I am still a bit confused if this designs the Station or the Disk Caching part of the station?
…
.suggest a picture for efficient - "define efficient" you are going to follow this design with implementation are you not?
………Clearly, the availablity
Typo availability
of a job's files on disk cache greatly affects its expected turnaround and therefore the extent to which it may be desirable to schedule the job sooner.
…..
Well yes below. I would have liked the below up front and some of the other stuff at the end for future consideration… but that is just me.
In summary, the rationale for the station design in the present form, i.e, largely restricted to disk management, is as follows. We strongly believe that an intelligent disk (cache) management is (a) a well-defined task of the station, to be integrated seamlessly into the bigger picture, (b) a natural step towards efficient overall resource management, rather than a diversion from the ongoing overall analysis, and (c) a necessity at the present stage of the SAM evolution as a project.
Specifically in the context of the station design, we will use the following terms:
I think the below is important and Station should be capitalized.
The primary contribution of the present document is given by the following discussion. The proposed design differs significantly from earlier ideas.
The distinction between the Cache and the Buffer is too fine
After all this time we are getting rid of the distinction? What about the Enstore cache/buffer terms. I know you have talked to Don - what is the distinction in enstore? It would be nice to have a comparison somewhere in this document/appendix as when we start talking about the enstore buffer and cache we will get confused - see if Don will give you some words to massage maybe. What about asking Waldman who is actually working on the enstore caching to review your doc?
and becomes cumbersome when enforced by the design. In many cases, it is simply not possible to predict whether a file will be reused in near future or not.Treating a part of the disk as a buffer simply means a particular (FIFO) cache replacement algorithm. We are not presenting any particular cache replacement; moreover, we assume that multiple algorithms will be possible (and dynamically set) for various parts of the total disk on the station. Thus, we erase the boundaries between Buffer, Short Term Cache, Long Term cache while understanding that different parts of the station may be configured to effectively be one of such. Thus, we treat all the station's disk as THE CACHE.
The Station's Cache Manager (CM) is responsible for coordination of projects requesting files and proper cooperation with the global resource manager (i.e., the optimizer). The Naturally, the cache management algorithm will essentially generalize that in the project master's replenisher: while the replenisher serves only its (directly attached) project master, the station's disk manager serves any number of projects, possibly with overlapping file requests.
For backward compatibility with projects that must (or wish to) run without the station master, the cache manager will implement all the interfaces of the replenisher. Thus, every project master will communicate with the same interface implemented either as directly attached replenisher or in the station, with the decision being made at project startup time.
When a project is started, its snapshot files are added to the "requested file" set in the Cache Manager. The CM then requests authorization from the optimizer for all the newly requested files (i.e., those that weren't already known
Known? Known about by the CM? Already requested in previous snapshots from this station?
before this project started). At all times, each file in the "requested" set is associated with at least one project that expressed interest in it.
When the authorization for a file arrives, the file is added to the "can go" file list. This is the list of files, hopefully grouped by volume (if the optimizer has done good job) whose HSM-disk retrieval can begin as soon as there is enough cache space. Specifically, if the disk requirements for the next delivery group (see below) can be met by erasing some of the disposable files (called "can free" in the replenisher), CM instructs the stager(s) to erase the disposable files and initiate the deliveries for the group. A delivery group is a sublist of the "can go" list that is a unit of ENCP work; naturally, it is a set of files from one physical volume (tape). If the tape mounts is the most scarce resource, a group includes all the files from the tape that are needed by all the known projects. If disk space becomes limited as well, the group size may decrease to a single file (as in the initial implementation of the replenisher).
When a stager notifies the CM of a successful file retrieval completion, the file becomes a "cached file" and is served to the projects associated with the file. The newly cached file is marked as being in use. Its new location is added to the database. Each project then serves the file to its consumers in the usual way; when all the consumers are done, the project releases the file by calling CM. It is important for CM to be able to limit the time a project takes to process a file, much like projects themselves have time limits for their consumers to process a file.
Finally, when all the projects release a file, the file is added to the "disposable" list (see above) and the CM reviews its chances to deliver a next group, at which point the file may be erased. Exactly what disposable files are selected to be erased is irrelevant for this document; what is important is that the CM possesses enough information about file accesses (both past and near future) in order to execute some intelligent generalization of LRU or another cache algorithm (see the section on persistent variables). When a file is erased from disk, its associated location is erased from the database.
It is a requirement to the station Cache Manager to support the notion of a locked (AKA pinned) file, i.e., a file that has been marked as "unerasable" until further notice. We will assume that any cached file (whether in use or disposable) may be locked on disk by a user with sufficient privileges. Clearly, uncontrolled use of this facility will incapacitate the CM by eventually locking of all the files thus leaving effectively no free space on disk and precluding any intelligent cache algorithm from execution. Therefore, the locking of files is primarily intended for specific kinds of data (such as Thumbnail or calibration) and by group administrators only.
Locked files (and their occupied space) are effectively excluded from the disk management algorithms above. It is critical, however, that similarly to any other disk files, locked files are subject to full access history monitoring. This access history will be provided to the administrators for their viewing pleasure (well, actually to facilitate decisions to change the contents of the locked area).
It's obvious but the status of a cached file as locked is maintained in the database. I agree with Rich comment about locked and "in use" being different. - at least that is what I think he said - I have not reread his comments yet.
do you want to have a maximum amount of disk which can contain locked files?
Station configuration is the set of parameters to be controlled by system and group administrators. The number of parameters should be neither too small (lest administrators think that SAM is too simplistic or that they don't have enough control) nor too large (lest administrators get too confused). These parameters fall into approximately three categories:
Note: allocations of some global resources, such as tape drives or tape mounts per hour, to a station will likely not be a part of that station configuration; rather, those will define the configuration of the global resource manager (optimizer).
Example activities of administrators changing these parameters include:
Station master is a permanent "stateful" server, therefore, it must store its state persistently in order to recover from software failures and system reboots. Upon startup, the station master reads it state from the database using the interface with the server. The latter is of course driven by what constitutes the state of the station.
In this section, we present the required DB support for the proposed design. It is not the purpose of this document to decide exact table organization in the database; we possess great expertise with other project developers to do so. Instead, we intend to define what variables must be made persistent.
The quasi-permanent configuration-related variables are based on the following entities and relationships:
The more dynamic objects that are created by the station itself will require the following entities to be added to the database:
We hereby suggest that the remaining information could then be derived from these tables upon station startup. For example, the access history for a particular file is based on the already existing analysis_projects table and analyzed_files table.
The Db server interfaces should be such that they allow storage and retrieval of the above station variables. In addition, interfaces to record significant events, which already include project begin/end, should be extended so as to incorporate file delivery/erasure.
In this section we attempt to predict the change in "look and feel" of SAM, i.e., give the flavor of new commands and outline benefits for the end users (aside from performance increase due to extensive caching of files). With the introduction of the SAM station, and from that time on, a clear distinction will be made between administrators and end users. Almost all of the the new commands/tools will be for use by administrators for configuring and restarting the station.
Typical command lines for configuration will feel like:
sam add disk --disk=/sam/cache1 --size=1000000 --station=d0mino
sam increase allocation --group=mcc99 --disk=/sam/cache1 --size=200000
Typical administrative command to lock a file on disk:
sam lock --file=sim.pmc02_01.pythia.zhbbmet_mb1.1av_200evts.292_1753
actually I like the word pin rather than lock as lock has lots of other meanings.
(This command may involve physical moving of the file.)
As for the end users, the major benefit will be in relieving them from explicit buffer allocation/cleanup for their projects. The sam start project command (or its successor) will be a request to the station, rather than an action of physically starting the project master; therefore, the command may fail if the station rejects the job. Furthermore, as we work towards the integration with the batch system, we will more frequently speak of a user job and less frequently of a project. A single consumer project is a part of the user job which essentially entails (1) starting of a project, (2) running of an analysis program, and (3) stopping a project. Our tendency is toward a single command such as one of the following:
sam run XXX.py <params
sam submit XXX.py <params
Users will have to deal with SAM-imposed resource restrictions, such as disk/ATL usage. We are excited to see how we can, by (seemingly) creating problems for every particular individual, enlighten the life of the Collaboration as a whole!
=============================================================================
Project : SAM
Package : sam_doc
$Id: ruth-station.html,v 1.1 1999/11/17 00:16:24 terekhov Exp $
This work is part of a development project, called SAM, which consists of a
number of coordinated packages each named sam_xxxx .
Notice of authorship, copyright status, and terms and conditions, should
the software eventually become available for use outside Fermilab, can be
found in the README and LICENCE files in the top level directory of the main
sam package.
==============================================================================