Storing Data Files with SAM

 

1. Introduction

Storing a file into sam includes physically copying the data  to a new location, and recording descriptive information about the file into the SAM catalog.  There are several contexts under  which files can be stored into the SAM system and the supported types of file storage  currently include:
  1. Files produced within a project,
  2. Monte Carlo files produced locally or remotely,
  3. RAW data files produced by the online data logger.
Files can be stored to a local disk, to a disk on a remote machine, or to a robotically mounted tape. The SAM file storage system requires that the  file storage server (FSS) be running for the system on which you are performing the store operation and that there be a running stager server associated with it.  However, on most widely used systems, such as d02ka and d0mino, required servers  are established and maintained by SAM. The following setup descriptions includes information about starting and using all possible servers and is provided for general reference.

An alternative method for adding information about files to the SAM catalog is described in the sam declare documentation. This method is used, for example,  when files are brought to the system on exchange media and added directly to the tape library without an intermediate disk staging operation.

2. Initial setup


To use SAM, the user needs to do some basic UPS setups first:

$ setup n32 #IRIX only
$ setup sam
$ setenv SAM_STATION <station_name>

On d0mino the <station_name> is "central-analysis" and this is setup automatically when you setup sam.
To verify that the FSS is running and operational, the user can type:

$sam dump fss

to see the state of the server, including  its known stagers and file store requests. Initially, the server is blank.  A typical user should never need to start the FSS or stagers. However, in certain cases this information is needed and is provided in a later section of this document.
 

3. Storing Files

3.1 The Store Command


The general command for storing a file or files using FSS is:

$sam store --descrip=<descrip_file> --source=<source_dir> [--dest=<dest_location>] [--timeout=<mins>] [--resubmit] [--copy=<n>] [--create-rm-file]

This call returns 0 if all files were stored to the destination and the database without error and non-0 otherwise.

Where  descrip_file is a description of the file(s) explained below,  source_dir is the full path to the local directory containing the data file(s),  and dest_location is the destination location (either an ENSTORE location beginning with "/pnfs" or a remote disk location of the form <node>:<remote_disk_dir>. If no --dest  is specified, a destination is produced based on the information in the description file and compared to the list of valid destinations stored in the database. Use the --dest flag only if you are an expert.. The cpid,  the id of the process that has created the file, can be included in the description file but  is not required for certain applications, like when adding Monte Carlo files.  For additional details about this parameter, see  how to run projects  , or the datalogger example later in this document. This operation will execute synchronously, i.e., it will return to the user's command prompt upon completion of the request or after time-out period, if one is specified. Giving the value of zero for the timeout (equivalent to --asynchronous) will cause the command to exit immediately and not wait for the store operation to complete.

The descrip_file parameter is the name of the Python file containing the description of the file being stored  if you choose to add the .py suffix to the filename, this must also be used in the flag. You can also specify the complete path to the description file.

The user is encouraged to view newly imported files in the database by using the SAM Data Browsing web pages.

In the case of errors the user can throw the --resubmit flag and SAM will attempt to fix what went wrong. It is very important not to use this as your default submission as it turns off a number of SAMs error checking mechanisms.

The --storeagain flag allows the duplicate storage of files. As with --resubmit this disables some error checking in the system (i.e. those checks which verify that the file hasn't been stored before) and should only be used in the case where you are trying to store a file for a second or more time.

The --create-rm-file flag tells SAM to create a file named rmfile in the current directory with the commands needed to delete the files that were successfully stored. The SAM system will not remove the files for you, just create this file. It is the responsibility of the user to make certain that they are ready to delete the files before executing this output file. You can run sam store several times until all files in the system and SAM will keep appending successful transfers to the file.
 

3.2. Inquiring the status

At any time during the "sam store" command execution, one can obtain a verbose status dump of the store request:

$sam get file store status <file_name>

Here the <file_name> is the name of the file being stored rather than that of the meta-data file.

3.3. The Description File

The description file includes all of the information needed to catalog the data being stored.  In general, this includes filename, size in kbytes (number of  bytes/1024) , first event , last event, number of events, data tier, start time, end time and parentage information. There are slight differences for the content and format for the various types of data which reflect the nature of the data.  Below are examples of the description files needed to add Reconstructed, Monte Carlo, and data logger files. Any number of files can be described in each description file, but all of the physical files need to be in the same source directory for the store to work correctly.

These description  files are, in fact, python scripts and use python objects to  record information about each file. Before such a file is sent to SAM, it is recommended that basic syntax checking be done done by running the description file through the Python interpreter, e.g.,

$python -c "import descrip_file" && echo "My file is good"

If the message  "My file is good" doesn't appear, there is a problem with descrip_file.  This is not an acid test to see if your data willbe accepeted by the system, but tests the format of the description file. The user needn't know anything about Python to be able to submit file store requests so the example files below  should be viewed merely as templates. If you are not familiar with Python, just follow the punctuation and the indentation of the examples. Please note the  time format for start_time and end_time.
 

4. Store Examples

4.1  Project Files


Assume the user has created a file called metadata.py in her current directory /home/samuser/tmp, and assume she has a file fifo.ace of size  12345 KBytes in the directory /home/samuser/outbuffer whose contents are exactly the same as in the above import_processed.py file. Further, assume that the user is running a consumer process in a project and the process ID is 100. The user can then type:

$sam store --descrip=project_metadata.py --source=/home/samuser/outbuffer --dest=/pnfs/samson/NULL

This example has deliberately chosen a non-existent (as far as the SAM database is concerned) file ssh-kurino as the parent, among other things. There should be a clear error message. While the file is being stored, the user can view the status of this (and all other) request via

$sam dump fss

For example, the sam store command could have been executed in the background by using the usual & symbol at the end of the command line, thus allowing the user to continue the job and perhaps submit more requests. It is important that the FSS is capable of handling any number of submitted requests (up to a reasonable limit determined by system resources) in parallel. The fulfillment of requests is regulated by SAM resource management mechanisms, primarily by the optimizer.

Once a request has been submitted, it cannot be re-submitted or canceled unless the request has entered the error state, of which the submitting processed will be notified (i.e., the sam store command returns with a non-zero status and a brief message). Thus, the users are advised to catalogue and store files responsibly. Although SAM will make every effort to verify the files' description and accept only files that are consistent with the supporting data in SAM database, it cannot ensure that the description is 100% correct.

The format of the description file  for reconstructed data is exemplified below.

# A sample description file for SAM store
from import_classes import *

TheFile = ReconstructedFile(name='fifo.ace', sizeK=12345,
                            events=Events(1, 100, 70),
                            tier='reconstructed',
                            start_time='08/01/1998 17:00:00',
                            end_time='08/01/1998 18:00:00',
                            parent_name='ssh-kurino',
                            pid=656)

In terms of the Python programming language, the file contains instantiation of a Reconstructed File object. (This and other relevant classes are defined in the import_processed.py file under the sam_user package.) From the user's prospective, the file describes the following attributes of the file being stored (fifo.ace): the name, the size in kilobytes, the event information (first and last event numbers, number of events in the file), the data tier for the file (in this example, the generic reconstructed tier is used; in the future, more specific tier such as EDU250 must be supplied), the start and end times for file creation, the name of the parent file, and the Consumer Process ID of the creator. There has to be exactly one parent for the Reconstructed File class.
 
 

4.2 Monte Carlo

To store   Monte Carlo data files, the consumer process ID is not required and a pseudo run number is generated by the SAM system for each  generated (PrimaryMCFile) file. The user does not need to know this number to store or use the data. The store command  is employed as shown in the example  below.

$sam store --descrip=mc_metadata.py --source=/home/samuser/outbuffer --dest=/pnfs/sam/NULL

In this case, the description file is called mc_metadata.py. Example Monte Carlo description files are  shown below illustrating  the format used for Monte Carlo data. Although not shown, more than one file can be stored at a time.  These files were generated automatically by the Monte Carlo launching tool  called runMCjob described in  Monte Carlo D0 documentation.
 

Example of Generator description file:

 

 

from import_classes import *
#
# Generated by runMCwin
#
my_generator = AppFamily( "generator","psim01.00.01","single" )

class MyProcess(ProcFamily):
    group="mcc99"
    origin_location="FNAL"
    origin_facility="d0mino"
    produced_for="Qizhong Li"
    phase="mcp03"
    def __init__(self, stream, param_file, produced_by):
        self.stream=stream
        self.param_file=param_file
        self.produced_by=produced_by
 
class Generator(MyProcess):
    appfamily=my_generator

channel = Channel("pdgid13","incl")

gen_fil=Generator(stream="notstreamed", \
                  param_file="spw_single_test.params", \
                  produced_by="Greg Graham")
 

gen_fil_import = PrimaryMCFile("lees-sam-v2.1-test.gen",
    gen_fil, 133, Events(1, 10, 10), \
    "08/16/2000 09:38", "08/16/2000 09:38", 2.000, channel)
 

Example of simulation (D0gstar)  description  file:

from import_classes import *

#
# Generated by runMCwin
#
my_d0gstar   = AppFamily( "simulator","pmc03.00.01","d0gstar" )

class MyProcess(ProcFamily):
    group="mcc99"
    origin_location="FNAL"
    origin_facility="d0mino"
    produced_for="Qizhong Li"
    phase="mcp03"
    def __init__(self, stream, param_file, produced_by):
        self.stream=stream
        self.param_file=param_file
        self.produced_by=produced_by

class Simulator(MyProcess):
    appfamily=my_d0gstar

channel = Channel("pdgid13","incl")

d0g_fil=Simulator(stream="notstreamed", \
                  param_file="spw_d0gstar_test.params", \
                  produced_by="Greg Graham")

d0g_file_import = SimulatedFile("import_d0g_test.py",\
    d0g_fil, 128, Events(1, 10, 10),\
   "08/16/2000 09:40", "08/16/2000 09:40","lees-sam-v2.1-test.gen", 1, 1, channel)
 

Example of digitizer (D0sim)  description  file:

from import_classes import *

#
# Generated by runMCwin
#
my_d0sim   = AppFamily( "digitizer","psim01.00.01","d0sim" )

class MyProcess(ProcFamily):
    group="mcc99"
    origin_location="FNAL"
    origin_facility="d0mino"
    produced_for="Qizhong Li"
    phase="mcp03"
    def __init__(self, stream, param_file, produced_by):
        self.stream=stream
        self.param_file=param_file
        self.produced_by=produced_by
 
class Digitizer(MyProcess):
    appfamily=my_d0sim

channel = Channel("pdgid13","incl")
minbi = MinBias("none","0.0")

dig_fil=Digitizer(stream="notstreamed", \
                  param_file="spw_d0sim_test.params", \
                  produced_by="Greg Graham")

dig_file_import = DigitizedFile("lees-sam-v2.1-test.psim",
    dig_fil, 613, Events(1, 10, 10),
   "08/16/2000 09:44", "08/16/2000 09:44",
    "lees-sam-v2.1-test.d0g", 1, 1, channel, minbi)
 

Example of reconstruction (D0reco)  description  file:

from import_classes import *

#
# Generated by runMCwin
#
my_reco   = AppFamily( "reconstruction","preco04.00.02","d0reco" )

class MyProcess(ProcFamily):
    group="HiT"
    origin_location="FNAL"
    origin_facility="d0mino"
    produced_for="Qizhong Li"
    phase="mcp03"
    def __init__(self, stream, param_file, produced_by):
        self.stream=stream
        self.param_file=param_file
        self.produced_by=produced_by
 
class Reconstruction(MyProcess):
    appfamily=my_reco

channel = Channel("pdgid13","incl")
minbias = MinBias("none","0.0")

rec_fil=Reconstruction(stream="notstreamed", \
                       param_file="spw_d0reco_test.params", \
                       produced_by="Greg Graham")

rec_file_import = ReconstructedMCFile("lees-sam-v2.1-test.reco",
    rec_fil, 671, Events(1, 10, 10),
   "08/16/2000 09:50", "08/16/2000 09:50",
    "lees-sam-v2.1-test.psim", 1, 1,channel,minbias)
 

4.3 Data Logger

Storing data generated by the online system has a few minor differences but shares the basic elements of the other modes of storing.  As in the other cases, it is assumed that a FSS and associated stager  are active on the system, and that the SAM_STATION environment variable is set.  In order to store data from the online  data logger, additional information must be supplied. A process is established  allowing  the definition of an application family and version. The start time will be  acquired from the system unless the optional start-time parameter is used.

sam establish online process --appfamily=datalogger --version=<data_logger_version> [--start-time=<process_starting_time>]

This command returns a process ID, pid, which is needed to properly identify subsequent entries in the database and is needed when the process is ended.   Next, a  run is established with the sam establish run command.

sam establish run --number=<run_number> --type=<run_type> --cme=<center_of_mass_energy> --start-time=<run_starting_time>

This command  returns run_id needed in the data description file and when the run is ended. The file is stored using the store command.

sam store file --descrip=<description_file> --source=<source_directory> --dest=<destination_directory> [--keep-description] The keep-description qualifier  allows one to store the  meta-data  to the database even if physical file transfer fails. Finally, at the end of each run, the end run command is issued.

sam end run --end-time=<run_ending_time> --runID=<run_id>

To finish the process, possibly at the end of each run, the end online process command is used.

sam end online process --pid=<process_id> [--end-time=<process_ending_time>]

If the end-time parameter is not supplied it will be obtained from the local system clock.

The data logger description file is more complete than the others, since for each file to be added to the SAM database there is an event list. The event list contains the event number, level 1, level 2,  level 3, and luminosity block   information for each event.  This format is generated by the data logger and stored by sam in a fashion similar to other transfers described above.

from import_classes import *

TheEventList = [
    RawEvent(        ev_num = 4,        lum_block = 911,
        level_1 = 0x41C6967E2781C46BF94B95FBD9E29CFBL,
        level_2 = 0xBF540FF60ABD31DF237CAF1C7DE1C487E201D2BFE231E3DEE9569372500F2847L,
        level_3 = 0x2C6775664287B3594DAAE488F73CEF59EEEA5656E113CA7B31D2AD8599A169D8L),
    RawEvent(        ev_num = 5,        lum_block = 911,
        level_1 = 0xB53C3B547D55102F1B377AAEDE65B45BL,
        level_2 = 0xE3DA61027A79839828CCE0E39F1A4B76858EFA5F28D98799388F751F493F8F36L,
        level_3 = 0x48EE2043BF781E4D3D0D33FAEFBE36A6ADDA30E40586148EC2DC59290C6DB34EL),
    RawEvent( ev_num = 6,        lum_block = 911,
        level_1 = 0x62FF9F56ABE11D70A620A6FBF18F84B1L,
        level_2 = 0xD35833055690DDC5F8091DDCEB537BCDE3AAB73B5648E799145223D31152EE9DL,
        level_3 = 0xE0061A9F11EAA5B5E6C21C06C813DB989949FEB22001371E60AC6E32F288FD31L),

        ...

  RawEvent(        ev_num = 4000,        lum_block = 2000,
        level_1 = 0xB53C3B547D55102F1B377AAEDE65B45BL,
        level_2 = 0xE3DA61027A79839828CCE0E39F1A4B76858EFA5F28D98799388F751F493F8F36L,
        level_3 = 0x48EE2043BF781E4D3D0D33FAEFBE36A6ADDA30E40586148EC2DC59290C6DB34EL)]

TheRawDataFile = RawDataFile(name = 'STREAM-000_0000012346_001.raw',stream = 'STREAM-000',
    part_nr = 1,start_time = '09/09/1999 17:42:25',end_time ='09/09/1999 17:44:26',sizeK=0,
    lum_min = 911,lum_max = 911,ev_min = 4,ev_max = 63,ev_list = TheEventList,
    pid = 15042,run_id = 102939)
 

5. Starting FSS and stagers


NB: The information in this section is for experts only. Do not, under any circumstances, start your own fss or stager on machines where a sam supported fss or stager is already running.

On a properly configured station,  a server called FSS (File Store Server),  runs and  manages user requests to store processed files.  The following information is not needed by general users, but may be required in special cases, e.g. on the reconstruction farm.   The server's CORBA name is /SAMStations/<station_name>/FSS:Sewer. If the server is not running, which may be the case for a farm station, it must be started with the following command:

$sam start fss [--quiet|--verbose]

On a farm, starting the FSS typically occurs in the beginning section of the job.  The above command assumes that the SAM_STATION environment variable is a valid station.  See the list of valid SAM stations available on the Quickie Query Lists in the SAM Data Browsing web pages. For developers: use --opter-suffix=devel in development environment (created by setup sam -q dev) to communicate to the development, rather than production, optimizer.

The processed files that the user wishes to store with SAM must reside on a node that is a part of the station. More precisely, the files to be stored must be  on a file system that is managed or at least read-accessible by the station.  Thus, there has to be at least one stager running at the node or original submission. Again, a properly configured station has stagers running at all of its component nodes. If, however, there is no  stager known to the FSS running at the node, such as the case for a farm station, a stager  would have to be started as follows:

$sam start stager [--quiet|--verbose] [--rtfile=<pid_file>]

On a farm, starting the stager typically occurs in the worker-node script.   If there is already a stager running at the node, ,such as a stager used to deliver input files, that stager can be used for storing of output files as well. Next, a running stager must be connected to the FSS via:

$sam add stager --pid=<pid> --fss=FSS

Where the pid of the affected stager may extracted from the return file (<pid_file>) of the previous command. This command merely connects the stager to the FSS rather than creates a new stager. If a stager had to be re-started for any reason, it must be re-connected to the FSS afterwards. Note: in the next version of SAM, when stations are configured and have station masters running permanently, the above steps of starting the FSS and/or stagers will be absent.
 
 

6. Troubleshooting Possible Error Conditions


Aside from the configuration problems like not finding the right station, stager, optimizer, etc, which should concern SAM administrator rather than the user, the following error conditions may occur.

Problem: FSS server is not running
Message: CORBA Exception, server is probably dead (Minor: 0 Completed: COMPLETED_NO)
Solution: Contact sam-design and have server(s) restarted.

Problem: Missing description file
Message: No module named foo, where foo appears as the --descrip value.
Solution: Ensure  that there is a file foo.py in your current directory.

Problem: Syntax errors in the description file.
Message: description file is not a valid Python file.
Solution: The Python interpreter will hopefully describe the nature of the error. Fix the description file format.

Problem: Invalid data tier, parent file name, etc.
Message: These are semantic errors in the description file found by the SAM database.
Solution: Check spelling of data tier, file names, etc. If no problems found, check sam browser for correct options. Contact      sam-design if new options are required.

Problem:  Invalid source location. The source directory containing the data file is invalid or inaccessible to SAM.
Message:  stream of python error messages
Solution:  Check spelling and location.

Problem:  File delivery problems or Invalid destination.
Message: stream of python error messages
Solution: Reported by the mover agent such as encp for disk to Enstore transfers,  or rcp. Check for valid locations using browser. If new location is required contact sam-design.

Problem: Invalid location is reported by the SAM database at the last step of storing the file when the FSS attempts to store the new location of the file.
Message: Invalid location
Solution: All valid file locations must be known to SAM; if a location is not known and it is not the user's mistake, contact sam-design.
 

=============================================================================
Project : SAM
Package : sam
$Id: SamStore.html,v 1.24 2005/04/15 19:21:45 lauri Exp $

This work is part of a development project, called SAM, which consists of a
number of coordinated packages each named sam_xxxx .

Notice of authorship, copyright status, and terms and conditions, should
the software eventually become available for use outside Fermilab, can be
found in the README and LICENCE files in the top level directory of the main
sam package.

==============================================================================