Using SAM within D0 Framework

1. Introduction

A straightforward interface to the SAM system is provided through the D0 Framework. The package which allows any Framework application to interact with SAM is called SAMManager. Its main purpose is to provide the application with the location of the next input file by communicating with the SAM Project Master. It also determines the location of the output files created by the application (if the output buffer space is managed by SAM), and generates meta-data for those files. If the output file is to be stored and catalogued with SAM in real time, SAMManager will initiate file store after file closing.
 

2. Getting started: running an interactive demo

The easiest way to get familiar with SAMManager is to see an example of a working application. This can be accomplished by logging into the D0 Central Analysis Server (d0mino) and following these steps in one of the available scratch areas:
 
flotsam-clued0> setup d0cvs #where to get code
flotsam-clued0> setup D0RunII t04.20.00 #where to find test executable
flotsam-clued0> cd scratchdir #go to your scratch area
flotsam-clued0> cvs checkout sam_manager #check out SAMManager
flotsam-clued0> cd sam_manager/test/demo  #go to the demo directory
flotsam-clued0> submit2sam_interactive #run the demo program

After running the above commands one might expect to see the output similar to the one given here. Note that in the same directory one can find several other demo scripts which use the sam submit command: submit2sam, submit2sam_script and submit2sam_parallel_script. All of those submit the SAM demo job to the batch system: the first one submits it as a Framework executable, while the other two submit it is as user provided scripts runsam_batch (for running a single user process), and runsam_parallel_batch (for running a multiple user processes in parallel).

3. Understanding the example

The only SAM command used in submit2sam scripts (sam submit) is described in the document on running SAM projects. In the following we describe the framework-related SAM issues.

The test executable SAMTest uses SAMManager to retrieve an input file, it reads and dumps all events using ReadEvent and DumpEvent packages,  and writes them into an output file using WriteEvent package. Upon its closing, SAMManager catalogues the newly created output file in the SAM database, and initializes its storing.

All of the packages used in SAMTest are listed in the top level RCP file SAMTest.rcp.

// ----------------------------------------------------------------
// SAMTest.rcp
// ----------------------------------------------------------------
// Required by the framework.
string PackageName = "Controller"
string Packages = "mgr read dump write"
// ----------------------------------------------------------------
// One of these for every item in the above Packages variable,
// and each must refer to an RCP file.
RCP mgr = <sam_manager SAMManager>
RCP read = <io_packages ReadEvent>
RCP write = <io_packages WriteEvent>
RCP dump = <io_packages DumpEvent>
// ----------------------------------------------------------------

Behavior of all individual packages is controlled by their respective RCP files: ReadEvent.rcpWriteEvent.rcp, DumpEvent.rcp (all located in io_packages/rcp subdirectory of demo), and SAMManager.rcp (in sam_manager/rcp).

In order to use ReadEvent with SAMManager, its RCP parameter InputFile has to be set to SAMInput:, a pseudo-file which indicates that SAM will be used to dynamically generate list of input files.

// ----------------------------------------------------------------
// ReadEvent.rcp
// ----------------------------------------------------------------
string PackageName = "ReadEvent"
string InputFile = "SAMInput:"
string InputFormat = ""
int SkipEvents = 0 // Number of events to skip.
int NumEvents = 0 // Maximum number of events to read (0 = all).
bool MakeUnknown = false // Instantiate unknown objects?
bool ExceptUnknown = false // Throw exception for unknown objects?
string ApplicationName = "ReadEvent" // String passed to d0om_init.
string EventOrder = "sequential" // Event order. 
// ----------------------------------------------------------------

Similarly, WriteEvent will work with SAMManager only if its RCP parameter OutputFile begins with SAMOutput: or SAMGenerated:.  In the case of SAMOutput:  SAMManager will replace this string with the name of the directory to write output file into (working directory at the moment), while the remaining file name expanders, such as the standard D0 output file name generator, are responsible for the rest. On the other hand, if SAMGenerated: is used SAMManager will generate the name of the output file.
 
// ----------------------------------------------------------------
// WriteEvent.rcp
// ----------------------------------------------------------------
string PackageName = "WriteEvent"
//string OutputFile = ("SAMOutput:file1","SAMOutput:file2")
string OutputFile = "SAMGenerated:"
int MaxEventsPerFile = 10  // 0 = no limit.
bool Synchronize = false  // Synchronize file advance with ReadEvent.
int InputFilesPerFile = 1  // Number of input files per output file.
string OutputFormat = "EVPACK"  // "DSPACK," "EVPACK," "D0MSQL," or "ORACLE."
int CompressionLevel = 1  // 0=none, 9=most. Only works with EVPACK format.
bool CopyUnknown = true  // Copy unknown objects when writing?
string ApplicationName = "WriteEvent"  // String passed to d0om_init.
bool UseOutputFilter = false
string WriteClassList = ""  // Space-separated list.
string VetoClassList = ""  // Space-separated list.
string WriteEventTags = "*"  // Space-separated list ("*" = all). 
string VetoEventTags = ""  // Space-separated list. 
string WriteChunkTags = "*"  // Space-separated list ("*" = all). 
string VetoChunkTags = ""  // Space-separated list.
// ----------------------------------------------------------------

The RCP file for the DumpEvent package determines the frequency and the total number of events to dump.
 
// ----------------------------------------------------------------
// DumpEvent.rcp
// ----------------------------------------------------------------
string PackageName = "DumpEvent"
int DumpPeriod = 1
int DumpNumber = 1000
bool DumpCollisionID = true
bool DumpEvent = true
string DumpChunks = ""  // Chunks to dump (empty string = all chunks)
string OutputFile = ""  // Output stream (out(...)) or filename.
// ----------------------------------------------------------------

The only parameter in the SAMManager RCP file which always has to be present is PackageName. If sam submit command is used, AnalysisProject and Station will be passed into the users' executable as environment variables. The user also has to provide ApplicationName, ApplicationVersion and (optionally) WorkingGroup. Valid ApplicationName/ApplicationVersion pairs can be found on the SAM Data Browsing page.

The remaining parameters are all optional: ProcessDescription  is the description of the user application, and MaxNumberOfInputFiles determines the maximum nymber of files the application will process (the default value of 0 means that there is no limit).

StoreOutput determines whether to store output files, while FileStoreLocation can be used to specify the storage location. If the FileStoreLocation parameter is not used, the storage location will be determined using SAM autodestination server. Note that even if storing of output files is disabled, the output file metadata will be generated and printed to to the standard output, as well as stored into a python file that can be submitted to the SAM File Storage server using the sam store command. SAMManager is using the asynchronous mode of storing output files. It will submit a file store request to the File Storage Server, and will not wait for callbacks.

OutputFileParentage controls whether output file parentage is determined by opened (default) or closed input files. There are also several parameters used to specify the output file metadata: OutputFileType, OutputFileFormat, OutputFilePhysicsGroup, OutputFileDataStream, and OutputFileDataTier.

The Verbose flag is used to control the amount of messages sent to the standard output, and LogFile is used for SAMManager log. If for any reason SAMManager has to be disabled, an optional UseSAM flag can be set to 0.

// ---------------------------------------------------------------
// SAMManager.rcp
// ---------------------------------------------------------------

//
// Framework parameters.
//

string PackageName = "SAMManager"

//int UseSAM = 1 // 1 yes (default), 0 no

//
// Parameters for file consumption.
//

//string AnalysisProject = "test"

//string Station = "central-analysis"

//string ProcessDescription = "test process" // default: "D0 Framework Process"

//string WorkingGroup = "test" // default: dzero

string ApplicationName = "test"

string ApplicationVersion = "1"

// int MaxNumberOfInputFiles = 1        // 0 no limit (default) 

//
// Parameters related to output files and their metadata.
//

// int StoreOutput = 1 // 1 yes, 0 no (default)

// If not provided (or given as empty string), files will be stored 
// into the storage location determined by the SAM autodest server
// string FileStoreLocation = "/pnfs/sam/dzero/db4/sam_test/2004/12/21"

string OutputFileDataTier = "reconstructed"  // default: reconstructed-bygroup

string OutputFileDataStream = "notstreamed"  // default: notstreamed

string OutputFilePhysicsGroup = "dzero"   // defaults to WorkingGroup

// Output file parentage mode: 
//   0 opened input files 
//   1 closed input files (default)
int OutputFileParentage = 1       

// Output file format: 
//   0 unknown (default)
//   1 dspack
//   2 evpack
//   3 compressed evpack
int OutputFileFormat = 2       

// Output file type: 
//   0 physics generic (default)
//   1 derived detector
//   2 derived simulated
int OutputFileType = 1       

//
// Debugging.
// 

//int Verbose = 1 // 1 yes, 0 no (default)

//string LogFile = "samManager.log" // default: no log file

Some of the parameters controlling SAMManager can also be supplied on the command line, in which case they override parameters given in the RCP file. For example, if one runs the test executable using

SAMTest  -project $SAM_PROJECT -station $SAM_STATION -rcp SAMTest.rcp

then the RCP parameters AnalysisProject and Station would be ignored. If some of the required parameters are not supplied in the SAMManager RCP file or as command line options, SAMManager will attempt to determine them from corresponding environent variables. All valid RCP parameters that have corresponding command line options and environment variables are listed below.
RCP Parameter CL Option Environment Variable
------------------------------ ------------------------------ ------------------------------
AnalysisProject -project SAM_PROJECT
Station -station SAM_STATION
ProcessDescription -desc
ApplicationName -app_name SAM_APPLICATION_NAME
ApplicationVersion -app_version SAM_APPLICATION_VERSION
WorkingGroup -working_group

4. Compiling SAMManager package

Compilation of SAMManager package and the test executable should be straightforward.  On flotsam-clued0 the necessary steps are the following:
 
flotsam-clued0> setup d0cvs #where to get code
flotsam-clued0> setup D0RunII t06.00.00 #set environment
flotsam-clued0> cd scratchdir #go to your scratch area
flotsam-clued0> newrel -t t06.00.00 testdir #will create directory testdir
flotsam-clued0> cd testdir #go to testdir
flotsam-clued0> addpkg -h sam_manager #get SAMManager code
flotsam-clued0> gmake #compile the library
flotsam-clued0> gmake sam_manager.test #compile test executable

Upon compilation, SAMManager library and test executable should be located in lib/$BFARCH and bin/$BFARCH subdirectories of testdir.
 
 

5. Compiling applications

In order to use SAMManager users' applications should be linked with the SAMManager library (sam_manager), as well as with several other SAM (sam_corba and sam_util) and CORBA libraries (corba_util, OB, JTC and CosNaming). Also, the file RegSAMManager.o should be included in the list of object files. An example below is the GNUmakefile used for compilation of the SAMManager test executable:


# Makefile for compiling SAM test executable.

USE_BINLIBS := true

override LOADLIBES += \
  -lsam_manager \
  -lsam_cpp_api \
  -lsam_mis_cpplib \
  -lsam_corba \
  -lsam_util \
  -lsam_station \
  -lsam_mis \
  -lsam_db_srv \
  -lsam_dimension_server \
  -lsam_server_base \
  -lsam_corba_base \
  -lsam_legacy \
  -lcorba_common \
  -lcorba_util \
  -lOB \
  -lJTC \
  -lCosNaming \
  -lio_packages \
  -leventflags \
  -lrun_config_fwk \
  -lstream_ds \
  -ld0om_ds \
  -levpack \
  -ldspack \
  -lframework \
  -lmemutil \
  -ld0stream \
  -lname_translation \
  -lfwkprofiling \
  -lrcp \
  -lptr \
  -ledm \
  -lidentifiers \
  -ld0om \
  -ld0_util \
  -lZMtools \
  -lExceptions \
  -lErrorLogger \
  -lErrLogEx \
  -lZMutility \
  -lZMtimer \
  -lCLHEP \
  -lrelversion \
  -lconfig_base 

include SoftRelTools/arch_spec_f77.mk
include SoftRelTools/arch_spec_zlib.mk

override LDFLAGS += \
  -L$(SAM_CPP_API_DIR)/lib

COMPLEXBIN := SAMTest
COMPLEXBINCPPFILES :=

BINSTANDALONEOFILES := \
  framework.o \
  RegSAMManager.o \
  ReadEvent.o \
  WriteEvent.o \
  DumpEvent.o

TBINS := $(COMPLEXBIN)

.PHONY: test tbin


test: 
        +@$(MAKE) -j 1 tbin

################################################
include SoftRelTools/standard.mk

6. Possible problems


All SAMManager error messages that users may encounter within their applications are meant to be sufficiently detailed to enable one to take the appropriate action.
 
 

=============================================================================
Project: SAM
Package: sam_manager
$Id: framework_sam_projects.html,v 1.45 2005/05/13 18:53:17 veseli Exp $

This work is part of a development project, called SAM, which consists of a number of coordinated packages each named sam_xxxx .

Notice of authorship, copyright status, and terms and conditions, should the software eventually become available for use outside Fermilab, can be found in the README and LICENCE files in the top level directory of the main sam package.

==============================================================================