NOTE 0:The following commands are only valid for version 5 of SAM and they not accurate any more in the new version 6/7 of SAM. Please go here for the new command syntax.
NOTE 1: The terms Dataset Definitions and Project Defintions are synonymous, as are the terms Datasets and Project Snapshots. These terms are used interchangably in this document, and may be used interchangably in the SAM commands.
sam define dataset --group=<group> --defname=<name> {constraints}
or (the original names still work)
sam define project --group=<group> --defname=<name> {constraints}
where valid constraints are listed below , the most useful being the file name pattern, example: --filename='%ttbar%', and percent sign (%) being the wildcard. The definition name that you choose must be unique and the group name must be known to the SAM database. For valid groups, see the Quickie Query List for Work Groups on the SAM Data Browsing pages. If successful, you may skip the rest of this document and go to running of SAM projects.
Initialization - To initialize the environment, the user should
enter the following, if not already done:
setup sam
3. SAM Project Commands
sam translate constraints - Translate a set of constraint parameters, or a dimension/constraint criteria into a summary list of the files that would result from such a project definition. It returns the number of files found, average file size and number of volumes needed to access the data. Note: resulting files are included in the list only if they are immediately available.
usage:
sam translate constraints <constraints_or_dim>where <constraints_or_dim> is either:
[--runnum=run_number] [--eventnum=event_number] [--datatier=data_tier] [--filename=file_name] [--physicaldatastream=physical_datastream_name] [--logicaldatastream==logical_datastream_name] [--physicaldataset=physical_dataset_name] [--applicationfamily=application_family] [--applicationfamilyversion=application_family_version]
or:
--dim=dimensions_and_constraints
or:
--rpn=dimensions_and_constraints_in_rpn_format (may be useful in odd situations where you're scripting things)
return: 1. files info, 2. volume info
for help using constraints:sam create dataset definition -
sam translate constraints --help will provide the basic help, while
sam translate constraints --dim=help will provide detailed help on the new dimension, including a list of available dimensions and how to use them.
usage:sam create dataset -or- sam create project snapshot
sam define dataset --defname=dataset_definition_name --group=work_group_name [--defdesc=dataset_definition_description] <constraints_or_dim>where <constraints_or_dim> is either:return: Status of the definition creation.
[--runnum=run_number] [--eventnum=event_number] [--datatier=data_tier] [--filename=file_name] [--physicaldatastream=physical_datastream_name] [--logicaldatastream==logical_datastream_name] [--physicaldataset=physical_dataset_name] [--applicationfamily=application_family] [--applicationfamilyversion=application_family_version]
or:
--dim=dimensions_and_constraints (see the help for translate constraints above for a list of available dimensions)
or:
--rpn=dimensions_and_constraints_in_rpn_format
usage:sam verify snapshot -or- sam verify project snapshot
sam create dataset {--defname=project_definition_name || --defid=project_definition_id} [--group=work_group_name] [--snapdesc=project_snapshot_desc]return: Status of the dataset/snapshot creation. Note: the group option is needed only if the dataset is created in the context of a group other than the group for which the project had been defined.
usage:4. Example of setting up and using a project
sam verify project snapshot --defname=project_definition_name [--snapvers=project_snapshot_version]
or:
sam verify project snapshot --defid=project_definition_id [--snapvers=project_snapshot_version]return: The file differences between the original snapshot and the version of the snapshot that would be created if the same definition were applied now. Each file listed starts with either a plus or minus sign. Files new to SAM since the original snapshot was recorded start with a plus sign (+). Files that were in the original snapshot, but are now not in SAM start with a minus sign (-). These missing files might have been deleted, or are otherwise inaccessible.
We would like to analyze all data between runs 100930 and 100930 with a data tier type of "digitized". First, we test the constraints using the translate constraints command:
sam translate constraints --runnum=100930 --datatier=digitizedThe return tells us that there are 11 files which satisfy these constraints with an average size of 1910894 kBytes. Next, we decide that a more (or less) complex query is needed than the translate constraints method allows. Possibly, we need to use dimensions that were not provided in the original list of constraints. For example, we can decide we want all these files, but only if they are not in a certain physical datastream.File Count: 11
Average File Size: 1910894
sam translate constraints --dim="(run_number 100930 data_tier digitized) minus physical_datastream_name electron+jet"Let's create a project definition using the original definition, and proceed through the chain of creating a snapshot and an analysis project. First, the project definition is created. We can create the project definition using the original constraints.File Count: 2
Average File Size: 1760936
sam define project --defname=ace_project --group=groupa --runnum=100930 --datatier=digitizedOr, we can create the project definition using the more complex dimensions.Project definition created with Id: 2159
sam define project --defname=ace_project --group=groupa --dim="(run_number 100930 data_tier digitized) minus physical_datastream_name electron+jet"Optional: create a snapshot.Project definition created with Id: 2160
sam create project snapshot --defname=ace_projectYou may omit this step and create a "new" snapshot on the fly when starting the actual project.Snapshot Id: 2181 Version: 1
Later, you may find it useful to compare your analysis results with the current physics data, by verifying the original project snapshot.
sam verify project snapshot --defname='ace_project_old'This means that since the last time the snapshot was created for the given definition, the set of available files has changed. If you were to create a project snapshot now, you would not get these five files.Defaulting to the latest snapshot version of 2 for the project definition.
Project Snapshot file differences:
- ALL_076151_04.IGOR_01
- ALL_076151_32.IGOR_01
- EXPRESS_076151_03.IGOR_01
- INSPILL_076151_01.IGOR_01
- EXPRESS_076151_02.IGOR_01
Once you have created a dataset of interest, proceed to retrieving the
dataset files: running the SAM projects.