Installing SAM

1. Introduction and Prerequisites


SAM consists of a number of UPS products whose names look like sam_xxx. Generally, you will need to install both the client part and some of the servers (specifically, station components). This document describes installation, start-up and configuration of SAM components.

The node on which the station is to be established will require the following advance setup by the system administrator:

  1. The node must have a user account named "sam", with a uid of 7816. The sam account needs to have a sufficiently large disk quota for storing log files, etc. By convention, this area is called ~sam/private. Typically, the ~sam/private directory is a symbolic link to a large scratch area.

  2. ups and upd must be installed and configured properly. In particular, the upd configuration file (typically found in $PRODUCTS/.updfiles/updconfig) must include all four components (product, version, flavor and qualifier) of a product instance in the ${UPS_PROD_DIR} definition. A good example would be:
    GROUP:
      product       = ANY
      flavor        = ANY
      qualifiers    = ANY
      options       = ANY
      dist_database = ANY
      dist_node     = ANY
    
    COMMON:
          UPS_THIS_DB = "${UPD_USERCODE_DB}"
         UPS_PROD_DIR = "${UPS_PROD_NAME}/${UPS_PROD_VERSION}/${UPS_PROD_FLAVOR}${UPS_PROD_QUALIFIERS}"
      UNWIND_PROD_DIR = "${PROD_DIR_PREFIX}/${UPS_PROD_DIR}"
          UPS_UPS_DIR = "ups"
       UNWIND_UPS_DIR = "${UNWIND_PROD_DIR}/${UPS_UPS_DIR}"
      UPS_TABLE_FILE  = "${UPS_PROD_NAME}.table"            # new recommendation: Oct 2001
     UNWIND_TABLE_DIR = "${UNWIND_UPS_DIR}"                 # new recommendation: Oct 2001
    
    Note how the highlighted line, which defines ${UPS_PROD_DIR}, includes the product name, version, flavor, and qualifiers. Note also that the UPS_TABLE_FILE is placed into the product's UPS directory, so that each instance of a product has its own copy of the table file.
    The updconfig file recommendations were changed in October 2001; see the relevent sam_admin hypermail archives for details.

  3. You must register all of the nodes that will be using sam in your cluster. Send mail to sam-design@fnal.gov with a list of the fully-qualified nodenames for each node you wish to register. For each node, include the name of the node, OS type, and hardware type. (we hope to automate this step ...).

  4. You will need to pick one station name per set-of-disks to be managed by sam. These names must be registered in the sam database. Again, send mail to sam-design@fnal.gov with a list of the station names, a description of each, and a list of administrators. (we hope to automate this step too...).

  5. Fermilab Local Nodes only: If you are authorized to fetch data directly from the Fermilab Enstore mass storage system, PNFS must be mounted. You will need to send mail to enstore-admin@fnal.gov requesting permission to mount pnfs space (include a list of all of the nodes that will need to mount PNFS space). (Authorization for PNFS may be requested through email to Wyatt Merritt, Heidi Schellman, Lee Lueking, or Dave Fagan. (we hope to automate this step...)

    The steps necessary (at least, on a Linux box; mileage may vary for other platforms):
    (Thank-you, Jason Allen, for sending these examples and instructions):

    1. Modify /etc/fstab to include the following lines:
      d0ensrv1:/sam-mammoth /pnfs/sam/mammoth nfs user,nosuid,intr,bg,hard,rw,grpid,noac 0 0
      d0ensrv1:/sam-m2 /pnfs/sam/m2 nfs user,nosuid,intr,bg,hard,rw,grpid,noac 0 0
      d0ensrv1:/NULL /pnfs/sam/NULL nfs user,nosuid,intr,bg,hard,rw,noac 0 0 
      
    2. Create the mount points:
         $ mkdir -p /pnfs/sam/mammoth
         $ mkdir -p /pnfs/sam/m2
         $ mkdir -p /pnfs/sam/NULL
      
    3. Mount the filesystems:
         $ mount /pnfs/sam/mammoth
         $ mount /pnfs/sam/m2
         $ mount /pnfs/sam/NULL
      
    4. Make sure the the filesystems are mounted:
         $ grep d0ensrv1 /etc/mtab
         d0ensrv1:/NULL /pnfs/sam/NULL nfs rw,noexec,nosuid,nodev,intr,bg,hard,noac,addr=131.225.164.21 0 0
         d0ensrv1:/sam-mammoth /pnfs/sam/mammoth nfs rw,noexec,nosuid,nodev,intr,bg,hard,noac,addr=131.225.164.21 0 0
         d0ensrv1:/sam-m2 /pnfs/sam/mammoth nfs rw,noexec,nosuid,nodev,intr,bg,hard,noac,addr=131.225.164.21 0 0
      
    5. Possible errors while mounting:
         # Case 1: mount point probably doesn't exist  Check /etc/fstab for typos.
         $ mount /pnfs/sam/mammoth
         mount: backgrounding "d0ensrv1:/sam-mammoth"
      
         # Case 2: filesystem probably isn't exported to this node.  
         $ mount /pnfs/sam/mammoth
         mount: d0ensrc1:/sam-mammoth failed, reason given by server: Operation not permitted
      

2. Install the sam products

To install the client software:

      upd install sam -q prd -G "-c"
As a convenience to your users, you should also
      upd install sam -G "-c"
(so that they can merely 'setup sam' rather than having to 'setup -q prd sam').

If you need access to the development environment, you will need to also

      upd install sam -G "-c" -q dev

To install the server software, first install the client software as directed above. Then:

      upd install sam_bootstrap -G "-c"
      ups tailor sam_bootstrap
and follow the instructions. Things that you will need to know at this time are:

The sam_station product installation includes the 'current' encp (enstore) product, which may not be the version you want; it doesn't hurt to ask beforehand to find out what the recommended version is. The special orbacus "-q sam_station" is also required for sam_station.

3. Configuring sam_bootstrap

The command to configure the servers (i.e., the sam_bootstrap package) is:

    ups tailor sam_bootstrap

This command should be excuted immediately after the above upd install sam_bootstrap command. It may also be executed any time that you need to update or modify your sam_bootstrap configuration. Please see the complete sam_bootstrap documentation for more information on this command.

The command will ask you about the file that describes what servers are to be run on this particular node. If this node is part of a cluster, execute this command on every node. As a general SAM user, you will only run the station/fss/stager servers, and perhaps a sam_bbftp server. Therefore, a typical configuration file might look like:

    station prd v2_2_10 new-station
    fss prd v2_2_10 new-station
    stager prd v2_2_10 new-station
    station dev v2_2_10 new-station
    fss dev v2_2_10 new-station --route=enstore,central-analysis:d0mino.fnal.gov:/sam/cache18/prague
    stager dev v2_2_10 new-station stager_config.txt
    bbftp prd f1_9_4b

In this example, "new-station" is the name of the new station being added. This will initiate all of the required servers to start when the command ups start sam_bootstrap is issued from within the sam account. The node should be configured so that this occurs automatically on boot up.

If you install a sam_bbftp server, you will also need to tailor and install the sam_bbftp authorization file. This is done in two separate steps.

  1. From a product maintainer account,
        ups tailor sam_bbftp
    
    This configures the location of the authorization file. The file should be in a location which is visible only on the node in question (i.e., not NFS-exported). Ideally, the file would be located in an area such as /etc where it is extremely secure; if you are unable to write into such system areas, you may use the default (in /tmp).

  2. From the sam account (or the 'root' account, if you are writing your configuration file into an area where sam does not have access):
        ups installAsSam sam_bbftp
    
    During this step, you will need to enter the authorization string, available from the sam_admin team.

4. Configuring individual servers

In the above example server list file, the entry for the development FSS contains the --route option specifying how files are routed when stored into SAM (see the documentation for this particular server). Every server in the sam_bootstrap may be configured by supplying such (space-separated) options at the end of the respective server entry in the list.


Advanced users may further modify servers' behavior by supplying the name of a file to be sourced by the Bourne Shell script that runs the servers (instead of the options string). In the above example, the entry for the development stager contains the string stager_config.txt. If the file stager_config.txt in the server home directory contains a line

PATH=/home/sam/bin:$PATH
then the PATH of this particular server will be modified accordingly, causing the stager to invoke programs from /home/sam/bin. (Don't do this unless you really know what you're doing!)

5. Starting the station for the first time and troubleshooting

When you finish the installation, you can start the station using
      ups start sam_bootstrap
from the sam account.

For some types of problems with the installation and/or configuration, you may see warning or informational messages. For example:

Message:

INFORMATIONAL: Product 'sam_config' (with qualifiers 'prd'), has no current chain (or may not exist)
/local/ups/prd/sam_bootstrap/v2_1_0/NULL/bin/run.bash: .: /dev/null: not a regular file

Resolution: sam_config has not been installed with the -q prd qualifier, see 2. above.

Other messages need to be understood and the installation fixed as needed. Some of the more difficult problems to understand may involve not having the the correct orbacus station or optimizer pieces installed. See section 2. above.

If everything works properly, you should be able to use the ps command and see a number of processes started by user sam; an example might be similar to:

     d0bbin> ps -fu sam
     sam   41757280   41671422  0   Jan 12 ?       0:05 stager start --station=protofarm --max-transfers=100 --nopid --nofork
     sam   41782673          1  0   Jan 12 ?       0:00 bash /d0farm/ups/prd/sam_bootstrap/v2_2_15/NULL/bin/run.bash start fss prd 2
     sam   41745551   41675629  0   Jan 12 ?       0:22 stagerng start --station=protofarm --max-transfers=100 --nopid --nofork
     sam   44744414   44598524  0 15:49:52 pts/13  0:00 ps -fu sam
     sam   41671422          1  0   Jan 12 ?       0:00 bash /d0farm/ups/prd/sam_bootstrap/v2_2_15/NULL/bin/run.bash start stager_op
     sam   41675629          1  0   Jan 12 ?       0:00 bash /d0farm/ups/prd/sam_bootstrap/v2_2_15/NULL/bin/run.bash start stager dv
     sam   41787689          1  0   Jan 12 ?       0:00 bash /d0farm/ups/prd/sam_bootstrap/v2_2_15/NULL/bin/run.bash start fss dev 2
     sam   41651368   41552218  0   Jan 12 ?      20:30 fss --station=protofarm --nofork
     sam   41552218          1  0   Jan 12 ?       0:00 bash /d0farm/ups/prd/sam_bootstrap/v2_2_15/NULL/bin/run.bash start fss int 2
     sam   41357841   41772469  0   Jan 12 ?       0:24 stagerng start --station=protofarm --max-transfers=100 --nopid --nofork
     sam   41788616   41755952  0   Jan 12 ?       0:05 stager start --station=protofarm --max-transfers=100 --nopid --nofork
     sam   41664799          1  0   Jan 12 ?       0:00 bash /d0farm/ups/prd/sam_bootstrap/v2_2_15/NULL/bin/run.bash start stager_oi
     sam   41772469          1  0   Jan 12 ?       0:00 bash /d0farm/ups/prd/sam_bootstrap/v2_2_15/NULL/bin/run.bash start stager iv
     sam   41755952          1  0   Jan 12 ?       0:00 bash /d0farm/ups/prd/sam_bootstrap/v2_2_15/NULL/bin/run.bash start stager_od
     sam   41756469   41782673  0   Jan 12 ?      27:48 fss --station=protofarm --nofork
     sam   41735806          1  0   Jan 12 ?       0:00 bash /d0farm/ups/prd/sam_bootstrap/v2_2_15/NULL/bin/run.bash start stager pv
     sam   41665585   41664799  0   Jan 12 ?       0:05 stager start --station=protofarm --max-transfers=100 --nopid --nofork
     sam   41649251   41735806  0   Jan 12 ?       0:56 stagerng start --station=protofarm --max-transfers=100 --nopid --nofork
     sam   41388836   41787689  0   Jan 12 ?      18:57 fss --station=protofarm --opter-suffix=devel --nofork

If you do not see such processes, you may find clues to the problems by looking at the log files. The log directories are under ~sam/private and are named based on the server, nodename, station name, and environment. For example, the station server running on node d0mino for central-analysis stations in the development environment writes its log files to a directory named ~sam/private/station__d0mino__dev__central-analysis.

The file named trace in this directory is a good source of clues when your servers are not properly starting.

6. Automatic Start on System Reboot

The system should be modified so that the sam servers are automatically started when the system boots.

If your system is not configured so that ups_startup is not called automatically during system boot, you will need to ask your system administrator to add the a command similar to the following to the boot sequence:

  /bin/su - sam -c ". /usr/local/etc/setups.sh; setup setpath; ups start sam_bootstrap"
Note, "/usr/local/etc" should be replaced with the absolute path to your system's $SETUPS_DIR, if necessary. If the sam account login shell is a cshell variant (such as tcsh), change this to
  /bin/sh - sam -c "source /usr/local/etc/setups.csh; setup setpath; ups start sam_bootstrap"

If your system is configured so that ups_startup is called during system boot, then you can add this command to the ups start configuration files without the intervention of a system administrator. The ups startup files are generally found in $PRODUCTS/.upsfiles/startup/<node>.products or $PRODUCTS/.upsfiles/startup/<flavor>.products.

7. Final Administrative setup

Finally, the station to be administered must have cache areas which can be used by groups. For a a complete description of configuring a station, refer to the Sam Station Administration Guide. The minimal configuration is established as follows, while logged in under the user account of the station administrator (as specified in the station registration).

First, set the station environment variable.

setenv SAM_STATION new-station

Add the disk for the cache area(s).

[lueking@d0lxbld1 ~]$ sam add disk --station=d0small-01 --mount=/sam/cache1
--sizeK=14000000
OK

Add groups and cache allocations.

[lueking@d0lxbld1 ~]$ sam add group --group=test --max-disk=2000000 --max-projects=4 --admin=lueking,white
OK

Configure the station as needed using the sam configure station command. To get started the defaults are ok.


Most Recent Update:
$Date: 2005/04/15 19:21:45 $
$Author: lauri $