SAM Installation: File Transfer Protocols

Special Bulletins Overview SAM Configuration SAM Bootstrap File Transfer Station Configuration

You must install the sam_cp and sam_cp_config wrappers, and at least one of the specific supported underlying protocols:

  • sam_gridftp:
    • this replaces bbftp for inter-site transfers
    • wrapper around gridftp for use by the 'sam' account, specifically for sam transfers
  • sam_bbftp:
    • will be phased out and replaced by sam_gridftp
    • wrapper around bbftp for use by the 'sam' account, specifically for sam transfers
  • sam_kerberos_rcp:
    • most on-site Fermilab nodes will use this also
    • wrapper around kerberized rcp for use by the 'sam' account, specifically for sam transfers
  • sam_encp:
    • only a limited number of on-site Fermilab nodes will use this
    • wrapper around transfers to/from D0's Enstore installation


  1. Installing and Configuring the sam_cp wrapper:

    sam_cp is a wrapper around all of the various file transfer protocols used by SAM. It is called by the sam station software, and handles the "decision-making" process of selecting the appropriate transfer mechanism.

    Installation:

        $ setup upd
        $ upd install -G -c -R sam_cp
    
    This will install sam_cp and the required sam_cp_config packages. You must select at least one of the supported underlying specific transfer protocols and install the corresponding package:
    • sam_gridftp (remote sites should be migrating to this)
    • sam_bbftp (being phased out)
    • sam_kerberos_rcp (kerberized on-site nodes may use this)
    • sam_encp (only nodes directly connected to the Fermilab Enstore system will use this).

    Once you have selected the specific transfer protocols, continue the installation by tailoring sam_cp_config.


  2. Tailoring sam_cp_config:

    sam_cp_config is the means by which the sam_cp package decides which protocol to use when transferring SAM files.

    Installation: sam_cp_config will be installed automatically via:

        $ setup upd
        $ upd install -G -c -R sam_cp
    
    Or, it can be independently installed/updated via
        $ setup upd
        $ upd install -G -c sam_cp_config
    

    Once sam_cp_config is installed, you must ensure that your system is listed in the $SAM_CP_CONFIG_FILE, which is the "map" used by sam_cp in determining capabilities of your system. This is done via

        $ setup sam_cp
        $ <edit> $SAM_CP_CONFIG_FILE
    
    The default $SAM_CP_CONFIG_FILE should be fairly self-documenting, and looks like:
    #
    # sam_cp_config.py
    #
    # python module which gives the mappings for
    # different ways to transfer files.
    #
    
    import os,sys
    import copy
    import SamCpClasses
    
    global KNOWN_SAM_CP_CLASSES
    global DOMAIN_CAPABILITY_MAP
    
    
    #
    # Known methods for handling the different types of Sam Copy protocols.
    # The dictionary key is a string which determines the class to use
    # for performing the copy.  The classes are defined in
    # SamCpClasses.py.
    #
    
    KNOWN_SAM_CP_CLASSES = { 'sam_kerberos_rcp' : SamCpClasses.SamKerberosRcp,
                             'local_cp'         : SamCpClasses.SamLocalCp,
                             'rcp'              : SamCpClasses.SamRcp,
                             'scp'              : SamCpClasses.SamScp,
                             'enstore'          : SamCpClasses.SamEnstore,
                             'encp'             : SamCpClasses.SamEncp,
                             'dcache'           : SamCpClasses.SamDcache,
                             'dcache_gridftp'   : SamCpClasses.SamDcacheGridftp,
                             'sam_gridftp'      : SamCpClasses.SamGridftp,
                             'jim_gridftp'      : SamCpClasses.JimGridftp,
                             'dccp'             : SamCpClasses.SamDccp,
                             'GridKaCp'         : SamCpClasses.SamGridKaCp,
                             'rfio'             : SamCpClasses.SamRfio,
                             'bbrfio'           : SamCpClasses.SamBbftpRfio,
                             'sam_bbftp'        : SamCpClasses.SamBbftp,
                             }
    
    
    
    #
    # Mapping between a DOMAIN name and the SAM CP CLASSES
    # that it provides.
    #
    # If multiple matches are found, the priority will
    # be given to EARLIER entries in the INITIATING
    # domain's entry.
    #
    DOMAIN_CAPABILITY_MAP = { 'enstore'      : [ 'enstore',  \
                                                 'encp', \
                                                 'dcache', \
                                                 'dccp', \
                                                 'dcache_gridftp', ],
                              'fnal.gov'     : [ 'sam_kerberos_rcp', \
                                                 'rcp', \
                                                 'enstore', 'encp', 'dcache', 'bbrfio', ],
                              'man.ac.uk'    : [ 'sam_gridftp', ],
                              'cs.wisc.edu'  : [ 'sam_gridftp', ],
                              'dummyDomain'  : [ 'dummyClass', ],
                              'in2p3.fr'     : [ 'rcp', \
                                                 'rfio', \
                                                 'sam_bbftp', 'bbrfio', ] ,
                              'rfio://in2p3.fr': [ 'rfio', \
                                                   'bbrfio', ],
                              'YOUR_NODE_HERE' : [],
                              }
    
    
    ########################################################################
    #
    # Template for adding new SamCp classes:
    #  a) create the Class with methods __init__(self) and
    #     copy(src,dest) which handles this new protocol
    #  b) add a string description dictionary key and the name
    #     of the path to the KNOWN_SAM_CP_CLASSES dictionary
    
    
    # New class definition,
    #  overriding the default __init__() and copy() methods.
    #
    #
    class DummySamCpClass(SamCpClasses.SamCp):
        def __init__(self):
            SamCp.__init__(self)
    
        # The guts of the matter take place here.
        #
        # srcFileObject and destFileObjects
        # are described in the file
        # $SAM_CP_DIR/src/SamCpFileParser.py
        #
        # The known classes are defined in the
        # file
        # $SAM_CP_DIR/src/SamCpClasses.py.
        #
        def copy(self, srcFileObject, destFileObject):
            SamCp.copy(self, srcFileObject, destFileObject)
    
    
    # Add the new class definition to the
    # KNOWN_SAM_CP_CLASSES dictionary, so that
    # any sites that have 'dummyClass' in their
    # DOMAIN_PROTOCOL_MAP entry know which class
    # to call to perform this transfer.
    KNOWN_SAM_CP_CLASSES['dummyClass'] = DummySamCpClass
            
    
    
    
    
    

    In the future, we hope to automate more of the configuration process...


  3. Installing and Configuring sam_gridftp:

    sam_gridftp is a wrapper around the gridftp protocol. The SAM system can use it to transfer files among participating stations.

    Requirements: ups v4_7 or higher

    Step 1: Installation of the products

    %products> upd install -G-c vdt
    %products> ups tailor vdt
    This installs globus from the Virtual Data Toolkit (VDT) distribution. Accepts the defaults. The installation can take more than 30 minutes on some machines.
    %products> upd install -G-c sam_gsi_config_util -q vdt
    This will also install both sam_gsi_config and sam_gridftp. The products sam_gsi_config helps with the configuration of the Globus Security Infrastructure (GSI), used by gridftp.

    Step 2: Configuration of GSI
    This step drives you through an interview that configures GSI for sam_griftp and all the JIM products.
    You will be asked where to store the sam server certificate: it must be a non-exported directory, writable by user sam. Create this directory (if you don't have a preference, create the directory indicated by the default).
    Note that in the past we used to store certificates in /etc/grid-security: this directory is in general writable by root only; create a subdirectory writable by sam if you want to keep the service certificates here.
    Execute:

    %products> ups tailor sam_gsi_config -q vdt
    At the end of the tailoring, you will be told what user needs to take the next step.
    For experts: you can achieve custom installations by editing the file ${UPS_THIS_DB}/sam_gsi_config/sam_gsi_config.`hostname`.conf

    Step 3: Installation of the trusted Certificate Authorities
    If you want to install sam_gridftp ONLY, do the following as user sam. Follow the instructions printed at the end of Step 2 otherwise. You may need to execute this commands many times as different users. The installation script will stop and prompt you if you are trying to install the CAs as a different user than the recommended one.
    To move files, the trusted certificate authorities are the EDG CAs.

    %sam> ups install_ca sam_gsi_config -q vdt

    Step 4: Request a sam service certificate
    This certificate defines the identity of the gridftp and JIM servers for security purposes. The following command helps requesting a certificate signed by the DOEGrids CA. See expert instructions to use a different CA.

    %sam>  setup sam_gsi_config -q vdt
    %sam>  sam_cert_request --name=<Name> --email=<email> --phone=<number>
    
    This will try to submit you request directly to the DOEGrids CA. Should this command fail, follow the instructins hereby to do a manual submission.

    Manual submission of your certificate to the DOEGrids CA
    FOLLOW THESE INSTRUCTIONS ONLY IF SAM_CERT_REQUEST TELLS YOU TO DO SO.
    Your request is the file called samservice.request.
    DO NOT EMAIL YOUR REQUEST.
    Instead, go to https://pki1.doegrids.org, click "Grid or SSL Server" and fill in the form.
    Hints

  4. PKCS#10 field: look at your request. Cut and paste all the characters between and including
    -----BEGIN CERTIFICATE REQUEST-----
    ...
    -----END CERTIFICATE REQUEST-----
    
  5. Affiliation: OSG
  6. VO: select DZero or CDF
  7. Additional Comments: "SAM Grid Service Certificate for [DZero|CDF]"
  8. For experts: If you want to use another CA other than the DOEGrids, you MUST request a common name of the form "CN=sam/fully.qualified.domain.name".

    Step 5: Register your sam service certificate
    Send email to d0sam-admin@fnal.gov or cdfsam-admin@fnal.gov and ask to add your sam server certificate to the central sam_gridftp grid-mapfile AND to the gridftp DCache door (if applicable).
    Include your certificate subject to the email as reported by sam_cert_request (Step 4).
    Hint
    The certificate subject is a string of the form /DC=org/DC=doegrids/OU=Services/CN=sam/d0mino.fnal.gov

    Step 6: Install your sam service certificate
    The CA will send you an email with a link to your certificate. The format you are interested in is right below "Base 64 encoded certificate". You need to cut and paste it in you certificate file, including the lines "-----BEGIN CERTIFICATE-----" and "-----END CERTIFICATE-----". You can find the location of your certificate file with the command

    %sam>  setup sam_gsi_config -q vdt
    %sam>  sam_gsi_read_config SAM_GSI_GRIDFTP_X509_CERT
    
    For experts: Make sure to install public and private keys where the gsi configuration expects them. The command above returns the path to the signed public key; the same command for SAM_GSI_GRIDFTP_X509_KEY returns the path to the private key. Make sure your private key is read-only by user sam.

    Step 7: Test your certificate (optional)
    DO THIS STEP ONLY AFTER Step 6 is completed.
    The following commands are for shell. Change them accordingly if you use a different shell.

    %sam> setup sam_gsi_config -q vdt
    %sam> export X509_USER_CERT=`sam_gsi_read_config SAM_GSI_GRIDFTP_X509_CERT`
    %sam> export X509_USER_KEY=`sam_gsi_read_config SAM_GSI_GRIDFTP_X509_KEY`
    %sam> export X509_CERT_DIR=`sam_gsi_read_config SAM_GSI_GRIDFTP_CERT_DIR`
    %sam> grid-proxy-init
    %sam> grid-proxy-destroy
    
    The last 2 commands create and destroy a valid proxy, if everything is OK. IMPORTANT: you must destroy the proxy (grid-proxy-destroy command) after the test in order for sam_gridftp to work correctly afterwards.

    Step 8: Configure sam_gridftp
    For the default setting, you will not need to answer any questions.

    %products> ups tailor sam_gridftp -q vdt
    
    For experts:
    1) you can customize your installation by editing the configuration file ${UPS_THIS_DB}/sam_gridftp/sam_gridftp.config
    2) to tighten the security of the gridftp server, edit the access file ${GLOBUS_LOCATION}/etc/ftpaccess
    Full instructions at http://www.landfield.com/wu-ftpd/man/current/ftpaccess.html
    Typical access file allow only reading from the sam cache area (/sam in the example below) and deny all uploads to the sam home area (/home/sam in the example)
    noretrieve /*
    allow-retrieve /sam
    upload /home/sam * no
    
    The gridftp deamon must be restarted for changes to take effect.

    Step 9: Get the sam service authorization list (grid-mapfile)
    DO THIS STEP ONLY AFTER Step 5, 6 and 8 are completed.

    
    %sam>  setup sam_gsi_config_util -q vdt
    %sam>  sam_gsi_get_gridmap --overwrite-gridmap
    
    This step is also a test of sam_gridftp: if you can get the new list, your gridftp client is configured properly.
    You can also get a grid-mapfile for a server different than gridftp. Use option -help for more details.
    You may want to establish a cron job as user sam that keeps this list up to date. The cron command will look something like
    0 * * * * . /usr/local/etc/setups.sh && setup sam_gsi_config_util -q vdt && sam_gsi_get_gridmap --overwrite-gridmap --no-default-gridmap > /dev/null 2>&1
    
    For Experts: You can force sam_gsi_get_gridmap to retrieve a grid-mapfile for a given VO using the environment variable SAM_GSI_CONFIG_VO=THE_VO. This overrides the default configured for the given server at Step 2. This can be useful for central systems like d0mino, where both the cdf and d0 grid-mapfiles are updated and merged in a single file.

    Step 10: Make SAM aware of gridftp

  9. Edit sam_cp_config file to include sam_gridftp as a protocol for your sam installation
    %products>  setup sam_cp_config
    %products>  EDIT $SAM_CP_CONFIG_FILE 
    
    Add/modify 2 lines to the dictionary DOMAIN_CAPABILITY_MAP. Example
          'your.node.name'    : [ 'sam_gridftp', ],
          'd0mino01.fnal.gov' : [ 'sam_kerberos_rcp', \
                                  'rcp',\
                                  'sam_gridftp', \
                                  'enstore', ],
    
  10. As sam, add the gridftp daemon to the node_server_list.txt (typically under ~sam/private/). The additional line will look something like
    gridftp prd vX_X 
    
  11. Optional Step: Tuning of the gridftp configuration
    This section documents all the parameters that can be configured within sam_gridftp. It is meant for experts or for people with special requirements at their site, such as firewalls.
    The values of parameters can be modified by editing the file $SAM_GRIDFTP_CONFIG_FILE (available after setup sam_gridftp). The parameters are described in the configuration file. This is the default configuration file:

    #What protocol is prepended to the remote file to form a url for globus-url-copy
    RemoteProtocol gsiftp
    
    #What protocol is prepended to the local file to form a url for globus-url-copy
    LocalProtocol file
    
    #Number of parallel data streams. This parameter is ignored and 
    #set internally to 1 if FTPActiveMode is false 
    NStreams 1
    
    #In active mode the client opens the data channel and instructs the server to
    #connect to it. In passive mode, the server opens the data channel and
    #instructs the client to connect to it.
    #The mode should be selected depending on the configuration of the network.
    #Note that multistreamed communication is implemented in active mode only.
    #This parameter has the precedence over the NStream parameter.
    FTPActiveMode true
    
    #Size (in bytes) of the buffer to be used by the underlying ftp data channels
    #Default is globus-url-copy default
    TCPBufferSize default
    
    #Size (in bytes) of the buffer to be used by the underlying transfer methods
    #Default is globus-url-copy default
    BlockSize default
    
    #Data port range. Useful when running gridftp behind a firewall.
    #The port range is comma separated e.g. 50001,50100
    #This option initilizes the GLOBUS_TCP_PORT_RANGE environment variable
    #The default does not impose any restriction on the ports for the data channel.
    TCPPortRange default
    
    #Runs gridftp with the -dbg option.
    #Warning: this is very verbose. If it is used in conjuction with the
    #sam station, the output may be bigger than the eworker string buffer.
    DebugMode false
    

  12. Installing and Configuring sam_bbftp:

    sam_bbftp is a wrapper around the bbftp protocol, used only by the 'sam' account between nodes which have been configured to use it.

    You will need to obtain the sam_bbftp authorization string by sending an email request to sam_admin before proceeding.

    Installation:
        # from a product maintainer account:
        $ setup upd
        $ upd install -G -c sam_bbftp
    
        $ ups tailor sam_bbftp
            # location of authorization file
            # number of retry-attempts (we recommend 2)
            # number of parallel streams (we recommend 2)
    
        # from the sam account:
        $ ups installAsSam sam_bbftp
            # authorization string here
    
    


  13. Installing and Configuring sam_kerberos_rcp:

    sam_kerberos_rcp is a wrapper around kerberized rcp, specifically for use with the special 'sam' principals in the FNAL.GOV domain. Note that kerberos rcp command itself may be wrapped using fcp (or similar product) which keeps control over the total number of incoming or outgoing rcp file transfers.

    • Your system administrator must request the creation of special sam/d0/<node.domain>@FNAL.GOV principals
    • Your system administrator must create special keytab files (typically, /var/adm/krb5/sam_keytab).
    • sam account must be able to successfully obtain tickets using
            $ kinit -k -t </path/to/sam_keytab> sam/d0/<node.domain>@FNAL.GOV
          
    • Once these steps are complete, install and tailor via:
            $ setup upd
            $ upd install -G -c sam_kerberos_rcp
            $ ups tailor sam_kerberos_rcp
          

    New in v4_0_8: as user sam, you can now use the utility

        $ setup sam_kerberos_rcp
        $ sam_kinit.sh
    
    in order to authenticate, even if you do not need to rcp any files. (For example, if you need to authenticate in order to update a cvs package, etc.).

  14. Installing sam_encp:

    sam_encp is a wrapper around encp copy commands, in order to make sure that the appropriate options and parameters are set.

    Installation: Your node must be authorized to mount /pnfs space and read/write directly from D0's enstore system.

        $ setup upd
        $ upd install -G -c sam_encp
    


For further information, contact sam-admin@fnal.gov
=============================================================================
Project  : SAM
Package  : $Name:  $
Revision : $Revision: 1.1 $
Modified : $Date: 2002/04/17 16:03:42 $ by $Author: lauri $

This work is part of a development project, called SAM, which consists
of a number of coordinated packages each named sam_xxxx.  Notice of
authorship, copyright status, and terms and conditions, should the
software eventually become available for use outside Fermilab, can be
found in the README and LICENCE files in the top level directory of
the main sam package.
=============================================================================