Using the SAM Administrative Console
1. Introduction
All important events in the SAM system, such as station events, project
manager actions, file transfers, consumers starting and
completing file processing, and errors, are recorded in a
log file common to the entire SAM system. This information enables
one to examine performance parameters, observe system
operation, diagnose problems and accumulate information to further
tune the system. The entries in the log file are
parsed by the info_server and cached in memory. This time dependent
data is then available to administrative clients that can view the
information as charts or tables. The following parameters are available
from the info server (encp is the copy procedure provided by Enstore):
Number of Files Authorized for delivery by the SAM global optimizer.
Number of files successfully delivered to the SAM system by encp.
Number of files unsuccessfully delivered to the SAM system by encp .
Number of files per minute transferred, from Enstore, to the SAM system.
Average data throughput in MBps , from Enstore, to the SAM system.
Number of files per minute transferred to a particular station, project,
consumer combination.
Average data throughput in MBps transferred to a particular station, project,
consumer combination.
Total number of files transferred to a station, project, consumer combination.
Total number of file errors for each consumer.
Refer to the sam_info_server
documentation for additional details about this part of the system.
2. SAM Admin Charts, Histograms, and Tables
The sam admin tools currently consist of a Java application known
as the SAM Administration Console. When a console is activated, it contacts
the info_server and requests information. With this tool the operation
of the entire system can be monitored. Following is a walk through the
charts, histograms and tables currently available using this tool. The
console can be activated to monitor either SAM development
or production
activities. The recommended browsers are Netscape 4.5 or higher, and IE
5.0 or higher.
2.1 Main Console Page
The main console page enables setting up the time period the to monitor,
and the item of interest. Figure 1 shows the console page. To select times
up to the current time, select "now" or use the "enter date" option and
type in a time and date parameter at which you would like the monitoring
period to end. Then, select a time range to view periods ranging from 1
day to 1 week. Press the "refresh" button to update the information
in the charts. Then, select the parameter of interest from the thumbtabs
above the charting region of the window.
Figure 1. The SAM Admin Console.
2.2 Data Movement Charts
The number of files and data throughput for the system are shown
in Figures 2 and 3 as a function of time. These charts allow us to
observe the throughput for the various modes of access in the system.
Figure 2: Files throughput for the various modes of access.
Figure 3: Average system throughput in MBps for the various
modes of access.
Figure 4. Diagram of overall system throughput from Enstore, and to
all consumers. The disk cache is not currently enabled.
2.3 Histograms
Under the Histogram tab, histograms for several important SAM and
Enstore parameters are available.
Figure 5 shows the histogram viewing area and the dropdown menu with
available select options.
Figure 5. Histogram selection with pulldown menu of available
parameters.
The available options are:
-
Authorization latency: Within the global optimizer is an algorithm
which will eventually determine the relative number of file transfers allowed
for each part of the system, or access mode. This algorithm is currently
just a random wait time to give the system more realistic performance
features. The distribution for authorization Latencies is shown in
Figure 5.
-
Enstore File seek time: Time to seek the file on the tape,
after it has been mounted.
-
Enstore Queue wait time: The time the request waits in the Enstore
queue.
-
Enstore overall latency: This includes contributions from tape mount,
file seek, queue wait as well as other incidental factors.
-
Consumer Filename Latency: The time a given consumer must wait from
when a file as requested to when the file is ready for processing
.
-
File size: Size for each file transferred.
-
Enstore mount time: Time required for the tape mount.
-
Instantaneous Transfer rates: The transfer rates here refer
to as "instantaneous" and are the size of the file divided by the time
spent transferring the file. These transfer rates represent the throughput
of the source device, network, and destination device combination.
Instantaneous transfer rates do not include the effect of latencies in
the system.
Figure 6 shows the histogram display area with the available transfer
mode options.
Figure 6: Histogram selection with pulldown menu
of available data access modes.
2.4 Summary Tables
In addition to the time charts and histograms, tabular summaries
of the system performance are available. There are currently two
tables and additional ones are in progress. A global summary of the system
is shown in Table 1. Summary information in each row of this
table is for a particular time duration and includes the number of
active stations, projects and consumers, The number of successful
and failed file transfers from enstore to SAM, Average ENCP latency
time (with std. dev. after the /) , Average file size (with std.
dev. after the /), Average file transfer (throughput) rate, and instantaneous
file transfer throughput. The average transfer rate is the total number
of MB transferred over the duration of the period divided by the time in
the period. The average instantaneous transfer rate is the average
rate for each transfer, not including the effect of latencies.
Table 1: Global Summar .
More detailed information about the performance of the system for each
project is contained in the Project Summary shown in Table 2.
This table provides information similar to Table 1, but broken down
by station and project for a single time period. This information
is useful to monitor the relative resource usage of the various stations
and projects active to the system. In the future, we
plan to test the operation of many stations and projects running simultaneously
and this table will be useful in understanding the operation and tuning
the performance of the system as well as spotting and diagnosing problems.
Table 2: Project Summary .
Another useful way of looking at the system is through the "System
Explorer" tree display shown in Figure 12. This allows one to drill down
the hierarchy for any active station to observe project and consumer status
and errors. In the example shown in the figure , the tree under the station
named "Station 0" has been expanded to show active projects, and
"Project0" has been expanded to show consumers. The status for Consumer
1 is shown on the right.
2.5 Cumulative Charts
Cumulative charts show the integrated values of data transferred
including files delivered, files authorized, and MBytes delivered.
These are shown in Figures 7,8 and 9.
Figure 7. Total number of files delivered for each
access mode.
Figure 8. Cumulative number of files authorized by sam.
Figure 9. Cumulative number of MBytes transferred for each mode of access.
Future Plans
As we use the system we plan to add more functionality, and make it easier
to use. There will be more information from the info
server including consumer file processing elapsed time, cpu time
and additional error information. The system currently is able to
get recently logged information but after some period of time, a day or
week, this data will need to be archived into the SAM database and this
system will be improved so it can also access this information.
More charts and tables will be added as needed to understand the system
and the correlations between its many parts.