Release 1.1 of the LISTSERV optional Line Monitor package (LMON) is now
available from [log in to unmask] It can be retrieved from the registered
LISTSERV contact account via a "TELL LISTSERV AT CEARN GET LMON PACKAGE".
You are also urged to AFD or FUI to the package, so that you are notified
of changes/fixes to the code.
*********************************
* Fixes and other minor changes *
*********************************
- All references to the SPLIT option and to the planned Transparent File
Splitting system have been removed. This package will not be developed.
- Information about held files is now automatically validated whenever
the held files are located on the local node. That is, if a file is
marked "held" by the line monitor and subsequently gets purged by an
operator, or dumped to tape and subsequently restored under another
spoolid, LISTSERV will realize that the file is no longer here and
delete it from the /LMON QUERY HELD output.
******************************************
* Performance and usability enhancements *
******************************************
- It is no longer necessary to COLD start LISTSERV to reset the line
monitor counters. A new command, /LMON RESET, has been provided for
that purpose. Although LMON V1.1 still behaves in the same way when
COLD-started, it is now possible to operate LMON 24 hours 365 days
without ever having to COLD-start LISTSERV.
- A fast checkpoint procedure has been implemented to improve the
performance of the /LMON CKPT and LISTSERV STOP command, which could
take over one minute on "hub" nodes. The time required to checkpoint
the line monitor data to disk has been reduced by a factor of 20
through the use of a GLOBALV PUT and a number of EXECIO DISKW
instructions instead of the GLOBALV PUTP command. Anybody who would
like to APAR the disastrous performance of GLOBALV xxxP can contact me
for more information.
- A new option, which will be the default for new V1.1 sites, has been
provided to allow routing information to be kept only in storage
GLOBALV. This improves the performance of LISTSERV both by dramatically
reducing the size of the LASTING GLOBALV file, by avoiding costly
GLOBALV SETP commands and by decreasing the average amount of variables
in the LMON GLOBALV group (and hence the amount of DMSFREEd blocks in
storage). The obvious drawback is that the table will be reset every
time LISTSERV is rebooted, and that more commands will be issued to
RSCS if LISTSERV is rebooted relatively often.
- LMON is now able to automatically purge files which exceed a predefined
threshold, in much the same way as it can be made to hold them. This is
particularly useful for monitoring printer links, or links to "gateway"
nodes (like CERNVAX) which will not process files in excess of a
certain size. This facility must be explicitly activated (in much the
same way as automatic hold) and is disabled by default. The line
monitor will hold files whose size lays between the "hold" and the
"purge" threshold.
- A new command, /LMON PURGE, has been provided to allow the operators to
manually purge files held or otherwise queued on a given link.
*****************************************************************
* New facilities requested by the EARN Network Operations Group *
*****************************************************************
The following new facilities have been added to LMON V1.1 at the request
of the EARN Network Operations Group. The reason why they were
implemented at all is that I had agreed to do so before I decided to
withdraw my support from EARN. Once the NOG has confirmed (by time-out)
that the new facilities do indeed correspond to what they had asked for,
I will provide no additional development for this new code (I may
maintain it, though, if enough BITNET/NetNorth nodes express some
interest). The following is an an excerpt from a note sent to the
EARN-NOG list.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Wrap test capability
--------------------
A new command, '/LMON WRAPTEST nodeid (<SIZE nnn> <RECORD>' has been
implemented to perform wrap tests. Operators and postmasters will be
allowed to use it, but they will be kindly asked not to specify the
RECORD option when they do so. The test will be performed, the person who
asked for the test will get the result, but it will not be recorded
anywhere.
In a number of judiciously chosen places, a number of '/LMON WRAPTEST
nodeid (RECORD SIZE nnn' should be put in the WAKEPARM FILE, at selected
times/intervals of time, with suitable values of nodeid. Deciding what
'selected', 'judicious' and 'nnn' mean IS a serious issue, but let's not
discuss it for now. My present problem is to make sure this is what you
asked for in Tel-Aviv, and we'll discuss the details when this has been
agreed to.
Anyway, we'll have a number of LISTSERVs performing wrap tests from time
to time, and logging the results to a disk file. Once a month, this file
will have to be sent to a central server (or maybe a human person), and
subsequently erased. This can be done either manually or from the
WAKEPARM FILE. Now, question number 1, what do you want to do with this
information? What kind of report do you want to generate? MTWTFSS curves,
time-of-the-day curves, delay as a function of nnn? My impression is that
we want to feed this stuff into SAS and play a bit with it on a graphics
terminal, to see what interesting results we can get. I will provide a
trivial exec to send this file with a standard header to a standard place
every month, but I will NOT provide any kind of SAS application to
process it.
Availability reports
--------------------
The new version of LMON lets you define an unlimited number of "reports",
each of which is basically a disk file into which the frozen values of
all the link counters are stored at regular intervals of time. The first
time you request a report after installing LMON V1.1, you will be told it
was never initialized, and it will be automatically initialized. Anytime
afterwards, you will be able to request information on what happened
since the last time this report was "frozen", and, optionally, you will
be allowed to "freeze" the counters again (this is called "checkpointing
the report") upon successful completion of the command. The idea is that
you should generate and checkpoint a WEEKLY report every week, a MONTHLY
one every month, etc. Each report has an "expected duration", expressed
as X +/- Y. If the difference between the time you requested the report
and the time it was last checkpointed, ie the period of time on which it
applies, does not lie within (X-Y,X+Y), you will get a warning message on
the output. The default reports are DAILY (24h +/- 1h), WEEKLY (7 days
+/- 1h), MONTHLY (29.5 days +/- 37h) and DEFAULT, which has no expected
duration and can therefore be used for very short periods without
generating a warning. You can create as many as you want, and you don't
have to use the ones supplied by default, but I suggest that WEEKLY and
MONTHLY reports should be produced at all the major nodes, maybe MONTHLY
reports only on relatively minor nodes.
Each report can be generated in SHORT (#Obs, #Down, %Down) or FULL form
(everything else from /LMON QUERY OBSERVATIONS) for human readers, or in
INTERNAL form for programs. You can select which nodes get listed, so
that you can easily suppress local nodes and printers from the output. I
expect the selected LMON nodes to have entries in their WAKEPARM FILE,
where you would specify for example that a '/LMON REPORT WEEKLY EARN
(CKPT INTERN TO master-server' would be executed each sunday around
midnight. Nodes which don't run LMON would have to somehow manage to
create a file in INTERNAL format and have it sent to the master server on
a regular basis. The master server would append all this info into a
file, and, question number 2, what do you want to do with this data? It
would be very easy to generate a series of human-format reports on this
master server, and somehow make them available to any EARN users, but
that's still a large amount of data and I'm not sure it's really useful.
What is it you really want reported?
-------------------------------------------------------------------------
Full daily Line Monitor report for node CEARN
---------------------------------------------
From: Sat, 26 Nov 1988 00:02
Until: Sat, 26 Nov 1988 19:42
--> Elapsed time: 19 hours and 40 minutes
Userid: LISTSERV@CEARN
Scope: ALL links
--> WARNING: Elapsed time is not within expected range (1 day +/- 1h)
Linkname #Obs #Down %Down #Held #Purged #Restart #Stall #Active #Inactive
-------- ---- ----- ----- ----- ------- -------- ------ ------- ---------
CMU_GENEVE 389 389 100.0% 0 0 0 37 389 0
MINT 389 314 80.7% 0 0 0 21 314 0
UKACRL 389 251 64.5% 0 0 0 23 251 0
BERNE 389 94 24.2% 0 0 13 0 0 94
USA 389 21 5.4% 0 0 0 0 21 0
UNICC 389 12 3.1% 0 0 4 0 2 10
FRMOP22 389 0 0.0% 0 0 0 0 0 0
SEARN 389 0 0.0% 0 0 0 0 0 0
NEUCHATEL 389 0 0.0% 0 0 0 0 0 0
AEARN 389 0 0.0% 0 0 0 0 0 0
UNI_GENEVE 389 0 0.0% 0 0 0 0 0 0
LAUSANNE 389 0 0.0% 0 0 0 0 0 0
CEARNV2 389 0 0.0% 0 0 0 0 0 0
CERNVM 389 0 0.0% 0 0 0 0 0 0
LEPICS 389 0 0.0% 0 0 0 0 0 0
ZURICH 389 0 0.0% 0 0 0 0 0 0
CRVXP173 389 0 0.0% 0 0 0 0 0 0
GEN 389 0 0.0% 0 0 0 0 0 0
DEARN 389 0 0.0% 0 0 0 0 0 0
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
*****************************************
* Migration instructions for V1.0 sites *
*****************************************
Note: this does NOT apply to sites which were not previously running LMON
V1.0. New LMON sites should simply get the full package and follow
the instructions in the installation guide (LMON MEMO).
1. Make a copy of LSV$LMN* EXEC and LMON SYSVARS. Optionally, order the
new version of LMON MEMO (the installation guide). If you follow these
instructions carefully, you will not need it. The user's guide and
operator's guide have not yet been converted to V1.1, so you may need
to write a short note to your operators about the /LMON PURGE command.
2. Order LSV$LMNC EXEC, LSV$LMNI EXEC, LSV$LMNR EXEC LMON$ MAILFORM and
LMON SYSVARS. When everything has arrived, stop LISTSERV and install
the files as indicated below.
3. LSV$LMNC EXEC, LSV$LMNI EXEC and LSV$LMNR EXEC must be placed 'as is'
on LISTSERV 191.
4. If you had any local mailform, please append them to $LMON$ MAILFORM
before storing it back on LISTSERV 191; otherwise, you will not need
to change it.
5. Carefully observe the changes from your existing V1.0 LMON SYSVARS to
the new V1.1 one. Merge the old definitions into the new file, and
store it on LISTSERV 191.
6. Reboot LISTSERV. It may take more time than usual to start up the
first time it comes up with V1.1. When the message "Initialization
complete" has been printed, enter STOP, wait for the CMS prompt and
start LISTSERV again. The one-time-only code will have been executed,
and you will be able to observe normal startup and shutdown durations.
Good luck, Eric
|