Release 1.1 of the LISTSERV optional Line Monitor package (LMON) is now available from [log in to unmask] It can be retrieved from the registered LISTSERV contact account via a "TELL LISTSERV AT CEARN GET LMON PACKAGE". You are also urged to AFD or FUI to the package, so that you are notified of changes/fixes to the code. ********************************* * Fixes and other minor changes * ********************************* - All references to the SPLIT option and to the planned Transparent File Splitting system have been removed. This package will not be developed. - Information about held files is now automatically validated whenever the held files are located on the local node. That is, if a file is marked "held" by the line monitor and subsequently gets purged by an operator, or dumped to tape and subsequently restored under another spoolid, LISTSERV will realize that the file is no longer here and delete it from the /LMON QUERY HELD output. ****************************************** * Performance and usability enhancements * ****************************************** - It is no longer necessary to COLD start LISTSERV to reset the line monitor counters. A new command, /LMON RESET, has been provided for that purpose. Although LMON V1.1 still behaves in the same way when COLD-started, it is now possible to operate LMON 24 hours 365 days without ever having to COLD-start LISTSERV. - A fast checkpoint procedure has been implemented to improve the performance of the /LMON CKPT and LISTSERV STOP command, which could take over one minute on "hub" nodes. The time required to checkpoint the line monitor data to disk has been reduced by a factor of 20 through the use of a GLOBALV PUT and a number of EXECIO DISKW instructions instead of the GLOBALV PUTP command. Anybody who would like to APAR the disastrous performance of GLOBALV xxxP can contact me for more information. - A new option, which will be the default for new V1.1 sites, has been provided to allow routing information to be kept only in storage GLOBALV. This improves the performance of LISTSERV both by dramatically reducing the size of the LASTING GLOBALV file, by avoiding costly GLOBALV SETP commands and by decreasing the average amount of variables in the LMON GLOBALV group (and hence the amount of DMSFREEd blocks in storage). The obvious drawback is that the table will be reset every time LISTSERV is rebooted, and that more commands will be issued to RSCS if LISTSERV is rebooted relatively often. - LMON is now able to automatically purge files which exceed a predefined threshold, in much the same way as it can be made to hold them. This is particularly useful for monitoring printer links, or links to "gateway" nodes (like CERNVAX) which will not process files in excess of a certain size. This facility must be explicitly activated (in much the same way as automatic hold) and is disabled by default. The line monitor will hold files whose size lays between the "hold" and the "purge" threshold. - A new command, /LMON PURGE, has been provided to allow the operators to manually purge files held or otherwise queued on a given link. ***************************************************************** * New facilities requested by the EARN Network Operations Group * ***************************************************************** The following new facilities have been added to LMON V1.1 at the request of the EARN Network Operations Group. The reason why they were implemented at all is that I had agreed to do so before I decided to withdraw my support from EARN. Once the NOG has confirmed (by time-out) that the new facilities do indeed correspond to what they had asked for, I will provide no additional development for this new code (I may maintain it, though, if enough BITNET/NetNorth nodes express some interest). The following is an an excerpt from a note sent to the EARN-NOG list. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Wrap test capability -------------------- A new command, '/LMON WRAPTEST nodeid (<SIZE nnn> <RECORD>' has been implemented to perform wrap tests. Operators and postmasters will be allowed to use it, but they will be kindly asked not to specify the RECORD option when they do so. The test will be performed, the person who asked for the test will get the result, but it will not be recorded anywhere. In a number of judiciously chosen places, a number of '/LMON WRAPTEST nodeid (RECORD SIZE nnn' should be put in the WAKEPARM FILE, at selected times/intervals of time, with suitable values of nodeid. Deciding what 'selected', 'judicious' and 'nnn' mean IS a serious issue, but let's not discuss it for now. My present problem is to make sure this is what you asked for in Tel-Aviv, and we'll discuss the details when this has been agreed to. Anyway, we'll have a number of LISTSERVs performing wrap tests from time to time, and logging the results to a disk file. Once a month, this file will have to be sent to a central server (or maybe a human person), and subsequently erased. This can be done either manually or from the WAKEPARM FILE. Now, question number 1, what do you want to do with this information? What kind of report do you want to generate? MTWTFSS curves, time-of-the-day curves, delay as a function of nnn? My impression is that we want to feed this stuff into SAS and play a bit with it on a graphics terminal, to see what interesting results we can get. I will provide a trivial exec to send this file with a standard header to a standard place every month, but I will NOT provide any kind of SAS application to process it. Availability reports -------------------- The new version of LMON lets you define an unlimited number of "reports", each of which is basically a disk file into which the frozen values of all the link counters are stored at regular intervals of time. The first time you request a report after installing LMON V1.1, you will be told it was never initialized, and it will be automatically initialized. Anytime afterwards, you will be able to request information on what happened since the last time this report was "frozen", and, optionally, you will be allowed to "freeze" the counters again (this is called "checkpointing the report") upon successful completion of the command. The idea is that you should generate and checkpoint a WEEKLY report every week, a MONTHLY one every month, etc. Each report has an "expected duration", expressed as X +/- Y. If the difference between the time you requested the report and the time it was last checkpointed, ie the period of time on which it applies, does not lie within (X-Y,X+Y), you will get a warning message on the output. The default reports are DAILY (24h +/- 1h), WEEKLY (7 days +/- 1h), MONTHLY (29.5 days +/- 37h) and DEFAULT, which has no expected duration and can therefore be used for very short periods without generating a warning. You can create as many as you want, and you don't have to use the ones supplied by default, but I suggest that WEEKLY and MONTHLY reports should be produced at all the major nodes, maybe MONTHLY reports only on relatively minor nodes. Each report can be generated in SHORT (#Obs, #Down, %Down) or FULL form (everything else from /LMON QUERY OBSERVATIONS) for human readers, or in INTERNAL form for programs. You can select which nodes get listed, so that you can easily suppress local nodes and printers from the output. I expect the selected LMON nodes to have entries in their WAKEPARM FILE, where you would specify for example that a '/LMON REPORT WEEKLY EARN (CKPT INTERN TO master-server' would be executed each sunday around midnight. Nodes which don't run LMON would have to somehow manage to create a file in INTERNAL format and have it sent to the master server on a regular basis. The master server would append all this info into a file, and, question number 2, what do you want to do with this data? It would be very easy to generate a series of human-format reports on this master server, and somehow make them available to any EARN users, but that's still a large amount of data and I'm not sure it's really useful. What is it you really want reported? ------------------------------------------------------------------------- Full daily Line Monitor report for node CEARN --------------------------------------------- From: Sat, 26 Nov 1988 00:02 Until: Sat, 26 Nov 1988 19:42 --> Elapsed time: 19 hours and 40 minutes Userid: LISTSERV@CEARN Scope: ALL links --> WARNING: Elapsed time is not within expected range (1 day +/- 1h) Linkname #Obs #Down %Down #Held #Purged #Restart #Stall #Active #Inactive -------- ---- ----- ----- ----- ------- -------- ------ ------- --------- CMU_GENEVE 389 389 100.0% 0 0 0 37 389 0 MINT 389 314 80.7% 0 0 0 21 314 0 UKACRL 389 251 64.5% 0 0 0 23 251 0 BERNE 389 94 24.2% 0 0 13 0 0 94 USA 389 21 5.4% 0 0 0 0 21 0 UNICC 389 12 3.1% 0 0 4 0 2 10 FRMOP22 389 0 0.0% 0 0 0 0 0 0 SEARN 389 0 0.0% 0 0 0 0 0 0 NEUCHATEL 389 0 0.0% 0 0 0 0 0 0 AEARN 389 0 0.0% 0 0 0 0 0 0 UNI_GENEVE 389 0 0.0% 0 0 0 0 0 0 LAUSANNE 389 0 0.0% 0 0 0 0 0 0 CEARNV2 389 0 0.0% 0 0 0 0 0 0 CERNVM 389 0 0.0% 0 0 0 0 0 0 LEPICS 389 0 0.0% 0 0 0 0 0 0 ZURICH 389 0 0.0% 0 0 0 0 0 0 CRVXP173 389 0 0.0% 0 0 0 0 0 0 GEN 389 0 0.0% 0 0 0 0 0 0 DEARN 389 0 0.0% 0 0 0 0 0 0 <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< ***************************************** * Migration instructions for V1.0 sites * ***************************************** Note: this does NOT apply to sites which were not previously running LMON V1.0. New LMON sites should simply get the full package and follow the instructions in the installation guide (LMON MEMO). 1. Make a copy of LSV$LMN* EXEC and LMON SYSVARS. Optionally, order the new version of LMON MEMO (the installation guide). If you follow these instructions carefully, you will not need it. The user's guide and operator's guide have not yet been converted to V1.1, so you may need to write a short note to your operators about the /LMON PURGE command. 2. Order LSV$LMNC EXEC, LSV$LMNI EXEC, LSV$LMNR EXEC LMON$ MAILFORM and LMON SYSVARS. When everything has arrived, stop LISTSERV and install the files as indicated below. 3. LSV$LMNC EXEC, LSV$LMNI EXEC and LSV$LMNR EXEC must be placed 'as is' on LISTSERV 191. 4. If you had any local mailform, please append them to $LMON$ MAILFORM before storing it back on LISTSERV 191; otherwise, you will not need to change it. 5. Carefully observe the changes from your existing V1.0 LMON SYSVARS to the new V1.1 one. Merge the old definitions into the new file, and store it on LISTSERV 191. 6. Reboot LISTSERV. It may take more time than usual to start up the first time it comes up with V1.1. When the message "Initialization complete" has been printed, enter STOP, wait for the CMS prompt and start LISTSERV again. The one-time-only code will have been executed, and you will be able to observe normal startup and shutdown durations. Good luck, Eric