These crashes were caused by a table update which triggered a rather
nasty bug that ultimately resulted in a bad pointer reference. Depending
on the system, version and configuration, there was either no visible
impact or a crash (there was no impact on SEARN, or the update would
never have been sent). In most cases, LISTSERV will restart after the
crash and load the updated table that I sent after noticing the problem.
So far it appears that VM sites running LISTSERV-TCP/IP (any version) and
a handful of VM sites running LISTSERV-NJE (it's hard to say due to the
small number of error reports, but it seems to happen systematically with
version 1.7f and for one site out of 10 with higher versions) can run
into a situation where the server will not come up after being restarted.
This does not seem to happen on non-VM sites unless the restart procedure
releases 'jobh' files, thereby causing the bad update to be re-executed
with the same results. If the server does not come up, you can try the
following:
1. ERASE INTPEERS NAMES, ERASE BITEARN LINKSUM2 and restart. This will
always work with LISTSERV-NJE servers. LISTSERV-TCP/IP servers may
stop on a REXX error after performing #1, or they may start, depending
on configuration details. The REXX error is in LSVCHK (startup checks)
and due to the fact that INTPEERS NAMES is not optional on
LISTSERV-TCP/IP servers. If you have disabled startup checks, the
server will come up.
or:
2. GET INTPEERS NAMES from (say) [log in to unmask], replace it on
LISTSERV 191, ERASE BITEARN LINKSUM2 and restart. This will always
work on all systems.
On non-VM systems the server will come up and fix itself unless you have
a procedure to release the jobh files, in this case just disable the
procedure. You can also try the equivalent of step #2 for your operating
system.
I apologize for the inconvenience. Every table update is tested on SEARN
first before being sent. The downside of supporting all versions of CMS
from 4 to 12, all the major VM architectures (SP, HPO, XA, ESA), 13
brands of unix with typically 3 major software levels each, NT, VMS, and
so forth, is that there are way too many systems and levels and
configurations to be able to test all possible cases.
Eric
|