The LISTS database, originally designed with the hope that some day with
a bit of luck we might reach a thousand lists, had become less and less
usable over the years as the number of lists increased and the "one disk
file per list" approach triggered non-linear behaviour in CMS. The data
files could also get out of sync with the database index if LISTSERV was
interrupted while rebuilding its index, which unfortunately was rather
frequent, due to the time this operation (entirely coded in REXX)
normally took.
A new driver has been written for the LISTS database. While the driver
solves the problems mentioned above and divides the disk space
requirements for the database by a factor of 2-3, it is not able to
automatically convert the files produced by the old driver. Manual
intervention will be required from sites running the LISTS database when
updating to 1.7f or running the beta-test code. The remainder of this
message contains instructions for migrating to the new driver.
The first thing to do is to allocate a new minidisk for the new files
(you will be able to remove the old minidisk after the migration, or if
you are short on disk space you can erase its contents and use it as new
minidisk). The recommended minimum is 15 3380 cylinders or about 9M. The
disk must be accessed at a fixed filemode, for performance reasons, and
its filemode must be entered in LOCAL SYSVARS under the name LISTS_FM:
LISTS_FM = 'G1'
You can (and indeed should) do this in advance, so that your server is
ready when 1.7f is shipped. 1.7e will ignore this statement.
After installing 1.7f or the beta version of the LISTS driver, you will
need to take the following steps:
1. (Beta only) Delete the LISTS DBNAMES and LISTS DBINDEX files from the
old driver's disk (FILEDISK).
2. (Beta only) Due to other changes you must also do 'ERASE * CACHE'
before starting the server with the new code.
3. Once the server is started, it will report a status of "empty so far"
if you attempt to access the LISTS database. This of course is because
the disk used by the new driver is empty. When your server is ready to
accept new files, you should contact me and I will send a number of
jobs to supply your server with the data. Note that we are talking
about a total of some 100,000 lines in jobs of about 10,000 lines
each. If your line is slow or backlogged, make sure to tell me when
you would like me to send the files.
Your server will reject a number database entries, claiming they are
"outdated". This is normal, especially for your own entries. The files I
have are a snapshot of the database, which gets updated every night, and
some entries are indeed outdated. Eventually, the normal synchronization
process will make your new database converge towards the same amount of
lists as GLOBLIST FILE. If you want to load as many entries as possible
no matter whether they are outdated, you can ERASE LUPDTIME FILE before
starting, but remember that GLOBLIST FILE and the LISTS database entries
are updated concurrently. If you want LISTSERV to leave GLOBLIST FILE
unchanged, make a copy before you start and restore it afterwards.
Performance... You are going to save a lot of CPU time and I/O, but this
still doesn't mean 9370-class machines can realistically run the
database. The new driver loads updates to the LISTS database and GLOBLIST
FILE faster than the old driver could update just GLOBLIST FILE, but of
course this requires additional I/O's. I haven't had the courage to time
the old index building process vs the new one, but it took less than 30
sec of TCPU on SEARN to build the entire index from scratch (meaning less
than 10 sec on a 3090). The procedure that updates the index no longer
needs enormous amounts of memory. However, there is nothing that can be
done about the fact that the database holds 6-7M of data, that the index
is over 1M, and that the subscribers count is not supplied by the X-LUPD
protocol, which cannot be changed without the risk of breaking LISTEARN,
but by a separate set of commands which causes the information to be in a
separate disk file, from which it must be cross-referenced. On SEARN,
which is heavily I/O bound with a single disk unit for everything, a
"dumb" search such as "search compilers in lists" takes 26 seconds of
wall clock time. I suspect a 3090-class machine would complete the search
in 5-6 seconds.
Anyway, contact me if you want to beta-test.
Eric
|