From the 1.8c Maintainer's Release Notes:
*****************************************************************
* Performance: Enabling reverse indexing for database functions *
*****************************************************************
(VM) Note: the guidelines in this section only apply to the new database
functions. The original VM database functions are not currently affected
by the DBRINDEX configuration variable.
The new 1.8c database functions can be configured to work in one of two
modes: with forward indexing only, and with both forward and reverse
indexing. The forward index is the file called xxx.DBINDEX. With list
archive files and other plain-text based databases, this file tells the
LISTSERV database functions where a particular database entry begins and
ends. Thus, entry #17283 in the XYZ-L list might begin at line 281 of
XYZ-L.LOG9609 and extend until line 312. The forward index also contains
frequently accessed information, such as the subject of a message or the
date at which it was posted. However, it does not contain any information
that would help in locating all the entries that contain the word 'TEST'.
The reverse index (xxx.DBRINDEX), when enabled by setting the
configuration variable DBRINDEX to 1 (or letting it default to this
value), provides this functionality. A reverse index will dramatically
speed up searches on large databases. However, it will not have much
effect on smaller databases. Without reverse index, you can still expect
a search rate on the order of 1-3M per second (elapsed) on a typical PC
server. Many lists have archives in the 1-5M range, and would not really
benefit from reverse indexing. The drawbacks of reverse indexing are:
1. The reverse index file uses up (typically) a few megabytes of disk
space, even if the archives are relatively small. This is because,
even with a small list, tens of thousands of different words are
likely to be in use. This could be an issue for sites with thousands
of small lists.
2. Building and maintaining the reverse index uses up some amount of CPU
time. This could be a problem on time-sharing systems where CPU cycles
are billed by the hour, and I/O accesses are comparatively cheap.
3. The current implementation of the reverse indexing code may require
significant amounts of virtual memory for large databases. On a system
with virtual memory quotas, this would require increasing the quotas
for the LISTSERV process. On a dedicated system, it could, in some
extreme cases, require a hardware upgrade.
None of these problems are likely to be an issue with a typical dedicated
configuration. However, we have customers running over 6,000 lists on the
same machine, paying an outsourcing company for every hour of CPU time
used, or running LISTSERV on machines with 8M of RAM, and this is the
context in which we are mentioning these issues. An operating system
specific discussion follows:
- NT (default = 1): L-Soft recommends leaving reverse indexing enabled
unless your system has less than 32-64M of RAM (depending on RISC/CISC
architecture and on the size of your largest archives) or is already
paging. Most dedicated PC servers have enough resources to enable this
option without impacting overall system performance, although in some
cases a RAM upgrade could be necessary.
- unix (default = 1): L-Soft recommends leaving reverse indexing enabled
unless your system has less than 32-64M of RAM (depending on RISC/CISC
architecture, presence or absence of X-Windows and size of your largest
archive) or is already swapping heavily. Most dedicated unix servers
have enough storage to enable this option without impacting overall
system performance.
- VM (default = 0): L-Soft recommends enabling reverse indexing on VM/XA
and VM/ESA unless all your lists have small archives. On S/370 systems,
only about 10-12M of virtual storage can be made available to LISTSERV,
which is sometimes insufficient even without reverse indexing. This
option can only be enabled safely on an XA/ESA system, after switching
the LISTSERV userid to an ESA/XA or XC machine and increasing its
VMSIZE to the recommended value of 64M (sites which already need more
than 32M due to the number of lists they are hosting should add another
32M). While this may seem excessive, LISTSERV is unlikely to actually
use all that storage. Database accesses last a few seconds at most, and
any peak in storage usage would be temporary.
- VMS (default = 0): L-Soft recommends enabling reverse indexing on
systems which have 64M of memory or more (possibly less for VAX systems
or systems without DECwindows), and which are not page-bound. On the
other hand, you will need to increase the PGFLQUO quota for the
LISTSERV process before enabling this option. Depending on the number
of lists on your system and on the size of the archives for your
largest list, you will want to increase PGFLQUO by 16-64M. VAX users
should note that the default PGFLQUO set by the 1.8b installation
procedure was found to be much too conservative (regardless of the
reverse indexing issue) and has been doubled for version 1.8c. Reverse
indexing is disabled by default on VMS as it is likely to lead to
virtual storage exhaustion unless the quotas are increased.
- Windows 95 (default = 1): since Windows 95 uses the same executable as
Windows NT, reverse indexing is enabled by default. However, a typical
Windows 95 system with 8-16M of RAM cannot support reverse indexing and
this option should be disabled. When installing with the graphical
installation program, your SITE.CFG file will be automatically modified
to disable reverse indexing. If you install the new version manually,
you will need to make this change yourself.
|