LSTSRV-L Archives

LISTSERV Site Administrators' Forum

LSTSRV-L

Options: Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Topic: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Mark L Hunnibell <[log in to unmask]>
Fri, 8 Sep 1995 00:22:37 -0400
text/plain (60 lines)
Hello:
 
I have been working on a means to make individual message Web Pages from
some LISTSERV LOG archive files.  Unfortunately, the LOG files are on a
machine a few thousand miles from the Web Server, so I had to get a little
creative.
 
My goals were:
        1. Use a Web Page naming convention that incorporated numbers
identical to the 'index' number you would get for the message doing a
LISTSERV database search.
        2. Offer the entire archives for the list, with several index
types (Date, Author, and Subject) for the most recent 10, 50, 100, and 200
messages.
 
I now have a program set that will do what I want.  I need to put a few
parts together and make a crontab entry for it, but I'm comfortable with
the results so far.
 
The down side is that I ran into a few snags along the way involving
message delimitation. My assumption based on discussion with the L-Soft
folks some time back was that the messages in LISTSERV LOG files are
delimited with a string of 73 equal signs starting in the left column.  I
asked what would happen if someone happened to send a message to a list
with 73 equal signs in the body of their message.  They told me that
LISTSERV would move it over one or something like that so that it wouldn't
be misinterpreted.  This may be a recent revision to LISTSERV, but I found
several messages in my source LOG files that had 73 equal signs starting
in the left column.
 
The *other* (very strange) thing I found were three entries from the
archives in 1991 that had *72* equal signs, followed by a space, and then
a number, like so:
 
 ======================================================================= 41
 
Both of these discrepencies hampered my efforts to keep the number count
right for the file names and ended up parsing some bogus messages.  It
took me a while to find the source both problems.
 
In the end, I settled on a pre-processing routine for the LOG files that
corrects these two file "defects" and I just had a perfect run. The
program created over 26,000 individual Web Pages in about 33 minutes
(about 13 pages created each second). I have not yet made a searching
interface for these pages, but a simple one will not be hard.
 
I am not ready to release this program to the world, but I can see it
would be useful to others and my intent is to make it available to anyone
who wants to use it.  It is written in perl 4.  My real purpose in writing
this message was to ask if anyone knew what this "72 equal sign plus a
number" thing was all about and then to ask about the filtering of the 73
equal sign lines in the messages... is that a new feature?
 
If you have any questions, please let me know.
 
Cheers
 
Mark Hunnibell                  Email: [log in to unmask]
KIDLINK Gopher/WWW Coordinator  http://www.connix.com/~markh/index.html

ATOM RSS1 RSS2