LISTSERV - LSTSRV-L Archives - COMMUNITY.EMAILOGY.COM

Most of  DISTRIBUTE has  been rewritten  in PASCAL  for release  1.7f, in
order to  provide a performance  boost to  machines in regions  where the
small number of systems running LISTSERV  causes most of the load to fall
on one or two systems.
 
The following table  shows the difference in performance  between the old
and new DISTRIBUTE.  The old DISTRIBUTE was labelled "1.7e",  but in fact
it was  already slightly faster than  the code released with  1.7e due to
miscellaneous improvements both to LSVDIST2 and to the PREXX library. The
"small" jobs have a  dozen of recipients with a typical  mix of BSMTP and
non-BSMTP delivery  (2/3 BSMTP), and  all recipients are  served locally.
The  "large" jobs  have  about 500  recipients of  which  100 are  served
locally  and the  rest is  forwarded  to a  dozen of  other servers.  The
"relayed" jobs  have been originated  by another  server than the  one on
which  they are  being benchmarked;  all recipient  assignments and  path
calculations  have been  done  by the  originating  server (for  non-leaf
nodes, most  jobs will be  relayed ones).  "Generated" jobs on  the other
hand require the local server to calculate paths; typically, they are the
result of a message posted to a local mailing list. Here are the figures:
 
+-----------------+--------------+---------------+
|    Job type     | Ratio (TCPU) | Absolute TCPU |
+-----------------+--------------+---------------+
| Small/relayed   |     2.40     | 0.658 & 0.274 |
| Small/generated |     2.42     | 0.687 & 0.284 |
+-----------------+--------------+---------------+
| Large/relayed   |     8.10     | 5.244 & 0.647 |
| Large/generated |     7.01     | 7.967 & 1.136 |
+-----------------+--------------+---------------+
 
Generally speaking,  the larger the  job, the  more you save  (I couldn't
find a  real-life job with  more than 500  recipients, but I  suspect the
ratio would quickly reach 20). For a given amount of recipients, you save
a lot more  if you forward many recipients to  other servers, because the
code that takes  care of local delivery (for mail)  was already in PASCAL
in 1.7e.  Furthermore the difference  is much higher if  local recipients
are handled via BSMTP.  For the small relayed job, about  70% of the time
is spent sending  copies of the messages  to the mailer (and  one half of
that  is TCPU-VCPU  overhead).  In other  words,  a carefully  optimized,
all-assembler program couldn't  possibly be more than 3  times faster for
small jobs, and probably no more than 2 since there are a number of calls
to make to system routines whose performance is not open to optimization.
So, while a factor of 2-3 for  typical jobs may seem a bit disappointing,
this isn't because the conversion wasn't done carefully but because local
delivery does  cost a  lot in  overhead (T-V)  CPU time  and we  have now
almost reached the limit.
 
The drawback of  this rewrite is of  course that 1200 lines  of REXX were
turned into  about 3000  lines of  new PASCAL  code which  now has  to be
beta-tested very carefully.  I have developed a DISTRIBUTE  test suite to
look for the most obvious errors,  but there are too many combinations of
options to  test everything (writing a  program to generate tons  of test
cases  is easy,  but then  you get  to examine  the results  manually and
decide if they are correct :-) ). Furthermore there are small differences
in behaviour  in cases where the  decision doesn't matter (due  to taking
the first vs the last enqueued recipient and so on), so the output cannot
be compared  directly with  that of  the old  code. So  I am  looking for
people with serious CPU problems and  not too much traffic who could help
with this testing.
 
  Eric