From June:
>What I don't understand is why when BYUADMIN determines that there is no
>closer LISTSERV machine to us than itself and that it must therefore pass of
>the transmission as regular mail, that it doesn't do so in the same way that
>MASTIST would have done in the old days, namely by passing all forur names to
>MAILER thus allowing them to be transmitted by a single BSMTP transmission
>from MAILER@BYUADMIN to [log in to unmask] Instead, it is causing four separate
>pieces of mail to be generated at that point, one for each subscriber. Thus,
>instead of one transmission from MARIST, we are now getting four from
>BYUADMIN. Total network load may be down but our spool consumption is up.
Now I understand what you meant; it is an issue of DIST vs non-DIST, and not
DIST2 vs DIST1. First you should note that, even with non-DIST mail, the
number of recipients per message is limited to 5, to avoid generating large
headers with hundreds of people on the same node.
I can perfectly understand your point, and it can be implemented. The only
problem is performance. Lists are sorted by nodeid when they are updated,
which doesn't happen very often and doesn't cost so much CPU time, as it is
done via an XEDIT command (ie in assembler). The REXX code can then decide
whether or not to "close" the current set of recipients by comparing the
nodeid with that of the next recipient in the list.
Now with DIST2, recipients can be in any order, especially when sub-lists get
exploded and appended to the distribution. Fine, you just have to sort the
list before distributing the files, or something like that. Well the problem
is you can have really large lists of recipients (take the case of LINKFAIL
for example), and sorting anything in REXX costs a huge amount of time. For
example, it is faster to re-ACCESS the 191 minidisk and let CMS sort the FST's
than to sort the 30-or-so lists in REXX when you do a LISTS command. Sorting
500 entries is simply out of question. Calling XEDIT to do it would be faster,
but it is ridiculous and implies either useless disk I/O or a bunch of SVC
202's to stack the data and get the results back.
To summarize, it can be done, it is not that difficult, it is very desirable,
but it could make the code twice slower, or even worse (sorting is usually a
non-linear performance degradation :-) ).
From Marty:
> DIST2 cuts down on the number of copies flowing between LISTSERVs but
>increases the number fanning out to the final destination. At least I think
>that is the view from a non-LISTSERV site.
That is perfectly correct.
> I think part of the problem arises in determining what is the
>responsibility of the LISTSERVer versus the Mail Transfer Agent. It might be
>nice for LISTSERV to "know" more about mail transfer agent and system. For
>example, one might be able to use BSMTP to encapsulate mail for users all
>served by a single gateway (but then what do you do with the To: field...
>;-).
That is a very significant point, especially when it comes to slow gateways.
Whenever a large amount of people subscribe to a distribution list under their
domain addresses, and they don't have a specific gateway defined in DOMAIN
NAMES, the poor CUNYVM mailer is going to have to process a real bunch of mail
files... The problem is even worse with BITNET-based lists with a bunch of
internet subscribers, as the SMTP gateway itself is going to process a linear
amount of items as the list grows up.
If LISTSERV "knew" more about the mailers and gateways, it could bundle
everything in a single BSMTP envelope, *provided* that the mail items were
identical. Presently this is not the case, but it could be made a SET-table
option that would default to "anonymous header" when the subscription comes
from a domain address.
> And then there is the question of how LISTSERV fits into a TCP/IP network
>where you have point-to-point SMTP connections...
One of the major dangers of point-to-point (PTP) networks is that people
invariably tend to forget about the real topology. They don't realize that all
these identical files, which are being sent "directly" through a PTP
connection to their destination, are in fact flowing through the same physical
link and consuming 10 times the bandwidth that an old-fashioned S&F network
with a decently dense LISTSERV backbone would have required.
This is also a very important point in the "domain" discussions - my personal
feeling is that, even if S&F and flat addressing were to disappear,
information about the network topology should STILL be available as it is
today, so that software like LISTSERV can keep using things like DIST2 to
avoid the nonsense of multiple "direct" sends through a "halo of
interconnected links" that does get you to the destination host but in a
completely unknown and unpredictable way.
From Fred:
>Well I kinka like not having a lot of addresses on the "To" line. It makes
>the mail look better. Since we are a backbone site having stuff sent via
>DIST2 is a lot better in terms of over all network load. The fact that there
>are a varity of mailers on the network can make any attempt to send single
>files based on what mailer is in use a very hard job. TCP/IP with SMTP adds
>more to the problem.
Well, one has to agree on a common interface, like BSMTP. People who want to
run another, non-conformant mailer, would simply be on their own with this,
and might have to waste hours making and maintaining local changes to solve
the problem.
The ideal solution is, of course, to increase the density of backbone servers,
especially in these areas where there is a huge amount of traffic and a very
little amount of servers (win an electronic joke-file if you can name at least
one of these places :-) )
Eric
|