In the last few months, enough sites have dropped out of the backbone or
announced their intent to do so in the near future to put the entire
backbone and at least two BITNET regions in jeopardy. In one of these
regions the impact has been such that a core site is seriously
considering leaving the network if the problem cannot be solved. The
reasons most often given, from most to least common, are:
1. We have no time to keep BITEARN NODES up to date, and your monitor is
wasting more of our time every week with a new complaint about our
tables.
2. Management decided to dump VM.
3. DISTRIBUTE takes too much CPU time, and we haven't got time to install
1.7f beta or LMail to cut both CPU bills in 3.
4. We haven't got time to install LMail or R2.10.
5. Management decided to keep VM, but leave BITNET.
Sites which pick up the corresponding load have been complaining about a
serious design flaw in the backbone - that you have no control over the
traffic you get. While the problem is real and serious, the design flaw
is elsewhere - you have no control over the non-DISTRIBUTE traffic you
get. When a leaf site leaves the backbone because they haven't got the
manpower to install new tables once a month, the sites on the way get a
lot more RSCS traffic. Those which also run LISTSERV see this as an
increase in DISTRIBUTE traffic, which they would like to be able to say
"no thanks" to. However, the hard facts are that this is not such much
DISTRIBUTE traffic as RSCS traffic. If you say "no thanks" to this
DISTRIBUTE traffic, you still get the same pile of RSCS files to the leaf
node; the difference is that you also get the same amount of incoming
files on your upstream node, which costs you MORE resources to pull in
(assuming 1.7f vs RSCS+VMNET).
Unfortunately this scenario has repeated itself a number of time. A leaf
node drops out to avoid having to update tables. The site behind it
identifies the extra load as DISTRIBUTE traffic due to its backbone
status, and refuses to believe my explanations, demanding to be removed
from the backbone as they think this will remove this extra traffic. So I
remove them as well, but their load only increases. That's when they
start talking about leaving BITNET.
With the formal/legal organization we have, there are two things which we
can do about such problems. First, you have to remember that, while the
CREN charter requires you to let others connect through your system, it
certainly doesn't require you to let any number of sites connect under
the conditions of their choice. If a particular site is causing you
serious operational problems for what appears to be unjustified reasons,
you can put pressure on them to stay on the backbone or find another
connection point. That may lead them to leave BITNET and lose LISTSERV,
but after all the problem is on their end - not yours. In fact, you may
find that if you approach them with a cooperative proposal ("we will
update your tables if you want, it only takes 5 minutes and we have
procedures to do that"), their desire to stay off the backbone will
vanish in a matter of seconds :-)
That was the first approach, something that requires a bit of talking and
maybe some lobbying in all corners of the network. The second approach is
a change I am going to make to 1.7f to help relieve transit sites: if you
run LISTSERV, you will now have to accept responsibility for deliveries
to your local node users. That is, the leaf sites which drop out will
start getting DISTRIBUTE jobs for all users on the LISTSERV host. Since
the local nodeid is a configuration constant, they will be able to
deliver without problem even if their tables date back to 88. If the
version of LISTSERV they are running is too slow or buggy or if their
server is only logged on every other day, that will just give them an
incentive to clean up their own house and not depend on the rest of the
world to take care of their problems for them. This change will be
implemented in beta 1.7f-7 which I will probably release tomorrow.
Eric
|