LSTSRV-L Archives

LISTSERV Site Administrators' Forum

LSTSRV-L

Options: Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Topic: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Eric Thomas <[log in to unmask]>
Wed, 19 Sep 90 01:20:15 O
text/plain (52 lines)
I suppose I will be flamed for saying  what comes next, but I have a very
good  DISCARD key.  The situation  at CEARN  is getting  out of  control.
LISTSERV has 900 files in its reader right now, a few hours ago I sent it
a command  via CP MSG straight  from CEARN and  it took it 25  minutes to
execute it.  When it did  execute it,  I checked the  interactive message
queue (LSVIUCV DEPTH) and  it said 416. It was getting  about as much CPU
time as it  needed to discard SENT FILE messages  and issue an occasional
RSCS START command, but no time to actually look at its reader.
 
This is an  old problem, and I  suspect there are a few  other sites with
similar  problems, although  probably  not as  critical.  The problem  of
course is  that it pisses  off people who  are downstream the  server and
have the  CPU capacity to  handle the files  destined to them,  yet since
they are farther away topologically speaking  they have to go through the
bottleneck node. Removing  the bottleneck server from  the backbone might
mean exploding on  the order of 50 files  per job on the other  side of a
perhaps not saturated but not precisely idle link and is therefore not an
option. With  the stupid topology  we have in  Europe there are  not many
alternatives either, so I looked at the  code to see what I could do with
simple changes.  The result is  development fix  16E-009D; this is  a new
type of  fix, which you  need LFIX release 1.1  to install and  which you
should  NOT  install as  preventive  service.  As  soon  as you  order  a
development fix, you  are automatically AFD'ed to it and  are expected to
re-install the  updates you get as  you receive them. Unlike  other fixes
they can be re-installed over and over as many times as necessary, but to
save disk space only one version of the updated files (the one before you
installed the  development fix the  first time) is  kept; if you  want to
keep each  and every  update when  you refresh  the fix,  you must  do so
manually. In other words, the "back  off" procedure for a development fix
is to  remove the fix completely,  rather than fall back  to its previous
incarnation (which I will not support because I will not keep a copy).
 
Anyway the  DISTRIBUTE change  in question  implements a  new ':backbone'
option  in  PEERS NAMES,  'DISTRIBUTE(YES,LOCAL)'  (treated  as a  normal
':backbone.YES' tag  by servers not  running 16E-009D). This  defines the
entry  in question  as a  "non-routing"  DISTRIBUTE server,  ie one  that
accepts to  receive distributions  for recipients  in its  "service area"
like a normal  server but that does  not want to be used  as a switchyard
for other recipients, even though doing so might save bandwidth. In other
words, it is a statement that bandwidth in this "area" is good enough (or
CPU time is  scarse enough) that you will actually  waste wall-clock time
in your attempt to save bandwidth, and  you should therefore not do so. I
have no  idea whether  or not  this will be  sufficient in  practice, but
that's the only "simple" change I could come up with.
 
I will be  distributing a modified PEERS NAMES  with modified ':backbone'
tags tomorrow (I have  to run home now), to allow sites  that want to try
the new algorithm to see whether  or not it improves things. The modified
PEERS NAMES will of course work "normally" with unmodified servers.
 
  Eric

ATOM RSS1 RSS2