> >> Why don't you take a look at your console logs and see where the bulk
> >> of the jobs are coming from :-)
> >
> >I took a look at one log file that I happened to have handy and didn't
> >see why a reasonable limit would be a problem.
>
> There should be a lot of requests coming from LISTSERV@XYZ, from
> NETNEWS@AUVM, and possibly from local addresses (either users or
> servers). For instance one site has an alert list which receives a lot of
> mail from various DVMs. Another site has a setup where a phone book
> server updates lists to keep them in synch with the campus phone book.
I don't see any files incoming from NETNEWS@AUVM, even though we do have
at least one list that's gatewayed through them. I presume the reason is
that those files go through Mailer.
I didn't suggest that request coming in from other LISTSERVs be an exception
to the rule. I thought that was too obvious to mention.
The alert list doesn't count because it's a list. I'm suggesting only
messages to LISTSERV itself be limited. Lists already have quite
adequate controls (thanks to you).
The phone book server might be a valid example if it doesn't batch its
updates. I guess that would require an exception mechanism, which, of
course, makes the idea more work to implement.
> >If you are saying that someone should be manually scanning the logs
> >every day to check for loops,
>
> No, I'm saying that you should have a program to monitor the size of the
> LISTSERV queue given the speed of your machine. And not just because of
> the possibility of loops :-)
I agree, although I'm not aware of any such programs available for VM.
We'll probably build some things like that for the Unix system that's
replacing the VM system.
> >> This is a separate problem that calls for a different solution. For
> >> what it's worth, most of these people send a small number of jobs with
> >> a large number of requests each.
> >
> >Even the ones who can't supply an address that can be replied to?
>
> People who collect addresses a la IAF certainly use a working address so
> they can receive the info they need.
Yes, I expect that IAF does. But I have seen multiple instances of
botched subscribe requests that come in with from addresses that can't
be replied too. That's why I see them -- LISTSERV sends me the bounces.
I'm not certain what they are doing, maybe trying to mail bomb someone by
subscribing them to lots of lists. Usually by the time I see the bounces
it's too late to serve the offending address out, so I don't see the
original messages. That makes it impossible to complain to the offender's
Internet provider. Anyway, this problem is not my main interest.
> >At the time we figured out the problem, there were 4-5000 files in
> >Mailer's reader. Something like 4000 of those were due to the loop. Due
> >to the lack of better tools, Bruce purged the ones whose sizes indicated
> >they were part of the loop (4 different sizes, if memory serves). Of
> >course, we most likely purged some legitimate mnessages too, but we had
> >no other reasonable choice.
>
> I'm sorry that you've had to purge all these messages, but let's face it,
> on a PC the impact of letting these 4000 messages go through would have
> been minimal. It would probably have taken all of 10 minutes to input
> them to the SMTP queue. If I could wave a magic wand to make this problem
> not happen, I would, but I can't.
In our situation, the exponential expansion in the number of messages meant
that it was going to grow until something broke. Unfortunately in this
case it was our machine. When we get our Sparc 20 into production, maybe we
can last until the other guy's machine dies. :-)
> >A limit of 100 or 200 would have caught the problem in time for us, I
> >think.
>
> Yes, a limit of 100 or 200 would have caught this problem for you. It
> would also have caused a number of other problems for other people
> (byebye news gateway) and then it wouldn't necessarily have caught other
> loops. Making the INFO change incidentally would also have caught this
> problem for you.
>
> Eric
Bruce did install the INFO change, and it immediately caught a loop with
another Groupwise system. It wasn't the exponentially growing sort, though.
I wasn't aware of that change before. It should help a lot.
|