Your message dated: Thu, 08 May 1997 01:16:19 +0200 > >The Internet side is typically about half the total volume of the AOL > >mail system as a whole. > > Internal mail from AOL user 1 to AOL user 2 doesn't go through SMTP and I > don't see how it is relevant to this discussion, other than conveniently > providing the 23M of deliveries you were missing :-) The problem is that the Internet side of the equation is not the limiting factor here. If everything else goes smoothly, the Internet side can *easily* handle that kind of load, without breaking a sweat. I'm confident it could handle twice or even three times that load without too much difficulty, if there weren't problems elsewhere. But, there are problems elsewhere, and to protect the mail system as a whole, we have to use the Internet side to aggressively filter out the illegitimate mail (since virtually all illegitimate mail comes from the Internet, and illegitimate mail as a whole poses more of a risk to the system due to certain aspects of its nature) > > Now, how many million messages did you receive yesterday? > > Receive, not that many. There are the bounces of course, but like any > other large mailing list shop, we receive a lot less than we send. You use the measure you want, we'll use the measure we want. > I > imagine AOL is the opposite and receives a lot more than it sends. Typically, we recieve about twice as much as we send. That's still a boatload of mail that we send, but since the expansion factor per outbound email message is low (on average, about two recipients per message), we don't get your economies of scale with hundred or thousands of recipients from a single message all served by the same set of MXes, etc.... Nevertheless, our outbound system is not the problem here. > Anyway, I think what you really want to know is the number of SMTP > transactions that we've made, regardless of the recipient count, right? Not exactly. Since virtually all mail that is sent is transmitted as soon as it is recived by the other end, it makes very little use of connection caching. Since much of the expense of SMTP transactions is setting up and tearing down the connection (otherwise connection caching wouldn't be an issue), connection caching typically only comes into play when someone is delivering a large number of previously queued messages, perhaps from a mailing list. Since you've got extremely large economies of scale due to large numbers of recipients typically served by a set of MXes, and you've surely optimized your delivery to make maximum use of connection caching, the actual number of SMTP connections you make is actually probably quite small. Considerably smaller than the total number of message envelopes delivered, which is probably considerably smaller than the total number of recipients per message. Making SMTP connections is relatively cheap, since you can choose whether or not to have another queue runner fork off (if you pre-fork off a set of worker processes, then if you've got any worker processes that are currently idle), you can control how much of a load on your system that sending messages presents. You can also control how long you wait for various sorts of things to happen before you time out, so that you deliver large quantities of mail to fast servers in a very short period of time, while slower servers end up getting relegated to the bottom of the list. Since receiving mail is inherently interrupt-driven, and you can't force the other end to make use of connection caching (you can only sit there and wait to accept multiple messages per connection if the other end chooses to send them that way), what I want to count is the total number of SMTP connections you receive per day. Everything else is superfluous. Programming something like delivering large quantities of mail out of a queue is relatively easy, since you don't have to accept connections (or not, if the system load is too high) and then pass them off to child/worker processes, a mechanism that is inherently fork/exec style, but which can be programmed (with no small amount of difficulty) in a pre-fork/worker process style. However, that is at least as hard, if not harder, than writing a program to solve an inherently recursive problem in a non-recursive manner. At least, programmers doing that sort of work have an extensive body of pre-existing work that shows how to use stacks to simulate recursion, so that there's relatively little "new" stuff that has to be done to "unroll" an inherently recursive process into an iterative one. > I have no intention of either giving up my day job, moving to the US or > joining AOL, nor do I see any reason why this would be necessary in order > to accomplish the stated goals. Nevertheless, I was making a serious > business proposal. I will pass on to the mail systems development management that you want to re-write the AOL Internet mail gateway system using LSMTP. If that's something we can do in parallel with our other efforts (and without a great deal of support required to teach you how the back end works), then they might be willing to listen. However, I am not in development, and applying development solutions to operational problems is not a method I have available to me. Only the development folks can decide whether or not that is a solution they can support (However, I think it unlikely, given how thinly they're already stretched). We did previously look at using PMDF as the basis for our gateway system, but rejected it once we realized what the API was, and the amount of programming that would be required on our part to get the messages out of their proprietary internal database and into ours. If we're going to do that level of programming anyway, we might as well write the thing from scratch. > I see that Matt Korn is still your VP of Operations, so it > looks like I may be preaching to the choir :-) He keeps remarking on occasion how much mail could be handled by VM SMTP, but he's changed his tune a bit since we found a bug in that code with regards to the way it handles MX RRs. Especially since we were forced to work that out the hard way, as no one at IBM was willing to work directly with us, and the IBM customers they were willing to work with didn't have enough information about the problem to describe it sufficiently well. After we'd sufficiently "black-boxed" the thing, and worked with these customers applying multiple rounds of "Okay, we've installed this patch to our mainframe, does it work now? No....", we finally got that one worked out. We've also pointed out to him how expensive mainframes still are, how much power, space, and cooling they require, and how many of them we'd require to do the job. Besides, we'd be replacing one sort of mainframe with another (as a part of the overall system), and we know that mainframes inherently do not scale to the size of the operation we have today, much less where we need to be. -- Brad Knowles MIME/PGP: [log in to unmask] Senior Unix Administrator <http://www.his.com/~brad/> <http://swissnet.ai.mit.edu:11371/pks/lookup?op=get&search=0xE38CCEF1>