On Sep 15, UIUC reported on the core operators' list that their core system, a dedicated 4381 running the UIUCVM42 core node, could no longer handle the load it was subjected to. It was quickly established that the machine is simply out of steam. It is one of the smallest machines on the core, it is 100% busy 24 hours, and it has reached a point where the smallest amount of files in LISTSERV's input queue is on the order of 3000, in the middle of the night on Sundays. This machine and the manpower to operate it are provided by UIUC on a volunteer basis, and we can only thank UIUC for their continuing support, dedication, and generosity. However, this problem does need to be solved. Several L-Soft customers have complained about LISTSERV delivery delays of up to EIGHT DAYS. Needless to say, this is totally unacceptable to the average user. At first, we told people that the UIUCVM42 issue was being investigated. The issue was being discussed on the core operators' list, and we hoped that a solution would be found shortly. Unfortunately, this has not happened. What's worse, nobody seems to have taken ownership of the problem. We can't even tell our customers that a solution is being actively implemented and is expected to be ready by a certain date, because, to the best of our knowledge, nothing at all is being implemented. This is intolerable. Our customers are not interested in our explanations of the delicate, volunteer-based core support structure. They pay us good money for software which happens to use the core. They demand service. In their opinion, if the core needs 8 days to process a LISTSERV distribution, the core should be either fixed or terminated, because a structure that needs 8 days to deliver mail is simply not useful. And they are right. In order to ensure that our customers do receive a decent level of service, we have had no option but to remove UIUCVM42 from the LISTSERV backbone (and set UIUCVMD to LOCAL distribution mode, to avoid having it attract the workload of UIUCVM42). This will bypass the LISTSERV@UIUCVM42 backlog and restore the expected level of service. This is not a satisfactory solution. In fact, it is a last resort solution, and this is why we waited 2 weeks before making this decision. There was simply no other option. Removing UIUC from the backbone will increase the level of traffic on the core, and break the INTERBIT symmetry. We do not expect any major disaster, and there is no cause for panic. This change simply puts UIUC in the same situation as Cornell, back when it used to be a core site not running LISTSERV. We expect that this change will solve the problem in the near future, at the expense of additional traffic that the core structure can support today. However, we also expect that other sites will find themselves in a situation similar to UIUC's over the next 6 months. Removing a core site from the backbone increases traffic in proportion to the number of remaining core sites on the backbone. That is, it is bearable the first few times you do it, but every additional removal becomes more expensive than the previous one. And, since each removal increases traffic and contributes to saturating machines and requiring another removal, this is a very dangerous situation which could get out of control in no time. Again, UIUC is not to blame for this problem. They are not being paid for this service, which costs them real money. The machine is out of steam, the traffic simply has to be moved elsewhere. There are several ways to move SMTP/INTERBIT traffic from a VM system to a workstation. These are not experimental mechanisms. SUNET has been running its SMTP service on a workstation since March 1994. Others have started offloading their mainframes in a similar fashion. The technology is available, today, to solve problems such as UIUC's. And, if it is not deployed today, we will not have a core for long. The only obstacle to this deployment is that software cannot run on thin air, and someone has to purchase the workstations in question. Some core sites are willing to spend $10-20k to buy a workstation for the core service, others aren't, and we can't really blame them for that. This problem has to be solved by the NJE connectivity providers, who are getting paid by the participating organizations for the provision of services that their users find useful - when the turnaround time is within reasonable bounds, that is. A comprehensive solution would probably cost around $200k and a minimal solution, $50k. These are mostly one time charges for the purchase of equipment. It is estimated that the NJE connectivity providers collect on the order of 2 million US dollars a year (worldwide). Thus, the comprehensive solution would cost about 10% of the yearly dues, and again most of that is a one time charge. Since the NJE connectivity providers have a monopoly, it is not possible for other companies to offer more competitive or better operated NJE services. Your only option, as a representative of your dissatisfied users, is to complain to your NJE provider, and seek alternate solutions if you do not receive a satisfactory answer. Eric