On Thu, 05 Dec 2002 10:56:46 EST, Tim Parker <[log in to unmask]> said: > Is there any easy way to get an output of the top domains that are on my > listserv lists? I am working on tweaking our LSMTP deliver and for the > destinations I am not sure of the right %'s for the different domain names. > Is there anything simple that can give the top ones? > > I know I can download the list and sort and use excel, but.... Sorting the list would be the totally wrong thing to do. Almost certainly, what you care about is total traffic to domains. So for instance, if you have 3 lists that each have 2,500 recipients for FOO.COM, but those three lists only see traffic once a week, it's not as important to tune for that case(*) as if you have 20 lists that have 100 recipients for BAR.COM which get 100 postings a day. What you *probably* want to do is take a week's worth of Listserv logs, take out all the 'Mail posted via SMTP to addr@host' lines, and do your statistics on *that*, so you're looking at *traffic*, not at *subcribers*. (*) Of course, if you have a very large list that is expected to deliver very fast throughput, tune for that. But you already know which lists those are if you have any... ;) I just chunked out the quick stats - first, let's run across all the lists and see what non-VT subscribers we have. So we get the admittedly ugly shell one-liner: [/home/listserv/home]1 find . -name '*.list' | xargs listview -s | \ egrep -i -v 'Ø\*|vt.edu|Ø$|ØFile ' |sed 's/Ø.*@\([Ø ]*\).*/\1/' | \ rev | cut -f1-2 -d. | sort | rev | uniq -c (The 'rev | cut | sort | rev' paradigm is quite useful sometimes). So we find we have some 15,631 different second-level domains represented, and the top ones are: 11684 AOL.COM 5415 HOTMAIL.COM 3490 YAHOO.COM 3283 VA.US 2296 EROLS.COM 1591 JUNO.COM 1128 MSN.COM 962 ATT.NET 925 MINDSPRING.COM 845 EARTHLINK.NET 829 RADFORD.EDU 528 VIRGINIA.EDU 519 COMPUSERVE.COM 503 INFI.NET 475 NAVY.MIL 385 PSU.EDU 349 RR.COM 332 NASA.GOV 328 NCSU.EDU 320 SWVA.NET 310 AC.UK 306 PRODIGY.NET 303 RUNET.EDU Now let's go look at a week's traffic... [/var/logs]1 egrep 'relay=.*stat=Sent' maillog* | grep -v 198.82.161.196 | \ sed 's/Ø.*relay=\([Ø ]*\).*/\1/' | rev | cut -f2-3 -d'.' | sort | rev | uniq -c Only get 4,113 second-levels in a week, and the top ones are: 9175 american.edu 7014 hotmail.com 5616 yahoo.com 5273 psmtp.com 4469 earthlink.net 3359 lsoft.com 2699 rr.com 2340 va.us 2063 sas.com 1804 aol.com 1785 msn.com 1623 prodigy.net 1556 army.mil 1279 washburn.edu 1278 serena.com 1234 nodak.edu 1165 edu.au 1078 msu.edu 1063 frb.org 1058 com.cn 1021 mindspring.com 1004 adelphia.net 1001 outblaze.com 997 criticalpath.net The american.edu, lsoft.com entries are due to the DIST2 network jobs... Notice that the two lists are *not* that similar.... (The wonders of shell one-liners.. ;) -- Valdis Kletnieks Computer Systems Senior Engineer Virginia Tech