December 2002 comp.lang.* stats

Erik Max Francis max at alcyone.com
Sun Jan 26 02:47:14 CET 2003


Peter Hansen wrote:

> Spam is probably a problem best ignored.  It would probably
> affect all those groups equally anyway.

Actually, that's one of the problems with his collapsing hierarchies
into a single number.  To first order, spammers would probably post to
every comp.* group with the same frequency.  So if a hierarchy contains
six groups, the raw numbers will likely be overcounting spam by
approximately a factor of six, as compared to a solitary newsgroup.

To second order, there's probably an additional effect of newsgroups
with names that sort lexicographically early getting more spam, since
more spammers do their spams sequentially, and those that get forcibly
stopped will be less likely to hit comp.lang.z than comp.lang.a.

-- 
 Erik Max Francis / max at alcyone.com / http://www.alcyone.com/max/
 __ San Jose, CA, USA / 37 20 N 121 53 W / &tSftDotIotE
/  \ Walk a mile in my shoes / And you'd be crazy too
\__/ Tupac Shakur
    Crank Dot Net / http://www.crank.net/
 Cranks, crackpots, kooks, & loons on the Net.




More information about the Python-list mailing list