More usenet usage statistics, by programming language

Aaron K. Johnson akjmicro at yahoo.com
Sat Jan 25 06:28:30 CET 2003


In message <3E3201BA.BB87EA3 at engcorp.com>, Peter Hansen wrote:
> "Aaron K. Johnson" wrote:
> > 
> > In message <3E31BACB.EA3D8CDA at engcorp.com>, Peter Hansen wrote:
> > > "Aaron K. Johnson" wrote:
> > > >
> > > > In message <v339gg9p1rlb3e at news.supernews.com>, "John Roth" wrote:
> > > > >
> > > > > I don't understand. Number of unique posters in the last 200 posts
> to a
> > > > > newsgroup I understand,
> > > > > and 647 to the (one) Python newsgroup I understand, but I don't
> > > > > understand how you get
> > > > > 647 different posters out of the last 200 posts.
> > > > >
> > > > > Oh, and Clipper is an old data base language, somewhere in the dbase
> > > > > family.
> > > >
> > > > oops, sorry....I meant 2000!
> > >
> > > So, among other problems, this means if a given newsgroup had a single
> > > large thread with a half-dozen regulars posting ten times each in a
> > > big argument, that particular language would appear less popular...
> > >
> > > -Peter
> > 
> > Peter,
> > 
> > Each poster is counted only once.
> 
> I understand that most basic point.  Let me try out an example
> to help clarify *my point*.
> 
> There are 2000 posts retrieved from comp.lang.noisy.  There is 
> a recent thread involving five people who each contributed 201
> messages.  That means 1000 of those 2000 messages are eliminated
> instantly by your filtering of non-unique posters.  That leaves
> only 1000 posts from which to measure the number of unique
> authors, aside from these prolific five.
> 
> Does that help?  The comments about needing to examine across
> a fixed duration are probably reasonable...
> 
> -Peter

Yes, thanks.

I'm now exploring some feature of the nntplib, in particular xhdr, to find and
organize date data. Your comments have been helpful! I'll post the results as I
complete the work!

Thanks,
Aaron.







More information about the Python-list mailing list