More usenet usage statistics, by programming language

Peter Hansen peter at engcorp.com
Sat Jan 25 04:17:14 CET 2003


"Aaron K. Johnson" wrote:
> 
> In message <3E31BACB.EA3D8CDA at engcorp.com>, Peter Hansen wrote:
> > "Aaron K. Johnson" wrote:
> > >
> > > In message <v339gg9p1rlb3e at news.supernews.com>, "John Roth" wrote:
> > > >
> > > > I don't understand. Number of unique posters in the last 200 posts to a
> > > > newsgroup I understand,
> > > > and 647 to the (one) Python newsgroup I understand, but I don't
> > > > understand how you get
> > > > 647 different posters out of the last 200 posts.
> > > >
> > > > Oh, and Clipper is an old data base language, somewhere in the dbase
> > > > family.
> > >
> > > oops, sorry....I meant 2000!
> >
> > So, among other problems, this means if a given newsgroup had a single
> > large thread with a half-dozen regulars posting ten times each in a
> > big argument, that particular language would appear less popular...
> >
> > -Peter
> 
> Peter,
> 
> Each poster is counted only once.

I understand that most basic point.  Let me try out an example
to help clarify *my point*.

There are 2000 posts retrieved from comp.lang.noisy.  There is 
a recent thread involving five people who each contributed 201
messages.  That means 1000 of those 2000 messages are eliminated
instantly by your filtering of non-unique posters.  That leaves
only 1000 posts from which to measure the number of unique
authors, aside from these prolific five.

Does that help?  The comments about needing to examine across
a fixed duration are probably reasonable...

-Peter




More information about the Python-list mailing list