[Tutor] Top posters for 2009
Christian Witts
cwitts at compuscan.co.za
Mon Mar 1 07:57:06 CET 2010
Kent Johnson wrote:
> It's not really about keeping score :-), but once again I've compiled
> a list of the top 20 posters to the tutor list for the last year. For
> 2009, the rankings are
>
> 2009 (7730 posts, 709 posters)
> ====
> Alan Gauld 969 (12.5%)
> Kent Johnson 804 (10.4%)
> Dave Angel 254 (3.3%)
> spir 254 (3.3%)
> Wayne Watson 222 (2.9%)
> bob gailer 191 (2.5%)
> Lie Ryan 186 (2.4%)
> David 127 (1.6%)
> Emile van Sebille 115 (1.5%)
> Wayne 112 (1.4%)
> Sander Sweers 111 (1.4%)
> Serdar Tumgoren 100 (1.3%)
> Luke Paireepinart 99 (1.3%)
> wesley chun 99 (1.3%)
> W W 74 (1.0%)
> Marc Tompkins 72 (0.9%)
> A.T.Hofkamp 71 (0.9%)
> Robert Berman 68 (0.9%)
> vince spicer 63 (0.8%)
> Emad Nawfal 62 (0.8%)
>
> Alan, congratulations, you pulled ahead of me for the first time in
> years! You posted more than in 2008, I posted less. Overall posts are
> up from last year, which was the slowest year since I started
> measuring (2003).
>
> Thank you to everyone who asks and answers questions here!
>
> The rankings are compiled by scraping the monthly author pages from
> the tutor archives, using Beautiful Soup to extract author names. I
> consolidate counts for different capitalizations of the same name but
> not for different spellings. The script is below.
>
> Kent
>
> ''' Counts all posts to Python-tutor by author'''
> # -*- coding: latin-1 -*-
> from datetime import date, timedelta
> import operator, urllib2
> from BeautifulSoup import BeautifulSoup
>
> today = date.today()
>
> for year in range(2009, 2010):
> startDate = date(year, 1, 1)
> endDate = date(year, 12, 31)
> thirtyOne = timedelta(days=31)
> counts = {}
>
> # Collect all the counts for a year by scraping the monthly author
> archive pages
> while startDate < endDate and startDate < today:
> dateString = startDate.strftime('%Y-%B')
>
> url = 'http://mail.python.org/pipermail/tutor/%s/author.html'
> % dateString
> data = urllib2.urlopen(url).read()
> soup = BeautifulSoup(data)
>
> li = soup.findAll('li')[2:-2]
>
> for l in li:
> name = l.i.string.strip()
> counts[name] = counts.get(name, 0) + 1
>
> startDate += thirtyOne
>
> # Consolidate names that vary by case under the most popular spelling
> nameMap = dict() # Map lower-case name to most popular name
>
> # Use counts.items() so we can delete from the dict.
> for name, count in sorted(counts.items(),
> key=operator.itemgetter(1), reverse=True):
> lower = name.lower()
> if lower in nameMap:
> # Add counts for a name we have seen already and remove the duplicate
> counts[nameMap[lower]] += count
> del counts[name]
> else:
> nameMap[lower] = name
>
> totalPosts = sum(counts.itervalues())
> posters = len(counts)
>
> print
> print '%s (%s posts, %s posters)' % (year, totalPosts, posters)
> print '===='
> for name, count in sorted(counts.iteritems(),
> key=operator.itemgetter(1), reverse=True)[:20]:
> pct = round(100.0*count/totalPosts, 1)
> print '%s %s (%s%%)' % (name.encode('utf-8',
> 'xmlcharrefreplace'), count, pct)
> print
> _______________________________________________
> Tutor maillist - Tutor at python.org
> To unsubscribe or change subscription options:
> http://mail.python.org/mailman/listinfo/tutor
>
>
Nice script Kent.
Keep up the good signal-to-noise ratio guys.
--
Kind Regards,
Christian Witts
More information about the Tutor
mailing list