[Tutor] Top posters for 2009

Mon Mar 1 07:57:06 CET 2010

Kent Johnson wrote:
> It's not really about keeping score :-), but once again I've compiled
> a list of the top 20 posters to the tutor list for the last year. For
> 2009, the rankings are
>
> 2009 (7730 posts, 709 posters)
> ====
> Alan Gauld 969 (12.5%)
> Kent Johnson 804 (10.4%)
> Dave Angel 254 (3.3%)
> spir 254 (3.3%)
> Wayne Watson 222 (2.9%)
> bob gailer 191 (2.5%)
> Lie Ryan 186 (2.4%)
> David 127 (1.6%)
> Emile van Sebille 115 (1.5%)
> Wayne 112 (1.4%)
> Sander Sweers 111 (1.4%)
> Serdar Tumgoren 100 (1.3%)
> Luke Paireepinart 99 (1.3%)
> wesley chun 99 (1.3%)
> W W 74 (1.0%)
> Marc Tompkins 72 (0.9%)
> A.T.Hofkamp 71 (0.9%)
> Robert Berman 68 (0.9%)
> vince spicer 63 (0.8%)
> Emad Nawfal 62 (0.8%)
>
> Alan, congratulations, you pulled ahead of me for the first time in
> years! You posted more than in 2008, I posted less. Overall posts are
> up from last year, which was the slowest year since I started
> measuring (2003).
>
> Thank you to everyone who asks and answers questions here!
>
> The rankings are compiled by scraping the monthly author pages from
> the tutor archives, using Beautiful Soup to extract author names. I
> consolidate counts for different capitalizations of the same name but
> not for different spellings. The script is below.
>
> Kent
>
> ''' Counts all posts to Python-tutor by author'''
> # -*- coding: latin-1 -*-
> from datetime import date, timedelta
> import operator, urllib2
> from BeautifulSoup import BeautifulSoup
>
> today = date.today()
>
> for year in range(2009, 2010):
>     startDate = date(year, 1, 1)
>     endDate = date(year, 12, 31)
>     thirtyOne = timedelta(days=31)
>     counts = {}
>
>     # Collect all the counts for a year by scraping the monthly author
> archive pages
>     while startDate < endDate and startDate < today:
>         dateString = startDate.strftime('%Y-%B')
>
>         url = 'http://mail.python.org/pipermail/tutor/%s/author.html'
> % dateString
>         data = urllib2.urlopen(url).read()
>         soup = BeautifulSoup(data)
>
>         li = soup.findAll('li')[2:-2]
>
>         for l in li:
>             name = l.i.string.strip()
>             counts[name] = counts.get(name, 0) + 1
>
>         startDate += thirtyOne
>
>     # Consolidate names that vary by case under the most popular spelling
>     nameMap = dict() # Map lower-case name to most popular name
>
>     # Use counts.items() so we can delete from the dict.
>     for name, count in sorted(counts.items(),
> key=operator.itemgetter(1), reverse=True):
>        lower = name.lower()
>        if lower in nameMap:
>           # Add counts for a name we have seen already and remove the duplicate
>           counts[nameMap[lower]] += count
>           del counts[name]
>        else:
>           nameMap[lower] = name
>
>     totalPosts = sum(counts.itervalues())
>     posters = len(counts)
>
>     print
>     print '%s (%s posts, %s posters)' % (year, totalPosts, posters)
>     print '===='
>     for name, count in sorted(counts.iteritems(),
> key=operator.itemgetter(1), reverse=True)[:20]:
>         pct = round(100.0*count/totalPosts, 1)
>         print '%s %s (%s%%)' % (name.encode('utf-8',
> 'xmlcharrefreplace'), count, pct)
>     print
> _______________________________________________
> Tutor maillist  -  Tutor at python.org
> To unsubscribe or change subscription options:
> http://mail.python.org/mailman/listinfo/tutor
>
>   
Nice script Kent.
Keep up the good signal-to-noise ratio guys.

-- 
Kind Regards,
Christian Witts