nice. Kudos to all top posters. May be I should search my rank ;)<br><br>~l0nwlf<br><br><div class="gmail_quote">On Fri, Feb 26, 2010 at 8:23 AM, Kent Johnson <span dir="ltr"><<a href="mailto:kent37@tds.net">kent37@tds.net</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;">It's not really about keeping score :-), but once again I've compiled<br>
a list of the top 20 posters to the tutor list for the last year. For<br>
2009, the rankings are<br>
<br>
2009 (7730 posts, 709 posters)<br>
====<br>
Alan Gauld 969 (12.5%)<br>
Kent Johnson 804 (10.4%)<br>
Dave Angel 254 (3.3%)<br>
spir 254 (3.3%)<br>
Wayne Watson 222 (2.9%)<br>
bob gailer 191 (2.5%)<br>
Lie Ryan 186 (2.4%)<br>
David 127 (1.6%)<br>
Emile van Sebille 115 (1.5%)<br>
Wayne 112 (1.4%)<br>
Sander Sweers 111 (1.4%)<br>
Serdar Tumgoren 100 (1.3%)<br>
Luke Paireepinart 99 (1.3%)<br>
wesley chun 99 (1.3%)<br>
W W 74 (1.0%)<br>
Marc Tompkins 72 (0.9%)<br>
A.T.Hofkamp 71 (0.9%)<br>
Robert Berman 68 (0.9%)<br>
vince spicer 63 (0.8%)<br>
Emad Nawfal 62 (0.8%)<br>
<br>
Alan, congratulations, you pulled ahead of me for the first time in<br>
years! You posted more than in 2008, I posted less. Overall posts are<br>
up from last year, which was the slowest year since I started<br>
measuring (2003).<br>
<br>
Thank you to everyone who asks and answers questions here!<br>
<br>
The rankings are compiled by scraping the monthly author pages from<br>
the tutor archives, using Beautiful Soup to extract author names. I<br>
consolidate counts for different capitalizations of the same name but<br>
not for different spellings. The script is below.<br>
<br>
Kent<br>
<br>
''' Counts all posts to Python-tutor by author'''<br>
# -*- coding: latin-1 -*-<br>
from datetime import date, timedelta<br>
import operator, urllib2<br>
from BeautifulSoup import BeautifulSoup<br>
<br>
today = date.today()<br>
<br>
for year in range(2009, 2010):<br>
startDate = date(year, 1, 1)<br>
endDate = date(year, 12, 31)<br>
thirtyOne = timedelta(days=31)<br>
counts = {}<br>
<br>
# Collect all the counts for a year by scraping the monthly author<br>
archive pages<br>
while startDate < endDate and startDate < today:<br>
dateString = startDate.strftime('%Y-%B')<br>
<br>
url = '<a href="http://mail.python.org/pipermail/tutor/%s/author.html" target="_blank">http://mail.python.org/pipermail/tutor/%s/author.html</a>'<br>
% dateString<br>
data = urllib2.urlopen(url).read()<br>
soup = BeautifulSoup(data)<br>
<br>
li = soup.findAll('li')[2:-2]<br>
<br>
for l in li:<br>
name = l.i.string.strip()<br>
counts[name] = counts.get(name, 0) + 1<br>
<br>
startDate += thirtyOne<br>
<br>
# Consolidate names that vary by case under the most popular spelling<br>
nameMap = dict() # Map lower-case name to most popular name<br>
<br>
# Use counts.items() so we can delete from the dict.<br>
for name, count in sorted(counts.items(),<br>
key=operator.itemgetter(1), reverse=True):<br>
lower = name.lower()<br>
if lower in nameMap:<br>
# Add counts for a name we have seen already and remove the duplicate<br>
counts[nameMap[lower]] += count<br>
del counts[name]<br>
else:<br>
nameMap[lower] = name<br>
<br>
totalPosts = sum(counts.itervalues())<br>
posters = len(counts)<br>
<br>
print<br>
print '%s (%s posts, %s posters)' % (year, totalPosts, posters)<br>
print '===='<br>
for name, count in sorted(counts.iteritems(),<br>
key=operator.itemgetter(1), reverse=True)[:20]:<br>
pct = round(100.0*count/totalPosts, 1)<br>
print '%s %s (%s%%)' % (name.encode('utf-8',<br>
'xmlcharrefreplace'), count, pct)<br>
print<br>
_______________________________________________<br>
Tutor maillist - <a href="mailto:Tutor@python.org">Tutor@python.org</a><br>
To unsubscribe or change subscription options:<br>
<a href="http://mail.python.org/mailman/listinfo/tutor" target="_blank">http://mail.python.org/mailman/listinfo/tutor</a><br>
</blockquote></div><br>