Using the nntplib module to count Google Groups users
Zero Piraeus
z at etiol.net
Sun Oct 27 02:37:29 EDT 2013
:
On Sun, Oct 27, 2013 at 03:35:40PM +1100, Chris Angelico wrote:
> On Sun, Oct 27, 2013 at 2:32 PM, Steven D'Aprano
> <steve+comp.lang.python at pearwood.info> wrote:
> > If anyone wants to modify the script to determine the ratio of posters,
> > rather than posts, using GG, be my guest.
>
> And if anyone does, do please post the result on-list.
Taking a different tack, since I happen to have a complete[1] local
archive of python-list going back a few years ... here's a quick and
dirty script to count unique senders and Google Groups users for this
year:
- - -
import os
from email.parser import HeaderParser
LIST = "python-list at python.org"
MAILDIR = "/path/to/mail/archive/cur"
YEAR = "2013"
parser = HeaderParser()
found = set()
gg_users = 0
for filename in os.listdir(MAILDIR):
with open(os.path.join(MAILDIR, filename)) as message:
headers = parser.parse(message)
sender = headers.get("from", "")
dest = headers.get("to", "")
date = headers.get("date", "")
if (LIST not in dest) or (YEAR not in date) or (sender in found):
continue
found.add(sender)
if "groups-abuse at google.com" in headers.get("complaints-to", ""):
gg_users += 1
print("GG user:")
print(sender)
print("Senders: %d" % len(found))
print("GG users: %d" % gg_users)
print("---")
- - -
It's obviously not very robust, but I reckon it's good enough to get an
idea what's going on.
The results:
Senders: 1701
GG users: 879
... so just over 50%.
If anyone wants the complete output, just let me know and I'll email it
privately.
-[]z.
[1] except for spam filtered out by Gmail.
--
Zero Piraeus: ad referendum
http://etiol.net/pubkey.asc
More information about the Python-list
mailing list