[Email-SIG] Feed parser recipe
Matthew Dixon Cowles
matt at mondoinfo.com
Sat Aug 21 22:02:43 CEST 2004
Alex and Anna Martelli are working on a new edition of O'Reilly's
As you may know, it's largely based on the recipes in ActiveState's
Python Cookbook site:
Alex asked me if I had anything new to offer, especially anything
that uses the new features from Python 2.4. I have a bit of code that
uses the new feed parser that may be a useful example.
Since the deadline for submissions that use 2.4's features is
September 24, I guess it's pretty unlikely that I'll be able to test
it against a release version, and maybe even unlikely that I'll be
able to test against a beta. So I thought I'd ask here if anyone
thinks that the feed parser is likely to change significantly between
now and 2.4 final.
In particular, the code solves one problem I've seen in real life
when using the feed parser. I've seen at least a few spam mails that
have a content-type of multipart/<something> but which contain only a
single part. The feed parser can parse them, but the resulting
Message object is internally inconsistent: get_main_type() returns
"multipart" but is_multipart() returns False. In that case, my code
applies some messy heuristics in an attempt to figure out what the
right content-type is.
(Anyone is welcome to the code, but I haven't posted it here because
I doubt that the standard library is the right place for a bunch of
I'd be glad if anyone who has an opinion about whether that would be
a useful example for the book would let me know.
More information about the Email-SIG