Hmm, Ill have a bash at it.
Its better than no protection at all, and its easy enough to do.
A function that resolves entities isnt hard to write though, heres one:
rentity = re.compile(r"&(\w+);") rhashentity = re.compile(r"(\d+);")
from htmlentitydefs import entitydefs
#escapes entities of the form 'ϧ' and '&aaa;' into ascii strings def unentity(s): s = rentity.sub(lambda m: entitydefs.get(m.group(1), m.group()), s) s = rhashentity.sub(lambda m: chr(int(m.group(1))), s)
It's a shame you cant define your own entities in html.
-----Original Message----- From: Barry A. Warsaw [mailto:barry@zope.com] Sent: Thursday, 28 February 2002 23:01 To: Paul Schreiber Cc: Damien Morton; mailman-developers@python.org Subject: RE: [Mailman-Developers] Protecting email addresses from spam harvesters
"PS" == Paul Schreiber cheesefactory@yahoo.com writes:
PS> Yes,, but you can encode the "mailto:" as well, like so: <a PS>
href="mailto:jo 1;@aol.com">me</a>
PS> I would guess most spambots are pretty dumb, probably using a PS> silly regex like <a href="mailto:([^"]+)">.
This /is/ kind of interesting. Anybody want to write a patch?
-Barry