[Python-Dev] RE: [Python-checkins] python/dist/src/Lib/email Parser.py, 1.20, 1.20.12.1

Josiah Carlson jcarlson at uci.edu
Sat May 15 13:39:30 EDT 2004


> >>Yadda yadda yadda.  At this point, the more obvious solution is to ignore
> >>all that MIME crap, and just guess a good structure for the email.  The
> >>spambayes Outlook addin does that all the time <wink>.
> > 
> > 
> > Structure's for monkeyboys and pointy-bracket lovers, not for studly
> > w1nd0z 1nst4ll3r d00dz and 3m41l l33t h4x0rz.
> 
> Hmm, I wonder if we should add another fun codec to the encodings
> package ?! Anyone have a reference ?

The only problem is that the mapping from English to 1337, or 1337 to
English on a per-character basis is not one-to-one:
s == s.encode('leet').decode('leet')
... would be false.

Generating the encoder would be relatively easy, but would only really produce
interesting results if, for example, i -> (l, 1, !, |, ;, :), and the
resulting mapping was taken in-context to other choices made in the
current word.

The decoder would be an exercise in frustration, assuming we don't care
about getting something exact, but something 'close enough'.  Assuming
purely English text, one would need a "possible decoding" of every word,
all of which could then be fed into a phonetic spell checker, which
would need to decide which of the possible decodings for this word was
likely the correct one. Sounds like a fun project, though as:
http://cockeyed.com/lessons/viagra/viagra.html
... has shown, there may not be a workable solution to the problem.

Pity though, it would be fun.

 - Josiah




More information about the Python-Dev mailing list