[Tutor] fnord [the red queen and email disguising]

Danny Yoo dyoo@hkn.eecs.berkeley.edu
Wed, 7 Aug 2002 00:20:27 -0700 (PDT)


> > OK, it just came across my CRT that people scrape emails off of
> > publicly accessible archives, for later spamming. Hmmm, interesting. I
> > may write a subroutine that MUNGS the submitter's email address when
> > writing it to the archive, in a way that HUMAN intelligence could
> > easily detect and correct, but which should fry a poor bot's MIND.

Hi Kirk,

###
try:
    gatherEmailAddresses()
except FryingMindError:
    pass
###

Unfortunately (or fortunately), programs don't have minds to fry.  To tell
the truth, I feel sympathetic to the poor deterministic computer program.
I feel that it's not fair that a spam-gathering program should bear
responsibility for the actions of a parasitic user.


The idea to munge up an email address using a function is a good idea, and
it may work for a while.  But it's very likely that a function that undoes
the munging can be written, given enough time.  For example, if we wrote
something that translated '@' to " at " and '.' to "dot" in an email
address:

###  Hypothetical example:
>>> email_encode("matt_ridley@theredqueen.org")
matt_ridley at theredqueen dot org
###

then that's still something a regular expression engine can pick up with
ease.  So it has to be a bit more sophisticated than simple text
substitution.  It's unfortunate, but being a programmer doesn't imply
being virtuous, and we have to assume that some spammers have brains, even
if they lack moral qualms.


We can't make the munging too hard: otherwise, would a human be able to
decode it?  If you can strike a good balance between making it hard to
extract for programs, but easy for humans --- and to do it
programatically! --- a lot of people may name their next of kin after you.


But back to 'The Red Queen' for me.  Talk to you later!