[Spambayes] Exceptionally well-done identity-theft spam

Tim Peters tim.one at comcast.net
Mon Dec 29 16:46:41 EST 2003

[Skip Montanaro]
> Yeah, this is a stinker.  I get them all the time.  Interestingly
> enough, your message scored 0.69 for me.  It probably would have
> scored as spam except it came from you. ;-)

Have you trained on any real msgs from PayPal as ham?  That's the kind of
commercial HTML email that *does* look quite spammy unless you've trained on
it first.  This scam embedded legitimate PayPal URLs too, which showed up as
pure ham clues for me (since PayPal includes them in their legit email,
which I've trained on as ham).

> The real kicker here is this URL:
> http://www.paypal.com%65%6B%6A%68%61%73%6B%6A%71%70%77%6F%70%77%6F@%
> 32%31%31.%36%33.%31%36%32.%39%33:%37%33%30%31/%70%61%79%70%61%6C.%68
> %74%6D
> which unmangles to:
>     http://www.paypal.comekjhaskjqpwopwo@
> I'm not about to visit that URL, but I'm almost certain it will look
> just like a PayPal page and that is not in PayPal's
> universe.

Ya, I didn't click on it either.  Maybe worse, what the email *displays* for
that link (what's rendered on the user's screen) is


and that's a correct (legitimate) URL for a PayPal login.  spambayes sees
that too, of course.  At least Outlook shows you the unmangled href (in the
UI's status line) if you hover the mouse over the link.  The instant tip-off
there was that the displayed URL claimed to use https but the actual href
used http -- real PayPal email never does that.

> This suggests some more possible things to try:
>     * URLs which have usernames in them
>     * URLs which refer to non-standard ports
>     * URLs with IP addresses instead of hostnames (in addition to
>       specific hosts or networks)
> I haven't looked to see if any of these are already recognized, but
> all three techniques seem to be prevalent or required by such scams.

The pieces of the URL get broken out and tagged as such (with a "url:"
prefix), but there's no semantic analysis.  Even if there were, the damnable
thing about this spam is that this specific URL is about the *only* thing in
it you won't find in a real PayPal email.  Even the images in it come from
PayPal's real home:

    <img src="http://images.paypal.com/images/pixel.gif" ...

I don't think a statistical word analyzer (like ours) is going to do much
good against well-done identity-theft scam, and *some* of those have been
getting much better over the last year.  This one was also remarkable for
its good spelling and grammar (still rare in "the typical" scam of this

More information about the Spambayes mailing list