Using non-ascii symbols

Neil Hodgson nyamatongwe+thunder at gmail.com
Fri Jan 27 17:29:20 EST 2006


    Having a bit of a play with some of my spam reduction code.

Original:

def isMostlyCyrillic(u):
     if type(u) != type(u""):
         u = unicode(u, "UTF-8")
     cnt = float(sum(0x400 <= ord(c) < 0x500 for c in u))
     return (cnt > 1) and ((cnt / len(u)) > 0.5)

Using more mathematical operators:

def isMostlyCyrillic(u):
     if type(u) ≠ type(u""):
         u ← unicode(u, "UTF-8")
     cnt ← float(∑(0x400 ≤ ord(c) < 0x500 ∀ c ∈ u))
     return (cnt > 1) ∧ ((cnt ÷ len(u)) > 0.5)

    The biggest win for me is "≠" with "←" also an improvement. I'm so 
used to "/" for division that "÷" now looks strange.

    Neil



More information about the Python-list mailing list