Using non-ascii symbols
Neil Hodgson
nyamatongwe+thunder at gmail.com
Fri Jan 27 17:29:20 EST 2006
Having a bit of a play with some of my spam reduction code.
Original:
def isMostlyCyrillic(u):
if type(u) != type(u""):
u = unicode(u, "UTF-8")
cnt = float(sum(0x400 <= ord(c) < 0x500 for c in u))
return (cnt > 1) and ((cnt / len(u)) > 0.5)
Using more mathematical operators:
def isMostlyCyrillic(u):
if type(u) ≠ type(u""):
u ← unicode(u, "UTF-8")
cnt ← float(∑(0x400 ≤ ord(c) < 0x500 ∀ c ∈ u))
return (cnt > 1) ∧ ((cnt ÷ len(u)) > 0.5)
The biggest win for me is "≠" with "←" also an improvement. I'm so
used to "/" for division that "÷" now looks strange.
Neil
More information about the Python-list
mailing list