Detect string has non-ASCII chars without checking each char?
John Machin
sjmachin at lexicon.net
Sun Aug 22 06:57:04 EDT 2010
On Aug 22, 5:07 pm, "Michel Claveau -
MVP"<enleverLesX_XX... at XmclavXeauX.com.invalid> wrote:
> Hi!
>
> Another way :
>
> # -*- coding: utf-8 -*-
>
> import unicodedata
>
> def test_ascii(struni):
> strasc=unicodedata.normalize('NFD', struni).encode('ascii','replace')
> if len(struni)==len(strasc):
> return True
> else:
> return False
>
> print test_ascii(u"abcde")
> print test_ascii(u"abcdĂȘ")
-1
Try your code with u"abcd\xa1" ... it says it's ASCII.
Suggestions:
test_ascii = lambda s: len(s.decode('ascii', 'ignore')) == len(s)
or
test_ascii = lambda s: all(c < u'\x80' for c in s)
or
use try/except
Also:
if a == b:
return True
else:
return False
is a horribly bloated way of writing
return a == b
More information about the Python-list
mailing list