Detect string has non-ASCII chars without checking each char?
John Machin
sjmachin at lexicon.net
Sun Aug 22 18:13:57 EDT 2010
On Aug 23, 1:10 am, "Michel Claveau -
MVP"<enleverLesX_XX... at XmclavXeauX.com.invalid> wrote:
> Re !
>
> > Try your code with u"abcd\xa1" ... it says it's ASCII.
>
> Ah? in my computer, it say "False"
Perhaps your computer has a problem. Mine does this with both Python
2.7 and Python 2.3 (which introduced the unicodedata.normalize
function):
>>> import unicodedata
>>> t1 = u"abcd\xa1"
>>> t2 = unicodedata.normalize('NFD', t1)
>>> t3 = t2.encode('ascii', 'replace')
>>> [t1, t2, t3]
[u'abcd\xa1', u'abcd\xa1', 'abcd?']
>>> map(len, _)
[5, 5, 5]
>>>
More information about the Python-list
mailing list