Detect string has non-ASCII chars without checking each char?

Vlastimil Brom vlastimil.brom at gmail.com
Sat Aug 21 16:21:10 EDT 2010


2010/8/21  <python at bdurham.com>:
> Python 2.6: Is there a built-in way to check if a Unicode string has
> non-ASCII chars without having to check each char in the string?
>
> Here's my use case: I have a section of code that makes frequent calls to
> hasattr. The attribute name being tested is derived from incoming data which
> at times can contain international content.
>
> hasattr() raises an exception when passed a Unicode attribute name. I would
> have expected a simple True/False return value vs. an encoding error.
>
> UnicodeEncodeError: 'ascii' codec can't encode character u'\u012c' in
> position 0: ordinal not in range(128)
>
> Is this behavior by design or could I be encoding the string I'm passing
> hasattr() incorrectly?
>
> If its by design, I'm thinking the best approach for me would be to write  a
> hasattr_enhanced() function that traps the Unicode encoding exception and
> returns False and use this function in place of hasattr(). Any thoughts on
> this strategy?
>
> Thank you,
> Malcolm
>
>
> --
> http://mail.python.org/mailman/listinfo/python-list
>
>
Hi,
I can't comment on the mentioned usecase, but for checking the basic
ascii unicode strings one can maybe use a simple hack (not sure about
possible drawbacks ...)
It is likely working with all characters too, but maybe in a more
straightforward way...

>>> a = u"abc"
>>> b = u"abc\u012c"
>>> a.encode("ascii", "ignore").decode("ascii") == a
True
>>> b.encode("ascii", "ignore").decode("ascii") == b
False
>>>

Others may supply more general/elegant/... approaches.

vbr



More information about the Python-list mailing list