Detect string has non-ASCII chars without checking each char?
vlastimil.brom at gmail.com
Sat Aug 21 22:21:10 CEST 2010
2010/8/21 <python at bdurham.com>:
> Python 2.6: Is there a built-in way to check if a Unicode string has
> non-ASCII chars without having to check each char in the string?
> Here's my use case: I have a section of code that makes frequent calls to
> hasattr. The attribute name being tested is derived from incoming data which
> at times can contain international content.
> hasattr() raises an exception when passed a Unicode attribute name. I would
> have expected a simple True/False return value vs. an encoding error.
> UnicodeEncodeError: 'ascii' codec can't encode character u'\u012c' in
> position 0: ordinal not in range(128)
> Is this behavior by design or could I be encoding the string I'm passing
> hasattr() incorrectly?
> If its by design, I'm thinking the best approach for me would be to write a
> hasattr_enhanced() function that traps the Unicode encoding exception and
> returns False and use this function in place of hasattr(). Any thoughts on
> this strategy?
> Thank you,
I can't comment on the mentioned usecase, but for checking the basic
ascii unicode strings one can maybe use a simple hack (not sure about
possible drawbacks ...)
It is likely working with all characters too, but maybe in a more
>>> a = u"abc"
>>> b = u"abc\u012c"
>>> a.encode("ascii", "ignore").decode("ascii") == a
>>> b.encode("ascii", "ignore").decode("ascii") == b
Others may supply more general/elegant/... approaches.
More information about the Python-list