String is ASCII or UTF-8?
C. Benson Manica
cbmanica at gmail.com
Tue Mar 9 12:17:58 EST 2010
On Mar 9, 12:07 pm, Tim Golden <m... at timgolden.me.uk> wrote:
> You can't. You can apply one or more heuristics, depending on exactly
> what your requirement is. But any valid ASCII text is also valid
> UTF8-encoded text since UTF-8 isn't "two bytes per char" but a variable
> number of bytes per char.
Hm, well that's very unfortunate. I'm using a database library which
seems to assume that all strings passed to it are ASCII, and I'm
attempting to use it on two different systems - one where all strings
are ASCII, and one where they seem to be UTF-8. The strings come from
the same place, i.e. they're exclusively normal ASCII characters.
What I would want is to check once for whether the strings passed to
function foo() are ASCII or UTF-8, and if they are to assume that all
strings need to be decoded. So that's not possible?
More information about the Python-list
mailing list