byte count unicode string
sjmachin at lexicon.net
Wed Sep 20 12:12:54 CEST 2006
> John Machin:
> >You are confusing the hell out of yourself. You say that your web app
> >deals only with UTF-8 strings. Where do you get "the unicode string"
> >from??? If name is a utf-8 string, as your comment says, then len(name)
> >is all you need!!!
> # I'll go ahead and concede defeat since you appear to be on the
> # verge of a heart attack :)
> # I can see that I lack clarity so I don't blame you.
All you have to do is use terminology like "Python str object, encoded
in utf-8" and "Python unicode object".
> # By UTF-8 string, I mean a unicode object with UTF-8 encoding:
There is no such animal as a "unicode object with UTF-8 encoding".
Don't make up terminology as you go.
> <type 'unicode'>
> >>> repr(ustr)
Sigh. I suppose we have to infer that "ustr" is the same as the "name"
that you were getting as post data. Is that correct?
> # The database API expects unicode objects:
> # A template query, then a variable number of values.
> # Perhaps I'm a victim of arbitrary design decisions :)
And the database will encode those unicode objects as utf-8, silently
truncating any that are too long -- just as Duncan feared? "Arbitrary"
is not the word for it.
More information about the Python-list