[Mailman-Developers] "Orignal" MySQL Member Adaptor - 1.71

Mark Sapiro mark at msapiro.net
Wed Jan 7 17:18:28 CET 2009


kyrian (List) wrote:
>
>The gist of the problem seems to be that you need to treat the strings 
>as utf-8 or iso-8859-1 encoded 'objects' rather than standard ASCII 
>string types within the code, and I don't know for sure how to do that.

And you have to know which because there are iso-8859-1 encoded
characters which aren't valid utf-8 codes and there are utf-8 encoded
characters which get garbled if decoded as iso-8859-1.

Thus, code like

        try:
            unicode(value, "ascii")
        except UnicodeError:
            value = unicode(value, "utf-8")
        else:
            # value was valid ASCII data
            pass

which I think is no different from simply

        value = unicode(value, "utf-8")

since if value is ascii to begin with, calling it utf-8 is OK,

doesn't work if value is actually iso-8859-1 encoded and contains bytes
which aren't valid utf-8 or which decode differently from utf-8.

-- 
Mark Sapiro <mark at msapiro.net>        The highway is for gamblers,
San Francisco Bay Area, California    better use your sense - B. Dylan



More information about the Mailman-Developers mailing list