Problem Regarding Handling of Unicode string

Ulrich Eckhardt eckhardt at satorlaser.com
Mon Aug 10 07:41:01 EDT 2009


joy99 wrote:
> [...] it is giving me output like:
> '\xef\xbb\xbf\xe0\xa6\x85\xe0\xa6\xa8\xe0\xa7\x87\xe0\xa6\x95'
   ^^^^^^^^^^^^

These three bytes encode the byte-order marker (BOM, Unicode uFEFF) as
UTF-8, followed by codepoint u09a8 (look it up on unicode.org what that
is).

In any case, if this is produced as output, there is some missing
encoding/decoding going on. You mentioned that it works in one case but
doesn't in another. Since you didn't provide any information how to
reproduce what you saw, any further help is at most guesswork.

Uli

-- 
Sator Laser GmbH
Geschäftsführer: Thorsten Föcking, Amtsgericht Hamburg HR B62 932




More information about the Python-list mailing list