[Chicago] understanding unicode problems
Carl Karsten
carl at personnelware.com
Fri Nov 16 16:07:40 CET 2007
Kumar McMillan wrote:
> On Nov 15, 2007 4:13 PM, Carl Karsten <carl at personnelware.com> wrote:
>> of course now a unicode problem just hit me.
>>
>> i use the django admin to enter Ivan Krstic'
>> and reportlab spits out: http://dev.personnelware.com/carl/a/IvanK1.pdf
>>
>> so pretty much 100% python.
>>
>> I am told:
>>
>> > Make sure that you are using utf-8 and not some other encoding, such as
>> > latin-1.
>>
>> But I really don't know what that means, nor do I even know how to debug this.
>
> I wrote up a little something about it when it finally clicked for me:
> http://farmdev.com/thoughts/23/what-i-thought-i-knew-about-unicode-in-python-amounted-to-nothing/
> (I was in the same spot, I knew I *should* use UTF-8 but wasn't sure
> how or why or what that even implied)
"However, it's not always possible to work with unicode all the time because not
everything supports it. As just one example, you'll need to create a wrapper
that temporarily encodes / decodes data when reading a csv file using the
standard csv module."
Is there a standard way of encoding?
A string (unicode or not) is a bunch of bytes. unicode chars may use more than
one byte. What I don't understand: Why do I need to encode / decode? I get
the feeling the error caused is a reminder "so that you know that you need to do
the other operation later."
Carl K
More information about the Chicago
mailing list