python 3.1 unicode question
clp2 at rebertia.com
Wed Sep 16 07:07:40 CEST 2009
On Tue, Sep 15, 2009 at 9:48 PM, jeffunit <jeff at jeffunit.com> wrote:
> At 09:25 PM 9/15/2009, Mark Tolonen wrote:
>> "jeffunit" <jeff at jeffunit.com> wrote in message
>> news:20090915144123964.LJKA6569 at cdptpa-omta01.mail.rr.com...
>>> I wrote a program that diffs files and prints out matching file names.
>>> I will be executing the output with sh, to delete select files.
>>> Most of the files names are plain ascii, but about 10% of them have
>>> characters in them. When I try to print the string containing the name, I
>>> an exception:
>>> 'ascii' codec can't encode character '\udce9'
>>> in position 37: ordinal not in range(128)
>>> The string is:
>>> This is on a windows xp system, using python 3.1 which I compiled
>>> with the cygwin
>>> linux compatability layer tool.
>>> Can you tell me what encoding I need to print \udce9 and how to set
>>> python to
>>> that encoding mode?
>> That looks like a "surrogate escape" (See PEP 383)
>> http://www.python.org/dev/peps/pep-0383/. It indicates the wrong encoding
>> was used to decode the filename.
> That seems likely. How do I set the encoding to something correct to decode
> the filename?
> Clearly windows knows how to display it.
> I suspect since I complied python with cygwin, that it is using a POSIX
> rather than a windows specific standard. Of course ideally, I would like my
> code to work
> on linux as well as windows, as I back up all of my data to a linux machine
Have you perhaps tried using the native Windows version of Python?
More information about the Python-list