Help with Latin Characters
tjreedy at udel.edu
Sun Jul 24 20:30:10 CEST 2011
On 7/24/2011 11:15 AM, Joao Jacome wrote:
list = os.listdir(dir)
While somewhat natural, using 'list' as a local name and masking the
builtin list function is a *very bad* idea. Someday you will do this and
then use 'list(args)' expecting to call the list function, and it will
> When the script reaches a file with latin characters (ê é ã etc) it crashes.
> Traceback (most recent call last):
> File "C:\backup\ORGANI~1\teste.py", line 37, in <module>
> File "C:\backup\ORGANI~1\teste.py", line 25, in Retrieve
> File "C:\backup\ORGANI~1\teste.py", line 18, in Retrieve
> print l
> File "C:\Python27\lib\encodings\cp850.py", line 12, in
> return codecs.charmap_encode(input,errors,encoding_map)
> UnicodeEncodeError: 'charmap' codec can't encode character u'\x8a' in
> position 4
> 3: character maps to <undefined>
'\x8a' *is* the cp850 encoded byte for reverse accent e: è
But your program treats is a unicode value, where it is a control char
(Line Tabulation Set), and tries to encode it to cp850, which is not
I suspect this has something to do with defining the rootdir as a
unicode string: rootdir = u"D:\\ghostone"
Perhaps if you removed the 'u', your program would work.
Or perhaps you should explicitly decode the values in os.listdir(dir)
before joining them to the rootdir and re-encoding.
This sort of thing sometimes works better with Python 3.
> Does someone knows how to fix this?
> Thank you!
> João Victor Sousa Jácome
Terry Jan Reedy
More information about the Python-list