[Tutor] close, but no cigar

Mon Jul 22 23:11:07 CEST 2013

On Mon, Jul 22, 2013 at 1:55 PM, Jim Mooney <cybervigilante at gmail.com>wrote:

> On 22 July 2013 13:45, Marc Tompkins <marc.tompkins at gmail.com> wrote:
>
>>
>>     inFileName = "/Users/Marc/Desktop/rsp/tree.txt"
>>     with open(inFileName, 'r') as inFile:
>>         inString = inFile.read().decode('cp437')
>>         print inString
>>
>> I already tried something similar and got an error:
>
> with open('../pytree.txt') as pytree, open('newpytree.txt','w') as pyout:
>     for line in pytree:
>         for char in line:
>             newchar = char.decode('cp437')
>             pyout.write(newchar)
>
> UnicodeEncodeError: 'ascii' codec can't encode character u'\u2502' in
> position 0: ordinal not in range(128)
>
> What's my error here, since I want to write to a file. I have python(x,y)
> so printing that is kind of time-taking and not too illuminating ;')
>

The error's in your error message: Python has decoded the string properly,
but (since you haven't specified an encoding) is trying to encode to the
default, which in Python < 3 is 'ascii'... which has a great big blank
space where all characters over 128 should be.

One way to deal with this is to specify an encoding:
    newchar = char.decode('cp437').encode('utf-8')
which works just fine for me.

Another way is to tell Python what it should do in case of encoding errors:
-  ignore (just drop the offending characters from the output):
    newchar = char.decode('cp437').encode('ascii', 'ignore')

-  replace (all offending characters get replaced with something
non-offending - like TREE did with my Cyrillic filename):
    newchar = char.decode('cp437').encode('ascii', 'replace')

-  xmlcharrefreplace (replace offending characters with their XML
encodings):
    newchar = char.decode('cp437').encode('ascii', 'xmlcharrefreplace')

This page is your friend: http://docs.python.org/2/howto/unicode.html
and so is this one:
http://docs.python.org/2/library/codecs.html#standard-encodings
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/tutor/attachments/20130722/dde88247/attachment-0001.html>