<div class="gmail_quote">On Thu, Dec 22, 2011 at 12:42 PM, Peter Otten <span dir="ltr"><__<a href="mailto:peter__@web.de">peter__@web.de</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div class="im">The file is probably encoded in ISO-8859-1, ISO-8859-15, or cp1252 then:</div>
<br>
>>> print "\xe1".decode("iso-8859-1")<br>
á<br>
>>> print "\xe1".decode("iso-8859-15")<br>
á<br>
>>> print "\xe1".decode("cp1252")<br>
á<br>
<br>
Try codecs.open() with one of these encodings.<br></blockquote><div><br></div><div>I'm baffled. I duplicated your print statements but when I run this code (or any of the 3 encodings):</div><div><br></div><div><div>file = codecs.open(p + "2.txt", "r", "cp1252")</div>
<div>#file = codecs.open(p + "2.txt", "r", "utf-8")</div><div>for line in file:</div><div> print line</div><div><br></div></div><div>I get this error:</div><div><br></div><div><span class="text"><strong>UnicodeEncodeError</strong>: 'ascii' codec can't encode character u'\xe1' in position 48: ordinal not in range(128)
<br><tt><small> </small> </tt>args =
('ascii', u'<i>Noticia: Este sitio web entre este portal est...r<font color="#c040c0">\xe1</font>pidamente va a salir de aqu<font color="#c040c0">\xed</font>.</i><br /><br /><font color="#c040c0">\r\n</font>', 48, 49, 'ordinal not in range(128)')
<br><tt><small> </small> </tt>encoding =
'ascii'
<br><tt><small> </small> </tt>end =
49
<br><tt><small> </small> </tt>object =
u'<i>Noticia: Este sitio web entre este portal est...r<font color="#c040c0">\xe1</font>pidamente va a salir de aqu<font color="#c040c0">\xed</font>.</i><br /><br /><font color="#c040c0">\r\n</font>'
<br><tt><small> </small> </tt>reason =
'ordinal not in range(128)'
<br><tt><small> </small> </tt>start =
48</span></div><div><br></div><div><span class="text">Please advise. TIA,</span></div><div><span class="text">Stan </span></div></div>