Spanish Accents
Peter Otten
__peter__ at web.de
Thu Dec 22 11:42:09 EST 2011
Stan Iverson wrote:
> On Thu, Dec 22, 2011 at 11:30 AM, Rami Chowdhury
> <rami.chowdhury at gmail.com>wrote:
>
>> Could you try using the 'open' function from the 'codecs' module?
>>
>
> I believe this is what you meant:
>
> file = codecs.open(p + "2.txt", "r", "utf-8")
> for line in file:
> print line
>
> but got this error:
>
> *UnicodeDecodeError*: 'utf8' codec can't decode bytes in position 0-2:
> invalid data
> args = ('utf8', '\xe1 intentado para ellos bastante sabios para
> discernir lo obvio. Tales perso', 0, 3, 'invalid data')
> which is the letter á (a with accent).
The file is probably encoded in ISO-8859-1, ISO-8859-15, or cp1252 then:
>>> print "\xe1".decode("iso-8859-1")
á
>>> print "\xe1".decode("iso-8859-15")
á
>>> print "\xe1".decode("cp1252")
á
Try codecs.open() with one of these encodings.
More information about the Python-list
mailing list