Problem reading file with umlauts
Stefan Behnel
stefan_ml at behnel.de
Tue Jul 7 10:04:09 EDT 2009
Claus Hausberger wrote:
> Hello
>
> I have a text file with is encoding in Latin1 (ISO-8859-1). I can't change that as I do not create those files myself.
>
> I have to read those files and convert the umlauts like ö to stuff like &oumol; as the text files should become html files.
>
> I have this code:
>
>
> #!/usr/bin/python
> # -*- coding: latin1 -*-
>
> import codecs
>
> f = codecs.open('abc.txt', encoding='latin1')
>
> for line in f:
> print line
> for c in line:
> if c == "ö":
You are reading Unicode strings, so you have to compare it to a unicode
string as in
if c == u"ö":
> print "oe"
> else:
> print c
Note that printing non-ASCII characters may not always work, depending on
your terminal.
Stefan
More information about the Python-list
mailing list