2012/2/11 Paul Moore <p.f.moore@gmail.com>
Nor is there in 3.x.
I view that claim as FUD, at least for many users, and at least until the persons making the claim demonstrate it. In particular, I claim that
On 11 February 2012 00:07, Terry Reedy <tjreedy@udel.edu> wrote: people
who use Python2 knowing nothing of unicode do not need to know much more to do the same things in Python3.
Concrete example, then.
I have a text file, in an unknown encoding (yes, it does happen to me!) but opening in an editor shows it's mainly-ASCII. I want to find all the lines starting with a '*'. The simple
with open('myfile.txt') as f: for line in f: if line.startswith('*'): print(line)
fails with encoding errors. What do I do? Short answer, grumble and go and use grep (or in more complex cases, awk) :-(
Paul.
I just look at the Python 3 documentation ( http://docs.python.org/release/3.1.3/library/functions.html#open), there is a "error" parameter to the open function. when set to "ignore" or "replace" it will solved your problem. Another way is to try to guess the encoding programaticaly (I found chardet module http://pypi.python.org/pypi/chardet) and pass it to decode your file with unknown encoding. Then why not put a value "auto" available for "encoding" parameter which makes "open" call a detector before opening and throw error when the guess is less than a certain percentage. Gabriel AHTUNE