On 2012-02-11, at 13:33 , Stefan Behnel wrote:
Paul Moore, 11.02.2012 11:47:
On 11 February 2012 00:07, Terry Reedy wrote:
Nor is there in 3.x.
I view that claim as FUD, at least for many users, and at least until the persons making the claim demonstrate it. In particular, I claim that people who use Python2 knowing nothing of unicode do not need to know much more to do the same things in Python3.
Concrete example, then.
I have a text file, in an unknown encoding (yes, it does happen to me!) but opening in an editor shows it's mainly-ASCII. I want to find all the lines starting with a '*'. The simple
with open('myfile.txt') as f: for line in f: if line.startswith('*'): print(line)
fails with encoding errors. What do I do? Short answer, grumble and go and use grep (or in more complex cases, awk) :-(
Or just use the ISO-8859-1 encoding.
It's true that requires to handle encodings upfront where Python 2 allowed you to play fast-and-lose though. And using latin-1 in that context looks and feels weird/icky, the file is not encoded using latin-1, the encoding just happens to work to manipulate bytes as ascii text + non-ascii stuff.