On 2012-02-11, at 13:53 , Stefan Behnel wrote:
Well, except for the cases where that didn't work. Remember that implicit encoding behaves in a platform dependent way in Python 2, so even if your code runs on your machine doesn't mean it will work for anyone else.
Sure, I said it allowed you, not that this allowance actually worked.
And using latin-1 in that context looks and feels weird/icky, the file is not encoded using latin-1, the encoding just happens to work to manipulate bytes as ascii text + non-ascii stuff.
Correct. That's precisely the use case described above.
Yes, but now instead of just ignoring that stuff you have to actively and knowingly lie to Python to get it to shut up.
Besides, it's perfectly possible to process bytes in Python 3. You just have to open the file in binary mode and do the processing at the byte string level.
I think that's the route which should be taken, but (and I'll readily admit not to have followed the current state of this story) I'd understood manipulations of bytes-as-ascii-characters-and-stuff to be far more annoying (in Python 3) than string manipulation even for simple use cases.