Paul Moore writes:
I'm now 100% convinced that encoding="ascii",errors="surrogateescape" is the way to say this in code.
It probably is, for you. If that ever gives you a UnicodeError, you know how to find out how to deal with it. And it probably won't.<wink/> That may also be a good universal default for Python 3, as it will pass through non-ASCII text unchanged, while raising an error if the program tries to manipulate it (or hand it to a module that validates). (encoding='latin-1' definitely is not a good default.) But I'm not sure of that, and the current approach of using the preferred system encoding is probably better. I don't think either argument applies to everybody who needs such a recipe, though. Many will be best served with encoding='latin-1' by some name.