On Tue, Sep 10, 2019 at 8:38 PM Matt Billenstein <matt@vazor.com> wrote:
I stopped using Python 3 after learning about str(bytes) by finding it in my corrupted database. Ever since then I've been anxious about changing to
On Tue, Sep 10, 2019 at 10:42:52AM -0400, Daniel Holth wrote: the new
language, since it makes it so easy to convert from bytes to unicode by accident without specifying a valid encoding. So I would like to see a future where str(bytes) is effectively removed. I started working on a pull request that adds an API to toggle str(bytes) at runtime with a thread local (instead of requiring a command line argument), so you could do with no_str_bytes(): if you were worried about the feature, but got a bit stuck in the internals.
How is this different than all the str -> unicode bugs we had in python2? If you have special needs, you can always monkey-patch it in plain python code by overriding __builtins__.str with something that asserts the given arg is not bytes.
m
-- Matt Billenstein matt@vazor.com http://www.vazor.com/
It's different. One hint is that there's already an option to disable the feature. The old style of error will occasionally reveal itself with decode errors but the new style error happens silently, you discover it somehow, then enable the -bb option, track down the source of the error, and deal with the fallout. The proposed change would allow `print(bytes)` for (de)bugging by letting you toggle `python3 -bb` behavior at runtime instead of only at the command line. Or you could debug more explicitly by `print(bytes.decode('ebcdic'))` or `print(repr(bytes))` I didn't realize you could override __builtins__.str. That's interesting.