
Been lurking here, no cows in the fire, no irons in the race, or whatever, except wanting Twisted to be perfect and easy to use and being perennially confused by text encoding, but I did notice this: On 11/22/2016 9:03 PM, Glyph Lefkowitz wrote: [...]
Okay. So.
The rule for reverts like this is: if you do something today, which is correct usage of the API and produces an observably correct result, will that be broken in the future if we fix it? If so, then we need to revert because the interface as released is unsupportable.
As it stands, we have a matrix of 4 behaviors:
*bytes*
*text(ascii)*
*text(nonascii)* *py2*
works
works
UnicodeDecodeError *py3*
garbage
works
works
This... is actually... fine, surprisingly.
The /right/ thing to do is to write code that passes text all the time. If you do that right now, it'll work on py3 and raise an exception on py2, unless it /happens/ to be ASCII, in which case it'll work.
If you write code that passes bytes on py3, it'll just be garbage. But, we want to deprecate that anyway, and you can't get correct, usable behavior out of it, no matter what workarounds you stuff in; so it's a bug, and can be fixed like any bug.
Similarly if you pass non-ascii text on py3, you'll get a UnicodeDecodeError.
Shouldn't this be "if you pass non-ascii text on *py2, *you'll get ..." ? [...]
-glyph
Pedantically yours, -- John Santos Evans Griffiths & Hart, Inc. 781-861-0670 ext 539