[Python-Dev] bytes.from_hex()
Greg Ewing
greg.ewing at canterbury.ac.nz
Sat Feb 25 01:20:25 CET 2006
Stephen J. Turnbull wrote:
> the kind of "text" for which Unicode was designed is normally produced
> and consumed by people, who wll pt up w/ ll knds f nnsns. Base64
> decoders will not put up with the same kinds of nonsense that people
> will.
The Python compiler won't put up with that sort of
nonsense either. Would you consider that makes Python
source code binary data rather than text, and that
it's inappropriate to represent it using a unicode
string?
> You're basically assuming that the person who implements the code that
> processes a Unicode string is the same person who implemented the code
> that converts a binary object into base64 and inserts it into a
> string.
No, I'm assuming the user of base64 knows the
characteristics of the channel he's using. You
can only use base64 if you know the channel
promises not to munge the particular characters
that base64 uses. If you don't know that, you
shouldn't be trying to send base64 through that
channel.
> In most environments, it should be possible to hide bytes<->unicode
> codecs almost all the time,
But it *is* hidden in the situation I'm talking
about, because all the Unicode encoding/decoding
takes place inside the implementation of the
text channel, which I'm taking as a given.
> I don't think it's a good idea to gratuitously introduce
> wire protocols as unicode codecs,
I am *not* saying that base64 is a unicode codec!
If that's what you thought I was saying, it's no
wonder we're confusing each other.
It's just a transformation from bytes to
text. I'm only calling it unicode because all
text will be unicode in Py3k. In py2.x it could
just as well be a str -- but a str interpreted
as text, not binary.
> What do you think the email module does?
> Assuming conforming MIME messages
But I'm not assuming mime in the first place. If I
have a mail interface that will accept chunks of
binary data and encode them as a mime message for
me, then I don't need to use base64 in the first
place.
The only time I need to use something like base64
is when I have something that will only accept
text. In Py3k, "accepts text" is going to mean
"takes a character string as input", where
"character string" is a distinct type from
"binary data". So having base64 produce anything
other than a character string would be awkward
and inconvenient.
I phrased that paragraph carefully to avoid using
the word "unicode" anywhere. Does that make it
clearer what I'm getting at?
--
Greg
More information about the Python-Dev
mailing list