ellisonbg at gmail.com
Tue Jul 27 15:23:37 EDT 2010
On Tue, Jul 27, 2010 at 11:34 AM, Fernando Perez <fperez.net at gmail.com>wrote:
> On Tue, Jul 27, 2010 at 11:14 AM, Brian Granger <ellisonbg at gmail.com>
> > Yes, I hadn't though about the fact that unicode objects are buffers as
> > well. But, we could raise a TypeError when a user tries to send a
> > object (str in python 3). IOW, don't treat unicode as buffers and force
> > them to encode/de ode. Does this make sense or should we allow unicode
> > be sent as buffers.
> Well, the problem I explained about a possible mismatch in internal
> unicode storage format rears its ugly head if we allow
> unicode-as-buffer. I was precisely worried about sending 3.x strings
> as buffers, since the two ends may not agree on what the buffer means.
> I may be worrying about a non-problem, but at some point it might be
> worth veryfing this. The test is a bit cumbersome to set up, because
> you have to build two versions of Python, one with ucs-2 and one with
> ucs-4, and see what happens if they try to send each other stuff. But
> I think it's a test worth making, so we know for sure whether this is
> a problem or not, as it will dictate design decisions for 3.x on all
> string handling.
This is definitely an issue. Also, someone could set their own custom
unicode encoding by hand and that would mess this up as well.
> If it is a problem, then there are some options:
> - disallow communication between ucs 2/4 pythons.
But this doesn't account for other encoding/decoding setups.
> - detect a mismatch and encode/decode all unicode strings to utf-8 on
> send/receive, but allow raw buffer sending if there's no mismatch.
This will be tough though if users set their own encoding.
> - *always* encode/decode.
I think this is the option that I prefer (having users to this in their
> The middle option seems appealing because it avoids the overhead of
> encoding/decoding on all sends, but I'm worried it may be too brittle.
Brian E. Granger, Ph.D.
Assistant Professor of Physics
Cal Poly State University, San Luis Obispo
bgranger at calpoly.edu
ellisonbg at gmail.com
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the IPython-dev