[Numpy-discussion] Bytes vs. Unicode in Python3

Pauli Virtanen pav at iki.fi
Fri Nov 27 10:41:04 EST 2009


pe, 2009-11-27 kello 16:33 +0100, Francesc Alted kirjoitti:
> A Friday 27 November 2009 15:09:00 René Dudfield escrigué:
> > On Fri, Nov 27, 2009 at 1:49 PM, Francesc Alted <faltet at pytables.org> wrote:
> > > Correct.  But, in addition, we are going to need a new 'bytes' dtype for
> > > NumPy for Python 3, right?
> > 
> > I think so.  However, I think S is probably closest to bytes... and
> > maybe S can be reused for bytes... I'm not sure though.
> 
> That could be a good idea because that would ensure compatibility with 
> existing NumPy scripts (i.e. old 'string' dtypes are mapped to 'bytes', as it 
> should).  The only thing that I don't like is that that 'S' seems to be the 
> initial letter for 'string', which is actually 'unicode' in Python 3 :-/
> But, for the sake of compatibility, we can probably live with that.

Well, we can "deprecate" 'S' (ie. never show it in repr, always only 'B'
or 'U').

> > Also, what will a bytes dtype mean within a py2 program context?  Does
> > it matter if the bytes dtype just fails somehow if used in a py2
> > program?
> 
> Mmh, I'm of the opinion that the new 'bytes' type should be available only 
> with NumPy for Python 3.  Would that be possible?

I don't see a problem in making a bytes_ scalar type available for
Python2. In fact, it would be useful for making upgrading to Py3 easier.

-- 
Pauli Virtanen





More information about the NumPy-Discussion mailing list