[Numpy-discussion] Python 3 porting

Pauli Virtanen pav at iki.fi
Mon Feb 15 18:16:05 EST 2010


ma, 2010-02-15 kello 15:51 -0700, Charles R Harris kirjoitti:
[clip]

> A lot of the remaining failures are of this sort:
>
>  x: array([b'pi', b'pi', b'pi', b'four', b'five'], 
>       dtype='|S8')
>  y: array(['pi', 'pi', 'pi', 'four', 'five'], 
>       dtype='<U4')
>  
>
> This looks fixable by specifying the dtype

Specifying the dtype in the test changes the meaning of the test.
Rather, the expected results should be made bytes on Py3. This is what
I've done so far.

There are asbytes() and asbytes_nested() macros available in
numpy.compat that can be used to portably get bytes literals.

> >>> np.array([b'pi'])
> array([b'pi'], 
>       dtype='|S2')
> >>> np.array(['pi'])
> array(['pi'], 
>       dtype='<U2')
> >>> np.array(['pi'], dtype='|S2')
> array([b'pi'], 
>       dtype='|S2')
>
> I expect we will break a lot of code if b'pi' can't somehow be made
> the default. 

I don't think we should make the unicode str map to bytes_ dtype, it's
just too magical. Any Python code being ported to Py3 will anyway need
to go the str vs. bytes transition so there will be breakage in any
case.

> Hmm. The 'b' prefix is an undocumented feature of python 2.6 but
> doesn't work for earlier versions. But these tests can be fixed by
> being a bit more explicit about the type.

I think the doctests can be partly fixed by using asstr() from
numpy.compat. Probably not completely, though -- I've seen some
complaints that doctests are a lot of work to convert to Py3.

	Pauli







More information about the NumPy-Discussion mailing list