[pypy-dev] PyPy 2 unicode class

Armin Rigo arigo at tunes.org
Wed Jan 22 18:56:32 CET 2014


Hi Johan,

On Wed, Jan 22, 2014 at 8:01 AM, Johan Råde <johan.rade at gmail.com> wrote:
> (I hope this makes more sense than my ramblings on IRC last night.)

All versions you gave make sense as far as I'm concerned :-)  But this
last one is the clearest indeed.

It seems that Python 3 went that way anyway too, and exposes the same
"natural" interface on all platforms including Windows.

I'm not saying it's a strong argument for us doing the same thing in
PyPy *2*, but it's certainly an extra argument for it.  I'd be
prepared to say that no portable program should crash because it runs
on the "wide" version of Python, even if Windows-only programs are not
portable in that sense.  The argument that it might actually *fix*
more programs than it breaks is interesting too.

As far as I'm concerned I'd vote for going with it tentatively (i.e.
implementing unicodes as utf-8 strings internally, with an indexing
cache).  If it's really needed we can always add another layer on top
of the implementation to give the utf-16 image of the world again.
Anyway, it's a trade-off between raw performance (how can the indexing
cache be made fast?) and memory usage, with a side note on RPython
getting rid of the not-really-reusable unicode type.


A bientôt,

Armin.


More information about the pypy-dev mailing list