[Python-Dev] PEP 393 Summer of Code Project

Wed Aug 24 10:18:20 CEST 2011

> So am I correctly reading between the lines when, after reading this
> thread so far, and the complete issue discussion so far, that I see a
> PEP 393 revision or replacement that has the following characteristics:
> 
> 1) Narrow builds are dropped.

PEP 393 already drops narrow builds.

> 2) There are more, or different, internal kinds of strings, which affect
> the processing patterns.

This is the basic idea of PEP 393.

> a) all ASCII
> b) latin-1 (8-bit codepoints, the first 256 Unicode codepoints) This
> kind may not be able to support a "mostly" variation, and may be no more
> efficient than case b).  But it might also be popular in parts of Europe

This two cases are already in PEP 393.

> c) mostly ASCII (utf8) with clever indexing/caching to be efficient
> d) UTF-8 with clever indexing/caching to be efficient

I see neither a need nor a means to consider these.

> e) 16-bit codepoints

These are in PEP 393.

> f) UTF-16 with clever indexing/caching to be efficient

Again, -1.

> g) 32-bit codepoints

This is in PEP 393.

> h) UTF-32

What's that, as opposed to g)?

I'm not open to revise PEP 393 in the direction of adding more
representations.

Regards,
Martin