[Python-3000] Making more effective use of slice objects in Py3k
fredrik at pythonware.com
Thu Aug 31 10:21:00 CEST 2006
Jack Diederich wrote:
> That said can you guys expand on what polymorphic means here in particular?
> Python wise I can only think of the str/unicode/buffer split. If the
> fraternity of strings doesn't include views (which I haven't needed either)
> what are you considering for the other kinds?
the idea is to allow a given string object to use different kinds of
storage depending on what data it contains, and how it's being used.
off the top of my head, I'd imagine using at least:
wide unicode (32-bit)
and possibly also one or more of
narrow unicode (16-bit)
8-bit encoded (arbitrary 8-bit encodings)
selected asian encodings
all these look and behave the same at the Python level, as well as when
using "high-level" C API:s. ob_type may differ (also during an object's
lifetime), but type(s) is always the same.
this approach gives you lots of advantages:
- lots of operations can be carried out without having to convert the
data (all the formats listed above supports forward iteration, and
most text-level operations).
- you'll save tons of memory in applications that uses text mostly in a
few character sets (and less memory means more speed).
- adding (or removing) specific string implementations becomes trivial,
both for the core developers and extension writers.
the main disadvantage is that it becomes a bit more difficult to deal
with strings at the C level (but properly dealing with both 8-bit and
Unicode strings is already a pain in the ass, and I'm not sure this has
to be any harder. just slightly different).
for some details on apple's implementation (thanks bob!), see:
More information about the Python-3000