[Python-Dev] Subclassing varying length types (What's a PyStructSequence ?)

Tim Peters tim.one@home.com
Mon, 10 Dec 2001 00:22:15 -0500


[MAL]
> Have you tried disabling all free list and using pymalloc
> instead ?

No, but I haven't tried anything -- it's a 2.3 issue.

> If this pays off, I agree, we should get rid off all of them.

When I do try it <wink>, it will be slower but more memory-efficient (both
data and code) than the type-specific free lists, and faster and much more
memory-efficient than using malloc().

> ...
> I would consider moving from 8-bit strings to Unicode an
> improvement in flexibility.

Sure.  Moving from one malloc to two is orthogonal.

> It also results in better algroithms (== simpler, less error-prone,
> etc. in this case).

Unclear what "it" means; assuming it means using two mallocs instead of one
for a Unicode string object, the 8-bit string algorithms haven't been a
particular source of bugs.  People mutating strings at the C level has been.

> As I said, it's a tradeoff flexibility vs. memory consumption.
> Whether it pays off depends on your application environment. It
> certainly does for companies like Micron and pays off stock-wise
> for a lot of people... uhm, getting off-topic here :-)

I've got nothing against Unicode (apart from the larger issue that the whole
world would obviously be a lot better off if they switched to American
English <wink>).

>> Subclassing seems easy enough to me from the Python level; I
>> don't have time to revisit C-level subclasssing here (and I don't
>> know that it's hackish there either, but do think it's in need of
>> docs).

> It is beautifully easy for non-varying-length types. Unfortunately,
> it happens that some of the basic types which would be attractive
> for subclassing are varying length types (such as string and
> tuples).

It's easy to subclass from str and tuple in Python -- even to add your own
instance data.

> In my case, I'm looking for away to subclass strings, but I haven't
> yet found an elegant solution to the problem of adding extra
> data to the instances.

It's easy if you're willing to use a dict:

class STR(str):
     def __new__(cls, strguts, n):
         self = str.__new__(cls, strguts)
         self.n = n
         return self

s = STR('abc', 42)
print s    # abc
print s.n  # 42

__slots__ doesn't work here, though.

I admit I personally don't see much attraction to subclassing from str and
tuple, apart from adding additional *methods*.  I suppose someone could code
up two-malloc variants ...