On Tue, Mar 29, 2022 at 6:37 AM Joseph Martinot-Lagarde <contrebasse@gmail.com> wrote:
Both suggested implementations are very interesting. If I'm not wrong they have exactly the same behavior (except for __dict__).

pretty close, yes -- probably differences with pickling, and if you did an type checking, very different.

Performance-wise with a quick test I get the following results:

t = (1, 2)
print("sizeof", sys.getsizeof(t))  # 56
print("sizeof type", sys.getsizeof(type(t)))  # 408
print(timeit.timeit("(1, 2)"))  # ~0.01

nt = namedtuple("nt", ["x", "y"])
print("sizeof", sys.getsizeof(nt(1, 2)))  # 56
print("sizeof type", sys.getsizeof(type(nt(1, 2))))  # 896
print(timeit.timeit("nt(1, 2)", setup="from __main__ import nt"))  # ~0.2

pt = atuple(x=1, y=2)
print("sizeof", sys.getsizeof(pt))  # 56
print("sizeof type", sys.getsizeof(type(pt)))  # 896
print(timeit.timeit("atuple(x=12, y=16)", setup="from __main__ import atuple"))  # ~0.8

point = TupleWithNames(x=1, y=2)
print("sizeof", sys.getsizeof(point))  # 64
print("sizeof type", sys.getsizeof(type(point)))  # 1064
print(timeit.timeit("TupleWithNames(x=12, y=16)", setup="from __main__ import TupleWithNames"))  # ~0.8

you have to be careful with getsizeof() -- I don't think it digs into nested objects. Anyway, it would certainly require a close look.

The timing performance of both solutions is roughly the same given the measurament variation,

That surprises me -- a tuple has to call namedtuple, and there's a lot of work done in there -- very odd. I guess that's why we have to profile, but I'd still give that a closer look.
 
but way slower than tuple an namedtuple.

well, with namedtuple, you aren't taking into account the creation of the class -- so if you make thousands of the same one, yes, but if you make only a couple, then not so much.
 
TupleWithNames is a bit more memory hungry than atuple and namedtuple, but there is only one type so if I understand correctly it would be a win for multiple instances.

I think so -- at least of atuple -- though one could create a cache of namedtuples so that atuple would reuse an existing one if was already there.

Maybe there is room for optimisation in both cases ?

I"m sure -- namedtuple use sys.intern on the names, that would help. and I tried to use __slots__ in TupleWithNames (sorry for that horrible name ;-) ), but apparently you can't use __slots__ in a tuple subclass ('cause tuple's already using it ??) -- but that could be done in a builtin. then it wouldn't need a __dict__

There's also various options for storing the fields -- I only tried the first one I thought of.

This reminds me -- it would be kinda cool if there was an easier, more robust way to make an immutable in Python -- maybe a frozendict for __dict__? 

Anyway, if someone wants to take this further, I'd be glad to help.

-CHB

 

--
Christopher Barker, PhD (Chris)

Python Language Consulting
  - Teaching
  - Scientific Software Development
  - Desktop GUI and Web Development
  - wxPython, numpy, scipy, Cython