On Tue, Mar 29, 2022 at 6:37 AM Joseph Martinot-Lagarde < contrebasse@gmail.com> wrote:
Both suggested implementations are very interesting. If I'm not wrong they have exactly the same behavior (except for __dict__).
pretty close, yes -- probably differences with pickling, and if you did an type checking, very different. Performance-wise with a quick test I get the following results:
t = (1, 2) print("sizeof", sys.getsizeof(t)) # 56 print("sizeof type", sys.getsizeof(type(t))) # 408 print(timeit.timeit("(1, 2)")) # ~0.01
nt = namedtuple("nt", ["x", "y"]) print("sizeof", sys.getsizeof(nt(1, 2))) # 56 print("sizeof type", sys.getsizeof(type(nt(1, 2)))) # 896 print(timeit.timeit("nt(1, 2)", setup="from __main__ import nt")) # ~0.2
pt = atuple(x=1, y=2) print("sizeof", sys.getsizeof(pt)) # 56 print("sizeof type", sys.getsizeof(type(pt))) # 896 print(timeit.timeit("atuple(x=12, y=16)", setup="from __main__ import atuple")) # ~0.8
point = TupleWithNames(x=1, y=2) print("sizeof", sys.getsizeof(point)) # 64 print("sizeof type", sys.getsizeof(type(point))) # 1064 print(timeit.timeit("TupleWithNames(x=12, y=16)", setup="from __main__ import TupleWithNames")) # ~0.8
you have to be careful with getsizeof() -- I don't think it digs into nested objects. Anyway, it would certainly require a close look. The timing performance of both solutions is roughly the same given the
measurament variation,
That surprises me -- a tuple has to call namedtuple, and there's a lot of work done in there -- very odd. I guess that's why we have to profile, but I'd still give that a closer look.
but way slower than tuple an namedtuple.
well, with namedtuple, you aren't taking into account the creation of the class -- so if you make thousands of the same one, yes, but if you make only a couple, then not so much.
TupleWithNames is a bit more memory hungry than atuple and namedtuple, but there is only one type so if I understand correctly it would be a win for multiple instances.
I think so -- at least of atuple -- though one could create a cache of namedtuples so that atuple would reuse an existing one if was already there. Maybe there is room for optimisation in both cases ?
I"m sure -- namedtuple use sys.intern on the names, that would help. and I tried to use __slots__ in TupleWithNames (sorry for that horrible name ;-) ), but apparently you can't use __slots__ in a tuple subclass ('cause tuple's already using it ??) -- but that could be done in a builtin. then it wouldn't need a __dict__ There's also various options for storing the fields -- I only tried the first one I thought of. This reminds me -- it would be kinda cool if there was an easier, more robust way to make an immutable in Python -- maybe a frozendict for __dict__? Anyway, if someone wants to take this further, I'd be glad to help. -CHB -- Christopher Barker, PhD (Chris) Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython