[Python-ideas] namedtuple literals [Was: RE a new namedtuple]

Nick Coghlan ncoghlan at gmail.com
Thu Jul 27 07:48:39 EDT 2017


On 27 July 2017 at 10:38, Steven D'Aprano <steve at pearwood.info> wrote:
> On Thu, Jul 27, 2017 at 11:46:45AM +1200, Greg Ewing wrote:
>> Nick Coghlan wrote:
>> >The same applies to the ntuple concept, expect there it's the fact
>> >that it's a *tuple* that conveys the "order matters" expectation.
>>
>> That assumes there's a requirement that it be a tuple in
>> the first place. I don't see that requirement in the use
>> cases suggested here so far.
>
> This is an excellent point. Perhaps we should just find a shorter name
> for SimpleNamespace and promote it as the solution.
>
> I'm not sure about other versions, but in Python 3.5 it will even save
> memory for small records:
>
> py> from types import SimpleNamespace
> py> spam = SimpleNamespace(flavour='up', charge='1/3')
> py> sys.getsizeof(spam)
> 24

sys.getsizeof() isn't recursive, so this is only measuring the
overhead of CPython's per-object bookkeeping. The actual storage
expense is incurred via the instance dict:

    >>> sys.getsizeof(spam.__dict__)
    240
    >>> data = dict(charge='1/3', flavour='up')
    >>> sys.getsizeof(data)
    240

Note: this is a 64-bit system, so the per-instance overhead is also
higher (48 bytes rather than 24), and tuple incur a cost of 8 bytes
per item rather than 4 bytes.

It's simply not desirable to rely on dicts for this kind of use case,
as the per-instance cost of their bookkeeping machinery is overly high
for small data classes and key-sharing only mitigates that problem, it
doesn't eliminate it.

By contrast, tuples are not only the most memory efficient data
structure Python offers, they're also one of the fastest to allocate:
since they're fixed length, they can be allocated as a single
contiguous block, rather than requiring multiple memory allocations
per instance (and that's before taking the free list into account).

As a result, "Why insist on a tuple?" has three main answers:

- lowest feasible per-instance memory overhead
- lowest feasible runtime allocation cost overhead
- backwards compatibility with APIs that currently return a tuple
without impacting either of the above benefits

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia


More information about the Python-ideas mailing list