Re: [Python-ideas] namedtuple literals [Was: RE a new namedtuple]

July 27, 2017

      ...
To avoid introducing a new built-in, we could do object.bag =
SimpleNamespace
I am liking the idea of making SimpleNamespace more accessible, but maybe
we need to think a bit more about why one might want a tuple-with-names,
rather than just an easy way to create
an object-with-just-attributes.

That is -- how many times do folks use a namedtuple rather than
SimpleNamespace just because they know about it, rather than because
they really need it. I know that is often the case...

but here are some reasons to want an actual tuple (or, an actual
ImutableSequence)

1) Backward compatibility with tuples.
    This may have been a common use case when they were new, and maybe
still is, but If we are future-looking, I don't think this the the primary
use case. But maybe some of the features you get from that are important.

2) order-preserving: this makes them a good match for "records" from a
DB or CSV file or something.

3) unpacking: x, y = a_point

4) iterating: for coord in a_point:
                          ...

5) immutability: being able to use them as a key in a dict.

What else?

So the question is -- If we want an easier way to create a namedtuple-like
object -- which of these features are desired?

Personally, I think an immutable SimpleNamespace would be good. And if you
want the other stuff, use a NamedTuple. And a quick and easy way to make
one would be nice.

I understand that the ordering could be confusing to folks, but I'm still
thinking yes -- in the spirit of duck-typing, I think having to think
about the Type is unfortunate.

And will people really get confused if:

ntuple(x=1, y=2) == ntuple(y=2, x=1)

returns False?

If so -- then, if we are will to introduce new syntax, then we can make
that more clear.

Note that:

ntuple(x=1, y=2) == ntuple(z=1, w=2)

Should also be False.

and

ntuple(x=1, y=2) == (1, 2)

also False (this is losing tuple-compatibility)

That is, the names, and the values, and the order are all fixed.

If we use a tuple to define the "type" == ('x','y') then it's easy enough
to cache and compare based on that. If, indeed, you need to cache at all.

BTW, I think we need to be careful about what assumptions we are making in
terms of "dicts are order-preserving". My understanding is that the fact
that the latest dict in cpython is order preserving should be considered an
implementation detail, and not relied on.

But that we CAN count on **kwargs being order-preserving. That is, **kwargs
is an order-preserving mapping, but the fact that it IS a dict is an
implementation detail.

Have I got that right?

Of course, this will make it hard to back-port a "ntuple" implementation....

And

ntuple(('x', 2), ('y', 3))

is unfortunate.

-CHB

On Thu, Jul 27, 2017 at 4:48 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
...
On 27 July 2017 at 10:38, Steven D'Aprano <steve@pearwood.info> wrote:
...
On Thu, Jul 27, 2017 at 11:46:45AM +1200, Greg Ewing wrote:
...
Nick Coghlan wrote:
...
The same applies to the ntuple concept, expect there it's the fact
that it's a *tuple* that conveys the "order matters" expectation.
That assumes there's a requirement that it be a tuple in
the first place. I don't see that requirement in the use
cases suggested here so far.
This is an excellent point. Perhaps we should just find a shorter name
for SimpleNamespace and promote it as the solution.
I'm not sure about other versions, but in Python 3.5 it will even save
memory for small records:
py> from types import SimpleNamespace
py> spam = SimpleNamespace(flavour='up', charge='1/3')
py> sys.getsizeof(spam)
24
sys.getsizeof() isn't recursive, so this is only measuring the
overhead of CPython's per-object bookkeeping. The actual storage
expense is incurred via the instance dict:
>>> sys.getsizeof(spam.__dict__)
    240
    >>> data = dict(charge='1/3', flavour='up')
    >>> sys.getsizeof(data)
    240
Note: this is a 64-bit system, so the per-instance overhead is also
higher (48 bytes rather than 24), and tuple incur a cost of 8 bytes
per item rather than 4 bytes.
It's simply not desirable to rely on dicts for this kind of use case,
as the per-instance cost of their bookkeeping machinery is overly high
for small data classes and key-sharing only mitigates that problem, it
doesn't eliminate it.
By contrast, tuples are not only the most memory efficient data
structure Python offers, they're also one of the fastest to allocate:
since they're fixed length, they can be allocated as a single
contiguous block, rather than requiring multiple memory allocations
per instance (and that's before taking the free list into account).
As a result, "Why insist on a tuple?" has three main answers:
- lowest feasible per-instance memory overhead
- lowest feasible runtime allocation cost overhead
- backwards compatibility with APIs that currently return a tuple
without impacting either of the above benefits
Cheers,
Nick.
--
Nick Coghlan   |   ncoghlan@gmail.com   |   Brisbane, Australia
_______________________________________________
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/
-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker@noaa.gov

Re: [Python-ideas] namedtuple literals [Was: RE a new namedtuple]

Chris Barker