
To avoid introducing a new built-in, we could do object.bag = SimpleNamespace
I am liking the idea of making SimpleNamespace more accessible, but maybe we need to think a bit more about why one might want a tuple-with-names, rather than just an easy way to create an object-with-just-attributes. That is -- how many times do folks use a namedtuple rather than SimpleNamespace just because they know about it, rather than because they really need it. I know that is often the case... but here are some reasons to want an actual tuple (or, an actual ImutableSequence) 1) Backward compatibility with tuples. This may have been a common use case when they were new, and maybe still is, but If we are future-looking, I don't think this the the primary use case. But maybe some of the features you get from that are important. 2) order-preserving: this makes them a good match for "records" from a DB or CSV file or something. 3) unpacking: x, y = a_point 4) iterating: for coord in a_point: ... 5) immutability: being able to use them as a key in a dict. What else? So the question is -- If we want an easier way to create a namedtuple-like object -- which of these features are desired? Personally, I think an immutable SimpleNamespace would be good. And if you want the other stuff, use a NamedTuple. And a quick and easy way to make one would be nice. I understand that the ordering could be confusing to folks, but I'm still thinking yes -- in the spirit of duck-typing, I think having to think about the Type is unfortunate. And will people really get confused if: ntuple(x=1, y=2) == ntuple(y=2, x=1) returns False? If so -- then, if we are will to introduce new syntax, then we can make that more clear. Note that: ntuple(x=1, y=2) == ntuple(z=1, w=2) Should also be False. and ntuple(x=1, y=2) == (1, 2) also False (this is losing tuple-compatibility) That is, the names, and the values, and the order are all fixed. If we use a tuple to define the "type" == ('x','y') then it's easy enough to cache and compare based on that. If, indeed, you need to cache at all. BTW, I think we need to be careful about what assumptions we are making in terms of "dicts are order-preserving". My understanding is that the fact that the latest dict in cpython is order preserving should be considered an implementation detail, and not relied on. But that we CAN count on **kwargs being order-preserving. That is, **kwargs is an order-preserving mapping, but the fact that it IS a dict is an implementation detail. Have I got that right? Of course, this will make it hard to back-port a "ntuple" implementation.... And ntuple(('x', 2), ('y', 3)) is unfortunate. -CHB On Thu, Jul 27, 2017 at 4:48 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
On 27 July 2017 at 10:38, Steven D'Aprano <steve@pearwood.info> wrote:
On Thu, Jul 27, 2017 at 11:46:45AM +1200, Greg Ewing wrote:
Nick Coghlan wrote:
The same applies to the ntuple concept, expect there it's the fact that it's a *tuple* that conveys the "order matters" expectation.
That assumes there's a requirement that it be a tuple in the first place. I don't see that requirement in the use cases suggested here so far.
This is an excellent point. Perhaps we should just find a shorter name for SimpleNamespace and promote it as the solution.
I'm not sure about other versions, but in Python 3.5 it will even save memory for small records:
py> from types import SimpleNamespace py> spam = SimpleNamespace(flavour='up', charge='1/3') py> sys.getsizeof(spam) 24
sys.getsizeof() isn't recursive, so this is only measuring the overhead of CPython's per-object bookkeeping. The actual storage expense is incurred via the instance dict:
>>> sys.getsizeof(spam.__dict__) 240 >>> data = dict(charge='1/3', flavour='up') >>> sys.getsizeof(data) 240
Note: this is a 64-bit system, so the per-instance overhead is also higher (48 bytes rather than 24), and tuple incur a cost of 8 bytes per item rather than 4 bytes.
It's simply not desirable to rely on dicts for this kind of use case, as the per-instance cost of their bookkeeping machinery is overly high for small data classes and key-sharing only mitigates that problem, it doesn't eliminate it.
By contrast, tuples are not only the most memory efficient data structure Python offers, they're also one of the fastest to allocate: since they're fixed length, they can be allocated as a single contiguous block, rather than requiring multiple memory allocations per instance (and that's before taking the free list into account).
As a result, "Why insist on a tuple?" has three main answers:
- lowest feasible per-instance memory overhead - lowest feasible runtime allocation cost overhead - backwards compatibility with APIs that currently return a tuple without impacting either of the above benefits
Cheers, Nick.
-- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
-- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov