I too have sometimes proposed what I think of as "minor quality-of-life"
enhancements, and had them shot down. It stings a bit, and can be
frustrating, but remember it's not personal.
I don't mind the shooting down, as long as the arguments make sense :D.
It seems like we're both in agreement that the cost of implementing & maintaining the change is non-zero.
I'm asserting that the end benefit of this change is also non-zero, and in my opinion higher than the cost. But I also acknowledge that the benefit may not be enough to overcome the inertia behind getting a change made.
The reason I'm persevering is to try and weed out the immaterial or incorrect reasons for not making this change, so hopefully we're left with a good understanding of the pros-cons.
The difficulty is that our QOL enhancement is someone else's bloat.
Every new feature is something that has to be not just written once, but
maintained, documented, tested and learned. Every new feature steepens
the learning curve for the language; every new feature increases the
size of the language, increases the time it takes to build, increases
the time it takes for the tests to run.
Yeah, I can see that more code = more code overhead, and that's got to be justified.
I don't believe that this feature would steepen the language learning curve however, but actually help to shallow it slightly (Explained more below)
This one might only be one new method on three classes, but it all adds
up, and we can't add *everything*.
(I recently started writing what was intended to be a fairly small
class, and before I knew it I was up to six helper classes, nearly 200
methods, and approaching 1500 LOC, for what was conceptually intended to
be a *lightweight* object. I've put this aside to think about it for a
while, to decide whether to start again from scratch with a smaller API,
or just remove the word "lightweight" from the description :-)
Absolute method count is seldom a standalone indicator of a dead end.
Often classes with many methods, (especially if they're accompanied by lots of helpers)
are a side-effect of some abstraction failure. Usually when I'm consulting on fixing projects
with these characteristics, it's a case of the developers not correctly choosing their abstractions, or letting them leak.
It sounds like you let that one get away from you, chalk it up to a learning experience.
The "It walks like a zoo, squaks/lows/grunts/chitters like a zoo" problem is very real.
This is more of a "It used to be a duck. Now it walks like a duck, but doesn't sound like a duck because it's a coot" problem
So each new feature has to carry its own weight. Even if the weight in
effort to write, effort to learn, code, tests and documentation is
small, the benefit gained must be greater or it will likely be rejected.
"Nice to have" is unlikely to be enough, unless you happen to be one of
the most senior core devs scratching your own itch, and sometimes not
even then.
> >>> import numpy as np
> >>> mapping_table = np.array(BIG_LOOKUP_DICT.items())
> [[1, 99],
> [2, 23],
> ...
> ]
That worked in Python 2 by making a copy of the dict items into a list.
It will equally work in Python 3 by making a copy of the items into a
list.
And I expect that even if dict.items() was indexable, numpy would
still have to copy the items. I don't know how numpy works in detail,
but I doubt that it will be able to use a view of a hash table internals
as a fast array without copying.
Bottom line here is that adding indexing to dict views won't save you
either time or memory or avoid making a copy in this example. All it
will save you is writing an explicit call to `list`. And we know what
the Zen says about being explicit.
What making dict_* types a Sequence will do is make this code (as written) behave:
1. like it used to do
2. like most people seem to expect it to.
Currently numpy does something that I consider unexpected (I'm sure, given your previous responses, you'd disagree with this, but from canvassing Python devs, I feel like many people share my opinions here)
with that code.
> >>> import sqlite3
> >>> conn = sqlite3.connect(":memory:")
> >>> params = {'a': 1, 'b': 2}
> >>> placeholders = ', '.join(f':{p}' for p in params)
> >>> statement = f"select {placeholders}"
> >>> print(f"Running: {statement}")
> Running: select :a, :b
> >>> cur=conn.execute(statement, params.values())
> >>> cur.fetchall()
> [(1, 2)]
Why are you passing a view to a values when you could pass the dict
itself? Is there some reason you don't do this?
# statement = "select :a, :b"
py> cur=conn.execute(statement, params)
py> cur.fetchall()
[(1, 2)]
I'm not an expert on sqlite, so I might be missing something here, but I
would have expected that this is the prefered solution. It matches the
example in the docs, which uses a dict.
You're right, this was a version of code that I've written before for a different database driver (which didn't support named parameters), and sqlite3 does support that, so that my mistake. As mentioned elsewhere, producing bullet-proof use-cases on demand can be tough.
> # This currently works, but is deprecated in 3.9
> >>> dict(random.sample({'a': 1, 'b': 2}.items(), 2))
> {'b': 2, 'a': 1}
I suspect that even if dict items were indexable, Raymond Hettinger
would not be happy with random.sample on dict views.
I don't know why? I can understand deprecating sets here as they're unordered, so the results when seed() has been called are not consistent.
I don't see why Raymond would object to allowing sampling an ordered container, one from which the results will be reproducible.
> >>> def min_max_keys(d):
> >>> min_key, min_val = d.items()[0]
> >>> max_key, max_val = min_key, min_val
> >>> for key, value in d.items():
Since there's no random access to the items required, there's not really
any need for indexing. You only need the first item, then iteration. So
the natural way to write that is with iter() and next().
Yeah, it's possible to write this in any number of ways.
I canvassed some opinions from python developers on how to do this sort of thing, and of 4 out of 5 different responses I got wouldn't currently work because their suggested implementations relied on `.keys() being indexable, or being Sequences.
I suspect that the difference in perspective here is that (perhaps?) you
still thing of concrete sequences and indexing as fundamental, while
Python 3 has moved in the direction of making the iterator protocol and
iterators as fundamental.
This is the proposal, I want to make these things Sequences.
These things (the results of dict.keys() for example) used to look like, and act, like nails. Then suddenly they looked like and acted like screws (for good reasons), but let's say screws with smooth heads, (as many people think they are still nails), now there's a simple way to make them act like nails again, using "but they're screws, so can't be nails" as counter doesn't hold water.
You have a hammer (indexing), so you want views to be nails so you can
hammer them. But views are screws, and need a screwdriver (iter and
next).
I have a proposal: make these things indexable, so people can hammer them in if they desire.
...not that we can jump to the 350th key without
stepping through the previous 349 keys.
The existing dictionary memory layout doesn't support direct indexing (without stepping), so this functionality is not being added as a requirement.
Dicts have gone through a number of major redesigns and many careful
tweaks over the years to get the best possible performance. The last
major change was to add *order-preserving* behaviour, not indexing. The
fact that they can be indexed in reasonable time is not part of the
design, just an accident of implementation, and being an accident, it
could change in the future.
To throw the request back, what's the use case you're considering here? Why would dictionary iteration be made slower in the future?
This feature would require upgrading that accident of implementation to
a guarantee. If the Python world were awash with dozens of compelling,
strong use-cases for indexing dicts, then we would surely be willing to
make that guarantee. But the most compelling use-case we've seen so far
is awfully weak indeed: choose a random item from a dict.
So the cost-benefit calculation goes (in my opinion) something like
this.
1. Risk of eliminating useful performance enhancements in the
future: small.
No use cases on how/why this would be a thing
2. Benefit gained: even smaller.
Some weak use cases on why this would help.
That's not FUD. It's just a simple cost-benefit calculation. You can
counter it by finding good use-cases that are currently difficult and
annoying to solve. Using an explicit call to list is neither difficult
nor annoying :-)