> >>> import numpy as np
> >>> mapping_table = np.array(BIG_LOOKUP_DICT.items())
one note on numpy: the numpy array() function is very much designed for Sequences: partly due to history, but also for convenience and performance -- it needs to know what the size and data type of the array it is going to create is before it creates it.
And honestly, I'm not sure that array() would work with the dict views anyway if we added indexing -- we'd have to look at the logic inside array()
And numpy has from_iter() for working with iterators.
In short: it would work with numpy is NOT a reason to add this feature :-)
> And I expect that even if dict.items() was indexable, numpy would
still have to copy the items. I don't know how numpy works in detail,
but I doubt that it will be able to use a view of a hash table internals
as a fast array without copying.
of course not -- but it makes a copy of the items in a list too -- so the extra copy for the list is still there.
(numpy works with homogenous lower level data types -- the actual bytes of the C datatype -- so it is always copying the values when it makes an array out of Python types. (except for the numpy object dtype, but that's a special case)
> What making dict_* types a Sequence will do is make this code (as written) behave:
For my part, I'm not asking for the dict views to be full blown Sequences -- I think that *would* be an attractive nuisance. I'm thinking only adding indexing.
still think of concrete sequences and indexing as fundamental, while
Python 3 has moved in the direction of making the iterator protocol and
iterators as fundamental.
That is indeed a change in Python over the years, but i don't think it was a practicality-driven change: in short: don't make copies you don't need to make. So I don't think we should use "Iterators are fundamental to Python" as a reason to NOT add Sequence-like behavior.
You have a hammer (indexing), so you want views to be nails so you can
hammer them. But views are screws, and need a screwdriver (iter and
next).
But there are, in carpentry, many places where you can use either a screw or a nail, and some of us have even been known to hammer a screw in, even if we had a screwdriver handy, and knew what the heck we were doing. That is the argument here: when the screw can be well used, in a particular case, by hitting it with a hammer, then why not let me do that. To take the analogy way too far: don't take the hammer out of my toolbox just because there are some screwdrivers in there.
> The existing dictionary memory layout doesn't support direct indexing (without stepping), so this functionality is not being added as a requirement.
But it does make it much more efficient if the stepping is done inside the dict object by code that knows its internal structure. Both because it can be in C, and can be done without any additional references or copying. yes, it's all O(n) but a very different constant.
fact that they can be indexed in reasonable time is not part of the
design, just an accident of implementation, and being an accident, it
could change in the future.
It *could*, but I can't imagine how you could have an efficient order-preserving data structure that could not be indexed reasonably -- in particular, more efficiently than making a full list copy first. And even so -- fine: performance characteristics are not guaranteed anyway.
> If random.choice should support non-sequence ordered container,
just propose it to random.choice.
That would indeed solve the usability issue, and so may be a good idea,
The problem here is that there is no way for random.choice to efficiently work with generic Mappings. This whole discussion started because now that dicts preserve order, there is both a logical reason, and a practical implementation for indexing. But if that is not exposed, then random.choice(), nor any other function, can take advantage of it.
Which would lead to adding a random_choice protocol -- but THAT sure seems like overkill.
(OK, you could have the builtin random.choice check for an actual dict, and then use custom code to make a random selection, but that would really be a micro-optimization!)
> but they can't be Sequences, since they are already Sets. They would
> have to be a hybrid of the two, and that, I feel, comes with more
> baggage than just being one or the other.
I Think this is where I fundamentally disagree, as far as language design and Python philosophy is concerned. I've been using Python for 20+ years (terrifying!) and I have always really like the Duck typing concept. in fact, even one better, it doesn't have to look, walk, and quack like a duck to be a duck -- if I only need it to quack, I don't care how it looks and walks.
Since those pre-2.0 days, Python has grown a lot more "structure" to its typing, notably ABCs and now facilities for static type checking. So far, those *enable* more formal typing, but don't *require* it. But as more folks start to use them, I'm going to have to start writing more strictly typed code if I want to use other libraries -- I"m hoping it won't come to that, but we'll see.
To bring this back to the case at hand:
I haven't looked at the code, but I"m pretty sure that random.choice() does not check for the Sequence ABC: it simply tries to get the length, and then index the object to get a random item. If that works, then it works -- This is proven by passing it a dict with integer indexes in the right range:
In [28]: dOut[28]: {0: 0, 1: 1, 2: 2, 3: 3, 4: 4, 5: 5, 6: 6, 7: 7, 8: 8, 9: 9}
In [29]: random.choice(d)
Out[29]: 9
I LIKE this -- so the argument that dict views shouldn't support indexing because they are a Set and can't be a proper Sequence is exactly backwards from how I think Python should work:
If a feature is useful, and doesn't conflict with another feature, then we can add it.
In the end though, while I think there is very little reason NOT to add indexing to dict views, unless someone comes up with a good use case beyond random.choice(), it may not be worth the churn.
-CHB