Negative array indicies and slice()
Andrew Robinson
andrew3 at r3dsolutions.com
Thu Nov 1 18:25:51 EDT 2012
On 11/01/2012 12:07 PM, Ian Kelly wrote:
> On Thu, Nov 1, 2012 at 5:32 AM, Andrew Robinson
> <andrew3 at r3dsolutions.com> wrote:
>> Hmmmm.... was that PEP the active state of Python, when Tim rejected the bug report?
> Yes. The PEP was accepted and committed in March 2006 for release in
> Python 2.5. The bug report is from June 2006 has a version
> classification of Python 2.5, although 2.5 was not actually released
> until September 2006.
That explain's Peter's remark. Thank you. He looks *much* smarter now.
>
>> Pep 357 merely added cruft with index(), but really solved nothing. Everything index() does could be implemented in __getitem__ and usually is.
> No. There is a significant difference between implementing this on
> the container versus implementing it on the indexes. Ethan
> implemented his string-based slicing on the container, because the
> behavior he wanted was specific to the container type, not the index
> type. Custom index types like numpy integers on the other hand
> implement __index__ on the index type, because they apply to all
> sequences, not specific containers.
Hmmm...
D'Aprano didn't like the monkey patch;and sub-classing was his fix-all.
Part of my summary is based on that conversation with him,and you
touched on one of the unfinished points; I responded to him that I
thought __getitem__ was under-developed. The object slice() has no
knowledge of the size of the sequence; nor can it get that size on it's
own, but must passively wait for it to be given to it.
The bottom line is: __getitem__ must always *PASS* len( seq ) to
slice() each *time* the slice() object is-used. Since this is the case,
it would have been better to have list, itself, have a default member
which takes the raw slice indicies and does the conversion itself. The
size would not need to be duplicated or passed -- memory savings, &
speed savings...
I'm just clay pidgeoning an idea out here....
Let's apply D'Aprano 's logic to numpy; Numpy could just have subclassed
*list*; so let's ignore pure python as a reason to do anything on the
behalf on Numpy:
Then, lets' consider all thrid party classes; These are where
subclassing becomes a pain -- BUT: I think those could all have been
injected.
>>> class ThirdParty( list ): # Pretend this is someone else's...
... def __init__(self): return
... def __getitem__(self,aSlice): return aSlice
...
We know it will default work like this:
>>> a=ThirdParty()
>>> a[1:2]
slice(1, 2, None)
# So, here's an injection...
>>> ThirdParty.superOnlyOfNumpy__getitem__ = MyClass.__getitem__
>>> ThirdParty.__getitem__ = lambda self,aSlice: ( 1, 3,
self.superOnlyOfNumpy__getitem__(aSlice ).step )
>>> a[5:6]
(1, 3, None)
Numpy could have exported a (workable) function that would modify other
list functions to affect ONLY numpy data types (eg: a filter). This
allows user's creating their own classes to inject them with Numpy's
filter only when they desire;
Recall Tim Peter's "explicit is better than implicit" Zen?
Most importantly normal programs not using Numpy wouldn't have had to
carry around an extra API check for index() *every* single time the
heavily used [::] happened. Memory & speed both.
It's also a monkey patch, in that index() allows *conflicting*
assumptions in violation of the unexpected monkey patch interaction worry.
eg: Numpy *CAN* release an index() function on their floats -- at which
point a basic no touch class (list itself) will now accept float as an
index in direct contradiction of PEP 357's comment on floats... see?
My point isn't that this particular implementation I have shown is the
best (or even really safe, I'd have to think about that for a while).
Go ahead and shoot it down...
My point is that, the methods found in slice(), and index() now have
moved all the code regarding a sequence *out* of the object which has
information on that sequence. It smacks of legacy.
The Python parser takes values from many other syntactical constructions
and passes them directly to their respective objects -- but in the case
of list(), we have a complicated relationship; and not for any reason
that can't be handled in a simpler way.
Don't consider the present API legacy for a moment, I'm asking
hypothetical design questions:
How many users actually keep slice() around from every instance of [::]
they use?
If it is rare, why create the slice() object in the first place and
constantly be allocating and de-allocating memory, twice over? (once for
the original, and once for the repetitive method which computes dynamic
values?) Would a single mutable have less overhead, since it is
destroyed anyway?
More information about the Python-list
mailing list