Negative array indicies and slice()
steve+comp.lang.python at pearwood.info
Fri Nov 2 03:14:55 CET 2012
On Thu, 01 Nov 2012 15:25:51 -0700, Andrew Robinson wrote:
> On 11/01/2012 12:07 PM, Ian Kelly wrote:
>>> Pep 357 merely added cruft with index(), but really solved nothing.
>>> Everything index() does could be implemented in __getitem__ and
>>> usually is.
>> No. There is a significant difference between implementing this on the
>> container versus implementing it on the indexes. Ethan implemented his
>> string-based slicing on the container, because the behavior he wanted
>> was specific to the container type, not the index type. Custom index
>> types like numpy integers on the other hand implement __index__ on the
>> index type, because they apply to all sequences, not specific
> D'Aprano didn't like the monkey patch;and sub-classing was his fix-all.
I pointed out that monkey-patching is a bad idea, even if it worked. But
it doesn't work -- you simply cannot monkey-patch built-ins in Python.
Regardless of whether "I like" the m-p or not, *you can't use it* because
you patch built-in list methods.
The best you could do is subclass list, then shadow the built-in name
"list" with your subclass. But that gives all sorts of problems too, in
some ways even worse than monkey-patching.
You started this thread with a question about slicing. You believe that
one particular use-case for slicing, which involves interpreting lists as
circular rather than linear, is the use-case that built-in list slicing
should have supported.
Fine, you're entitled to your option. But that boat has sailed about 20
years ago. Python didn't make that choice, and it won't change now. If
you write up a PEP, you could aim to have the built-in behaviour changed
for Python 4 in perhaps another 10-15 years or so. But for the time
being, that's not what lists, tuples, strings, etc. do. If you want that
behaviour, if you want a circular list, then you have to implement it
yourself, and the easiest way to do so is with a subclass.
That's not a "fix-all". I certainly don't believe that subclassing is the
*only* way to fix this, nor that it will fix "all" things. But it might
fix *some* things, such as you wanting a data type that is like a
circular list rather than a linear list.
If you prefer to create a circular-list class from scratch, re-
implementing all the list-like behaviour, instead of inheriting from an
existing class, then by all means go right ahead. If you have a good
reason to spend days or weeks writing, testing, debugging and fine-tuning
your new class, instead of about 15 minutes with a subclass, then I'm
certainly not going to tell you not to.
> Part of my summary is based on that conversation with him,and you
> touched on one of the unfinished points; I responded to him that I
> thought __getitem__ was under-developed. The object slice() has no
> knowledge of the size of the sequence; nor can it get that size on it's
> own, but must passively wait for it to be given to it.
That's because the slice object is independent of the sequence. As I
demonstrated, you can pass a slice object to multiple sequences. This is
a feature, not a bug.
> The bottom line is: __getitem__ must always *PASS* len( seq ) to
> slice() each *time* the slice() object is-used.
The bottom line is: even if you are right, so what?
The slice object doesn't know what the length of the sequence is. What
makes you think that __getitem__ passes the length to slice()? Why would
it need to recreate a slice object that already exists?
It is the *sequence*, not the slice object, that is responsible for
extracting the appropriate items when __getitem__ is called. __getitem__
gets a slice object as argument, it doesn't create one. It no more
creates the slice object than mylist creates the int 5.
> Since this is the case,
But it isn't.
> it would have been better to have list, itself, have a default member
> which takes the raw slice indicies and does the conversion itself. The
> size would not need to be duplicated or passed -- memory savings, &
> speed savings...
We have already demonstrated that slice objects are smaller than (x)range
objects and three-item tuples. In Python 3.3:
py> sys.getsizeof(range(1, 10, 2)) # xrange remained in Python 3
py> sys.getsizeof((1, 10, 2))
py> sys.getsizeof(slice(1, 10, 2))
It might help you to be taken seriously if you base your reasoning on
Python as it actually is, rather than counter-factual assumptions.
> I'm just clay pidgeoning an idea out here.... Let's apply D'Aprano 's
> logic to numpy; Numpy could just have subclassed *list*;
Sure they could have, if numpy arrays were intended to be a small
variation on Python lists. But they weren't, so they didn't.
> so let's ignore
> pure python as a reason to do anything on the behalf on Numpy:
> Then, lets' consider all thrid party classes; These are where
> subclassing becomes a pain -- BUT: I think those could all have been
> >>> class ThirdParty( list ): # Pretend this is someone else's...
> ... def __init__(self): return
> ... def __getitem__(self,aSlice): return aSlice
Strange and bizarre semantics for slicing, but okay.
> We know it will default work like this:
> >>> a=ThirdParty()
> >>> a[1:2]
> slice(1, 2, None)
> # So, here's an injection...
> >>> ThirdParty.superOnlyOfNumpy__getitem__ = MyClass.__getitem__
> >>> ThirdParty.__getitem__ = lambda self,aSlice: ( 1, 3,
> self.superOnlyOfNumpy__getitem__(aSlice ).step )
> >>> a[5:6]
> (1, 3, None)
> Numpy could have exported a (workable) function that would modify other
> list functions to affect ONLY numpy data types (eg: a filter). This
> allows user's creating their own classes to inject them with Numpy's
> filter only when they desire;
Sure, the numpy people could have done this, if they were smoking crack.
Have you actually programmed before? Judging from the techniques you seem
to prefer for everyday use (monkey-patching other classes) and techniques
you seem to hate (subclassing), I'm getting the impression you've read
about bleeding edge programming hacks but never actually written code.
Sort of like somebody who has never driven a car, but fantasises about
doing the sort of extreme stunt driving that kills people in real life
and occasionally even stunt drivers. And now you are *insisting* that
everyone should drive like that, *all the time*, because stopping at
traffic lights is so inefficient.
Of course, I could be wrong. Maybe you've been programming for years and
know exactly what you are doing. But if so, you are coming across as
exactly the kind of cowboy coder that I pray to all the gods I never have
deal with in real life.
> Don't consider the present API legacy for a moment, I'm asking
> hypothetical design questions:
> How many users actually keep slice() around from every instance of [::]
> they use?
Does it matter? It is supported behaviour, so even *one* user is enough.
> If it is rare, why create the slice() object in the first place and
> constantly be allocating and de-allocating memory, twice over? (once for
> the original, and once for the repetitive method which computes dynamic
Huh? As opposed to what? Creating an xrange() object, and constantly
allocating and de-allocating memory? Or a tuple? Same again. Some sort of
object has to be created.
And I have no idea what you are talking about "twice over". What
"repetitive method which computers dynamic values"?
In any case, I return to my comment earlier in this thread: if you have
profiled your application and have hard evidence that creating slice
objects is a bottleneck, then we can talk about optimizing the slice
objects. Until then, you are wasting your time and ours by prematurely
optimizing the wrong parts of your code.
> Would a single mutable have less overhead, since it is
> destroyed anyway?
What? This question makes no sense. Why do you think that mutable objects
have "less overhead" than immutable ones?
More information about the Python-list