[Python-3000] Making more effective use of slice objects in Py3k

Sun Aug 27 17:50:39 CEST 2006

On 8/26/06, Jim Jewett <jimjjewett at gmail.com> wrote:
> > > As I understand it, Nick is suggesting that slice
> > > objects be used as a sequence (not just string)
> > > view.
>
> > I have a hard time parsing this sentence. A slice is
> > an object with three immutable attributes -- start,
> > stop, step. How does this double as a string view?
>
> Poor wording on my part; it is (the application of a slice to a
> specific sequence) that could act as copyless view.
>
> For example, you wanted to keep the rarely used optional arguments to
> find because of efficiency.

I don't believe they are rarely used. They are (currently) essential
for code that searches a long string for a short substring repeatedly.
If you believe that is a rare use case, why bother coming up with a
whole new language feature to support it?

>     s.find(prefix, start, stop)
>
> does not copy.

That's still really poor wording. If you want to make your case you
should take more time explaining it right.

> If slices were less eager at copying, this could be
> rewritten as
>
>     view=slice(start, stop, 1)
>     view(s).find(prefix)

Now you're postulating that calling a slice will take a slice of an
object? Any object? And how is that supposed to work for arbitrary
objects? I would think that it ought to be a method on the string
object -- surely a view on a string will have to be a different type
of object than a few on a list and that ought to be different again
from a view on a unicode string. Also you're postulating that the
slice object somehow has the same methods as the thing it slices? How
are you expecting to implement that? (Don't tell me that you haven't
thought about implementation yet. Without a plan implementation there
is no feature.)

> or perhaps even as
>
>     s[start:stop].find(prefix)

That will never fly. NumPy may get away with non-copying slices, but
for built-in objects this would be too big of a departure of current
practice. (If you don't stop about this I'll have to add it to PEP
3099. :-)

> I'm not sure these look better, but they are less surprising, because
> they don't depend on optional arguments that most people have
> forgotten about.

Because they're not that important except to the few people who really
need the optimization. Also they're easily looked up.

> > Maybe the idea is that instead of
>
> >   pos = s.find(t, pos)
>
> > we would write
>
> >   pos += stringview(s)[pos:].find(t)
>
> > ???
>
> With stringviews, you wouldn't need to be reindexing from the start of
> the original string.  The idiom would instead be a generalization of
> "for line in file:"
>
>     while data:
>         chunk, sep, data = data.partition()
>
> but the partition call would not need to copy the entire string; it
> could simply return three views.

That depends. I can imagine situations where the indices are needed
regardless of how you code it.

> Yes, this does risk keeping all of data alive because one chunk was
> saved.  This might be a reasonable tradeoff to avoid the copying.  If
> not, perhaps the gc system could be augmented to shrink bloated views
> during idle moments.

Keep dreaming on. it really seems you have no clue about
implementation issues; you just keep postulating random solutions
whenever you're faced with an objection.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)