[Python-3000] Making more effective use of slice objects in Py3k

Guido van Rossum guido at python.org
Tue Aug 29 17:42:18 CEST 2006


On 8/28/06, Josiah Carlson <jcarlson at uci.edu> wrote:
>
> "Guido van Rossum" <guido at python.org> wrote:
> > Those are all microbenchmarks. It's easy to prove the superiority of
> > an approach that way. But what about realistic applications? What if
> > your views don't end up saving memory or time for an application, but
> > still cost in terms of added complexity in all string operations?
>
> At no point has anyone claimed that every operation on views will always
> be faster than on strings.  Nor has anyone claimed that it will always
> reduce memory consumption.  However, for a not insignificant number of
> operations, views can be faster, offer better memory use, etc.
>
>
> I agree with Jean-Paul Calderone:
>
> "If the goal is to avoid speeding up Python programs because views are
> too complex or unpythonic or whatever, fine.  But there isn't really any
> question as to whether or not this is a real optimization."

And without qualification that is as false as anything you've said.

> "I don't think we see people overusing buffer() in ways which damage
> readability now, and buffer is even a builtin.  Tossing something off
> into a module somewhere shouldn't really be a problem.  To most people
> who don't actually know what they're doing, the idea to optimize code
> by reducing memory copying usually just doesn't come up."

Another "yes they do -- no they don't" argument. As I've said
repeatedly before, optimizations are likely to be copied without being
understood by newbies. The buffer() built-in has such a poor
reputation and API that it doesn't get much play; but a new "views"
feature that will magically make all your string processing go faster
surely will.

> While there are examples where views can be slower, this is no different
> than the cases where deque is slower than list; sometimes some data
> structures are more applicable to the problem than others.  As we have
> given users the choice to use a structure that has been optimized for
> certain behaviors (set and deque being primary examples), this is just
> another structure that offers improved performance for some operations.

As long as it is very carefully presented as such I have much less of
a problem with it.

Earlier proposals were implying that all string ops should return
views whenever possibe. That, I believe, is never going to fly, and
that's where my main objection lies.

Having views in a library module alleviates many of my objections.
While I still worry that it will be overused, deque doesn't seem to be
overused, so perhaps I should relax.

> > Then I ask you to make it so that string views are 99.999%
> > indistinguishable from strings -- they have all the same methods, are
> > usable everywhere else, etc.
>
> For reference, I'm about 2 hours into it (including re-reading the
> documentation for Pyrex), and I've got [r]partition, [r]find, [r]index,
> [r|l]strip. I don't see significant difficulty implementing all other
> methods on views.
>
> Astute readers of the original implementation will note that I never
> check that the argument being passed in is a string; I use the buffer
> interface, so anything offering the buffer interface can be seen as a
> read-only view with string methods attached.  Expect a full
> implementation later this week.

Good luck!

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


More information about the Python-3000 mailing list