[Python-Dev] Question regarding: Lib/_markupbase.py

Mon Feb 11 20:02:04 CET 2013

Warning: see http://bugs.python.org/issue17170. Depending on the length of
the string being scanned and the probability of finding the specific
character, the proposed change could actually be a *pessimization*. OTOH if
the character occurs many times, the slice will actually cause O(N**2)
behavior. So yes, it depends greatly on the distribution of the input data.

On Mon, Feb 11, 2013 at 4:37 AM, Oleg Broytman <phd at phdru.name> wrote:

> On Mon, Feb 11, 2013 at 12:16:48PM +0000, Developer Developer <
> just_another_developer at yahoo.de> wrote:
> > I was having a look at the file: Lib/_markupbase.py (@ 82151), function:
> "_parse_doctype_element" and have seen something that has caught my
> attention:
> >
> > if '>' in rawdata[j:]:
> >     return rawdata.find(">", j) + 1
> >
> >
> > Wouldn't it be better to do the following?
> > pos = rawdata.find(">", j)
> > if pos != -1:
> >     return pos + 1
> >
> > Otherwise I think we are scanning rawdata[j:] twice.
>
>    Is it really a significant optimization? Can you do an experiment and
> show figures?
>
> Oleg.
> --
>      Oleg Broytman            http://phdru.name/            phd at phdru.name
>            Programmers don't die, they just GOSUB without RETURN.
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> http://mail.python.org/mailman/options/python-dev/guido%40python.org
>

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20130211/c59eb806/attachment.html>