[Python-Dev] Slices and "==" optimization

M.-A. Lemburg mal@lemburg.com
Tue, 30 Oct 2001 21:07:21 +0100


"Martin v. Loewis" wrote:
> 
> > > That's why I would like the simple
> > >
> > >     if (v == w) return 0;
> > >
> > > integrated into the ceval loop right along the INT-compare
> > > optimization.
> >
> > Maybe this could be done as follows:
> >
> >     if (v == w && PyString_CheckExact(v)) return 0;
> 
> Maybe I'm missing some context here: where is this fragment supposed
> to go to? into ceval.c:COMPARE_OP? What is the "return 0;" doing then?

It's pseudo code I made up. The real thing will look very much
like it though.
 
> In any case, I think measurements should show how much speed
> improvement is gained by taking this short-cut. It sounds nice in
> theory, but ...
> 
> I just added the block
> 
>                 else if ((v == w) && (oparg == EQ) && PyString_CheckExact(v)) {
>                         x = Py_True;
>                         Py_INCREF(x);
>                 }

You should first test for oparg == EQ, then for v == w and
PyString_CheckExact().
 
> into the code. In an application that almost exclusively does
> COMPARE_OPs on identical strings, I got a 30% speed-up. 

Nice :-)

> OTOH, this
> same code caused a 10% slowdown if I converted the "==" into "<>".

This sounds like an alignment problem caused by the compiler.
10% speedups/slowdowns can easily be produced by randomly moving
about a few cases in the ceval loop. All depends on the platform
and compiler though.
 
> In a real application, total speed-up will depend on two things:
> - how many COMPARE_OPs are done in the code?
> - how many of those compare identical strings for equality?

As I explained in my other reply, I have code which does:

	if variable == 'constant string': ...

Since the compiler interns the 'constant string' and I can
make sure that variable is interned too (by calling intern()),
I can easily take advantage of the optimization without
giving away any semantics.
 
If variable happens to be some other constant which is
used in a Python module (e.g. some kind of flag or parameter),
then it will be interned per-se too.

Note that the dictionary implementation relies heavily
on the same optimization (interning was added to enhance
string compare performance in dict lookups).

> Running the PyXML test suite, I counted 120000 cases where
> slow_compare was done, and only 700 cases identical strings were
> compared for equality.

As Fred mentioned, this is probably due to the fact that
Unicode objects are not interned. Neither are typical
XML tag names; could make a difference... don't know. 

I believe that PyXML spends most of the time in Python 
calls and not so much in string compares (and that's what I'm 
trying to avoid in my XML parser approach ;-).

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Consulting & Company:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/