[Python-Dev] other "magic strings" issues

Alex Martelli aleaxit at yahoo.com
Fri Nov 7 17:44:46 EST 2003


On Friday 07 November 2003 22:16, Martin v. Löwis wrote:
> Alex Martelli <aleaxit at yahoo.com> writes:
> > Very interesting!  To me, this suggests fixing this performance bug
> > -- there is no reason that I can see why the .is* methiods should be
> > _slower_.  Would a performance bugfix (no implementation change,
> > just a speedup) be OK for 2.3.3, I hope?
>
> Yes, but I doubt you do much about it. I also fail to see how it is

I dunno -- it seems that (on a toy case where an 'in' test takes 0.25 usec
and an .isdigit takes 0.52 to 0.55) we can shave the time to 0.39, about
in-between, by avoiding the generation of a bound-method.  Now of course
saving 25% or so isn't huge, but maybe it's still worth it...?

> relevant to ascii_letters. .islower is locale-aware, so it is your C
> library which does the bulk of the work.

Ah -- interesting point!  So, for example:

f = xx.islower
print f()
# insert locale change here
print f()

should be able to print two distinct values for appropriate values of
xx and locale changes, right?  Hmmm -- if supporting this usage is crucial
then indeed we can't avoid generating a boundmethod (for .islower and
other locale-aware .is* methods), because the "return a function" approach
is basically evaluating the function at attribute-access time... if locale
changes between the attribute-access time and the moment of the call,
then the result may not be as desired.  Funny, among the deleterious effects
of locale-changing's "subterraneous global effects" I had not considered
this one -- it breaks nice conditions we might otherwise have counted on
thanks to strings' immutability and the parameterless nature of the .is...()
methods.  Oh well, I guess the trick is not worth pursuing just for the sake
of .isdigit and .isspace, then, if "locale change between access and call"
must be supported.  Pity, because despite the C library's amount of work,
the overhead of the bound-method generation is not trivial, as Fred 
mentioned.

So, if the fast idiom is _inevitably_ "if xx in ...:" (thanks in part to the 
fact that we _don't_ have to support locale changes in the middle of
things in this case), then perhaps we should stop touting xx.is...() as
superior, and see about offering the best possible support for the
'in' case -- where my remarks about "accidental successes" of, e.g.,
"if xx in ...digits...:" when xx=="23" but not when xx=="34" stand.  We
can't break "if xx in string.digits:" (maybe somebody's relying on the
test succeding when xx is a sequence of adjacent increasing digits?)
but we can surely choose, if we wish, to define the semantics of
"if xx in str.digits:" in a (IMHO) more helpful-against-errors way....


Alex




More information about the Python-Dev mailing list