[Python-Dev] other "magic strings" issues

Alex Martelli aleaxit at yahoo.com
Sat Nov 8 11:09:34 EST 2003


On Saturday 08 November 2003 10:53, Martin v. Löwis wrote:
> Alex Martelli <aleaxit at yahoo.com> writes:
> > I dunno -- it seems that (on a toy case where an 'in' test takes 0.25
> > usec and an .isdigit takes 0.52 to 0.55) we can shave the time to 0.39,
> > about in-between, by avoiding the generation of a bound-method.  Now of
> > course saving 25% or so isn't huge, but maybe it's still worth it...?
>
> If you can avoid creating bound methods in the general case, that
> would be a good thing. Even avoiding them for for strings only would
> be valuable, although I would then ask that you extend your strategy
> to lists.

Lists are mutable, which makes "creating bound methods" (or the equivalent 
thereof) absolutely unavoidable -- e.g.:
    xxx = somelist.somemethod
    " alter somelist at will "
    yyy = xxx( <args if needed> )

xxx needs to be able to refer back to somelist at call time, clearly.

This problem doesn't necessarily apply to method calls _on immutable
objects_ -- as long as their results are not affected by other mutable
"global" aspects of "the environment" in ways which also depend on
the object they were originally called on.

The is... methods of strings would be just perfect -- were it not for the
influence of locale.  Consider isdigit, which isn't influenced by locale.
When x.isdigit is ACCESSED, we can direct that access through a
getter, which, upon examining x's value at that time, KNOWS what the
call will have to return -- whenever the call happens.  So, the getter can
return a callable that always returns True when called, or one that
always returns False when called -- no need to create *new* callable
objects for either, we can just keep two callables around for the purpose
and incref them as needed.

Few situations are as favourable as this one -- immutable object, no
arguments, just two possible constant-returning callables needed.  I
just think it might be worth taking advantage of these rare circumstances,
where feasible, to avoid wasting a little bit of performance.  I think that
this can be done in 2.3.* without changing Python-observable behavior
in any way whatsoever -- just that if, e.g., we do it for both isdigit and
isspace (the two non-locale-dependent string is* methods, i believe),
we'll need 4 callables rather than 2 so that their __name__ and _doc__
attributes can be indistinguishable from the current versions thereof.


> It's not crucial, but it would be an incompatible change to change it.
>
> However, this is irrelevant with respect to bound methods. The
> locale-awareness is in the code of the function, so if you manage to
> invoke that at the point of the call (instead of caching its result),
> then it would still be compatible.

Nope., because the locale-dependent part needs to be applied to
the actual string on which, e.g., isupper is being called.  Therefore,
since locale-dependency applies at call-time, we need a way _at
call-time_ to get to the actual string... i.e., a bound-method or its
equivalent, alas.  Only when attribute-fetch-time behavior can be
substituted for call-time behavior, is the above optimization feasible.


Alex




More information about the Python-Dev mailing list