[Python-Dev] Alternative implementation of interning, take 2

M.-A. Lemburg mal@lemburg.com
Fri, 12 Jul 2002 23:04:15 +0200


Tim Peters wrote:
> [M.-A. Lemburg, on my "does ii help at all?" ii patch]
> 
>>Cool, I'll try that... hmm, I'll have to backport it to Python 2.1.3
>>though ;-)
> 
> 
> Your codebase doesn't run under current CVS?  If so, I would have guessed
> you would have mentioned that before this <wink>.

I don't test against the current CVS -- no time for that.

>>Scanning the source code: I hardly use PyDict_SetItem(); most usages
>>are PyDict_SetItemString().
> 
> 
> That's why you shouldn't try to guess.  The latter calls the former, and the
> real target here is actually indirect optimization of different ways to
> spell setattr.  They all end up in PyDict_SetItem; it doesn't matter whether
> you call that directly.

Sure, but SetItemString() does some extra magic: it interns the
key for me.

>>>+ PyString_InternInPlace:  Whenever it pays here, the patch spits
>>>
>>>    ii paid on an InternInPlace
>>
> 
>>I do use this API, but only in mxURL and mxXMLTools (which is
>>closed source and works with the evil code below I mentioned ;-).
> 
> 
> As mentioned before, the optimization in this doesn't do you any good
> overall unless it triggers in PyDict_SetItem() later.  If it doesn't trigger
> in the latter, your code will run faster overall if we removed the
> optimization from PyString_InternInPlace (although probably not measurably
> faster in this routine; a never-pays anti-optimization in PyDict_SetItem is
> a much more serious matter).

I only use PyString_InternInPlace() on strings which will be
used as dict keys or for string compares in tokenizers and
parsers.

>>>Ya, while that's evil, it's not affected by indirect interning.
>>
> 
>>Cool :-)
>>
>>If Guido should ever decide to rip this out,
> 
> 
> He won't, but it's quite likely to either not do you any good, or actually
> do you harm, in an alternate implementation of Python (e.g., I doubt-- but
> don't know --that Jython bothers with this_).

Jaja... as soon as PEP 275 is implemented I won't have to
worry any more :-)

>>I can always switch to a different technique, e.g. use my own interning
>>token type.
> 
> 
> Or you could call intern() explicitly.  That's what I usually do.
> 
> IF_TOKEN, ELSE_TOKEN, ... = map(intern, "if else ...". split())

True, but Python's compiler already does this for me. You right,
though, I should make this explicit...

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/