Is there a document that details which objects are cached in memory (to not create the same object multiple times, for performance)? If not, could please somebody point me out where this is implemented for strings? Thank you! . Facundo Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/
[Facundo Batista]
Is there a document that details which objects are cached in memory (to not create the same object multiple times, for performance)?
The caches get cleaned-up before Python exit's, so you can find them all listed together in the code in Python/pythonrun.c: /* Sundry finalizers */ PyMethod_Fini(); PyFrame_Fini(); PyCFunction_Fini(); PyTuple_Fini(); PyList_Fini(); PyString_Fini(); PyInt_Fini(); PyFloat_Fini(); #ifdef Py_USING_UNICODE /* Cleanup Unicode implementation */ _PyUnicode_Fini(); #endif Raymond Hettinger
Facundo Batista wrote:
Is there a document that details which objects are cached in memory (to not create the same object multiple times, for performance)?
why do you think you need to know?
If not, could please somebody point me out where this is implemented for strings?
Objects/stringobject.c (where else? ;-) </F>
On 4/22/05, Fredrik Lundh
Is there a document that details which objects are cached in memory (to not create the same object multiple times, for performance)?
why do you think you need to know?
I was in my second class of the Python workshop I'm giving here in one Argentine University, and I was explaining how to think using name/object and not variable/value. Using id() for being pedagogic about the objects, the kids saw that id(3) was always the same, but id([]) not. I explained to them that Python, in some circumstances, caches the object, and I kept them happy enough. But I really don't know what objects and in which circumstances. . Facundo Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/
I was in my second class of the Python workshop I'm giving here in one Argentine University, and I was explaining how to think using name/object and not variable/value.
Using id() for being pedagogic about the objects, the kids saw that id(3) was always the same, but id([]) not. I explained to them that Python, in some circumstances, caches the object, and I kept them happy enough.
But I really don't know what objects and in which circumstances.
Aargh! Bad explanation. Or at least you're missing something: *mutable* objects (like lists) can *never* be cached, because they have explicit object semantics. For example each time the expression [] is evaluated it *must* produce a fresh list object (though it may be recycled from a GC'ed list object -- or any other GC'ed object, for that matter). But for *immutable* objects (like numbers, strings and tuples) the implementation is free to use caching. In practice, I believe ints between -5 and 100 are cached, and 1-character strings are often cached (but not always). Hope this helps! I would think this is in the docs somewhere but probably not in a place where one would ever think to look... -- --Guido van Rossum (home page: http://www.python.org/~guido/)
Guido: But for *immutable* objects (like numbers, strings and tuples) the implementation is free to use caching. In practice, I believe ints between -5 and 100 are cached, and 1-character strings are often cached (but not always). Hope this helps! I would think this is in the docs somewhere but probably not in a place where one would ever think to look... ----------- I am sure that the fact that immutables *may* be cached is in the ref manual, but I have been under the impression that the private, *mutable* specifics for CPython are intentionally omitted so that people will not think of them as either fixed or as part of the language/library. I have previously suggested that there be a separate doc for CPython implementation details like this that some people want but which are not part of the language or library definition. Terry J. Reedy
"Terry Reedy"
Guido:
But for *immutable* objects (like numbers, strings and tuples) the implementation is free to use caching. In practice, I believe ints between -5 and 100 are cached, and 1-character strings are often cached (but not always).
Hope this helps! I would think this is in the docs somewhere but probably not in a place where one would ever think to look...
-----------
To be clearer, the above quotes what Guido wrote in the post of his that I am responding to. Only the below is my response.
I am sure that the fact that immutables *may* be cached is in the ref manual, but I have been under the impression that the private, *mutable* specifics for CPython are intentionally omitted so that people will not think of them as either fixed or as part of the language/library.
I have previously suggested that there be a separate doc for CPython implementation details like this that some people want but which are not part of the language or library definition.
Terry J. Reedy
Guido van Rossum wrote:
But for *immutable* objects (like numbers, strings and tuples) the implementation is free to use caching. In practice, I believe ints between -5 and 100 are cached, and 1-character strings are often cached (but not always).
Also, string literals that resemble Python identifiers are often interned, although this is not guaranteed. And this only applies to literals, not strings constructed dynamically by the program (unless you explicitly apply intern() to them). Python 2.3.4 (#1, Jun 30 2004, 16:47:37) [GCC 3.2 20020903 (Red Hat Linux 8.0 3.2-7)] on linux2 Type "help", "copyright", "credits" or "license" for more information.
"foo" is "foo" True "foo" is "f" + "oo" False "foo" is intern("f" + "oo") True
-- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg.ewing@canterbury.ac.nz +--------------------------------------+
On 4/26/05, Greg Ewing
Also, string literals that resemble Python identifiers are often interned, although this is not guaranteed. And this only applies to literals, not strings constructed dynamically by the program (unless you explicitly apply intern() to them).
This simplifies the whole thing. If the issue arises again, my speech will be: "Don't worry about that, Python worries for you". :D And I *someone* in particular keeps interested in it (I'm pretty sure the whole class won't), I'll explain it to him better, and with more time. Thank you! . Facundo Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/
On 4/25/05, Guido van Rossum
I was in my second class of the Python workshop I'm giving here in one Argentine University, and I was explaining how to think using name/object and not variable/value.
Using id() for being pedagogic about the objects, the kids saw that id(3) was always the same, but id([]) not. I explained to them that Python, in some circumstances, caches the object, and I kept them happy enough.
But I really don't know what objects and in which circumstances.
Aargh! Bad explanation. Or at least you're missing something:
Not really. It's easier for me to show that id(3) is always the same and id([]) not, and let the kids see that's not so easy and you'll have to look deeper if you want to know better. If I did id(3) and id(500), then the difference would look more subtle, and I would had to explain it longer. Remember, it was the second day (2 hours per day).
implementation is free to use caching. In practice, I believe ints between -5 and 100 are cached, and 1-character strings are often cached (but not always).
These are exactly my doubts, ;). . Facundo Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/
Facundo Batista wrote:
Aargh! Bad explanation. Or at least you're missing something:
Not really. It's easier for me to show that id(3) is always the same and id([]) not, and let the kids see that's not so easy and you'll have to look deeper if you want to know better.
I think Guido was saying that it's important for them to know that mutable objects are never in danger of being shared, so you should at least tell them that much. Otherwise they may end up worrying unnecessarily that two of their lists might get shared somehow behind their back. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg.ewing@canterbury.ac.nz +--------------------------------------+
Facundo Batista
Is there a document that details which objects are cached in memory (to not create the same object multiple times, for performance)?
No.
If not, could please somebody point me out where this is implemented for strings?
In PyString_FromStringAndSize and PyString_FromString, it seems to me. Cheers, mwh -- I also feel it essential to note, [...], that Description Logics, non-Monotonic Logics, Default Logics and Circumscription Logics can all collectively go suck a cow. Thank you. -- http://advogato.org/person/Johnath/diary.html?start=4
participants (7)
-
Facundo Batista
-
Fredrik Lundh
-
Greg Ewing
-
Guido van Rossum
-
Michael Hudson
-
Raymond Hettinger
-
Terry Reedy