why python cache the string > 256?
Gabriel Genellina
gagsl-py2 at yahoo.com.ar
Tue Oct 27 02:39:26 EDT 2009
En Mon, 26 Oct 2009 23:42:54 -0300, s7v7nislands <s7v7nislands at gmail.com>
escribió:
> On Oct 27, 4:03 am, "Diez B. Roggisch" <de... at nospam.web.de> wrote:
>> s7v7nislands schrieb:
>>
>> > test.py
>> > #!/usr/bin/python
>> > a = []
>> > for i in xrange(1000000):
>> > a.append('a'*500)
>>
>> > $python -i test.py #virt mem 514m in top output
>> >>> del a #virt mem 510m
>>
>> > why python cache these string?
>> > In source, I see when object size > SMALL_REQUEST_THRESHOLD, python
>> > would use malloc.
>> > also use free() when string refcount == 0.
>>
>> > do I miss somethong?
>>
>> http://effbot.org/pyfaq/why-doesnt-python-release-the-memory-when-i-d...
>
> thanks. but it can't explain why cache string size bigger than
> SMALL_REQUEST_THRESHOLD.
> In source, It seems only allocate from cache pool when object's size <
> SMALL_REQUEST_THRESHOLD (256),
> when size > SMALL_REQUEST_THRESHOLD, it will direct use malloc.
> see void *PyObject_Malloc(size_t nbytes) in Objects/obmalloc.c
>
> I know the range & xrange's memory use. the int,str object has their
> memory manager above Python's object allocator.
> but why python don't free big string's memory, which size >
> SMALL_REQUEST_THRESHOLD
It does - it isn't Python who keeps those memory blocks, but the C stdlib.
Apparently free doesn't bring the memory block back to the OS, but keeps
itself internally as free blocks.
If you repeat the cycle again, you should see that memory usage doesn't
grow beyond the previous value; when those big strings are malloc()ed
again, the C stdlib fulfills the memory request using those free blocks it
originally got from the OS.
This is what I see on Windows (using the pslist utility from sysinternals):
# memory usage right when Python starts:
D:\USERDATA\Gabriel>pslist -m python
PsList 1.23 - Process Information Lister
Copyright (C) 1999-2002 Mark Russinovich
Sysinternals - www.sysinternals.com
Process memory detail for LEPTON:
Name Pid VM WS WS Pk Priv Faults NonP Page
PageFile
python 3672 32916 4732 4732 2828 1212 2 55
2828
# after creating the big list of strings
python 3672 622892 533820 533820 531724 141737 2 55
531724
# after `del a`
python 3672 618712 6656 533828 4212 141747 2 55
4212
# after re-creating the list
python 3672 622960 533844 533844 531732 281037 2 55
531732
# after `del a` again, repeating a few times
python 3672 618712 6708 533844 4244 281037 2 55
4244
python 3672 618712 6672 533848 4220 420317 2 55
4220
python 3672 618712 6696 533848 4252 559608 2 55
4252
python 3672 618712 6680 533848 4228 698891 2 55
4228
Note the 'VM' and 'Priv' columns (virtual address space and private bytes,
respectively).
--
Gabriel Genellina
More information about the Python-list
mailing list