How many is "too many" with lists?

Tue May 2 14:27:21 EDT 2000

Matthew Hirsch wrote:
> 
> Generating the list is slow.  I'm accessing each element with a for
> loop.  I was just curious.

So you mean not access to the list, but creating the list?

Do you build the list by append()?

--- on the speed issue ---

Note that append() is somewhat optimized, but that doesn't
take you very far. For small lists, the allocation is rounded up to
the next multiple of ten. For larger lists, this is rounded up
to 100.
In other words: If you use appand() and append() over and over,
You must expect a complete copy of your list on every 100 element.
With your 296010 elements, this means that you are doing about
2960 list allocations with a mean length of 148005 elements.
Calculating per element, this is a mean overhead of 1480 per
element, somehow.

If you can estimate the size of your list in advance and do your
own allocation, you are better off. For instance, keep an extra
list length variable instead of using len(), and increase your
list by a constant factor (1.5, maybe) whenever it needs to grow.

--- on the memory issue ---

Well, another issue is total memory use, which may in fact be
much worse than the above, if your list elements do not fit
into memory. You may assume that the machine slows down by a
factor of 1000 or worse if thrashing becomes serious.

Let's see what you have. A list of lists (hint: use tuples, they
are smaller, due to headers and the round-up rule!),
each list contains six integers.

Cost calculation:

1 int = 3 words. But since you are using numbers below 100, you
are lucky: These objects are shared and come at no cost.

A list is a header of 4 words, pointing to a memory chunk.
In your case, due to round-up, the chunk will be 10 words long,
and due to common memory allocaiton schemes, it will probably
be 12 words or more.
So each of your sublists counts as 16 words, plus the one
reference that is sitting in you big list. The rough estimate
of total size is then

>>> 296010 * (16+1) * 4
20128680
>>> 

bytes. If you are on a 16MB machine, this gives already horrible
performance. 32MB may still be problematic if you are using Windows.

As an alternative, use tuples inside your list. Tuples are
contiguous and therefore have less malloc overhead. And they do
not use spare allocation. A tuple is

3 word header plus one word per entry. In your case, this makes
(unfortunate) 9, probably rounded up to 10 or 12 words my malloc.
Let's hope for 10:

>>> 296010 * (10+1) * 4
13024440
>>> 

Much better, right?

Now this is still a waste of memory, unless you really need fast
access to your integers. There are several ways to save this space.
Is it feasible for you to encode your numbers as characters?
It would make very much sense to build a string of digits, instead
of using tuples with integers, and indexing strings is quick.

A string is a (quite) expensive structure with 5 words header. The
rest is the number of bytes you store, plus 1, rounded up by malloc.
In your case, we have

(5 * 4) + 6 + 1 = 27 byte, probably rounded up to 32 which is 8 words.

Result:
>>> 296010 * (8+1) * 4
10656360
>>> 

Ok, that's still not cheapest possible.
But if you can afford a little computation time in favor of
a smaller memory footprint, how about this:
Simply pack your 6-tuples into one integer by multiplication.

If your "digits" are limited to the range(27), and if you don't
need more than 6 digits, you can pack them together using
bit shifts by 5, and 6 digits will fit into an integer.

The cost of an integer is only 3 words, with nearly no overhead,
since ints are allocated in blocks. This gives you an overall
result of

>>> 296010 * (3+1) * 4
4736160
>>> 

Quite compact, isn't it?

Well, and if you are really mad at memory footprint, then
you may consider the array module: You are storing integers
always, so why pay for the universality of lists and tuples?
Array store your ints natively.

>>> a=array.array("I", range(296010))
>>> len(a)
296010
>>> 

This object, despite the little header, has really only
296010 words of content. This is 1184040 byte, about
17 times better than the original approach.

Ok, and as the last proposal, leaving the 5bit-packing in favor
of more ease, how about the Numerical Python package?
After installation, you can do something like

>>> import Numeric
>>> a=Numeric.array(1000*[range(6)], Numeric.Int8)

This gives you a 1000 by 6 array of tiny 8-bit integers,
which should be able to do what you need, and you have everything
in the right shape. I'm not sure about the exact object
layout in NumPy, but my guess is that we now are at

>>> 296010*6
1776060
>>> 

which is still very good, and performance should be very high.
But I think you have to allocate the array in advance. I'm not
so used to NumPy.

Note that the use of "flat" array does not give you Python
objects. The whole array is an object which contains plain
numbers. On every access, a new integer ubject must be created.
Generally, these arrays perform a bit worse than lists or
tuples. But luckily not in your case, since our small integers
don't create new objects all the time, but you always get
the already existing small integer objects from Python.

hope this helps - chris

-- 
Christian Tismer             :^)   <mailto:tismer at appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com