[Tutor] Problems creating very large arrays (over 10 million indices)

Michael Janssen Janssen@rz.uni-frankfurt.de
Tue Jul 1 12:13:03 2003


On Tue, 1 Jul 2003 DORSEY_EDMUND_K@LILLY.COM wrote:

> Using the following worked and reasonably fast (about 5 seconds to
> initialize an array with close to 100 million indices)
>
> myArray = [0] * arraySize

your first testcase:

for i in range(0, arraySize): #arraySize being over 10 million
        myArray.append(0)       #just use zero as a place holder

builds both a "range" list fom 0 till arraySize as also myArray. Beside
the time for the append operation (and subsequents efforts to aquire more
memory) you will need twice the time of "consuming memory". "xrange" is
better with for-loops over wide ranges: it doesn't build the list but just
iterate trough the items (doing some behind-the-scene-magic).

k = range(10000000) consumes 150 MB (from 512 RAM) on my Computer (linux
i686 glibc?). "del k" frees just 40 MB. Exiting the python interpreter
releases memory until former state (measured with "free -m". Just one
testrun). This is much - but not enough to freeze a machine.

Michael

>
> Thanks for the help!
>
> ~Edmund
>
>
>
>
>
>
> Alan Trautman <ATrautman@perryjudds.com>
> 07/01/2003 09:50 AM
>
>
>         To:     "'DORSEY_EDMUND_K@LILLY.COM'" <DORSEY_EDMUND_K@LILLY.COM>,
> tutor@python.org
>         cc:
>         Subject:        RE: [Tutor] Problems creating very large arrays (over 10 million indices)
>
>
> Edmund,
>
> An array as large as you want will require quite large computing power
> unless the stored data is binary.
>
> Questions:
> What are you storing (generic description) in each array position? Large
> amounts of data will require completely different structures.
>
> What can't the image be broken up into smaller regions? This is the way
> graphics are normally handled. By breaking images into squares (triangles
> for 3D graphics) the computer only has to deal with smaller regions at a
> time.
>
>
> If you can work on smaller regions the test will be a few choices: Sort
> the
> incoming file, pick a regional small array and extract its' data from the
> larger file, and/or take each data element as it comes and locate the
> smaller array, store and move to the next element. These are a few methods
> I
> can think of, I'm sure the list can come up with others.
>
>
> HTH
> Alan
>
>
> I need to initialize an array with sizes greater than 10 million indices.
> The reason for this is because I then proceed to fill in the array but
> jumping around.  (The array is used to store binary medical image data)  I
>
> tried using a for loop like so
>
>
> for i in range(0, arraySize): #arraySize being over 10 million
>         myArray.append(0)       #just use zero as a place holder
>
> WHen arraySize is small under 200,000 or so it works okay (though very
> very slow)  Anything larger than this and it just crashes.  I have 2 gigs
> of memory so I'm not running out of memory.
>
> Two Questions...
>
> 1)  Why does this crash?
> 2) Is there a faster, stable way to do what I want to do
>
> Thank you for all the help
>
> ~Edmund Dorsey
>
>
>
>