On Mar 29, 2005, at 9:11 PM, Travis Oliphant wrote:
There are two distinct issues with regards to large arrays.
1) How do you support > 2Gb memory mapped arrays on 32 bit systems and other large-object arrays only a part of which are in memory at any given time (there is an equivalent problem for > 8 Eb (exabytes) on 64 bit systems, an Exabyte is 2^60 bytes or a giga-giga-byte).
2) Supporting the sequence protocol for in-memory objects on 64-bit systems.
Part 2 can be fixed using the recommendations Martin is making and which will likely happen (though it could definitely be done faster). Handling part 1 is more difficult.
One idea is to define some kind of "super object" that mediates between the large file and the in-memory portion. In other words, the ndarray is an in-memory object, while the super object handles interfacing it with a larger structure.
Thoughts?
Maybe I'm missing something but isn't it possible to mmap part of a large file? In that case one just limits the memory maps to what can be handled on a 32 bit system leaving it up to the user software to determine which part of the file to mmap. Did you have something more automatic in mind? As for other large-object arrays I'm not sure what other examples there are other than memory mapping. Do you have any? Perry