20 Jul
2011
20 Jul
'11
3:58 p.m.
On Wed, Jul 20, 2011 at 10:48:23PM +0200, Sven Rahmann wrote:
I'm thinking of suffix arrays (a text indexing data structure) for large texts, eg the human genome and its reverse complement (about 6 billion characters from the alphabet ACGT). The suffix array is a long int array of the same size (8 bytes per number, so it occupies about 48 GB memory).
I doubt array.array was designed to handle data of such size. Why not to use bsddb or such? Oleg. -- Oleg Broytman http://phdru.name/ phd@phdru.name Programmers don't die, they just GOSUB without RETURN.