Reading bz2 file into numpy array

Peter Otten __peter__ at web.de
Tue Nov 23 05:18:52 EST 2010


Nobody wrote:

> On Mon, 22 Nov 2010 11:37:22 +0100, Peter Otten wrote:
> 
>>> is there a convenient way to read bz2 files into a numpy array?
>> 
>> Try
> 
>> f = bz2.BZ2File(filename)
>> data = numpy.fromstring(f.read(), numpy.float32)
> 
> That's going to hurt if the file is large.

Yes, but memory usage will peak at about 2*sizeof(data), and most scripts
need more data than just a single numpy.array.
In short: the OP is unlikely to run into the problem.

> You might be better off either extracting to a temporary file, or creating
> a pipe with numpy.fromfile() reading the pipe and either a thread or
> subprocess decompressing the data into the pipe.
 
I like to keep it simple, so if available RAM turns out to be the limiting 
factor I think extracting the data into a temporary file is a good backup 
plan. 

Peter



More information about the Python-list mailing list