[SciPy-user] issues while loading scatter data file with load() from pylab...

fred fredmfp at gmail.com
Thu Sep 13 08:27:24 EDT 2007


Gael Varoquaux a écrit :
> On Thu, Sep 13, 2007 at 12:40:41PM +0200, fred wrote:
>   
>> First question.
>> Using load() function from pylab, array returned is a float64.
>> Is it possible to directly load it in float32 ?
>> I don't need the double precision.
>> And I saw nothing with load?
>>     
>
>   
>> The issue.
>>     
>
>   
>> My scatter data has ~7x1e6 points,
>> stored as x, y, z, v per line.
>>     
>
>   
>> Using a short C code and fscanf, it takes 12 s and ~240 MB in format 
>> double to load it.
>> Fine.
>>     
>
>   
>> Using load() from pylab to load this file is endless and need more than 
>> 1 GB.
>>     
>
> Did you try something less "swiss army knife" than pylab.load ? For instance
> scipy.io.read_array or something homebaked ? As pylab.load is trying to
> accomodate for all sort of weird things, and is very versatile, I bet
> something more targetted would be quicker.
>   
Well, I have modified  my code to read scatter data from binary file.
No more, no less ;-)
> Other solution is to store the data in a format better suited for large
> data. For instance hdf5 with pytables.
>   
I'll look at it when we have to process data files of several GB ;-)

Thanks.

-- 
http://scipy.org/FredericPetit




More information about the SciPy-User mailing list