[Numpy-discussion] genloadtxt: second serving

Manuel Metz mmetz at astro.uni-bonn.de
Thu Dec 4 07:22:33 EST 2008


Pierre GM wrote:
> All,
> Here's the second round of genloadtxt. That's a tad cleaner version than 
> the previous one, where I tried to take  into account the different 
> comments and suggestions that were posted. So, tabs should be supported 
> and explicit whitespaces are not collapsed.
> FYI, in the __main__ section, you'll find 2 hotshot tests and a timeit 
> comparison: same input, no missing data, one with genloadtxt, one with 
> np.loadtxt and a last one with matplotlib.mlab.csv2rec.
> 
> As you'll see, genloadtxt is roughly twice slower than np.loadtxt, but 
> twice faster than csv2rec. One of the explanation for the slowness is 
> indeed the use of classes for splitting lines and converting values. 
> Instead of a basic function, we use the __call__ method of the class, 
> which itself calls another function depending on the attribute values. 
> I'd like to reduce this overhead, any suggestion is more than welcome, 
> as usual.
> 
> Anyhow: as we do need speed, I suggest we put genloadtxt somewhere in 
> numpy.ma, with an alias recfromcsv for John, using his defaults. Unless 
> somebody comes with a brilliant optimization.

Will loadtxt in that case remain as is? Or will the _faulttolerantconv 
class be used?

mm



More information about the NumPy-Discussion mailing list