[Numpy-discussion] genloadtxt: second serving
Manuel Metz
mmetz at astro.uni-bonn.de
Thu Dec 4 07:22:33 EST 2008
Pierre GM wrote:
> All,
> Here's the second round of genloadtxt. That's a tad cleaner version than
> the previous one, where I tried to take into account the different
> comments and suggestions that were posted. So, tabs should be supported
> and explicit whitespaces are not collapsed.
> FYI, in the __main__ section, you'll find 2 hotshot tests and a timeit
> comparison: same input, no missing data, one with genloadtxt, one with
> np.loadtxt and a last one with matplotlib.mlab.csv2rec.
>
> As you'll see, genloadtxt is roughly twice slower than np.loadtxt, but
> twice faster than csv2rec. One of the explanation for the slowness is
> indeed the use of classes for splitting lines and converting values.
> Instead of a basic function, we use the __call__ method of the class,
> which itself calls another function depending on the attribute values.
> I'd like to reduce this overhead, any suggestion is more than welcome,
> as usual.
>
> Anyhow: as we do need speed, I suggest we put genloadtxt somewhere in
> numpy.ma, with an alias recfromcsv for John, using his defaults. Unless
> somebody comes with a brilliant optimization.
Will loadtxt in that case remain as is? Or will the _faulttolerantconv
class be used?
mm
More information about the NumPy-Discussion
mailing list