[Numpy-discussion] deprecate fromstring() for text reading?

Derek Homeier derek at astro.physik.uni-goettingen.de
Wed Nov 4 15:00:19 EST 2015


On 3 Nov 2015, at 6:03 pm, Chris Barker - NOAA Federal <chris.barker at noaa.gov> wrote:
> 
> I was more aiming to point out a situation where the NumPy's text file reader was significantly better than the Pandas version, so we would want to make sure that we properly benchmark any significant changes to NumPy's text reading code. Who knows where else NumPy beats Pandas?
> Indeed. For this example, I think a fixed-with reader really is a different animal, and it's probably a good idea to have a high performance one in Numpy. Among other things, you wouldn't want it to try to auto-determine data types or anything like that.
> 
> I think what's on the table now is to bring in a new delimited reader -- I.e. CSV in its various flavors.
> 
To add my own handful of change or at least another data point, I had been looking into both
the pandas and the Astropy fast readers as a fast loadtxt/genfromtxt replacement; at the time
I found the Astropy cparser source somewhat easier to dig into, although looking now Pandas'
parser.pyx seems clear enough as well.
Some comparison of the two can be found at
http://astropy.readthedocs.org/en/stable/io/ascii/fast_ascii_io.html#speed-gains

Unfortunately the Astropy fast reader currently does not support fixed-width format either, and
adding this functionality would require modifications to the tokenizer C code - not sure how
extensive.

Cheers,
					Derek




More information about the NumPy-Discussion mailing list