[Numpy-discussion] genfromtxt universal newline support

Derek Homeier derek at astro.physik.uni-goettingen.de
Mon Jun 30 10:47:56 EDT 2014


On 30 Jun 2014, at 04:39 pm, Nathaniel Smith <njs at pobox.com> wrote:

> On Mon, Jun 30, 2014 at 12:33 PM, Julian Taylor
> <jtaylor.debian at googlemail.com> wrote:
>> genfromtxt and loadtxt need an almost full rewrite to fix the botched
>> python3 conversion of these functions. There are a couple threads
>> about this on this list already.
>> There are numerous PRs fixing stuff in these functions which I
>> currently all -1'd because we need to fix the underlying unicode
>> issues first.
>> I have a PR were I started this for loadtxt but it is incredibly
>> annoying to try to support all the broken use cases the function
>> accidentally supported.
>> 
>> 1.9 beta still uses the broken functions because I had no time to get
>> this done correctly.
>> But we should probably put a big fat future warning into the release
>> notes that genfromtxt and loadtxt may stop working for your binary
>> streams.

What binary streams?

>> That will probably allow us to start fixing these functions.
> 
> +1 to doing the proper fix instead of piling up buggy hacks. Do we
> understand the difference between the current code and the "proper"
> code well enough to detect cases where they differ and issue warnings
> in those cases specifically?

Does it make sense to keep maintaing both functions at all? IIRC the idea that
loadtxt would be the faster version of the two has been discarded long ago,
thus it seems there is very little, if anything, loadtxt can do that cannot be done
just as well by genfromtxt. Main compatibility issue is probably different default
behaviour and interface of the two, but perhaps that might be best solved by
replacing loadtxt with another genfromtxt wrapper?
A real need, which had also been discussed at length, is a truly performant text IO
function (i.e. one using a compiled ASCII number parser, and optimally also a more
memory-efficient one), but unfortunately all people interested in implementing this
seem to have drifted away (not excluding myself from this)…

Cheers,
						Derek




More information about the NumPy-Discussion mailing list