[Numpy-discussion] Fast Reading of ASCII files

Bruce Southey bsouthey at gmail.com
Wed Dec 14 10:11:24 EST 2011


On 12/14/2011 01:03 AM, Chris Barker wrote:
>
>
> On Tue, Dec 13, 2011 at 1:21 PM, Ralf Gommers 
> <ralf.gommers at googlemail.com <mailto:ralf.gommers at googlemail.com>> wrote:
>
>
>         genfromtxt sure looks close for an API
>
>
>     This I don't agree with. It has a huge amount of keywords that
>     just confuse or intimidate a beginning user. There should be a
>     dead simple interface, even the loadtxt API is on the heavy side.
>
>
> well, yes, though it does do a lot -- do you have a smpler one in mind?
>
> But anyway, the really simple cases, are reallly simle, even with 
> genfromtxt.
>
> I guess it's a matter of debate about what is a better API:
>
> a few functions, each adding a layer of sophistication
>
> or
>
> one function, with layers of sophistication added with an array of 
> keyword arguments.
>
> In either case, though I wish the multiple functionality built on the 
> same, well optimized core code.
>
> -Chris
>
>
>
I am not sure that you can even create a simple API here as even 
Python's csv module is rather complex especially when it just reads data 
as strings. It also 'hides' many arguments in the Dialect class although 
these are just the collection of 7 'fmtparam' arguments. It also 
provides the Sniffer class that tries to find correct format that can 
then be passed to the reader function. Then you still have to convert 
the data into the required types - another set of arguments as well as 
yet another pass through the data.

In comparison, genfromtxt can perform sniffing and both genfromtxt and 
loadtxt can read and convert the data. These also add some useful 
features like skipping rows (start, end and commented) and columns. 
However, it could be possible to create a sniffer function and a single 
data reader function leading to a 'simple' reader function but that 
probably would not change the API of the underlying data reader function.

Bruce


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20111214/06f6d025/attachment.html>


More information about the NumPy-Discussion mailing list