
Hi-- I've submitted a pull request for a new method for loading data from text files into a record array/masked record array. https://github.com/numpy/numpy/pull/143 Click on the link for more info, but the general idea is to create a regular expression for what entries should look like and loop over the file, updating the regular expression if it's wrong. Once the types are determined the file is loaded line by line into a pre-allocated numpy array. Compared to genfromtxt this function has several advantages/potential advantages. *More modular (genfromtxt is a rather large, nearly 500 line, monolithic function. In my pull request no individual method is longer than around 80 lines, and they're fairly self-contained.) *delimiters can be specified via regex's *missing data can be specified via regex's *it's bit simpler and has sensible defaults *it actually works on some (unfortunately proprietary) data that genfromtxt doesn't seem robust enough for *it supports datetimes *fairly extensible for the power user *makes two passes through the file, the first to determine types/sizes for strings and the second to read in the data, and pre-allocates the array for the second pass. So no giant memory bloating for reading large text files *fairly fast, though I think there is plenty of room for optimizations All that said, it's entirely possible that the innards which determine the type should be ripped out and submitted as a function on their own. I'd love suggestions for improvements, as well as suggestions for a better name. (Currently it's called loadtable, which I don't really like. It was just a working name.) -Chris Jordan-Squire