variable number of columns in loadtxt/genfromtxt
Hi, I commonly have to deal with legacy ASCII files, which don't have a constant number of columns. The standard is 10 values per row, but sometimes, there are less columns. loadtxt doesn't support this, and in genfromtext, the rows which have less than 10 values are excluded from the resulting array. Is there any way around this? Thanks for your insight, Andreas.
On Tue, Sep 25, 2012 at 2:31 AM, Andreas Hilboll <lists@hilboll.de> wrote:
I commonly have to deal with legacy ASCII files, which don't have a constant number of columns. The standard is 10 values per row, but sometimes, there are less columns. loadtxt doesn't support this, and in genfromtext, the rows which have less than 10 values are excluded from the resulting array.
Is there any way around this?
the trick is: what does it mean when there are fewer values in a row? There is no way to universally define that. Anyway, I'd just punt on using a standard ascii file reader, in the time it took to write this question, you'd be halfway to writing a custom file parser -- it's really easy in Python, at least if you don't need absolutely top performance (which loadtext and genfromtext doen't give you anyway) -Chris
Thanks for your insight, Andreas.
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
-- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov
On Tue, Sep 25, 2012 at 2:31 AM, Andreas Hilboll <lists@hilboll.de> wrote:
I commonly have to deal with legacy ASCII files, which don't have a constant number of columns. The standard is 10 values per row, but sometimes, there are less columns. loadtxt doesn't support this, and in genfromtext, the rows which have less than 10 values are excluded from the resulting array.
Is there any way around this?
the trick is: what does it mean when there are fewer values in a row? There is no way to universally define that.
Anyway, I'd just punt on using a standard ascii file reader, in the time it took to write this question, you'd be halfway to writing a custom file parser -- it's really easy in Python, at least if you don't need absolutely top performance (which loadtext and genfromtext doen't give you anyway)
Actually, that's just what I did before writing this question ;) I was just wondering if there were some solution available which I didn't know about. Cheers, Andreas.
On Tue, Sep 25, 2012 at 9:35 AM, Andreas Hilboll <lists@hilboll.de> wrote:
On Tue, Sep 25, 2012 at 2:31 AM, Andreas Hilboll <lists@hilboll.de> wrote:
I commonly have to deal with legacy ASCII files, which don't have a constant number of columns. The standard is 10 values per row, but sometimes, there are less columns. loadtxt doesn't support this, and in genfromtext, the rows which have less than 10 values are excluded from the resulting array.
Is there any way around this?
the trick is: what does it mean when there are fewer values in a row? There is no way to universally define that.
Anyway, I'd just punt on using a standard ascii file reader, in the time it took to write this question, you'd be halfway to writing a custom file parser -- it's really easy in Python, at least if you don't need absolutely top performance (which loadtext and genfromtext doen't give you anyway)
Actually, that's just what I did before writing this question ;) I was just wondering if there were some solution available which I didn't know about.
This may or may not be relevant, but pandas does a pretty good job of handling this sort of thing... http://nbviewer.maxdrawdown.com/3785198 Notebook Viewer hasn't quite caught up with the dev version of ipython. I've attached a screen shot too. -paul
participants (3)
-
Andreas Hilboll
-
Chris Barker
-
Paul Hobson