[Numpy-discussion] odd ascii format and genfromtxt

Ralf Gommers ralf.gommers at googlemail.com
Fri Feb 26 04:10:08 EST 2010


On Fri, Feb 26, 2010 at 4:29 PM, Warren Weckesser <
warren.weckesser at enthought.com> wrote:

> Ralf Gommers wrote:
> > Hi all,
> >
> > I'm trying to read in data from text files with genfromtxt, and have
> > some trouble figuring out the right combination of keywords. The
> > format is:
> >
> > ['0\t\t4.000000000000000e+007,0.000000000000000e+000\n',
> >  '\t9.860280631554179e-001,-1.902586503306264e-002\n',
> >  '\t9.860280631554179e-001,-1.902586503306264e-002']
> >
> > Note that there are two delimiters, tab and comma. Also, the first
> > line has an extra integer plus tab (this is a repeating pattern).
> >
>
> The 'delimiter' keyword does not accept a list of strings.  If it is a
> list, it must be a list of integers that are the field widths.  In your
> case, that won't work.
>
> You could try fromregex:
>
> -----
> In [1]: import numpy as np
>
> In [2]: cat sample.raw
> 0        4.000e+007,0.00000e+000
>    9.8602806e-001,-1.9025e-002
>    9.8602806e-001,-1.9025e-002
> 123        5.0e6,100.0
>    10.1,-2.0e-3
>    10.2,-2.1e-3
>
>
> In [3]: a = np.fromregex('sample.raw', '(.*?)\t+(.*),(.*)',
> np.dtype([('extra', 'S8'), ('x', float), ('y', float)]))
>
> In [4]: a
> Out[4]:
> array([('0', 40000000.0, 0.0), ('', 0.98602805999999998, -0.019025),
>       ('', 0.98602805999999998, -0.019025), ('123', 5000000.0, 100.0),
>       ('', 10.1, -0.002), ('', 10.199999999999999,
> -0.0020999999999999999)],
>      dtype=[('extra', '|S8'), ('x', '<f8'), ('y', '<f8')])
>
>
> Note that the first field of the array is a string, not an integer.  The
> string will be empty in rows that did not have the initial integer.  I
> don't know if that will work for you.
>
> That works, thanks. I had hoped that genfromtxt could do it because it can
skip the header and is presumably faster. But I'll take what I can get.

Cheers,
Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20100226/a4b2b06a/attachment.html>


More information about the NumPy-Discussion mailing list