[AstroPy] astropy.io.ascii.FixedWidthNoHeader bug

Paul Kuin npkuin at gmail.com
Sat Jun 2 17:58:44 EDT 2018


i vote bug. if a format has been adopted, then the skipped columns can
easily be checked for being empty. Better fail than read in false values.

On Sat, Jun 2, 2018 at 6:25 PM, Derek Homeier <
derek at astro.physik.uni-goettingen.de> wrote:

> Hi Rick,
>
> > On 31 May 2018, at 2:28 pm, Frederic V. Hessman <
> hessman at astro.physik.uni-goettingen.de> wrote:
> >
> > I've got a simple ASCII table:
> >
> > # nix.txt
> >   1     -68 40574.624730 40574.625190 40574.624025 1 0.0000200
> 0.0011645  100.61
> >   2       0     0.000000 40610.064100 40610.064500 0 0.0001000
> -0.0003996  -34.52
> >   3       5 40612.670790 40612.671278 40612.670417 1 0.0001000
> 0.0008612   74.41
> >
> > that I wanted to read using astropy.io.ascii (Table was giving me more
> problems....), so I played with various parsers and options that didn't
> work until it finally appeared to parse successfully :
> >
> > % python
> > Python 3.5.4 (default, Sep 22 2017, 08:33:07)
> > [GCC 4.2.1 Compatible Apple LLVM 8.1.0 (clang-802.0.42)] on darwin
> > Type "help", "copyright", "credits" or "license" for more information.
> > >>> from astropy.io import ascii
> > >>> ascii.read ('nix.txt',format='fixed_width_no_header',comment='#',delimiter='
> ')
> > <Table length=3>
> >  col1  col2     col3        col4         col5      col6   col7     col8
>    col9
> > int64 int64   float64     float64      float64    int64 float64
> float64  float64
> > ----- ----- ----------- ------------ ------------ ----- -------
> --------- -------
> >     1   -68 40574.62473  40574.62519 40574.624025     1   2e-05
> 0.0011645  100.61
> >     2     0         0.0   40610.0641   40610.0645     0  0.0001
> 0.0003996  -34.52
> >     3     5 40612.67079 40612.671278 40612.670417     1  0.0001
> 0.0008612   74.41
> >
> > Note that the minus sign in col8 was zapped but that the minus sign in
> col9 was not!  I then switched the two lines:
> >
> > >>> ascii.read ('nix.txt',format='fixed_width_no_header',comment='#',delimiter='
> ')
> > <Table length=3>
> >  col1  col2   col3      col4         col5      col6   col7     col8
> col9
> > int64 int64 float64   float64      float64    int64 float64  float64
>  float64
> > ----- ----- ------- ------------ ------------ ----- ------- ----------
> -------
> >     2     0     0.0   40610.0641   40610.0645     0  0.0001 -0.0003996
> -34.52
> >     1     8 4.62473  40574.62519 40574.624025     1   2e-05  0.0011645
> 100.61
> >     3     5 2.67079 40612.671278 40612.670417     1  0.0001  0.0008612
>  74.41
> >
> > which gives the correct values, so the behaviour somehow depends on how
> the fixed width columns are found.  My guess is that,  in the first case,
> there was a " 0.00" in col8 (leading space) and "100" in col9 defining the
> fixed width columns but in the second the columns were already reserved by
> "-0.00" and "-34". Looks like a bug to me.
> >
> perhaps not a bug, rather a limitation in functionality.
> As FixedWidthNoHeader cannot obtain the column limits from the header,
> it tries to infer them from the first data line if they are not specified
> by the user.
> But this makes such truncations fairly inevitable, if the first line does
> not cover
> the maximum range of all subsequent lines. A similar problem occurs when
> inserting an item with fewer digits:
>
> # nix.txt
>   1     -68  4574.624730 40574.625190 40574.624025 1 0.0000200  0.0011645
> 100.61
>   2       0     0.000000 40610.064100 40610.064500 0 0.0001000 -0.0003996
> -34.52
>   3       5 40612.670790 40612.671278 40612.670417 1 0.0001000  0.0008612
>  74.41
>
> >>> astropy.io.ascii.read('nix.txt', format='fixed_width_no_header',
> delimiter=' ')
> <Table length=3>
>  col1  col2    col3        col4         col5      col6   col7     col8
>  col9
> int64 int64  float64     float64      float64    int64 float64  float64
> float64
> ----- ----- ---------- ------------ ------------ ----- ------- ---------
> -------
>     1   -68 4574.62473  40574.62519 40574.624025     1   2e-05 0.0011645
> 100.61
>     2     0        0.0   40610.0641   40610.0645     0  0.0001 0.0003996
> -34.52
>     3     5  612.67079 40612.671278 40612.670417     1  0.0001 0.0008612
>  74.41
>
> Testing all data lines for possible other column limits would be clearly
> untenable from
> a performance POV.
>
> What could be improved is perhaps an option to always assume that the
> entries are
> correctly right-aligned, thus determining the column ends from the first
> line and
> setting them to maximum width - basically equivalent to the user specifying
>
> ascii.read('nix.txt', format='fixed_width_no_header', delimiter=‘ ‘,
> col_ends=[2,10,…])
>
> This might be worth filing an issue on GitHub, and the documentation
> probably could
> also be a bit clearer.
>
> But for your specific case here, as you have found below, fixed_width is
> not a very good
> recipe in the first place, since it is almost a textbook example for a
> basic ascii[.no_header] format.
>
> > Afterwards, I realized that
> >
> >       ... format='no_header", ...
> >
> > would have been easier and safer.  Thank goodness I finally recognized
> that all of my minus signs had dissappeared!
> >
> Out of curiosity, what was giving you more problems with the Table reader
> here; in particular, did
>
> Table.read('nix.txt', format='ascii.no_header’)
>
> not work?
>
> Cheers,
>                                         Derek
>
> _______________________________________________
> AstroPy mailing list
> AstroPy at python.org
> https://mail.python.org/mailman/listinfo/astropy
>



-- 

* * * * * * * * http://www.mssl.ucl.ac.uk/~npmk/ * * * *
N.P.M. Kuin      (n.kuin at ucl.ac.uk)
phone +44-(0)1483 (prefix) -204111 (work)
mobile +44(0)7908715953  skype ID: npkuin
Mullard Space Science Laboratory  – University College London  –
Holmbury St Mary – Dorking – Surrey RH5 6NT–  U.K.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/astropy/attachments/20180602/7526b5ae/attachment-0001.html>


More information about the AstroPy mailing list