[AstroPy] "ASCII" tables that contain non-ASCII characters

Aldcroft, Thomas aldcroft at head.cfa.harvard.edu
Mon Oct 24 21:53:20 EDT 2016


On Mon, Oct 24, 2016 at 6:11 PM, Derek Homeier <
derek at astro.physik.uni-goettingen.de> wrote:

> On 25 Oct 2016, at 12:01 am, Nathan Goldbaum <nathan12343 at gmail.com>
> wrote:
> >
> > I believe this is issue 2923:
> >
> > https://github.com/astropy/astropy/issues/2923
> >
> > On Mon, Oct 24, 2016 at 4:45 PM, Benjamin Alan Weaver <baweaver at lbl.gov>
> wrote:
> > Hello y'all,
> >
> > We are trying to read "ASCII" tables containing atomic line data
> > provided by NIST.  When you request the line wavelength data in
> > angstroms, NIST very helpfully labels the columns with the angstrom
> > symbol (Å), which is not strictly part of the ASCII character set.
> >
> > We can read these tables with Table.read() *and* the environment
> > variable LANG=en_US.utf-8 set.  However, if LANG is not set,
> > Table.read() fails to decode these files.
> >
> > As far as I can tell the underlying read() function in astropy.io.ascii
> > does not accept keywords related to the file encoding.
> >
> > So two questions:
> >
> > 1. Is the lack of an encoding keyword a bug that should be reported?
> >
> > 2. Is there a workaround that does not rely on LANG being set?
>
> A workaround that would at least get you away without manipulating the
> environment outside Python would be
>
> import locale
> locale.setlocale(locale.LC_ALL, str(‘en_US.utf8’))
>

You can make this a little cleaner using the set_locale context manager in
astropy:

from astropy.utils.misc import set_locale
with set_locale('en_US.utf8'):
    dat = Table.read(...)

As to the original question of whether this should be reported as a bug, it
has already been discussed in:

 https://github.com/astropy/astropy/issues/3826

That discussion ended without any really clear consensus except that using
Python 3 is a good thing if that is an option.  I have never seriously
evaluated how difficult it would be to implement support for unicode inputs
for Python 2.  A basic recipe is shown in the stdlib csv package
documentation, but I don't know how messy a fully working implementation
would get.

Cheers,
Tom A


>
> Cheers,
>                                         Derek
>
> _______________________________________________
> AstroPy mailing list
> AstroPy at scipy.org
> https://mail.scipy.org/mailman/listinfo/astropy
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/astropy/attachments/20161024/df285c9e/attachment.html>


More information about the AstroPy mailing list