On Mon, Apr 24, 2017 at 10:51 AM, Aldcroft, Thomas <aldcroft@head.cfa.harvard.edu> wrote:
>
> On Mon, Apr 24, 2017 at 1:04 PM, Chris Barker <chris.barker@noaa.gov> wrote:
>> - round-tripping of binary data (at least with Python's encoding/decoding) -- ANY string of bytes can be decodes as latin-1 and re-encoded to get the same bytes back. You may get garbage, but you won't get an EncodingError.
>
> +1. The key point is that there is a HUGE amount of legacy science data in the form of FITS (astronomy-specific binary file format that has been the primary file format for 20+ years) and HDF5 which uses a character data type to store data which can be bytes 0-255. Getting an decoding/encoding error when trying to deal with these datasets is a non-starter from my perspective.
That says to me that these are properly represented by `bytes` objects, not `unicode/str` objects encoding to and decoding from a hardcoded latin-1 encoding.
--
Robert Kern