<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Mon, Apr 24, 2017 at 2:47 PM, Robert Kern <span dir="ltr"><<a href="mailto:robert.kern@gmail.com" target="_blank">robert.kern@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><span class="">On Mon, Apr 24, 2017 at 10:51 AM, Aldcroft, Thomas <<a href="mailto:aldcroft@head.cfa.harvard.edu" target="_blank">aldcroft@head.cfa.harvard.edu</a><wbr>> wrote:<br>><br></span><span class="">> On Mon, Apr 24, 2017 at 1:04 PM, Chris Barker <<a href="mailto:chris.barker@noaa.gov" target="_blank">chris.barker@noaa.gov</a>> wrote:<br><br></span><span class="">>> - round-tripping of binary data (at least with Python's encoding/decoding) -- ANY string of bytes can be decodes as latin-1 and re-encoded to get the same bytes back. You may get garbage, but you won't get an EncodingError.<br>><br>> +1.  The key point is that there is a HUGE amount of legacy science data in the form of FITS (astronomy-specific binary file format that has been the primary file format for 20+ years) and HDF5 which uses a character data type to store data which can be bytes 0-255.  Getting an decoding/encoding error when trying to deal with these datasets is a non-starter from my perspective.<br><br></span>That says to me that these are properly represented by `bytes` objects, not `unicode/str` objects encoding to and decoding from a hardcoded latin-1 encoding.<br></div></blockquote><div><br></div><div>If you could go back 30 years and get every scientist in the world to do the right thing, then sure.  But we are living in a messy world right now with messy legacy datasets that have character type data that are *mostly* ASCII, but not infrequently contain non-ASCII characters.</div><div><br></div><div>So I would beg to actually move forward with a pragmatic solution that addresses very real and consequential problems that we face instead of waiting/praying for a perfect solution.</div><div><br></div><div>- Tom</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><br>--<br>Robert Kern</div>

<br>______________________________<wbr>_________________<br>

NumPy-Discussion mailing list<br>

<a href="mailto:NumPy-Discussion@python.org">NumPy-Discussion@python.org</a><br>

<a href="https://mail.python.org/mailman/listinfo/numpy-discussion" rel="noreferrer" target="_blank">https://mail.python.org/<wbr>mailman/listinfo/numpy-<wbr>discussion</a><br>

<br></blockquote></div><br></div></div>