[Numpy-discussion] using loadtxt to load a text file in to a numpy array
chris.barker at noaa.gov
Thu Jan 23 20:09:28 EST 2014
On Thu, Jan 23, 2014 at 4:02 PM, Oscar Benjamin
<oscar.j.benjamin at gmail.com>wrote:
> On 23 January 2014 21:51, Chris Barker <chris.barker at noaa.gov> wrote:
> > However, I would prefer latin-1 -- that way you might get garbage for
> > non-ascii parts, but it wouldn't raise an exception and it round-trips
> > through encoding/decoding. And you would have a somewhat more useful
> > -- including the latin-language character and symbols like the degree
> > symbol, etc.
> Exceptions and error messages are a good thing! Garbage is not!!! :)
in principle, I agree with you, but sometime practicality beets purity.
in py2 there is a lot of implicit encoding/decoding going on, using the
system encoding. That is ascii on a lot of systems. The result is that
there is a lot of code out there that folks have ported to use unicode, but
missed a few corners. If that code is only testes with ascii, it all seems
o be working but then out in the wild someone
puts another character in there and presto -- a crash.
Also, there are places where the inability to encode makes silent message
-- for instance if an Exception is raised with a unicode message, it will
get silently dropped when it comes time to display on the terminal. I spent
quite a wile banging my head against that one recently when I tried to
update some code to read unicode files. I would have been MUCH happier with
a bit of garbage in the mesae than having it drop (or raise
an encoding error in the middle of the error...)
I think this is a bad thing.
The advantage of latin-1 is that while you might get something that
doesn't print right, it won't crash, and it won't contaminate the data, so
comparisons, etc, will still work. kind of like using utf-8 in an old-style
c char array -- you can still passi t around and copare it, even if the
bytes dont mean what you think they do.
Christopher Barker, Ph.D.
Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception
Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the NumPy-Discussion