[Numpy-discussion] proposal: smaller representation of string arrays
aldcroft at head.cfa.harvard.edu
Tue Apr 25 22:02:38 EDT 2017
On Tue, Apr 25, 2017 at 7:11 PM, Chris Barker - NOAA Federal <
chris.barker at noaa.gov> wrote:
> > On Apr 25, 2017, at 12:38 PM, Nathaniel Smith <njs at pobox.com> wrote:
> > Eh... First, on Windows and MacOS, filenames are natively Unicode.
> Yeah, though once they are stored I. A text file -- who the heck
> knows? That may be simply unsolvable.
> > s. And then from in Python, if you want to actually work with those
> filenames you need to either have a bytestring type or else a Unicode type
> that uses surrogateescape to represent the non-ascii characters.
> > IMO if you have filenames that are arbitrary bytestrings and you need to
> represent this properly, you should just use bytestrings -- really, they're
> perfectly friendly :-).
> I thought the Python file (and Path) APIs all required (Unicode)
> strings? That was the whole complaint!
> And no, bytestrings are not perfectly friendly in py3.
> This got really complicated and sidetracked, but All I'm suggesting is
> that if we have a 1byte per char string type, with a fixed encoding,
> that that encoding be Latin-1, rather than ASCII.
> That's it, really.
> Having a settable encoding would work fine, too.
At a simple level, I just want the things that currently work just fine in
Py2 to start working in Py3. That includes being able to read / manipulate
/ compute and write back to legacy binary FITS and HDF5 files that include
ASCII-ish text data (not strictly ASCII). Memory mapping such files should
be supportable. Swapping type from bytes to a 1-byte char str should be
possible without altering data in memory.
BTW, I am saying "I want", but this functionality would definitely be
welcome in astropy. I wrote a unicode sandwich workaround for the astropy
Table class (https://github.com/astropy/astropy/pull/5700) which should be
in the next release. It would be way better to have this at a level lower
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the NumPy-Discussion