Re: [Numpy-discussion] Fwd: [numfocus] Grants up to $3k available to NumFOCUS projects, (sponsored & affiliated)
![](https://secure.gravatar.com/avatar/da3a0a1942fbdc5ee9a9b8115ac5dae7.jpg?s=120&d=mm&r=g)
Mon, 27 Mar 2017 08:21:37 -0700, Chris Barker kirjoitti:
TBH, I don't see why 's' should be deprecated --- the operation is well-specified (byte strings + null stripping) and has the same meaning in python2 and 3. Of course, a true 1-byte unicode subset string may be more useful type for some applications, so it could indeed be added. -- Pauli Virtanen
![](https://secure.gravatar.com/avatar/5dde29b54a3f1b76b2541d0a4a9b232c.jpg?s=120&d=mm&r=g)
On Mon, Mar 27, 2017 at 12:14 PM, Pauli Virtanen <pav@iki.fi> wrote:
exactly -- I don't think there was a consensus on this.
Of course, a true 1-byte unicode subset string may be more useful type for some applications, so it could indeed be added.
That's the idea -- scientist tend to use a lot of ascii text (or at least one-byte per char text), numy requires each element to be the same number of bytes, so the unicode dtype is 4 btes per char -- seemingly very wasteful. but if you use 's' on py3, you get bytestrings back -- not "text" from a py3 perspective. and aside from backwards compatibility, I see no reason for a 's' dtype that returns a bytes object on py3 -- if it's really binary data, you can use the 'b' dtype. -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov
![](https://secure.gravatar.com/avatar/5dde29b54a3f1b76b2541d0a4a9b232c.jpg?s=120&d=mm&r=g)
On Mon, Mar 27, 2017 at 12:14 PM, Pauli Virtanen <pav@iki.fi> wrote:
exactly -- I don't think there was a consensus on this.
Of course, a true 1-byte unicode subset string may be more useful type for some applications, so it could indeed be added.
That's the idea -- scientist tend to use a lot of ascii text (or at least one-byte per char text), numy requires each element to be the same number of bytes, so the unicode dtype is 4 btes per char -- seemingly very wasteful. but if you use 's' on py3, you get bytestrings back -- not "text" from a py3 perspective. and aside from backwards compatibility, I see no reason for a 's' dtype that returns a bytes object on py3 -- if it's really binary data, you can use the 'b' dtype. -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov
participants (2)
-
Chris Barker
-
Pauli Virtanen