[Numpy-discussion] Bytes vs. Unicode in Python3
Christopher Barker
Chris.Barker at noaa.gov
Fri Nov 27 15:52:43 EST 2009
Anne Archibald wrote:
>>> I don't think it makes sense to handle format strings in Unicode
>>> internally -- they should always be coerced to bytes.
>> This should be fine -- we control what is a valid format string, and
>> thus they can always be ASCII-safe.
>
> I have to disagree. Why should we force the user to use bytes?
One of us mis-understood that -- I THINK the idea was that internally
numpy would use bytes (for easy conversion to/from char*), but they
would get converted, so the use could pass in unicode strings (or
bytes). I guess the questions remains as to what you'd get when you
printed a format string.
> Keep in mind that "coercing" strings to bytes
> requires extra information, namely the encoding.
but that is built-in to the unicode object.
I think the idea is that a format string is ALWAYS ASCII -f there are
any other characters in there, it's an invalid format anyway.
Unless I mis-understand what a format string is. I think it's a string
you use to represent a custom dtype -- it that right?
-Chris
--
Christopher Barker, Ph.D.
Oceanographer
Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception
Chris.Barker at noaa.gov
More information about the NumPy-Discussion
mailing list