How to waste computer memory?

Terry Reedy tjreedy at udel.edu
Fri Mar 18 14:33:49 EDT 2016


On 3/18/2016 7:58 AM, Steven D'Aprano wrote:
> On Fri, 18 Mar 2016 10:46 pm, Steven D'Aprano wrote:
>
>> I think it is typical of JMF that his idea of a language where Unicode
>> "just works" is one where it *does work at all* (at least not as strings).
>
> Er, does NOT work at all.
>
>> Python 1.5 strings supported Unicode just as well as Go's string class.
>
> Since I'm replying to myself, I guess I can take the opportunity to expand
> on this. Go's concept of strings is, more or less, byte strings:
>
> https://blog.golang.org/strings
>
> They are handled as an array of bytes and indexing produces bytes. That's
> exactly the same functionality as Python strings provided in version 1.5.
> In fairness, Go does provide a second type, "runes", which is equivalent to
> Python 2.7 unicode using a wide build (i.e. equivalent to UTF-32).

To be exact, a rune is equivalent to a codepoint, so '[]rune' (array of 
runes) is the equivalent to 'unicode'.

Go was written by Google for use by Google, so it is not surprising that 
its design is influenced by what Google does.  Google mainly gathers, 
stores, and disperses 'humongabytes' of data.  Storing as utf-32 would 
perhaps triple its storage space.  So it stores and indexes text as the 
same bytes it will most likely transmit.

Python's space-saving FSR was developed years after Go was.

-- 
Terry Jan Reedy




More information about the Python-list mailing list