[Python-ideas] duck typing for io write methods

Wolfgang Maier wolfgang.maier at biologie.uni-freiburg.de
Sat Jun 15 18:51:32 CEST 2013


Paul Moore <p.f.moore at ...> writes:

> 
> 
> 
> On 15 June 2013 15:45, Wolfgang Maier <wolfgang.maier <at>
biologie.uni-freiburg.de> wrote:Two questions though: you're saying in 3.3+.
Does that mean the behaviour
> 
> 
> has changed with 3.3 or that you checked it only for that version (I'm
> currently using 3.2)?
> Second, is that one byte optimization special for str() from int or is it
> happening elsewhere too (like in string literals without non-english
> characters)? Where can I find that documented?
> 
> 
> Basically, it's new in Python 3.3. See the What's New document
at http://docs.python.org/3/whatsnew/3.3.html#pep-393 and PEP 393
(http://www.python.org/dev/peps/pep-0393/)
> 
> What happened is that the internal representation of strings changed so
that strings are held in 1, 2 or 4-byte form depending on the actual data.
So all-ASCII data (such as the numbers you are interested in) are held in
1-byte form, and encoding to and from bytes can be done by just copying the
bytes (assuming you're using an ascii-compatible encoding).
> 
> The same code works in earlier versions, but it will be slower (how much
depends on your application) because bytestrings will need to be converted
to and from wide character strings.
> 
> Paul.
> 

That sounds like a really good argument for moving to Python 3.3 !
Thanks a lot, Paul and Stephen, for this feedback.
So if I understand the PEP correctly, then, theoretically, text mode file IO
objects could be implemented to declare that all they'll ever need is 1 byte
strings (if the encoding is ASCII-compatible)? Then converting incoming
bytes from a file would also be reduced to copying and would eliminate much
of the speed difference between 'r' and 'rb' modes?
Is that done already, or are there problems with such an approach?
Wolfgang




More information about the Python-ideas mailing list