[Python-ideas] Adding 'bytes' as alias for 'latin_1' codec.

Nick Coghlan ncoghlan at gmail.com
Fri May 27 11:27:54 CEST 2011


On Fri, May 27, 2011 at 6:46 PM, Stephen J. Turnbull <stephen at xemacs.org> wrote:
>  > What method is invoked to convert the numbers to text? What encoding
>  > is used to convert those numbers to text? How does this operation
>  > avoid also converting the *bytes* object to text and then reencoding
>  > it?
>
> OTOH, Nick, aren't you making this harder than it needs to be?  After
> all,

To me, the defining feature of str.format() over str.__mod__() is the
ability for types to provide their own __format__ methods, rather than
being limited to a predefined set of types known to the interpreter.
If bytes were to reuse the same name, then I'd want to see similar
flexibility.

Now, a *different* bytes method (bytes.interpolate, perhaps?), limited
to specific types may make sense, but such an alternative *shouldn't*
be conflated with the text formatting API.

However, proponents of such an addition need to clearly articulate
their use cases and proposed solution in a PEP to make it clear that
they aren't merely trying to perpetuate the bytes/text confusion that
plagues 2.x 8-bit strings.

We can almost certainly do better when it comes to constructing byte
sequences from component parts, but simply saying "oh, just add a
format() method to bytes objects" doesn't cut it, since the associated
magic methods for str.format are all string based, and bytes
interpolation also needs to address encoding issues for anything that
isn't already a byte sequence.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia



More information about the Python-ideas mailing list