[Python-ideas] Adding 'bytes' as alias for 'latin_1' codec.

Nick Coghlan ncoghlan at gmail.com
Sat May 28 12:29:46 CEST 2011


On Sat, May 28, 2011 at 7:43 PM, Eric Smith <eric at trueblade.com> wrote:
> There have been various discussions over the years of how to actually do
> that. I think the most recent one was to add an __bformat__ method.

Python 2.x was different, as the automatic unicode coercion meant
class developers still only needed to provide __str__ (or __unicode__
if they wanted to return non-ASCII data).

__bformat__ (and similar ideas) are somewhat different beasts due to
the encoding issues involved. Those aren't insurmountable, but they're
things that don't come up with pure unicode handling (2.x unicode, 3.x
str) or data that is essentially assumed to be latin-1 encoded in many
cases (2.x str)

> I'm not saying any of this is a good idea or desirable. I'm just saying
> it would be easy to do and wouldn't hurt the performance of
> unicode.format().

I'm still not sure about that, since the 2.x str.format() pretty much
ignores the associated encoding problems, and I don't believe
perpetuating that behaviour would be appropriate for 3.x bytes.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia



More information about the Python-ideas mailing list