[Python-ideas] Adding 'bytes' as alias for 'latin_1' codec.
Eric Smith
eric at trueblade.com
Sat May 28 11:43:54 CEST 2011
On 5/27/2011 7:51 AM, Nick Coghlan wrote:
> In the specific case of adding bytes.format(), it's the weight of the
> backing machinery that bothers me - the PEP 3101 implementation isn't
> small, and providing a parallel API for bytes without slowing down the
> existing string implementation would be problematic (code re-use would
> likely slow down the common case even further, while avoiding re-use
> would likely end up duplicating a lot of code). However, *if* a solid
> set of use cases for direct bytes interpolation can be identified (and
> that's a big if), then it may be possible to devise a narrower, more
> focused API that doesn't require such a heavy back end to support it.
In Python 2.x str.format() and unicode.format() share the same
implementation, using the Objects/stringlib mechanism of #defines and
multiple includes. So while you do get the compiled code included twice,
there's only one source file that implements them both. I don't think
there's any concern about performance issues.
And Python 3.x has the exact same implementation, although it's only
included for unicode strings. It would not be difficult to add .format()
for bytes.
There have been various discussions over the years of how to actually do
that. I think the most recent one was to add an __bformat__ method.
I'm not saying any of this is a good idea or desirable. I'm just saying
it would be easy to do and wouldn't hurt the performance of
unicode.format().
Eric.
More information about the Python-ideas
mailing list