[Python-ideas] Adding 'bytes' as alias for 'latin_1' codec.

Paul Colomiets paul at colomiets.name
Sun May 29 20:55:21 CEST 2011


On Sat, May 28, 2011 at 12:43 PM, Eric Smith <eric at trueblade.com> wrote:
>
> And Python 3.x has the exact same implementation, although it's only
> included for unicode strings. It would not be difficult to add .format()
> for bytes.
>
> There have been various discussions over the years of how to actually do
> that. I think the most recent one was to add an __bformat__ method.


Well, that's actually great idea I think. format method on bytes could
produce some data which is not an ascii, and eventually became struct.pack
on steroids. The struct.pack has plenty of problems:

* unable to use named fields, which is usefull to describe big structures
* all fields are fixed-length, which is unfortunate for today's trend of
variable length integers
* can't specify separators between fields

I also use str(intvalue).encode('ascii') idiom a lot. So probably I'd
suggest to have something like
__bformat__ with format values somewhat similar to ones struct.pack has
along with str-like ones for integers. Also it might be useful to have
`!len` conversion for bytes fields, for easier encoding of length-prefixed
strings.

To show an example, here is how two-chunk png file can be encoded:

(b"\x89PNG\r\n\x1A\n"
    b"{s1!len:>L}IHDR{s1}{crc1:>L}"
    b"{s2!len:>L}IDAT{s2}{crc2:>L}\0\0\0\0IEND".format(
    s1=section1, crc1=crc(section1),
    s2=section2, crc2=crc(section2)))

-- 
Paul
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20110529/e3aa0e87/attachment.html>


More information about the Python-ideas mailing list