[Python-ideas] a new bytestring type?

Andrew Barnert abarnert at yahoo.com
Mon Jan 6 12:34:31 CET 2014


From: Geert Jansen <geertj at gmail.com>

Sent: Monday, January 6, 2014 12:28 AM


> I'm not missing a new type, but I am missing the format method on the
> binary types.


I miss that too, but it's a bit tricky.

'{}'.format(x) calls str(x).

b'{}'.format(x) can't call bytes(x). At least not unless you want b'#{}'.format(6) to give you b'#\0\0\0\0\0\0'. Besides, most types don't provide a __bytes__, so even if it weren't for this problem, it wouldn't really be useful for anything except inserting bytes into other bytes. So, what _should_ it call?

You could add encoding and errors keyword parameters (defaulting to 'ascii' and 'strict'), so b'{}'.format(x, encoding='utf-8') calls str(x).encode('utf-8'), which solves all of those problems… except that now it means you can't stick bytes objects into bytes formats, which is even worse.

You could solve that by making objects that support the buffer protocol (like bytes) copy as-is instead of going through str and encode. That would mean you can't use bytes with a placeholder with any format flags, but maybe that's a good thing anyway (e.g., do you really want b'{:3}'.format(b'\xc3\xa9') to only pad to 2 characters instead of 3 because it's a 2-byte character?).

That would be enough to let you cram pre-encoded/formatted bytes, and things like numbers, into bytes formats made up of ASCII headers, which I think is 90% of what people want here. Does that seem worth pursuing?


More information about the Python-ideas mailing list