[Python-ideas] Bytes formatting (was Re: Adding 'bytes' as alias for 'latin_1' codec)

Tue May 31 20:08:33 CEST 2011

On 5/31/2011 4:24 AM, Nick Coghlan wrote:
> On Tue, May 31, 2011 at 5:32 PM, Greg Ewing
>> If you're using the special ascii type at all, rather
>> than an ordinary str, it's precisely because you want
>> to mix it with bytes. Making that part hard would
>> defeat the purpose,
>
> Indeed, the specific use case here is working with ASCII snippets
> embedded within ASCII compatible encodings (or otherwise demarcated
> from the 8-bit data).

My proposal for a function that interpolates bytes into bytes covers 
this case. There is no need for a new class at all. I agree that 
experience and experimentation is needed before adding anything to the 
atdlib. But here is a baseline version in Python:

from itertools import zip_longest
import re
field = re.compile(b'{}')

def bformat(template, *inserts):
     temlits = re.split(field, template) # template literals
     res = bytearray()
     for t,i in zip_longest(temlits, inserts, fillvalue=b''):
         res.extend(t)
         res.extend(i)
     return res

print(bformat(b'xxx{}yyy{}zzz', b'help', b'me'))

# bytearray(b'xxxhelpyyymezzz')

This is, of course, not limited to the ascii subset of bytes.

print(bformat(b'xx\xaa{}yy\xbb{}zzz', b'h\xeeelp', b'm\xeee'))
#bytearray(b'xx\xaah\xeeelpyy\xbbm\xeeezzz')

The next step would be to change the field re to allow a field spec 
between {} and add capturing parens so that re.split keeps the field 
specs. Then use those to format the inserted bytes or, later, ints.

-- 
Terry Jan Reedy