
On Aug 25, 2012, at 09:16 AM, Nick Coghlan wrote:
A couple of people at PyCon Au mentioned running into this kind of issue with Python 3. It relates to the fact that: 1. String formatting is *coercive* by default 2. Absolutely everything, including bytes objects can be coerced to a string, due to the repr() fallback
So it's relatively easy to miss a decode or encode operation, and end up interpolating an unwanted "b" prefix and some quotes.
For existing versions, I think the easiest answer is to craft a regex that matches bytes object repr's and advise people to check that it *doesn’t* match their formatted strings in their unit tests.
For 3.4+ a non-coercive string interpolation format code may be desirable.
Or maybe just one that calls __str__ without a __repr__ fallback? FWIW, the representation of bytes with the leading b'' does cause problems when trying to write doctests that work in both Python 2 and 3. http://www.wefearchange.org/2012/01/python-3-porting-fun-redux.html It might be a bit nicer to be able to write:
print('{:S}'.format(somebytes))
Of course, in the bytes case, its __str__() would have to be rewritten to not call its __repr__() explicitly. It's probably not worth it just to save from writing a small helper function, but it would be useful in eliminating a surprising gotcha. The other option is of course just to make doctests smarter[*]. Cheers, -Barry [*] Doctest haters need not respond snarkily. :)