[Python-ideas] format specifier for "not bytes"

Paul Moore p.f.moore at gmail.com
Fri Aug 24 22:03:25 CEST 2012


On 24 August 2012 20:21, Daniel Holth <dholth at gmail.com> wrote:
> I was merely surprised by the implicit bytes to
> "b'string'" conversion, and would like to be able to turn it off.

The conversion is not really "implicit". It's precisely what the %s
(or {!s}) conversion format *explicitly* requests - insert the str()
of the supplied argument at this point in the output string. See
library reference 6.1.3 "Format String Syntax" (I don't know if
there's an equivalent description for % formatting).

If you want to force an argument to be a string, you could always do
something like this:

def must_be_str(s):
  if isinstance(s, str):
    return s
  raise ValueError

x = "The value is {}".format(must_be_str(s))

There's no "only insert a string here, raise an error for other types"
format specifier, largely because formatting is in principle about
*formatting* - converting other types to strings. In practice, most of
my uses of formatting (and I suspect many other people's) is more
about interpolation - inserting chunks of text into templates. For
that application, a stricter form could be more useful, I guess.

I could see value in a {!S} conversion specifier (in the terminology
of library reference 6.1.3 "Format String Syntax") which overrode
__format__ with a conversion function equivalent to must_be_str above.
But I don't know if it would get much use (anyone careful enough to
use it is probably careful enough of their types to not need it).

Also, is it *really* what you want? Did your code accidentally pass
bytes to a {!s} formatter, and yet *never* pass a number and get the
right result? Or conversely, would you be willing to audit all your
conversions to be sure that numbers were never passed, and yet *still*
not be willing to ensure you have no bytes/str confusion? (Although as
your use case was encode/decode dances, maybe bytes really are
sufficiently special in your code - but I'd argue that needing to
address this issue implies that you have some fairly subtle bugs in
your encoding process that you should be fixing before worrying about
this).

Paul



More information about the Python-ideas mailing list