On Fri, Aug 24, 2012 at 4:03 PM, Paul Moore <p.f.moore@gmail.com> wrote:
On 24 August 2012 20:21, Daniel Holth <dholth@gmail.com> wrote:
I was merely surprised by the implicit bytes to "b'string'" conversion, and would like to be able to turn it off.
The conversion is not really "implicit". It's precisely what the %s (or {!s}) conversion format *explicitly* requests - insert the str() of the supplied argument at this point in the output string. See library reference 6.1.3 "Format String Syntax" (I don't know if there's an equivalent description for % formatting).
If you want to force an argument to be a string, you could always do something like this:
def must_be_str(s): if isinstance(s, str): return s raise ValueError
x = "The value is {}".format(must_be_str(s))
There's no "only insert a string here, raise an error for other types" format specifier, largely because formatting is in principle about *formatting* - converting other types to strings. In practice, most of my uses of formatting (and I suspect many other people's) is more about interpolation - inserting chunks of text into templates. For that application, a stricter form could be more useful, I guess.
I could see value in a {!S} conversion specifier (in the terminology of library reference 6.1.3 "Format String Syntax") which overrode __format__ with a conversion function equivalent to must_be_str above. But I don't know if it would get much use (anyone careful enough to use it is probably careful enough of their types to not need it).
Also, is it *really* what you want? Did your code accidentally pass bytes to a {!s} formatter, and yet *never* pass a number and get the right result? Or conversely, would you be willing to audit all your conversions to be sure that numbers were never passed, and yet *still* not be willing to ensure you have no bytes/str confusion? (Although as your use case was encode/decode dances, maybe bytes really are sufficiently special in your code - but I'd argue that needing to address this issue implies that you have some fairly subtle bugs in your encoding process that you should be fixing before worrying about this).
Hi Paul! You could probably guess that this is the wheel digital signatures package. All the string formatting arguments (I hope) are now passed to binary() or native() string conversion functions that do less on Python 2.7 than on Python 3. Yes, I would be willing to audit my code to ensure that numbers were never passed. I am already calling .encode() and .decode() on most objects in this pipeline. In my opinion int-when-usually-str is in most cases as likely to be a bug as getting bytes() when you expect str(). Python even has the -bb argument to help with this thing that is almost never the right thing to do. How often does anyone who is not writing a REPL ever expect "%s" % bytes() to produce b''? In this particular case I could also make my life a lot easier by extending the JSON serializer to accept bytes(), but I suppose I would lose the string formatting operations.