[Python-ideas] Re: Support more conversions in format string

April 24, 2021

      23.04.21 12:22, Stephen J. Turnbull пише:
...
Serhiy Storchaka writes:
...
Currently format strings (and f-string expressions) support three
conversions: !s -- str, !r -- repr and !a for ascii.
It's not clear to me what these are good for, to be honest.  Why not
just have s, r, and a format codes?  The !conversions don't compose
with format codes:
>>> f"{10!r:g}"
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    ValueError: Unknown format code 'g' for object of type 'str'
Because it converts value to string, and string formatting does not
support "g". Converters !s, !r and !a are separated from format
specifier, and it is old and widely used feature.

I only propose to add more converters, because they are needed for some
compiler optimizations. I was going to add them as private AST detail in
any case, but if we are going to make this feature public, it is worth
to discuss it ahead to avoid name conflicts in future. I asked what
letters should be chosen for convertors for int() and index().
...
So I don't think I want to go further.  I have some sympathy for your
proposal, in part because I'd like to see something done about moving
I18N into the format() mechanism.  But I'm going to play devil's
advocate, mostly because I'm getting old enough to not like change so
much. ;-)
I am not sure which relation does it have to I18N.
...
...
I propose to add support of additional conversions: for int, float
and operator.index. It will help to convert automatically printf-
like format strings to f-string expressions: %d, %i, %u -- use int,
%f -- use float, %o, %x -- use operator.index.
This makes more sense to me than !s, !r, and !a -- you might or might
not want these conversions, I guess.  But it seems like a lot of
complexity to add.  On the other hand, isn't the answer "fix
__format__ in class definitions?"
We need to format a value as integer or float independently from
__format__ implementation, and raise an error if it cannot be converted
to integer or float. The purpose of the feature is bypassing __format__
and get the same result as in printf-style formatting.
...
But we could change int.__format__ to allow 's' as a format code[1],
automagically calling str(), just as 'efg' are allowed and
automagically call float().
Yes, we could add support for "s" in int.__format__, but it was decided
to not do this for some reasons. It would confuse format specifier with
converter, it would make some errors be uncaught (like passing integer
when string is expected), it would require to duplicate the code of
str.__format__ in int.__format__ (and all other __format__'s where you
want to support "s").
...
...
Currently I write f"path = {repr(str(path))}" or f"path =
{str(path)!r}", but want to write f"path = {path!s!r}".
I have some sympathy for this; it's not a big change, and given the
syntax you propose I doubt anyone would be confused about the
semantics, including the order of conversions.  However:
To me, this seems like a clear case where you want to embed the
conversions in the format code mechanism for those specific types:
extend the __format__ method for URL objects to allow {url:h} where
the 'h' format code applies "hex-escape", or you could repurpose the
"u" code from the standard minilanguage to apply url-escape, or (I
don't know if format allows) you could use {url:%}!
Supporting the "h" format code in the __format__ method for URL objects
is a reasonable idea, and it is the purpose of __format__ methods. But
it does not relates to converters. If you want to dump the value of some
variable using repr(), do you want to add support of "r" in every
implementation of __format__ in the world (and what if some is not
support it or use it with different semantic)?  "%r" % x just calls
repr(), and we wanted this feature in new formatting.
...
How many types would need an additional format code to handle whatever
use case wants repr(str())?
All types with custom __str__. If it is convertable to str, you often
want to see the string representation, because it is shorter and more
human readable than the result of repr(). But since it can contain any
special characters, you want to see them and the boundary of that
string, thus use repr() or ascii() on the resulted string.
...
Or are you envisioning heavy use of !f!a
etc?  (I can't see how any of the existing conversions could have an
effect on the output of the numerical conversions you propose, though.)
No, I only need !s!r and !s!a. Maybe !f!s will have some use, but since
repr of float is the same as str, !f!a is the same as !f!s.