On 8/10/2015 7:23 PM, Victor Stinner wrote:
But in any event, I don't see the distinction between calling str.format(), and calling each object's __format__ method. Both are compliant with the PEP, which doesn't specify exactly how the transformation is done.
When I read the PEP for the first time, I understood that you reimplemented str.format() using the __format__() methods. So i understood that it's a new formatting language and it would be tricky to reimplement it, for example in a library providing i18n with f-string syntax (I'm not sure that it's feasible, it's just an example). I also expected many subtle differences between .format() and f-string.
In fact, f-string is quite standard and not new, it's just a compact syntax to call .format() (well, with some minor and acceptable subtle differences). For me, it's a good thing to rely on the existing .format() method because it's well known (no need to learn a new formatting language).
Maybe you should rephrase some parts of your PEP and rewrite some examples to say that's it's "just" a compact syntax to call .format().
Okay. I'll look at it.
For me, calling __format__() multiple times or format() once matters, for performances, because I contributed to the implementation of _PyUnicodeWriter. I spent a lot of time to keep good performances when the implementation of Unicode was rewritten for the PEP 393. With this PEP, writing an efficient implementation is much harder. The dummy benchmark is to compare Python 2.7 str.format() (bytes!) to Python 3 str.format() (Unicode!). Users want similar performances! If I recall correctly, Python 3 is not bad (faster is some corner cases).
'{} {}'.format(datetime.datetime.now(), decimal.Decimal('100')) calls __format__() twice. It's only special cased to not call __format__ for str, int, float, and complex. I'll grant you that most of the cases it will ever be used for are thus special cased.
Concatenate temporary strings is less efficient Than _PyUnicodeWriter (single buffer) when you have UCS-1, UCS-2 and UCS-4 strings (1/2/4 bytes per character). It's more efficient to write directly into the final format (UCS-1/2/4), even if you may need to convert the buffer from UCS-1 to UCS-2 (and maybe even one more time to UCS-4).
As I said, after it's benchmarked, I'll look at it. It's not a user-visible change. And thanks for your work on _PyUnicodeWriter. Eric.