Draft2 PEP on string interpolation

With the major design decisions made, behold version 2 of my draft PEP on string interpolation. It's now significantly shorter due to removal of most of the i18n related discussion, pruning, as well as simplification of the prose itself. I don't expect many changes from here on: https://bitbucket.org/mixmastamyk/docs/src/default/pep/pep-05XX.rst TL;DR: Here is a summary table and comparisons with my current understanding of the other proposals, please correct if they are now out of date: String Interpolation PEP Comparison =================================== ================= ================= ================= ================= PEP PEP 498 PEP 501 Draft PEP ================= ================= ================= ================= Name Format/f-string Gen. Purpose Str… Expression-string Prefix f'' i'' e'' Syntax str.format()+ .format+Template+ str.format()+ Returns String join expr… Object Object Immediate Render Yes No Yes Deferred Render No Yes, str, mutable Yes I18n Support No Yes Input available Escaping Hook No No Yes, manual ================= ================= ================= ================= The table can be found here and updated via pull-request: https://bitbucket.org/mixmastamyk/docs/src/default/pep/pep_comparison.rst -Mike

On Wed, Aug 26, 2015 at 11:40 PM, Mike Miller <python-ideas@mgmiller.net> wrote:
With the major design decisions made, behold version 2 of my draft PEP on
string interpolation.
It's now significantly shorter due to removal of most of the i18n related
discussion, pruning, as well as simplification of the prose itself. I don't expect many changes from here on:
https://bitbucket.org/mixmastamyk/docs/src/default/pep/pep-05XX.rst
TL;DR: Here is a summary table and comparisons with my current
understanding of the other proposals, please correct if they are now out of date:
=================
Is the Draft PEP column of the table supposed to have both "Immediate Render" and "Deferred Render" as "Yes"? I'm hoping that's a typo, otherwise I don't understand what it means at all.
The table can be found here and updated via pull-request:
https://bitbucket.org/mixmastamyk/docs/src/default/pep/pep_comparison.rst
Cody

On 08/27/2015 09:19 AM, Cody Piersall wrote:
There is no need to call str() manually with e'', the .rendered member is returned by default: >>> print(estr('Hello {friend}.')) 'Hello John' Here is the example implementation: https://bitbucket.org/mixmastamyk/docs/src/default/pep/estring_demo.py -Mike

On 08/27/2015 11:22 AM, Mike Miller wrote:
The problem with this auto-rendering is that the format_spec and conversion character have to make sense to __format__. For example, you couldn't do this, if value were a string: to_html(e'<p>{value:raw}</p>') Imagine that ":raw" is interpreted by to_html() to mean that the string does not get html escaped. With PEP 501, the format_spec and conversion are opaque to the i-string machinery, and are only interpreted by the custom interpolation function (here, to_html()). Eric.

On 08/27/2015 10:27 AM, Eric V. Smith wrote:
The problem with this auto-rendering is that the format_spec and
Hmm, I believe this is a design choice, one that should be made depending on whether this use-case is important and/or common. The estr provides for this situation instead by allowing for additional renderings if/when needed, but doesn't require str() in the common case. This is the first I've seen of directives passed inside the format spec, I'll add it to the comparison table. -Mike

On Aug 27, 2015, at 10:27, Eric V. Smith <eric@trueblade.com> wrote:
With str.format, it's the type of value that decides how to interpret the format spec. If you leave it up to the consumer of the i-string (the interpolation function) instead of the value's type, how do you handle things like numeric formats and datetime formats and so on? Would I need to do something like this: to_html(e'<p>{str(e"{value:05}"):raw}</p>') Or would to_html (and every other custom interpolator) have to take things like :raw05 and parse out a format spec to pass to value.__format__? Maybe what you really want is !raw rather than :raw. If there is no conversion, __format__ gets called and the result passed around as part of the i-string object; if there is one, the value, conversion, and format spec get passed instead (and then to_html could decide that conversion 'raw' means to call format(value, format_spec) and then not escape the result). Although that's pretty different from how the standard conversions work (call repr or ascii on the value, then format the resulting string with the format spec). So maybe replacement fields need another subpart separate from both the conversion and the format spec that's only used by custom interpolators?

On 08/27/2015 02:30 PM, Andrew Barnert via Python-ideas wrote:
Your interpolator would need to decide. It might never call value.__format__. It might invent some other protocol, like __html_escape__(fmt_spec). Or, it might bake-in knowledge of how to convert whatever types it cares about. Or another good choice would be to use the singledispatch module. In fact, I like singledispatch so much that I'm going to have to use it in an example.
Maybe what you really want is !raw rather than :raw. If there is no conversion, __format__ gets called and the result passed around as part of the i-string object; if there is one, the value, conversion, and format spec get passed instead (and then to_html could decide that conversion 'raw' means to call format(value, format_spec) and then not escape the result).
You could do that in addition or in place of format_spec. Except currently conversions are only allowed to be a single character, but I don't see any reason not to relax that. The take away is that the PEP 501 i-string machinery applies zero significance to format_spec and conversion. It just parses them out of the template string. It's left up to the interpolator to apply some meaning to them.
Although that's pretty different from how the standard conversions work (call repr or ascii on the value, then format the resulting string with the format spec). So maybe replacement fields need another subpart separate from both the conversion and the format spec that's only used by custom interpolators?
There's nothing from stopping you from doing this. You could decide that your format_spec, for some interpolator, is composed of "part1^part2", and do something based on part1 and part2. See: https://bitbucket.org/ericvsmith/istring/src/d92e47c96609eed44ed57b7d3c1932b... for how my i-string str() interpolator applies format_spec and conversion. Another interpolator could do something different (for example, https://bitbucket.org/ericvsmith/istring/src/d92e47c96609eed44ed57b7d3c1932b... for regex escaping). Eric.

On Aug 27, 2015, at 12:08, Eric V. Smith <eric@trueblade.com> wrote:
But even with singledispatch, you have to write formatters for every type that just call the default format; it means you only have N+M functions to write instead of N*M (where N is the number of interpolators and M the number of types to format), but that's still a lot more than just N functions. Also, of course, the fact that you're doing it differently from the usual "just write a __format__ method" means an extra thing for people to learn, and search for. And I think that's functionality people will almost always expect. Whether I'm dealing with a logger, an i18n library, or even a SQL DECIMAL field, I'd expect :3.5 to mean the same thing it does in str.format. So making every project write the identical code to make that true just because a small number of them won't care seems like a bad idea. And similarly, not having a standard way to separate out the interpolator spec (which is obviously unique to every interpolator) and the format spec (which should be the same for almost every interpreter, but is different for each type) seems like it just adds confusion without adding flexibility. Finally, the fact that, by default, handling format specs isn't done at all means it's very easy to design and implement an interpolator that doesn't do it, use it for a while, and only later realize that you need to be able to do the equivalent of :+05 and haven't left any way to do that and now have to find a clumsy way to tack it on.

On 28 August 2015 at 02:55, <random832@fastmail.us> wrote:
While I appreciate that people are still in a design phase with this, I'd like to point out that in the end, people will need to teach and remember this stuff. Currently ! introduces a conversion, which is one of r, s, or a (and is rarely used except for the occasional !r). Whereas : introduces a format spec, which is a mini-language for describing how to format the value and is specific to the type of the value. The "raw" thing above feels like neither of those things. It feels more like a conversion, but if so then conversions are currently single letters, and language-defined. I'd also expect (for that reason) that any conversions would be valid in any type of formatting (str.format, f-strings, e-strings, whatever). I'm not saying you *have* to follow those rules, just that it feels like we're setting up for a huge teachability nightmare (and a feature that no-one will ever use, because they can't remember how[1]) if we don't at least try to adhere to some level of consistency here. Paul [1] I already tend to ignore most of the features of format strings beyond putting in a simple field number or name, because I don't remember the details and would have to look them up. I'm pretty sure I'd never use something like !raw, no matter how it was spelt, for exactly the same reason.

On Fri, Aug 28, 2015, at 03:52, Paul Moore wrote:
Yes, but the format spec mini-language belongs to the type of the value. Depending on what value is, "raw" could _already_ have a meaning.
Well, at the time I posted that I thought we were moving away from things being language-defined, because my mind was still on the "user- defined string prefixes" proposal from a while back. Them currently being single letters isn't really a compelling argument.
And what if the type of value expects to be able to process a format specifier of "raw", rather than it being used by to_html for its own purpose? The advantage of conversion specifiers is that they're currently a closed set. ---- One thing that I don't think *either* version successfully expresses is that while in many cases (including the to_html example) we want a string, that won't always be the case. If we have a syntax for inserting something as, e.g., a SQL parameter, it should be able to accept a double, but I'm not convinced it shouldn't _also_ be able to describe putting the string result of converting the double with ".05d" as a varchar.

On Wed, Aug 26, 2015 at 9:40 PM, Mike Miller <python-ideas@mgmiller.net> wrote:
I'm confused by this proposal. There are many paragraphs about motivation, philosophy, other languages, etc., but the proposal itself seems to be poorly specified. E.g. I couldn't figure out what code should be produced by: a = e"Sliced {n} onions in {t1-t0:.3f} seconds." Generalizing from the only example in the specification, this would become: a = est("Sliced {n} onions in {t1-t0:.3f} seconds", n=n, t1-t0=t1-t0) which is invalid syntax. Similarly, I don't see how e.g. the following could be rendered correctly: a = e"Three random numbers: {rand()}, {rand()}, {rand()}." I also don't understand the claim that no str(estr) is necessary to render the result -- the estr implementation given at https://bitbucket.org/mixmastamyk/docs/src/default/pep/estring_demo.py has a __str__ method that renders the .rendered attribute, but without the str() call the type of 'a' in the above examples would not be str, and various things that operate on strings (e.g. regular expression searches) would not work. A solution might be to make estr a subclass of str, but nothing in the PEP suggests that you have even considered this problem. (The only hint I can find is the comment "more magic-methods to be implemented here, to improve str compatibility" in your demo implementation, but without subclassing str this is not enough.) -- --Guido van Rossum (python.org/~guido)

On Aug 27, 2015, at 11:27, Guido van Rossum <guido@python.org> wrote:
I also don't understand the claim that no str(estr) is necessary to render the result -- the estr implementation given at https://bitbucket.org/mixmastamyk/docs/src/default/pep/estring_demo.py has a __str__ method that renders the .rendered attribute, but without the str() call the type of 'a' in the above examples would not be str, and various things that operate on strings (e.g. regular expression searches) would not work. A solution might be to make estr a subclass of str, but nothing in the PEP suggests that you have even considered this problem. (The only hint I can find is the comment "more magic-methods to be implemented here, to improve str compatibility" in your demo implementation, but without subclassing str this is not enough.)
Even subclassing str doesn't really help, because there's plenty of code (including, I believe, regex searches) that just looks at the raw string storage that gets created at str.__new__ and can never be mutated or replaced later. So, anything that's delayed-rendered is not a str, or at least it's not the right str. (I know someone earlier in the discussion suggested that at the C level you could replace PyUnicode_READY with a function that, if it's an estr, first calls self.__str__ and then initializes the string storage to the result and then does the normal READY stuff, but I don't think that actually works, does it?)

On Thu, Aug 27, 2015 at 12:13 PM, Andrew Barnert <abarnert@yahoo.com> wrote:
I think subclassing would be enough -- the raw string should be the default rendered string (not the template), and any code that wants to do something *different* will have to extract the template and the list of values (and whatever else is extracted) from other attributes. (Note that it shouldn't be *necessary* to store anything besides the template and the list of values, since the rest of the info can be recovered by parsing the template. But might be *convenient* to store some other things, like the actual text of the expression that produced each value, and the format spec (or perhaps integers into the template that let you find these -- the details would be up to the implementer and a matter of QoI). -- --Guido van Rossum (python.org/~guido)

On 08/27/2015 11:27 AM, Guido van Rossum wrote:
Yes, the demo does not currently handle arbitrary expressions, only .format() is implemented. The PEP is relying on much of the PEP 498 implementation (which I don't have at hand), so that part is underspecified. I hope to reconcile the details if/when the larger design is chosen. For now, I will add the ability to pass a context dictionary at init as well as keywords to prepare for further implementation.
Yes, no explicit call is necessary. When you do things like print(e'') or e''.upper() or '' + e'', you'll get the rendered string without str(). That could be more clear, yes, and maybe not substantially different than i''.
A solution might be to make estr a subclass of str, but nothing in the PEP suggests that
I originally subclassed it from str, but others on the list advised against it. Which is preferred? Under what name is the string in a string object actually held (if any)? -Mike

I've addressed those questions, updated the demo script, and added your examples in a separate examples section. I'll write more after lunch. There is the question of how much detail to copy from PEP 498, I wonder if it will change any further? -Mike On 08/27/2015 11:27 AM, Guido van Rossum wrote:

On Thu, Aug 27, 2015 at 1:38 PM, Mike Miller <python-ideas@mgmiller.net> wrote:
I've addressed those questions, updated the demo script, and added your examples in a separate examples section. I'll write more after lunch.
Looking through your latest commit ( https://bitbucket.org/mixmastamyk/docs/commits/760274613d8c306cd688385fbbcf2...) I think you've painted yourself into an impossible corner, and haven't thought through the consequences in all cases enough. Apparently my Socratic questions didn't help enough, so I feel compelled to give you the answers. :-) The interpreter can't and shouldn't be passing in the values of all the variables involved in an expression. It should only be passing in the final evaluated result of each "slot" in the e-string. For my second example (with three rand() calls) it should pass the three different random values returned by the three rand() calls into the estr() constructor, e.g.: b = estr("Three random numbers: {rand()}, {rand()}, {rand()}.", rand(), rand(), rand()) Your entire formatting machinery should be rewritten using positional values instead of keyword args. Regarding subclassing str, there are indeed many problems with that (e.g. what is the type of str(...)+estr(...)), but without it, you will never be able to claim that str() calls are never needed, because quite a few built-in operations and stdlib modules in Python *require* that their arguments are str subclasses (or they treat str subclasses different than other classes). An important example is the re module.
There is the question of how much detail to copy from PEP 498, I wonder if it will change any further?
Undoubtedly PEP 498 will evolve. You're better off not depending on it directly (despite being an alternative or variant). -- --Guido van Rossum (python.org/~guido)

Hi, I was able to get this done tonight. There's still simplifications to be done, perhaps keeping the string fragments in one piece? Also, a lot of things need to be passed to the constructor to avoid parsing the template twice. Positional arguments are working however, and we're back to inheriting from str. It is now rendering into the "real" string, and everything seems to work without the magic methods. ;) Thanks, -Mike On 08/27/2015 02:02 PM, Guido van Rossum wrote:

Thanks, this looks much better, if we ever want to go in this direction. (Though I think you may want to separate the field names into two parts, the expression text and the format spec.) Can you work with the team at peps@python.org to get a PEP number for this? On Fri, Aug 28, 2015 at 1:34 AM, Mike Miller <python-ideas@mgmiller.net> wrote:
-- --Guido van Rossum (python.org/~guido)

On 08/28/2015 08:11 AM, Guido van Rossum wrote:
It does currently separate the expression texts from the format specs internally, it felt excessive to pass them in separately in the constructor, as I'm sort of complaining about below. But, the end-developer won't see this typically, so it could be done before or after I suppose. Anyone have a preference?
Can you work with the team at peps@python.org <mailto:peps@python.org> to get a PEP number for this?
Yes, nice to have a third alternative to choose from. -Mike

On Wed, Aug 26, 2015 at 11:40 PM, Mike Miller <python-ideas@mgmiller.net> wrote:
With the major design decisions made, behold version 2 of my draft PEP on
string interpolation.
It's now significantly shorter due to removal of most of the i18n related
discussion, pruning, as well as simplification of the prose itself. I don't expect many changes from here on:
https://bitbucket.org/mixmastamyk/docs/src/default/pep/pep-05XX.rst
TL;DR: Here is a summary table and comparisons with my current
understanding of the other proposals, please correct if they are now out of date:
=================
Is the Draft PEP column of the table supposed to have both "Immediate Render" and "Deferred Render" as "Yes"? I'm hoping that's a typo, otherwise I don't understand what it means at all.
The table can be found here and updated via pull-request:
https://bitbucket.org/mixmastamyk/docs/src/default/pep/pep_comparison.rst
Cody

On 08/27/2015 09:19 AM, Cody Piersall wrote:
There is no need to call str() manually with e'', the .rendered member is returned by default: >>> print(estr('Hello {friend}.')) 'Hello John' Here is the example implementation: https://bitbucket.org/mixmastamyk/docs/src/default/pep/estring_demo.py -Mike

On 08/27/2015 11:22 AM, Mike Miller wrote:
The problem with this auto-rendering is that the format_spec and conversion character have to make sense to __format__. For example, you couldn't do this, if value were a string: to_html(e'<p>{value:raw}</p>') Imagine that ":raw" is interpreted by to_html() to mean that the string does not get html escaped. With PEP 501, the format_spec and conversion are opaque to the i-string machinery, and are only interpreted by the custom interpolation function (here, to_html()). Eric.

On 08/27/2015 10:27 AM, Eric V. Smith wrote:
The problem with this auto-rendering is that the format_spec and
Hmm, I believe this is a design choice, one that should be made depending on whether this use-case is important and/or common. The estr provides for this situation instead by allowing for additional renderings if/when needed, but doesn't require str() in the common case. This is the first I've seen of directives passed inside the format spec, I'll add it to the comparison table. -Mike

On Aug 27, 2015, at 10:27, Eric V. Smith <eric@trueblade.com> wrote:
With str.format, it's the type of value that decides how to interpret the format spec. If you leave it up to the consumer of the i-string (the interpolation function) instead of the value's type, how do you handle things like numeric formats and datetime formats and so on? Would I need to do something like this: to_html(e'<p>{str(e"{value:05}"):raw}</p>') Or would to_html (and every other custom interpolator) have to take things like :raw05 and parse out a format spec to pass to value.__format__? Maybe what you really want is !raw rather than :raw. If there is no conversion, __format__ gets called and the result passed around as part of the i-string object; if there is one, the value, conversion, and format spec get passed instead (and then to_html could decide that conversion 'raw' means to call format(value, format_spec) and then not escape the result). Although that's pretty different from how the standard conversions work (call repr or ascii on the value, then format the resulting string with the format spec). So maybe replacement fields need another subpart separate from both the conversion and the format spec that's only used by custom interpolators?

On 08/27/2015 02:30 PM, Andrew Barnert via Python-ideas wrote:
Your interpolator would need to decide. It might never call value.__format__. It might invent some other protocol, like __html_escape__(fmt_spec). Or, it might bake-in knowledge of how to convert whatever types it cares about. Or another good choice would be to use the singledispatch module. In fact, I like singledispatch so much that I'm going to have to use it in an example.
Maybe what you really want is !raw rather than :raw. If there is no conversion, __format__ gets called and the result passed around as part of the i-string object; if there is one, the value, conversion, and format spec get passed instead (and then to_html could decide that conversion 'raw' means to call format(value, format_spec) and then not escape the result).
You could do that in addition or in place of format_spec. Except currently conversions are only allowed to be a single character, but I don't see any reason not to relax that. The take away is that the PEP 501 i-string machinery applies zero significance to format_spec and conversion. It just parses them out of the template string. It's left up to the interpolator to apply some meaning to them.
Although that's pretty different from how the standard conversions work (call repr or ascii on the value, then format the resulting string with the format spec). So maybe replacement fields need another subpart separate from both the conversion and the format spec that's only used by custom interpolators?
There's nothing from stopping you from doing this. You could decide that your format_spec, for some interpolator, is composed of "part1^part2", and do something based on part1 and part2. See: https://bitbucket.org/ericvsmith/istring/src/d92e47c96609eed44ed57b7d3c1932b... for how my i-string str() interpolator applies format_spec and conversion. Another interpolator could do something different (for example, https://bitbucket.org/ericvsmith/istring/src/d92e47c96609eed44ed57b7d3c1932b... for regex escaping). Eric.

On Aug 27, 2015, at 12:08, Eric V. Smith <eric@trueblade.com> wrote:
But even with singledispatch, you have to write formatters for every type that just call the default format; it means you only have N+M functions to write instead of N*M (where N is the number of interpolators and M the number of types to format), but that's still a lot more than just N functions. Also, of course, the fact that you're doing it differently from the usual "just write a __format__ method" means an extra thing for people to learn, and search for. And I think that's functionality people will almost always expect. Whether I'm dealing with a logger, an i18n library, or even a SQL DECIMAL field, I'd expect :3.5 to mean the same thing it does in str.format. So making every project write the identical code to make that true just because a small number of them won't care seems like a bad idea. And similarly, not having a standard way to separate out the interpolator spec (which is obviously unique to every interpolator) and the format spec (which should be the same for almost every interpreter, but is different for each type) seems like it just adds confusion without adding flexibility. Finally, the fact that, by default, handling format specs isn't done at all means it's very easy to design and implement an interpolator that doesn't do it, use it for a while, and only later realize that you need to be able to do the equivalent of :+05 and haven't left any way to do that and now have to find a clumsy way to tack it on.

On 28 August 2015 at 02:55, <random832@fastmail.us> wrote:
While I appreciate that people are still in a design phase with this, I'd like to point out that in the end, people will need to teach and remember this stuff. Currently ! introduces a conversion, which is one of r, s, or a (and is rarely used except for the occasional !r). Whereas : introduces a format spec, which is a mini-language for describing how to format the value and is specific to the type of the value. The "raw" thing above feels like neither of those things. It feels more like a conversion, but if so then conversions are currently single letters, and language-defined. I'd also expect (for that reason) that any conversions would be valid in any type of formatting (str.format, f-strings, e-strings, whatever). I'm not saying you *have* to follow those rules, just that it feels like we're setting up for a huge teachability nightmare (and a feature that no-one will ever use, because they can't remember how[1]) if we don't at least try to adhere to some level of consistency here. Paul [1] I already tend to ignore most of the features of format strings beyond putting in a simple field number or name, because I don't remember the details and would have to look them up. I'm pretty sure I'd never use something like !raw, no matter how it was spelt, for exactly the same reason.

On Fri, Aug 28, 2015, at 03:52, Paul Moore wrote:
Yes, but the format spec mini-language belongs to the type of the value. Depending on what value is, "raw" could _already_ have a meaning.
Well, at the time I posted that I thought we were moving away from things being language-defined, because my mind was still on the "user- defined string prefixes" proposal from a while back. Them currently being single letters isn't really a compelling argument.
And what if the type of value expects to be able to process a format specifier of "raw", rather than it being used by to_html for its own purpose? The advantage of conversion specifiers is that they're currently a closed set. ---- One thing that I don't think *either* version successfully expresses is that while in many cases (including the to_html example) we want a string, that won't always be the case. If we have a syntax for inserting something as, e.g., a SQL parameter, it should be able to accept a double, but I'm not convinced it shouldn't _also_ be able to describe putting the string result of converting the double with ".05d" as a varchar.

On Wed, Aug 26, 2015 at 9:40 PM, Mike Miller <python-ideas@mgmiller.net> wrote:
I'm confused by this proposal. There are many paragraphs about motivation, philosophy, other languages, etc., but the proposal itself seems to be poorly specified. E.g. I couldn't figure out what code should be produced by: a = e"Sliced {n} onions in {t1-t0:.3f} seconds." Generalizing from the only example in the specification, this would become: a = est("Sliced {n} onions in {t1-t0:.3f} seconds", n=n, t1-t0=t1-t0) which is invalid syntax. Similarly, I don't see how e.g. the following could be rendered correctly: a = e"Three random numbers: {rand()}, {rand()}, {rand()}." I also don't understand the claim that no str(estr) is necessary to render the result -- the estr implementation given at https://bitbucket.org/mixmastamyk/docs/src/default/pep/estring_demo.py has a __str__ method that renders the .rendered attribute, but without the str() call the type of 'a' in the above examples would not be str, and various things that operate on strings (e.g. regular expression searches) would not work. A solution might be to make estr a subclass of str, but nothing in the PEP suggests that you have even considered this problem. (The only hint I can find is the comment "more magic-methods to be implemented here, to improve str compatibility" in your demo implementation, but without subclassing str this is not enough.) -- --Guido van Rossum (python.org/~guido)

On Aug 27, 2015, at 11:27, Guido van Rossum <guido@python.org> wrote:
I also don't understand the claim that no str(estr) is necessary to render the result -- the estr implementation given at https://bitbucket.org/mixmastamyk/docs/src/default/pep/estring_demo.py has a __str__ method that renders the .rendered attribute, but without the str() call the type of 'a' in the above examples would not be str, and various things that operate on strings (e.g. regular expression searches) would not work. A solution might be to make estr a subclass of str, but nothing in the PEP suggests that you have even considered this problem. (The only hint I can find is the comment "more magic-methods to be implemented here, to improve str compatibility" in your demo implementation, but without subclassing str this is not enough.)
Even subclassing str doesn't really help, because there's plenty of code (including, I believe, regex searches) that just looks at the raw string storage that gets created at str.__new__ and can never be mutated or replaced later. So, anything that's delayed-rendered is not a str, or at least it's not the right str. (I know someone earlier in the discussion suggested that at the C level you could replace PyUnicode_READY with a function that, if it's an estr, first calls self.__str__ and then initializes the string storage to the result and then does the normal READY stuff, but I don't think that actually works, does it?)

On Thu, Aug 27, 2015 at 12:13 PM, Andrew Barnert <abarnert@yahoo.com> wrote:
I think subclassing would be enough -- the raw string should be the default rendered string (not the template), and any code that wants to do something *different* will have to extract the template and the list of values (and whatever else is extracted) from other attributes. (Note that it shouldn't be *necessary* to store anything besides the template and the list of values, since the rest of the info can be recovered by parsing the template. But might be *convenient* to store some other things, like the actual text of the expression that produced each value, and the format spec (or perhaps integers into the template that let you find these -- the details would be up to the implementer and a matter of QoI). -- --Guido van Rossum (python.org/~guido)

On 08/27/2015 11:27 AM, Guido van Rossum wrote:
Yes, the demo does not currently handle arbitrary expressions, only .format() is implemented. The PEP is relying on much of the PEP 498 implementation (which I don't have at hand), so that part is underspecified. I hope to reconcile the details if/when the larger design is chosen. For now, I will add the ability to pass a context dictionary at init as well as keywords to prepare for further implementation.
Yes, no explicit call is necessary. When you do things like print(e'') or e''.upper() or '' + e'', you'll get the rendered string without str(). That could be more clear, yes, and maybe not substantially different than i''.
A solution might be to make estr a subclass of str, but nothing in the PEP suggests that
I originally subclassed it from str, but others on the list advised against it. Which is preferred? Under what name is the string in a string object actually held (if any)? -Mike

I've addressed those questions, updated the demo script, and added your examples in a separate examples section. I'll write more after lunch. There is the question of how much detail to copy from PEP 498, I wonder if it will change any further? -Mike On 08/27/2015 11:27 AM, Guido van Rossum wrote:

On Thu, Aug 27, 2015 at 1:38 PM, Mike Miller <python-ideas@mgmiller.net> wrote:
I've addressed those questions, updated the demo script, and added your examples in a separate examples section. I'll write more after lunch.
Looking through your latest commit ( https://bitbucket.org/mixmastamyk/docs/commits/760274613d8c306cd688385fbbcf2...) I think you've painted yourself into an impossible corner, and haven't thought through the consequences in all cases enough. Apparently my Socratic questions didn't help enough, so I feel compelled to give you the answers. :-) The interpreter can't and shouldn't be passing in the values of all the variables involved in an expression. It should only be passing in the final evaluated result of each "slot" in the e-string. For my second example (with three rand() calls) it should pass the three different random values returned by the three rand() calls into the estr() constructor, e.g.: b = estr("Three random numbers: {rand()}, {rand()}, {rand()}.", rand(), rand(), rand()) Your entire formatting machinery should be rewritten using positional values instead of keyword args. Regarding subclassing str, there are indeed many problems with that (e.g. what is the type of str(...)+estr(...)), but without it, you will never be able to claim that str() calls are never needed, because quite a few built-in operations and stdlib modules in Python *require* that their arguments are str subclasses (or they treat str subclasses different than other classes). An important example is the re module.
There is the question of how much detail to copy from PEP 498, I wonder if it will change any further?
Undoubtedly PEP 498 will evolve. You're better off not depending on it directly (despite being an alternative or variant). -- --Guido van Rossum (python.org/~guido)

Hi, I was able to get this done tonight. There's still simplifications to be done, perhaps keeping the string fragments in one piece? Also, a lot of things need to be passed to the constructor to avoid parsing the template twice. Positional arguments are working however, and we're back to inheriting from str. It is now rendering into the "real" string, and everything seems to work without the magic methods. ;) Thanks, -Mike On 08/27/2015 02:02 PM, Guido van Rossum wrote:

Thanks, this looks much better, if we ever want to go in this direction. (Though I think you may want to separate the field names into two parts, the expression text and the format spec.) Can you work with the team at peps@python.org to get a PEP number for this? On Fri, Aug 28, 2015 at 1:34 AM, Mike Miller <python-ideas@mgmiller.net> wrote:
-- --Guido van Rossum (python.org/~guido)

On 08/28/2015 08:11 AM, Guido van Rossum wrote:
It does currently separate the expression texts from the format specs internally, it felt excessive to pass them in separately in the constructor, as I'm sort of complaining about below. But, the end-developer won't see this typically, so it could be done before or after I suppose. Anyone have a preference?
Can you work with the team at peps@python.org <mailto:peps@python.org> to get a PEP number for this?
Yes, nice to have a third alternative to choose from. -Mike
participants (7)
-
Andrew Barnert
-
Cody Piersall
-
Eric V. Smith
-
Guido van Rossum
-
Mike Miller
-
Paul Moore
-
random832@fastmail.us