[Python-ideas] Draft PEP on string interpolation

Wed Aug 26 14:56:51 CEST 2015

On 8/25/2015 10:20 PM, Ron Adam wrote:
> On 08/24/2015 09:42 PM, Eric V. Smith wrote:
>>> On Aug 24, 2015, at 10:23 PM, Ron
>>> Adam<ron3200 at gmail.com>  wrote:
>>>>
>>>> On 08/24/2015 06:45 PM, Mike Miller wrote:
>>>>>>>> - How problematic will it be that an e-string pins all
>>>>>>>> the interpolated objects in memory for its lifetime?
>>>>>>
>>>>>> It will be an object holding a raw template string, and a
>>>>>> number of variables. In normal usage I don't suspect it to be
>>>>>> a problem.
>>>>
>>>> If an objects __str__ method could have an optional fmt='spec'
>>>> argument, then an estring, could just hold strings, and not the
>>>> object references.  That also prevent surprises if the object is
>>>> mutated between the time it's estring is created and when the
>>>> estring is used as a string.  For that matter it prevents an
>>>> estring from printing one way at one time, and another at another
>>>> time.
>>>>
>>>> I don't know if the fomatting can be split like this...  Where an
>>>> object is formatted to a string representation, and then that is
>>>> formatted to a field specification.   The later being things like
>>>> width, fill, right, center, and left.   These are independent of
>>>> the object and belong to the string.  Things like nubmer of
>>>> places and sign or to use leading or trailing zeros is part of
>>>> the object being converted to a string.
> 
>> It's not possible. For examples, look at all of the number format
>> options. How would you implement hex conversions? Or datetime %A?
> 
> I'm not sure which part you are referring to..  But I think adding an
> optional argument to __str__ methods is probably out.

The part that's not possible is to have the format_spec always be
interpreted on a string ojbect, even if the format_spec refers to a
different type (such as datetime).

> As to splitting the format spec, I think it would be possible, but It
> may not be needed.
> 
> I still think early evaluation is a must here.  The issue I have with
> the late evaluation is shown in your current example of logging.  If the
> time which may be from an actual time() function rather than a fixed
> time is not evaluated until the logged list is printed at the end of the
> run, all the times will be set to when it's printed rather than when the
> logged even happened.

There are two things being evaluated: the expressions (the things inside
the {}'s), and the value of the i-string (or whatever it's called here,
I've lost track). The expressions would be evaluated immediately, when
the i-string is created. This is identical to what would happen if,
instead of being in an i-string, the expressions were written in Python
code. The value of the i-string would be evaluated later, such as when
str() or log() or whatever evaluated the contents of the string.

This is what my example on bitbucket does. See i.__init__ for eval(),
where the expressions are evaluated. Then later, i.join() actually
evaluates the content of the string.

Note that evaluating the i-string need not result in a string as the
result. See the regex example. The 'i' class needs better support for
this, but it's doable. Adding that is on my list of things to do, once I
have a better API thought out.

> Another similar reason is the evaluated expression is sensitive to what
> object is in the name at the time it is evaluated.  If it's evaluated
> later, the object from the name look up may be something entirely
> unexpected because that name may have been reused during each iteration
> of a loop.  So all the logged entries that refer to that name will give
> the last value rather than the value at the time the event was logged.

Sure. Currently:

logging.info('the time is %s', datetime.datetime.now())

Evaluates the current time immediately, but builds up the string later.
That's equivalent to what this would do in my bitbucket log.py example:

msg = i("the time is {datetime.datetime.now()}")
log.log(msg)

Also, see test_i in simple.py, again on bitbucket. It shows that
changing the values after an i-string is created has no effect on the
contents of the i-string. This would be different if the values were
mutable, of course. I'll add a test for that to show what I mean.

I think your example below is a functional subset of what I have on
bitbucket. The only real distinction is that I can do substitutions from
a different string, using the expressions that were originally evaluated
when the i-string was constructed. This is needed for the i18n case. I
realize i18n might never use this, but it's a useful thought experiment
in any case.

Eric.

> Here's a slightly reworked version to compare to.
> 
> Hope this is helpful,
>   Ron
> 
> 
> 
> import sys
> import _string
> 
> def interleave(*iters):
>     result = []
>     for items in zip(*iters):
>         for item in items:
>             result.append(item)
>     return result
> 
> 
> # i-string
> class i:
>     def __init__(self, s):
>         self.s = s
>         locals = sys._getframe(1).f_locals
>         globals = sys._getframe(1).f_globals
>         self.literals = []
>         self.values = []
>         # Evaluate the expressions now, and remember them.
>         # This freezes the value at execution time.
>         for literal, expr, format_spec, conversion in \
>                 _string.formatter_parser(self.s):
>             self.literals.append(literal)
>             if expr:
>                 value = eval(expr, locals, globals)
>                 self.values.append(value.__format__(format_spec))
>             else:
>                 self.values.append('')
> 
>     def __str__(self):
>         return ''.join(interleave(self.literals, self.values))
> 
> 
> 
> # f-string
> def f(s):
>     return str(i(s))
> 
> 
> # logging
> def log(istring, echo=True):
>     logged = 'log:' + str(istring)
>     print(logged)
>     return logged
> 
> 
> 
> # test
> 
> if __name__ == '__main__':
> 
>     x = i('Version in caps {sys.version.upper()!r}')
>     print(str(x))
> 
> 
>     name = 'Eric'
>     dog = 'Fido'
>     s = f('My name is {name}, my dog is {dog}')
>     print(repr(s))
>     assert repr(s) == "'My name is Eric, my dog is Fido'"
>     assert type(s) == str
> 
> 
>     import datetime
>     def func(value):
>         return i('called func with "{value:10}"')
> 
>     logline = 'as of {now:%Y-%m-%d} the value is {400+1:#06x}'
>     now = datetime.datetime(2015, 8, 10, 12, 13, 15)
>     logged = log(i(logline), echo=True)
>     assert logged == "log:as of 2015-08-10 the value is 0x0191"
> 
>     now = datetime.datetime(2015, 8, 11, 12, 13, 15)
>     logged = log(i(logline), echo=True)
>     assert logged == "log:as of 2015-08-11 the value is 0x0191"
> 
>     logged = log(i('{func(42)}'))
>     assert logged == 'log:called func with "        42"'
> 
> 
>     import re
>     delimiter = '+'
>     trailing_re = re.escape(r'\S+')
>     regex = i(r'{delimiter}\d+{delimiter}{trailing_re}')
>     print(regex)
>     assert str(regex) == r"+\d++\\S\+"
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>