[Python-Dev] proposal: evaluated string

tomer filiba tomerfiliba at gmail.com
Thu Apr 20 17:20:01 CEST 2006


many times, templating a string is a tidious task. using the % operator,
either with tuples or dicts,
is difficult to maintain, when the number of templated arguments is large.
and string.Template,
although more easy to read, is less intutive and cumbersome:

import string
t = string.Template("hello $name")
print t.substitute({"name" : "john"})

as you can see, it is is redundant, as you must repeat the dict keys in two
places, and imagine
maintaining such a template with 20 parameters! if you change the one
argument's name in the
template, you must go and fix all locations that use that template. not nice
at all.

i'm suggesting something like boo's string interpolation:
http://boo.codehaus.org/String+Interpolation
but i chose to call it "evaluated string".

like raw strings (r""), which are baiscally a syntactic sugar, evaluated
strings will be marked by 'e',
for instance, e"", which may be combined with the 'r' or 'u', that exist
today.

the evaluated string will be evaluated based on the current scope (locals
and globals), just like
normal expressions. the difference is, the results of the expressions will
be str()ed into the
evaluated string directly. these expressions will be embedded into the
string surrounded by
special delimiters, and an unclosed delimited or a syntax error will be
reported at just like "\x??"
raises "ValueError: invalid \x escape".

i'm not sure which delimiters to use, but i think only { } is sufficient (no
need for ${ } like in boo)

some examples:
===============
name = "john"
print e"hello {name}"

a = 3
b = 7
print e"the function is y = {a}x + {b}"
for x in range(10):
    print e"y({x}) = {a*x+b}"

import time, sys
print e"the time is {time.asctime()} and you are running on {sys.platform}"
===============

in order to implement it, i suggest a new type, estr. doing
a = e"hello"
will be equivalent to
a = estr("hello", locals(), globals()),
just like
u"hello"
is equivalent to
unicode("hello")
(if we ignore \u escaping for a moment)

and just like unicode literals introduce the \u escape, estr literals would
introduce \{ and \}
to escape delimiters.

the estr object will be evaluated with the given locals() and globals() only
at __repr__ or __str__,
which means you can work with it like a normal string:

a = e"hello {name} "
b = e"how nice to meet you at this lovely day of {time.localtime().tm_year}"
c = a + b
c is just the concatenation of the two strings, and it will will be
evaluated as a whole when you
str()/repr() it. of course the internal representation of the object
shouldnt not as a string,
rather a sequence of static (non evaluated) and dynamic (need evaluation)
parts, i.e.:
["hello", "name", "how nice to meet you at this lovely day of", "
time.localtime().tm_year"],
so evaluating the string will be fast (just calling eval() on the relevant
parts)

also,  estr objects will not support getitem/slicing/startswith, as it's not
clear what the indexes are...
you'd have to first evaluate it and then work with the string:
str(e"hello")[2:]

estr will have a counterpart type called eunicode. some rules:
estr + str => estr
estr + estr => estr
estr + unicode => eunicode
estr + eunicode => eunicode
eunicode + eunicode => eunicode

there are no backwards compatibility issues, as e"" is an invalid syntax
today, and as for clarity,
i'm sure editors like emacs and the like can be configured to highlight the
strings enclosed by {}
like normal expressions.

i know it may cause the perl-syndrome, where all the code of the program is
pushed into strings,
but templating/string interpolation is really a fundamental requirement of
scripting languages,
and the perl syndrome can be prevented with two precautions:
* compile the code with "eval" flag instead of "exec". this would prevent
abominations like
e"{import time\ndef f(a):\n\tprint 'blah'}"
* do not allow the % operator to work on estr's, to avoid awful things like
e"how are %s {%s}" % ("you", "name")
one templating mechanism at a time, please :)

perhaps there are other restrictions to impose, but i couldnt think of any
at the moment.

here's a proposed implementation:

class estr(object): # can't derive from basestring!
    def __init__(self, raw, locals, globals):
        self.elements = self._parse(raw)
        self.locals = locals
        self.globals = globals

    def _parse(self, raw):
        i = 0
        last_index = 0
        nesting = 0
        elements = []
        while i < len(raw):
            if raw[i] == "{":
                if nesting == 0:
                    elements.append((False, raw[last_index : i]))
                    last_index = i + 1
                nesting += 1
            elif raw[i] == "}":
                nesting -= 1
                if nesting == 0:
                    elements.append((True, raw[last_index : i]))
                    last_index = i + 1
            if nesting < 0:
                raise ValueError("too many '}' (at index %d)" % (i,))
            i += 1
        if nesting > 0:
            raise ValueError("missing '}' before end")
        if last_index < i:
            elements.append((False, raw[last_index : i]))
        return elements

    def __add__(self, obj):
        if type(obj) == estr:
            elements = self.elements + obj.elements
        else:
            elements = self.elements + [(False, obj)]
        # the new  object inherits the current one's namespace (?)
        newobj = estr("", self.locals, self.globals)
        newobj.elements = elements
        return newobj

    def __mul__(self, count):
        newobj = estr("", self.locals, self.globals)
        newobj.elements = self.elements * count
        return newobj

    def __repr__(self):
        return repr(self.__str__())

    def __str__(self):
        result = []
        for dynamic, elem in self.elements:
            if dynamic:
                result.append(str(eval(elem, self.locals, self.globals)))
            else:
                result.append(str(elem))
        return "".join(result)

myname = "tinkie winkie"
yourname = "la la"
print estr("{myname}", locals(), globals())
print estr("hello {myname}", locals(), globals())
print estr("hello {yourname}, my name is {myname}", locals(), globals())

a = 3
b = 7
print estr("the function is y = {a}x + {b}", locals(), globals())
for x in range(10):
    print estr("y({x}) = {a*x+b}", locals(), globals())

a = estr("hello {myname}", locals(), globals())
b = estr("my name is {myname} ", locals(), globals())
c = a + ", " + (b * 2)
print c
======

the difference is that when __str__ is called, it will figure out by itself
the locals() and globals()
using the stack frame or whatever.



-tomer
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20060420/b9212994/attachment.htm 


More information about the Python-Dev mailing list