[Python-Dev] Yet another string formatting proposal
Oren Tirosh
oren-py-d@hishome.net
Thu, 21 Nov 2002 20:58:42 +0200
"\(a) + \(b) = \(a+b)\n"
The expressions embedded in the string are parsed at compile time and
any syntax errors in them are detected during compilation.
The use of the backslash as introducer makes it unnecessary to add a new
magic character ("$") along with a new escaping convention when this
character needs to appear in the string ("$$") and a new string prefix
(pep 215) or method (pep 292) to instruct the system to perform
additional processing on this string.
One advantage of using an operator, method or function over in-line
formatting is that it enables the use of a template. A new string method
can provide run-time evaluation of the same format:
"\(a) + \(b) = \(a+b)\n"
r"\(a) + \(b) = \(a+b)\n".cook()
A raw string is used to defer the evaluation of all backslash escape
sequences to some later time. The cook method evaluates backslash
escapes in the string, including any embedded expressions. This runtime
version may be used for internationalization, for example.
By default, the cook method uses the global and local namespace of the
calling scope, just like the built-in function eval(). Dictionary and/or
named arguments may be used to override the namespace in which embedded
expressions are evaluated:
s = formatstring.cook(a=5, b=6)
s = formatstring.cook(sys._getframe().f_locals)
Security issues:
Compile-time expression embedding should not have any special security
concerns since there is no parsing of data from untrusted sources (if
your SOURCE CODE is not trusted I can't help you there).
In order to provide protection against evaluation of arbitrary code when
an attacker has access to the format strings the cook() method could be
limited to variable names only. A sparate cook_eval() method would
support full expressions. The 'eval' in the method name should remind
the programmer that it is potentially as dangerous as eval().
Drawbacks:
Must use the full format "I like \(traffic) lights". There is no option
for the shorter version "I like \traffic lights" because these
combinations are already taken. May be considered an advantage:
"There should be one-- and preferably only one --obvious way to do it."
Not as familiar as $ for programmers from other languages. May also be
considered an advantage :-)
Oren