PEP 215 redux: toward a simplified consensus?
I'm still woefully behind on my email since returning from vacation, but I thought I'd rehash a bit on PEP 215, string interpolation, given some recent hacking and thinking about stuff we talked about at IPC10. Background: PEP 215 has some interesting ideas, but IMHO is more than I'm comfortable with. At IPC10, Guido described his rules for string interpolation as they would be if his time machine were more powerful. These follow some discussions we've had during various Zope sprints about making the rules simpler for non-programmers to understand. I've also been struggling with how error prone %(var)s substitutions can be in the thru-the-web Mailman strings where this is supported. Here's what I've come up with. Guido's rules for $-substitutions are really simple: 1. $$ substitutes to just a single $ 2. $identifier followed by non-identifier characters gets interpolated with the value of the 'identifier' key in the substitution dictionary. 3. For handling cases where the identifier is followed by identifier characters that aren't part of the key, ${identfier} is equivalent to $identifier. And that's it. For the sake of discussion, forget about where the dictionary for string interpolation comes from. I've hacked together 4 functions which I'm experimentally using to provide these rules in thru-the-web string editing, and also for sanity checking the strings as they're submitted. I think there's a fairly straightforward conversion between traditional %-strings and these newfangled $-strings, and so two of the functions do the conversions back and forth. The second two functions attempt to return a list of all the substitution variables found in either a %-string or a $-string. I match this against the list of known legal substitution variables, and bark loudly if there's some mismatch. The one interesting thing about %-to-$ conversion is that the regexp I use leaves the trailing `s' in %(var)s as optional, so I can auto-correct for those that are missing. I think this was an idea that Paul Dubois came up with during the lunch discussion. Seems to work well, and I can do a %-to-$-to-% roundtrip; if the strings at the ends are the same then there wasn't any missing `s's, otherwise the conversion auto-corrected and I can issue a warning. This is all really proto-stuff, but I've done some limited testing and it seems to work pretty well. So without changing the language we can play with $-strings using Guido's rules to see if we like them or not, by simply converting them to traditional %-strings manually, and then doing the mod-operator substitutions. Hopefully I've extracted the right bits of code from my modules for you to get the idea. There may be bugs <wink>. -Barry -------------------- snip snip -------------------- import re from string import digits try: # Python 2.2 from string import ascii_letters except ImportError: # Older Pythons _lower = 'abcdefghijklmnopqrstuvwxyz' ascii_letters = _lower + _lower.upper() # Search for $(identifier)s strings, except that the trailing s is optional, # since that's a common mistake cre = re.compile(r'%\(([_a-z]\w*?)\)s?', re.IGNORECASE) # Search for $$, $identifier, or ${identifier} dre = re.compile(r'(\${2})|\$([_a-z]\w*)|\${([_a-z]\w*)}', re.IGNORECASE) IDENTCHARS = ascii_letters + digits + '_' EMPTYSTRING = '' # Utilities to convert from simplified $identifier substitutions to/from # standard Python $(identifier)s substititions. The "Guido rules" for the # former are: # $$ -> $ # $identifier -> $(identifier)s # ${identifier} -> $(identifier)s def to_dollar(s): """Convert from %-strings to $-strings.""" s = s.replace('$', '$$') parts = cre.split(s) for i in range(1, len(parts), 2): if parts[i+1] and parts[i+1][0] in IDENTCHARS: parts[i] = '${' + parts[i] + '}' else: parts[i] = '$' + parts[i] return EMPTYSTRING.join(parts) def to_percent(s): """Convert from $-strings to %-strings.""" s = s.replace('%', '%%') parts = dre.split(s) for i in range(1, len(parts), 4): if parts[i] is not None: parts[i] = '$' elif parts[i+1] is not None: parts[i+1] = '%(' + parts[i+1] + ')s' else: parts[i+2] = '%(' + parts[i+2] + ')s' return EMPTYSTRING.join(filter(None, parts)) def dollar_identifiers(s): """Return the set (dictionary) of identifiers found in a $-string.""" d = {} for name in filter(None, [b or c or None for a, b, c in dre.findall(s)]): d[name] = 1 return d def percent_identifiers(s): """Return the set (dictionary) of identifiers found in a %-string.""" d = {} for name in cre.findall(s): d[name] = 1 return d -------------------- snip snip -------------------- Python 2.2 (#1, Dec 24 2001, 15:39:01) [GCC egcs-2.91.66 19990314/Linux (egcs-1.1.2 release)] on linux2 Type "help", "copyright", "credits" or "license" for more information.
import dollar dollar.to_dollar('%(one)s %(two)three %(four)seven') '$one ${two}three ${four}even' dollar.to_percent(dollar.to_dollar('%(one)s %(two)three %(four)seven')) '%(one)s %(two)sthree %(four)seven' dollar.percent_identifiers('%(one)s %(two)three %(four)seven') {'four': 1, 'two': 1, 'one': 1} dollar.dollar_identifiers(dollar.to_dollar('%(one)s %(two)three %(four)seven')) {'four': 1, 'two': 1, 'one': 1}
"Barry A. Warsaw" wrote:
Background: PEP 215 has some interesting ideas, but IMHO is more than I'm comfortable with. At IPC10, Guido described his rules for string interpolation as they would be if his time machine were more powerful. These follow some discussions we've had during various Zope sprints about making the rules simpler for non-programmers to understand. I've also been struggling with how error prone %(var)s substitutions can be in the thru-the-web Mailman strings where this is supported. Here's what I've come up with.
Guido's rules for $-substitutions are really simple:
1. $$ substitutes to just a single $
2. $identifier followed by non-identifier characters gets interpolated with the value of the 'identifier' key in the substitution dictionary.
3. For handling cases where the identifier is followed by identifier characters that aren't part of the key, ${identfier} is equivalent to $identifier.
And that's it. For the sake of discussion, forget about where the dictionary for string interpolation comes from.
Wouldn't it be a lot simpler and more inline with what we already have, if we'd use '%' as escape characters ? 1. %% becomes % 2. %ident maps to %(ident)s as we have it now 3. %{ident} maps to %(ident)s 4. %(ident)s continues to have the same semantics as before -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/
[Barry]
Guido's rules for $-substitutions are really simple:
1. $$ substitutes to just a single $
2. $identifier followed by non-identifier characters gets interpolated with the value of the 'identifier' key in the substitution dictionary.
3. For handling cases where the identifier is followed by identifier characters that aren't part of the key, ${identfier} is equivalent to $identifier.
And that's it. For the sake of discussion, forget about where the dictionary for string interpolation comes from.
[MAL]
Wouldn't it be a lot simpler and more inline with what we already have, if we'd use '%' as escape characters ?
1. %% becomes %
2. %ident maps to %(ident)s as we have it now
3. %{ident} maps to %(ident)s
4. %(ident)s continues to have the same semantics as before
That's not simpler, it's more complicated. Any tool dealing with these will have to understand all the rules. The point of switching to $ is twofold: (1) it avoids confusion with the old %-based syntax (which can continue to exist for different purposes), (2) it is familiar to people who have seen substitution in other languages. $ is nearly universal (Perl, Tcl, Ruby, shell, etc.) --Guido van Rossum (home page: http://www.python.org/~guido/)
"MAL" == M <mal@lemburg.com> writes:
MAL> 1. %% becomes % MAL> 2. %ident maps to %(ident)s as we have it now MAL> 3. %{ident} maps to %(ident)s MAL> 4. %(ident)s continues to have the same semantics as MAL> before What happens to %dogfood or %sickpuppy? If you're trying to maintain backwards compatibility with existing syntax, you can't use %ident strings. -Barry
"Barry A. Warsaw" wrote:
"MAL" == M <mal@lemburg.com> writes:
MAL> 1. %% becomes %
MAL> 2. %ident maps to %(ident)s as we have it now
MAL> 3. %{ident} maps to %(ident)s
MAL> 4. %(ident)s continues to have the same semantics as MAL> before
What happens to %dogfood or %sickpuppy? If you're trying to maintain backwards compatibility with existing syntax, you can't use %ident strings.
That's what I was trying to achieve. The only gripe I sometimes have with '%(ident)s' is that users forget the 's' behind '%(ident)'; I'd be ok with dropping 2. and only adding 3. Whatever you do, just please don't mix the old and new semantics... 'Joe has $ %(a)5.2f in his pocket.' % locals() is perfectly valid now and should continue to be valid. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/
"MAL" == M <mal@lemburg.com> writes:
MAL> Whatever you do, just please don't mix the old and new MAL> semantics... MAL> 'Joe has $ %(a)5.2f in his pocket.' % locals() MAL> is perfectly valid now and should continue to be valid. I agree completely; it ought to be one or the other. In the code I emailed, you actually had to do a conversion step from $-strings to %-strings to use the build-in string-mod operator. In practice, if $-strings were to be added to the language, I suspect some new prefix would have to designate a new type of string object, e.g. $'' strings. Or perhaps a different binary operator could be used. -Barry
"Barry A. Warsaw" wrote:
"MAL" == M <mal@lemburg.com> writes:
MAL> Whatever you do, just please don't mix the old and new MAL> semantics...
MAL> 'Joe has $ %(a)5.2f in his pocket.' % locals()
MAL> is perfectly valid now and should continue to be valid.
I agree completely; it ought to be one or the other. In the code I emailed, you actually had to do a conversion step from $-strings to %-strings to use the build-in string-mod operator. In practice, if $-strings were to be added to the language, I suspect some new prefix would have to designate a new type of string object, e.g. $'' strings. Or perhaps a different binary operator could be used.
Good. Since the strings themselves don't really change and to avoid confusing string modifiers... ur$'my $format \$tring' I'd suggest to use a new operator, e.g. 'Joe has $$ $a in his pocket.' $ locals() -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/
"Barry A. Warsaw" wrote:
"MAL" == M <mal@lemburg.com> writes:
MAL> 'Joe has $$ $a in his pocket.' $ locals()
I'd prefer to hijack an existing operator -- one that's unsupported by the string object. Perhaps / or - or & or |
'/' looks nice and has this "interpret under" sort of meaning: 'Joe has $$ $a in his pocket.' / locals() If you are more into algebra, then '*' would probably also appeal to the eye: 'Joe has $$ $a in his pocket.' * locals() -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/
M.-A. Lemburg writes:
'/' looks nice and has this "interpret under" sort of meaning:
'Joe has $$ $a in his pocket.' / locals()
I'd read that more as "mapped over" rather than "interpret under". ;)
If you are more into algebra, then '*' would probably also appeal to the eye:
'Joe has $$ $a in his pocket.' * locals()
But * is already meaningful for strings, so not a good choice. -Fred -- Fred L. Drake, Jr. <fdrake at acm.org> PythonLabs at Zope Corporation
"MAL" == M <mal@lemburg.com> writes:
MAL> '/' looks nice and has this "interpret under" sort of MAL> meaning: MAL> 'Joe has $$ $a in his pocket.' / locals() I agree, I like that one. MAL> If you are more into algebra, then '*' would probably also MAL> appeal to the eye: MAL> 'Joe has $$ $a in his pocket.' * locals() I avoid it because then you'd have to add another type test to operator-*. Ping, if you're around and care to comment, perhaps we can try to update PEP 215 and maybe add a reference implementation? -Barry
"Barry A. Warsaw" wrote:
"MAL" == M <mal@lemburg.com> writes:
MAL> '/' looks nice and has this "interpret under" sort of MAL> meaning:
MAL> 'Joe has $$ $a in his pocket.' / locals()
I agree, I like that one.
Fine with me. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/
"Barry A. Warsaw" wrote:
"MAL" == M <mal@lemburg.com> writes:
MAL> 'Joe has $$ $a in his pocket.' $ locals()
I'd prefer to hijack an existing operator -- one that's unsupported by the string object. Perhaps / or - or & or |
Yuck! String interopolation should be a *compile time* action, not an operator. One of the goals, in my mind, is to allow people to string interpolate without knowing what the locals() function does. After all, that function is otherwise useless for most Python programmers (and should probably be moved to an introspection module). Your strategy requires the naive user to learn a) the $ syntax, b) the magic operator syntax and c) the meaning of the locals() function. Plus you've thrown away the idea that interpolation works as it does in the shell or in Perl/Awk/Ruby etc. At that point, in my mind, we're back where we started and should just use %. Well have reinvented it with a few small tweaks. Plus, operator-based evaluation has some security implications that compile time evaluation does not. In particular, if the left-hand thing can be any string then you have the potential of accidentally allowing the user to supply a string that allows him/her to introspect your local variables. That can't happen if the interpolation is done at compile time. Paul Prescod
paul wrote:
Your strategy requires the naive user to learn a) the $ syntax, b) the magic operator syntax and c) the meaning of the locals() function. Plus you've thrown away the idea that interpolation works as it does in the shell or in Perl/Awk/Ruby etc.
At that point, in my mind, we're back where we started and should just use %.
# interpolate! s = I('Joe has $ ', a, ' in his pocket.') or perhaps # print-like interpolation s = P('Joe has $', a, 'in his pocket.') works pretty well too. in all versions of python, with all existing syntax-aware tools. and if written in C, it's probably as fast as any other solution... (implementing I/P is left as an exercise etc etc) </F>
Paul Prescod writes:
String interopolation should be a *compile time* action, not an operator. One of the goals, in my mind, is to allow people to string
This doesn't work as soon as the string is not a constant. Many of the discussions at PythonLabs did not involve text included as part of an application's source, and the conversion operation would not be driven by application code but by library/service code. Even if it were a constant, needing to add in message catalog support changes things as well. So auto-magical interpolation doesn't seem like a good idea.
interpolate without knowing what the locals() function does. After all, that function is otherwise useless for most Python programmers (and should probably be moved to an introspection module).
You'd still only need to use locals() if that's your source of variables.
Your strategy requires the naive user to learn a) the $ syntax, b) the magic operator syntax and c) the meaning of the locals() function. Plus you've thrown away the idea that interpolation works as it does in the shell or in Perl/Awk/Ruby etc.
a) The $ syntax is easier than the % syntax, and already more familiar to most new users. b) What's a magic operator? string % mapping is already pretty magical as far as the modulus operation is concerned. c) And you still don't have to use locals() if you don't want to. And the string syntax matches a common subset of what's used elsewhere. We just have the added control over the source of substitution values (a good thing).
At that point, in my mind, we're back where we started and should just use %. Well have reinvented it with a few small tweaks.
And we've made it a lot easier for strings that are not part of Python source code, and for people who produce that data but never know Python.
Plus, operator-based evaluation has some security implications that compile time evaluation does not. In particular, if the left-hand thing can be any string then you have the potential of accidentally allowing the user to supply a string that allows him/her to introspect your local variables. That can't happen if the interpolation is done at compile time.
I'm not sure I understand this. -Fred -- Fred L. Drake, Jr. <fdrake at acm.org> PythonLabs at Zope Corporation
On Mon, Feb 25, 2002 at 12:31:55PM -0800, Paul Prescod wrote:
Yuck!
String interopolation should be a *compile time* action, not an operator. One of the goals, in my mind, is to allow people to string interpolate without knowing what the locals() function does. After all, that function is otherwise useless for most Python programmers (and should probably be moved to an introspection module).
Your strategy requires the naive user to learn a) the $ syntax, b) the magic operator syntax and c) the meaning of the locals() function. Plus you've thrown away the idea that interpolation works as it does in the shell or in Perl/Awk/Ruby etc.
At that point, in my mind, we're back where we started and should just use %. Well have reinvented it with a few small tweaks.
Plus, operator-based evaluation has some security implications that compile time evaluation does not. In particular, if the left-hand thing can be any string then you have the potential of accidentally allowing the user to supply a string that allows him/her to introspect your local variables. That can't happen if the interpolation is done at compile time.
But how do you internationalize your program once you use $-subs? The great strength of %-formats, and the *printf functions that inspired them, are that the interpretation of the format takes place at runtime. (printf has added positional specifiers, spelled like "%1$s", to permit reordering of items in the format, while Python has added key-specifiers, spelled like "%(id)s", but they're about equally powerful) With %-subs, we can write def gettext(s): """ Return the localized version of s from the message catalog """ return s def print_chance(who, chance): print gettext("%(who)s has a %(percent).2f%% chance of surviving") % { 'who': who, 'percent': chance * 100} print_chance("Jeff", 1./3) I'm not interested in any proposal that turns code that's easy to internationalize (just add calls to gettext(), commonly spelled _(), around each string that needs translating, then fix up the places where the programmer was too clever) into code that's impossible to internationalize by design. Jeff
On Mon, Feb 25, 2002 at 03:55:15PM -0500, Fred L. Drake, Jr. wrote:
And we've made it a lot easier for strings that are not part of Python source code, and for people who produce that data but never know Python.
But for applications where people don't edit Python, this could just be a library module and doesn't need a new operator in the Python code. I agree with Paul; there's no actual gain in clarity from the new syntax.
Plus, operator-based evaluation has some security implications that compile time evaluation does not. In particular, if the left-hand thing
I'm not sure I understand this.
Presumably Paul is thinking of something like: mlist = load_list('listname') # Lists have .title, .password, ... form_value = cgi.form['text'] # User puts $password into text print text \ vars(mlist) --amk (www.amk.ca) The most merciful thing in the world, I think, is the inability of the human mind to correlate all its contents. -- H.P. Lovecraft, "The Call of Cthulhu"
String interopolation should be a *compile time* action, not an operator. One of the goals, in my mind, is to allow people to string interpolate without knowing what the locals() function does. After all, that function is otherwise useless for most Python programmers (and should probably be moved to an introspection module).
Your strategy requires the naive user to learn a) the $ syntax, b) the magic operator syntax and c) the meaning of the locals() function. Plus you've thrown away the idea that interpolation works as it does in the shell or in Perl/Awk/Ruby etc.
At that point, in my mind, we're back where we started and should just use %. Well have reinvented it with a few small tweaks.
Plus, operator-based evaluation has some security implications that compile time evaluation does not. In particular, if the left-hand thing can be any string then you have the potential of accidentally allowing the user to supply a string that allows him/her to introspect your local variables. That can't happen if the interpolation is done at compile time.
All right, but there *also* needs to be a way to invoke interpolation explicitly -- just like eval(). This has applicability e.g. in i18n. --Guido van Rossum (home page: http://www.python.org/~guido/)
Fred L. Drake, Jr. wrote:
This doesn't work as soon as the string is not a constant. Many of the discussions at PythonLabs did not involve text included as part of an application's source, and the conversion operation would not be driven by application code but by library/service code.
Write a function or use %. This is not a good reason to add a string interpolation operator to the language. Note that this does not mean I'm against PEP 215. PEP 215 proposes to solve a different problem and should not be hijacked, IMHO. Neil
BAW> ... I suspect some new prefix would have to designate a new type of BAW> string object, e.g. $'' strings. Or perhaps a different binary BAW> operator could be used. I'm still not at all fond of the $-string idea, but in the interests of completeness, perhaps using '$' as a binary operator (by analogy with '%' as a binary operator having nothing to do with modulo when the left arg is a string) would be appropriate. Skip
On Mon, Feb 25, 2002 at 03:55:15PM -0500, Fred L. Drake, Jr. wrote:
Paul Prescod writes:
Plus, operator-based evaluation has some security implications that compile time evaluation does not. In particular, if the left-hand thing can be any string then you have the potential of accidentally allowing the user to supply a string that allows him/her to introspect your local variables. That can't happen if the interpolation is done at compile time.
I'm not sure I understand this.
Imagine that you have: def print_crypted_passwd(name, plaintext, salt="Xx"): crypted = crypt.crypt(plaintext, salt) print _("""%(name)s, your crypted password is %(crypted)s.""") % locals() and that some crafty devil translates this as msgstr "%(name)s, your plaintext password is %(plaintext). HA HA HA" i.e., the translator (or other person who can influence the format string) can access other information in the dict you pass in, even if you didn't intend it. Personally, I tend to view this as showing that using % locals() is unsanitary. But that means that the problem is in using the locals() dictionary, a problem made worse by making the use of locals() implicit. (And under $-substitution, if locals() is implicit, how do I substitute with a dictionary other than locals()? def print_crypted_passwd(accountinfo): print "%(name)s, your crypted password is %(crypted)s." \ % accountinfo.__dict__ vs def print_crypted_passwd(accountinfo): def really_subst(name, crypted): return $"$name, your crypted password is $crypted" print really_subst(accountinfo.name, accountinfo.crypted) or def print_crypted_passwd(accountinfo): name = accountinfo.name crypted = accountinfo.crypted print $"$name, your crypted password is $crypted" ???) Jeff
On Mon, Feb 25, 2002 at 01:18:57PM -0800, Neil Schemenauer wrote:
Jeff Epler wrote:
But how do you internationalize your program once you use $-subs?
So don't use them. What's the problem?
The problem is when I have to internationalize a program some schmuck wrote using $-subs throughout. Jeff
Jeff Epler wrote:
...
Imagine that you have: def print_crypted_passwd(name, plaintext, salt="Xx"): crypted = crypt.crypt(plaintext, salt) print _("""%(name)s, your crypted password is %(crypted)s.""") % locals()
and that some crafty devil translates this as msgstr "%(name)s, your plaintext password is %(plaintext). HA HA HA"
i.e., the translator (or other person who can influence the format string) can access other information in the dict you pass in, even if you didn't intend it.
Right. I don't claim that this is a killer problem. I'm actually much more concerned about the usability aspects. But if we can improve security at the same time, then lets.
Personally, I tend to view this as showing that using % locals() is unsanitary. But that means that the problem is in using the locals() dictionary, a problem made worse by making the use of locals() implicit.
If it is done a compile time then the crafty devil couldn't get in the alternate string! On the other hand, if you're doing runtime translation stuff then of course you need to use a runtime function, like "%" or maybe a new "interpol". I am not against the existence of such a thing. I'm against it being the default way to do interpolation. It's like "eval" a compile-time tool that sophisticated users have access to at runtime.
(And under $-substitution, if locals() is implicit, how do I substitute with a dictionary other than locals()?
Well I don't think you should have to, because you could use the "interpol" function (maybe from the "interpol" module). But anyhow, your question has a factual answer and you already gave it!
def print_crypted_passwd(accountinfo): def really_subst(name, crypted): return $"$name, your crypted password is $crypted" print really_subst(accountinfo.name, accountinfo.crypted) or def print_crypted_passwd(accountinfo): name = accountinfo.name crypted = accountinfo.crypted print $"$name, your crypted password is $crypted"
This last one looks very clear and simple to me! What's the problem with it? Still, I don't argue against the need for something at runtime -- as a power tool. Either we could just keep "%" or make a function. Okay, so my proposal for $ doesn't do everything that % does. It was never spec'd to do everything "%" does. For instance it doesn't do float formatting tricks. Paul Prescod
"SM" == Skip Montanaro <skip@pobox.com> writes:
SM> I'm still not at all fond of the $-string idea, but in the SM> interests of completeness, perhaps using '$' as a binary SM> operator (by analogy with '%' as a binary operator having SM> nothing to do with modulo when the left arg is a string) would SM> be appropriate. I can't say whether it's a good thing to add this to the language or not. I tend to think that %(var)s is just fine from a Python programmer's point of view, and in the interest of TOOWTDI, we don't need anything else.
Jeff Epler wrote:
On Mon, Feb 25, 2002 at 01:18:57PM -0800, Neil Schemenauer wrote:
Jeff Epler wrote:
But how do you internationalize your program once you use $-subs?
So don't use them. What's the problem?
The problem is when I have to internationalize a program some schmuck wrote using $-subs throughout.
I think you go through and remove the "$" signs (probably at the same time you are removing "_") and use a runtime function to do the translation (probably the same function doing the interpolation). Then you take on the responsibility yourself for making sure that the original string is a constant (not a user-supplied variable) and that the replacement strings come from somewhere secure. So: a = $"Hello there $name" becomes: a = _("Hello there $name") I think Barry's gettext already does that or something, doesn't it? Paul Prescod
"JE" == Jeff Epler <jepler@unpythonic.dhs.org> writes:
JE> Imagine that you have: >> def print_crypted_passwd(name, plaintext, salt="Xx"): crypted = >> crypt.crypt(plaintext, salt) print _("""%(name)s, your crypted >> password is %(crypted)s.""") % locals() JE> and that some crafty devil translates this as msgstr JE> "%(name)s, your plaintext password is %(plaintext). HA HA HA" JE> i.e., the translator (or other person who can influence the JE> format string) can access other information in the dict you JE> pass in, even if you didn't intend it. That's a very interesting vulnerability you bring up! In my own implementation, _() uses sys._getframe(1) to gather up the caller's locals and globals into the interpolation dictionary, i.e. you don't need to specify it explicitly. Damn convenient, but also vulnerable to this exploit. In that case, I'd be very careful to make sure that print_crypted_passwd() was written such that the plaintext wasn't available via a variable in the caller's frame. JE> Personally, I tend to view this as showing that using % JE> locals() is unsanitary. Nope, but you have to watch out not to mix cooked and raw food on the same plate (to stretch an unsavory analogy). JE> But that means that the problem is in using the locals() JE> dictionary, a problem made worse by making the use of locals() JE> implicit. JE> (And under $-substitution, if locals() is implicit, how do I JE> substitute with a dictionary other than locals()? def print_crypted_passwd(name, crypted): print $"$name, your crypted password is $crypted" print_crypted_passwd(yername, crypt.crypt(plaintext, salt)) -Barry
"JE" == Jeff Epler <jepler@unpythonic.dhs.org> writes:
JE> I'm not interested in any proposal that turns code that's easy JE> to internationalize (just add calls to gettext(), commonly JE> spelled _(), around each string that needs translating, then JE> fix up the places where the programmer was too clever) into JE> code that's impossible to internationalize by design. I'm with you there, Jeff. -Barry
"PP" == Paul Prescod <paul@prescod.net> writes:
PP> Okay, so my proposal for $ doesn't do everything that % PP> does. It was never spec'd to do everything "%" does. For PP> instance it doesn't do float formatting tricks. Does anybody ever even use something other than `s' for %() strings?
'%(float)f' % {'float': 3.9} '3.900000'
I never have. -Barry
barry wrote:
From a /non-programmer's/ point of view, %(var)s is way too error prone, and $-strings are an attempt at implementing a simple to explain, hard to get wrong, rule for thru-the-web supplied template strings.
how about making that "s" optional? 1. %% substitutes to just a single % 2. %(identifier) followed by non-identifier characters gets interpolated with the value of the 'identifier' key in the sub- stitution dictionary. 3. For handling cases where the identifier is followed by identifier characters that aren't part of the key, $(identfier)s is equivalent to %(identifier) </F>
Andrew Kuchling writes:
But for applications where people don't edit Python, this could just be a library module and doesn't need a new operator in the Python code. I agree with Paul; there's no actual gain in clarity from the new syntax.
I'm happy with that as well.
Presumably Paul is thinking of something like: mlist = load_list('listname') # Lists have .title, .password, ... form_value = cgi.form['text'] # User puts $password into text print text \ vars(mlist)
Yes, but I'm not convinced this has any more security implications implications than using a library function to perform the transformation. -Fred -- Fred L. Drake, Jr. <fdrake at acm.org> PythonLabs at Zope Corporation
"PP" == Paul Prescod <paul@prescod.net> writes:
PP> "Barry A. Warsaw" wrote: >> >> ... Does anybody ever even use something other than `s' for >> %() strings? >>> '%(float)f' % {'float': 3.9} '3.900000' PP> Presumably numerical analysts do....and David Ascher once told PP> me he uses %d as a sanity type-check. I don't bother. %d I sometimes use, but I don't think I've ever (purposely) used %(var)d. -Barry
BAW> There's been no usability testing yet to know whether $-strings BAW> actually will be easier to use <wink>, but I've got plenty of BAW> anecdotal evidence that %-strings suck badly for useability by BAW> non-Python programmers. I presume your anecdotal evidence comes from Mailman. If you have a pair of functions that implement the %-to-$-to-% transformation and can catch the missing 's' problem automatically (is that the biggest problem non- programmers have?), then why not just use this in Mailman and be done with the problem? In fact, why not just document Mailman so that "%(var)" is the correct form and silently add the "missing" 's' in your transformation step? That %-strings suck for Mailman administrators does not mean they necessarily suck for programmers. The two populations obviously overlap somewhat, but not tremendously. I have never had a problem with %-strings, certainly not with omitting the trailing 's'. Past experience with printf() doesn't obviously pollute the sample population too much either, since the %(var)s type of format is not supported by printf(). BAW> Still, if $-strings are better for non-programmers, maybe they're BAW> better for programmers too. There's certainly evidence that BAW> translators get them wrong too. What do you mean by "translators"? Skip
There are two entirely different potential uses for interpolation. One is for the Python programmer; call this literal interpolation. It's cute to be able to write a = 12 b = 15 c = a*b print $"A rectangle of $a x $b has an area of $c." This is arguably better than print "A rectangle of", a, "x", b, "has an area of", c, "." (and to get rid of the space between the value of c and the '.' a totally different paradigm would have to be used). A totally *different* use of interpolation is for templates, where both the template (any data containing the appropriate $ syntax) and the set of variables to be substituted (any mapping) should be under full control of the program. This is what mailmail needs. Literal interpolation has no security issues, if done properly. In the latter use, the security issues can be taken care of by carefully deciding what data is available in the set of variables to be interpolated. The interpolation syntax I've proposed is intentionally very simple, so that this is relatively easy. I recall seeing slides at the conference of a templating system (maybe Twisted's?) that allowed expressions like $foo.bar[key] which would be much harder to secure. I18n of templates is easy -- just look up the template string in the translation database. I18n of apps using literal interpolation is more of a can of worms, and I have no clear solution. I agree that a solution is needed -- otherwise literal interpolation would be *worse* than what we have now! --Guido van Rossum (home page: http://www.python.org/~guido/)
Does anybody ever even use something other than `s' for %() strings?
'%(float)f' % {'float': 3.9} '3.900000'
I never use this in combination with named variables, but I often write timing programs that format times using "%6.3f" to get millisecond precision. --Guido van Rossum (home page: http://www.python.org/~guido/)
"SM" == Skip Montanaro <skip@pobox.com> writes:
BAW> There's been no usability testing yet to know whether BAW> $-strings actually will be easier to use <wink>, but I've got BAW> plenty of anecdotal evidence that %-strings suck badly for BAW> useability by non-Python programmers. SM> I presume your anecdotal evidence comes from Mailman. Correct. SM> If you have a pair of functions that implement the %-to-$-to-% SM> transformation and can catch the missing 's' problem SM> automatically (is that the biggest problem non- programmers SM> have?), The biggest, yes, but not necessarily the only one. SM> then why not just use this in Mailman and be done with the SM> problem? That's what I plan on doing for MM2.1, except I won't force it down people's throats yet. It'll be optional (but it'll be an either-or option). I won't use it in Python code yet though (too disruptive), just the thru-the-web template defining text-boxes. SM> In fact, why not just document Mailman so that "%(var)" is the SM> correct form and silently add the "missing" 's' in your SM> transformation step? SM> That %-strings suck for Mailman administrators does not mean SM> they necessarily suck for programmers. True, but who knows? I wouldn't necessarily classify python-dev as a representative sample of users. SM> The two populations obviously overlap somewhat, but not SM> tremendously. I have never had a problem with %-strings, SM> certainly not with omitting the trailing 's'. Past experience SM> with printf() doesn't obviously pollute the sample population SM> too much either, since the %(var)s type of format is not SM> supported by printf(). BAW> Still, if $-strings are better for non-programmers, maybe BAW> they're better for programmers too. There's certainly BAW> evidence that translators get them wrong too. SM> What do you mean by "translators"? Someone who is fluent in a natural language other than English, and translates a catalog of English source strings to a target non-English natural language. E.g. "No such list: %(listname)s" -> "Non esiste la lista: %(listname)s" -Barry
Barry A. Warsaw writes:
I can't say whether it's a good thing to add this to the language or not. I tend to think that %(var)s is just fine from a Python programmer's point of view, and in the interest of TOOWTDI, we don't
We're definately seeing a lot of reasonable concern over adding another formatting operator, and my own interest in the proposal has nothing to do with having an operator to do this. I probably shouldn't have said anything about the topic (I don't recall even noting a preference, myself, just that I'd read one alternative differently than Marc-Andre and that another already had a meaning).
From a /non-programmer's/ point of view, %(var)s is way too error prone, and $-strings are an attempt at implementing a simple to explain, hard to get wrong, rule for thru-the-web supplied template
How the string was obtained is irrelevant, only that it is not part of the source code and the author may not be a programmer. -Fred -- Fred L. Drake, Jr. <fdrake at acm.org> PythonLabs at Zope Corporation
"Fred" == Fred L Drake, Jr <fdrake@acm.org> writes:
>> From a /non-programmer's/ point of view, %(var)s is way too >> error prone, and $-strings are an attempt at implementing a >> simple to explain, hard to get wrong, rule for thru-the-web >> supplied template Fred> How the string was obtained is irrelevant, only that it is Fred> not part of the source code and the author may not be a Fred> programmer. Correct. -Barry
barry@zope.com (Barry A. Warsaw) writes:
JE> i.e., the translator (or other person who can influence the JE> format string) can access other information in the dict you JE> pass in, even if you didn't intend it.
That's a very interesting vulnerability you bring up!
That's not a vulnerability. It assumes that the translator is an attacker, or that the attacker can change the catalogs. If he is or can, you could not trust them, anyway, as they could cause arbitrary other failures, as well. Regards, Martin
Paul Prescod <paul@prescod.net> writes:
I think you go through and remove the "$" signs (probably at the same time you are removing "_") and use a runtime function to do the translation (probably the same function doing the interpolation).
I could not accept any solution that cannot offer anything but this. This kind of interpolation is plain broken. Regards, Martin
On Mon, Feb 25, 2002 at 11:27:49PM +0100, Martin v. Loewis wrote:
Paul Prescod <paul@prescod.net> writes:
I think you go through and remove the "$" signs (probably at the same time you are removing "_") and use a runtime function to do the translation (probably the same function doing the interpolation).
I could not accept any solution that cannot offer anything but this. This kind of interpolation is plain broken.
Exactly. Why spend all this time and effort complicating the Python parser and compiler, only to find that all real-world programs just instead implement the feature inside a function call? Jeff
On Mon, Feb 25, 2002 at 11:25:48PM +0100, Martin v. Loewis wrote:
That's not a vulnerability. It assumes that the translator is an attacker, or that the attacker can change the catalogs. If he is or can, you could not trust them, anyway, as they could cause arbitrary other failures, as well.
It means that you must audit not only your source code, but also your message catalogs, to determine whether information that is supposed to remain internal to a program is not formatted into a string. Of course, it is fairly easy to do this audit by showing that the translated string doesn't contain substitution on any identifiers that the original string did not. I don't think it's impossible that someone supplying catalogs could be an "attacker", even if a plausible scenario doesn't come directly to mind. Jeff
Jeff Epler wrote:
...
Exactly. Why spend all this time and effort complicating the Python parser and compiler, only to find that all real-world programs just instead implement the feature inside a function call?
Nobody said to reimplement it. I've said on several occasions that there should be a runtime version. Paul Prescod
"JE" == Jeff Epler <jepler@unpythonic.dhs.org> writes:
JE> On Mon, Feb 25, 2002 at 11:25:48PM +0100, Martin v. Loewis JE> wrote: >> That's not a vulnerability. It assumes that the translator is >> an attacker, or that the attacker can change the catalogs. If >> he is or can, you could not trust them, anyway, as they could >> cause arbitrary other failures, as well. JE> It means that you must audit not only your source code, but JE> also your message catalogs, to determine whether information JE> that is supposed to remain internal to a program is not JE> formatted into a string. Of course, it is fairly easy to do JE> this audit by showing that the translated string doesn't JE> contain substitution on any identifiers that the original JE> string did not.
"Martin v. Loewis" wrote:
Paul Prescod <paul@prescod.net> writes:
I think you go through and remove the "$" signs (probably at the same time you are removing "_") and use a runtime function to do the translation (probably the same function doing the interpolation).
I could not accept any solution that cannot offer anything but this. This kind of interpolation is plain broken.
How so? I need more info to go on. Paul Prescod
"Fred L. Drake, Jr." wrote:
...
Yes, but I'm not convinced this has any more security implications implications than using a library function to perform the transformation.
The point is that the simplest mechanism, that we teach to newbies, has security non-obvious "concerns". If we have literal interpolation, then a library function would be used by people who WANT to do it at runtime because they have a REASON for doing it at runtime and thus have a pretty clear concept of the distinction between runtime and compile time. But as I've said, the major reason for this is not security. I don't know that a Python program has been hacked through "%" so it doesn't make sense to lose sleep over it. The major reason for doing it at compile time (for me) is that you can have a nice syntax that doesn't evolve modulus-ing (or dividing) an otherwise useless vars() or locals() dictionary. Paul Prescod
SM> What do you mean by "translators"? BAW> Someone who is fluent in a natural language other than English, and BAW> translates a catalog of English source strings to a target BAW> non-English natural language. E.g. BAW> "No such list: %(listname)s" -> "Non esiste la lista: %(listname)s" So translators aren't programmers either. Just tell them anything between %(...) and the first alphabetic character after that is off-limits. Again, it doesn't look to me like a programmer problem. Just to play the devil's advocate (and ignoring the bit about $-strings not being i18n-friendly), I suspect non-programming translators would have just as much trouble with something like $"Please confirm your choice of color ($color)..." "$color" will look like a word to be translated. You would have to tell them "don't translate anything immediately following a dollar sign up to, but not inluding the next character that can't be part of a Python identifier." Seems either a bit error-prone or confusing to me if I pretend I'm not a programmer. Skip
Guido van Rossum wrote:
There are two entirely different potential uses for interpolation. One is for the Python programmer; call this literal interpolation.
True!
... A totally *different* use of interpolation is for templates, where both the template (any data containing the appropriate $ syntax) and the set of variables to be substituted (any mapping) should be under full control of the program. This is what mailmail needs.
True! But we've already got a solution for this. Is there something wrong with it? I guess I don't know what problem we're trying to solve. My only interest in interpolation was to make the common, simple case easier.
Literal interpolation has no security issues, if done properly. In the latter use, the security issues can be taken care of by carefully deciding what data is available in the set of variables to be interpolated. The interpolation syntax I've proposed is intentionally very simple, so that this is relatively easy. I recall seeing slides at the conference of a templating system (maybe Twisted's?) that allowed expressions like $foo.bar[key] which would be much harder to secure.
I'm not attached enough to fight for these but I'll re-emphasize your implicit point that these are entirely secure if used in literal interpolation.
I18n of templates is easy -- just look up the template string in the translation database.
I18n of apps using literal interpolation is more of a can of worms, and I have no clear solution. I agree that a solution is needed -- otherwise literal interpolation would be *worse* than what we have now!
You translate them from compile time interpolation to runtime by removing a $ and replacing it by a function call. a = $"My name is $name" becomes: a = interp(_("My name is $name")) But of course it is trivial to make the last line of '_' return interp(rc) so that the client doesn't have to do it. Paul Prescod
"SM" == Skip Montanaro <skip@pobox.com> writes:
SM> So translators aren't programmers either. Well, they may not be /Python/ programmers. ;) SM> Just tell them anything between %(...) and the first SM> alphabetic character after that is off-limits. Again, it SM> doesn't look to me like a programmer problem. SM> Just to play the devil's advocate (and ignoring the bit about SM> $-strings not being i18n-friendly), I suspect non-programming SM> translators would have just as much trouble with something SM> like SM> $"Please confirm your choice of color ($color)..." SM> "$color" will look like a word to be translated. You would SM> have to tell them "don't translate anything immediately SM> following a dollar sign up to, but not inluding the next SM> character that can't be part of a Python identifier." Seems SM> either a bit error-prone or confusing to me if I pretend I'm SM> not a programmer. To be clear, I think the ideal interface would be a graphical one, with drag-n-drop icons for the textual placeholders. This would allow them to re-arrange the order of the placeholder, and it would be obvious what is variable in your templates, but it wouldn't allow them to change, remove, or add placeholders. Then it wouldn't matter what syntax you actually used. I'm holding my breath... ready... go! <wink> -Barry
"Paul" == Paul Prescod <paul@prescod.net> writes:
Paul> "Martin v. Loewis" wrote: >> I could not accept any solution that cannot offer anything but this. >> This kind of interpolation is plain broken. Paul> How so? I need more info to go on. I have no direct experience with text translation, but in this internet day and age, it seems to me that a change to the language shouldn't make internationalization more difficult than it already is. (I doubt anyone will claim that it's truly easy, even with gettext.) Guido mentioned a number of other languages that already use $-interpolation, Perl, the shells, awk and Ruby I think. Of those, all but Ruby were around before the explosion of the internet in general and the web and Unicode in particular, so internationalization wasn't a prime consideration when those languages' $-interpolation facilities were implemented. Skip
Sorry about the first, fumbled reply... BAW> To be clear, I think the ideal interface would be a graphical one, BAW> with drag-n-drop icons for the textual placeholders. This would BAW> allow them to re-arrange the order of the placeholder, and it would BAW> be obvious what is variable in your templates, but it wouldn't BAW> allow them to change, remove, or add placeholders. This places the onus back on the application programmer, not the language designer. Skip
Skip Montanaro wrote:
"Paul" == Paul Prescod <paul@prescod.net> writes:
Paul> "Martin v. Loewis" wrote: >> I could not accept any solution that cannot offer anything but this. >> This kind of interpolation is plain broken.
Paul> How so? I need more info to go on.
I have no direct experience with text translation, but in this internet day and age, it seems to me that a change to the language shouldn't make internationalization more difficult than it already is.
I've proposed that whereas today you add a "_( )" in the future you would add "_( )" and remove "$" if it happens to occur at the start of the string. If the string didn't start with a "$" you might also have to scan to see if it contains one. In that case you double it up. This doesn't make internationalization more difficult. As proof I present mailman, which *already* does the interpolation I ask for as a feature of its implementation of "_()". All I'm asking is that mailman's interpolation feature ALSO be available under a simplified syntax at compile time. Paul Prescod
Jeff Epler <jepler@unpythonic.dhs.org> writes:
It means that you must audit not only your source code, but also your message catalogs, to determine whether information that is supposed to remain internal to a program is not formatted into a string. Of course, it is fairly easy to do this audit by showing that the translated string doesn't contain substitution on any identifiers that the original string did not.
That specific test could be done automatically. In fact, GNU msgfmt already performs the test for c-format strings; msgfmt.py should probably learn about the common notations for string interpolation. Regards, Martin
Paul Prescod <paul@prescod.net> writes:
I think you go through and remove the "$" signs (probably at the same time you are removing "_") and use a runtime function to do the translation (probably the same function doing the interpolation).
I could not accept any solution that cannot offer anything but this. This kind of interpolation is plain broken.
How so? I need more info to go on.
In the applications that I have in mind, interpolated strings are typically presented to the user, so there must be a way to localize them. An extension to the language that does not support localization is useless if I have to find some other means for l10n. If there will be a standard library function that does the interpolation anyway, I'd prefer not to have a language extension ot achieve the same thing, but is more limited. If anything, the language extension should be more powerful, not more limited, in applicability. Regards, Martin
Skip Montanaro <skip@pobox.com> writes:
Just to play the devil's advocate (and ignoring the bit about $-strings not being i18n-friendly), I suspect non-programming translators would have just as much trouble with something like
$"Please confirm your choice of color ($color)..."
"$color" will look like a word to be translated. You would have to tell them "don't translate anything immediately following a dollar sign up to, but not inluding the next character that can't be part of a Python identifier." Seems either a bit error-prone or confusing to me if I pretend I'm not a programmer.
Indeed. Therefore, the only true solution is to have an automatic check that verifies that the translated string has the same inserts as the original. Such a check could instruct users to follow any interpolation scheme; even if translators don't know the programming language of the application, they still are typically capable of understanding the error messages from msgfmt. Regards, Martin
"Martin v. Loewis" wrote:
...
In the applications that I have in mind, interpolated strings are typically presented to the user, so there must be a way to localize them. An extension to the language that does not support localization is useless if I have to find some other means for l10n.
You will use another invocation syntax, but probably the same string interpolation syntax.
If there will be a standard library function that does the interpolation anyway, I'd prefer not to have a language extension ot achieve the same thing, but is more limited. If anything, the language extension should be more powerful, not more limited, in applicability.
The language extension should be syntactically simpler because it is what is used for simpler cases. Simpler constructs are also less likely to open up security issues. Paul Prescod
"PP" == Paul Prescod <paul@prescod.net> writes:
PP> This doesn't make internationalization more difficult. As PP> proof I present mailman, which *already* does the PP> interpolation I ask for as a feature of its implementation of PP> "_()". All I'm asking is that mailman's interpolation feature PP> ALSO be available under a simplified syntax at compile time. Except that remember the interpolation step must happen /after/ the translation step, otherwise it's worse than useless. -Barry
[meant to send this before] Guido van Rossum wrote:
...
All right, but there *also* needs to be a way to invoke interpolation explicitly -- just like eval(). This has applicability e.g. in i18n.
Agree 100%. The last time we discussed this I proposed there should be a function to do this. But the naive integrated syntax could be compile time. My complaints with the current interpolation are: 1. they require too many magical incantations to invoke (especially % vars()) 2. they require too much thinking about types and conversions in the syntax 3. special behaviour with dictionaries and tuples and singleton tuples etc. 4. operator abuse I would only be in favour of a replacement if for *simple cases* it cleared up all of these issues so that it is roughly as easy as in Perl/Ruby/sh/tcl: a = $"My name is $name" If there is any more syntax than that then personally I think that the cost/benefit ratio falls down. So I don't see this as a big win: a = "My name is $name" \ locals() It solves two of my four problems. Maybe other people have different goals than I do and that's why they see the above as a "win". Paul Prescod
"Barry A. Warsaw" wrote:
"PP" == Paul Prescod <paul@prescod.net> writes:
PP> This doesn't make internationalization more difficult. As PP> proof I present mailman, which *already* does the PP> interpolation I ask for as a feature of its implementation of PP> "_()". All I'm asking is that mailman's interpolation feature PP> ALSO be available under a simplified syntax at compile time.
Except that remember the interpolation step must happen /after/ the translation step, otherwise it's worse than useless.
Right, that's why you *for localized software* you should do it at runtime. And insofar as the process of localization *already* consists of touching every string, it takes no extra effort to change a compile-time interpolation to a runtime one while you are at it. But the newbie to Python should not be saddled with a syntax optimized towards advanced users, and even as a person often hacking single-language software I shouldn't be saddled with dynamic interpolation until I need it either! "Saddled with" means "required to use a verbose, non-intuitive syntax with a bunch of special cases for a simple and common operation." Paul Prescod
Paul Prescod writes:
The major reason for doing it at compile time (for me) is that you can have a nice syntax that doesn't evolve modulus-ing (or dividing) an otherwise useless vars() or locals() dictionary.
Which has everything to do with your usage. I almost never use % with locals() or vars(), so I don't share that motivation. I'm much more likely to build a dict specifically for the purpose, which includes computed values, or have something already created which includes this usage as part of the larger picture. -Fred -- Fred L. Drake, Jr. <fdrake at acm.org> PythonLabs at Zope Corporation
I'm not sure I like the idea of using $ as the character for prefixing interpolated strings. Somehow a = $"My name is $name" looks confusing. I think it has something to do with the fact that $ is appearing both inside and outside the quotes, making my visual parser worry that the quotes are misplaced. Also, it uses up one of the three precious not-yet-used characters, and I think we should keep those for some future time when we really need them. We don't need one for this -- there are plenty of operators available that haven't yet been used on strings. I suggest '^', since it does a nice job of suggesting "inject stuff into this string". We can have both a prefix form for compile-time interpolation: a = ^ "My name is $name" and an infix form for run-time interpolation: a = "My name is $name" ^ dict Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+
"GE" == Greg Ewing <greg@cosc.canterbury.ac.nz> writes:
GE> Also, it uses up one of the three precious not-yet-used GE> characters, and I think we should keep those for some GE> future time when we really need them. We don't need one GE> for this -- there are plenty of operators available GE> that haven't yet been used on strings. GE> I suggest '^', since it does a nice job of suggesting GE> "inject stuff into this string". We can have both a GE> prefix form for compile-time interpolation: GE> a = ^ "My name is $name" GE> and an infix form for run-time interpolation: GE> a = "My name is $name" ^ dict I think I suggested using ~ for this at IPC10: a = ~'my name is $name' for the compile-time interpolation. I don't think it matters much which operator is chosen (let Guido decide). -Barry
[Barry A. Warsaw]
Does anybody ever even use something other than `s' for %() strings?
'%(float)f' % {'float': 3.9} '3.900000'
I never have.
Then again, you've never used a floating-point number either <wink>. I've certainly used %(x)f/g/e with float formats. Not quite speaking of which, if Python grows a new $ operator, let's get the precedence right. This kind of thing is too common today:
amount = 3.50 n = 3 print "Total: $%.2f." % amount*n Total: $3.50.Total: $3.50.Total: $3.50.
[Guido]
I never use this in combination with named variables, but I often write timing programs that format times using "%6.3f" to get millisecond precision.
Note that you also use %(name)s with width, precision and justification modifiers. For example, this line is yours: s = "%(name)-20.20s %(sts)-10s %(uptime)6s %(idletime)6s" % locals()
Paul Prescod wrote:
"Barry A. Warsaw" wrote:
...
Does anybody ever even use something other than `s' for %() strings?
'%(float)f' % {'float': 3.9} '3.900000'
Presumably numerical analysts do....and David Ascher once told me he uses %d as a sanity type-check. I don't bother.
Paul's starting to turn into my brother -- quoting things I said twenty years ago and that I have no way of disproving. As Bill said, "I don't recall". These days, I rarely think in FP, even if I use FP, so I typically use %s. Back then I probably cared about mantissa and her friends. --da
"Fred L. Drake, Jr." wrote:
Paul Prescod writes:
The major reason for doing it at compile time (for me) is that you can have a nice syntax that doesn't evolve modulus-ing (or dividing) an otherwise useless vars() or locals() dictionary.
Which has everything to do with your usage. I almost never use % with locals() or vars(), so I don't share that motivation.
Even so you have to modulus a tuple or a variable. That doesn't make any more sense for a newbie and is just as inconvenient for the script kiddie (which is often me!), compared to languages like Perl, Ruby, Tcl, sh etc. Python's interpolation syntax is: more verbose, more complicated, less secure and also more powerful. I have no problem with keeping the power but I'd like something less verbose and less complicated alongside it.
I'm much more likely to build a dict specifically for the purpose, which includes computed values, or have something already created which includes this usage as part of the larger picture.
I don't believe that this feature should be taken away from you. But I don't see how it relates to the PEP because what you want to do is already doable. PEP 215 is about making things *easier for simple cases*. If you have new, high-end needs for runtime string interpolation then PEP 215 probably won't address them. Paul Prescod
[Greg Ewing]
I suggest '^', since it does a nice job of suggesting "inject stuff into this string". We can have both a prefix form for compile-time interpolation:
a = ^ "My name is $name"
and an infix form for run-time interpolation:
a = "My name is $name" ^ dict
[Barry]
I think I suggested using ~ for this at IPC10:
a = ~'my name is $name'
for the compile-time interpolation. I don't think it matters much which operator is chosen (let Guido decide).
-1 on all line-noise string modifiers. (I just looked at Barry's example and part of my reptilian hindbrain thought it was a regex match. Don't do that to Perl and awk refugees, please!) All existing string modifiers are letters; how about "i" for "interpolation": a = i"my name is $name" Assuming of course that we really do need yet another flavour of strings... Greg -- Greg Ward - programmer-at-large gward@python.net http://starship.python.net/~gward/ Time flies like an arrow; fruit flies like a banana.
participants (15)
-
Andrew Kuchling
-
barry@zope.com
-
David Ascher
-
Fred L. Drake, Jr.
-
Fredrik Lundh
-
Greg Ewing
-
Greg Ward
-
Guido van Rossum
-
Jeff Epler
-
M.-A. Lemburg
-
martin@v.loewis.de
-
Neil Schemenauer
-
Paul Prescod
-
Skip Montanaro
-
Tim Peters