Mailman 3 *Simpler* string substitutions - Python-Dev

newer
Re: [Python-Dev] PEP 292, Simpler...

Simpler string substitutions

older
Re: [Python-Dev] strptime recapped

Paul Prescod

20 Jun 2002 20 Jun '02

6:29 p.m.

We will never come to a solution unless we agree on what, if any, the problem is. Here is my sense of the "interpolation" problem (based entirely on the code I see): * 95% of all scripts (or modules) need to do string interpolation * 5% of all scripts want to be explicit about the types * 10% of all scripts want to submit a dictionary rather than the current namespace * 5% of all scripts want to do printf-style formatting tricks Which means that if we do the math in a simplistic way, 20% modules/scripts need these complicated features but the other 75% pay for these features that they are not using. They pay through having to use "% locals()" (which uses two advanced features of Python, operator overloading and the local namespace). They pay through counting the lengths of their %-tuples (in my case, usually miscounting). They pay through adding (or forgetting to add) the format specifier after "%(...)". They pay through having harder to read strings where they have to go back and forth to figure out what various positional variables mean. They through having to remember the special case for singletons -- except for singleton tuples! Of course the syntax is flexible: you get to choose HOW you pay (shifting from positional to name) and thus reduce some costs while you incur others, but you can't choose simply NOT to pay, as you can in every other scripting language I know. And remember that Python is a language that *encourages* readability. But this kind of code is common: * exception.append('\n<br>%s%s =\n%s' % (indent, name, value)) whereas it could be just: * exception.append('\n<br>${ident}${name} =\n${value}') Which is shorter, uses fewer concepts, and keeps variables close to where they are used. We could argue that the programmer here made the wrong choice (versus using % locals()) but the point is that Python itself favoured the wrong choice by making the wrong choice shorter and simpler. Usually Python favours the right choice. The tax is small but it is collected on almost every script, almost every beginner and almost every programmer almost every day. So it adds up. If we put this new feature in a module: (whether "text", "re", "string"), then we are just divising another way to make people pay. At that point it becomes a negative feature, because it will clutter up the standard library without getting use.As long as you are agreeing to pay some tax, "%" is a smaller tax (at least at first) because it does not require you to interrupt your workflow to insert an import statement. In my mind, this feature is only worth adding if we agree that it is now the standard string interpolation feature and "%" becomes a quaint historical feature -- a bad experiment in operator overloading gone wrong. "%" could be renamed "text.printf" and would actually become more familiar to its core constituency and less of a syntactic abberation. "interp" could be a built-in and thus similarly simple syntactically. But I am against adding "$" if half of Python programmers are going to use that and half are going to use %. $ needs to be a replacement. There should be one obvious way to solve simple problems like this, not two. I am also against adding it as a useless function buried in a module that nobody will bother to import. Paul Prescod

Show replies by thread

Gustavo Niemeyer

20 Jun 20 Jun

8 p.m.

...

Here is my sense of the "interpolation" problem (based entirely on the code I see):

* 95% of all scripts (or modules) need to do string interpolation

* 5% of all scripts want to be explicit about the types

* 10% of all scripts want to submit a dictionary rather than the current namespace

* 5% of all scripts want to do printf-style formatting tricks

Which means that if we do the math in a simplistic way, 20% modules/scripts need these complicated features but the other 75% pay [...]

I'm curious.. where did you get this from? Have you counted? I think 99% of the statistics are forged to enforce an opinion. :-) [...]

...

Of course the syntax is flexible: you get to choose HOW you pay (shifting from positional to name) and thus reduce some costs while you incur others, but you can't choose simply NOT to pay, as you can in every other scripting language I know.

And remember that Python is a language that *encourages* readability. But this kind of code is common:

* exception.append('\n<br>%s%s =\n%s' % (indent, name, value))

whereas it could be just:

* exception.append('\n<br>${ident}${name} =\n${value}')

That's the usual Perl way of string interpolation. I've used Perl in some large projects before being a python adept, and I must confess I don't miss this feature. Maybe it's my C background, but I don't like to mix code and strings. Think about these real examples, taken from *one* single module (BaseHTTPServer): "%s %s %s\r\n" % (self.protocol_version, str(code), message) "%s - - [%s] %s\n" % (self.address_string(), self.log_date_time_string(), format%args)) "%s, %02d %3s %4d %02d:%02d:%02d GMT" % (self.weekdayname[wd], day, self.monthname[month], year, hh, mm, ss) "Serving HTTP on", sa[0], "port", sa[1], "..." "Bad HTTP/0 .9 request type (%s)" % `command` "Unsupported method (%s)" % `self.command` "Bad request syntax (%s)" % `requestline` "Bad request version (%s)" % `version`

...

Which is shorter, uses fewer concepts, and keeps variables close to where they are used. We could argue that the programmer here made the [...]

Please, show me that with one of the examples above.

...

The tax is small but it is collected on almost every script, almost every beginner and almost every programmer almost every day. So it adds up.

That seems like an excessive generalization of a personal opinion. -- Gustavo Niemeyer [ 2AAC 7928 0FBF 0299 5EB5 60E2 2253 B29A 6664 3A0C ]

Paul Prescod

8:26 p.m.

Gustavo Niemeyer wrote:

...

...

I'm curious.. where did you get this from? Have you counted?

No.

...

I think 99% of the statistics are forged to enforce an opinion. :-)

I said it was based only on my experience!

...

... Think about these real examples, taken from *one* single module (BaseHTTPServer):

"%s %s %s\r\n" % (self.protocol_version, str(code), message)

Let's presume a "sub" method with the features of Ping's string interpolation PEP. This would look like: "${self.protocol_version}, $code, $message\r\n".sub() Shorter and simpler.

...

"%s - - [%s] %s\n" % (self.address_string(), self.log_date_time_string(), format%args))

"${self.address_string()} - - [${self.log_date_time_string()}] ${format.sub(args)}".sub() But I would probably clarify that: addr = self.address_string() time = self.log_date_time_string() command = format.sub(args) "$addr - - [$time] $command\n".sub()

...

"%s, %02d %3s %4d %02d:%02d:%02d GMT" % (self.weekdayname[wd], day, self.monthname[month], year, hh, mm, ss)

This one is part of the small percent that uses formatting codes. It wouldn't be rocket science to integrate formatting codes with the "$" notation $02d{day} but it would also be fine if this involved a call to textutils.printf()

...

"Serving HTTP on", sa[0], "port", sa[1], "..."

This doesn't use "%" to start with, but it is still clearer (IMO) in the new notation: "Serving HTTP on ${sa[0]} port ${sa[1]} ..."

...

"Bad HTTP/0 .9 request type (%s)" % `command`

"Bad HTTP/0 .9 request type ${`command`}" etc. Paul Prescod

Gustavo Niemeyer

9:20 p.m.

...

Let's presume a "sub" method with the features of Ping's string interpolation PEP. This would look like:

That's not the PEP being discussed, and if it was, it can't replace the % mapping. Read the Security Issues. -- Gustavo Niemeyer [ 2AAC 7928 0FBF 0299 5EB5 60E2 2253 B29A 6664 3A0C ]

Barry Scott

9:21 p.m.

If I'm going to move from %(name)fmt to ${name} I need a place for the fmt format. Given the error prone nature of %(name) should have been %(name)s Howabout adding the format inside the {} for example: ${name:format} You can then have $name ${name} ${name:s} $name and ${name} work as you have already decided. ${name:format} allows the format to control the substitution. Barry

Guido van Rossum

9:21 p.m.

[Paul]

...

We will never come to a solution unless we agree on what, if any, the problem is.

[...eloquent argument, ending in...]

...

But I am against adding "$" if half of Python programmers are going to use that and half are going to use %. $ needs to be a replacement. There should be one obvious way to solve simple problems like this, not two. I am also against adding it as a useless function buried in a module that nobody will bother to import.

Well argued. Alex said roughly the same thing: let's not add $ while keeping %. Adding a function for $-interpolation to a module would certainly help some projects (like web templating) from reinventing the wheel -- but /F has shown that this particular wheel isn't hard to recreate. I would certainly recommend any project that offers substitution in templates that are edited by non-programmers to use the $-based syntax from Barry's PEP rather than Python's %(name)s syntax. (In particular I hope Python's i18n projects will use $ interpolation.) Oren made a good point that Paul emphasized: the most common use case needs interpolation from the current namespace in a string literal, and expressions would be handy. Oren also made the point that the necessary parsing could (should?) be done at compile time. We currently already have many ways to do this: - In some cases print is appropriate: def f(x, t): print "The sum of", x, "and", y, "is", x+y - You can use string concatenation: def f(x, y): return "The sum of " + str(x) + " and " + str(y) + " is " + str(x+y) - You can use % interpolation (with two variants: positional and by-name). A problem is that you have to specify an explicit tuple or dict of values. def f(x, y): return "The sum of %s and %s is %s" % (x, y, x+y) Note that the print version is the shortest, and IMO also the easiest to read. (Though some people might disagree and prefer the % version because it separates the template from the data; it's not much longer.) - You could have an interpolation helper function: def i(*a): return "".join(map(str, a)) so you could write this: def f(x, y): return i("The sum of ", x, " and ", y, " is ", x+y) This comes closer in length to the print version. IMO the attraction of the $ version is that it reduces the amount of punctuation so that it becomes even shorter and clearer. While I said "shorter" several times above when comparing styles, I really meant that as a shorthand for "shorter and clearer". Even the print example suffers from the fact that every interpolated value is separated from the surrounding template by a comma and a string quote on both sides -- that's a lot of visual clutter (not to mention stuff to type). Maybe in Python 3.0 we will be able to write: def f(x, y): return "The sum of $x and $y is $(x+y)" To me, it's a toss-up whether this looks better or worse than the ABC version: def f(x, y): return "The sum of `x` and `y` is `x+y`" but I do know that backticks have a poor reputation for being hard to find on the keyboard (newbies don't even know they have it), hard to distinguish in some fonts, and publishers often turn 'foo' into `foo', making it hard to publish accurate documentation. I think on some European keyboards ` is a dead key, making it even harder to type. Additionally, it's a symmetric operator, which makes it harder to parse complex examples. Now, how to get there (or somewhere similar) in Python 2.3? PEP 215 solves it by using (yet) another string prefix character. It uses $, which to me looks a bit ugly; in this thread, someone proposed using e, so you can do: def f(x, y): return e"The sum of $x and $y is $(x+y)" That looks OK to me, especially if it can be combined with u and r to create unicode and raw strings. There are other possibilities: def f(x, y): return "The sum of \$x and \$y is \$(x+y)" Alas, it's not 100% backwards compatible, and the \$ looks pretty bad. Another one: def f(x, y): return "The sum of \(x) and \(y) is \(x+y)" Still not 100% compatible, looks perhaps a bit better, but notice how now every interpolation needs three punctuation characters: almost as many as the print example. Assuming that interpolating simple variables is relatively common, I still like plain $ with something to tag the string as an interpolation best. PEP 292 is an attempt to do this *without* involving the parser: def f(x, y): return "The sum of $x and $y is $(x+y)".sub() Downsides are that it invites using non-literals as formats, with all the security aspects, and that its parsing happens at run-time (no big deal IMO). Now back to $ vs. %. I think I can defend having both in the language, but only if % is reduced to the positional version (classic printf). This would be used mostly to format numerical data with fixed column width. There would be very little overlap in use cases: % always requires you to specify explicit values, while $ is always % followed by a variable name. (Yet another variant is from Tcl, which uses $variable but also [expression]. In Python 3.0 this would become: def f(x, y): return "The sum of $x and $y is [x+y]" But now you have three characters that need quoting, and we might as well use \$ to quote a literal $ instead of $$.) All options are still open. --Guido van Rossum (home page: http://www.python.org/~guido/)

Guido van Rossum

9:35 p.m.

...

...
"%s, %02d %3s %4d %02d:%02d:%02d GMT" % (self.weekdayname[wd], day, self.monthname[month], year, hh, mm, ss)

This one is part of the small percent that uses formatting codes. It wouldn't be rocket science to integrate formatting codes with the "$" notation $02d{day} but it would also be fine if this involved a call to textutils.printf()

But if you support $02d{day} you should also support $d{day}, but that already means something different (the variable 'd' followed by '{day}'). --Guido van Rossum (home page: http://www.python.org/~guido/)

Guido van Rossum

9:37 p.m.

...

If I'm going to move from %(name)fmt to ${name} I need a place for the fmt format. Given the error prone nature of %(name) should have been %(name)s

Howabout adding the format inside the {} for example:

${name:format}

You can then have

$name ${name} ${name:s}

$name and ${name} work as you have already decided. ${name:format} allows the format to control the substitution.

Not bad. --Guido van Rossum (home page: http://www.python.org/~guido/)

Paul Prescod

10:08 p.m.

Gustavo Niemeyer wrote:

...

...
Let's presume a "sub" method with the features of Ping's string interpolation PEP. This would look like:

That's not the PEP being discussed, and if it was, it can't replace the % mapping. Read the Security Issues.

That's true. I didn't mean to endorse any particular solution but rather to clarify the problem. I believe that only one of your examples required a feature (runtime provision of the format string) that was not in Ping's PEP. If another PEP is a better solution to the problem than the current one then fine. My point is that there *is* a problem! Paul Prescod

Skip Montanaro

10:17 p.m.

Guido> Alex said roughly the same thing: let's not add $ while keeping Guido> %. Then let's not add $ at all. ;-) Seriously, I'm not keen on having to modify all my %-formatted strings for something I perceive as a negligible improvement. I've seen nothing to suggest that any $-format proposals I've read were knock-my-socks-off better than the current %-format implementation. Skip

Gustavo Niemeyer

10:32 p.m.

...

That's true. I didn't mean to endorse any particular solution but rather to clarify the problem. I believe that only one of your examples required a feature (runtime provision of the format string) that was not in Ping's PEP. If another PEP is a better solution to the problem than the current one then fine. My point is that there *is* a problem!

Agreed. I feel relieved to know that the problem is in a PEP, and that there's a lot of smart people discussing its implementation. Don't worry, it won't get into Python before there's a minimum consensus on the solution. Of course, issuing your opinion is important to define the minimum consensus. ;-) -- Gustavo Niemeyer [ 2AAC 7928 0FBF 0299 5EB5 60E2 2253 B29A 6664 3A0C ]

Donald Beaudry

10:50 p.m.

"Barry Scott" wrote,

...

Howabout adding the format inside the {} for example:

${name:format}

Considering that the $ is supposed to be familar to folks who use other tools, the colon used this way might undo much of that good will. On the other hand, %{name:format} might be just the right thing. -- Donald Beaudry Ab Initio Software Corp. 201 Spring Street donb@abinitio.com Lexington, MA 02421 ...So much code...

Aahz

11 p.m.

On Thu, Jun 20, 2002, Gustavo Niemeyer wrote:

...

"Serving HTTP on", sa[0], "port", sa[1], "..."

This is where current string handling comes up short. What's the correct way to internationalize this string? What if the person handling I18N isn't a Python programmer? I'm sort of caught in the middle here. I can see that in some ways what we currently have isn't ideal, but we've already got problems with strings violating the Only One Way stricture (largely due to immutability vs. "+" combined with .join() vs. % -- fortunately, the use cases for .join() and % are different, so people mostly use them appropriately). It seems to me that fixing the problems with % formatting for newbie Python programmers just isn't worth the pain. It also seems to me that getting better/simpler interpolation support for I18N and similar templating situations is also a requirement. I vote for two things: * String template class for the text module/package that does more-or-less what PEP 292 suggests. I think standardizing string templating would be a Good Thing. I recommend that only one interpolation form be supported; if we're following PEP 292, it should be ${var}. This makes it visually easy for translators to find the variables. * No changes to current string interpolation features unless it's made compatible with % formatting. I don't think I can support dropping % formatting even in Python 3.0; it's not just source code that will have string formats, but also config files and databases. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ Project Vote Smart: http://www.vote-smart.org/

pinard＠iro.umontreal.ca

11:40 p.m.

[Guido van Rossum]

...

[...] All options are still open.

Thanks, Guido, for the synthesis of a summary of various avenues. These two points are worth underlining: 1) let's not add $ while keeping %. [...] having both in the language, but only if % is reduced to the positional version 2) the necessary parsing could (should?) be done at compile time. Here are other comments, some of which are related to internationalisation.

...

return "The sum of " + str(x) + " and " + str(y) + " is " + str(x+y) return i("The sum of ", x, " and ", y, " is ", x+y) print "The sum of", x, "and", y, "is", x+y

...

Note that the print version is the shortest, and IMO also the easiest to read.

These are good for quick programs, and `print' is good for debugging. But they are less appropriate whenever internationalisation is in the picture, because it is more handy and precise for translators to handle wider context at once, than individual sentence fragments.

...

[...] % interpolation (with two variants: positional and by-name).

The advantage of by-name interpolation for internationalisation is the flexibility it gives for translators to reorganise the inserts.

...

return "The sum of `x` and `y` is `x+y`" return "The sum of $x and $y is $(x+y)" return "The sum of $x and $y is [x+y]"

Those three above might be a little too magical for Python. Python does not ought to have interpolation on all double-quoted strings like shells or Perl (and it should probably avoid deciding interpolability on the delimiter being a single or double quote, even if shells or Perl do this).

...

return "The sum of \(x) and \(y) is \(x+y)" return "The sum of \$x and \$y is \$(x+y)" return e"The sum of $x and $y is $(x+y)"

...

[...] I still like plain $ with something to tag the string as an interpolation best.

Those three are interesting, because they build on the escape syntax, or prefix letters, which Python already has. All these notations would naturally accept `ur' prefix letters. The shortest notation in the above is the third, using the `e' prefix, because this is the one requiring the least number of supplementary characters per interpolation. This is really a big advantage. (A detail about the letter `e': is it the best letter to use?) I also like the hidden suggestion that round parentheses are more readable than braces, something that was already granted in Python through the current %-by-name syntax. In fact, `${name}' would be more acceptable if Python also got at the same time `$(name)' as equivalent, and _also_ `%{name}format' as equivalent for %(name)format'. The simplest is surely to avoid braces completely, not introducing them. As long as Python does not fully get rid of `%', I wonder if the last two examples above could not be rewritten: return "The sum of \%x and \%y is \%(x+y)" return e"The sum of %x and %y is %(x+y)" That would avoid introducing `$' while we already have `%'. On the other hand, it might be confusing to overload `%' too much, if one want to mix everything like in: return e"The sum of %x and %y is %%d" % (x+y) This is debatable, and delicate. Users already have to deal with how to quote `\' and `%'. Having to deal with `$' as well, in all combinations and exceptional cases, makes a lot of things to consider. Most of us easily write shell scripts, yet we have difficulty to properly write or decipher a shell line using many quoting devices at once. Python is progressively climbing the same road. It should stay simpler, all considered. But I think the main problem in all these suggestions is how they interact with internationalisation. Surely: return _(e"The sum of %x and %y is %(x+y)") cannot be right. Interpolation has to be delayed to after translation, not before, because you agree that translators just cannot produce a translation for all possible inserts. I do not know what the solution is, and what kind of elegant magic may be invented to yield programmers all the flexibility they still need in that area. It is worth a good thought, and we should not rush into a decision before this aspect has been carefully analysed. If other PEPs are necessary for addressing interactions between interpolation and translation, these PEPs should be fully resolved before or concurrently with the PEP on interpolation, and not pictured as independent issues.

...

[...] There would be very little overlap in use cases: % always requires you to specify explicit values, while $ is always % followed by a variable name.

Yes, the suggestion of using `$(name:format)', whenever needed, is a good one that should be retained, maybe as `%(name:format)', or maybe with `$'. It means that the overlap would not be so little, after all. -- François Pinard http://www.iro.umontreal.ca/~pinard

Raymond Hettinger

21 Jun 21 Jun

6:18 a.m.

+1 for $(name) instead of ${name} because it is closer to existing formatting spec because my tastes like it better -1 for $(x+y) because z=x+y; '$z'.sub() works fine because general expressions are harder to pick-out +1 for $(name:fmt) because the style is powerful and elegant +1 for \$ instead of $$ because \ is already an escape character because $$ is more likely to occur in actual string samples +1 for 'istring'.sub() instead of e'istring' because sub allows a particular mapping to be specified +1 for not being a separate module so the feature gets used +1 for leaving %()s alone because formats may have been stored external to programs +1 for not using back-quotes because they are hard to read in languages with accents because the open and close back-quotes are not distinct 'regnitteh dnomyar'[::-1]

Alex Martelli

7:38 a.m.

On Thursday 20 June 2002 11:21 pm, Guido van Rossum wrote: ...

...

Now back to $ vs. %. I think I can defend having both in the language, but only if % is reduced to the positional version (classic printf). This would be used mostly to format numerical data with fixed column width. There would be very little overlap in use cases:

I think you're right: in a "greenfield" language design (a hypothetical one starting from scratch with no constraints of backwards compatibility) you can indeed defend using both % and $ for these two tasks, net of the issues of what feature set to give $ formatting -- implicit vs nonimplicit access to variables, including the very delicate case of access to free variables (HOW to give access to free variables if the formatstring isn't a literal?); ability to use expressions and not just identifiers; ability to pass a mapping; what format control should be allowed in $ formatting -- and what syntax to use to give acces to those features. If %(name)s is to be deprecated moving towards Python-3000 (surely it can't be _removed_ before then), $-formatting needs a very rich feature set; otherwise it can't _replace_ %-formatting. It seems to me that (assuming $ formatting IS destined to get into Python) $ formatting should then be introduced with all or most of the formatting power it eventually needs, so that those who want to make their programs Py3K-ready can use $ formatting to replace all their uses of %(name)s formatting. The "transition" period will thus inevitably offer different ways to perform the same tasks -- we can never get out of this bind, any time we move to deprecate an "old way" to perform a task, since the old way and the new way MUST both work together for a good while to allow migration. This substantial cost is of course worth paying only if the new way is a huge win over the old one -- not just "somewhat" better, but ENORMOUSLY better. But that's OK, and exactly the kind of delicate trade-off which you DO have such a good track record at getting right in the past:-).

...

All options are still open.

Thanks for clarifying this. To me personally it seems that the gain of introducing $ formatting, if gain it be, is small enough not to be worth the transition cost, but that's just opinion, hard to back up with any substance. So I offer a real-life anecdote instead. A colleague at Strakt (a wizard at various communication and storage programming issues) had no previous exposure to Python at all, his recent background being mostly with Plan-9, Inferno, and Limbo (previously, other Bell Labs technologies, centered on Unix and C). He picked up Python on the job over the last few months -- basically from Python's own docs, our existing code base, and discussions with colleagues, me included -- and didn't take long to become productive with it. He still has some issues. Some are very understandable considering his background -- e.g., he's still not fully _comfortable_ with dynamic typing (I predict he'll grow to like it, but Rome wasn't built in one day). Overall, what I would call a pretty good scenario and an implicit tribute to Python's simplicity / ease / power. He may pine for Limbo, but in fact produces a lot of excellent Python code day in day out. But his biggest remaining "general peeve" struck me hard the other day, exactly because that's not something he "heard", but an observation he came up with all by himself, by reasonably unbiased examination of "Python as she's spoken". "I wouldn't mind Python so much" (I'm paraphrasing, but that IS the kind of grudging-compliment understatement he did use:-) "except that there's always so MANY deuced ways to do everything -- can't they just pick one and STICK with it?!". In the widespread subtext of most Python discourse this might sound like irony, but in his case, it was just an issue of fact (compared, remember, with SMALL languages such as Limbo -- bloated ones such as, e.g., C++, are totally *outside* his purvey and experience) -- a bewildering array of possible variations. Surely inevitable when viewed diachronically (==as an evolution over time), but his view, like that of anybody who comes to Python anew today, is synchronic (==a snapshot at one moment). I don't think there's anything we can do to AVOID this phenomenon, of course, but right now I'm probably over-sensitized to the "transition costs" of introducing "yet one more way to do it" by this recent episode. So, it appears to me that REDUCING the occurrence of such perceptions is important. Alex

M.-A. Lemburg

7:57 a.m.

Alex Martelli wrote:

...

If %(name)s is to be deprecated moving towards Python-3000 (surely it can't be _removed_ before then), $-formatting needs a very rich feature set; otherwise it can't _replace_ %-formatting. It seems to me that (assuming $ formatting IS destined to get into Python) $ formatting should then be introduced with all or most of the formatting power it eventually needs, so that those who want to make their programs Py3K-ready can use $ formatting to replace all their uses of %(name)s formatting.

I haven't jumped into this discussion since I thought that you were only discussing some new feature which I don't have a need for. Now if you want to deprecate %(name)s formatting, the situation is different: my tie would start jumping up and down, doing funny noises :-) So just this comment from me: please don't deprecate %(name)s formatting. For the rest: I don't really care. Thanks, -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ Meet us at EuroPython 2002: http://www.europython.org/

Christian Tismer

11:09 a.m.

M.-A. Lemburg wrote:

...

Alex Martelli wrote:

...
If %(name)s is to be deprecated moving towards Python-3000 (surely it can't be _removed_ before then), $-formatting needs a very rich feature set; otherwise it can't _replace_ %-formatting. It seems to me that (assuming $ formatting IS destined to get into Python) $ formatting should then be introduced with all or most of the formatting power it eventually needs, so that those who want to make their programs Py3K-ready can use $ formatting to replace all their uses of %(name)s formatting.

I haven't jumped into this discussion since I thought that you were only discussing some new feature which I don't have a need for.

Now if you want to deprecate %(name)s formatting, the situation is different: my tie would start jumping up and down, doing funny noises :-)

So just this comment from me: please don't deprecate %(name)s formatting. For the rest: I don't really care.

Yes, please don't! Besides the proposals so far, I'd like to add one, which I really like a bit, since I used it for years in an institute with a row of macro languages: How about name = "Guido" ; land = "The Netherlands" "His name is <<name>> and he comes from <<land>>.".sub(locals()) I always found this notation very sharp and readable, maybe this is just me. I like to have a notation that is easily parsed, has unique start and stop strings, no puctuation/whitespace rules at all. Any kind of extra stuff like format specifiers, default values or expressions (if you really must) can be added with ease. If people like to use different delimiters, why not: "His name is <$name$> and he comes from <$land$>.".sub(locals(), delimiters=("<$","$>") ) -- Christian Tismer :^) mailto:tismer@tismer.com Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 pager +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/

Skip Montanaro

4:42 p.m.

>> Now back to $ vs. %. I think I can defend having both in the >> language, but only if % is reduced to the positional version (classic >> printf). This would be used mostly to format numerical data with >> fixed column width. There would be very little overlap in use cases: Overlap or not, you wind up with two things that look very much alike doing nearly identical things. -1... Skip

barry＠zope.com

23 Jun 23 Jun

1:12 a.m.

...

...
...
...
...
"BS" == Barry Scott writes:

BS> If I'm going to move from %(name)fmt to ${name} I need a place BS> for the fmt format. One of the reasons why I added "simpler" to the PEP is because I didn't want to support formatting characters in the specification. While admittedly handy for some applications, I submit that most string interpolation simply uses %s or %(name)s and there should be a simpler, less error prone way of writing that. -Barry

barry＠zope.com

1:20 a.m.

GvR> Oren made a good point that Paul emphasized: the most common GvR> use case needs interpolation from the current namespace in a GvR> string literal, and expressions would be handy. Oren also GvR> made the point that the necessary parsing could (should?) be GvR> done at compile time. I'll point out that in my experience, while expressions are (very) occasionally handy, you wouldn't necessarily need /arbitrary/ expressions. Something as simple as allowing dotted names only would solve probably 90% of uses, e.g. person = getPerson() print '${person.name} was born in ${person.country}' Not that this can't execute arbitrary code of course, so the security implications of that would need to be examined. -Barry

7977

Age (days ago)

7980

Last active (days ago)

List overview

Download

20 comments

13 participants

participants (13)

Aahz
Alex Martelli
Barry Scott
barry＠zope.com
Christian Tismer
Donald Beaudry
Guido van Rossum
Gustavo Niemeyer
M.-A. Lemburg
Paul Prescod
pinard＠iro.umontreal.ca
Raymond Hettinger
Skip Montanaro

*Simpler* string substitutions

Gustavo Niemeyer

Gustavo Niemeyer

Barry Scott

Gustavo Niemeyer

Donald Beaudry

pinard＠iro.umontreal.ca

M.-A. Lemburg

Christian Tismer

tags

participants (13)

Simpler string substitutions