String formatting and namedtuple

[sorry for starting a new thread, but I just subscribed and can't figure out how to respond to an earlier message] Raymond wrote:
[not picking on Raymond here at all, his message was just convenient] There are a number of comments in this thread that lead me to think not everyone is aware that .format is fully supported in 2.6. I just want to make sure everyone knows that's the case. If you want to support 2.6+ and 3.0+, you can certainly use .format. Eric.

Eric Smith wrote at Thu, 12 Feb 2009 22:49:13 -0500:
I think the main problem is the huge amount of existing code that uses `%` for formatting. As long as there is no easy way to migrate that code to `.format`, moves to deprecate `%`-formatting are bound to cause friction. Counting lines containing `%` in my code base gives 14534 -- a few of these are numeric (but I'd be surprised if the numeric ones are more than a couple of hundred). -- Christian Tanzer http://www.c-tanzer.at/

On Fri, Feb 13, 2009 at 12:58 AM, Christian Tanzer <tanzer@swing.co.at> wrote:
Yes, that was our concern too when we decided to keep % without deprecation in 3.0. My guess is that *most* of these use string literals, and we *can* write a 2to3 fixer for those. It is the cases where the format is being passed in as an argument or precomputed somehow where 2to3 falls down. It would be useful to have an idea how frequently that happens. -- --Guido van Rossum (home page: http://www.python.org/~guido/)

Guido van Rossum wrote at Fri, 13 Feb 2009 09:56:55 -0800:
A fair amount of my use cases involve a non-literal format string (i.e., passed in as argument, defined as module or class variable, or even doc-strings used as format strings). I'd guess that most non-literal format strings are used together with dictionaries. Unfortunately, it's hard to grep for this :-(, so I can't give you hard numbers. Another, probably fairly common, use case involving non-literal strings is %-formatting in I18N strings -- though it might be possible to fix these automatically. -- Christian Tanzer http://www.c-tanzer.at/

On Mon, Feb 16, 2009 at 12:48 AM, Christian Tanzer <tanzer@swing.co.at> wrote:
It would be pretty simple to rig up 2to3 to report any string literals containing e.g. '%(...)s' that are not immediately followed by a % operator.
Plus, i18n is one of the motivators for .format() -- less chance of forgetting to type the trailing 's' in '%(foobar)s' and the ability to rearrange positional arguments a la "foo {1} bar {0} baz". -- --Guido van Rossum (home page: http://www.python.org/~guido/)

Guido van Rossum schrieb:
The major problems I see are 1) __mod__ application with a single right-hand operand (a tuple makes it a string formatting to 100%, at least without other types overloading %) 2) format strings coming from external sources The first can't be helped easily. For the second, a helper function that converts %s format strings to {0} format strings could be imagined. A call of the form fmtstr % (a, b) would then be converted to _mod2format(fmtstr).format(a, b) To fix 1), _mod2format could even return a wrapper that executes .__mod__ on .format() if fmtstr is not a string. <half-wink>
I expect future generations of translators will be thankful. Current generation may be angry when they have to revise their .po files :) Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out.

Georg Brandl wrote at Mon, 16 Feb 2009 22:16:19 +0100:
Please note that the right-hand operand can be a dictionary (more specifically, any object implementing `__getitem__()`) For objects implementing `__getitem__`, the stuff inside the parentheses of `%(...)s` can get pretty fancy. -- Christian Tanzer http://www.c-tanzer.at/

On Tue, Feb 17, 2009 at 2:47 AM, Christian Tanzer <tanzer@swing.co.at> wrote:
Indeed it can. I had some functionality for templating C programs that relied on just this. The files would be mostly C code, but then %-formatting was used to specify certain chunks to be generated by Python code. The custom class I wrote implemented a __getitem__ class that broke down the given key into arguments to an indicated function, eval-ed Python code, and generated the desired C code. This was super-useful for things like spitting out static const arrays and specifying the array sizes in the header file without requiring duplicate human effort. An example would be the following snippet from a C-template, num_to_words.ctemplate: %(array; char * ones; "zero one two three four five six seven eight nine".split())s Taking this file, reading it in, and doing %-interpolation with my custom class would result in the following output: num_to_words.h: extern char * ones[10]; num_to_words.c: #ifndef NUM_TO_WORDS_C #define NUM_TO_WORDS_C #include "num_to_words.h" char * ones = { "zero", "one", "two", "three", "four", "five", "six", "seven", "eight", "nine" }; #endif So there are some pretty crazy things possible with %-formatting and custom __getitem__ classes. As long as format can do similar things, though, I don't think there is a problem. Brandon

You can do similar things with .format(), but inside {} the : and ! characters always end the key. On Tue, Feb 17, 2009 at 12:22 AM, Brandon Mintern <bmintern@gmail.com> wrote:
-- --Guido van Rossum (home page: http://www.python.org/~guido/)

On Tue, 17 Feb 2009 10:32:17 -0800, Guido van Rossum wrote:
You can do similar things with .format(), but inside {} the : and ! characters always end the key.
Why not keep something like str.old_format(formatcode, tuple_or_object) for py3k for backward compatibility purpose, then completely removing it on python 4.0? Add a note that .old_format is obsolete and would be removed and codes should use the newer formatters. That way 2to3 tool would become simpler (just convert most things to old_format). While for the most common use cases (e.g. 'old%sliteral' % item), it can automatically be converted it to 'new{}literal'.format(item) or $- substitution.

Eric Smith wrote at Thu, 12 Feb 2009 22:49:13 -0500:
I think the main problem is the huge amount of existing code that uses `%` for formatting. As long as there is no easy way to migrate that code to `.format`, moves to deprecate `%`-formatting are bound to cause friction. Counting lines containing `%` in my code base gives 14534 -- a few of these are numeric (but I'd be surprised if the numeric ones are more than a couple of hundred). -- Christian Tanzer http://www.c-tanzer.at/

On Fri, Feb 13, 2009 at 12:58 AM, Christian Tanzer <tanzer@swing.co.at> wrote:
Yes, that was our concern too when we decided to keep % without deprecation in 3.0. My guess is that *most* of these use string literals, and we *can* write a 2to3 fixer for those. It is the cases where the format is being passed in as an argument or precomputed somehow where 2to3 falls down. It would be useful to have an idea how frequently that happens. -- --Guido van Rossum (home page: http://www.python.org/~guido/)

Guido van Rossum wrote at Fri, 13 Feb 2009 09:56:55 -0800:
A fair amount of my use cases involve a non-literal format string (i.e., passed in as argument, defined as module or class variable, or even doc-strings used as format strings). I'd guess that most non-literal format strings are used together with dictionaries. Unfortunately, it's hard to grep for this :-(, so I can't give you hard numbers. Another, probably fairly common, use case involving non-literal strings is %-formatting in I18N strings -- though it might be possible to fix these automatically. -- Christian Tanzer http://www.c-tanzer.at/

On Mon, Feb 16, 2009 at 12:48 AM, Christian Tanzer <tanzer@swing.co.at> wrote:
It would be pretty simple to rig up 2to3 to report any string literals containing e.g. '%(...)s' that are not immediately followed by a % operator.
Plus, i18n is one of the motivators for .format() -- less chance of forgetting to type the trailing 's' in '%(foobar)s' and the ability to rearrange positional arguments a la "foo {1} bar {0} baz". -- --Guido van Rossum (home page: http://www.python.org/~guido/)

Guido van Rossum schrieb:
The major problems I see are 1) __mod__ application with a single right-hand operand (a tuple makes it a string formatting to 100%, at least without other types overloading %) 2) format strings coming from external sources The first can't be helped easily. For the second, a helper function that converts %s format strings to {0} format strings could be imagined. A call of the form fmtstr % (a, b) would then be converted to _mod2format(fmtstr).format(a, b) To fix 1), _mod2format could even return a wrapper that executes .__mod__ on .format() if fmtstr is not a string. <half-wink>
I expect future generations of translators will be thankful. Current generation may be angry when they have to revise their .po files :) Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out.

Georg Brandl wrote at Mon, 16 Feb 2009 22:16:19 +0100:
Please note that the right-hand operand can be a dictionary (more specifically, any object implementing `__getitem__()`) For objects implementing `__getitem__`, the stuff inside the parentheses of `%(...)s` can get pretty fancy. -- Christian Tanzer http://www.c-tanzer.at/

On Tue, Feb 17, 2009 at 2:47 AM, Christian Tanzer <tanzer@swing.co.at> wrote:
Indeed it can. I had some functionality for templating C programs that relied on just this. The files would be mostly C code, but then %-formatting was used to specify certain chunks to be generated by Python code. The custom class I wrote implemented a __getitem__ class that broke down the given key into arguments to an indicated function, eval-ed Python code, and generated the desired C code. This was super-useful for things like spitting out static const arrays and specifying the array sizes in the header file without requiring duplicate human effort. An example would be the following snippet from a C-template, num_to_words.ctemplate: %(array; char * ones; "zero one two three four five six seven eight nine".split())s Taking this file, reading it in, and doing %-interpolation with my custom class would result in the following output: num_to_words.h: extern char * ones[10]; num_to_words.c: #ifndef NUM_TO_WORDS_C #define NUM_TO_WORDS_C #include "num_to_words.h" char * ones = { "zero", "one", "two", "three", "four", "five", "six", "seven", "eight", "nine" }; #endif So there are some pretty crazy things possible with %-formatting and custom __getitem__ classes. As long as format can do similar things, though, I don't think there is a problem. Brandon

You can do similar things with .format(), but inside {} the : and ! characters always end the key. On Tue, Feb 17, 2009 at 12:22 AM, Brandon Mintern <bmintern@gmail.com> wrote:
-- --Guido van Rossum (home page: http://www.python.org/~guido/)

On Tue, 17 Feb 2009 10:32:17 -0800, Guido van Rossum wrote:
You can do similar things with .format(), but inside {} the : and ! characters always end the key.
Why not keep something like str.old_format(formatcode, tuple_or_object) for py3k for backward compatibility purpose, then completely removing it on python 4.0? Add a note that .old_format is obsolete and would be removed and codes should use the newer formatters. That way 2to3 tool would become simpler (just convert most things to old_format). While for the most common use cases (e.g. 'old%sliteral' % item), it can automatically be converted it to 'new{}literal'.format(item) or $- substitution.
participants (6)
-
Brandon Mintern
-
Christian Tanzer
-
Eric Smith
-
Georg Brandl
-
Guido van Rossum
-
Lie Ryan