Fwd: deep question re dict as formatting input
One of the students on an introductory Python 3 class asks a very good question about string formatting. This could be because the course materials are misleading, so I would like to understand. It would appear from tests that "{0[X]}".format(...) first tries to convert the string "X" to in integer. If it succeeds then __getitem__() is called with the integer as an argument, otherwise it is called with the string itself as an argument. Is this correct? The documentation at http://docs.python.org/library/string.html#formatspec is silent on whether strings were ever intended to be used as subscripts. Does this seem sensible? Was it considered during design? Should I alter the materials so that only integer subscripts are used? regards Steve Begin forwarded message:
From: kirby urner
Date: February 22, 2011 2:31:08 PM PST To: Steve Holden Subject: deep question re dict as formatting input d {'Steve': 'Holden', 'Tim': 'Peters', 'Guido': 'van Rossum', '1': 'string', 1: 'integer'} "{0[Guido]} is cool".format(d) 'van Rossum is cool' "{0[1]} is cool".format(d) 'integer is cool' "{0['1']} is cool".format(d) Traceback (most recent call last): File "
", line 1, in <module> "{0['1']} is cool".format(d) KeyError: "'1'" Student question:
Good morning!
Question on .format(), interactive session follows:
--> d = {"Steve": "Holden", ... "Guido": "van Rossum", ... "Tim": "Peters", ... "1": "string", ... 1: "integer"}
--> d {'Steve': 'Holden', 'Tim': 'Peters', '1': 'string', 1: 'integer', 'Guido': 'van Rossum'}
--> d[1] 'integer'
--> d['1'] 'string'
--> "{dct[1]}".format(dct=d) 'integer'
--> "{dct[Guido]}".format(dct=d) 'van Rossum'
--> "{dct['1']}".format(dct=d) Traceback (most recent call last): File "<console>", line 1, in <module> KeyError: "'1'"
Question: If {dct[Guido]} treats Guido as str, why doesn't {dct[1]} treate 1 as str? Feels like an automatic conversion from str to int. Furthermore, how does one access the key '1' in a format statement?
~Ethan~
Quoting PEP 3101: An example of the 'getitem' syntax: "My name is {0[name]}".format(dict(name='Fred')) It should be noted that the use of 'getitem' within a format string is much more limited than its conventional usage. In the above example, the string 'name' really is the literal string 'name', not a variable named 'name'. The rules for parsing an item key are very simple. If it starts with a digit, then it is treated as a number, otherwise it is used as a string. On 2/22/2011 6:01 PM, Steve Holden wrote:
One of the students on an introductory Python 3 class asks a very good question about string formatting. This could be because the course materials are misleading, so I would like to understand. It would appear from tests that "{0[X]}".format(...) first tries to convert the string "X" to in integer. If it succeeds then __getitem__() is called with the integer as an argument, otherwise it is called with the string itself as an argument. Is this correct?
The documentation at http://docs.python.org/library/string.html#formatspec is silent on whether strings were ever intended to be used as subscripts. Does this seem sensible? Was it considered during design? Should I alter the materials so that only integer subscripts are used?
regards Steve
Begin forwarded message:
*From: *kirby urner
mailto:kirby.urner@gmail.com> *Date: *February 22, 2011 2:31:08 PM PST *To: *Steve Holden mailto:steve@holdenweb.com> *Subject: **deep question re dict as formatting input* d {'Steve': 'Holden', 'Tim': 'Peters', 'Guido': 'van Rossum', '1': 'string', 1: 'integer'} "{0[Guido]} is cool".format(d) 'van Rossum is cool' "{0[1]} is cool".format(d) 'integer is cool' "{0['1']} is cool".format(d) Traceback (most recent call last): File "
", line 1, in <module> "{0['1']} is cool".format(d) KeyError: "'1'" Student question:
Good morning!
Question on .format(), interactive session follows:
--> d = {"Steve": "Holden", ... "Guido": "van Rossum", ... "Tim": "Peters", ... "1": "string", ... 1: "integer"}
--> d {'Steve': 'Holden', 'Tim': 'Peters', '1': 'string', 1: 'integer', 'Guido': 'van Rossum'}
--> d[1] 'integer'
--> d['1'] 'string'
--> "{dct[1]}".format(dct=d) 'integer'
--> "{dct[Guido]}".format(dct=d) 'van Rossum'
--> "{dct['1']}".format(dct=d) Traceback (most recent call last): File "<console>", line 1, in <module> KeyError: "'1'"
Question: If {dct[Guido]} treats Guido as str, why doesn't {dct[1]} treate 1 as str? Feels like an automatic conversion from str to int. Furthermore, how does one access the key '1' in a format statement?
~Ethan~
_______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/eric%2Ba-python-dev%40true...
On Feb 22, 2011, at 3:08 PM, Eric Smith wrote:
Quoting PEP 3101:
An example of the 'getitem' syntax:
"My name is {0[name]}".format(dict(name='Fred'))
It should be noted that the use of 'getitem' within a format string is much more limited than its conventional usage. In the above example, the string 'name' really is the literal string 'name', not a variable named 'name'. The rules for parsing an item key are very simple. If it starts with a digit, then it is treated as a number, otherwise it is used as a string.
That's not strictly true:
d = {"Steve":"Holden", "Guido":"van Rossum", 21.2:"float"} d[21.1] Traceback (most recent call last): File "<stdin>", line 1, in <module> KeyError: 21.1 d[21.2] 'float' "{0[21.2]}".format(d) Traceback (most recent call last): File "<stdin>", line 1, in <module> KeyError: '21.2'
But I take your point, and should have thought to read the PEP. Thanks! Kirby: Please apologize to Ethan. I can't remember being aware of the PEP 3101 specification quoted by Eric above. We will probably need to modify the course materials to take this wrinkle into account (at least by demonstrating that we are aware of it). regards Steve
On 2/22/2011 6:01 PM, Steve Holden wrote:
One of the students on an introductory Python 3 class asks a very good question about string formatting. This could be because the course materials are misleading, so I would like to understand. It would appear from tests that "{0[X]}".format(...) first tries to convert the string "X" to in integer. If it succeeds then __getitem__() is called with the integer as an argument, otherwise it is called with the string itself as an argument. Is this correct?
The documentation at http://docs.python.org/library/string.html#formatspec is silent on whether strings were ever intended to be used as subscripts. Does this seem sensible? Was it considered during design? Should I alter the materials so that only integer subscripts are used?
regards Steve
Begin forwarded message:
*From: *kirby urner
mailto:kirby.urner@gmail.com> *Date: *February 22, 2011 2:31:08 PM PST *To: *Steve Holden mailto:steve@holdenweb.com> *Subject: **deep question re dict as formatting input* d {'Steve': 'Holden', 'Tim': 'Peters', 'Guido': 'van Rossum', '1': 'string', 1: 'integer'} "{0[Guido]} is cool".format(d) 'van Rossum is cool' "{0[1]} is cool".format(d) 'integer is cool' "{0['1']} is cool".format(d) Traceback (most recent call last): File "
", line 1, in <module> "{0['1']} is cool".format(d) KeyError: "'1'" Student question:
Good morning!
Question on .format(), interactive session follows:
--> d = {"Steve": "Holden", ... "Guido": "van Rossum", ... "Tim": "Peters", ... "1": "string", ... 1: "integer"}
--> d {'Steve': 'Holden', 'Tim': 'Peters', '1': 'string', 1: 'integer', 'Guido': 'van Rossum'}
--> d[1] 'integer'
--> d['1'] 'string'
--> "{dct[1]}".format(dct=d) 'integer'
--> "{dct[Guido]}".format(dct=d) 'van Rossum'
--> "{dct['1']}".format(dct=d) Traceback (most recent call last): File "<console>", line 1, in <module> KeyError: "'1'"
Question: If {dct[Guido]} treats Guido as str, why doesn't {dct[1]} treate 1 as str? Feels like an automatic conversion from str to int. Furthermore, how does one access the key '1' in a format statement?
~Ethan~
_______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/eric%2Ba-python-dev%40true...
On 2/22/2011 6:28 PM, Steve Holden wrote:
On Feb 22, 2011, at 3:08 PM, Eric Smith wrote:
Quoting PEP 3101:
An example of the 'getitem' syntax:
"My name is {0[name]}".format(dict(name='Fred'))
It should be noted that the use of 'getitem' within a format string is much more limited than its conventional usage. In the above example, the string 'name' really is the literal string 'name', not a variable named 'name'. The rules for parsing an item key are very simple. If it starts with a digit, then it is treated as a number, otherwise it is used as a string.
That's not strictly true:
d = {"Steve":"Holden", "Guido":"van Rossum", 21.2:"float"} d[21.1] Traceback (most recent call last): File "<stdin>", line 1, in<module> KeyError: 21.1 d[21.2] 'float' "{0[21.2]}".format(d) Traceback (most recent call last): File "<stdin>", line 1, in<module> KeyError: '21.2'
You are correct, I didn't exactly implement the PEP on this point, probably as a shortcut. I think there's an issue somewhere that discusses this, but I can't find it. The CPython implementation is really using "If every character is a digit, then it is treated as an integer, otherwise it is used as a string". See find_name_split in Objects/stringlib/string_format.h, in particular the call to get_integer() and the interpretation of the result. Eric.
On 02/22/2011 07:32 PM, Eric Smith wrote:
On 2/22/2011 6:28 PM, Steve Holden wrote:
On Feb 22, 2011, at 3:08 PM, Eric Smith wrote:
Quoting PEP 3101:
An example of the 'getitem' syntax:
"My name is {0[name]}".format(dict(name='Fred'))
It should be noted that the use of 'getitem' within a format string is much more limited than its conventional usage. In the above example, the string 'name' really is the literal string 'name', not a variable named 'name'. The rules for parsing an item key are very simple. If it starts with a digit, then it is treated as a number, otherwise it is used as a string.
That's not strictly true:
d = {"Steve":"Holden", "Guido":"van Rossum", 21.2:"float"} d[21.1] Traceback (most recent call last): File "<stdin>", line 1, in<module> KeyError: 21.1 d[21.2] 'float' "{0[21.2]}".format(d) Traceback (most recent call last): File "<stdin>", line 1, in<module> KeyError: '21.2'
You are correct, I didn't exactly implement the PEP on this point, probably as a shortcut. I think there's an issue somewhere that discusses this, but I can't find it. The CPython implementation is really using "If every character is a digit, then it is treated as an integer, otherwise it is used as a string".
See find_name_split in Objects/stringlib/string_format.h, in particular the call to get_integer() and the interpretation of the result.
Just for the archives, I'll mention why it works this way. It's trying to support indexing by integers, as well as dictionary access using arbitrary keys. Both of course use the same syntax. In this case it must convert the index values into ints:
a = ['usr', 'var'] '{0[0]} {0[1]}'.format(a) 'usr var'
And in this case it doesn't:
a = {'one':'usr', 'two':'var'} '{0[one]} {0[two]}'.format(a) 'usr var'
Eric.
On Tue, 22 Feb 2011 19:32:56 -0500, Eric Smith
You are correct, I didn't exactly implement the PEP on this point, probably as a shortcut. I think there's an issue somewhere that discusses this, but I can't find it. The CPython implementation is really using "If every character is a digit, then it is treated as an integer, otherwise it is used as a string".
Perhaps you are thinking of http://bugs.python.org/issue7951. Not directly on point, but related. -- R. David Murray www.bitdance.com
On 02/23/2011 09:42 AM, R. David Murray wrote:
On Tue, 22 Feb 2011 19:32:56 -0500, Eric Smith
wrote: You are correct, I didn't exactly implement the PEP on this point, probably as a shortcut. I think there's an issue somewhere that discusses this, but I can't find it. The CPython implementation is really using "If every character is a digit, then it is treated as an integer, otherwise it is used as a string".
Perhaps you are thinking of http://bugs.python.org/issue7951. Not directly on point, but related.
Yes, that's the one I was thinking of. Thanks, David. I'm not sure I agree with all of the points raised there, but it does some additional background on the issue.
Eric Smith wrote:
On 2/22/2011 6:28 PM, Steve Holden wrote:
On Feb 22, 2011, at 3:08 PM, Eric Smith wrote:
Quoting PEP 3101:
An example of the 'getitem' syntax:
"My name is {0[name]}".format(dict(name='Fred'))
It should be noted that the use of 'getitem' within a format string is much more limited than its conventional usage. In the above example, the string 'name' really is the literal string 'name', not a variable named 'name'. The rules for parsing an item key are very simple. If it starts with a digit, then it is treated as a number, otherwise it is used as a string.
That's not strictly true:
--> d = {"Steve":"Holden", "Guido":"van Rossum", 21.2:"float"} --> d[21.1] Traceback (most recent call last): File "<stdin>", line 1, in<module> KeyError: 21.1 --> d[21.2] 'float' --> "{0[21.2]}".format(d) Traceback (most recent call last): File "<stdin>", line 1, in<module> KeyError: '21.2'
You are correct, I didn't exactly implement the PEP on this point, probably as a shortcut. I think there's an issue somewhere that discusses this, but I can't find it. The CPython implementation is really using "If every character is a digit, then it is treated as an integer, otherwise it is used as a string".
Given the representation issues with floating point, I think the current behavior is desirable. Also, leaving digits with periods as strings would, I think, be more useful (Dewey Decimal, anyone?). ~Ethan~
On Tue, Feb 22, 2011 at 6:01 PM, Steve Holden
... It would appear from tests that "{0[X]}".format(...) first tries to convert the string "X" to in integer. If it succeeds then __getitem__() is called with the integer as an argument, otherwise it is called with the string itself as an argument. Is this correct?
This is addressed in the PEP 3101: """ The rules for parsing an item key are very simple. If it starts with a digit, then it is treated as a number, otherwise it is used as a string. """ http://www.python.org/dev/peps/pep-3101/ If current implementation does something more involved, I would say it is a bug.
On Tue, Feb 22, 2011 at 6:01 PM, Steve Holden
wrote: ... It would appear from tests that "{0[X]}".format(...) first tries to convert the string "X" to in integer. If it succeeds then __getitem__() is called with the integer as an argument, otherwise it is called with the string itself as an argument. Is this correct?
This is addressed in the PEP 3101: """ The rules for parsing an item key are very simple. If it starts with a digit, then it is treated as a number, otherwise it is used as a string. """ http://www.python.org/dev/peps/pep-3101/
To the other question :
Furthermore, how does one access the key '1' in a format statement? ~Ethan~
I think, parsing rule already helps to understand the problem with the key like '1'. The PEP also explicitly states that: """ Because keys are not quote-delimited, it is not possible to specify arbitrary dictionary keys (e.g., the strings "10" or ":-]") from within a format string. """ -- Senthil
On 2/22/2011 6:32 PM, Senthil Kumaran wrote:
On Tue, Feb 22, 2011 at 6:01 PM, Steve Holden
wrote: ... It would appear from tests that "{0[X]}".format(...) first tries to convert the string "X" to in integer. If it succeeds then __getitem__() is called with the integer as an argument, otherwise it is called with the string itself as an argument. Is this correct?
This is addressed in the PEP 3101: """ The rules for parsing an item key are very simple. If it starts with a digit, then it is treated as a number, otherwise it is used as a string. """ http://www.python.org/dev/peps/pep-3101/
To the other question :
Furthermore, how does one access the key '1' in a format statement? ~Ethan~
I think, parsing rule already helps to understand the problem with the key like '1'. The PEP also explicitly states that:
""" Because keys are not quote-delimited, it is not possible to specify arbitrary dictionary keys (e.g., the strings "10" or ":-]") from within a format string. """
Is this all specific in the lib docs? If not, it should be. -- Terry Jan Reedy
On Wed, Feb 23, 2011 at 9:32 AM, Senthil Kumaran
""" Because keys are not quote-delimited, it is not possible to specify arbitrary dictionary keys (e.g., the strings "10" or ":-]") from within a format string. """
I was curious as to whether or not nested substitution could be used to avoid that limitation, but it would seem not:
"{0[{1}]}".format(d, 21.2) Traceback (most recent call last): File "<stdin>", line 1, in <module> KeyError: '{1}'
Turns out that was also a deliberate design choice: """ These 'internal' replacement fields can only occur in the format specifier part of the replacement field. Internal replacement fields cannot themselves have format specifiers. This implies also that replacement fields cannot be nested to arbitrary levels. """ Ah, how (much more) confused would we be if we didn't have the PEPs and mailing list archives to remind ourselves of what we were thinking years ago... Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On Feb 23, 2011, at 5:42 AM, Nick Coghlan wrote:
Ah, how (much more) confused would we be if we didn't have the PEPs and mailing list archives to remind ourselves of what we were thinking years ago...
True. And how much more useful it would be if it were incorporated into the documentation after implementation. Too much of the format() stuff is demonstrated rather than explained. regards Steve
On Feb 23, 2011, at 5:42 AM, Nick Coghlan wrote:
Ah, how (much more) confused would we be if we didn't have the PEPs and mailing list archives to remind ourselves of what we were thinking years ago...
True. And how much more useful it would be if it were incorporated into the documentation after implementation. Too much of the format() stuff is demonstrated rather than explained.
I think the documentation team has been pretty good about responding to format() issues that have been brought up in the bug tracker. Eric.
participants (8)
-
Alexander Belopolsky
-
Eric Smith
-
Ethan Furman
-
Nick Coghlan
-
R. David Murray
-
Senthil Kumaran
-
Steve Holden
-
Terry Reedy