The interpreter accepts f(**{'5':'foo'}); is this intentional?
I can't find documentation about whether there are constraints imposed on the keys in the map passed to a function via **, as in f(**d). According to http://docs.python.org/reference/expressions.html#id9 , d must be a mapping. test_extcall.py implies that the keys of this map must be strings in the following test: >>> f(**{1:2}) Traceback (most recent call last): ... TypeError: f() keywords must be strings But must the keys be valid python identifiers? In particular, the following is allows by the Python 2.5.2 and the Jython 2.2.1 interpreters: >>> f(**{'1':2}) {'1': 2} Is this behavior required somewhere by the Python language spec, or is it an error that just doesn't happen to be checked, or is it intentionally undefined whether this is allowed? Michael
I would favor this not being constrained. I don't want every use of ** to cause a pattern match to verify each key. I would even be fine without the check for being strings. Define what it should be, but let the implementation be lax. It is no different from any other place where you need to know its not a promise, just an artifact, and shouldn't rely on what the implementation currently does or does not force. On Thu, Feb 5, 2009 at 3:03 AM, Michael Haggerty <mhagger@alum.mit.edu> wrote:
I can't find documentation about whether there are constraints imposed on the keys in the map passed to a function via **, as in f(**d).
According to
http://docs.python.org/reference/expressions.html#id9
, d must be a mapping.
test_extcall.py implies that the keys of this map must be strings in the following test:
f(**{1:2}) Traceback (most recent call last): ... TypeError: f() keywords must be strings
But must the keys be valid python identifiers?
In particular, the following is allows by the Python 2.5.2 and the Jython 2.2.1 interpreters:
f(**{'1':2}) {'1': 2}
Is this behavior required somewhere by the Python language spec, or is it an error that just doesn't happen to be checked, or is it intentionally undefined whether this is allowed?
Michael _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/ironfroggy%40gmail.com
-- Read my blog! I depend on your acceptance of my opinion! I am interesting! http://techblog.ironfroggy.com/ Follow me if you're into that sort of thing: http://www.twitter.com/ironfroggy
Calvin Spealman wrote:
I would favor this not being constrained. I don't want every use of ** to cause a pattern match to verify each key. I would even be fine without the check for being strings. Define what it should be, but let the implementation be lax. It is no different from any other place where you need to know its not a promise, just an artifact, and shouldn't rely on what the implementation currently does or does not force.
I agree. There was a similar issue in http://bugs.python.org/issue2598, and we decided not to do anything about it. Eric.
On Thu, Feb 5, 2009 at 3:03 AM, Michael Haggerty <mhagger@alum.mit.edu> wrote:
I can't find documentation about whether there are constraints imposed on the keys in the map passed to a function via **, as in f(**d).
According to
http://docs.python.org/reference/expressions.html#id9
, d must be a mapping.
test_extcall.py implies that the keys of this map must be strings in the following test:
f(**{1:2}) Traceback (most recent call last): ... TypeError: f() keywords must be strings
But must the keys be valid python identifiers?
In particular, the following is allows by the Python 2.5.2 and the Jython 2.2.1 interpreters:
f(**{'1':2}) {'1': 2}
Is this behavior required somewhere by the Python language spec, or is it an error that just doesn't happen to be checked, or is it intentionally undefined whether this is allowed?
Michael _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/ironfroggy%40gmail.com
Michael Haggerty wrote:
Is this behavior required somewhere by the Python language spec, or is it an error that just doesn't happen to be checked, or is it intentionally undefined whether this is allowed?
Generally speaking, Python namespace dictionaries (be it globals(), locals(), the __dict__ attribute of an instance or a set of keyword arguments) aren't required to enforce the use of legal identifiers (in many cases, the CPython variants don't even enforce the use of strings). Enforcing legal identifiers is usually the compiler's job and if you're using dict syntax to access the contents of a namespace, the compiler doesn't care. That laxness is a CPython implementation detail though - other implementations are quite free to be stricter with their namespaces (e.g. I believe Jython namespaces use explicitly string-keyed dictionaries, so Jython would reject the example below). Cheers, Nick. P.S. An example of messing about with a class's dictionary in CPython: Python 2.5.2 (r252:60911, Jul 31 2008, 17:28:52) [GCC 4.2.3 (Ubuntu 4.2.3-2ubuntu7)] on linux2 Type "help", "copyright", "credits" or "license" for more information.
class C: pass ... C.__dict__[5] = "Not an identifier!" C.5 # obviously not allowed File "<stdin>", line 1 C.5 # obviously not allowed ^ SyntaxError: invalid syntax C.__dict__['5'] = "Still not an identifier!" C.5 # still not allowed File "<stdin>", line 1 C.5 # still not allowed ^ SyntaxError: invalid syntax C.__dict__[5] 'Not an identifier!' getattr(C, 5) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: getattr(): attribute name must be string getattr(C, '5') 'Still not an identifier!'
Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------
I'd prefer a compromise -- the keys should be strings, but should not be required to be valid identifiers. All Python implementations should support this. Rationale: a type check is cheap, and using strings exclusively makes the use of a faster dict implementation possible. A check for a conforming identifier is relatively expensive and serves no purpose except being pedantic. --Guido On Thu, Feb 5, 2009 at 12:19 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
Michael Haggerty wrote:
Is this behavior required somewhere by the Python language spec, or is it an error that just doesn't happen to be checked, or is it intentionally undefined whether this is allowed?
Generally speaking, Python namespace dictionaries (be it globals(), locals(), the __dict__ attribute of an instance or a set of keyword arguments) aren't required to enforce the use of legal identifiers (in many cases, the CPython variants don't even enforce the use of strings).
Enforcing legal identifiers is usually the compiler's job and if you're using dict syntax to access the contents of a namespace, the compiler doesn't care.
That laxness is a CPython implementation detail though - other implementations are quite free to be stricter with their namespaces (e.g. I believe Jython namespaces use explicitly string-keyed dictionaries, so Jython would reject the example below).
Cheers, Nick.
P.S. An example of messing about with a class's dictionary in CPython:
Python 2.5.2 (r252:60911, Jul 31 2008, 17:28:52) [GCC 4.2.3 (Ubuntu 4.2.3-2ubuntu7)] on linux2 Type "help", "copyright", "credits" or "license" for more information.
class C: pass ... C.__dict__[5] = "Not an identifier!" C.5 # obviously not allowed File "<stdin>", line 1 C.5 # obviously not allowed ^ SyntaxError: invalid syntax C.__dict__['5'] = "Still not an identifier!" C.5 # still not allowed File "<stdin>", line 1 C.5 # still not allowed ^ SyntaxError: invalid syntax C.__dict__[5] 'Not an identifier!' getattr(C, 5) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: getattr(): attribute name must be string getattr(C, '5') 'Still not an identifier!'
Cheers, Nick.
-- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia --------------------------------------------------------------- _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org
-- --Guido van Rossum (home page: http://www.python.org/~guido/)
Guido van Rossum schrieb:
I'd prefer a compromise -- the keys should be strings, but should not be required to be valid identifiers. All Python implementations should support this. Rationale: a type check is cheap, and using strings exclusively makes the use of a faster dict implementation possible. A check for a conforming identifier is relatively expensive and serves no purpose except being pedantic.
As pointed out in my other posting, an additional type check isn't required in most cases. A C function that checks ma_lookup does the same trick. int PyDict_StringKeysOnly(PyObject *dict) { if (((PyDictObject*)dict)->ma_lookup == lookdict_string) return 1; else { /* check all keys for PyStringObject */ } } The performance penalty is slime to nothing for the common case. How are we going to handle str subclasses and unicode? Should we allow all subclasses of basestring? Or just str and unicode? Or str only? Christian
How are we going to handle str subclasses and unicode?
"Are we going to"? You mean, the current code is not good enough? Why not?
Should we allow all subclasses of basestring? Or just str and unicode? Or str only?
In 2.x, str only, in 3.x, unicode only. Regards, Martin
Nick Coghlan wrote:
Generally speaking, Python namespace dictionaries (be it globals(), locals(), the __dict__ attribute of an instance or a set of keyword arguments) aren't required to enforce the use of legal identifiers (in many cases, the CPython variants don't even enforce the use of strings).
Side note: CPython's dict code has a special case for str objects (PyStringObject in 2.x, PyUnicodeObject in 3.x). The internal lookup method is optimized for str objects. Python uses dict objects for all its namespaces like classes, modules and most objects, so dict with str as keys are pretty common. The first time a non str object is inserted or looked up, the dict swiches to a more general lookup methods. lookdict() still fast but not as fast as lookdict_string(). It doesn't make a huge difference but you should still keep the fact in your head. We could abuse the state of the ma_lookup function pointer to check the dict for str only keys. But it would break for unicode keys thus making from __future__ import unicode_literals useless. Christian
Christian Heimes wrote:
Nick Coghlan wrote:
Generally speaking, Python namespace dictionaries (be it globals(), locals(), the __dict__ attribute of an instance or a set of keyword arguments) aren't required to enforce the use of legal identifiers (in many cases, the CPython variants don't even enforce the use of strings).
Side note:
CPython's dict code has a special case for str objects (PyStringObject in 2.x, PyUnicodeObject in 3.x). The internal lookup method is optimized for str objects. Python uses dict objects for all its namespaces like classes, modules and most objects, so dict with str as keys are pretty common.
The first time a non str object is inserted or looked up, the dict swiches to a more general lookup methods.
This makes adding a string-only dict pretty trivial, if desired.
lookdict() still fast but not as fast as lookdict_string(). It doesn't make a huge difference but you should still keep the fact in your head.
We could abuse the state of the ma_lookup function pointer to check the dict for str only keys. But it would break for unicode keys thus making from __future__ import unicode_literals useless.
Assuming that 3.x dicts are optimized for the 3.x string type, this is not a problem for 3.x ;-). tjr
Terry Reedy schrieb:
The first time a non str object is inserted or looked up, the dict swiches to a more general lookup methods.
This makes adding a string-only dict pretty trivial, if desired.
It's not as trivial as it seems. The switch over to the general lookup method happens when a non str object is *looked up*. d = {'a': None} # uses lookdict_string() d.get(1, None) # d now uses lookdict() "d->ma_lookup == lookdict_string" is a sufficient condition, not a condicio sine qua non.
Assuming that 3.x dicts are optimized for the 3.x string type, this is not a problem for 3.x ;-).
Well, it's always optimized for the str type - which happens to be PyUnicodeObject in 3.x. Christian
participants (9)
-
"Martin v. Löwis"
-
Calvin Spealman
-
Christian Heimes
-
Eric Smith
-
Guido van Rossum
-
Michael Haggerty
-
Nick Coghlan
-
Scott David Daniels
-
Terry Reedy