Mailman 3 The interpreter accepts f(**{'5':'foo'}); is this intentional? - Python-Dev

The interpreter accepts f(**{'5':'foo'}); is this intentional?

Michael Haggerty

5 Feb 2009 5 Feb '09

3:03 a.m.

I can't find documentation about whether there are constraints imposed on the keys in the map passed to a function via **, as in f(**d). According to http://docs.python.org/reference/expressions.html#id9 , d must be a mapping. test_extcall.py implies that the keys of this map must be strings in the following test: >>> f(**{1:2}) Traceback (most recent call last): ... TypeError: f() keywords must be strings But must the keys be valid python identifiers? In particular, the following is allows by the Python 2.5.2 and the Jython 2.2.1 interpreters: >>> f(**{'1':2}) {'1': 2} Is this behavior required somewhere by the Python language spec, or is it an error that just doesn't happen to be checked, or is it intentionally undefined whether this is allowed? Michael

Show replies by date

Calvin Spealman

5 Feb 5 Feb

10:31 a.m.

New subject: The interpreter accepts f(**{'5':'foo'}); is this intentional?

I would favor this not being constrained. I don't want every use of ** to cause a pattern match to verify each key. I would even be fine without the check for being strings. Define what it should be, but let the implementation be lax. It is no different from any other place where you need to know its not a promise, just an artifact, and shouldn't rely on what the implementation currently does or does not force. On Thu, Feb 5, 2009 at 3:03 AM, Michael Haggerty <mhagger@alum.mit.edu> wrote:

...

I can't find documentation about whether there are constraints imposed on the keys in the map passed to a function via **, as in f(**d).

According to

http://docs.python.org/reference/expressions.html#id9

, d must be a mapping.

test_extcall.py implies that the keys of this map must be strings in the following test:

...
...
...
f(**{1:2}) Traceback (most recent call last): ... TypeError: f() keywords must be strings

But must the keys be valid python identifiers?

In particular, the following is allows by the Python 2.5.2 and the Jython 2.2.1 interpreters:

...
...
...
f(**{'1':2}) {'1': 2}

Is this behavior required somewhere by the Python language spec, or is it an error that just doesn't happen to be checked, or is it intentionally undefined whether this is allowed?

Michael _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/ironfroggy%40gmail.com

-- Read my blog! I depend on your acceptance of my opinion! I am interesting! http://techblog.ironfroggy.com/ Follow me if you're into that sort of thing: http://www.twitter.com/ironfroggy

Eric Smith

10:39 a.m.

New subject: The interpreter accepts f(**{'5':'foo'}); is this intentional?

Calvin Spealman wrote:

...

I would favor this not being constrained. I don't want every use of ** to cause a pattern match to verify each key. I would even be fine without the check for being strings. Define what it should be, but let the implementation be lax. It is no different from any other place where you need to know its not a promise, just an artifact, and shouldn't rely on what the implementation currently does or does not force.

I agree. There was a similar issue in http://bugs.python.org/issue2598, and we decided not to do anything about it. Eric.

...

On Thu, Feb 5, 2009 at 3:03 AM, Michael Haggerty <mhagger@alum.mit.edu> wrote:

...
I can't find documentation about whether there are constraints imposed on the keys in the map passed to a function via **, as in f(**d).

According to

http://docs.python.org/reference/expressions.html#id9

, d must be a mapping.

test_extcall.py implies that the keys of this map must be strings in the following test:

...
...
...
f(**{1:2}) Traceback (most recent call last): ... TypeError: f() keywords must be strings

But must the keys be valid python identifiers?

In particular, the following is allows by the Python 2.5.2 and the Jython 2.2.1 interpreters:

...
...
...
f(**{'1':2}) {'1': 2}

Is this behavior required somewhere by the Python language spec, or is it an error that just doesn't happen to be checked, or is it intentionally undefined whether this is allowed?

Michael _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/ironfroggy%40gmail.com

Nick Coghlan

3:19 p.m.

New subject: The interpreter accepts f(**{'5':'foo'}); is this intentional?

Michael Haggerty wrote:

...

Is this behavior required somewhere by the Python language spec, or is it an error that just doesn't happen to be checked, or is it intentionally undefined whether this is allowed?

Generally speaking, Python namespace dictionaries (be it globals(), locals(), the __dict__ attribute of an instance or a set of keyword arguments) aren't required to enforce the use of legal identifiers (in many cases, the CPython variants don't even enforce the use of strings). Enforcing legal identifiers is usually the compiler's job and if you're using dict syntax to access the contents of a namespace, the compiler doesn't care. That laxness is a CPython implementation detail though - other implementations are quite free to be stricter with their namespaces (e.g. I believe Jython namespaces use explicitly string-keyed dictionaries, so Jython would reject the example below). Cheers, Nick. P.S. An example of messing about with a class's dictionary in CPython: Python 2.5.2 (r252:60911, Jul 31 2008, 17:28:52) [GCC 4.2.3 (Ubuntu 4.2.3-2ubuntu7)] on linux2 Type "help", "copyright", "credits" or "license" for more information.

...

...
...
class C: pass ... C.__dict__[5] = "Not an identifier!" C.5 # obviously not allowed File "<stdin>", line 1 C.5 # obviously not allowed ^ SyntaxError: invalid syntax C.__dict__['5'] = "Still not an identifier!" C.5 # still not allowed File "<stdin>", line 1 C.5 # still not allowed ^ SyntaxError: invalid syntax C.__dict__[5] 'Not an identifier!' getattr(C, 5) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: getattr(): attribute name must be string getattr(C, '5') 'Still not an identifier!'

Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------

Guido van Rossum

3:32 p.m.

New subject: The interpreter accepts f(**{'5':'foo'}); is this intentional?

I'd prefer a compromise -- the keys should be strings, but should not be required to be valid identifiers. All Python implementations should support this. Rationale: a type check is cheap, and using strings exclusively makes the use of a faster dict implementation possible. A check for a conforming identifier is relatively expensive and serves no purpose except being pedantic. --Guido On Thu, Feb 5, 2009 at 12:19 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:

...

Michael Haggerty wrote:

...
Is this behavior required somewhere by the Python language spec, or is it an error that just doesn't happen to be checked, or is it intentionally undefined whether this is allowed?

Generally speaking, Python namespace dictionaries (be it globals(), locals(), the __dict__ attribute of an instance or a set of keyword arguments) aren't required to enforce the use of legal identifiers (in many cases, the CPython variants don't even enforce the use of strings).

Enforcing legal identifiers is usually the compiler's job and if you're using dict syntax to access the contents of a namespace, the compiler doesn't care.

That laxness is a CPython implementation detail though - other implementations are quite free to be stricter with their namespaces (e.g. I believe Jython namespaces use explicitly string-keyed dictionaries, so Jython would reject the example below).

Cheers, Nick.

P.S. An example of messing about with a class's dictionary in CPython:

Python 2.5.2 (r252:60911, Jul 31 2008, 17:28:52) [GCC 4.2.3 (Ubuntu 4.2.3-2ubuntu7)] on linux2 Type "help", "copyright", "credits" or "license" for more information.

...
...
...
class C: pass ... C.__dict__[5] = "Not an identifier!" C.5 # obviously not allowed File "<stdin>", line 1 C.5 # obviously not allowed ^ SyntaxError: invalid syntax C.__dict__['5'] = "Still not an identifier!" C.5 # still not allowed File "<stdin>", line 1 C.5 # still not allowed ^ SyntaxError: invalid syntax C.__dict__[5] 'Not an identifier!' getattr(C, 5) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: getattr(): attribute name must be string getattr(C, '5') 'Still not an identifier!'

Cheers, Nick.

-- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia --------------------------------------------------------------- _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org

-- --Guido van Rossum (home page: http://www.python.org/~guido/)

Christian Heimes

4:04 p.m.

New subject: The interpreter accepts f(**{'5':'foo'}); is this intentional?

Guido van Rossum schrieb:

...

I'd prefer a compromise -- the keys should be strings, but should not be required to be valid identifiers. All Python implementations should support this. Rationale: a type check is cheap, and using strings exclusively makes the use of a faster dict implementation possible. A check for a conforming identifier is relatively expensive and serves no purpose except being pedantic.

As pointed out in my other posting, an additional type check isn't required in most cases. A C function that checks ma_lookup does the same trick. int PyDict_StringKeysOnly(PyObject *dict) { if (((PyDictObject*)dict)->ma_lookup == lookdict_string) return 1; else { /* check all keys for PyStringObject */ } } The performance penalty is slime to nothing for the common case. How are we going to handle str subclasses and unicode? Should we allow all subclasses of basestring? Or just str and unicode? Or str only? Christian

"Martin v. Löwis"

4:10 p.m.

New subject: The interpreter accepts f(**{'5':'foo'}); is this intentional?

...

How are we going to handle str subclasses and unicode?

"Are we going to"? You mean, the current code is not good enough? Why not?

...

Should we allow all subclasses of basestring? Or just str and unicode? Or str only?

In 2.x, str only, in 3.x, unicode only. Regards, Martin

Scott David Daniels

7:16 p.m.

New subject: The interpreter accepts f(**{'5':'foo'}); is this intentional?

Christian Heimes wrote:

...

... The performance penalty is slime to nothing for the common case....

Sorry, I love this typo. -Scott

Christian Heimes

3:36 p.m.

Nick Coghlan wrote:

...

Generally speaking, Python namespace dictionaries (be it globals(), locals(), the __dict__ attribute of an instance or a set of keyword arguments) aren't required to enforce the use of legal identifiers (in many cases, the CPython variants don't even enforce the use of strings).

Side note: CPython's dict code has a special case for str objects (PyStringObject in 2.x, PyUnicodeObject in 3.x). The internal lookup method is optimized for str objects. Python uses dict objects for all its namespaces like classes, modules and most objects, so dict with str as keys are pretty common. The first time a non str object is inserted or looked up, the dict swiches to a more general lookup methods. lookdict() still fast but not as fast as lookdict_string(). It doesn't make a huge difference but you should still keep the fact in your head. We could abuse the state of the ma_lookup function pointer to check the dict for str only keys. But it would break for unicode keys thus making from __future__ import unicode_literals useless. Christian

Terry Reedy

4:13 p.m.

Christian Heimes wrote:

...

Nick Coghlan wrote:

...
Generally speaking, Python namespace dictionaries (be it globals(), locals(), the __dict__ attribute of an instance or a set of keyword arguments) aren't required to enforce the use of legal identifiers (in many cases, the CPython variants don't even enforce the use of strings).

Side note:

CPython's dict code has a special case for str objects (PyStringObject in 2.x, PyUnicodeObject in 3.x). The internal lookup method is optimized for str objects. Python uses dict objects for all its namespaces like classes, modules and most objects, so dict with str as keys are pretty common.

The first time a non str object is inserted or looked up, the dict swiches to a more general lookup methods.

This makes adding a string-only dict pretty trivial, if desired.

...

lookdict() still fast but not as fast as lookdict_string(). It doesn't make a huge difference but you should still keep the fact in your head.

We could abuse the state of the ma_lookup function pointer to check the dict for str only keys. But it would break for unicode keys thus making from __future__ import unicode_literals useless.

Assuming that 3.x dicts are optimized for the 3.x string type, this is not a problem for 3.x ;-). tjr

Christian Heimes

4:55 p.m.

Terry Reedy schrieb:

...

...
The first time a non str object is inserted or looked up, the dict swiches to a more general lookup methods.

This makes adding a string-only dict pretty trivial, if desired.

It's not as trivial as it seems. The switch over to the general lookup method happens when a non str object is *looked up*. d = {'a': None} # uses lookdict_string() d.get(1, None) # d now uses lookdict() "d->ma_lookup == lookdict_string" is a sufficient condition, not a condicio sine qua non.

...

Assuming that 3.x dicts are optimized for the 3.x string type, this is not a problem for 3.x ;-).

Well, it's always optimized for the str type - which happens to be PyUnicodeObject in 3.x. Christian

5565

Age (days ago)

5566

Last active (days ago)

List overview

Download

10 comments

9 participants

participants (9)

"Martin v. Löwis"
Calvin Spealman
Christian Heimes
Eric Smith
Guido van Rossum
Michael Haggerty
Nick Coghlan
Scott David Daniels
Terry Reedy

The interpreter accepts f(**{'5':'foo'}); is this intentional?

tags

participants (9)