
Summary: Further information is provided, which suggests that it may be best to amend Python so that "frozenset({1, 2, 3})" is the literal for eval("frozenset({1, 2, 3})"). Steve D'Aprano correctly notes that the bytecode generated by the expression x in {1, 2 ,3} is apparently not optimal. He then argues that introducing a frozenset literal would allow x in f{1, 2, 3} # New syntax, giving a frozenset literal would allow better bytecode to be generated. However, the following has the same semantics as "x in {1, 2, 3}" and perhaps gives optimal bytecode.
import dis dis.dis("x in (1, 2, 3)") 1 0 LOAD_NAME 0 (x) 2 LOAD_CONST 3 ((1, 2, 3)) 4 COMPARE_OP 6 (in) 6 RETURN_VALUE
For comparison, here's the bytecode Steve correctly notes is apparently not optimal.
dis.dis("x in {1, 2, 3}") 1 0 LOAD_NAME 0 (x) 2 LOAD_CONST 3 (frozenset({1, 2, 3})) 4 COMPARE_OP 6 (in) 6 RETURN_VALUE
Steve states that "x in {1, 2, 3}" when executed calls "frozenset({1, 2, 3})", and in particular looks up "frozenset" in builtins and literals. I can see why he says that, but I've done an experiment that suggests otherwise.
bc = compile("x in {1, 2, 3}", "filename", "eval") eval(bc, dict(x=1)) True eval(bc, dict(x=1, frozenset="dne")) True
I suspect that if you look up in the C-source for Python, you'll find that dis.dis ends up using frozenset({1, 2, 3}) as the literal for representing the result of evaluating frozenset({1, 2, 3}). The following is evidence for this hypothesis:
frozenset({1, 2, 3}) frozenset({1, 2, 3}) set({1, 2, 3}) {1, 2, 3} set([1, 2, 3]) {1, 2, 3} set([]) set()
To conclude, I find it plausible that: 1. The bytecode generated by "x in {1, 2, 3}" is already optimal. 2. Python already uses "frozenset({1, 2, 3})" as the literal representation of a frozenset. Steve in his original post mentioned the issue https://bugs.python.org/issue46393, authored by Terry Reedy. Steve rightly comments on that issue that "may have been shadowed, or builtins monkey-patched, so we cannot know what frozenset({1, 2, 3}) will return until runtime." Steve's quite right about this shadowing problem. In light of my plausible conclusions I suggest his goal of a frozenset literal might be better achieved by making 'frozenset' a keyword, much as None and True and False are already keywords.
True = False File "<stdin>", line 1 SyntaxError: can't assign to keyword
Once this is done we can then use frozenset({1, 2, 3}) as the literal for a frozenset, not only in dis.dis and repr and elsewhere, but also in source code. As a rough suggestion, something like from __future__ import literal_constructors_as_keywords would prevent monkey-patching of set, frozenset, int and so forth (just as True cannot be monkeypatched). I thank Steve for bringing this interesting question to our attention, for his earlier work on the issue, and for sharing his current thoughts on this matter. It's also worth looking at the message for Gregory Smith that Steve referenced in his original post. https://mail.python.org/pipermail/python-ideas/2018-July/051902.html Gregory wrote: frozenset is not the only base type that lacks a literals leading to loading values into these types involving creation of an intermediate throwaway object: bytearray. bytearray(b'short lived bytes object') I hope this helps. -- Jonathan