
This is similar to another proposal: http://mail.python.org/pipermail/python-3000/2008-January/011798.html Anyway, I was using ast.literal_eval and attempted to use frozenset({...}) as a key in a dictionary, which failed, because frozenset isn't a literal (though putting frozenset in the environment would be a security risk). I am currently working around this with tuples, but I'd like a literal for representing frozensets as well. I also use frozensets elsewhere in the code in ways similar to Raymond's original suggestion. Perhaps something like f{...} for declaring frozenset( comprehension)? literals?

Hua Lu, 02.02.2013 07:24:
This has nothing to do with being a literal or not. The way you created your frozenset doesn't impact its behaviour. Could you give an example of what's not working for you? Frozensets as dict keys work just fine for me: Python 3.2.3 (default, Oct 19 2012, 19:53:16) [GCC 4.7.2] on linux2 Type "help", "copyright", "credits" or "license" for more information.
Stefan

Sorry, I'm quite new to mailing lists. I want to be able to use frozensets in ast.literal_eval portions (especially as semantically correct keys for certain domains where tuples have too much structure). Writing an ast filter just to accommodate a builtin type feels way too hacky to me, and there are legitimate reasons for having frozenset literals exist, some of which are discussed in the link to that 2008 discussion. Also, there are serious hacks for eval, and after seeing some of those... ay caramba. Thanks for being patient with my newbishness.

On 03/02/13 00:06, Stefan Behnel wrote:
Stefan, you did ask Hua Lu to show an example of what isn't working. It's just a demonstration of what doesn't work -- you can't create a frozenset using ast.literal_eval. I think Hua Lu's original post made it quite clear. He wishes there to be a frozenset literal, because currently there is no way to have ast.literal_eval evaluate something containing a frozenset. Because there is no frozenset literal. I think Raymond Hettinger's proposal back in 2008: http://mail.python.org/pipermail/python-3000/2008-January/011798.html and the following thread is worth reading. Guido even pronounced his agreement: http://mail.python.org/pipermail/python-3000/2008-January/011814.html but then changed his mind (as did Raymond). So the status quo remains. Unfortunately the proposal to use f{ ... } for frozen sets cannot work within the constraints of Python's lexer: http://mail.python.org/pipermail/python-3000/2008-January/011838.html Unfortunately we're running out of useful, easy to enter symbols for literals. Until such time (Python4000 perhaps, or more likely Python5000) as we can use a rich set of Unicode literals, I don't think there is any clear way to have a frozenset literal. -- Steven

Perhaps, but we'd have to be careful with how we introduce those symbols: http://nedbatchelder.com/blog/201206/eval_really_is_dangerous.html You could possibly not allow attribute access but permit those symbols... is there an exploit possible with that much filtering? On Sat, Feb 2, 2013 at 12:30 PM, MRAB <python@mrabarnett.plus.com> wrote:

On Sat, Feb 2, 2013 at 11:07 AM, MRAB <python@mrabarnett.plus.com> wrote:
You're proposing to do this just for literal_eval(), right? But how would you implement it? It seems it would require lots of special cases. Where would you stop? dict(key1=..., key2=...)? list({...})? -- --Guido van Rossum (python.org/~guido)

On 03/02/13 05:30, MRAB wrote:
I think that having literal_eval support non-literals is a bad, bad idea. Let's just not go there. It will surely end in tears. However, I think that having something in between the strictness of literal_eval and the dangerous "anything goes" power of eval is a good idea. For a long time now I've toyed with an engine for building expression evaluators. Something that understands operators, function calls, etc, and you can tell it what names to accept. My main motivation is for evaluating mathematical expressions like: 5x^3 - 2x + log(1/y) + n!/√π I have some old, broken Pascal code that half does what I want, and I keep intending to revisit it some day in Python. In my copious spare time. In any case, I think something like that belongs in a third-party module, at least at first. -- Steven

On Sun, Feb 3, 2013 at 1:18 PM, Steven D'Aprano <steve@pearwood.info> wrote:
This sounds like a good idea, especially if there can be some way to enforce that these names may ONLY be called - you can't piggyback on log to get other functionality with log.__globals__ etc. That would cover frozenset quite happily. ChrisA

On Sun, Feb 3, 2013 at 1:09 PM, MRAB <python@mrabarnett.plus.com> wrote:
This is why we need a PEP or a PyPI module. It's certainly not clear to me that special casing in ast.literal_eval (or a new "ast.limited_eval") is a superior solution to s{} and fs{} syntax for creating the empty set and frozen sets. (And, as Raymond notes, there are other compile-time benefits in terms of constant-caching when it comes to dedicated syntax) On the other hand, a "limited_eval" style solution might be easier to extend to other builtins like range, reversed and enum, as well as to container comprehensions and generator expressions. It also has the virtue of being possible to write as a PyPI module, and made available for *current* Python versions, rather than only being available in Python 3.4+. It's certainly a space worth exploring, even though the best way to improve the status quo isn't immediately obvious. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Sun, 3 Feb 2013 14:50:45 +1000 Nick Coghlan <ncoghlan@gmail.com> wrote:
Well, it's superior because it doesn't need new syntax.
Which is a rather dubious benefit when it comes to a little-used datatype. If you want to avoid the cost of instantiating a frozenset in every loop iteration, just hoist it manually out of the loop.
Agreed. Regards Antoine.

On 3 Feb 2013 00:21, "Steven D'Aprano" <steve@pearwood.info> wrote:
0x7f865a8c1450>
To clarify Guido's comment in that post, I'm fairly sure it *can* be made to work, it just won' t be the same way that string prefixes work (because the contents of dict and set displays are not opaque to the compiler the way string contents are). The hypothetical "What if we want to allow expr{} as a general construct?" objection needs to be balanced against the immediate value of a more expressive language subset for use in ast.literal_eval(). Cheers, Nick.
Unfortunately we're running out of useful, easy to enter symbols for
literals. Until such time (Python4000 perhaps, or more likely Python5000) as we can use a rich set of Unicode literals, I don't think there is any clear way to have a frozenset literal.

On Sun, 3 Feb 2013 00:40:47 +1000 Nick Coghlan <ncoghlan@gmail.com> wrote:
I've just tried: ast.literal_eval() works just fine on a normal set:
ast.literal_eval("{1,2,3}") {1, 2, 3}
So I'd like to know why people find it important to build a frozenset rather than a normal set. The situation of wanting to use sets as hash keys is very rare in my experience. Regards Antoine.

On Feb 2, 2013, at 6:20 AM, Steven D'Aprano <steve@pearwood.info> wrote:
Wow, that was a nice recap of the historical discussions. Thank you :-) FWIW, if a workable frozenset literal were proposed, I would have no objections. Their original use case was be useable as elements of other sets (for sets of sets) or as dictionary keys (to implement graph structures). Later, it became clear that it would be useful for code-optimization to treat frozensets as constants that are built when a function is defined, rather than later when it is invoked. In the future, frozen sets may obtain other desirable properties. For example, I'm still evaluating whether to store the key/hash entries in a dense array and storing only indicies in a collision table. For regular sets, the dense array needs to over-allocate in order to grow efficiently. In contrast, a frozenset could free the excess memory and become very compact (about a third of the current size). If that ever comes to fruition, then a frozenset literal would be nice because of its compactness as compared to regular sets. Raymond

On Sat, Feb 2, 2013, at 9:20, Steven D'Aprano wrote:
I was going to post about not being sure what the objection is (if it's multiple tokens, let it be multiple tokens - the contents are multiple tokens anyway - and saying it would block a future syntax extension doesn't seem like a reasonable objection to a proposed syntax extension), but I had a new idea so I'll post that instead: { as frozenset { ... } } The sequence "{ as" can't occur (to my knowledge) anywhere now. So, the thing after it is a keyword in that context (and only that context, otherwise "frozenset" remains an identifier naming an ordinary builtin) and specifies what kind of literal the following sequence is. You could also extend it to alternate forms for some other builtin types - for example { as bytes [1, 2, 3, 4, 5] } instead of b"\x1\x2\x3\x4\x5". Or... { as set { } }

On 13 February 2013 21:02, <random832@fastmail.us> wrote:
I'm really not sure I like this idea, but surely: LITERAL as KEYWORD { ... } as frozenset [1, 2, 3, 4, 5] as bytes {} as set would work better. However, I'm not happy on the idea that an identifier can be a keyword in another context.

On Sat, Feb 2, 2013 at 11:38 AM, Hua Lu <gotoalanlu@gmail.com> wrote:
It looks like we can special-case usage of set literal as a key for a dict or a member for another set and create a frozenset constant instead. So that adict[{1, 2, 3}] should be interpreted as adict[frozenset([1, 2, 3])] and { {'b', 'a', 'r'}, 'foo' } as { frozenset('bar'), 'foo' } This will provide a minimal change to the interpreter while making it possible to use any literal-parsing with frozensets while keeping method calls out of literal_eval. -- Kind regards, Yuriy.

This could interfere with the behavior of types overriding the index operator. Also, it's a special case.
Why shouldn't we change the interpreter when it could make the language better? Honestly, I can't think of an element which will require special syntax like symbol{...}. Furthermore, it could be really confusable with current syntax. Example: for fn { bar(foo) for foo in baz }: What if bar was a higher order function and that syntax meant something like "make a generator applying each thing in the set to an original fn before introducing fn in the scope" when you really meant `for fn in {...}`? Python 3 has already made some calls more explicitly a function (e.g. print) so having that as syntactic sugar for that would also be weird. There doesn't seem to be any _use_ for syntax like symbol{...}, so I feel for example s{...}, f{...}, and d{...} would be a natural extension to Python insofar as b'...' and u'...' are in the language, _even though_ sets and maps feel more general mathematically speaking (so the need for special use syntax _may_ be inappropriate (but people write math in languages ;)).

Hey all, I have a simple hack around this problem for the time being. It involves adding a parameter to ast.literal_eval. See this diff: $ diff ast.py.bak ast.py 38c38 < def literal_eval(node_or_string): ---
return set_t(map(_convert, node.elts))
Use is as follows:
Regards, Alan

FWIW, I could personally tolerate the introduction of s{} and fs{} literals. We'd just declare the "s" prefix optional for non-empty sets to match the current rules. Encouraging the use of ast.literal_eval() over the security nightmare that is eval() would be more than enough justification for me. (As a syntax change, the idea would still need a PEP, though) Cheers, Nick.

On 3 Feb 2013 00:39, "Antoine Pitrou" <solipsis@pitrou.net> wrote:
match the current rules.
Encouraging the use of ast.literal_eval() over the security nightmare
to that
The difference is that decimal and datetime already have safe "from string" conversion operations. The empty set and frozen sets do not. That said, a decimal literal proposal would be a reasonable follow-up to the incorporation of cdecimal. Datetime is a more difficult prospect, since there are good reasons strptime is as flexible as it is. Regardless, I'm not saying a PEP to support all the builtin container types in ast.literal_eval would necessarily be accepted. I'm merely saying it is *worth writing*. Cheers, Nick.

Hua Lu, 02.02.2013 07:24:
This has nothing to do with being a literal or not. The way you created your frozenset doesn't impact its behaviour. Could you give an example of what's not working for you? Frozensets as dict keys work just fine for me: Python 3.2.3 (default, Oct 19 2012, 19:53:16) [GCC 4.7.2] on linux2 Type "help", "copyright", "credits" or "license" for more information.
Stefan

Sorry, I'm quite new to mailing lists. I want to be able to use frozensets in ast.literal_eval portions (especially as semantically correct keys for certain domains where tuples have too much structure). Writing an ast filter just to accommodate a builtin type feels way too hacky to me, and there are legitimate reasons for having frozenset literals exist, some of which are discussed in the link to that 2008 discussion. Also, there are serious hacks for eval, and after seeing some of those... ay caramba. Thanks for being patient with my newbishness.

On 03/02/13 00:06, Stefan Behnel wrote:
Stefan, you did ask Hua Lu to show an example of what isn't working. It's just a demonstration of what doesn't work -- you can't create a frozenset using ast.literal_eval. I think Hua Lu's original post made it quite clear. He wishes there to be a frozenset literal, because currently there is no way to have ast.literal_eval evaluate something containing a frozenset. Because there is no frozenset literal. I think Raymond Hettinger's proposal back in 2008: http://mail.python.org/pipermail/python-3000/2008-January/011798.html and the following thread is worth reading. Guido even pronounced his agreement: http://mail.python.org/pipermail/python-3000/2008-January/011814.html but then changed his mind (as did Raymond). So the status quo remains. Unfortunately the proposal to use f{ ... } for frozen sets cannot work within the constraints of Python's lexer: http://mail.python.org/pipermail/python-3000/2008-January/011838.html Unfortunately we're running out of useful, easy to enter symbols for literals. Until such time (Python4000 perhaps, or more likely Python5000) as we can use a rich set of Unicode literals, I don't think there is any clear way to have a frozenset literal. -- Steven

Perhaps, but we'd have to be careful with how we introduce those symbols: http://nedbatchelder.com/blog/201206/eval_really_is_dangerous.html You could possibly not allow attribute access but permit those symbols... is there an exploit possible with that much filtering? On Sat, Feb 2, 2013 at 12:30 PM, MRAB <python@mrabarnett.plus.com> wrote:

On Sat, Feb 2, 2013 at 11:07 AM, MRAB <python@mrabarnett.plus.com> wrote:
You're proposing to do this just for literal_eval(), right? But how would you implement it? It seems it would require lots of special cases. Where would you stop? dict(key1=..., key2=...)? list({...})? -- --Guido van Rossum (python.org/~guido)

On 03/02/13 05:30, MRAB wrote:
I think that having literal_eval support non-literals is a bad, bad idea. Let's just not go there. It will surely end in tears. However, I think that having something in between the strictness of literal_eval and the dangerous "anything goes" power of eval is a good idea. For a long time now I've toyed with an engine for building expression evaluators. Something that understands operators, function calls, etc, and you can tell it what names to accept. My main motivation is for evaluating mathematical expressions like: 5x^3 - 2x + log(1/y) + n!/√π I have some old, broken Pascal code that half does what I want, and I keep intending to revisit it some day in Python. In my copious spare time. In any case, I think something like that belongs in a third-party module, at least at first. -- Steven

On Sun, Feb 3, 2013 at 1:18 PM, Steven D'Aprano <steve@pearwood.info> wrote:
This sounds like a good idea, especially if there can be some way to enforce that these names may ONLY be called - you can't piggyback on log to get other functionality with log.__globals__ etc. That would cover frozenset quite happily. ChrisA

On Sun, Feb 3, 2013 at 1:09 PM, MRAB <python@mrabarnett.plus.com> wrote:
This is why we need a PEP or a PyPI module. It's certainly not clear to me that special casing in ast.literal_eval (or a new "ast.limited_eval") is a superior solution to s{} and fs{} syntax for creating the empty set and frozen sets. (And, as Raymond notes, there are other compile-time benefits in terms of constant-caching when it comes to dedicated syntax) On the other hand, a "limited_eval" style solution might be easier to extend to other builtins like range, reversed and enum, as well as to container comprehensions and generator expressions. It also has the virtue of being possible to write as a PyPI module, and made available for *current* Python versions, rather than only being available in Python 3.4+. It's certainly a space worth exploring, even though the best way to improve the status quo isn't immediately obvious. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Sun, 3 Feb 2013 14:50:45 +1000 Nick Coghlan <ncoghlan@gmail.com> wrote:
Well, it's superior because it doesn't need new syntax.
Which is a rather dubious benefit when it comes to a little-used datatype. If you want to avoid the cost of instantiating a frozenset in every loop iteration, just hoist it manually out of the loop.
Agreed. Regards Antoine.

On 3 Feb 2013 00:21, "Steven D'Aprano" <steve@pearwood.info> wrote:
0x7f865a8c1450>
To clarify Guido's comment in that post, I'm fairly sure it *can* be made to work, it just won' t be the same way that string prefixes work (because the contents of dict and set displays are not opaque to the compiler the way string contents are). The hypothetical "What if we want to allow expr{} as a general construct?" objection needs to be balanced against the immediate value of a more expressive language subset for use in ast.literal_eval(). Cheers, Nick.
Unfortunately we're running out of useful, easy to enter symbols for
literals. Until such time (Python4000 perhaps, or more likely Python5000) as we can use a rich set of Unicode literals, I don't think there is any clear way to have a frozenset literal.

On Sun, 3 Feb 2013 00:40:47 +1000 Nick Coghlan <ncoghlan@gmail.com> wrote:
I've just tried: ast.literal_eval() works just fine on a normal set:
ast.literal_eval("{1,2,3}") {1, 2, 3}
So I'd like to know why people find it important to build a frozenset rather than a normal set. The situation of wanting to use sets as hash keys is very rare in my experience. Regards Antoine.

On Feb 2, 2013, at 6:20 AM, Steven D'Aprano <steve@pearwood.info> wrote:
Wow, that was a nice recap of the historical discussions. Thank you :-) FWIW, if a workable frozenset literal were proposed, I would have no objections. Their original use case was be useable as elements of other sets (for sets of sets) or as dictionary keys (to implement graph structures). Later, it became clear that it would be useful for code-optimization to treat frozensets as constants that are built when a function is defined, rather than later when it is invoked. In the future, frozen sets may obtain other desirable properties. For example, I'm still evaluating whether to store the key/hash entries in a dense array and storing only indicies in a collision table. For regular sets, the dense array needs to over-allocate in order to grow efficiently. In contrast, a frozenset could free the excess memory and become very compact (about a third of the current size). If that ever comes to fruition, then a frozenset literal would be nice because of its compactness as compared to regular sets. Raymond

On Sat, Feb 2, 2013, at 9:20, Steven D'Aprano wrote:
I was going to post about not being sure what the objection is (if it's multiple tokens, let it be multiple tokens - the contents are multiple tokens anyway - and saying it would block a future syntax extension doesn't seem like a reasonable objection to a proposed syntax extension), but I had a new idea so I'll post that instead: { as frozenset { ... } } The sequence "{ as" can't occur (to my knowledge) anywhere now. So, the thing after it is a keyword in that context (and only that context, otherwise "frozenset" remains an identifier naming an ordinary builtin) and specifies what kind of literal the following sequence is. You could also extend it to alternate forms for some other builtin types - for example { as bytes [1, 2, 3, 4, 5] } instead of b"\x1\x2\x3\x4\x5". Or... { as set { } }

On 13 February 2013 21:02, <random832@fastmail.us> wrote:
I'm really not sure I like this idea, but surely: LITERAL as KEYWORD { ... } as frozenset [1, 2, 3, 4, 5] as bytes {} as set would work better. However, I'm not happy on the idea that an identifier can be a keyword in another context.

On Sat, Feb 2, 2013 at 11:38 AM, Hua Lu <gotoalanlu@gmail.com> wrote:
It looks like we can special-case usage of set literal as a key for a dict or a member for another set and create a frozenset constant instead. So that adict[{1, 2, 3}] should be interpreted as adict[frozenset([1, 2, 3])] and { {'b', 'a', 'r'}, 'foo' } as { frozenset('bar'), 'foo' } This will provide a minimal change to the interpreter while making it possible to use any literal-parsing with frozensets while keeping method calls out of literal_eval. -- Kind regards, Yuriy.

This could interfere with the behavior of types overriding the index operator. Also, it's a special case.
Why shouldn't we change the interpreter when it could make the language better? Honestly, I can't think of an element which will require special syntax like symbol{...}. Furthermore, it could be really confusable with current syntax. Example: for fn { bar(foo) for foo in baz }: What if bar was a higher order function and that syntax meant something like "make a generator applying each thing in the set to an original fn before introducing fn in the scope" when you really meant `for fn in {...}`? Python 3 has already made some calls more explicitly a function (e.g. print) so having that as syntactic sugar for that would also be weird. There doesn't seem to be any _use_ for syntax like symbol{...}, so I feel for example s{...}, f{...}, and d{...} would be a natural extension to Python insofar as b'...' and u'...' are in the language, _even though_ sets and maps feel more general mathematically speaking (so the need for special use syntax _may_ be inappropriate (but people write math in languages ;)).

Hey all, I have a simple hack around this problem for the time being. It involves adding a parameter to ast.literal_eval. See this diff: $ diff ast.py.bak ast.py 38c38 < def literal_eval(node_or_string): ---
return set_t(map(_convert, node.elts))
Use is as follows:
Regards, Alan

FWIW, I could personally tolerate the introduction of s{} and fs{} literals. We'd just declare the "s" prefix optional for non-empty sets to match the current rules. Encouraging the use of ast.literal_eval() over the security nightmare that is eval() would be more than enough justification for me. (As a syntax change, the idea would still need a PEP, though) Cheers, Nick.

On 3 Feb 2013 00:39, "Antoine Pitrou" <solipsis@pitrou.net> wrote:
match the current rules.
Encouraging the use of ast.literal_eval() over the security nightmare
to that
The difference is that decimal and datetime already have safe "from string" conversion operations. The empty set and frozen sets do not. That said, a decimal literal proposal would be a reasonable follow-up to the incorporation of cdecimal. Datetime is a more difficult prospect, since there are good reasons strptime is as flexible as it is. Regardless, I'm not saying a PEP to support all the builtin container types in ast.literal_eval would necessarily be accepted. I'm merely saying it is *worth writing*. Cheers, Nick.
participants (15)
-
Antoine Pitrou
-
Chris Angelico
-
Greg Ewing
-
Guido van Rossum
-
Hua Lu
-
Joshua Landau
-
Markus Unterwaditzer
-
MRAB
-
Nick Coghlan
-
random832@fastmail.us
-
Raymond Hettinger
-
Serhiy Storchaka
-
Stefan Behnel
-
Steven D'Aprano
-
Yuriy Taraday