[Python-ideas] Re: Revisiting a frozenset display literal

Jan. 21, 2022

      On Fri, 21 Jan 2022 at 12:15, Chris Angelico <rosuav@gmail.com> wrote:
...
On Fri, 21 Jan 2022 at 22:52, Oscar Benjamin <oscar.j.benjamin@gmail.com> wrote:
...
I really don't understand (having read everything above) why anyone prefers {1,2,3}.frozen() over f{1,2,3}. Yes, some people coming from some other languages might get confused (e.g. in Mathematica this is function call syntax) but that's true of anything: you have to learn Python syntax to use Python. The fact that {1,2,3} is a set and f{1,2,3} is a frozenset is not difficult to explain or to understand, especially in a language that already uses single letter prefixes for other things.
The .frozen() method is a strangely indirect way to achieve a minor optimisation. Outside of attempting to achieve that optimisation it's basically useless because any time you would have written obj.frozen() you could have simply written frozenset(obj) so it does nothing to improve code that uses frozensets.
If set.frozen() is optimized, then str.upper() can be optimized the
same way, which means there's a lot of places where constant folding
can be used. We commonly write code like "7*24*60*60" to mean the
number of seconds in a week, confident that it'll be exactly as fast
as writing "604800", and there's no particular reason that method
calls can't get the same optimization, other than that it hasn't been
done yet.
While dedicated syntax might be as good, it also wouldn't help with
string methods (or int methods - I don't see it a lot currently, but
maybe (1234).to_bytes() could become more popular), and it would also
be completely backward incompatible - you can't feature-test for
syntax without a lot of hassle with imports and alternates. In
contrast, code that wants to use set.frozen() can at least test for
that with a simple try/except in the same module.
The proposal for .frozen() is not about optimising method calls on
literals in general: the proposal is to add a method that is basically
redundant but purely so that calls to the method can be optimised
away.
...
Not one of the proposed syntaxes has seen any sort of strong support.
This isn't the first time people have proposed a syntactic form for
frozensets, and it never achieves sufficient consensus to move
forward.
...
With f{...} you have a nice syntax that clearly creates a frozenset directly and that can be used for repr. This is some actual code that I recently wrote using frozensets to represent monomials in a sparse representation of a multivariate polynomial:
"Clearly" is subjective. Any syntax could be used for repr, including
{1,2,3}.frozen(), so f{1,2,3} doesn't have any particular edge there.
Personally, I think that string literals are not the same thing as
tuple/list/dict/set displays, and letter prefixes are not as useful on
the latter.
...
...
...
...
poly = {frozenset([(1,2), (3,4)]): 2, frozenset([(0,1)]): 3}
poly
  {frozenset({(1, 2), (3, 4)}): 2, frozenset({(0, 1)}): 3}
With the f{...} proposal you have actual syntax for this:
...
...
...
poly = {f{(1,2), (3,4)}: 2, f{(0,1)}): 3}
poly
  {f{(1, 2), (3, 4)}: 2, f{(0, 1)}): 3}
With .frozen() it's
...
...
...
poly = {{(1,2), (3,4)}.frozen(): 2, f{(0,1)}.frozen()): 3}
poly
  ??? does the repr change?
Yes, it most certainly would change the repr. I don't see why that's an issue.
I'm not saying it's an issue. That was a genuine question. So I guess
you'd expect this:
...
...
...
poly = {{(1,2), (3,4)}.frozen(): 2, f{(0,1)}.frozen()): 3}
poly
   {{(1, 2), (3, 4)}.frozen(): 2, f{(0, 1)}.frozen()): 3}
This btw is my real point of my post which you seem to have missed (I
probably should have kept it more direct):
...
...
That difference in code/repr may or may not seem like an improvement to different people but that should be the real point of discussion if talking about a frozenset literal. The performance impact of frozenset literals is not going to be noticeable in any real application.
If we take performance out of the equation would anyone actually
propose to add a .frozen() method so that obj.frozen() could be used
instead of frozenset(obj)?

If so then what is the argument for having a redundant way of doing this?
...
I don't understand polynomials as frozensets. What's the point of
representing them that way? Particularly if you're converting to and
from dicts all the time, why not represent them as dicts? Or as some
custom mapping type, if you need it to be hashable?
Hashability is the point. The polynomial is a dict mapping monomials
to coefficients and the monomials are frozensets of factors so that
they are hashable with unordered equality. Another option would just
be a sorted tuple of tuples and then instead of frozenset(d.items())
you'd have tuple(sorted(d.items())) but that's slower in my timings
(for all input sizes). Any custom type with pure Python
__eq__/__hash__ methods would slow everything down because apart from
a couple of functions like the one I showed all these objects are used
for is as dict keys.

--
Oscar