An unambiguous way of initializing an empty set and dictionary
Currently: l = [] # new empty list t = () # new empty tuple s = set() # new empty set (no clean and consistent way of initializing regarding the others) <<< d = {} # new empty dictionary Possible solution: s = {} # new empty set d = {:} # new empty dictionary (the ":" is a reference to key-value pairs) Current workaround at least for consistency: l = list() # new empty list t = tuple() # new empty tuple s = set() # new empty set d = dict() # new empty dictionary However, it doesn't feel right to not be able to initialize an empty set as cleanly and consistently as lists, tuples and dictionaries in both forms.
On Mon, 14 Mar 2022 at 23:35, <joao.p.f.batista.97@gmail.com> wrote:
Currently: l = [] # new empty list t = () # new empty tuple s = set() # new empty set (no clean and consistent way of initializing regarding the others) <<< d = {} # new empty dictionary
Possible solution: s = {} # new empty set d = {:} # new empty dictionary (the ":" is a reference to key-value pairs)
Nope, that would break tons of existing code. Not gonna happen. ChrisA
On Mon, Mar 14, 2022 at 9:49 AM Chris Angelico <rosuav@gmail.com> wrote:
On Mon, 14 Mar 2022 at 23:35, <joao.p.f.batista.97@gmail.com> wrote:
Currently: l = [] # new empty list t = () # new empty tuple s = set() # new empty set (no clean and consistent way of initializing
d = {} # new empty dictionary
Possible solution: s = {} # new empty set d = {:} # new empty dictionary (the ":" is a reference to key-value
regarding the others) <<< pairs)
Nope, that would break tons of existing code. Not gonna happen.
Of couse not. (And I mean it). - but what about keeping what exists and adding {,} for an empty set? (it is not that unlike the one-element tuple, which already exists)
ChrisA _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/PCPZPH... Code of Conduct: http://python.org/psf/codeofconduct/
On Tue, 15 Mar 2022 at 00:07, Joao S. O. Bueno <jsbueno@python.org.br> wrote:
On Mon, Mar 14, 2022 at 9:49 AM Chris Angelico <rosuav@gmail.com> wrote:
On Mon, 14 Mar 2022 at 23:35, <joao.p.f.batista.97@gmail.com> wrote:
Currently: l = [] # new empty list t = () # new empty tuple s = set() # new empty set (no clean and consistent way of initializing regarding the others) <<< d = {} # new empty dictionary
Possible solution: s = {} # new empty set d = {:} # new empty dictionary (the ":" is a reference to key-value pairs)
Nope, that would break tons of existing code. Not gonna happen.
Of couse not. (And I mean it). - but what about keeping what exists and adding {,} for an empty set? (it is not that unlike the one-element tuple, which already exists)
That's more plausible. However, the one-element tuple is actually written like this: t = x, The parentheses are a common form of clarity (and included in the repr), but aren't actually the part of the syntax that makes it a tuple - the comma is. So there's no real parallel, and it's basically "what can we write that wouldn't be ambiguous?", which is a weak justification. Unfortunately, Python simply doesn't have enough symbols available. Using precisely one opener/closer for each type is highly limiting, since the only characters available are those on a US-English keyboard and in the ASCII set. It would be nice if, for instance, ∅ could mean "new empty set", but then we'd need a way to type it, and it'd end up coming right back around to "just type set(), it's easier". I wonder what it would be like to have a fork of Python that introduces some non-ASCII non-US-English syntax, purely to give people a chance to play around with it. Someone might actually set up an editor feature so that "set()" transforms into "∅", not just visually but in the file, and since it's restricted to an input feature in the editor, it avoids the usual problems of "what if you shadow the name set". Who knows? Maybe it would catch on, maybe it wouldn't. It's not all that difficult to hack on Python and add this feature. I did it a while back, but since I didn't use sets enough to bother figuring out an input method, didn't end up using it. If you want to write it as a pure source-code transformation, {*()} is a syntax-only way to generate an empty set, so it'll guarantee that you don't run into name shadowing issues; but it would be better to make it actual syntax (and thus avoid unpacking a tuple into your set for no reason), and also change the repr accordingly. ChrisA
Do you know what else would work for being able to enter empty sets? A prefix to {} , like "s" a = s{} and b = f{} for an empty frozenset (/me ducks, and hides in a place Chris won't find me) On Mon, Mar 14, 2022 at 10:29 AM Chris Angelico <rosuav@gmail.com> wrote:
On Tue, 15 Mar 2022 at 00:07, Joao S. O. Bueno <jsbueno@python.org.br> wrote:
On Mon, Mar 14, 2022 at 9:49 AM Chris Angelico <rosuav@gmail.com> wrote:
On Mon, 14 Mar 2022 at 23:35, <joao.p.f.batista.97@gmail.com> wrote:
Currently: l = [] # new empty list t = () # new empty tuple s = set() # new empty set (no clean and consistent way of
d = {} # new empty dictionary
Possible solution: s = {} # new empty set d = {:} # new empty dictionary (the ":" is a reference to key-value
initializing regarding the others) <<< pairs)
Nope, that would break tons of existing code. Not gonna happen.
Of couse not. (And I mean it). - but what about keeping what exists and adding {,} for an empty set? (it is not that unlike the one-element tuple, which already exists)
That's more plausible. However, the one-element tuple is actually written like this:
t = x,
The parentheses are a common form of clarity (and included in the repr), but aren't actually the part of the syntax that makes it a tuple - the comma is. So there's no real parallel, and it's basically "what can we write that wouldn't be ambiguous?", which is a weak justification.
Unfortunately, Python simply doesn't have enough symbols available. Using precisely one opener/closer for each type is highly limiting, since the only characters available are those on a US-English keyboard and in the ASCII set. It would be nice if, for instance, ∅ could mean "new empty set", but then we'd need a way to type it, and it'd end up coming right back around to "just type set(), it's easier".
I wonder what it would be like to have a fork of Python that introduces some non-ASCII non-US-English syntax, purely to give people a chance to play around with it. Someone might actually set up an editor feature so that "set()" transforms into "∅", not just visually but in the file, and since it's restricted to an input feature in the editor, it avoids the usual problems of "what if you shadow the name set". Who knows? Maybe it would catch on, maybe it wouldn't.
It's not all that difficult to hack on Python and add this feature. I did it a while back, but since I didn't use sets enough to bother figuring out an input method, didn't end up using it. If you want to write it as a pure source-code transformation, {*()} is a syntax-only way to generate an empty set, so it'll guarantee that you don't run into name shadowing issues; but it would be better to make it actual syntax (and thus avoid unpacking a tuple into your set for no reason), and also change the repr accordingly.
ChrisA _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/NYNTWQ... Code of Conduct: http://python.org/psf/codeofconduct/
Unfortunately, Python simply doesn't have enough symbols available. Using precisely one opener/closer for each type is highly limiting, since the only characters available are those on a US-English keyboard and in the ASCII set. It would be nice if, for instance, ∅ could mean "new empty set", but then we'd need a way to type it, and it'd end up coming right back around to "just type set(), it's easier".
∅ also means "diameter" in optics community and mechanical engineering community (at least in ASME Y14.5).
14.03.22 15:07, Joao S. O. Bueno пише:
- but what about keeping what exists and adding {,} for an empty set? (it is not that unlike the one-element tuple, which already exists)
If you want to create an empty set without using any identifier, use {*()}. The advantage is that it works in old Python versions.
On Mon, Mar 14, 2022 at 8:33 AM <joao.p.f.batista.97@gmail.com> wrote:
Possible solution: s = {} # new empty set d = {:} # new empty dictionary (the ":" is a reference to key-value pairs)
I have suggested over the years—as have probably dozens of other people (maybe thousands)—that that would be a great spelling if Python were a brand new language. But in reality, Python is 30+ years old, and billions of lines of code use `{}` as an empty dict. Historically, Python had dictionaries before it had sets, so there was more than a decade in there where sets were not a thing you could spell at all. A breaking change isn't something that's going to happen here. -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.
The “correct” (according to Bourbaki) mathematical notation for an empty set is “∅" (aka Unicode U+2205, or HTML ∅) Some time ago, for a project which had a lot of empty sets, I tried to use this symbol as a short hand for set(). But:
⦰ = set() File "<stdin>", line 1 ⦰ = set() ^ SyntaxError: invalid character '⦰' (U+29B0) ø = set()
In other words, “⦰” is illegal as an identifier in Python (same for ⌀ aka U+2300 DIAMETER SIGN), but “ø” (aka U+00F8 LATIN SMALL LETTER O WITH STROKE) is legal ! So I used "⌀" instead of “⦰”, but I eventually dropped the whole idea because, IIRC, some tools weren’t too happy with it. Still, I guess it wouldn’t be neither too hard nor two disruptive to accept “⦰” as well as some other mathematical characters as identifiers in Python. Since once of the application domains where Python shines nowadays is mathematics (numerical, but also symbolic), I think it’s a shame that we are preventing to use the proper unicode characters to designate some universal mathematical objects. More info here: https://en.wikipedia.org/wiki/Null_sign S. On 13 Mar 2022 at 22:52:16, joao.p.f.batista.97@gmail.com wrote:
Currently: l = [] # new empty list t = () # new empty tuple s = set() # new empty set (no clean and consistent way of initializing regarding the others) <<< d = {} # new empty dictionary
Possible solution: s = {} # new empty set d = {:} # new empty dictionary (the ":" is a reference to key-value pairs)
Current workaround at least for consistency: l = list() # new empty list t = tuple() # new empty tuple s = set() # new empty set d = dict() # new empty dictionary
However, it doesn't feel right to not be able to initialize an empty set as cleanly and consistently as lists, tuples and dictionaries in both forms. _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/OPWEKL... Code of Conduct: http://python.org/psf/codeofconduct/
-- Stefane Fermigier - http://fermigier.com/ - http://twitter.com/sfermigier - http://linkedin.com/in/sfermigier Founder & CEO, Abilian - Enterprise Social Software - http://www.abilian.com/ Co-Founder & Co-Chairman, National Council for Free & Open Source Software (CNLL) - http://cnll.fr/ Co-Founder & Chairman, Association Professionnelle Européenne du Logiciel Libre (APELL) - https://www.apell.info/ Co-Founder & Spokesperson, European Cloud Industrial Alliance (EUCLIDIA) - https://www.euclidia.eu/ Founder, PyParis & PyData Paris - http://pyparis.org/ & http://pydata.fr/
Hmm, I think the idea of the mathematical symbol is interesting, but I think users are more interested in constructing a new, eventually-not-empty set, than referencing the empty set. Semantically, I don't know if ∅() is satisfying. On Thu, Mar 17, 2022 at 08:19 Stéfane Fermigier <sf@fermigier.com> wrote:
The “correct” (according to Bourbaki) mathematical notation for an empty set is “∅" (aka Unicode U+2205, or HTML ∅)
Some time ago, for a project which had a lot of empty sets, I tried to use this symbol as a short hand for set(). But:
⦰ = set() File "<stdin>", line 1 ⦰ = set() ^ SyntaxError: invalid character '⦰' (U+29B0) ø = set()
In other words, “⦰” is illegal as an identifier in Python (same for ⌀ aka U+2300 DIAMETER SIGN), but “ø” (aka U+00F8 LATIN SMALL LETTER O WITH STROKE) is legal !
So I used "⌀" instead of “⦰”, but I eventually dropped the whole idea because, IIRC, some tools weren’t too happy with it.
Still, I guess it wouldn’t be neither too hard nor two disruptive to accept “⦰” as well as some other mathematical characters as identifiers in Python.
Since once of the application domains where Python shines nowadays is mathematics (numerical, but also symbolic), I think it’s a shame that we are preventing to use the proper unicode characters to designate some universal mathematical objects.
More info here: https://en.wikipedia.org/wiki/Null_sign
S.
On 13 Mar 2022 at 22:52:16, joao.p.f.batista.97@gmail.com wrote:
Currently: l = [] # new empty list t = () # new empty tuple s = set() # new empty set (no clean and consistent way of initializing regarding the others) <<< d = {} # new empty dictionary
Possible solution: s = {} # new empty set d = {:} # new empty dictionary (the ":" is a reference to key-value pairs)
Current workaround at least for consistency: l = list() # new empty list t = tuple() # new empty tuple s = set() # new empty set d = dict() # new empty dictionary
However, it doesn't feel right to not be able to initialize an empty set as cleanly and consistently as lists, tuples and dictionaries in both forms. _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/OPWEKL... Code of Conduct: http://python.org/psf/codeofconduct/
-- Stefane Fermigier - http://fermigier.com/ - http://twitter.com/sfermigier - http://linkedin.com/in/sfermigier Founder & CEO, Abilian - Enterprise Social Software - http://www.abilian.com/ Co-Founder & Co-Chairman, National Council for Free & Open Source Software (CNLL) - http://cnll.fr/ Co-Founder & Chairman, Association Professionnelle Européenne du Logiciel Libre (APELL) - https://www.apell.info/ Co-Founder & Spokesperson, European Cloud Industrial Alliance (EUCLIDIA) - https://www.euclidia.eu/ Founder, PyParis & PyData Paris - http://pyparis.org/ & http://pydata.fr/ _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/D3SSAH... Code of Conduct: http://python.org/psf/codeofconduct/
I just do this myself in my text editor (vim): [image: sets-py.png] But this is just cosmetic because I like to look at it this way. The actual file on disk contains `set()`, `<=`, `in`, `not in` and wouldn't be a problem for anyone without the same fonts installed, or require anyone to know odd key combos. On Thu, Mar 17, 2022 at 8:35 AM Michael Smith <michael@smith-li.com> wrote:
Hmm, I think the idea of the mathematical symbol is interesting, but I think users are more interested in constructing a new, eventually-not-empty set, than referencing the empty set.
Semantically, I don't know if ∅() is satisfying.
On Thu, Mar 17, 2022 at 08:19 Stéfane Fermigier <sf@fermigier.com> wrote:
The “correct” (according to Bourbaki) mathematical notation for an empty set is “∅" (aka Unicode U+2205, or HTML ∅)
Some time ago, for a project which had a lot of empty sets, I tried to use this symbol as a short hand for set(). But:
⦰ = set() File "<stdin>", line 1 ⦰ = set() ^ SyntaxError: invalid character '⦰' (U+29B0) ø = set()
In other words, “⦰” is illegal as an identifier in Python (same for ⌀ aka U+2300 DIAMETER SIGN), but “ø” (aka U+00F8 LATIN SMALL LETTER O WITH STROKE) is legal !
So I used "⌀" instead of “⦰”, but I eventually dropped the whole idea because, IIRC, some tools weren’t too happy with it.
Still, I guess it wouldn’t be neither too hard nor two disruptive to accept “⦰” as well as some other mathematical characters as identifiers in Python.
Since once of the application domains where Python shines nowadays is mathematics (numerical, but also symbolic), I think it’s a shame that we are preventing to use the proper unicode characters to designate some universal mathematical objects.
More info here: https://en.wikipedia.org/wiki/Null_sign
S.
On 13 Mar 2022 at 22:52:16, joao.p.f.batista.97@gmail.com wrote:
Currently: l = [] # new empty list t = () # new empty tuple s = set() # new empty set (no clean and consistent way of initializing regarding the others) <<< d = {} # new empty dictionary
Possible solution: s = {} # new empty set d = {:} # new empty dictionary (the ":" is a reference to key-value pairs)
Current workaround at least for consistency: l = list() # new empty list t = tuple() # new empty tuple s = set() # new empty set d = dict() # new empty dictionary
However, it doesn't feel right to not be able to initialize an empty set as cleanly and consistently as lists, tuples and dictionaries in both forms. _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/OPWEKL... Code of Conduct: http://python.org/psf/codeofconduct/
-- Stefane Fermigier - http://fermigier.com/ - http://twitter.com/sfermigier - http://linkedin.com/in/sfermigier Founder & CEO, Abilian - Enterprise Social Software - http://www.abilian.com/ Co-Founder & Co-Chairman, National Council for Free & Open Source Software (CNLL) - http://cnll.fr/ Co-Founder & Chairman, Association Professionnelle Européenne du Logiciel Libre (APELL) - https://www.apell.info/ Co-Founder & Spokesperson, European Cloud Industrial Alliance (EUCLIDIA) - https://www.euclidia.eu/ Founder, PyParis & PyData Paris - http://pyparis.org/ & http://pydata.fr/ _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/D3SSAH... Code of Conduct: http://python.org/psf/codeofconduct/
_______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/Z43FW7... Code of Conduct: http://python.org/psf/codeofconduct/
-- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.
On Fri, 18 Mar 2022 at 00:31, David Mertz, Ph.D. <david.mertz@gmail.com> wrote:
I just do this myself in my text editor (vim):
But this is just cosmetic because I like to look at it this way. The actual file on disk contains `set()`, `<=`, `in`, `not in` and wouldn't be a problem for anyone without the same fonts installed, or require anyone to know odd key combos.
This is potentially confusing if you ever shadow the name 'set' (since the word doesn't appear in the visual form, but does appear in the source), but otherwise, it's a good solution. Here's a crazy thought: What if we start by standardizing on {*()} as an idiom, teach the optimizer to skip the tuple altogether, and then encourage editors to (a) show it as an empty set glyph, and (b) provide a convenient way to enter it? On disk, it would still be a syntactic form that's compatible all the way back to Python 3.5 (I had to check actually - the additional unpacking generalizations have been around longer than I thought), but in the display, it would look quite elegant. Then, if it catches on, support for the actual ∅ symbol (btw, "⦰" is actually "reversed empty set", but they're in the same category so same diff) could be added as an alias for {*()}, at which point editors could have an option to represent it either way. For the record, here's the timings for the different forms: rosuav@sikorsky:~$ python3 -m timeit -s 'from opcode import opmap as o; f1 = lambda: set(); f2 = lambda: {*()}; f3 = type(f2)(f2.__code__.replace(co_code=bytes([o["BUILD_SET"], 0, o["RETURN_VALUE"], 0])), f2.__globals__)' 'f1()' 5000000 loops, best of 5: 78.6 nsec per loop rosuav@sikorsky:~$ python3 -m timeit -s 'from opcode import opmap as o; f1 = lambda: set(); f2 = lambda: {*()}; f3 = type(f2)(f2.__code__.replace(co_code=bytes([o["BUILD_SET"], 0, o["RETURN_VALUE"], 0])), f2.__globals__)' 'f2()' 5000000 loops, best of 5: 98.9 nsec per loop rosuav@sikorsky:~$ python3 -m timeit -s 'from opcode import opmap as o; f1 = lambda: set(); f2 = lambda: {*()}; f3 = type(f2)(f2.__code__.replace(co_code=bytes([o["BUILD_SET"], 0, o["RETURN_VALUE"], 0])), f2.__globals__)' 'f3()' 5000000 loops, best of 5: 62.8 nsec per loop Sorry for the messy setup but I wanted something that would work on multiple Python versions, so hardcoding a bytes literal wouldn't work. I tried it on Python 3.8, 3.9, 3.10, and 3.11, and in each case, the relative speeds (f2 slowest, f3 fastest) were maintained. ChrisA
On Thu, 17 Mar 2022 at 23:19, Stéfane Fermigier <sf@fermigier.com> wrote:
The “correct” (according to Bourbaki) mathematical notation for an empty set is “∅" (aka Unicode U+2205, or HTML ∅)
Some time ago, for a project which had a lot of empty sets, I tried to use this symbol as a short hand for set(). But:
⦰ = set() File "<stdin>", line 1 ⦰ = set() ^ SyntaxError: invalid character '⦰' (U+29B0) ø = set()
In other words, “⦰” is illegal as an identifier in Python (same for ⌀ aka U+2300 DIAMETER SIGN), but “ø” (aka U+00F8 LATIN SMALL LETTER O WITH STROKE) is legal !
So I used "⌀" instead of “⦰”, but I eventually dropped the whole idea because, IIRC, some tools weren’t too happy with it.
Still, I guess it wouldn’t be neither too hard nor two disruptive to accept “⦰” as well as some other mathematical characters as identifiers in Python.
unicodedata.category("⦰") 'Sm'
https://www.fileformat.info/info/unicode/category/Sm/list.htm This is the "Symbol, math" category. Python's support for characters in identifiers is, apart from some compatibility rules to ensure that treatment of ASCII hasn't changed since Py2, based on these categories, and this one is primarily composed of what we would call symbols, not letters (if you prefer, they're more like "punctuation" than "words"). https://docs.python.org/3/reference/lexical_analysis.html#identifiers Supporting these in identifiers is fundamentally incompatible with supporting them as literals, with the exception of keywords, which always represent specific values (for instance, True does not mean "construct a new boolean object with the value True", it means "use the existing instance of True"). Since an empty set needs to be constructed every time, using it as an identifier seems backwards; it would be more useful to define it as a literal instead. There are problems with creating non-ASCII literal forms, but I believe fewer than with allowing symbols as identifiers. ChrisA
participants (8)
-
Brian McCall
-
Chris Angelico
-
David Mertz, Ph.D.
-
Joao S. O. Bueno
-
joao.p.f.batista.97@gmail.com
-
Michael Smith
-
Serhiy Storchaka
-
Stéfane Fermigier