Greek alphabet in built-in`string` module

I was wondering if it might be reasonable to consider adding a set of Greek letters to the constants in the `string` module. In my mind, the format would follow the existing convention—constants for both `greek_lowercase` and `greek_uppercase`, as well as a combined constant `greek_letters`. Judging by how these constants are defined in the module, this seems like it might be an almost trivially easy addition (with minimal future maintenance required), and could provide a neat little functionality. Where Python 3 already uses unicode strings, I don't believe this would cause any backward compatibility issues. The only one drawback I can see might be polluting the namespace if it is not a commonly used feature. That said, the Greek alphabet is used so commonly in math/science that I'd hazard a guess it might be valuable for some, especially as unicode becomes increasingly prevalent. Obviously developers could build their own similar constant just by creating a string with those characters, but this would save a step, whether that's adding a dependency or copying/pasting a handful of characters. I also anticipate that the argument may come up that this opens a can of worms, and then why not include even more symbols. I think that is a valid concern, though I think the Greek alphabet is somewhat unique in it's prevalence (due to it's use in math/science), on top of the fact that it is both limited and permanent (in a way that things like logograms and emojis may not be). I'm certainly not an expert on the Python source code though, so please correct me if there's an obvious reason not to add this or if this has been debated before :)

Would it include digamma, san, koppa, or sampi... Or strictly Koine letters? To a lesser extent than Greek, Hebrew letters are also used commonly in math. What about the double struck capitals like ℤ, ℚ, and ℕ? It kinda feels like a very simple third party module couple give many such names for "characters used in such-and-such domain." On Fri, Apr 9, 2021, 9:34 AM <mitchell.negus.57@gmail.com> wrote:
I was wondering if it might be reasonable to consider adding a set of Greek letters to the constants in the `string` module. In my mind, the format would follow the existing convention—constants for both `greek_lowercase` and `greek_uppercase`, as well as a combined constant `greek_letters`.
Judging by how these constants are defined in the module, this seems like it might be an almost trivially easy addition (with minimal future maintenance required), and could provide a neat little functionality. Where Python 3 already uses unicode strings, I don't believe this would cause any backward compatibility issues.
The only one drawback I can see might be polluting the namespace if it is not a commonly used feature. That said, the Greek alphabet is used so commonly in math/science that I'd hazard a guess it might be valuable for some, especially as unicode becomes increasingly prevalent. Obviously developers could build their own similar constant just by creating a string with those characters, but this would save a step, whether that's adding a dependency or copying/pasting a handful of characters. I also anticipate that the argument may come up that this opens a can of worms, and then why not include even more symbols. I think that is a valid concern, though I think the Greek alphabet is somewhat unique in it's prevalence (due to it's use in math/science), on top of the fact that it is both limited and permanent (in a way that things like logograms and emojis may not be).
I'm certainly not an expert on the Python source code though, so please correct me if there's an obvious reason not to add this or if this has been debated before :) _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/STABL2... Code of Conduct: http://python.org/psf/codeofconduct/

This sounds more like a Unicode thing than a generic string thing. And, in Uncode, Greek characters are included in multiple groupings. Searching for "Theta" to see what we get: Greek and Coptic: U+0398 GREEK CAPITAL LETTER THETA U+03B8 GREEK SMALL LETTER THETA U+03D1 GREEK THETA SYMBOL U+03F4 GREEK CAPITAL THETA SYMBOL Phonetic Extensions Supplement: U+1DBF MODIFIER LETTER SMALL THETA Mathematical Alphanumeric Symbols: U+1D6AF MATHEMATICAL BOLD CAPITAL THETA U+1D6B9 MATHEMATICAL BOLD CAPITAL THETA SYMBOL U+1D6C9 MATHEMATICAL BOLD SMALL THETA (... 17 more Thetas in this group! ...) If you were to pick a definitive set of Greek characters for your use case, would it be in the Mathematical Alphanumeric Symbols category? Would others' expected use of Greek characters match yours, or would it need to be inclusive of all Greek characters across groupings? I'm beginning to sense a metal container containing wriggly things... Paul On Fri, 2021-04-09 at 12:59 +0000, mitchell.negus.57@gmail.com wrote:
I was wondering if it might be reasonable to consider adding a set of Greek letters to the constants in the `string` module. In my mind, the format would follow the existing convention—constants for both `greek_lowercase` and `greek_uppercase`, as well as a combined constant `greek_letters`.
Judging by how these constants are defined in the module, this seems like it might be an almost trivially easy addition (with minimal future maintenance required), and could provide a neat little functionality. Where Python 3 already uses unicode strings, I don't believe this would cause any backward compatibility issues.
The only one drawback I can see might be polluting the namespace if it is not a commonly used feature. That said, the Greek alphabet is used so commonly in math/science that I'd hazard a guess it might be valuable for some, especially as unicode becomes increasingly prevalent. Obviously developers could build their own similar constant just by creating a string with those characters, but this would save a step, whether that's adding a dependency or copying/pasting a handful of characters. I also anticipate that the argument may come up that this opens a can of worms, and then why not include even more symbols. I think that is a valid concern, though I think the Greek alphabet is somewhat unique in it's prevalence (due to it's use in math/science), on top of the fact that it is both limited and permanent (in a way that things like logograms and emojis may not be).
I'm certainly not an expert on the Python source code though, so please correct me if there's an obvious reason not to add this or if this has been debated before :) _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/STABL2... Code of Conduct: http://python.org/psf/codeofconduct/

On Sat, Apr 10, 2021 at 12:15 AM Paul Bryan <pbryan@anode.ca> wrote:
This sounds more like a Unicode thing than a generic string thing. And, in Uncode, Greek characters are included in multiple groupings. Searching for "Theta" to see what we get:
Greek and Coptic: U+0398 GREEK CAPITAL LETTER THETA U+03B8 GREEK SMALL LETTER THETA U+03D1 GREEK THETA SYMBOL U+03F4 GREEK CAPITAL THETA SYMBOL
Phonetic Extensions Supplement: U+1DBF MODIFIER LETTER SMALL THETA
Mathematical Alphanumeric Symbols: U+1D6AF MATHEMATICAL BOLD CAPITAL THETA U+1D6B9 MATHEMATICAL BOLD CAPITAL THETA SYMBOL U+1D6C9 MATHEMATICAL BOLD SMALL THETA (... 17 more Thetas in this group! ...)
If you were to pick a definitive set of Greek characters for your use case, would it be in the Mathematical Alphanumeric Symbols category? Would others' expected use of Greek characters match yours, or would it need to be inclusive of all Greek characters across groupings?
I'm beginning to sense a metal container containing wriggly things...
But I think you've also nailed the correct solution. Python comes with [1] a unicodedata module, which would be the best way to define these sorts of sets. It's a tad messy to try to gather the correct elements though, so maybe the best way to do this would be a unicodedata.search() function that returns a string of all characters with a particular string in their names, or something like that. ChrisA [1] technically, CPython and many other implementations come with, but there are some (eg uPy) that don't

I agree. It would be great to get something more than what the simplistic `unicodedata.category(...)` returns; for example, what Unicode group a character falls in. On Sat, 2021-04-10 at 00:29 +1000, Chris Angelico wrote:
On Sat, Apr 10, 2021 at 12:15 AM Paul Bryan <pbryan@anode.ca> wrote:
This sounds more like a Unicode thing than a generic string thing. And, in Uncode, Greek characters are included in multiple groupings. Searching for "Theta" to see what we get:
Greek and Coptic: U+0398 GREEK CAPITAL LETTER THETA U+03B8 GREEK SMALL LETTER THETA U+03D1 GREEK THETA SYMBOL U+03F4 GREEK CAPITAL THETA SYMBOL
Phonetic Extensions Supplement: U+1DBF MODIFIER LETTER SMALL THETA
Mathematical Alphanumeric Symbols: U+1D6AF MATHEMATICAL BOLD CAPITAL THETA U+1D6B9 MATHEMATICAL BOLD CAPITAL THETA SYMBOL U+1D6C9 MATHEMATICAL BOLD SMALL THETA (... 17 more Thetas in this group! ...)
If you were to pick a definitive set of Greek characters for your use case, would it be in the Mathematical Alphanumeric Symbols category? Would others' expected use of Greek characters match yours, or would it need to be inclusive of all Greek characters across groupings?
I'm beginning to sense a metal container containing wriggly things...
But I think you've also nailed the correct solution. Python comes with [1] a unicodedata module, which would be the best way to define these sorts of sets. It's a tad messy to try to gather the correct elements though, so maybe the best way to do this would be a unicodedata.search() function that returns a string of all characters with a particular string in their names, or something like that.
ChrisA
[1] technically, CPython and many other implementations come with, but there are some (eg uPy) that don't _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/5MRAFM... Code of Conduct: http://python.org/psf/codeofconduct/
participants (4)
-
Chris Angelico
-
David Mertz
-
mitchell.negus.57@gmail.com
-
Paul Bryan