
What you are think about adding Unicode aliases for some mathematic names in the math module? ;-) math.π = math.pi math.τ = math.tau math.Γ = math.gamma math.ℯ = math.e Unfortunately we can't use ∞, ∑ and √ as identifiers. :-(

On Thu, Jun 01, 2017 at 09:47:57AM +0300, Serhiy Storchaka <storchaka@gmail.com> wrote:
-1. "There should be one-- and preferably only one --obvious way to do it." And -1 for non-ascii in stdlib. Oleg. -- Oleg Broytman http://phdru.name/ phd@phdru.name Programmers don't die, they just GOSUB without RETURN.

On 05/31/2017 11:47 PM, Serhiy Storchaka wrote:
What you are think about adding Unicode aliases for some mathematic names in the math module? ;-)
I personally don't like there being multiple symbols with the same meaning and I never find myself confused by the longer names versus the sorter symbols I would use when writing them. Cheers, Thomas

Hi all, Two remarks: 1. Note that ℯ also doesn't really work. While you can assign to this identifier, it actually gets normalized into a plain "e". 2. Unicode has a Σ : GREEK CAPITAL LETTER SIGMA and a ∑ : N-ARY SUMMATION The first is a valid Python identifier, the second not. Unfortunately, the second has the desired semantics... Stephan 2017-06-01 9:14 GMT+02:00 Ivan Levkivskyi <levkivskyi@gmail.com>:

Or perhaps create a small module: ============unimath.py============== import math __all__ = ["π", "τ", "Γ"] π = math.pi τ = math.tau Γ = math.gamma ==================================== Then do: from unimath import * Put it on the Python Package index. If it gets wildly popular the case for putting it in `math` will be greatly strengthened. Stephan 2017-06-01 9:17 GMT+02:00 Brice PARENT <contact@brice.xyz>:

Or perhaps create a small module: ============unimath.py============== import math __all__ = ["π", "τ", "Γ"] π = math.pi τ = math.tau Γ = math.gamma ==================================== Then do: from unimath import * Put it on the Python Package index. If it gets wildly popular the case for putting it in `math` will be greatly strengthened. Stephan 2017-06-01 9:17 GMT+02:00 Brice PARENT <contact@brice.xyz>:

How do you write π (pi) with a keyboard on Windows, Linux or macOS?
On macOS, ⌥ P (Option-P) works. On all platforms: 1. Make sure you are using Vim. 2. In insert mode: Ctrl-K *p You can also define abbrev's which will allow you to type pi\ and it gets replaced by π. See: https://gist.github.com/stephanh42/fc466e62bfb022a890ff2c4643eaf3a5 I presume Emacs can do something similar. Or you get this keyboard: https://imgur.com/gallery/tCNvP ;-) Stephan 2017-06-01 11:49 GMT+02:00 Victor Stinner <victor.stinner@gmail.com>:

This shouldn't be a problem for Greek users. ;-)
Well, they still need to switch between keymaps, since presumably they used the Latin keymap to enter `math.` before they can enter π. That is actually another general solution: just install the Greek keymap in addition to your native keymap. The OS typically provides keyboard shortcus to switch between keymaps. Υεσ Ι καν υσε Γρεεκ καρακτερσ! OK, that works. Stephan 2017-06-01 12:14 GMT+02:00 Serhiy Storchaka <storchaka@gmail.com>:

On Thu, Jun 01, 2017 at 11:49:43AM +0200, Victor Stinner wrote:
How do you write π (pi) with a keyboard on Windows, Linux or macOS?
Use the compose key 🙌 for linux: https://help.ubuntu.com/community/ComposeKey for windows: https://github.com/SamHocevar/wincompose for macosx: http://lol.zoy.org/blog/2012/06/17/compose-key-on-os-x Then I wrote my own ~/.XCompose file with: <Multi_key> <asterisk> <p> : "π" U03C0 # GREEK SMALL LETTER PI so it's like the vim digraphs. Cheers, -- zmo

On Jun 01 2017, Victor Stinner <victor.stinner-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
Under Linux, you'd use the Compose facility. Take a look at eg. /usr/share/X11/locale/en_US.UTF-8/Compose for all the nice things it let's you enter: $ egrep '[πτΓ]' /usr/share/X11/locale/en_US.UTF-8/Compose <dead_greek> <G> : "Γ" U0393 # GREEK CAPITAL LETTER GAMMA <dead_greek> <p> : "π" U03C0 # GREEK SMALL LETTER PI <dead_greek> <t> : "τ" U03C4 # GREEK SMALL LETTER TAU Best, -Nikolaus -- GPG Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F »Time flies like an arrow, fruit flies like a Banana.«

On Fri, Jun 2, 2017 at 7:52 AM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
I don't have a strong opinion about it being in the stdlib, but I'd also point out that a strong advantage to having these defined in a module at all is that third-party interpreters (e.g. IPython, bpython, some IDEs) that support tab-completion make these easy to type as well, and I find them to be very readable for math-heavy code.

Hi Masayuki, I admit that my understanding of this issue is very limited. Nevertheless, I would like to point out that the encoding assumed for a Python3 source file never depends on the locale. My understanding is that in the default encoding for Python source files (utf-8), East Asian Ambiguous characters must be assumed narrow. Now there are also legacy encodings where they are fullwidth. But it is always determined by the encoding, which in turn is specified or implied in the source file. So I don't actually see an issue here. Am I missing something? Stephan Op 1 jun. 2017 16:08 schreef "Masayuki YAMAMOTO" <ma3yuki.8mamo10@gmail.com
:
The width of Greek letters is East Asian Ambiguous. Using ambiguous width characters possibly will be a reason that is source code layout break on specific locale. Masayuki 2017-06-01 15:47 GMT+09:00 Serhiy Storchaka <storchaka@gmail.com>:
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

Hi Stephan, Nevertheless, I would like to point out that the encoding assumed for a
Python3 source file never depends on the locale.
Yeah, as you pointed out. I'd like to correct my said.
The mapping for ambiguous width assumes on East Asia legacy encodings and non East Asia legacy encodings, but not recommend to UTF-8 and other Unicode encodings. Displaying ambiguous width characters behave narrow by default, it isn't related to encoding. [*] Let me see... Several softwares have a setting that changes ambiguous width to halfwidth or fullwidth regardless for encoding (e.g. gnome-terminal, vim). And some fonts that are used in East Asia make glyph that is Greek letters and other signs to adjust to fullwidth, they break layout under halfwidth settings. It is possible that avoids these fonts, and uses multi language support font, yet signs that are only used in East Asia don't have halfwidth glyph no matter the ambiguous width. Therefore, in case of using East Asia language, it is difficult that set displaying Greek letters as halfwidth. Regards, Masayuki [*] http://unicode.org/reports/tr11/#Recommendations

I'm slightly confused as to what you mean, but here goes: So you're saying that: - Glyphs like pi have an ambiguous width. - Most text editors/terminals let you choose between halfwidth (roughly normal monospace width?) and fullwidth (double the size). - However, many East Asian fonts do NOT have halfwidth support. Is this correct? -- Ryan (ライアン) Yoko Shimomura > ryo (supercell/EGOIST) > Hiroyuki Sawano >> everyone else http://refi64.com On Jun 1, 2017 2:27 PM, "Masayuki YAMAMOTO" <ma3yuki.8mamo10@gmail.com> wrote: Hi Stephan, Nevertheless, I would like to point out that the encoding assumed for a
Python3 source file never depends on the locale.
Yeah, as you pointed out. I'd like to correct my said.
The mapping for ambiguous width assumes on East Asia legacy encodings and non East Asia legacy encodings, but not recommend to UTF-8 and other Unicode encodings. Displaying ambiguous width characters behave narrow by default, it isn't related to encoding. [*] Let me see... Several softwares have a setting that changes ambiguous width to halfwidth or fullwidth regardless for encoding (e.g. gnome-terminal, vim). And some fonts that are used in East Asia make glyph that is Greek letters and other signs to adjust to fullwidth, they break layout under halfwidth settings. It is possible that avoids these fonts, and uses multi language support font, yet signs that are only used in East Asia don't have halfwidth glyph no matter the ambiguous width. Therefore, in case of using East Asia language, it is difficult that set displaying Greek letters as halfwidth. Regards, Masayuki [*] http://unicode.org/reports/tr11/#Recommendations _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

Yes, it's correct. I'd show you a link to vim help for ambiguous width setting. http://vimdoc.sourceforge.net/htmldoc/options.html#'ambiwidth' Masayuki 2017-06-02 5:05 GMT+09:00 Ryan Gonzalez <rymg19@gmail.com>:

On Thu, Jun 1, 2017, at 10:08, Masayuki YAMAMOTO wrote:
I don't think PEP 8 approves anyway of doing the kind of column alignment that this (or for that matter proportional fonts) would break. One example is specifically called out as a "pet peeve". Of course, it also doesn't exactly approve of non-ASCII identifiers (PEP 3131 specifically forbids them in the standard library).

On Fri, Jun 2, 2017 at 2:40 AM, Random832 <random832@fastmail.com> wrote:
PEP 8 has nothing against non-ASCII identifiers where they make sense. The Py3 grammar was changed to be full Unicode specifically to permit that sort of thing. Personally, I would continue to use math.pi because it's easier to type *on my keyboard* than something involving letters I have to compose, but it may well be different for someone who already shifts keyboard from Latin to Greek regularly. Regardless, the stdlib does, as you say, avoid non-ASCII. ChrisA

On Thu, Jun 1, 2017 at 9:47 AM, Serhiy Storchaka <storchaka@gmail.com> wrote:
If this were to happen, I would only add π and forget about the others. It is the only one that is nearly 100 percent unambiguous. And seeing that in code or listed in math or builtins might have a nice wow factor to some. If π were in builtins, it might actually be useful as being more readable and faster to type than math.pi or np.pi. As math.π, I'm not sure it's worth it, although less harmful than math.tau. In IPython/Jupyter, you can type \pi + tab, and you'll get π. This even works on command line! -- Koos (mobile)

On Thu, Jun 1, 2017 at 8:47 AM, Serhiy Storchaka <storchaka@gmail.com> wrote:
* it duplicates functionality * I have no idea how to write those chars on Linux; if I did, I'm not sure it'd be the same on OSX and Windows (probably not) * duplicated aliases might make sense if they add readability; in this case they don't unless (maybe) you have a mathematical background. I can infer what "math.gamma" stands for but not being a mathematician math.Γ makes absolutely zero sense to me. * if you really want to do that you can simply do "from math import gamma as Γ" but it's something I wouldn't like if I were to read your code * I generally dislike any non-ASCII API; the fact that Python 3 allows you to do that should not be an incentive to promote such habit in the stdlib or anywhere else except in the end-user code, and it's something I still wouldn't like it except if in comments or docstrings -1 -- Giampaolo - http://grodola.blogspot.com

On 2 June 2017 at 12:17, Giampaolo Rodola' <g.rodola@gmail.com> wrote: > On Thu, Jun 1, 2017 at 8:47 AM, Serhiy Storchaka <storchaka@gmail.com> > wrote: > >> What you are think about adding Unicode aliases for some mathematic names >> in the math module? ;-) >> >> math.π = math.pi >> math.τ = math.tau >> math.Γ = math.gamma >> math.ℯ = math.e >> >> Unfortunately we can't use ∞, ∑ and √ as identifiers. :-( >> >> [...] > * duplicated aliases might make sense if they add readability; in this > case they don't unless (maybe) you have a mathematical background. I can > infer what "math.gamma" stands for but not being a mathematician math.Γ > makes absolutely zero sense to me. > > There is a significant number of scientific Python programmers (21% according to PyCharm 2016), so it is not that rare to meet someone who knows what is Gamma function. And for many of them π is much more readable than np.pi. Also there is another problem, confusion between Gamma function and Euler–Mascheroni constant, the first one is Γ, the second one is γ (perfectly opposite to PEP 8 capitalization rules :-), while both of them are frequently denoted as just gamma (in particular math.gamma follows the PEP8 rules, but is counter-intuitive for most scientist). All that said, I agree that these problems are easily solved by a custom import from. Still there is something in (or related to?) this proposal I think is worth considering: Can we also allow identifiers like ∫ or √. This will make many expressions more similar to usual TeX, plus it will be useful for projects like SymPy. -- Ivan

I would love to show how easy it is to write from math import pi as π, gamma as Γ but I had to cheat by copying from the OP since I don't know how to type these (and even if you were to tell me how I'd forget tomorrow). So, I am still in favor of the rule "only ASCII in the stdlib". On Fri, Jun 2, 2017 at 3:48 PM, Ivan Levkivskyi <levkivskyi@gmail.com> wrote: > On 2 June 2017 at 12:17, Giampaolo Rodola' <g.rodola@gmail.com> wrote: > >> On Thu, Jun 1, 2017 at 8:47 AM, Serhiy Storchaka <storchaka@gmail.com> >> wrote: >> >>> What you are think about adding Unicode aliases for some mathematic >>> names in the math module? ;-) >>> >>> math.π = math.pi >>> math.τ = math.tau >>> math.Γ = math.gamma >>> math.ℯ = math.e >>> >>> Unfortunately we can't use ∞, ∑ and √ as identifiers. :-( >>> >>> [...] >> * duplicated aliases might make sense if they add readability; in this >> case they don't unless (maybe) you have a mathematical background. I can >> infer what "math.gamma" stands for but not being a mathematician math.Γ >> makes absolutely zero sense to me. >> >> > There is a significant number of scientific Python programmers (21% > according to PyCharm 2016), so it is not that rare to meet someone who > knows what is Gamma function. > And for many of them π is much more readable than np.pi. Also there is > another problem, confusion between Gamma function and Euler–Mascheroni > constant, the first one is Γ, > the second one is γ (perfectly opposite to PEP 8 capitalization rules :-), > while both of them are frequently denoted as just gamma (in particular > math.gamma follows the PEP8 rules, > but is counter-intuitive for most scientist). > > All that said, I agree that these problems are easily solved by a custom > import from. Still there is something in (or related to?) this proposal > I think is worth considering: Can we also allow identifiers like ∫ or √. > This will make many expressions more similar to usual TeX, > plus it will be useful for projects like SymPy. > > -- > Ivan > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas@python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > -- --Guido van Rossum (python.org/~guido)

On 3 June 2017 at 00:55, Guido van Rossum <guido@python.org> wrote:
[...] So, I am still in favor of the rule "only ASCII in the stdlib".
But what about the other question? Currently, integral, sum, infinity, square root etc. Unicode symbols are all prohibited in identifiers. Is it possible to allow them? (Btw IPython just supports normal TeX notations like \pi, \lambda etc, so it is very easy to remember) -- Ivan

On 3 June 2017 at 01:29, Guido van Rossum <guido@python.org> wrote:
Are those characters not considered Unicode letters? Maybe we could add their category to the allowed set?
Yes, they are not considered letters, they are in category Sm. Unfortunately, +, -, |, and other symbol that clearly should not be in identifiers are also in this category, so we cannot add the whole category. It is possible to include particular ranges, but there should be a discussion about what exactly can/should be included. -- Ivan

On 6/2/2017 7:56 PM, Ivan Levkivskyi wrote:
I presume that is Symbol - math.
Having to test ranges will slow down identifier recognition.
but there should be a discussion about what exactly can/should be included.
I believe the current python definition of 'identifier' is taken from the Unicode Standard for default identifiers. Any change would have to be propagated to regex engines, IDEs, and anything else that parses python. I suggest that you ask Martin Loewis for his opinion on changing the identifier definition. -- Terry Jan Reedy

On Fri, Jun 02, 2017 at 04:29:16PM -0700, Guido van Rossum wrote:
Are those characters not considered Unicode letters? Maybe we could add their category to the allowed set?
They're not letters: py> {unicodedata.category(c) for c in '∑√∫∞'} {'Sm'} That's Symbol, Math. One problem is that the 'Sm' category includes a whole lot of mathematical symbols that we probably don't want in identifiers: ∴ ∣ ≈ ≒ ≝ ≫ ≮ ⊞ (plus MANY more variations on = < and > operators) including some "Confusables": ∁ ∊ ∨ ∗ ∑ etc C ε v * Σ http://www.unicode.org/reports/tr39/ Of course a language can define identifiers however it likes, but I think it is relevant that the Unicode Consortium's default algorithm for determining an identifier excludes Sm. http://www.unicode.org/reports/tr31/ I also disagree with Ivan that these symbols would be particularly useful in general, even for maths-heavy code, although I wouldn't say no to special casing ∞ (infinity) and maybe √ as a unary square root operator. -- Steve

OK, I think this discussion is pretty much dead then. We definitely shouldn't allow math operators in identifiers, otherwise in Python 4 or 5 we couldn't introduce them as operators. On Fri, Jun 2, 2017 at 5:10 PM, Steven D'Aprano <steve@pearwood.info> wrote:
-- --Guido van Rossum (python.org/~guido)

On Sat, Jun 3, 2017 at 3:42 PM, Pavol Lisy <pavol.lisy@gmail.com> wrote:
No, because operators need to be defined before you get to individual objects, and they need precedence and associativity. So it'd have to be defined at the compiler level. Also, having arbitrary operators gets extremely confusing. It's not easy to reason about code when you don't know what's even an operator. Not a stupid question, but one for which the answer is "definitely not like that". ChrisA

For reference, haskell is perhaps the closest language to providing arbitrary infix operators, and it requires that they be surrounded by backticks. That is A `op` B is equivalent to op(A, B) That doesn't work for python (backtick is taken) and I don't think anything similar is a good idea. On Sat, Jun 3, 2017 at 1:56 AM Chris Angelico <rosuav@gmail.com> wrote:

Hi Joshua,
This can of course be faked in Python. https://gist.github.com/stephanh42/a4d6d66b10cfecf935c9531150afb247 Now you can do: ======== @BinopCallable def add(x, y): return x + y print(3 @add@ 5) =========== Stephan 2017-06-03 7:59 GMT+02:00 Joshua Morton <joshua.morton13@gmail.com>:

Julia lets you define new infix operators directly, including using mathematical symbols as operators. Not that I think that is a good idea, but you can do it. On Jun 3, 2017 2:00 AM, "Joshua Morton" <joshua.morton13@gmail.com> wrote: For reference, haskell is perhaps the closest language to providing arbitrary infix operators, and it requires that they be surrounded by backticks. That is A `op` B is equivalent to op(A, B) That doesn't work for python (backtick is taken) and I don't think anything similar is a good idea. On Sat, Jun 3, 2017 at 1:56 AM Chris Angelico <rosuav@gmail.com> wrote:
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

On 6/3/17, Chris Angelico <rosuav@gmail.com> wrote:
Thanks for clarifying this point. Sorry for another stupid question: coding import machinery couldn't be used too, right? (I mean something like hylang.org ) Because ast could not understand these operators (and precedence and associativity)? BTW there could be also question about "multipliability". I mean something like a↑↑↑n ( see https://en.wikipedia.org/wiki/Knuth%27s_up-arrow_notation )
Also, having arbitrary operators gets extremely confusing. It's not easy to reason about code when you don't know what's even an operator.
I was thinking about it, but python is like this! You couldn't be really sure what is operator + doing! :) And it could be much easier to learn what some operator means in some library than for example understand async paradigm. (at least for some people)
Not a stupid question, but one for which the answer is "definitely not like that".
Thanks again! :) Although I am not sure it is definitely impossible I see that it is pretty pretty difficult.

On 3 June 2017 at 17:22, Pavol Lisy <pavol.lisy@gmail.com> wrote:
Source translation frontends *can* define new in-fix operators, but they need to translate them into explicit method and/or function calls before they reach the AST. So a frontend that added "A @ B" support to Python 2.7 (for example), would need to translate it into something like "numpy.dot(A, B)" or "matmul(A, B)" at the Python AST level. It would then be up to that function to emulate Python 3's __matmul__ special method support (or not, if the frontend was happy to only support a particular type, such as NumPy arrays) Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 3 June 2017 at 15:55, Chris Angelico <rosuav@gmail.com> wrote:
A useful background read on this question specifically in the context of Python is PEP 465 (which added A@B for matrix multiplication), and in particular its discussion of the rejected alternatives: https://www.python.org/dev/peps/pep-0465/#rejected-alternatives-to-adding-a-... For most purposes, the existing set of operators is sufficient, since we can alias them for unusual purposes (e.g. "/" for path joining in pathlib) when we don't need access to the more conventional meaning (division in that case, since "dividing one path segment by another" is nonsensical) and context makes it possible for the reader to understand what is going on ("filepath = segment1 / segment2 / segment3" looks a lot like writing out a filesystem path as a string and the name of the assignment target makes it clear this is a filesystem path operation, not a division operation). Matrix multiplication turned out to be a genuine expection, since all the other binary operators had well defined meanings as elementwise-operators, so borrowing one of them for matrix multiplication meant losing access to the corresponding elementwise operation, and there typically *weren't* enough hints from the context to let you know whether "*" was by element or the dot product. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

03.06.17 03:31, Guido van Rossum пише:
Sorry. I proposed this idea as a joke. math.π is useless, but mostly harmless. But I don't want to change Python grammar. The rule for Python identifiers already is not easy, there is no simple regular expression for them, and I'm sure most tools proceeding Python sources (even the tokenize module and IDLE) do not handle all Python identifier correctly. For example they don't recognize the symbol ℘ (U+2118, SCRIPT CAPITAL P) as a valid identifier.

On Sat, Jun 03, 2017 at 03:51:50PM +0300, Serhiy Storchaka wrote:
They shouldn't, because it isn't a valid identifier: it's a Maths Symbol, not a letter, same as ∑ √ ∫ ∞ etc. https://en.wikipedia.org/wiki/Weierstrass_p py> unicodedata.category('℘') 'Sm' But Python 3.5 does treat it as an identifier! py> ℘ = 1 # should be a SyntaxError ? py> ℘ 1 There's a bug here, somewhere, I'm just not sure where... The PEP for non-ASCII identifiers is quite old now (it was written for Unicode 4!) but it excludes category 'Sm' in its identifier algorithm: https://www.python.org/dev/peps/pep-3131/#id16 -- Steve

On Sun, Jun 04, 2017 at 02:36:50AM +1000, Steven D'Aprano wrote:
That appears to be the only Symbol Math character which is accepted as an identifier in Python 3.5: py> import unicodedata py> all_unicode = map(chr, range(0x110000)) py> symbols = [c for c in all_unicode if unicodedata.category(c) == 'Sm'] py> len(symbols) 948 py> ns = {} py> for c in symbols: ... try: ... exec(c + " = 1", ns) ... except SyntaxError: ... pass ... else: ... print(c, unicodedata.name(c)) ... ℘ SCRIPT CAPITAL P py> -- Steve

On Sun, Jun 4, 2017 at 2:48 AM, Steven D'Aprano <steve@pearwood.info> wrote:
Curious. And not specific to 3.5 - the exact same thing happens in 3.7. Here's the full category breakdown: cats = collections.defaultdict(int) ns = {} for c in map(chr, range(1, 0x110000)): try: exec(c + " = 1", ns) except SyntaxError: pass except UnicodeEncodeError: if unicodedata.category(c) != "Cs": raise else: cats[unicodedata.category(c)] += 1 defaultdict(<class 'int'>, {'Po': 1, 'Lu': 1702, 'Pc': 1, 'Ll': 2063, 'Lo': 112703, 'Lt': 31, 'Lm': 245, 'Nl': 236, 'Mn': 2, 'Sm': 1, 'So': 1}) For reference, as well as the 948 Sm, there are 1690 Mn and 5777 So, but only these characters are valid from them: \u1885 Mn MONGOLIAN LETTER ALI GALI BALUDA \u1886 Mn MONGOLIAN LETTER ALI GALI THREE BALUDA ℘ Sm SCRIPT CAPITAL P ℮ So ESTIMATED SYMBOL 2118 SCRIPT CAPITAL P and 212E ESTIMATED SYMBOL are listed in PropList.txt as Other_ID_Start, so they make sense. But that doesn't explain the two characters from category Mn. It also doesn't explain why U+309B and U+309C are *not* valid, despite being declared Other_ID_Start. Maybe it's a bug? Maybe 309B and 309C somehow got switched into 1885 and 1886?? ChrisA

On 03/06/17 20:41, Chris Angelico wrote:
\u1885 and \u1886 are categorised as letters (category Lo) by my Python 3.5. (Which makes sense, right?) If your system puts them in category Mn, that's bound to be a bug somewhere. As for \u309B and \u309C - it turns out this is a question of normalisation. PEP 3131 requires NFKC normalisation:
This is.... interesting. Thomas

On 03/06/17 21:02, Thomas Jollans wrote:
Actually it turns out that these characters were changed to category Mn in Unicode 9.0, but remain in (X)ID_Start for compatibility. All is right with the world. (All of this just goes to show how much subtlety there is in the science that goes into making Unicode) See: http://www.unicode.org/reports/tr44/tr44-18.html#Unicode_9.0.0
-- Thomas Jollans m ☎ +31 6 42630259 e ✉ tjol@tjol.eu

On Sun, Jun 4, 2017 at 5:02 AM, Thomas Jollans <tjol@tjol.eu> wrote:
rosuav@sikorsky:~$ python3.7 -c "import unicodedata; print(unicodedata.unidata_version, unicodedata.category('\u1885'))" 9.0.0 Mn rosuav@sikorsky:~$ python3.6 -c "import unicodedata; print(unicodedata.unidata_version, unicodedata.category('\u1885'))" 8.0.0 Lo rosuav@sikorsky:~$ python3.5 -c "import unicodedata; print(unicodedata.unidata_version, unicodedata.category('\u1885'))" 8.0.0 Lo rosuav@sikorsky:~$ python3.4 -c "import unicodedata; print(unicodedata.unidata_version, unicodedata.category('\u1885'))" 6.3.0 Lo Is it possible that there's a discrepancy between the Unicode version used by the unicodedata module and the one used by the parser? ChrisA

On 03/06/17 18:48, Steven D'Aprano wrote:
This is actually not a bug in Python, but a quirk in Unicode. I've had a closer look at PEP 3131 [1], which specifies that Python identifiers follow the Unicode classes XID_Start and XID_Continue. ℘ is listed in the standard [2][3] as XID_Start, so Python correctly accepts it as an identifier.
℘ and ℮ are actually explicitly mentioned in the Unicode annnex [3]:
I have no idea why U+309B and U+309C are not accepted as identifiers by Python 3.5. This could be a question of Python following an old version of the Unicode standard, or it *could* be a bug. Thomas [1] https://www.python.org/dev/peps/pep-3131/#specification-of-language-changes [2] http://www.unicode.org/Public/4.1.0/ucd/DerivedCoreProperties.txt [3] http://www.unicode.org/reports/tr31/

On Fri, Jun 2, 2017 at 7:29 PM, Guido van Rossum <guido@python.org> wrote: Are those characters not considered Unicode letters? Maybe we could add
their category to the allowed set?
Speaking of which, it would be convenient to be able to build strings with non-ascii characters using their Unicode codepoint name: greek_pi = "\u:greek_small_letter_pi" Or something like that. -- Juancarlo *Añez*

(Btw IPython just supports normal TeX notations like \pi, \lambda etc, so it is very easy to remember)
IPython dev here. I am the one who implemented (most of) that. We do support it, but it's not easy to remember unless you know how to write latex, and know said character. Question, how would you type the following: In [3]: ℸ = 1 Hint it's easy it's \CHARACTERNAME<tab>, but if you don't know how ℸ is named[1], you are screwed[3]. It's cute, it's compact, it's extremely useful for some internal code, but _exporting_ this as an interface is IMHO an extremely bad idea that hinders readability[2] and usability of the code. On Fri, Jun 2, 2017 at 4:29 PM, Guido van Rossum <guido@python.org> wrote:
Are those characters not considered Unicode letters? Maybe we could add their category to the allowed set?
+1 on allowing more of math symbols and be more flexible on allowed identifiers though. In particular the one mentioned above are part of mathematical operators[4]. It also would be great for some of these to be parsed as infix operators, but that's another topic :-) -- M [1] \daleth [2] and that's assuming your font support said character. [3] Tab completion on full unicode character name does work as well so \GREEK SMALL LETTER GAMMA<tab> will give you γ. And \γ<tab> will expand to \gamma, so you can figure it out, but users still struggle for unknown symbols [4] http://www.fileformat.info/info/unicode/block/mathematical_operators/images.... On Fri, Jun 2, 2017 at 4:02 PM, Ivan Levkivskyi <levkivskyi@gmail.com> wrote:

On Sat, Jun 03, 2017 at 01:02:12AM +0200, Ivan Levkivskyi wrote:
In the last few months, I've been making a lot of use of the TI Nspire CAS calculator, and I think that there is very little benefit to allowing symbols like ∑ √ ∫ (sum, radical/root, integral) unless you have a proper 2-dimensional template system. There's not much, if any, benefit to writing: ∫(expression, lower_limit, upper_limit, name) In fact, that's probably *harder* to read than integrate(expression, lower_limit, upper_limit, name) because the important thing, the fact that this is an integral, is barely visible. Its only a single character. That's not how mathematicians write it! If we had a 2D template system, like the Nspire, we could write what mathematicians do: (best viewed with a non-proportional font) b ⌠ ⎮ 3 2 1 ⎮ x + 2 x − ─── dx ⎮ x ⌡ a I say "best", but of course even with a monospaced font, it still looks pretty awful. You really need a proper GUI interface and support for resizing characters. I'm not suggesting this be part of Python the language! But It might be a nice application written for users of Python, perhaps part of Sage or IPython/Jupiter or a GUI interface to Sympy. You don't need ∫ to be legal in identifies for that. -- Steve

Steven D'Aprano wrote:
There's not much, if any, benefit to writing:
∫(expression, lower_limit, upper_limit, name)
More generally, there's a kind of culture clash between mathematical notation and programming notation. Mathematical notation tends to almost exclusively use single-character names, relying on different fonts and alphabets, and superscripts and subscripts, to get a large enough set of identifiers. Whereas in programming we use a much smaller alphabet and longer names. Having terse symbols for just a few things, and having to spell everything else out longhand, doesn't really help. -- Greg

On Fri, 2 Jun 2017 at 15:56 Guido van Rossum <guido@python.org> wrote: > I would love to show how easy it is to write > > from math import pi as π, gamma as Γ > > but I had to cheat by copying from the OP since I don't know how to type > these (and even if you were to tell me how I'd forget tomorrow). So, I am > still in favor of the rule "only ASCII in the stdlib". > Since this regularly comes up, why don't we add a note to the math module that you can do the above import(s) to bind various mathematical constants to their traditional symbol counterparts? The note can even start off with something like "While Python's standard library only uses ASCII characters to maximize ease of use and contribution, individuals are allowed to use various Unicode characters for variable names." This would also help with making sure people don't come back later and say, "why don't you just add the constants to the module?" -Brett > > On Fri, Jun 2, 2017 at 3:48 PM, Ivan Levkivskyi <levkivskyi@gmail.com> > wrote: > >> On 2 June 2017 at 12:17, Giampaolo Rodola' <g.rodola@gmail.com> wrote: >> >>> On Thu, Jun 1, 2017 at 8:47 AM, Serhiy Storchaka <storchaka@gmail.com> >>> wrote: >>> >>>> What you are think about adding Unicode aliases for some mathematic >>>> names in the math module? ;-) >>>> >>>> math.π = math.pi >>>> math.τ = math.tau >>>> math.Γ = math.gamma >>>> math.ℯ = math.e >>>> >>>> Unfortunately we can't use ∞, ∑ and √ as identifiers. :-( >>>> >>>> [...] >>> * duplicated aliases might make sense if they add readability; in this >>> case they don't unless (maybe) you have a mathematical background. I can >>> infer what "math.gamma" stands for but not being a mathematician math.Γ >>> makes absolutely zero sense to me. >>> >>> >> There is a significant number of scientific Python programmers (21% >> according to PyCharm 2016), so it is not that rare to meet someone who >> knows what is Gamma function. >> And for many of them π is much more readable than np.pi. Also there is >> another problem, confusion between Gamma function and Euler–Mascheroni >> constant, the first one is Γ, >> the second one is γ (perfectly opposite to PEP 8 capitalization rules >> :-), while both of them are frequently denoted as just gamma (in particular >> math.gamma follows the PEP8 rules, >> but is counter-intuitive for most scientist). >> >> All that said, I agree that these problems are easily solved by a custom >> import from. Still there is something in (or related to?) this proposal >> I think is worth considering: Can we also allow identifiers like ∫ or √. >> This will make many expressions more similar to usual TeX, >> plus it will be useful for projects like SymPy. >> >> -- >> Ivan >> >> >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas@python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> >> > > > -- > --Guido van Rossum (python.org/~guido) > _______________________________________________ > Python-ideas mailing list > Python-ideas@python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ >

On Sat, 03 Jun 2017 17:45:43 +0000, Brett Cannon wrote:
[...]
Because in order to add that note to the math module, you have to violate the "only ASCII in the stdlib" rule. ;-) People who would benefit from seeing ϖ (or π) in their code will arrange to type it in proportion to that benefit (and probably already have). I know how to type those characters in my environments, but it might not be that easy if I had to do so on a random computer with a random keyboard running a random OS. Ob XKCD: https://xkcd.com/1806/ (my apologies if someone else already brought this up; I haven't been following along that closely). And I want to say something about this argument being like the one about being able to represent people's names correctly, but while the ratio between the circumference of a circle to its diameter has a name, it isn't a person. Dan

On 4 June 2017 at 05:02, Dan Sommers <dan@tombstonezero.net> wrote:
The ASCII-only restriction in the standard library is merely "all public APIs will use ASCII-only identifiers", rather than "We don't allow the use of Unicode anywhere" (Several parts of the documentation would be rather unreadable if they were restricted to ASCII characters). However, clarifying that made me realise we've never actually written that down anywhere - it's just been an assumed holdover from the fact that Python 2.7 is still being supported, and doesn't allow for Unicode identifiers in the first place. https://github.com/python/peps/pull/285 is a PR to explicitly document the standard library API restriction in the "Names to Avoid" part of PEP 8. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 5 June 2017 at 07:00, Guido van Rossum <gvanrossum@gmail.com> wrote:
AFAK it was in whatever PEP introduced Unicode identifiers.
Ah, indeed it is: https://www.python.org/dev/peps/pep-3131/#policy-specification Interestingly, that's stricter than my draft PR for PEP 8, and I'm not entirely sure we follow the "string literals and comments must be in ASCII" part in its entirety: ============ All identifiers in the Python standard library MUST use ASCII-only identifiers, and SHOULD use English words wherever feasible (in many cases, abbreviations and technical terms are used which aren't English). In addition, string literals and comments must also be in ASCII. The only exceptions are (a) test cases testing the non-ASCII features, and (b) names of authors. Authors whose names are not based on the Latin alphabet MUST provide a Latin transliteration of their names. ============ That said, all the potential counter-examples that come to mind are in the documentation, but *not* in the corresponding docstrings (e.g. the Euro symbol used in in the docs for chr() and ord()). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

I think the strictness comes from the observation that the stdlib is read and edited using *lots* of different tools and not every tool is with the program. That argument may be weaker now than when that PEP was written, but I still get emails and see websites with mojibake. (Most recently, the US-PyCon badges had spaces for all non-ASCII letters.) The argument is also weaker for comments than it is for identifiers, since stdlib identifiers will be used through *even more* tools (anyone who uses a name imported from stdlib). Docstrings are perhaps halfway in between. On Sun, Jun 4, 2017 at 8:33 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
-- --Guido van Rossum (python.org/~guido)

On 6/1/17, Serhiy Storchaka <storchaka@gmail.com> wrote:
My humble opinion: I would rather like to see something like: from some_wide_used_scientific_library.math_constants import * with good acceptance from scientific users before thinking to add it into stdlib. PS. Maybe this could be interesting for some vim users -> http://gnosis.cx/bin/.vim/after/syntax/python.vim (I am not author of this file) If you apply it in vim then vim show lines (where is not cursor) "translated". Means for example that math.pi is replaced by π. So you still edit "math.pi" but if you move cursor outside of this line then you could see formula simplified/prettified. (or more complicated - because you need a little train your brain to accept new view) I am pretty skeptic how popular this conceal technique could be in vim pythonistas community! ( why skeptic? For example I am testing to improve readability of line similar to self.a = self.b + self.c using with this technique and see in vim ᐠa = ᐠb + ᐠc but **I am not sure if it is really useful** (probably I have to add that I am editing code much more in other editor than vim) ᐠ U+1420 CANADIAN SYLLABICS FINAL GRAVE )

This is a horrible thing that nobody else should do :-), but I *am* the author of the file linked by Pavol. It's based on someone else's version (credited in the file), but I fine tuned it for what I want. I'm also not the author of the conceal plugin for vim, which is pretty much exactly for just this. The screenshot attached is a little bit of a vim session editing a Python file. The key thing is that I *type* only regular ASCII characters; I just tell vim to show my something different on lines I am not currently editing. On Fri, Jun 2, 2017 at 11:26 PM, Pavol Lisy <pavol.lisy@gmail.com> wrote:
-- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.

I use the conceal feature for this purpose, but I use https://github.com/ehamberg/vim-cute-python instead, which is a more toned-down version of the same idea. I personally find x ≤ y, a ∉ B more readable than x <= y, a not in B etc. Especially the conceal lambda => λ is useful for de-cluttering the code. Stephan 2017-06-03 8:50 GMT+02:00 David Mertz <mertz@gnosis.cx>:

Agree with this, but note that a similar proposal has once been made to scipy. It was rejected for now since they still also target Python2. So once scipy drops Python2 supports (presumable around 2020 at the latest), this could be re-proposed there. Stephan 2017-06-03 8:26 GMT+02:00 Pavol Lisy <pavol.lisy@gmail.com>:

Hi, What you are think about adding Unicode aliases for some mathematic names in the math module? ;-) math.π = math.pi math.τ = math.tau math.Γ = math.gamma math.ℯ = math.e Unfortunately we can't use ∞, ∑ and √ as identifiers. :-( It may be the role of editors to do it if one really want it. I personally use the `pretty-mode` emacs mode, it replaces pi, sum, sqrt, None (∅), tau, gamma, but not math.e obviously. It replaces some other strings like lambda (λ x: ...), and is customizable (I added ∈ ∉ for "in" and "not in"). It only "messes" with visual numbers of characters in a line, but flake8 sees the correct python thrue flycheck so it reports errors properly, and you're still notified you're more than 80 columns long, even if visually there may be a longer valid line. Oh and it's also messing with copying and pasting to share, found myself yesterday pasting a "prettyfied" line of code on IRC and it (legitimately) surprised someone. -- Julien Palard https://mdk.fr

On Thu, Jun 01, 2017 at 09:47:57AM +0300, Serhiy Storchaka <storchaka@gmail.com> wrote:
-1. "There should be one-- and preferably only one --obvious way to do it." And -1 for non-ascii in stdlib. Oleg. -- Oleg Broytman http://phdru.name/ phd@phdru.name Programmers don't die, they just GOSUB without RETURN.

On 05/31/2017 11:47 PM, Serhiy Storchaka wrote:
What you are think about adding Unicode aliases for some mathematic names in the math module? ;-)
I personally don't like there being multiple symbols with the same meaning and I never find myself confused by the longer names versus the sorter symbols I would use when writing them. Cheers, Thomas

Hi all, Two remarks: 1. Note that ℯ also doesn't really work. While you can assign to this identifier, it actually gets normalized into a plain "e". 2. Unicode has a Σ : GREEK CAPITAL LETTER SIGMA and a ∑ : N-ARY SUMMATION The first is a valid Python identifier, the second not. Unfortunately, the second has the desired semantics... Stephan 2017-06-01 9:14 GMT+02:00 Ivan Levkivskyi <levkivskyi@gmail.com>:

Or perhaps create a small module: ============unimath.py============== import math __all__ = ["π", "τ", "Γ"] π = math.pi τ = math.tau Γ = math.gamma ==================================== Then do: from unimath import * Put it on the Python Package index. If it gets wildly popular the case for putting it in `math` will be greatly strengthened. Stephan 2017-06-01 9:17 GMT+02:00 Brice PARENT <contact@brice.xyz>:

Or perhaps create a small module: ============unimath.py============== import math __all__ = ["π", "τ", "Γ"] π = math.pi τ = math.tau Γ = math.gamma ==================================== Then do: from unimath import * Put it on the Python Package index. If it gets wildly popular the case for putting it in `math` will be greatly strengthened. Stephan 2017-06-01 9:17 GMT+02:00 Brice PARENT <contact@brice.xyz>:

How do you write π (pi) with a keyboard on Windows, Linux or macOS?
On macOS, ⌥ P (Option-P) works. On all platforms: 1. Make sure you are using Vim. 2. In insert mode: Ctrl-K *p You can also define abbrev's which will allow you to type pi\ and it gets replaced by π. See: https://gist.github.com/stephanh42/fc466e62bfb022a890ff2c4643eaf3a5 I presume Emacs can do something similar. Or you get this keyboard: https://imgur.com/gallery/tCNvP ;-) Stephan 2017-06-01 11:49 GMT+02:00 Victor Stinner <victor.stinner@gmail.com>:

This shouldn't be a problem for Greek users. ;-)
Well, they still need to switch between keymaps, since presumably they used the Latin keymap to enter `math.` before they can enter π. That is actually another general solution: just install the Greek keymap in addition to your native keymap. The OS typically provides keyboard shortcus to switch between keymaps. Υεσ Ι καν υσε Γρεεκ καρακτερσ! OK, that works. Stephan 2017-06-01 12:14 GMT+02:00 Serhiy Storchaka <storchaka@gmail.com>:

On Thu, Jun 01, 2017 at 11:49:43AM +0200, Victor Stinner wrote:
How do you write π (pi) with a keyboard on Windows, Linux or macOS?
Use the compose key 🙌 for linux: https://help.ubuntu.com/community/ComposeKey for windows: https://github.com/SamHocevar/wincompose for macosx: http://lol.zoy.org/blog/2012/06/17/compose-key-on-os-x Then I wrote my own ~/.XCompose file with: <Multi_key> <asterisk> <p> : "π" U03C0 # GREEK SMALL LETTER PI so it's like the vim digraphs. Cheers, -- zmo

On Jun 01 2017, Victor Stinner <victor.stinner-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
Under Linux, you'd use the Compose facility. Take a look at eg. /usr/share/X11/locale/en_US.UTF-8/Compose for all the nice things it let's you enter: $ egrep '[πτΓ]' /usr/share/X11/locale/en_US.UTF-8/Compose <dead_greek> <G> : "Γ" U0393 # GREEK CAPITAL LETTER GAMMA <dead_greek> <p> : "π" U03C0 # GREEK SMALL LETTER PI <dead_greek> <t> : "τ" U03C4 # GREEK SMALL LETTER TAU Best, -Nikolaus -- GPG Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F »Time flies like an arrow, fruit flies like a Banana.«

On Fri, Jun 2, 2017 at 7:52 AM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
I don't have a strong opinion about it being in the stdlib, but I'd also point out that a strong advantage to having these defined in a module at all is that third-party interpreters (e.g. IPython, bpython, some IDEs) that support tab-completion make these easy to type as well, and I find them to be very readable for math-heavy code.

Hi Masayuki, I admit that my understanding of this issue is very limited. Nevertheless, I would like to point out that the encoding assumed for a Python3 source file never depends on the locale. My understanding is that in the default encoding for Python source files (utf-8), East Asian Ambiguous characters must be assumed narrow. Now there are also legacy encodings where they are fullwidth. But it is always determined by the encoding, which in turn is specified or implied in the source file. So I don't actually see an issue here. Am I missing something? Stephan Op 1 jun. 2017 16:08 schreef "Masayuki YAMAMOTO" <ma3yuki.8mamo10@gmail.com
:
The width of Greek letters is East Asian Ambiguous. Using ambiguous width characters possibly will be a reason that is source code layout break on specific locale. Masayuki 2017-06-01 15:47 GMT+09:00 Serhiy Storchaka <storchaka@gmail.com>:
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

Hi Stephan, Nevertheless, I would like to point out that the encoding assumed for a
Python3 source file never depends on the locale.
Yeah, as you pointed out. I'd like to correct my said.
The mapping for ambiguous width assumes on East Asia legacy encodings and non East Asia legacy encodings, but not recommend to UTF-8 and other Unicode encodings. Displaying ambiguous width characters behave narrow by default, it isn't related to encoding. [*] Let me see... Several softwares have a setting that changes ambiguous width to halfwidth or fullwidth regardless for encoding (e.g. gnome-terminal, vim). And some fonts that are used in East Asia make glyph that is Greek letters and other signs to adjust to fullwidth, they break layout under halfwidth settings. It is possible that avoids these fonts, and uses multi language support font, yet signs that are only used in East Asia don't have halfwidth glyph no matter the ambiguous width. Therefore, in case of using East Asia language, it is difficult that set displaying Greek letters as halfwidth. Regards, Masayuki [*] http://unicode.org/reports/tr11/#Recommendations

I'm slightly confused as to what you mean, but here goes: So you're saying that: - Glyphs like pi have an ambiguous width. - Most text editors/terminals let you choose between halfwidth (roughly normal monospace width?) and fullwidth (double the size). - However, many East Asian fonts do NOT have halfwidth support. Is this correct? -- Ryan (ライアン) Yoko Shimomura > ryo (supercell/EGOIST) > Hiroyuki Sawano >> everyone else http://refi64.com On Jun 1, 2017 2:27 PM, "Masayuki YAMAMOTO" <ma3yuki.8mamo10@gmail.com> wrote: Hi Stephan, Nevertheless, I would like to point out that the encoding assumed for a
Python3 source file never depends on the locale.
Yeah, as you pointed out. I'd like to correct my said.
The mapping for ambiguous width assumes on East Asia legacy encodings and non East Asia legacy encodings, but not recommend to UTF-8 and other Unicode encodings. Displaying ambiguous width characters behave narrow by default, it isn't related to encoding. [*] Let me see... Several softwares have a setting that changes ambiguous width to halfwidth or fullwidth regardless for encoding (e.g. gnome-terminal, vim). And some fonts that are used in East Asia make glyph that is Greek letters and other signs to adjust to fullwidth, they break layout under halfwidth settings. It is possible that avoids these fonts, and uses multi language support font, yet signs that are only used in East Asia don't have halfwidth glyph no matter the ambiguous width. Therefore, in case of using East Asia language, it is difficult that set displaying Greek letters as halfwidth. Regards, Masayuki [*] http://unicode.org/reports/tr11/#Recommendations _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

Yes, it's correct. I'd show you a link to vim help for ambiguous width setting. http://vimdoc.sourceforge.net/htmldoc/options.html#'ambiwidth' Masayuki 2017-06-02 5:05 GMT+09:00 Ryan Gonzalez <rymg19@gmail.com>:

On Thu, Jun 1, 2017, at 10:08, Masayuki YAMAMOTO wrote:
I don't think PEP 8 approves anyway of doing the kind of column alignment that this (or for that matter proportional fonts) would break. One example is specifically called out as a "pet peeve". Of course, it also doesn't exactly approve of non-ASCII identifiers (PEP 3131 specifically forbids them in the standard library).

On Fri, Jun 2, 2017 at 2:40 AM, Random832 <random832@fastmail.com> wrote:
PEP 8 has nothing against non-ASCII identifiers where they make sense. The Py3 grammar was changed to be full Unicode specifically to permit that sort of thing. Personally, I would continue to use math.pi because it's easier to type *on my keyboard* than something involving letters I have to compose, but it may well be different for someone who already shifts keyboard from Latin to Greek regularly. Regardless, the stdlib does, as you say, avoid non-ASCII. ChrisA

On Thu, Jun 1, 2017 at 9:47 AM, Serhiy Storchaka <storchaka@gmail.com> wrote:
If this were to happen, I would only add π and forget about the others. It is the only one that is nearly 100 percent unambiguous. And seeing that in code or listed in math or builtins might have a nice wow factor to some. If π were in builtins, it might actually be useful as being more readable and faster to type than math.pi or np.pi. As math.π, I'm not sure it's worth it, although less harmful than math.tau. In IPython/Jupyter, you can type \pi + tab, and you'll get π. This even works on command line! -- Koos (mobile)

On Thu, Jun 1, 2017 at 8:47 AM, Serhiy Storchaka <storchaka@gmail.com> wrote:
* it duplicates functionality * I have no idea how to write those chars on Linux; if I did, I'm not sure it'd be the same on OSX and Windows (probably not) * duplicated aliases might make sense if they add readability; in this case they don't unless (maybe) you have a mathematical background. I can infer what "math.gamma" stands for but not being a mathematician math.Γ makes absolutely zero sense to me. * if you really want to do that you can simply do "from math import gamma as Γ" but it's something I wouldn't like if I were to read your code * I generally dislike any non-ASCII API; the fact that Python 3 allows you to do that should not be an incentive to promote such habit in the stdlib or anywhere else except in the end-user code, and it's something I still wouldn't like it except if in comments or docstrings -1 -- Giampaolo - http://grodola.blogspot.com

On 2 June 2017 at 12:17, Giampaolo Rodola' <g.rodola@gmail.com> wrote: > On Thu, Jun 1, 2017 at 8:47 AM, Serhiy Storchaka <storchaka@gmail.com> > wrote: > >> What you are think about adding Unicode aliases for some mathematic names >> in the math module? ;-) >> >> math.π = math.pi >> math.τ = math.tau >> math.Γ = math.gamma >> math.ℯ = math.e >> >> Unfortunately we can't use ∞, ∑ and √ as identifiers. :-( >> >> [...] > * duplicated aliases might make sense if they add readability; in this > case they don't unless (maybe) you have a mathematical background. I can > infer what "math.gamma" stands for but not being a mathematician math.Γ > makes absolutely zero sense to me. > > There is a significant number of scientific Python programmers (21% according to PyCharm 2016), so it is not that rare to meet someone who knows what is Gamma function. And for many of them π is much more readable than np.pi. Also there is another problem, confusion between Gamma function and Euler–Mascheroni constant, the first one is Γ, the second one is γ (perfectly opposite to PEP 8 capitalization rules :-), while both of them are frequently denoted as just gamma (in particular math.gamma follows the PEP8 rules, but is counter-intuitive for most scientist). All that said, I agree that these problems are easily solved by a custom import from. Still there is something in (or related to?) this proposal I think is worth considering: Can we also allow identifiers like ∫ or √. This will make many expressions more similar to usual TeX, plus it will be useful for projects like SymPy. -- Ivan

I would love to show how easy it is to write from math import pi as π, gamma as Γ but I had to cheat by copying from the OP since I don't know how to type these (and even if you were to tell me how I'd forget tomorrow). So, I am still in favor of the rule "only ASCII in the stdlib". On Fri, Jun 2, 2017 at 3:48 PM, Ivan Levkivskyi <levkivskyi@gmail.com> wrote: > On 2 June 2017 at 12:17, Giampaolo Rodola' <g.rodola@gmail.com> wrote: > >> On Thu, Jun 1, 2017 at 8:47 AM, Serhiy Storchaka <storchaka@gmail.com> >> wrote: >> >>> What you are think about adding Unicode aliases for some mathematic >>> names in the math module? ;-) >>> >>> math.π = math.pi >>> math.τ = math.tau >>> math.Γ = math.gamma >>> math.ℯ = math.e >>> >>> Unfortunately we can't use ∞, ∑ and √ as identifiers. :-( >>> >>> [...] >> * duplicated aliases might make sense if they add readability; in this >> case they don't unless (maybe) you have a mathematical background. I can >> infer what "math.gamma" stands for but not being a mathematician math.Γ >> makes absolutely zero sense to me. >> >> > There is a significant number of scientific Python programmers (21% > according to PyCharm 2016), so it is not that rare to meet someone who > knows what is Gamma function. > And for many of them π is much more readable than np.pi. Also there is > another problem, confusion between Gamma function and Euler–Mascheroni > constant, the first one is Γ, > the second one is γ (perfectly opposite to PEP 8 capitalization rules :-), > while both of them are frequently denoted as just gamma (in particular > math.gamma follows the PEP8 rules, > but is counter-intuitive for most scientist). > > All that said, I agree that these problems are easily solved by a custom > import from. Still there is something in (or related to?) this proposal > I think is worth considering: Can we also allow identifiers like ∫ or √. > This will make many expressions more similar to usual TeX, > plus it will be useful for projects like SymPy. > > -- > Ivan > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas@python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > -- --Guido van Rossum (python.org/~guido)

On 3 June 2017 at 00:55, Guido van Rossum <guido@python.org> wrote:
[...] So, I am still in favor of the rule "only ASCII in the stdlib".
But what about the other question? Currently, integral, sum, infinity, square root etc. Unicode symbols are all prohibited in identifiers. Is it possible to allow them? (Btw IPython just supports normal TeX notations like \pi, \lambda etc, so it is very easy to remember) -- Ivan

On 3 June 2017 at 01:29, Guido van Rossum <guido@python.org> wrote:
Are those characters not considered Unicode letters? Maybe we could add their category to the allowed set?
Yes, they are not considered letters, they are in category Sm. Unfortunately, +, -, |, and other symbol that clearly should not be in identifiers are also in this category, so we cannot add the whole category. It is possible to include particular ranges, but there should be a discussion about what exactly can/should be included. -- Ivan

On 6/2/2017 7:56 PM, Ivan Levkivskyi wrote:
I presume that is Symbol - math.
Having to test ranges will slow down identifier recognition.
but there should be a discussion about what exactly can/should be included.
I believe the current python definition of 'identifier' is taken from the Unicode Standard for default identifiers. Any change would have to be propagated to regex engines, IDEs, and anything else that parses python. I suggest that you ask Martin Loewis for his opinion on changing the identifier definition. -- Terry Jan Reedy

On Fri, Jun 02, 2017 at 04:29:16PM -0700, Guido van Rossum wrote:
Are those characters not considered Unicode letters? Maybe we could add their category to the allowed set?
They're not letters: py> {unicodedata.category(c) for c in '∑√∫∞'} {'Sm'} That's Symbol, Math. One problem is that the 'Sm' category includes a whole lot of mathematical symbols that we probably don't want in identifiers: ∴ ∣ ≈ ≒ ≝ ≫ ≮ ⊞ (plus MANY more variations on = < and > operators) including some "Confusables": ∁ ∊ ∨ ∗ ∑ etc C ε v * Σ http://www.unicode.org/reports/tr39/ Of course a language can define identifiers however it likes, but I think it is relevant that the Unicode Consortium's default algorithm for determining an identifier excludes Sm. http://www.unicode.org/reports/tr31/ I also disagree with Ivan that these symbols would be particularly useful in general, even for maths-heavy code, although I wouldn't say no to special casing ∞ (infinity) and maybe √ as a unary square root operator. -- Steve

OK, I think this discussion is pretty much dead then. We definitely shouldn't allow math operators in identifiers, otherwise in Python 4 or 5 we couldn't introduce them as operators. On Fri, Jun 2, 2017 at 5:10 PM, Steven D'Aprano <steve@pearwood.info> wrote:
-- --Guido van Rossum (python.org/~guido)

On Sat, Jun 3, 2017 at 3:42 PM, Pavol Lisy <pavol.lisy@gmail.com> wrote:
No, because operators need to be defined before you get to individual objects, and they need precedence and associativity. So it'd have to be defined at the compiler level. Also, having arbitrary operators gets extremely confusing. It's not easy to reason about code when you don't know what's even an operator. Not a stupid question, but one for which the answer is "definitely not like that". ChrisA

For reference, haskell is perhaps the closest language to providing arbitrary infix operators, and it requires that they be surrounded by backticks. That is A `op` B is equivalent to op(A, B) That doesn't work for python (backtick is taken) and I don't think anything similar is a good idea. On Sat, Jun 3, 2017 at 1:56 AM Chris Angelico <rosuav@gmail.com> wrote:

Hi Joshua,
This can of course be faked in Python. https://gist.github.com/stephanh42/a4d6d66b10cfecf935c9531150afb247 Now you can do: ======== @BinopCallable def add(x, y): return x + y print(3 @add@ 5) =========== Stephan 2017-06-03 7:59 GMT+02:00 Joshua Morton <joshua.morton13@gmail.com>:

Julia lets you define new infix operators directly, including using mathematical symbols as operators. Not that I think that is a good idea, but you can do it. On Jun 3, 2017 2:00 AM, "Joshua Morton" <joshua.morton13@gmail.com> wrote: For reference, haskell is perhaps the closest language to providing arbitrary infix operators, and it requires that they be surrounded by backticks. That is A `op` B is equivalent to op(A, B) That doesn't work for python (backtick is taken) and I don't think anything similar is a good idea. On Sat, Jun 3, 2017 at 1:56 AM Chris Angelico <rosuav@gmail.com> wrote:
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/

On 6/3/17, Chris Angelico <rosuav@gmail.com> wrote:
Thanks for clarifying this point. Sorry for another stupid question: coding import machinery couldn't be used too, right? (I mean something like hylang.org ) Because ast could not understand these operators (and precedence and associativity)? BTW there could be also question about "multipliability". I mean something like a↑↑↑n ( see https://en.wikipedia.org/wiki/Knuth%27s_up-arrow_notation )
Also, having arbitrary operators gets extremely confusing. It's not easy to reason about code when you don't know what's even an operator.
I was thinking about it, but python is like this! You couldn't be really sure what is operator + doing! :) And it could be much easier to learn what some operator means in some library than for example understand async paradigm. (at least for some people)
Not a stupid question, but one for which the answer is "definitely not like that".
Thanks again! :) Although I am not sure it is definitely impossible I see that it is pretty pretty difficult.

On 3 June 2017 at 17:22, Pavol Lisy <pavol.lisy@gmail.com> wrote:
Source translation frontends *can* define new in-fix operators, but they need to translate them into explicit method and/or function calls before they reach the AST. So a frontend that added "A @ B" support to Python 2.7 (for example), would need to translate it into something like "numpy.dot(A, B)" or "matmul(A, B)" at the Python AST level. It would then be up to that function to emulate Python 3's __matmul__ special method support (or not, if the frontend was happy to only support a particular type, such as NumPy arrays) Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 3 June 2017 at 15:55, Chris Angelico <rosuav@gmail.com> wrote:
A useful background read on this question specifically in the context of Python is PEP 465 (which added A@B for matrix multiplication), and in particular its discussion of the rejected alternatives: https://www.python.org/dev/peps/pep-0465/#rejected-alternatives-to-adding-a-... For most purposes, the existing set of operators is sufficient, since we can alias them for unusual purposes (e.g. "/" for path joining in pathlib) when we don't need access to the more conventional meaning (division in that case, since "dividing one path segment by another" is nonsensical) and context makes it possible for the reader to understand what is going on ("filepath = segment1 / segment2 / segment3" looks a lot like writing out a filesystem path as a string and the name of the assignment target makes it clear this is a filesystem path operation, not a division operation). Matrix multiplication turned out to be a genuine expection, since all the other binary operators had well defined meanings as elementwise-operators, so borrowing one of them for matrix multiplication meant losing access to the corresponding elementwise operation, and there typically *weren't* enough hints from the context to let you know whether "*" was by element or the dot product. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

03.06.17 03:31, Guido van Rossum пише:
Sorry. I proposed this idea as a joke. math.π is useless, but mostly harmless. But I don't want to change Python grammar. The rule for Python identifiers already is not easy, there is no simple regular expression for them, and I'm sure most tools proceeding Python sources (even the tokenize module and IDLE) do not handle all Python identifier correctly. For example they don't recognize the symbol ℘ (U+2118, SCRIPT CAPITAL P) as a valid identifier.

On Sat, Jun 03, 2017 at 03:51:50PM +0300, Serhiy Storchaka wrote:
They shouldn't, because it isn't a valid identifier: it's a Maths Symbol, not a letter, same as ∑ √ ∫ ∞ etc. https://en.wikipedia.org/wiki/Weierstrass_p py> unicodedata.category('℘') 'Sm' But Python 3.5 does treat it as an identifier! py> ℘ = 1 # should be a SyntaxError ? py> ℘ 1 There's a bug here, somewhere, I'm just not sure where... The PEP for non-ASCII identifiers is quite old now (it was written for Unicode 4!) but it excludes category 'Sm' in its identifier algorithm: https://www.python.org/dev/peps/pep-3131/#id16 -- Steve

On Sun, Jun 04, 2017 at 02:36:50AM +1000, Steven D'Aprano wrote:
That appears to be the only Symbol Math character which is accepted as an identifier in Python 3.5: py> import unicodedata py> all_unicode = map(chr, range(0x110000)) py> symbols = [c for c in all_unicode if unicodedata.category(c) == 'Sm'] py> len(symbols) 948 py> ns = {} py> for c in symbols: ... try: ... exec(c + " = 1", ns) ... except SyntaxError: ... pass ... else: ... print(c, unicodedata.name(c)) ... ℘ SCRIPT CAPITAL P py> -- Steve

On Sun, Jun 4, 2017 at 2:48 AM, Steven D'Aprano <steve@pearwood.info> wrote:
Curious. And not specific to 3.5 - the exact same thing happens in 3.7. Here's the full category breakdown: cats = collections.defaultdict(int) ns = {} for c in map(chr, range(1, 0x110000)): try: exec(c + " = 1", ns) except SyntaxError: pass except UnicodeEncodeError: if unicodedata.category(c) != "Cs": raise else: cats[unicodedata.category(c)] += 1 defaultdict(<class 'int'>, {'Po': 1, 'Lu': 1702, 'Pc': 1, 'Ll': 2063, 'Lo': 112703, 'Lt': 31, 'Lm': 245, 'Nl': 236, 'Mn': 2, 'Sm': 1, 'So': 1}) For reference, as well as the 948 Sm, there are 1690 Mn and 5777 So, but only these characters are valid from them: \u1885 Mn MONGOLIAN LETTER ALI GALI BALUDA \u1886 Mn MONGOLIAN LETTER ALI GALI THREE BALUDA ℘ Sm SCRIPT CAPITAL P ℮ So ESTIMATED SYMBOL 2118 SCRIPT CAPITAL P and 212E ESTIMATED SYMBOL are listed in PropList.txt as Other_ID_Start, so they make sense. But that doesn't explain the two characters from category Mn. It also doesn't explain why U+309B and U+309C are *not* valid, despite being declared Other_ID_Start. Maybe it's a bug? Maybe 309B and 309C somehow got switched into 1885 and 1886?? ChrisA

On 03/06/17 20:41, Chris Angelico wrote:
\u1885 and \u1886 are categorised as letters (category Lo) by my Python 3.5. (Which makes sense, right?) If your system puts them in category Mn, that's bound to be a bug somewhere. As for \u309B and \u309C - it turns out this is a question of normalisation. PEP 3131 requires NFKC normalisation:
This is.... interesting. Thomas

On 03/06/17 21:02, Thomas Jollans wrote:
Actually it turns out that these characters were changed to category Mn in Unicode 9.0, but remain in (X)ID_Start for compatibility. All is right with the world. (All of this just goes to show how much subtlety there is in the science that goes into making Unicode) See: http://www.unicode.org/reports/tr44/tr44-18.html#Unicode_9.0.0
-- Thomas Jollans m ☎ +31 6 42630259 e ✉ tjol@tjol.eu

On Sun, Jun 4, 2017 at 5:02 AM, Thomas Jollans <tjol@tjol.eu> wrote:
rosuav@sikorsky:~$ python3.7 -c "import unicodedata; print(unicodedata.unidata_version, unicodedata.category('\u1885'))" 9.0.0 Mn rosuav@sikorsky:~$ python3.6 -c "import unicodedata; print(unicodedata.unidata_version, unicodedata.category('\u1885'))" 8.0.0 Lo rosuav@sikorsky:~$ python3.5 -c "import unicodedata; print(unicodedata.unidata_version, unicodedata.category('\u1885'))" 8.0.0 Lo rosuav@sikorsky:~$ python3.4 -c "import unicodedata; print(unicodedata.unidata_version, unicodedata.category('\u1885'))" 6.3.0 Lo Is it possible that there's a discrepancy between the Unicode version used by the unicodedata module and the one used by the parser? ChrisA

On 03/06/17 18:48, Steven D'Aprano wrote:
This is actually not a bug in Python, but a quirk in Unicode. I've had a closer look at PEP 3131 [1], which specifies that Python identifiers follow the Unicode classes XID_Start and XID_Continue. ℘ is listed in the standard [2][3] as XID_Start, so Python correctly accepts it as an identifier.
℘ and ℮ are actually explicitly mentioned in the Unicode annnex [3]:
I have no idea why U+309B and U+309C are not accepted as identifiers by Python 3.5. This could be a question of Python following an old version of the Unicode standard, or it *could* be a bug. Thomas [1] https://www.python.org/dev/peps/pep-3131/#specification-of-language-changes [2] http://www.unicode.org/Public/4.1.0/ucd/DerivedCoreProperties.txt [3] http://www.unicode.org/reports/tr31/

On Fri, Jun 2, 2017 at 7:29 PM, Guido van Rossum <guido@python.org> wrote: Are those characters not considered Unicode letters? Maybe we could add
their category to the allowed set?
Speaking of which, it would be convenient to be able to build strings with non-ascii characters using their Unicode codepoint name: greek_pi = "\u:greek_small_letter_pi" Or something like that. -- Juancarlo *Añez*

(Btw IPython just supports normal TeX notations like \pi, \lambda etc, so it is very easy to remember)
IPython dev here. I am the one who implemented (most of) that. We do support it, but it's not easy to remember unless you know how to write latex, and know said character. Question, how would you type the following: In [3]: ℸ = 1 Hint it's easy it's \CHARACTERNAME<tab>, but if you don't know how ℸ is named[1], you are screwed[3]. It's cute, it's compact, it's extremely useful for some internal code, but _exporting_ this as an interface is IMHO an extremely bad idea that hinders readability[2] and usability of the code. On Fri, Jun 2, 2017 at 4:29 PM, Guido van Rossum <guido@python.org> wrote:
Are those characters not considered Unicode letters? Maybe we could add their category to the allowed set?
+1 on allowing more of math symbols and be more flexible on allowed identifiers though. In particular the one mentioned above are part of mathematical operators[4]. It also would be great for some of these to be parsed as infix operators, but that's another topic :-) -- M [1] \daleth [2] and that's assuming your font support said character. [3] Tab completion on full unicode character name does work as well so \GREEK SMALL LETTER GAMMA<tab> will give you γ. And \γ<tab> will expand to \gamma, so you can figure it out, but users still struggle for unknown symbols [4] http://www.fileformat.info/info/unicode/block/mathematical_operators/images.... On Fri, Jun 2, 2017 at 4:02 PM, Ivan Levkivskyi <levkivskyi@gmail.com> wrote:

On Sat, Jun 03, 2017 at 01:02:12AM +0200, Ivan Levkivskyi wrote:
In the last few months, I've been making a lot of use of the TI Nspire CAS calculator, and I think that there is very little benefit to allowing symbols like ∑ √ ∫ (sum, radical/root, integral) unless you have a proper 2-dimensional template system. There's not much, if any, benefit to writing: ∫(expression, lower_limit, upper_limit, name) In fact, that's probably *harder* to read than integrate(expression, lower_limit, upper_limit, name) because the important thing, the fact that this is an integral, is barely visible. Its only a single character. That's not how mathematicians write it! If we had a 2D template system, like the Nspire, we could write what mathematicians do: (best viewed with a non-proportional font) b ⌠ ⎮ 3 2 1 ⎮ x + 2 x − ─── dx ⎮ x ⌡ a I say "best", but of course even with a monospaced font, it still looks pretty awful. You really need a proper GUI interface and support for resizing characters. I'm not suggesting this be part of Python the language! But It might be a nice application written for users of Python, perhaps part of Sage or IPython/Jupiter or a GUI interface to Sympy. You don't need ∫ to be legal in identifies for that. -- Steve

Steven D'Aprano wrote:
There's not much, if any, benefit to writing:
∫(expression, lower_limit, upper_limit, name)
More generally, there's a kind of culture clash between mathematical notation and programming notation. Mathematical notation tends to almost exclusively use single-character names, relying on different fonts and alphabets, and superscripts and subscripts, to get a large enough set of identifiers. Whereas in programming we use a much smaller alphabet and longer names. Having terse symbols for just a few things, and having to spell everything else out longhand, doesn't really help. -- Greg

On Fri, 2 Jun 2017 at 15:56 Guido van Rossum <guido@python.org> wrote: > I would love to show how easy it is to write > > from math import pi as π, gamma as Γ > > but I had to cheat by copying from the OP since I don't know how to type > these (and even if you were to tell me how I'd forget tomorrow). So, I am > still in favor of the rule "only ASCII in the stdlib". > Since this regularly comes up, why don't we add a note to the math module that you can do the above import(s) to bind various mathematical constants to their traditional symbol counterparts? The note can even start off with something like "While Python's standard library only uses ASCII characters to maximize ease of use and contribution, individuals are allowed to use various Unicode characters for variable names." This would also help with making sure people don't come back later and say, "why don't you just add the constants to the module?" -Brett > > On Fri, Jun 2, 2017 at 3:48 PM, Ivan Levkivskyi <levkivskyi@gmail.com> > wrote: > >> On 2 June 2017 at 12:17, Giampaolo Rodola' <g.rodola@gmail.com> wrote: >> >>> On Thu, Jun 1, 2017 at 8:47 AM, Serhiy Storchaka <storchaka@gmail.com> >>> wrote: >>> >>>> What you are think about adding Unicode aliases for some mathematic >>>> names in the math module? ;-) >>>> >>>> math.π = math.pi >>>> math.τ = math.tau >>>> math.Γ = math.gamma >>>> math.ℯ = math.e >>>> >>>> Unfortunately we can't use ∞, ∑ and √ as identifiers. :-( >>>> >>>> [...] >>> * duplicated aliases might make sense if they add readability; in this >>> case they don't unless (maybe) you have a mathematical background. I can >>> infer what "math.gamma" stands for but not being a mathematician math.Γ >>> makes absolutely zero sense to me. >>> >>> >> There is a significant number of scientific Python programmers (21% >> according to PyCharm 2016), so it is not that rare to meet someone who >> knows what is Gamma function. >> And for many of them π is much more readable than np.pi. Also there is >> another problem, confusion between Gamma function and Euler–Mascheroni >> constant, the first one is Γ, >> the second one is γ (perfectly opposite to PEP 8 capitalization rules >> :-), while both of them are frequently denoted as just gamma (in particular >> math.gamma follows the PEP8 rules, >> but is counter-intuitive for most scientist). >> >> All that said, I agree that these problems are easily solved by a custom >> import from. Still there is something in (or related to?) this proposal >> I think is worth considering: Can we also allow identifiers like ∫ or √. >> This will make many expressions more similar to usual TeX, >> plus it will be useful for projects like SymPy. >> >> -- >> Ivan >> >> >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas@python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> >> > > > -- > --Guido van Rossum (python.org/~guido) > _______________________________________________ > Python-ideas mailing list > Python-ideas@python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ >

On Sat, 03 Jun 2017 17:45:43 +0000, Brett Cannon wrote:
[...]
Because in order to add that note to the math module, you have to violate the "only ASCII in the stdlib" rule. ;-) People who would benefit from seeing ϖ (or π) in their code will arrange to type it in proportion to that benefit (and probably already have). I know how to type those characters in my environments, but it might not be that easy if I had to do so on a random computer with a random keyboard running a random OS. Ob XKCD: https://xkcd.com/1806/ (my apologies if someone else already brought this up; I haven't been following along that closely). And I want to say something about this argument being like the one about being able to represent people's names correctly, but while the ratio between the circumference of a circle to its diameter has a name, it isn't a person. Dan

On 4 June 2017 at 05:02, Dan Sommers <dan@tombstonezero.net> wrote:
The ASCII-only restriction in the standard library is merely "all public APIs will use ASCII-only identifiers", rather than "We don't allow the use of Unicode anywhere" (Several parts of the documentation would be rather unreadable if they were restricted to ASCII characters). However, clarifying that made me realise we've never actually written that down anywhere - it's just been an assumed holdover from the fact that Python 2.7 is still being supported, and doesn't allow for Unicode identifiers in the first place. https://github.com/python/peps/pull/285 is a PR to explicitly document the standard library API restriction in the "Names to Avoid" part of PEP 8. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 5 June 2017 at 07:00, Guido van Rossum <gvanrossum@gmail.com> wrote:
AFAK it was in whatever PEP introduced Unicode identifiers.
Ah, indeed it is: https://www.python.org/dev/peps/pep-3131/#policy-specification Interestingly, that's stricter than my draft PR for PEP 8, and I'm not entirely sure we follow the "string literals and comments must be in ASCII" part in its entirety: ============ All identifiers in the Python standard library MUST use ASCII-only identifiers, and SHOULD use English words wherever feasible (in many cases, abbreviations and technical terms are used which aren't English). In addition, string literals and comments must also be in ASCII. The only exceptions are (a) test cases testing the non-ASCII features, and (b) names of authors. Authors whose names are not based on the Latin alphabet MUST provide a Latin transliteration of their names. ============ That said, all the potential counter-examples that come to mind are in the documentation, but *not* in the corresponding docstrings (e.g. the Euro symbol used in in the docs for chr() and ord()). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

I think the strictness comes from the observation that the stdlib is read and edited using *lots* of different tools and not every tool is with the program. That argument may be weaker now than when that PEP was written, but I still get emails and see websites with mojibake. (Most recently, the US-PyCon badges had spaces for all non-ASCII letters.) The argument is also weaker for comments than it is for identifiers, since stdlib identifiers will be used through *even more* tools (anyone who uses a name imported from stdlib). Docstrings are perhaps halfway in between. On Sun, Jun 4, 2017 at 8:33 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
-- --Guido van Rossum (python.org/~guido)

On 6/1/17, Serhiy Storchaka <storchaka@gmail.com> wrote:
My humble opinion: I would rather like to see something like: from some_wide_used_scientific_library.math_constants import * with good acceptance from scientific users before thinking to add it into stdlib. PS. Maybe this could be interesting for some vim users -> http://gnosis.cx/bin/.vim/after/syntax/python.vim (I am not author of this file) If you apply it in vim then vim show lines (where is not cursor) "translated". Means for example that math.pi is replaced by π. So you still edit "math.pi" but if you move cursor outside of this line then you could see formula simplified/prettified. (or more complicated - because you need a little train your brain to accept new view) I am pretty skeptic how popular this conceal technique could be in vim pythonistas community! ( why skeptic? For example I am testing to improve readability of line similar to self.a = self.b + self.c using with this technique and see in vim ᐠa = ᐠb + ᐠc but **I am not sure if it is really useful** (probably I have to add that I am editing code much more in other editor than vim) ᐠ U+1420 CANADIAN SYLLABICS FINAL GRAVE )

This is a horrible thing that nobody else should do :-), but I *am* the author of the file linked by Pavol. It's based on someone else's version (credited in the file), but I fine tuned it for what I want. I'm also not the author of the conceal plugin for vim, which is pretty much exactly for just this. The screenshot attached is a little bit of a vim session editing a Python file. The key thing is that I *type* only regular ASCII characters; I just tell vim to show my something different on lines I am not currently editing. On Fri, Jun 2, 2017 at 11:26 PM, Pavol Lisy <pavol.lisy@gmail.com> wrote:
-- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.

I use the conceal feature for this purpose, but I use https://github.com/ehamberg/vim-cute-python instead, which is a more toned-down version of the same idea. I personally find x ≤ y, a ∉ B more readable than x <= y, a not in B etc. Especially the conceal lambda => λ is useful for de-cluttering the code. Stephan 2017-06-03 8:50 GMT+02:00 David Mertz <mertz@gnosis.cx>:

Agree with this, but note that a similar proposal has once been made to scipy. It was rejected for now since they still also target Python2. So once scipy drops Python2 supports (presumable around 2020 at the latest), this could be re-proposed there. Stephan 2017-06-03 8:26 GMT+02:00 Pavol Lisy <pavol.lisy@gmail.com>:

Hi, What you are think about adding Unicode aliases for some mathematic names in the math module? ;-) math.π = math.pi math.τ = math.tau math.Γ = math.gamma math.ℯ = math.e Unfortunately we can't use ∞, ∑ and √ as identifiers. :-( It may be the role of editors to do it if one really want it. I personally use the `pretty-mode` emacs mode, it replaces pi, sum, sqrt, None (∅), tau, gamma, but not math.e obviously. It replaces some other strings like lambda (λ x: ...), and is customizable (I added ∈ ∉ for "in" and "not in"). It only "messes" with visual numbers of characters in a line, but flake8 sees the correct python thrue flycheck so it reports errors properly, and you're still notified you're more than 80 columns long, even if visually there may be a longer valid line. Oh and it's also messing with copying and pasting to share, found myself yesterday pasting a "prettyfied" line of code on IRC and it (legitimately) surprised someone. -- Julien Palard https://mdk.fr
participants (35)
-
Brett Cannon
-
Brice PARENT
-
Chris Angelico
-
Dan Sommers
-
David Mertz
-
Erik Bray
-
Giampaolo Rodola'
-
Greg Ewing
-
Guido van Rossum
-
Guido van Rossum
-
Guyzmo
-
Ivan Levkivskyi
-
Joshua Morton
-
Juancarlo Añez
-
Julien Palard
-
Koos Zevenhoven
-
Lucas Wiman
-
Masayuki YAMAMOTO
-
Matthias Bussonnier
-
MRAB
-
Nick Coghlan
-
Nikolaus Rath
-
Oleg Broytman
-
Pavol Lisy
-
Random832
-
Ryan Gonzalez
-
Serhiy Storchaka
-
Stephan Houben
-
Steven D'Aprano
-
Sven R. Kunze
-
Terry Reedy
-
Thomas Jollans
-
Thomas Nyberg
-
Todd
-
Victor Stinner