[Python-ideas] .pyu nicode syntax symbols (was Re: Empty set, Empty dict)

Guido van Rossum guido at python.org
Mon Jun 23 02:30:59 CEST 2014


Hm. What's wrong with rejecting bad ideas?
On Jun 22, 2014 1:19 PM, "Terry Reedy" <tjreedy at udel.edu> wrote:

> Problem: For years, various people have suggested that they would like to
> use syntactically significant unicode symbols in Python code. A prime
> example is using U+2205, EMPTY SET, ∅, instead of 'set()'. On the other
> hand, the conservative, overwhelmed core development group is not much
> interested and would rather do other things.
>
> Solution: Act instead of ask.
>
> One or more of the people who really want this could get themselves
> together and produce a working system. (If multiple people, ask for a new
> sig and mailing list).
>
> 1. Ask core development to reserve '.pyu' for python with unicode
> symbolds. (If refused, chose something else.)
>
> 2. Write pyu.py. It should first translate x.pyu to the equivalent x.py.
> If x.py exists, check the date (at with .py and .pyc). Optionally, but
> probably by default, run x.py.
>
> Translation requires two operations: masking comments and string literals
> from translation and translating the remainder. I personally would start by
> doing the two operations separately, with separately testable functions.
>
> def codechunk(unisymcode):
>   '''Yield code_or_not, code_chunk pairs for code with unicode symbols.
>
>   Chunks are comments or string literals (code_or_not == False),
>   and code that might have unicode symbols that need translation
>   'code_or_not' == True).
>   '''
>   <Simplified parser, possibly derived from tokenize.tokenize(),
>   which already knows how to recognize comments and strings.>
>
> unisym = <dict mapping unicode ordinals to ascii replacements>
>
> def unisym2ascii(unisymcode):
>   blocklist = []
>   for code, block in codeblocks(unisymcode):
>     if code:
>       block = block.translate(unisym)
>     blocklist.append(block)
>   return ''.join(blocklist)
>
> 3. Upload pyu.py to PyPI, *along with instructions on the various ways to
> enter unicode symbols on various systems*. Announce and promote.
>
>
> On 6/22/2014 10:41 AM, Philipp A. wrote:
>
>> if people are too lazy to find a input method that works for them (Alt
>> Gr, compose key, copy&paste), they should just continue to type ASCII,
>> and leave the more elegant unicode variants for others.
>>
>
> Being snarky can be fun, but if I wrote and distributed pyu.py, I would
> want as many users as possible.
>
>  ∅ and λ seem like good ideas to me as un-redefinable empty
>> set literal and shorter/more elegant lambda. And “…” for “Ellipsis”.
>>
>> there’s also ∀, ¬, ×, ∧,∨, ∩, ∪, ∈, ∉, ≠, ≡, ≤, and ≥, but i think those
>> are a bit much:
>>
>
> I think the unisym dict should be inclusive and let people choose to use
> the symbols they want. I suspect I use ≤ and ≥ b sooner than λ. A
> mathematician that used most of those symbols, for a math audience, could
> still use the ascii tranlation for other audiences.
>
> On 6/22/2014 11:01 AM, MRAB wrote:
> > λ is a valid identifier in Python 3 because it's a letter.
>
> Overall, I see this as less of a problem than the possibility of rebinding
> builtin names. The program could have a 'translate_lambda' (default True)
> parameter. But I would be willing to say that if you use unicode symbols,
> then you cannot also use λ as an identifier. (If one did, the resulting .py
> would stop with SyntaxError where 'lambda' repladed identifier λ.)
>
> --
> Terry Jan Reedy
>
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20140622/2f5e058c/attachment-0001.html>


More information about the Python-ideas mailing list