Hm. What's wrong with rejecting bad ideas?

On Jun 22, 2014 1:19 PM, "Terry Reedy" <tjreedy@udel.edu> wrote:
Problem: For years, various people have suggested that they would like to use syntactically significant unicode symbols in Python code. A prime example is using U+2205, EMPTY SET, ∅, instead of 'set()'. On the other hand, the conservative, overwhelmed core development group is not much interested and would rather do other things.

Solution: Act instead of ask.

One or more of the people who really want this could get themselves together and produce a working system. (If multiple people, ask for a new sig and mailing list).

1. Ask core development to reserve '.pyu' for python with unicode symbolds. (If refused, chose something else.)

2. Write pyu.py. It should first translate x.pyu to the equivalent x.py. If x.py exists, check the date (at with .py and .pyc). Optionally, but probably by default, run x.py.

Translation requires two operations: masking comments and string literals from translation and translating the remainder. I personally would start by doing the two operations separately, with separately testable functions.

def codechunk(unisymcode):
  '''Yield code_or_not, code_chunk pairs for code with unicode symbols.

  Chunks are comments or string literals (code_or_not == False),
  and code that might have unicode symbols that need translation
  'code_or_not' == True).
  '''
  <Simplified parser, possibly derived from tokenize.tokenize(),
  which already knows how to recognize comments and strings.>

unisym = <dict mapping unicode ordinals to ascii replacements>

def unisym2ascii(unisymcode):
  blocklist = []
  for code, block in codeblocks(unisymcode):
    if code:
      block = block.translate(unisym)
    blocklist.append(block)
  return ''.join(blocklist)

3. Upload pyu.py to PyPI, *along with instructions on the various ways to enter unicode symbols on various systems*. Announce and promote.


On 6/22/2014 10:41 AM, Philipp A. wrote:
if people are too lazy to find a input method that works for them (Alt
Gr, compose key, copy&paste), they should just continue to type ASCII,
and leave the more elegant unicode variants for others.

Being snarky can be fun, but if I wrote and distributed pyu.py, I would want as many users as possible.

∅ and λ seem like good ideas to me as un-redefinable empty
set literal and shorter/more elegant lambda. And “…” for “Ellipsis”.

there’s also ∀, ¬, ×, ∧,∨, ∩, ∪, ∈, ∉, ≠, ≡, ≤, and ≥, but i think those
are a bit much:

I think the unisym dict should be inclusive and let people choose to use the symbols they want. I suspect I use ≤ and ≥ b sooner than λ. A mathematician that used most of those symbols, for a math audience, could still use the ascii tranlation for other audiences.

On 6/22/2014 11:01 AM, MRAB wrote:
> λ is a valid identifier in Python 3 because it's a letter.

Overall, I see this as less of a problem than the possibility of rebinding builtin names. The program could have a 'translate_lambda' (default True) parameter. But I would be willing to say that if you use unicode symbols, then you cannot also use λ as an identifier. (If one did, the resulting .py would stop with SyntaxError where 'lambda' repladed identifier λ.)

--
Terry Jan Reedy


_______________________________________________
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/