[Python-ideas] Non-ASCII in Python syntax? [was: Null coalescing operator]

Paul Moore p.f.moore at gmail.com
Sun Oct 30 08:22:10 EDT 2016


On 30 October 2016 at 07:00, Stephen J. Turnbull
<turnbull.stephen.fw at u.tsukuba.ac.jp> wrote:
>> as I imagine Unicode characters would be for me. I really hope it
>  > isn't...
>
> I think your imagination is running away with you.  While I understand
> how costly it is for those over the age of 12 to develop new habits
> (I'm 58, and painfully aware of how frequently I balk at learning
> anything new no matter how productivity-enhancing it is likely to be,
> and how much more slowly it becomes part of my repertoire), the number
> of new things you would need to learn would be few, and frequently
> enough used, at least in Python.  It's hard enough to get Guido (and
> the other Masters of Pythonic Language Design) to sign on to new ASCII
> syntax; even if in principle non-ASCII were to be admitted, I suspect
> the barrier there would be even higher.
>
> Most of Unicode is irrelevant to everybody.  Mathematicians use only a
> small fraction of the math notation available to them -- it's just
> that it's a different small fraction for each field.  The East Asians
> need a big chunk (I would guess that educated Chinese and Japanese
> encounter about 10,000 characters in "daily life" over a lifetime,
> while those encountered at least once a week number about 3000), but
> those that need to be memorized are a small minority (less than 5%) of
> the already defined Unicode repertoire.
>
> For Western programmers, the mechanics are almost certainly there.
> Every personal computer should have at least one font containing all
> characters defined in the Basic Multilingual Plane, and most will have
> chunks of the astral planes (emoji, rare math symbols, country flags,
> ...).  Even the Happy Hacker keyboard has enough mode keys (shift,
> control, ...) to allow defining "3-finger salutes" for commonly-used
> characters not on the keycaps -- in daily life if you don't need a
> input method now, you won't need one if Python decides to use WHITE
> SQUARE to represent an operation you frequently use -- just an extra
> "control key combo" like the editing control keys (eg, for copy, cut,
> paste, undo) that aren't marked on any keyboard I have.

My point wasn't so much about dealing with the character set of
Unicode, as it was about physical entry of non-native text. For
example, on my (UK) keyboard, all of the printed keycaps are basically
used. And yet, I can't even enter accented letters from latin-1 with a
standard keypress, much less extended Unicode. Of course it's possible
to get those characters (either by specialised mappings in an editor,
or by using an application like Character Map) but there's nothing
guaranteed to work across all applications. That's a hardware and OS
limitation - the hardware only has so many keys to use, and the OS
(Windows, in my case) doesn't support global key mapping (at least not
to my knowledge, in a user-friendly manner - I'm excluding writing my
own keyboard driver :-)) My interest in East Asian experience is at
least in part because the "normal" character sets, as I understand it,
are big enough that it's impractical for a keyboard to include a
plausible basic range of characters, so I'm curious as to what the
physical process is for typing from a vocabulary of thousands of
characters on a sanely-sized keyboard.

In mentioning emoji, my main point was that "average computer users"
are more and more likely to want to use emoji in general applications
(emails, web applications, even documents) - and if a sufficiently
general solution for that problem is found, it may provide a solution
for the general character-entry case. (Also, I couldn't resist the
irony of using a :-) smiley while referring to emoji...) But it may be
that app-specific solutions (e.g., the smiley menu in Skype) are
sufficient for that use case. Or the typical emoji user is likely to
be using a tablet/phone rather than a keyboard, and mobile OSes have
included an emoji menu in their on-screen keyboards.

Coming back to a more mundane example, if I need to type a character
like é in an email, I currently need to reach for Character Map and
cut and paste it. The same is true if I have to type it into the
console. That's a sufficiently annoying stumbling block that I'm
inclined to avoid it - using clumsy workarounds like referring to "the
OP" rather than using their name. I'd be fairly concerned about
introducing non-ASCII syntax into Python while such stumbling blocks
remain - the amount of code typed outside of an editor (interactive
prompt, emails, web applications like Jupyter) mean that editor-based
workarounds like custom mappings are only a partial solution.

But maybe you are right, and it's just my age showing. The fate of APL
probably isn't that relevant these days :-) (or ☺ if you prefer...)

Paul


More information about the Python-ideas mailing list