
Paul Moore writes:
My point wasn't so much about dealing with the character set of Unicode, as it was about physical entry of non-native text. For example, on my (UK) keyboard, all of the printed keycaps are basically used.
How do you type the pound sign and the Euro sign? Are they on the UK keyboard? Or are you not in the UK and don't need them?
And yet, I can't even enter accented letters from latin-1 with a standard keypress, much less extended Unicode.
I'm pretty sure you can, but since I've been Windows-free for 20 years (except for a short period when I was treasurer for an NGO, and only used it to access the accounting system), I can't tell you what it is. On the Mac, you press alt/option plus a graphic key. Most result in what somebody decided are common non-ASCII characters (German sharp S, Greek lowercase mu, Greek upper- and lowercase sigma), but several are dead keys, producing accented characters when combined with a base character: tilde, accents acute and grave, and so on. Surely Windows has a similar system (I don't mean Alt+digits). (But maybe not, I didn't notice one in my brief Googling.)
My interest in East Asian experience is at least in part because the "normal" character sets, as I understand it, are big enough that it's impractical for a keyboard to include a plausible basic range of characters, so I'm curious as to what the physical process is for typing from a vocabulary of thousands of characters on a sanely-sized keyboard.
You're right about the size. Korean is special, because the 11,000- odd Hangul are phonetic and generated algorithmically from a set of about 70 phonetic partial glyphs, divided into three groups. The same keys do multiple duty when typed in phonetic order. Other systems use the shift key. For the 100,000 Han ideographs[1], there are a wide variety of methods for entry by key sequence, ranging from code point entry to context-dependent phonetic entry of entire sentences as they would be spoken. Then, of course, there's voice recognition, and handwriting recognition (both static from the image, and dynamic, taking account of the order of pen strokes). The more advanced input methods not only take account of grammar, but also learn the users' habits, remember recent conversions, and predict coming keystrokes based on current context, offering several conversions based on plausible continuations.
In mentioning emoji, my main point was that "average computer users" are more and more likely to want to use emoji in general applications (emails, web applications, even documents) - and if a sufficiently general solution for that problem is found, it may provide a solution for the general character-entry case.
Not for the Asian languages. For them, "character entry" in the sense of character-by-character has long since been obsoleted by predictive sentence-level phonetic methods. But emoji are a perfect example for the present purpose, since they don't have standard pronunciations (although probably many will get them based on the Unicode standard names). On systems with high- enough resolution displays, a palette showing the glyphs is the obvious solution. But that's not pleasant if you type quickly and need those characters frequently. I don't think there's an alternative for emoji though, except for personalized shortcut maps. Math symbols are similar, I think.
Coming back to a more mundane example, if I need to type a character like é in an email, I currently need to reach for Character Map and cut and paste it. The same is true if I have to type it into the console.
You probably have Control, Windows, Menu, Alt, and maybe a "function" key. If you're lucky, one labelled AltGr for "Alternate Graphic" is the obvious suspect. Some combination of the above probably allows entry of accented Latin-1 characters, miscellaneous Latin-1 (eg, sharp S), and a few oddballs (Greek letters, ligatures like oe, the leminiscate usually read infinity).
That's a sufficiently annoying stumbling block
It very well could be, although my Windows Google-foo isn't great. But this is what I found. For WHITE SQUARE, the Mac doesn't have a keyboard equivalent, but there's a standard way to set up a set of shortcut keys[2]: http://stackoverflow.com/questions/3685146/how-do-you-do-the-therefore-%E2%8... And I think you can also use the "Input Preferences" screen in System Preferences to set up a few of them. For Windows, it seems that Alt+decimal character codes, or hex Unicode followed by Alt+x are the built-in ways to enter characters not on your keyboard.. It's also possible to set up "Math Autocorrect" to automatically convert keysequences according to https://blogs.msdn.microsoft.com/murrays/2011/08/29/sans-serif-mathematical-... but that's hardly obvious (although maybe it is if you're Dutch?) I have to wonder why so many people stick with a system that obviously hates users. :-( Footnotes: [1] I'm counting several thousand Taiwanese standard glyphs whose pronunciation and meaning is no longer known (they're culled from old manuscripts), as well as each of the 2 or 3 variants of several thousand characters given simplified glyphs by the Japanese and PRC standard bodies, because all have separate Unicode codepoints assigned. [2] Note: I had to Google this because I use Japanese input methods: when I want a square I type the Japanese word for "square" and then press "next conversion" until the square I want shows up. This also works for most Greek letters and math symbols. This doesn't bother me, because it's normal for typing Japanese (and I do mix Japanese and English enough that I know that it doesn't bug me when I need such a character in an otherwise all-English text). I suspect it would be inadequate for someone who doesn't also type a language requiring a complex input method.