Unicode [was Re: Cult-like behaviour]

Chris Angelico rosuav at gmail.com
Mon Jul 16 12:22:59 EDT 2018

On Tue, Jul 17, 2018 at 2:05 AM, Mark Lawrence <breamoreboy at gmail.com> wrote:
> On 16/07/18 15:17, Dan Sommers wrote:
>> On Mon, 16 Jul 2018 10:39:49 +0000, Steven D'Aprano wrote:
>>> ... people who think that if ISO-8859-7 was good enough for Jesus ...
>> It may have been good enough for his disciples, but Jesus spoke Aramaic.
>> Also, ISO-8859-7 doesn't cover ancient polytonic Greek; it only covers
>> modern monotonic Greek.
>> See also the Unicode Greek FAQ (https://www.unicode.org/faq/greek.html).
> Out of curiosity where does my mum's Welsh come into the equation as I
> believe that it is not recognised by the EU as a language?

What characters does it use? Mostly Latin letters? If so, it's easy -
most Western European languages are covered by the basic Latin
alphabetics (the ASCII ones), plus the combining diacriticals (U+0300
and following), plus a small handful of language-specific characters
(eg U+0130/U+0131 for Turkish). There are combined forms of some of
these, which can be found via NFC normalization, and a few ligatures
for some languages, but by and large, that's all you need for most
Latin-derived languages.


More information about the Python-list mailing list