[Python-3000] Unicode IDs -- why NFC? Why allow ligatures?
amcnabb at mcnabbs.org
Fri Jun 8 19:00:49 CEST 2007
On Thu, Jun 07, 2007 at 06:50:57PM -0400, Jim Jewett wrote:
> On 6/7/07, Andrew McNabb <amcnabb at mcnabbs.org> wrote:
> > On Wed, Jun 06, 2007 at 07:06:05PM -0400, Jim Jewett wrote:
> > > (There were mixed opinions on Technical symbols, and no one has spoken
> > > up yet about the half-dozen Croatian digraphs corresponding to Serbian
> > > Cyrillic.)
> If the digraphs were converted to compatibility characters, would
> that be good, bad, or no big deal?
> I'm not entirely certain which letters Stephen was talking about, but
> believe they are the (upper, lower, and titlecase) digraphs for Ǉ, Ǌ,
> Ǳ, Ǆ (DZ caron)
> Would it be acceptable if (only in identifier names, not normal text)
> python treated those the same as the two-character sequences LJ, NJ,
> DZ, and DŽ?
I speak Serbian as a second language (and lived in Serbia for a few
years), and my opinion is that a Serbian/Croatian speaker would expect
the digraphs to be treated the same as the two-character sequences.
The issue doesn't seem to come up too often, but people using
typewriters have been typing the digraphs as separate characters for
years. The place I noticed the issue most frequently was if there was a
vertical sign, such as a storefront.
A sign saying "bookstore" would like like this:
The following would be incorrect:
But even many native speakers make this mistake.
Other than that, ǌ is practically indistinguishable from nj, and the
other Croatian digraphs have the same behavior.
I hope this helps in the discussion.
PGP Fingerprint: 8A17 B57C 6879 1863 DE55 8012 AB4D 6098 8826 6868
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Size: 186 bytes
Desc: not available
Url : http://mail.python.org/pipermail/python-3000/attachments/20070608/867eb658/attachment.pgp
More information about the Python-3000