[Python-3000] PEP: Supporting Non-ASCII Identifiers
"Martin v. Löwis"
martin at v.loewis.de
Sun Jun 3 20:43:03 CEST 2007
Rauli Ruohonen schrieb:
> This is only almost true. Consider these two hypothetical files
> written by naive newbies:
>
> data.py:
>
> favorite_colors = {'Martin Löwis': 'blue'}
>
> code.py:
>
> import data
>
> print data.favorite_colors['Martin Löwis']
That is an unrealistic example. It's more likely that the
second access reads
user = find_current_user()
print data.favorite_colors[user]
To deal with that safely, I would recommend to write
favorite_colors = nfc_dict({'Martin Löwis': 'blue'})
> The most important thing about normalization is that it should be
> consistent for internal strings. Similarly when reading in a text
> file, you really should normalize it first, if you're going to
> handle it as *text*, not binary.
>
> The most common normalization is NFC, because it works best
> everywhere and causes the least amount of surprise. E.g.
> "Löwis"[2] results in "w", not in u'\u0308' (COMBINING DIAERESIS),
> which most naive users won't expect.
Sure. If you think it is worth the effort, write a PEP.
PEP 3131 is only about identifiers.
Regards,
Martin
More information about the Python-3000
mailing list