[Python-Dev] Divorcing str and unicode (no more implicitconversions).

Josiah Carlson jcarlson at uci.edu
Tue Oct 25 23:40:06 CEST 2005

"Martin v. Löwis" <martin at v.loewis.de> wrote:
> Josiah Carlson wrote:
> > It seems that removing this restriction may cause serious issues, at
> > least in the case when using cyrillic characters in names.  See recent
> > security issues in regards to web addresses in web browsers for the
> > confusion (and/or name errors) that could result in their use.
> That impression is deceiving. We are talking about source code here;
> people type in identifiers explicitly rather than receiving them
> through linking, and they scope identifiers (by module or object).
> If somebody manages to get look-alike identifiers into your Python
> libraries, you have bigger problems than these look-alikes: anybody
> capable of doing so could just as well replace the real thing in
> the first place.
> As always in computer security: define your threat model before
> reasoning about the risks.

I should have been more explicit.  I did not mean to imply that I was
concerned about the security implications of inserting arbitrary
identifiers in Python (I was mentioning the web browser case for
an example of how such characters have been confusing previously), I am
concerned about confusion involved with using:
    Greek Capital: Alpha, Beta, Epsilon, Zeta, Eta, Iota, Kappa, Mu, Nu,
Omicron, Rho, and Tau.
    Cyrillic Capital: Dze, Je, A, Ve, Ie, Em, En, O, Er, Es, Te, Ha, ...

And how users could say, "name error? But I typed in window.draw(PEN) as
I was told to, and it didn't work!"

Identically drawn glyphs are a problem, and pretending that they aren't
a problem, doesn't make it so.  Right now, all possible name glyphs are
visually distinct, which would not be the case if any unicode character
could be used as a name (except for numerals).  Speaking of which, would
we then be offering support for arabic/indic numeric literals, and/or
support it in int()/float()?  Ideally I would like to say yes, but I
could see the confusion if such were allowed.

 - Josiah

More information about the Python-Dev mailing list