[Python-ideas] [Python-Dev] Unicode minus sign in numeric conversions

Nick Coghlan ncoghlan at gmail.com
Thu Jun 13 13:18:49 CEST 2013


On 13 June 2013 13:15, Steven D'Aprano <steve at pearwood.info> wrote:
> There are practical exploits where the bad guy can exploit the visual
> similarity of certain digits to other digits, but they doesn't have anything
> to do with int(). The Unicode consortium has done the right thing by
> mentioning this, but we can get a rough idea of the practical risk involved:
> there are about ten pages of discussion of various URL spoofing attacks, and
> six lines on numeric spoofs.

Just as significantly, we push validation of untrusted data out to the
boundaries of applications for a reason: the interpreter has no way of
knowing whether data is trusted or untrusted. Establishing
trustworthiness is a complex topic, and there are limits to the
assistance the language can offer. We can remove
obviously-unsafe-in-retrospect features (like the old Python 2.x
input-with-implicit-eval), but glyph confusion in Unicode is well
outside what we decided to worry about when designing Python 3's
unicode features. It's similar to the way we don't second guess the
user if they decided to set "shell=True" on a subprocess call.

You can contrast that with the packaging metadata 2.0 design though,
where not only are we continuing to disallow arbitrary Unicode in
package names (with the vulnerability to glyph-confusion based
exploits being one of the major considerations), but even allowing
index servers to enforce the TR36 glyph confusability restrictions
that apply to ASCII characters (specifically "lI1" and "0O").

I think this whole thread has gone pretty far afield, though. The
original question was whether 'int("-1") == int("-\{MINUS SIGN}1")'
should hold, and I agree with Lukasz and MAL that it should. I'm only
+0 on the other characters MAL mentioned, though (FULLWIDTH PLUS SIGN,
SUPERSCRIPT PLUS SIGN, SUPERSCRIPT MINUS).

Cheers,
Nick.

--
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia


More information about the Python-ideas mailing list