[Python-Dev] Re: pre-PEP: Unicode Security Considerations for Python

3 Nov 2021

      On Wed, Nov 03, 2021 at 11:21:53AM +1100, Chris Angelico wrote:
...
TBH, I'm not entirely sure how valid it is to talk about *security*
considerations when we're dealing with Python source code and variable
confusions, but that's a term that is well understood.
It's not like Unicode is the only way to write obfuscated code, 
malicious or otherwise.
...
But to the extent that it is a security concern, it's not one that
linters can really cope with. I'm not sure how a linter would stop
someone from publishing code on PyPI that causes confusion by its
character encoding, for instance.
Do we require that PyPI prevents people from publishing code that causes 
confusion by its poorly written code and obfuscated and confusing 
identifiers?

The linter is to *flag the issue* during, say, code review or before 
running the code, like other code quality issues.

If you're just running random code you downloaded from the internet 
using pip, then Unicode confusables are the least of your worries.

I'm not really sure why people get so uptight about Unicode confusables, 
while being blasé about the opportunities to smuggle malicious code into 
pure ASCII code.

https://en.wikipedia.org/wiki/Underhanded_C_Contest

Is it unfamiliarity? Worse? "Real programmers write identifiers in 
English." And the ironic thing is, while it is very difficult indeed for 
automated checkers to detect underhanded code in ASCII, it is trivially 
easier for editors, linters and other tools to spot the sort of Unicode 
confusables we're talking about here. But we spend all our energy 
worrying about the minor issue, and almost none on the broader problem 
of malicious code in general.

I'm pretty sure I could upload a library to PyPI that included

    os.system('rm -rf .')

and nobody would blink an eye, but if I write:

    A = 1
    А = 2
    Α = 3
    print(A, А, Α)

everyone goes insane. Let's keep the threat in perspective. Writing an 
informational PEP for the education of people is a great idea. Rushing 
into making wholesale changes to the interpreter, not so much.

-- 
Steve

[Python-Dev] Re: pre-PEP: Unicode Security Considerations for Python

Steven D'Aprano