
I don't think the SQL query is a good use case; many non-trivial queries (e.g. conditionally including clauses) require generating SQL statements dynamically. On Mon, 2021-08-09 at 22:23 +0000, gbleaney@gmail.com wrote:
The proposal is based on the assumption that a literal type is a good proxy for “a value that has been validated as safe by the caller”. That assumption seems tenuous. There are many ways the caller can perform such validation. For example, it could use a regex filter, compare against a table of known good values, scan for dangerous character sequences, or pass it though an escape transform. None of these techniques will produce a literal-typed value.
You're 100% right! There are lots of ways to make sure a given string is safe to use in a given command. The problem is, many of the ways that you suggested (regex filtering, scanning for dangerous characters, escape transforms) can have subtle implementation flaws that allow an attacker to bypass them. Additionally, users can entirely forget to make those checks. As a security engineer, I have to assume that someone somewhere along the line is going to mess up their ad-hoc regex check (or forget to write it) and let a special character slip though. `Literal[str]` gives me another way though. It lets me create *opinionated* APIs that say the user *must* supply a literal string, which the type checker can then tell me with 100% certainty was not dynamically created with user controlled input.
Let's make things concrete and talk about SQL injection. The canonical way to prevent it is to use parameterized queries:
``` def get_data(value: str): SQL = "SELECT * FROM table WHERE col = %s" return conn.query(SQL, value) ```
The problem is, nothing stops a developer from inserting a dynamically created SQL string into that first parameter and creating a SQL injection vulnerability despite the availability of parameterization:
``` def get_data(value: str): SQL = f"SELECT * FROM table WHERE col = '{value}'" return conn.query(SQL) ```
If I change the interface of `query` to require that the first argument is a literal, I can prevent this SQL injection issue from happening.
Python is a runtime-type-safe language, so it already prevents format string attacks from accessing stack or heap locations outside of the target object. Are you primarily concerned about cases where Python code invokes code written in a different language that potentially has vulnerabilities because of a lack of runtime type safety?
My concerns have nothing to do with memory or type safety, and everything to do with preventing the confusion of data and commands. As you've alluded to, many APIs take a string and run it as code (`pickle`, `eval`, etc. run Python code, SQL APIs run SQL code, `os.system` runs shell commands, `python-ldap`'s APIs let you run LDAP queries, etc.). Sometimes the commands they run need data (IE. a value to insert into the table). If the commands and data are a part of the same string, injection vulnerabilities can occur. Using `Literal[str]`, API designers can enforce the separation of commands and data by requiring that the commands be literals within the python program, rather than coming from some external source that is user controlled data.
It sounds like you are looking for some form of taint analysis, but this doesn’t strike me as a good solution to that problem.
Full taint analysis is definitely useful, and I actually spend the majority of my time working on Pysa which is a taint analysis tool build on top of Pyre. To me, the reasons to want this in a type checker are:
1) It's way faster to run and give feedback to developers. Pysa will take an hour+ to report an issue to a developer on a massive codebase, wheres Pyre can do it in a second. 2) Taint analysis requires that you're able to track sources of user controlled data into the dangerous function. There is always a risk of false negative there, whereas I can't imagine a case of a false negative coming from a type check for `Literal` (outside of explicit lint suppression) _______________________________________________ Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/ Member address: pbryan@anode.ca