[Python-ideas] pytaint: taint tracking in python

Tue Oct 15 18:57:22 CEST 2013

On Tue, Oct 15, 2013 at 2:58 AM, Felix Gröbert <felix at groebert.org> wrote:

> 1. Please correct me if I misunderstand the Python project, but if the
> idea is deemed 'good' by this list, a PEP can follow and the feature can be
> included in Python 3? It is not necessary to have a Python 3 implementation
> beforehand?
> The existing Python 2.7.5 pytaint implementation is intended to be run by
> users who need tainting in Python 2 but can also serve as a reference /
> benchmark / proof-of-concept implementation for this discussion.
>

FWIW having reviewed parts of this code as it was implemented by Marcin
I'll state up front that porting this to Python 3 will mostly be a matter
of mechanical work. Python 3's bytes (PyBytes) and str (PyUnicode) objects
are not _that_ different in implementation in comparison to Python 2's str
(PyString) and unicode (PyUnicode) objects for the purposes of adding and
tracking taint.

Besides, the code could use more eyeballs as would happen in any porting
process. :)

> 2. I haven't had the time to publish benchmarks yet but I plan to. Also,
> of course, the cpython tests pass and we added additional taint tracking
> tests. We also ran the internal tests of our python codebase with the
> pytaint interpreter. This had negligible fails, mostly because some C
> extensions haven't had been recompiled to work with the redefined string
> objects.
>
> Regarding taint tracking as a feature for python:
>
> First of all, taint tracking is a general language feature and can be
> considered for additional applications besides security. When it comes to
> the security community, taint tracking is certainly controversial.
> Nevertheless, my pytaint announcement received 50 retweets and 30 favs from
> a part of the security community, if that counts for something ;)
>
> As Andrew and Bruce mention, there are other solutions to XSS and SQLi:
> template systems and parameterized queries. Another library solution exists
> to shell injection: pipes.quote. However, all these solutions require the
> developer to pick the correct library and method. We have empirical
> indicators that this works, but maybe only in 70% of cases. The rest of the
> developers are introducing new vulnerabilities. Thus, an additional
> language-based feature can help to mitigate the remaining 30% of cases. A
> web app framework (or a python-developing company) can maintain and ship a
> pytaint configuration which will throw a TaintError exception in those 30%
> of cases and prevent the vulnerability from being exploited.
>
> This argument follows along the principle of defense-in-depth: why just
> have one security feature (e.g. pipes.quote) if we can offer several
> security features to the developer? This has previously worked well for
> system security: ALSR, DEP, etc.
>
> Regarding the relation to typing:
>
> We are using Mertis on purpose to be able to distinguish between different
> forms of string cleaning. Today, most HTML template systems don't even make
> a distinction between different escaping contexts. However, with a pytaint
> Merit configuration for raw HTML, URLs, HTML attribution contents, CSS
> attributes and JS strings, you would be able to make sure that your string
> is cleaned for the specific context you're using it in. This can be
> implemented for each template system individually but it would be easier to
> just write a pytaint config.
>

Indeed. I like the taint merits system. It is much more powerful than what
Perl 5 ever had with a single taint bit.

The ability to configure taint properties "offline" via JSON files is also
neat. You can effectively create taint merit and sink metadata for existing
Python libraries without needing to modify them (similar to how Cython lets
you specify types via an external file for it to apply its magic better to
other libraries without needing to modify them).

-gps

> If you don't clean strings based on browser context, you will run into
> problems: a string is cleaned with HTML-entity encoding but used in a
> <iframe src> attribute. An attacker could trigger a XSS by suppling
> javascript:alert(document.cookie).
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20131015/5b7b746e/attachment.html>