Safe eval of insecure strings containing Python data structures?

Tue Oct 14 10:56:00 EDT 2008

On Oct 13, 6:12 pm, George Sakkis <george.sak... at gmail.com> wrote:
> On Oct 13, 8:36 am,lkcl<luke.leigh... at googlemail.com> wrote:
>
>
>
> > On Oct 9, 4:32 am, "James Mills" <prolo... at shortcircuit.net.au> wrote:
>
> > > On Thu, Oct 9, 2008 at 2:26 PM, Warren DeLano <war... at delsci.com> wrote:
> > > > JSON rocks!  Thanks everyone.
>
> > > Yes it does :)
>
> > > > Ben wrote:
>
> > > >>More generally, you should never execute (via eval, exec, or whatever)
> > > >>*any* instruction from an untrusted path; especially not arbitrary
> > > >>data from an input stream.
>
> >  rubbish.  this is why a project i was involved with, to do execution
> > of code from a database instead of a filesystem had to be abandoned,
> > back in 2001.
>
> >  there are perfectly good systems for associating security context
> > with "arbitrary data" (as the security models of SE/Linux, based on
> > Flask, and the security model of windows nt, based on VAX/VMS
> > security, show).
>
> >  there was a flawed design decision in python 2.2 or python 2.3 which
> > resulted in an "escape route" - i believe it centered around either
> > __class__ or __new__ - in the c code, which the developers had not
> > considered, and would not correct.
>
> >  this decision resulted in the abandonment of the rexec.py module in
> > python: you can see for yourself because it raises a runtime exception
> > when you try to use it, issuing a warning.
>
> >  it's _perfectly_ possible to define security contexts and boundaries,
> > and to allow access to functions and modules on a per-security-context
> > basis.
>
> > *as defined by the application developer* [not by the developers of
> > python itself]
>
> > if an individual developer wants to allow "arbitrary code execution
> > from any data stream", it most certainly is _not_ anyone's place to
> > dictate to them that they "cannot do this".
>
> That's why eval and exec still exist (and will probably be around for
> a long time, if not forever). If you define your own external to
> python security contexts, what did the deprecated rexec buy you that
> eval/exec don't ?

 * being able to store python modules in a mysql database!
 * being able to add context to selecting which python module
   and which python function should be retrieved from the db (*1)
 * being able to "vet" function names, allowing only those which
   are supported routines (out of the database) and banning
   all of the "standard" modules.

(*1) the context in which rexec.py was being used was for a data
centre "scanner" tool.  a really damn good one, too :)  some five
years later, we got things like nessus and the other scanner tools
being able to do "ping escalation", automated installs, ssh login
checks etc. etc. but this tool was written in early 2001 (!)

what we had was a 3-way-join on database tables:
* asset, comprising an id, name, OS name and IP address
* scripts, comprising an id, script content and the "module" name
* the scripts-to-os-mapper table, comprising an id, "module" name and
OS name

the 3way join was between asset.os-name and scripts-to-os-mapper.os-
name; scripts-to-os-mapper.module-name and scripts.module-name.

the implications were that we could write per-OS modules (each with
identical function names, function parameters and purpose, of course).

then, if the customer decided that they wanted NT 4.0 instead of
Redhat 5, we simply changed the OS type in the assets table, called up
the "installOS" script, and it would be up to the 3-way-join to select
the appropriate script for the job.  no other work on our part was
needed (yes we had an automated way to network-install NT 4 and
Windows 2000).

the example i remember best was "loginssh" - using the standard Popen
python library - which of course was slightly different on a per-OS
basis, because for NT there were CRLF issues to deal with, and also we
had installed a commercial version of sshd which behaved differently.

regarding exec / eval: yes, i _have_ used that in a similar sort of
way, in another project. catching Name exceptions when executing a
piece of code, i would then retrieve the value for the variable which
came up from the exception by a SQL database call (which, perhaps
unsurprisingly, had been put there from a web interface).

by substituting the retrieved value into a dictionary to be used as
"locals" in the exec / eval call, i was able to repeatedly perform
this trick until the exec / eval succeeded, or the patience of the
user ran out.

it was awfully inefficient - O(N^2) - but, given that the code being
executed wasn't particularly large (200 lines, max?) it wasn't that
important.

however, these were _purely_ mathematical evaluations - returning
numbers or booleans.  there wasn't anything radically complex - not
even _function_ calls.

so, the trick of doing overloading of "import" and "from x import y"
wasn't needed.

> In any case, rexec is a single pure python module;
> nothing stops you from copying it over to your project, hacking it and
> keep using it at your own risk.

 i knoww.  it just annoyed and disappointed me that the issue wasn't
resolved at the right level.