easy eval() fix?

Bengt Richter bokr at oz.net
Fri Oct 17 05:17:24 EDT 2003


On Wed, 15 Oct 2003 17:53:15 -0700, Geoff Gerrietts <geoff at gerrietts.net> wrote:

>Quoting John Roth (newsgroups at jhrothjr.com):
>> 
>> I don't know of a module that does this, but I'm not altogether
>> certain it wouldn't be possible to put something together that would
>> suit what you need in around the same time it took to write the
>> message.
>
>You might be surprised how quickly I type. ;)
>
>> What are the primitive types you need to convert from repr() string
>> format back to their object format?
>
>Literal statements.
>
>A list of integers:
>  [1, 2, 3, 4] 
>A list of strings:
>  ['1', '2', '3', '4']
>A string/string dict:
>  {'a': 'b', 'c': 'd'}
>
>Imagine the variations; they are plentiful.
>
Maybe looking into what the compiler produces? E.g.,

 >>> import compiler
 >>> compiler.transformer.parse('[1, 2, 3, 4]')
 Module(None, Stmt([Discard(List([Const(1), Const(2), Const(3), Const(4)]))]))
 >>> compiler.transformer.parse("['1', '2', '3', '4']")
 Module(None, Stmt([Discard(List([Const('1'), Const('2'), Const('3'), Const('4')]))]))
 >>> compiler.transformer.parse("{'a': 'b', 'c': 'd'}")
 Module(None, Stmt([Discard(Dict([(Const('a'), Const('b')), (Const('c'), Const('d'))]))]))

You could repr that and then exec the string in an environment [1] with your own definitions
of all those capitalized names, e.g., in example at end.

>On the other hand, anything that actually performs "operations" is not
>permissible.
>
>On the other hand, an error case:
>  [10 ** (10 *** 10)]
>
 >>> print '%s'% compiler.transformer.parse('[10 ** (10 *** 10)]')
 Traceback (most recent call last):
   File "<stdin>", line 1, in ?
   File "D:\python23\lib\compiler\transformer.py", line 50, in parse
     return Transformer().parsesuite(buf)
   File "D:\python23\lib\compiler\transformer.py", line 120, in parsesuite
     return self.transform(parser.suite(text))
   File "<string>", line 1
     [10 ** (10 *** 10)]
                 ^
 SyntaxError: invalid syntax

No problem catching syntax errors ;-)

>This should not, for instance, choke the box for a day evaluating the
>expression; it should (probably) throw an exception but any scenario
>that does not allow the code to chew CPU time is a win over eval().
>
Time, memory resource quotas are an OS job, I think.

>Also, eval and exec do all their work inside a namespace where names
>get resolved to bound objects etcetera. That's not desirable. Nor is
>it desirable to permit an object to be called.
>
>What I'm interested in -- what eval seems most used for, certainly in
>this project -- is a general-purpose tool for transforming a string
>containing a literal statement into the Python data structure.
>
>I toyed with using the parser module to do this. I still may try to do
>that, but I don't know enough about ASN parse trees to understand why
>so many apparently unrelated symbols show up in the parse tree, and so
>I'm reluctant to start down this road without an ample budget of time
>to come to an understanding of such things.
I think using repr on the tree (like what happens when you print the top node
and (I presume) the nodes repr themselves recursively down their subtrees)
and using that as in [what I can now (it's later) refer to as] the example below,
might work as a screener.

>
>I don't have that ample budget of time in my project schedule, so I
>thought I would check to see if there was a quick fix available.
>
Here is a first go at implementing the idea mentioned above.
Not tested very much. I don't know what you need to exclude from compilation.

You will note a start at customizing -- i.e., I allowed calls to names
in the ok_to_call list, in case you need that. Plus it's a start on ideas how
to detect and allow particular things, or maybe disallow some things.

For some reason I disallowed keyword parameters in calls, and attribute access in general,
but you will need to think about each AST name and decide whether it plays a role
in code you want to accept for compilation.

The tuple return for ok names is sure suggestive of lisp/scheme ;-)
It doesn't seem like a big jump to do a translation of simple Python to simple
scheme, but that just fell out ;-)

====< cksrc.py >===================================================
# chksrc.py -- check Python source for syntax errors and "dangerous" stuff
# V .01a bokr 2003-10-17

# NOTE: USE AT YOUR OWN RISK -- NO WARRANTY! MAKE YOUR OWN EDITS!!
# This is an experimental hack to demonstrate a nascent idea, no more.

# Following symbols from python 2.3 were/are retrieved by 
#   [x for x in dir(compiler.symbols.ast) if x[:2].istitle()] 
# and edited to comment out what to allow, the rest being "dangerous"
#
# Everything has not even been tested, never mind thoroughly
#
import sets
dangerous = sets.Set([
#  'Add',       'And',
   'AssAttr',   'AssList',   'AssName',  'AssTuple',
   'Assert',    'Assign', 'AugAssign', 'Backquote',
#  'Bitand',     'Bitor', 'Bitxor', 'Break',
   'CallFunc',     'Class',
#  'Compare',     'Const', 'Continue',  'Dict',   'Discard',
#  'Div',  'Ellipsis',
   'EmptyNode', # ??
   'Exec',
#  'Expression',  'FloorDiv',
   'For',      'From',  'Function',
   'Getattr',  'Global','If',    'Import',    'Invert',
   'Keyword',  # ??
   'Lambda',
#  'LeftShift',      'List',  'ListComp', 'ListCompFor', 'ListCompIf',
   'ListType',       'Mod',
#  'Module',       'Mul',      'Name',      'Node',
#  'Not',        'Or',
   'Pass',
#  'Power',
   'Print',   'Printnl',
   'Raise',    'Return',
#  'RightShift',     'Slice',  'Sliceobj',      'Stmt',
#  'Sub', 'Subscript',
   'TryExcept', 'TryFinally',
#  'Tuple', 'TupleType',
#  'UnaryAdd',  'UnarySub',
   'While',     'Yield',
])

## define a set of ok names to call
ok_to_call = sets.Set('bool int foo'.split()) # etc

import compiler
checkMethods = {}

class Error(Exception): pass


# build an environment dict with functions defined for all the AST names
# above, returning an innocuous tuple for accepted names, and throwing an
# exception for the names left un-commented in the "dangerous" set.
class NameChecker(object):
    def __init__(self, name, dangerous=True):
        self.name=name; self.dangerous = dangerous
    def __call__(self, *args):
        result = (self.name,)+args
        # allow call to specific function
        if self.name=='CallFunc' and args[0][0]=='Name' and args[0][1] in ok_to_call:
            return result
        if self.dangerous: raise Error, '%r not allowed!'%(result,)
        return result
        
for name in [x for x in dir(compiler.symbols.ast) if x[:2].istitle()]:
    checkMethods[name] = NameChecker(name, name in dangerous)

def cksrc(src, verbose=False):
    """
    Check Python source for syntax errors or banned usage.
    Return True if "ok (USE THIS AT YOUR OWN RISK!), False otherwise.
    Print source, compiler.transformer.parse AST representation, and
    result of recompiling and evaluating the text of that AST in an
    environment where the node names are functions returning tuples for
    "safe" nodes ,and throwing an Error exception if a "dangerous" node's
    name is called.
    """
    env = checkMethods.copy()   #XXX maybe can eliminate copy
    if verbose: print '%r =>'%src
    try: ast_repr = repr(compiler.transformer.parse(src))
    except Exception, e:
        if verbose: print '%s: %s'%(e.__class__.__name__, e)
        return False #not ok
    else:
        if verbose: print '%r =>'%ast_repr
        try:
            v = eval(ast_repr, env)
            if verbose: print v
            return True # ok
        except Exception,e:
            if verbose: print '%s: %s'%(e.__class__, e)
            return False # not ok
            
if __name__ == '__main__':
    import sys
    usage = """
    Usage: cksrc.py [-v] [- | -f file]* | -- expression
        (quote expression elements as necessary to prevent shell interpretation)
        -v for verbose output (default for interactive)
        -v- to turn off verbose
        - to read source from stdin (prompted if tty)
        -f file to read from file
        -- expression to take rest of command line as source"""
        
    args = sys.argv[1:]
    verbose = False; vopt=''
    if not args: raise SystemExit, usage
    while args:
        src = ''
        opt = args.pop(0)
        if opt=='-v': vopt=opt; verbose=True; continue
        elif opt=='-v-': vopt=opt; verbose=False; continue
        elif opt=='-h': print usage; continue
        elif opt=='-':
            if sys.stdin.isatty:
                print 'Enter Python source and end with ^Z'
                verbose = True and vopt!='-v-'
            src = sys.stdin.read()
            print 'cksrc returned ok ==%s' % cksrc(src, verbose)
        elif opt=='-f':
            if not args: raise SystemExit, usage
            f = file(args.pop(0))
            src = f.read()
            f.close()
            print 'cksrc returned ok ==%s' % cksrc(src, verbose)
        elif opt=='-i':
            src='anything'; verbose = True and vopt!='-v-'
            print 'Enter expression (or just press Enter to quit):'
            while src:
                src = raw_input('Expr> ').rstrip()
                if src: print 'cksrc returned ok ==%s' % cksrc(src, verbose)
        elif opt == '--':
            verbose = True and vopt!='-v-'
            src = ' '.join(args); args=[]
            print 'cksrc returned ok ==%s' % cksrc(src, verbose)
===================================================================

A few examples:

[ 1:45] C:\pywk\rexec>cksrc.py -i
Enter expression (or just press Enter to quit):
Expr> [1, 2, 3, 4]
'[1, 2, 3, 4]' =>
'Module(None, Stmt([Discard(List([Const(1), Const(2), Const(3), Const(4)]))]))' =>
('Module', None, ('Stmt', [('Discard', ('List', [('Const', 1), ('Const', 2), ('Const', 3), ('Con
st', 4)]))]))
cksrc returned ok ==True
Expr> ['1', '2', '3', '4']
"['1', '2', '3', '4']" =>
"Module(None, Stmt([Discard(List([Const('1'), Const('2'), Const('3'), Const('4')]))]))" =>
('Module', None, ('Stmt', [('Discard', ('List', [('Const', '1'), ('Const', '2'), ('Const', '3'),
 ('Const', '4')]))]))
cksrc returned ok ==True
Expr> {'a':'b', 'c':'d'}
"{'a':'b', 'c':'d'}" =>
"Module(None, Stmt([Discard(Dict([(Const('a'), Const('b')), (Const('c'), Const('d'))]))]))" =>
('Module', None, ('Stmt', [('Discard', ('Dict', [(('Const', 'a'), ('Const', 'b')), (('Const', 'c
'), ('Const', 'd'))]))]))
cksrc returned ok ==True
Expr> a,b
'a,b' =>
"Module(None, Stmt([Discard(Tuple([Name('a'), Name('b')]))]))" =>
('Module', None, ('Stmt', [('Discard', ('Tuple', [('Name', 'a'), ('Name', 'b')]))]))
cksrc returned ok ==True
Expr> a b
'a b' =>
SyntaxError: unexpected EOF while parsing (line 1)
cksrc returned ok ==False
Expr> 1*3+3**4
'1*3+3**4' =>
'Module(None, Stmt([Discard(Add((Mul((Const(1), Const(3))), Power((Const(3), Const(4))))))]))' =>
('Module', None, ('Stmt', [('Discard', ('Add', (('Mul', (('Const', 1), ('Const', 3))), ('Power',
 (('Const', 3), ('Const', 4))))))]))
cksrc returned ok ==True
Expr> 1*2+3**4
'1*2+3**4' =>
'Module(None, Stmt([Discard(Add((Mul((Const(1), Const(2))), Power((Const(3), Const(4))))))]))' =>
('Module', None, ('Stmt', [('Discard', ('Add', (('Mul', (('Const', 1), ('Const', 2))), ('Power',
 (('Const', 3), ('Const', 4))))))]))
cksrc returned ok ==True
Expr>

Checking on being able to call foo:

Enter expression (or just press Enter to quit):
Expr> foo()
'foo()' =>
"Module(None, Stmt([Discard(CallFunc(Name('foo'), [], None, None))]))" =>
('Module', None, ('Stmt', [('Discard', ('CallFunc', ('Name', 'foo'), [], None, None))]))
cksrc returned ok ==True
Expr> bar(foo())
'bar(foo())' =>
"Module(None, Stmt([Discard(CallFunc(Name('bar'), [CallFunc(Name('foo'), [], None, None)], None,
 None))]))" =>
__main__.Error: ('CallFunc', ('Name', 'bar'), [('CallFunc', ('Name', 'foo'), [], None, None)], N
one, None) not allowed!
cksrc returned ok ==False

Note the order of evaluation let it accept the foo() call to set up the arg value for the bar(foo())
call, but not the bar call itself. The other way around stop on bar right away:

Expr> foo(bar())
'foo(bar())' =>
"Module(None, Stmt([Discard(CallFunc(Name('foo'), [CallFunc(Name('bar'), [], None, None)], None,
 None))]))" =>
__main__.Error: ('CallFunc', ('Name', 'bar'), [], None, None) not allowed!
cksrc returned ok ==False
Expr>

HTH. Fun stuff, anyway ;-)

Regards,
Bengt Richter




More information about the Python-list mailing list