Rolling a Container Into a String

Paul McGuire ptmcg at austin.rr._bogus_.com
Fri Jun 25 01:44:34 EDT 2004


"Kamilche" <klachemin at home.com> wrote in message
news:889cbba0.0406241742.51a2980b at posting.google.com...
> I want to convert a dict into string form, then back again. After
> discovering that eval is insecure, I wrote some code to roll a Python
> object, dict, tuple, or list into a string. I've posted it below. Does
> anyone know an easier way to accomplish this? Essentially, I want to
> avoid doing an 'eval' on a string to get it back into dict form... but
> still allow nested structures. (My current code doesn't handle nested
> structures.)
>
unroll is just repr().

Here is a pyparsing routine to "roll up" your data structures, a mere 60
lines or so.  It's fairly tolerant of some odd cases, and fully handles
nested data.  (Extension to include remaining items, such as boolean data
and scientific notation, is left as an exercise for the reader.)  I hope
this is fairly easy to follow - I've dropped in a few comments.

For those of you who've been living in a cave, you can download pyparsing at
http://pyparsing.sourceforge.net.

-- Paul


from pyparsing import Word, ZeroOrMore, OneOrMore, Suppress, Forward, \
                quotedString,nums,Combine,Optional,delimitedList,Group

# create a dictionary of ugly data, complete with nested lists, tuples
# and dictionaries, even imaginary numbers!
d1 = {}
d1['a'] = [1,2,3,[4,5,6]]
d1['b'] = (7,8,(9,10),'a',"",'')
d1['c'] = { 'aa' : 1, 'bb' : "lskdj'slkdjf", 'cc':1.232, 'dd':(('z',),) }
d1[('d','e')] = 5+10j

print repr(d1)

testdata = repr(d1)
"""
looks like this:
{'a': [1, 2, 3, [4, 5, 6]], ('d', 'e'): (5+10j), 'c': {'aa': 1, 'cc': 1.232,
'dd': (('z',),), 'bb': "lskdj'slkdjf"}, 'b': (7, 8, (9, 10), 'a', '', '')}
"""

#define low-level data elements
intNum = Word( nums+"+-", nums )
realNum = Combine(intNum + "." + Optional(Word(nums)))
number = realNum | intNum
imagNum = Combine( "(" + number + "+" + number + "j" + ")" )

item = Forward() # set up for recursive grammar definition
tupleDef = Suppress("(") + ( delimitedList( item ) ^
                             ( item + Suppress(",") ) ) + Suppress(")")
listDef = Suppress("[") + delimitedList( item ) + Suppress("]")
keyDef = tupleDef | quotedString | imagNum | number
keyVal = Group( keyDef + Suppress(":") + item )
dictDef = Suppress("{") + delimitedList( keyVal ) + Suppress("}")

item << ( quotedString | number | imagNum |
           tupleDef | listDef | dictDef )

# define low-level conversion routines
intNum.setParseAction( lambda s,loc,toks: int(toks[0]) )
realNum.setParseAction( lambda s,loc,toks: float(toks[0]) )
imagNum.setParseAction( lambda s,loc,toks: eval(toks[0]) ) # no built-in to
convert imaginaries?

# strip leading and trailing character from parsed quoted string
quotedString.setParseAction( lambda s,loc,toks: toks[0][1:-1] )

# define list-to-list/tuple/dict routines
evalTuple = lambda s,loc,toks: [ tuple(toks) ]
evalList  = lambda s,loc,toks: [ toks.asList() ]
evalDict  = lambda s,loc,toks: [ dict([tuple(kv) for kv in toks]) ]

tupleDef.setParseAction( evalTuple )
listDef.setParseAction( evalList )
dictDef.setParseAction( evalDict )

# first element of returned tokens list is the reconstructed list/tuple/dict
results = item.parseString( testdata )[0]
print results

if repr(results) == repr(d1):
    print "Eureka!"
else:
    print "Compare results for mismatch"
    print repr(results)
    print repr(d1)





More information about the Python-list mailing list