pickle/marshal internal format 'life expectancy'/backward compatibility

Adam DePrince adam at cognitcorp.com
Sun Feb 6 02:20:34 EST 2005


On Sat, 2005-02-05 at 17:04, Tim Peters wrote:
> [Philippe C. Martin]
> > I am looking into using the pickle format to store object/complex data
> > structures into a smart card as it would make the design of the embedded
> > application very simple.
> >
> > Yet the card might have to stay in the pocket of the customer for a few
> > years, during which the back office application responsible for
> > retrieving the information from the card might evolve as well as the
> > python release it relies upon.
> >
> > Is there a commitment for python releases to be able to interpret
> > 'older' pickle/marshal internal formats ?
> 
> Pickle and marshal have nothing in common, and there's no
> cross-release guarantee about marshal behavior.  Pickles are safe in
> this respect; indeed, each Python released to date has been able to
> load pickles created by all earlier Python releases.  Of course a
> pickle produced by a later Python may not be loadable by an earlier
> Python, although even in that direction you can get cross-release
> portability by forcing the newer Python to restrict itself to produce
> obsolete pickle formats.  Reading Lib/pickletools.py in a current
> release will explain all this.


How complicated is your data structure?  Might you just store:

repr( <mydatastructure> )

and eval it later?  Trust is an issue; you are vulnerable to malicious
code, but no more so than with pickle or marshal.

One quick and dirty way to be a little safer is to "sign" the data you
store.

# To save your data
import sha
import cPickle
mysecret = "abc"
mydata = {"what":"my data structure"}
f = open( "/tmp/myfile.txt", "w+" )
mydata = cPickle.dumps( mydata, protocol=0 )
# I'm assuming this is a flash device ... lets be safe and not assume
# that write is buffered ... 
f.write( sha.new( mysecret + mydata ).digest() + mydata)



# To load your data
import sha
import cPickle
mysecret = "abc
f = open( "/tmp/myfile.txt", "w+" )
hash = f.read( sha.digest_size )
mydata = f.read()
if sha.new( mysecret + mydata ).digest() != hash:
	raise "Somebody is tring to hack you!" 
mydata = cPickle.loads( mydata )

Of course, the security of this scheme is dependent on a lot, including
the strength of sha, your ability to keep your secret key secret, the
correctness of what I'm saying, etc etc etc.




Adam DePrince 





More information about the Python-list mailing list