pickle juice and YAML (Re: [Tutor] dumping .pck files)

Neal McBurnett neal at bcn.boulder.co.us
Wed Jan 5 03:31:24 CET 2005


Thanks to Patric Michael and Alan Gauld for helpful responses.
The mailman "config_list" command was perfect for my near-term needs.

A bit more digging led me to juicier info on pickles, for my longer
term nourishment and development.

Lots of documentation is in /usr/lib/python2.3/pickletools.py
What I was asking for is provided by the "dis()" routine.  E.g.

 import pickletools
 print pickletools.dis(open("config.pck"))

which requires no class information at all and produces a low-level
byte-by-byte disassembly like this:
    0: }    EMPTY_DICT
    1: q    BINPUT     1
    3: (    MARK
    4: U        SHORT_BINSTRING 'send_welcome_msg'
   22: q        BINPUT     2
 ....

The page at http://python.active-venture.com/lib/node63.html clarifies
that in fact the default pickle format is ascii.  But mine was
protocol version 1 ("binary") and thus unpleasant to look at in an
editor.  The ascii version is safe but also probably awfully cryptic.

I also learned:

 One of the design goals of the pickle protocol is to make pickles
 "context-free": as long as you have installed the modules containing
 the classes referenced by a pickle, you can unpickle it, without
 needing to import any of those classes ahead of time.

and I succeeded in getting a good dump by running my original 2-liner
in the directory that contained the file MailList.py, or by just
doing "from Mailman import MailList"

Details are in PEP 307: http://www.python.org/peps/pep-0307.html



Finally, another serialization format that is eminently readable is
YAML ("YAML Ain't Markup Language").  Besides Python, it works for
Ruby et al., can be used as an alternative to xml, etc.

 http://www.yaml.org/
 http://www.pyyaml.org
 http://www.yaml.org/spec/
 yaml and xml
  http://www-106.ibm.com/developerworks/library/x-matters23.html
 http://yaml.kwiki.org/?YamlInFiveMinutes

I found it a bit difficult to uncover the latest on Python
and YAML.  These notes should help.

To get your feet wet, download from

 http://www.pyyaml.org/cgi-bin/trac.cgi/wiki/ArchiveTarballs

The most recent seems to be Mike Orr's Work in Progess:
 http://python.yaml.org/dist/PyYaml_0.32_MONEW.tar.gz

Unapck it, run "demo.py", and read the code to figure out what was
going on :-) Don't miss the experimental "ypath" demo which is, I
guess, sort of like xpath and allows you to select data items out of
hierarchical data structures.

There is some documentation at
 http://www.pyyaml.org/cgi-bin/trac.cgi/wiki/UserDocumentation

But as described at 
 http://www.pyyaml.org/cgi-bin/trac.cgi/ticket/11

there are security issues with the PyYaml implementation in terms of
importing arbitrary data structures, and other limitations currently.
But YAML is an interesting development.

Another implementation for Ruby, Perl, Python, PHP and oCaml is
syck: http://whytheluckystiff.net/syck/

but I haven't seen any demos of that for Python, just for Ruby.

Cheers,

Neal McBurnett                 http://bcn.boulder.co.us/~neal/
Signed and/or sealed mail encouraged.  GPG/PGP Keyid: 2C9EBA60

On Sun, Jan 02, 2005 at 09:57:08PM -0700, Neal McBurnett wrote:
> I want to explore some mailman config files, e.g.
> config.pck.  Some quick searches informed me that these
> contain pickled configuration info.
> 
> My first naive attempt to dump what was in them failed:
> 
> >>> import pickle
> >>> pickle.load(open("config.pck"))
> traceback....
> ImportError: No module named Mailman.Bouncer
> 
> It seems that to do a good job of dumping the data, I need to tell it
> what this class looks like.
> 
> Are there alternatives?  Does the pickle format really not provide a
> way to inspect the data without the class definitions?  E.g. if I
> didn't have the source code or didn't want to dig around in it?
> 
> Is there a simple way to dump everything it does understand, and
> just print stubs for the parts it doesn't?
> 
> Or is there some nice software to do this all for me?
> 
> Or perhaps does mailman itself have handy command-line tools to aid in
> comparing configurations across mailing lists?
> 
> Are there other serialization options that would make this easier?
> 
> Many thanks,
> 
> Neal McBurnett                 http://bcn.boulder.co.us/~neal/
> Signed and/or sealed mail encouraged.  GPG/PGP Keyid: 2C9EBA60


More information about the Tutor mailing list