Do you guys have any thoughts on the merits of adding dumpXML and loadXML methods to the pickle module? The only disadvantage that comes to mind is that the file sizes are larger (though they may compress more efficiently. The advantages center around portability and the use of existing tools: -- The pickles would be validatable against a DTD or schema -- Pickles would be more human readable than the current format -- XLST make translations to HTML, JavaPickle formats, more compact formats, etc. -- XPATH could be used as a recursive search tool -- Pickles would be editable and viewable with XML editors -- No need for stack machine instructions to be included -- Python object trees could potentially be loaded in other languages -- The DTD can be used by non-Python sources to create data that is directly loadable in to Python objects -- Pickle security can be improved by using tight DTDs instead of copyreg. I would appreciate you thoughts. Raymond Hettinger P.S. Here's an example of what it would look like: class Circle: def __init__(self, rad): self.rad = rad class Square: def __init__(self, side): self.side = side def __getinitargs__(self): return (self.side,) class Triangle: def __init__(self, side1, side2, side3): self.sides = map(math.toRadians, (side1, side2, side3)) def __getstate__(self): return self.sides def __setstate__(self, state): self.sides = state
d = {"one":"uno", "two":"dos"} obj = [d, 42, u"abc", [1.0,2+5j], Circle(5), Square(4), Triangle(3,4,5), d, None, True, False, Circle, len] pickle.dumpsXML(obj)
<objectlist> <list> <dict id="0"> <item> <str>one</str> <str>uno</str> </item> <item> <str>two</str> <str>dos</str> </item> </dict> <int>42</int> <unicode>abc</unicode> <list> <float>1.0</float> <complex>2+5j</complex> </list> <instance module="__main__" name="Circle"> <dict> <item> <str>rad</str> <int>5</int> </item> </dict> </instance> <instance module="__main__" name="Square"> <tuple> <int>5</int> </tuple> </instance> <instance module="__main__" name="Triangle"> <list> <float>0.052358333333333333</float> <float>0.069811111111111115</float> <float>0.087263888888888891</float> </list> </instance> <memo idref="0"/> <none/> <true/> <false/> <global module="__main__" name="Circle"/> <global module="__builtin__" name="len"/> </list> </objectlist>
Do you guys have any thoughts on the merits of adding dumpXML and loadXML methods to the pickle module?
That doesn't belong in the pickle module. An XML format to store Python-specific data structures doesn't make sense. Storing data in XML makes total sense, but should probably be guided by some XML standard and not by the set of data types that happen to be available in Python. Put it in the xml module. Note that xmlrpclib.py already has a way to do this, for the data types supported by XMLRPC. --Guido van Rossum (home page: http://www.python.org/~guido/)
Guido van Rossum <guido@python.org> writes:
That doesn't belong in the pickle module.
Also, it doesn't belong in the core (right now). PyXML has the xml.marshal package, which has a "generic" XML marshaller, and one that generates WDDX. There are a few users of WDDX, but nobody has ever asked to provide marshalling for arbitrary Python objects. Contributions to this package are welcome (sf.net/projects/pyxml); if such a module has existed for a couple of PyXML releases, we can tell whether there is enough demand for it to be in the standard library (which I doubt). Regards, Martin
Raymond Hettinger wrote:
Do you guys have any thoughts on the merits of adding dumpXML and loadXML methods to the pickle module?
The only disadvantage that comes to mind is that the file sizes are larger (though they may compress more efficiently.
FYI, there such a module in PyXML. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/
participants (4)
-
Guido van Rossum
-
M.-A. Lemburg
-
martin@v.loewis.de
-
Raymond Hettinger