Fwd: [XML-SIG] xmlpickle.py ?!

Jim Fulton jim@digicool.com
Mon, 07 Aug 2000 14:39:24 -0400


<note>I don't normally have time to follow the xml sig.
  Someone kindly forwarded Marc-Andre's note to me.
  I haven't seen the rest of this thread.
</note>

"M.-A. Lemburg" <mal@lemburg.com> wrote:
> 
> I'm currently looking into writing a xmlpickle.py module
> with the intent to be able to pickle (and unpickle) arbitrary
> Python objects in a way that makes the objects editable through
> a XML editor or convertible to some other format using the
> existing XML tools.

I wonder whether a tool that generated XML for arbitrary Python
objects would really be that useful for transfer to
other applications. I suspect not.
 
> After looking at the archives of this SIG, I found that the
> idea was already tossed around a few times, but I couldn't
> find any downloadble outcome.

Zope has a facility that I've been meaning to make more 
generally available but haven't had time to. :/
In my case, I wanted to be able to convert to/from binary
pickles and xml, so I had an intern write something that 
works from pickles, rather than from objects. It can be used
to look at existing pickles and can be used, in conjunction with
pickle or cPickle to convert objects to and from XML.

If your interested, let me know and I'll provide more details.

> I've looked at pickle.py a bit and realized that the extensible
> nature of the pickle mechanism would probably cause trouble
> because the DTD would have to be generated as well (not a good
> idea).

Why would a DTD have to be generated?

> There are two alternatives to this though:
> 
> 1. add an element which handles all non-core Python object
>    types (the ones registered through copy_reg)
> 
> 2. use an abstract DTD altogheter
> 
> Example for 1:
> 
> <PythonPickle version="1.0">
> <Dictionary>
>         <String name="aString">abcdef</String>
>         <List name="aList">
>                 <Integer>10</Integer>
>                 <String>abc</String>
>         </List>
>         <Instance name="aInstance" module="test" classname="test">
>                 <String name="instvar">value</String>
>         </Instance>
>         <Object name="myObject" constructor="mx.DateTime.DateTime">
>                 <Tuple>
>                         <Integer>2000</Integer>
>                         <Integer>8</Integer>
>                         <Integer>6</Integer>
>                 </Tuple>
>         </Object>
> </Dictionary>
> </PythonPickle>

This is the route I took. Here's an example that's
probably alot bigger than you want....

    <pickle>
      <dictionary id="3046.4">
        <item>
            <key> <string id="3046.5" encoding="repr">title</string> </key>
            <value> <string encoding="repr"></string> </value>
        </item>
        <item>
            <key> <string id="3046.6" encoding="repr">raw</string> </key>
            <value> <string id="3046.7" encoding="cdata"><![CDATA[

<dtml-var standard_html_header>\n
<h2><dtml-var title_or_id> <dtml-var document_title></h2>\n
<dtml-var "\'\\n\\n\'">\n
<p>\n
This is the <dtml-var document_id> Document \n
in the <dtml-var title_and_id> Folder.\n
</p>\n
<dtml-var standard_html_footer>

]]></string> </value>
        </item>
        <item>
            <key> <string id="3046.8" encoding="repr">__ac_local_roles__</string> </key>
            <value>
              <dictionary id="3046.9">
                <item>
                    <key> <string id="3046.10" encoding="repr">jim</string> </key>
                    <value>
                      <list id="3046.11">
                        <string id="3046.12" encoding="repr">Owner</string>
                      </list>
                    </value>
                </item>
              </dictionary>
            </value>
        </item>
        <item>
            <key> <string id="3046.13" encoding="repr">globals</string> </key>
            <value>
              <dictionary id="3046.14"/>
            </value>
        </item>
        <item>
            <key> <string id="3046.15" encoding="repr">__name__</string> </key>
            <value> <string id="3046.16" encoding="repr">m2</string> </value>
        </item>
        <item>
            <key> <string id="3046.17" encoding="repr">_vars</string> </key>
            <value>
              <dictionary id="3046.18"/>
            </value>
        </item>
      </dictionary>
    </pickle>


Note that this is pretty much a straight translation of
the Python pickle "schema". :)  Note the id attributes
and reference tags, which allow cyclical data structures.
(I recently discovered that there is a problem with my id
values. Does anyone know what it is? ;)

One other note. I found the XML spec to be a little
ambigouos (or maybe I'm just too dense) wrt binary data 
and newlines, so I decided to punt and escape newlines and
binary data.  I encode strings as either "repr" which is a 
repr like encoding that escapes things in a way that is
just a tad more terse than repr. I switch to base64 when
the escaping penalty exceeds 40%.  Since alot of our pickles
have marked up text, I automatically use CDATA sections when
I can and where it would help. See the example above.

I really need to write down a DTD for this......

Jim

--
Jim Fulton           mailto:jim@digicool.com   Python Powered!        
Technical Director   (888) 344-4332            http://www.python.org  
Digital Creations    http://www.digicool.com   http://www.zope.org    

Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email
address may not be added to any commercial mail list with out my
permission.  Violation of my privacy with advertising or SPAM will
result in a suit for a MINIMUM of $500 damages/incident, $1500 for
repeats.