[pypy-svn] r15534 - in pypy/dist/pypy: interpreter module/marshal module/marshal/test objspace/std

tismer at codespeak.net tismer at codespeak.net
Wed Aug 3 06:48:44 CEST 2005


Author: tismer
Date: Wed Aug  3 06:48:41 2005
New Revision: 15534

Added:
   pypy/dist/pypy/module/marshal/
   pypy/dist/pypy/module/marshal/__init__.py   (contents, props changed)
   pypy/dist/pypy/module/marshal/interp_marshal.py   (contents, props changed)
   pypy/dist/pypy/module/marshal/stackless_snippets.py   (contents, props changed)
   pypy/dist/pypy/module/marshal/test/
   pypy/dist/pypy/module/marshal/test/make_test_marshal.py   (contents, props changed)
   pypy/dist/pypy/module/marshal/test/test_marshal.py   (contents, props changed)
   pypy/dist/pypy/module/marshal/test/test_marshalimpl.py   (contents, props changed)
   pypy/dist/pypy/objspace/std/marshal_impl.py   (contents, props changed)
Modified:
   pypy/dist/pypy/interpreter/baseobjspace.py
   pypy/dist/pypy/objspace/std/model.py
   pypy/dist/pypy/objspace/std/objspace.py
Log:
Hey, marshal is here!

I added an interp-level marshal module. It is also enabled
to do the basic marshalling and loading of .pyc files.

This implementation is fairly fast. There is some performance
to be gathered in the presence of files. Buffering has not been
taken, yet.
Hint: PyPy marshal does not use file descriptors, but file-like
objects. For optimization, look for a seek method and redirect
to the string marshaller. Make sure to seek back to the true end
of the marshal. This is cheap, but make PyPy marshal much more
versatile than CPython's.

This implementation tries to be as compatible to builtin marshal as possible.
At the time being, there is one problem: Recursion depth is not big
enough to emulate CPython's nesting depth of 5000. I did a couple of
aproaches to do it the stackless way. The snippets can be found in
stackless_snippets, but nothing was pleasant enough, yet to make me
incorporate it into the so far quite nice module. After all, every
simpler effort brings me back to the way Stackless 3.1 implements things.
I will finally implement an extension this way.

Although I was not pleased with my temporary results, I have to say
that RPython's object model is much nicer than that of the C language!
The single inheritance is very, very powerful. I will use it to implement
custom frame-like structures. I also agree with Armin's comment that
such transformations should not be done by hand, but that we should
find algorithms which generate this automatically. Besides that, I think
it makes sense to explore these things, and marshal is a perfect example
which is not too complicated, but reveals a number of non-trivial cases.

Nevertheless, for complete CPython compatibility, it is necessary
to limit object nesting to 5000, or CPython will crash. I do intend
to back-translate the stackless version to CPython, in order to
remove this arbitrary limitation. I'm also expecting new random
arguments whioch will prevend this from getting into the core,
as usual :-)


Modified: pypy/dist/pypy/interpreter/baseobjspace.py
==============================================================================
--- pypy/dist/pypy/interpreter/baseobjspace.py	(original)
+++ pypy/dist/pypy/interpreter/baseobjspace.py	Wed Aug  3 06:48:41 2005
@@ -160,7 +160,7 @@
         except AttributeError:
             pass
 
-        l = ['sys', '__builtin__', 'exceptions', 'unicodedata', '_codecs',
+        l = ['sys', '__builtin__', 'exceptions', 'unicodedata', '_codecs', 'marshal',
              '_sre']
 
         if self.options.nofaking:
@@ -198,7 +198,7 @@
         self.setitem(self.builtin.w_dict, self.wrap('__builtins__'), w_builtin) 
 
         for modname, mixedname in self.get_builtinmodule_list():
-            if modname not in ('sys', '__builtin__', 'exceptions', 'marshal'):
+            if modname not in ('sys', '__builtin__', 'exceptions'):##!!, 'marshal'):
                 self.setbuiltinmodule(modname, mixedname)
         
         # initialize with "bootstrap types" from objspace  (e.g. w_None)

Added: pypy/dist/pypy/module/marshal/__init__.py
==============================================================================
--- (empty file)
+++ pypy/dist/pypy/module/marshal/__init__.py	Wed Aug  3 06:48:41 2005
@@ -0,0 +1,18 @@
+# Package initialisation
+from pypy.interpreter.mixedmodule import MixedModule
+
+class Module(MixedModule):
+    """
+    This module implements marshal at interpreter level.
+    """
+
+    appleveldefs = {
+    }
+    
+    interpleveldefs = {
+        'dump'    : 'interp_marshal.dump',
+        'dumps'   : 'interp_marshal.dumps',
+        'load'    : 'interp_marshal.load',
+        'loads'   : 'interp_marshal.loads',
+        'version' : 'space.wrap(interp_marshal.Py_MARSHAL_VERSION)',
+    }

Added: pypy/dist/pypy/module/marshal/interp_marshal.py
==============================================================================
--- (empty file)
+++ pypy/dist/pypy/module/marshal/interp_marshal.py	Wed Aug  3 06:48:41 2005
@@ -0,0 +1,447 @@
+from pypy.interpreter.baseobjspace import ObjSpace
+from pypy.interpreter.error import OperationError
+from pypy.rpython.rarithmetic import intmask
+import sys
+
+# Py_MARSHAL_VERSION = 2
+# this is from Python 2.5
+# already implemented, but for compatibility,
+# we default to version 1. Version 2 can be
+# tested, anyway, by using the optional parameter.
+# XXX auto-configure this by inspecting the
+# Python version we emulate. How to do this?
+Py_MARSHAL_VERSION = 1
+
+def dump(space, w_data, w_f, w_version=Py_MARSHAL_VERSION):
+    writer = FileWriter(space, w_f)
+    # note: bound methods are currently not supported,
+    # so we have to pass the instance in, instead.
+    ##m = Marshaller(space, writer.write, space.int_w(w_version))
+    m = Marshaller(space, writer, space.int_w(w_version))
+    m.put_w_obj(w_data)
+
+def dumps(space, w_data, w_version=Py_MARSHAL_VERSION):
+    # using a list's append directly does not work,
+    # it leads to someobjectness.
+    writer = StringWriter()
+    m = Marshaller(space, writer, space.int_w(w_version))
+    m.put_w_obj(w_data)
+    return space.wrap(writer.get_value())
+
+def load(space, w_f):
+    reader = FileReader(space, w_f)
+    u = Unmarshaller(space, reader)
+    return u.get_w_obj(False)
+
+def loads(space, w_str):
+    reader = StringReader(space, w_str)
+    u = Unmarshaller(space, reader)
+    return u.get_w_obj(False)
+
+
+class _BaseWriter(object):
+    pass
+
+
+class _BaseReader(object):
+    def raise_eof(self):
+        space = self.space
+        raise OperationError(space.w_EOFError, space.wrap(
+            'EOF read where object expected'))
+
+
+class FileWriter(_BaseWriter):
+    def __init__(self, space, w_f):
+        self.space = space
+        try:
+            self.func = space.getattr(w_f, space.wrap('write'))
+            # XXX how to check if it is callable?
+        except OperationError:
+            raise OperationError(space.w_TypeError, space.wrap(
+            'marshal.dump() 2nd arg must be file-like object'))
+
+    def raise_eof(self):
+        space = self.space
+        raise OperationError(space.w_EOFError, space.wrap(
+            'EOF read where object expected'))
+
+    def write(self, data):
+        space = self.space
+        space.call_function(self.func, space.wrap(data))
+
+
+class FileReader(_BaseReader):
+    def __init__(self, space, w_f):
+        self.space = space
+        try:
+            self.func = space.getattr(w_f, space.wrap('read'))
+            # XXX how to check if it is callable?
+        except OperationError:
+            raise OperationError(space.w_TypeError, space.wrap(
+            'marshal.load() arg must be file-like object'))
+
+    def read(self, n):
+        space = self.space
+        w_ret = space.call_function(self.func, space.wrap(n))
+        ret = space.str_w(w_ret)
+        if len(ret) != n:
+            self.raise_eof()
+        return ret
+
+
+class StringWriter(_BaseWriter):
+    # actually we are writing to a stringlist
+    def __init__(self):
+        self.buflis = []
+
+    def write(self, data):
+        self.buflis.append(data)
+
+    def get_value(self):
+        return ''.join(self.buflis)
+
+
+class StringReader(_BaseReader):
+    def __init__(self, space, w_str):
+        self.space = space
+        try:
+            self.bufstr = space.str_w(w_str)
+        except OperationError:
+            raise OperationError(space.w_TypeError, space.wrap(
+                'marshal.loads() arg must be string'))
+        self.bufpos = 0
+        self.limit = len(self.bufstr)
+
+    def read(self, n):
+        pos = self.bufpos
+        newpos = pos + n
+        if newpos > self.limit:
+            self.raise_eof()
+        self.bufpos = newpos
+        return self.bufstr[pos : newpos]
+
+
+MAX_MARSHAL_DEPTH = 5000
+
+# the above is unfortunately necessary because CPython
+# relies on it without run-time checking.
+# PyPy is currently in much bigger trouble, because the
+# multimethod dispatches cause deeper stack nesting.
+
+# we try to do a true stack limit estimate, assuming that
+# one applevel call costs at most APPLEVEL_STACK_COST
+# nested calls.
+
+nesting_limit = sys.getrecursionlimit()
+APPLEVEL_STACK_COST = 25    # XXX check if this is true
+
+CPYTHON_COMPATIBLE = True
+
+TEST_CONST = 10
+
+class _Base(object):
+    def raise_exc(self, msg):
+        space = self.space
+        raise OperationError(space.w_ValueError, space.wrap(msg))
+
+DONT_USE_MM_HACK = False
+
+class Marshaller(_Base):
+    # _annspecialcase_ = "specialize:ctr_location" # polymorphic
+    # does not work with subclassing
+    
+    def __init__(self, space, writer, version):
+        self.space = space
+        ## self.put = putfunc
+        self.writer = writer
+        self.version = version
+        # account for the applevel that we will call by one more.
+        self.nesting = ((space.getexecutioncontext().framestack.depth() + 1)
+                        * APPLEVEL_STACK_COST + TEST_CONST)
+        self.cpy_nesting = 0    # contribution to compatibility
+        self.stringtable = {}
+        self.stackless = False
+        self._stack = None
+        self._iddict = {}
+
+    ## currently we cannot use a put that is a bound method
+    ## from outside. Same holds for get.
+    def put(self, s):
+        self.writer.write(s)
+
+    def atom(self, typecode):
+        assert type(typecode) is str and len(typecode) == 1
+        self.put(typecode)
+
+    def atom_int(self, typecode, x):
+        a = chr(x & 0xff)
+        x >>= 8
+        b = chr(x & 0xff)
+        x >>= 8
+        c = chr(x & 0xff)
+        x >>= 8
+        d = chr(x & 0xff)
+        self.put(typecode + a + b + c + d)
+
+    def atom_int64(self, typecode, x):
+        self.atom_int(typecode, x)
+        self.put_int(x>>32)
+
+    def atom_str(self, typecode, x):
+        self.atom_int(typecode, len(x))
+        self.put(x)
+
+    def atom_strlist(self, typecode, tc2, x):
+        self.atom_int(typecode, len(x))
+        for item in x:
+            if type(item) is not str:
+                self.raise_exc('object with wrong type in strlist')
+            self.atom_str(tc2, item)
+
+    def start(self, typecode):
+        assert type(typecode) is str and len(typecode) == 1
+        self.put(typecode)
+
+    def put_short(self, x):
+        a = chr(x & 0xff)
+        x >>= 8
+        b = chr(x & 0xff)
+        self.put(a + b)
+
+    def put_int(self, x):
+        a = chr(x & 0xff)
+        x >>= 8
+        b = chr(x & 0xff)
+        x >>= 8
+        c = chr(x & 0xff)
+        x >>= 8
+        d = chr(x & 0xff)
+        self.put(a + b + c + d)
+
+    def put_pascal(self, x):
+        lng = len(x)
+        if lng > 255:
+            self.raise_exc('not a pascal string')
+        self.put(chr(lng))
+        self.put(x)
+
+    # HACK!ing a bit to loose some recursion depth and gain some speed.
+    # XXX it would be nicer to have a clean interface for this.
+    # remove this hack when we have optimization
+    # YYY we can drop the chain of mm dispatchers and save code if a method
+    # does not use NotImplemented at all.
+    def _get_mm_marshal(self, w_obj):
+        mm = getattr(w_obj, '__mm_marshal_w')
+        mm_func = mm.im_func
+        name = mm_func.func_code.co_names[0]
+        assert name.startswith('marshal_w_')
+        return mm_func.func_globals[name]
+
+    def put_w_obj(self, w_obj):
+        self.nesting += 2
+        do_nested = self.nesting < nesting_limit
+        if CPYTHON_COMPATIBLE:
+            self.cpy_nesting += 1
+            do_nested = do_nested and self.cpy_nesting < MAX_MARSHAL_DEPTH
+        if do_nested:
+            if DONT_USE_MM_HACK:
+                self.nesting += 2
+                self.space.marshal_w(w_obj, self)
+                self.nesting -= 2
+            else:
+                self._get_mm_marshal(w_obj)(self.space, w_obj, self)
+        else:
+            self._run_stackless(w_obj)
+        self.nesting -= 2
+        if CPYTHON_COMPATIBLE:
+            self.cpy_nesting -= 1
+
+    # this function is inlined below
+    def put_list_w(self, list_w, lng):
+        self.nesting += 1
+        self.put_int(lng)
+        idx = 0
+        while idx < lng:
+            self.put_w_obj(list_w[idx])
+            idx += 1
+        self.nesting -= 1
+
+    def put_list_w(self, list_w, lng):
+        if DONT_USE_MM_HACK:
+            # inlining makes no sense without the hack
+            self.nesting += 1
+            self.put_int(lng)
+            idx = 0
+            while idx < lng:
+                self.put_w_obj(list_w[idx])
+                idx += 1
+            self.nesting -= 1
+            return
+
+        # inlined version, two stack levels, only!
+        self.nesting += 2
+        self.put_int(lng)
+        idx = 0
+        space = self.space
+        do_nested = self.nesting < nesting_limit
+        if CPYTHON_COMPATIBLE:
+            self.cpy_nesting += 1
+            do_nested = do_nested and self.cpy_nesting < MAX_MARSHAL_DEPTH
+        if do_nested:
+            while idx < lng:
+                w_obj = list_w[idx]
+                self._get_mm_marshal(w_obj)(space, w_obj, self)
+                idx += 1
+        else:
+            while idx < lng:
+                w_obj = list_w[idx]
+                self._run_stackless(w_obj)
+                idx += 1
+        self.nesting -= 2
+        if CPYTHON_COMPATIBLE:
+            self.cpy_nesting -= 1
+
+    def _run_stackless(self, w_obj):
+        self.raise_exc('object too deeply nested to marshal')
+
+
+def invalid_typecode(space, u, tc):
+    u.raise_exc('invalid typecode in unmarshal: %r' % tc)
+
+def register(codes, func):
+    """NOT_RPYTHON"""
+    for code in codes:
+        Unmarshaller._dispatch[ord(code)] = func
+
+
+class Unmarshaller(_Base):
+    _dispatch = [invalid_typecode] * 256
+
+    def __init__(self, space, reader):
+        self.space = space
+        ## self.get = getfunc
+        self.reader = reader
+        # account for the applevel that we will call by one more.
+        self.nesting = ((space.getexecutioncontext().framestack.depth() + 1)
+                        * APPLEVEL_STACK_COST)
+        self.stringtable_w = []
+
+    def get(self, n):
+        assert n >= 0
+        return self.reader.read(n)
+
+    def atom_str(self, typecode):
+        self.start(typecode)
+        lng = self.get_lng()
+        return self.get(lng)
+
+    def atom_strlist(self, typecode, tc2):
+        self.start(typecode)
+        lng = self.get_lng()
+        res = [None] * lng
+        idx = 0
+        while idx < lng:
+            res[idx] = self.atom_str(tc2)
+            idx += 1
+        return res
+
+    def start(self, typecode):
+        tc = self.get(1)
+        if tc != typecode:
+            self.raise_exc('invalid marshal data')
+        self.typecode = tc
+
+    def get_short(self):
+        s = self.get(2)
+        a = ord(s[0])
+        b = ord(s[1])
+        x = a | (b << 8)
+        if x & 0x8000:
+            x = x - 0x10000
+        return x
+
+    def get_int(self):
+        s = self.get(4)
+        a = ord(s[0])
+        b = ord(s[1])
+        c = ord(s[2])
+        d = ord(s[3])
+        x = a | (b<<8) | (c<<16) | (d<<24)
+        return intmask(x)
+
+    def get_lng(self):
+        s = self.get(4)
+        a = ord(s[0])
+        b = ord(s[1])
+        c = ord(s[2])
+        d = ord(s[3])
+        if d & 0x80:
+            self.raise_exc('bad marshal data')
+        x = a | (b<<8) | (c<<16) | (d<<24)
+        return x
+
+    def get_pascal(self):
+        lng = ord(self.get(1))
+        return self.get(lng)
+
+    def get_str(self):
+        lng = self.get_lng()
+        return self.get(lng)
+
+    # this function is inlined below
+    def get_list_w(self):
+        self.nesting += 1
+        lng = self.get_lng()
+        res_w = [None] * lng
+        idx = 0
+        while idx < lng:
+            res_w[idx] = self.get_w_obj(False)
+            idx += 1
+        self.nesting -= 1
+        return res_w
+
+    def get_w_obj(self, allow_null):
+        self.nesting += 2
+        if self.nesting < nesting_limit:
+            tc = self.get(1)
+            w_ret = self._dispatch[ord(tc)](self.space, self, tc)
+            if w_ret is None and not allow_null:
+                space = self.space
+                raise OperationError(space.w_TypeError, space.wrap(
+                    'NULL object in marshal data'))
+        else:
+            w_ret = self._run_stackless()
+        self.nesting -= 2
+        return w_ret
+
+    # inlined version to save a nesting level
+    def get_list_w(self):
+        self.nesting += 2
+        lng = self.get_lng()
+        res_w = [None] * lng
+        idx = 0
+        space = self.space
+        w_ret = space.w_None # something not None
+        if self.nesting < nesting_limit:
+            while idx < lng:
+                tc = self.get(1)
+                w_ret = self._dispatch[ord(tc)](space, self, tc)
+                if w_ret is None:
+                    break
+                res_w[idx] = w_ret
+                idx += 1
+        else:
+            while idx < lng:
+                w_ret = self._run_stackless()
+                if w_ret is None:
+                    break
+                res_w[idx] = w_ret
+                idx += 1
+        if w_ret is None:
+            raise OperationError(space.w_TypeError, space.wrap(
+                'NULL object in marshal data'))
+        self.nesting -= 2
+        return res_w
+
+    def _run_stackless(self):
+        self.raise_exc('object too deeply nested to unmarshal')

Added: pypy/dist/pypy/module/marshal/stackless_snippets.py
==============================================================================
--- (empty file)
+++ pypy/dist/pypy/module/marshal/stackless_snippets.py	Wed Aug  3 06:48:41 2005
@@ -0,0 +1,180 @@
+"""
+This file contains various snippets of my attemts
+to create a stackless marshal version. Finally,
+I postponed this effort, because the recursive
+solution gets quite far, already, and I wanted to
+deliver a clean solution, after all. Explicit
+stackless-ness is not very nice, after all.
+"""
+
+"""
+
+Stackless extension:
+--------------------
+I used marshal as an example of making recursive
+algorithms iterative. At some point in the future,
+we will try to automate such transformations. For the
+time being, the approach used here is quite nice
+and shows, how much superior RPython is over C.
+Especially the simple inheritance makes writing
+the necessary callbacks a pleasure.
+I believe that the recursive version is always more
+efficient (to be tested). The strategy used here is
+to use recursion up to a small stack depth and to switch
+to iteration at some point.
+
+"""
+
+class TupleEmitter(Emitter):
+    def init(self):
+        self.limit = len(self.w_obj.wrappeditems)
+        self.items_w = self.w_obj.wrappeditems
+        self.idx = 0
+    def emit(self):
+        idx = self.idx
+        if idx < self.limit:
+            self.idx = idx + 1
+            return self.items_w[idx]
+
+
+class TupleCollector(Collector):
+    def init(self):
+        pass
+    def collect(self, w_obj):
+        idx = self.idx
+        if idx < self.limit:
+            self.idx = idx + 1
+            self.items_w[idx] = w_obj
+            return True
+        return False
+    def fini(self):
+        return W_TupleObject(self.space, self.items_w)
+
+
+class xxx(object):
+    def _run_stackless(self):
+        self.stackless = True
+        tc = self.get(1)
+        w_obj = unmarshal_dispatch[ord(tc)](self.space, self, tc)
+        while 1:
+            collector = self._stack
+            if not collector:
+                break
+            w_obj = emitter.emit()
+            if w_obj:
+                self.space.marshal_w(w_obj, self)
+            else:
+                emitter._teardown()
+        self.stackless = False
+
+    def deferred_call(self, collector):
+        collector._setup()
+
+# stackless helper class
+
+class Collector(_Base):
+    def __init__(self, typecode, unmarshaller):
+        self.space = unmarshaller.space
+        self.typecode = typecode
+
+    def _setup(self):
+        unmarshaller = self.unmarshaller
+        self.f_back = unmarshaller._stack
+        unmarshaller._stack = self
+        self.init()
+
+    def collect(self, w_obj):
+        return False # to be overridden
+
+    def _teardown(self):
+        self.unmarshaller._stack = self.f_back
+        return self.fini()
+
+
+class ListCollector(Collector):
+    def __init__(self, space, typecode, count, finalizer):
+        Collector.__init__(self, space, typecode)
+        self.limit = count
+        self.finalizer = finalizer
+        self.items_w = [space.w_None] * count
+        self.idx = 0
+
+    def accumulate(self, w_data):
+        idx = self.idx
+        limit = self.limit
+        assert idx < limit
+        self.items_w[idx] = w_data
+        idx += 1
+        self.idx = idx
+        return idx < limit
+
+class DictCollector(Collector):
+    def __init__(self, space, typecode, finalizer):
+        Collector.__init__(self, space, typecode)
+        self.finalizer = finalizer
+        self.items_w = []
+        self.first = False
+        self.w_hold = None
+
+    def accumulate(self, w_data):
+        first = not self.first
+        if w_data is None:
+            if not first:
+                self.raise_exc('bad marshal data')
+            return False
+        if first:
+            self.w_hold = w_data
+        else:
+            self.items_w.append( (self.w_hold, w_data) )
+        self.first = first
+        return True
+
+class yyy(object):
+    def _run_stackless(self, w_obj):
+        self.stackless = True
+        self.space.marshal_w(w_obj, self)
+        while 1:
+            emitter = self._stack
+            if not emitter:
+                break
+            w_obj = emitter.emit()
+            if w_obj:
+                self.space.marshal_w(w_obj, self)
+            else:
+                emitter._teardown()
+        self.stackless = False
+
+    def deferred_call(self, emitter):
+        emitter._setup()
+
+"""
+Protocol:
+Functions which write objects check the marshaller's stackless flag.
+If set, they call the deferred_call() method with an instance of
+an Emitter subclass.
+"""
+
+class Emitter(_Base):
+    def __init__(self, w_obj, marshaller):
+        self.space = marshaller.space
+        self.marshaller = marshaller
+        self.w_obj = w_obj
+
+    def _setup(self):
+        # from now on, we must keep track of recursive objects
+        marshaller = self.marshaller
+        iddict = marshaller._iddict
+        objid = id(self.w_obj)
+        if objid in iddict:
+            self.raise_exc('recursive objects cannot be marshalled')
+        self.f_back = marshaller._stack
+        marshaller._stack = self
+        iddict[objid] = 1
+
+    def _teardown(self):
+        del self.marshaller._iddict[id(self.w_obj)]
+        self.marshaller._stack = self.f_back
+
+    def emit(self):
+        return None # to be overridden
+

Added: pypy/dist/pypy/module/marshal/test/make_test_marshal.py
==============================================================================
--- (empty file)
+++ pypy/dist/pypy/module/marshal/test/make_test_marshal.py	Wed Aug  3 06:48:41 2005
@@ -0,0 +1,76 @@
+
+TESTCASES = """\
+    None
+    False
+    True
+    StopIteration
+    Ellipsis
+    42
+    sys.maxint
+    -1.25
+    -1.25 #2
+    2+5j
+    2+5j #2
+    42L
+    -1234567890123456789012345678901234567890L
+    hello   # not interned
+    "hello"
+    ()
+    (1, 2)
+    []
+    [3, 4]
+    {}
+    {5: 6, 7: 8}
+    func.func_code
+    scopefunc.func_code
+    u'hello'
+    set()
+    set([1, 2])
+    frozenset()
+    frozenset([3, 4])
+""".strip().split('\n')
+
+def readable(s):
+    for c, repl in (
+        ("'", '_quote_'), ('"', '_Quote_'), (':', '_colon_'), ('.', '_dot_'),
+        ('[', '_list_'), ('{', '_dict_'), ('-', '_minus_'), ('+', '_plus_'),
+        (',', '_comma_'), ('(', '_open_'), (')', '_close_') ):
+        s = s.replace(c, repl)
+    lis = list(s)
+    for i, c in enumerate(lis):
+        if c.isalnum() or c == '_':
+            continue
+        lis[i] = '_'
+    return ''.join(lis)
+
+print """class AppTestMarshal:
+"""
+for line in TESTCASES:
+    line = line.strip()
+    name = readable(line)
+    version = ''
+    extra = ''
+    if line.endswith('#2'):
+        version = ', 2'
+        extra = '; assert len(s) in (9, 17)'
+    src = '''\
+    def test_%(name)s(self):
+        import sys
+        hello = "he"
+        hello += "llo"
+        def func(x):
+            return lambda y: x+y
+        scopefunc = func(42)
+        import marshal, StringIO
+        case = %(line)s
+        print "case:", case
+        s = marshal.dumps(case%(version)s)%(extra)s
+        x = marshal.loads(s)
+        assert x == case
+        f = StringIO.StringIO()
+        marshal.dump(case, f)
+        f.seek(0)
+        x = marshal.load(f)
+        assert x == case
+''' % {'name': name, 'line': line, 'version' : version, 'extra': extra}
+    print src

Added: pypy/dist/pypy/module/marshal/test/test_marshal.py
==============================================================================
--- (empty file)
+++ pypy/dist/pypy/module/marshal/test/test_marshal.py	Wed Aug  3 06:48:41 2005
@@ -0,0 +1,534 @@
+class AppTestMarshal:
+
+    def test_None(self):
+        import sys
+        hello = "he"
+        hello += "llo"
+        def func(x):
+            return lambda y: x+y
+        scopefunc = func(42)
+        import marshal, StringIO
+        case = None
+        print "case:", case
+        s = marshal.dumps(case)
+        x = marshal.loads(s)
+        assert x == case
+        f = StringIO.StringIO()
+        marshal.dump(case, f)
+        f.seek(0)
+        x = marshal.load(f)
+        assert x == case
+
+    def test_False(self):
+        import sys
+        hello = "he"
+        hello += "llo"
+        def func(x):
+            return lambda y: x+y
+        scopefunc = func(42)
+        import marshal, StringIO
+        case = False
+        print "case:", case
+        s = marshal.dumps(case)
+        x = marshal.loads(s)
+        assert x == case
+        f = StringIO.StringIO()
+        marshal.dump(case, f)
+        f.seek(0)
+        x = marshal.load(f)
+        assert x == case
+
+    def test_True(self):
+        import sys
+        hello = "he"
+        hello += "llo"
+        def func(x):
+            return lambda y: x+y
+        scopefunc = func(42)
+        import marshal, StringIO
+        case = True
+        print "case:", case
+        s = marshal.dumps(case)
+        x = marshal.loads(s)
+        assert x == case
+        f = StringIO.StringIO()
+        marshal.dump(case, f)
+        f.seek(0)
+        x = marshal.load(f)
+        assert x == case
+
+    def test_StopIteration(self):
+        import sys
+        hello = "he"
+        hello += "llo"
+        def func(x):
+            return lambda y: x+y
+        scopefunc = func(42)
+        import marshal, StringIO
+        case = StopIteration
+        print "case:", case
+        s = marshal.dumps(case)
+        x = marshal.loads(s)
+        assert x == case
+        f = StringIO.StringIO()
+        marshal.dump(case, f)
+        f.seek(0)
+        x = marshal.load(f)
+        assert x == case
+
+    def test_Ellipsis(self):
+        import sys
+        hello = "he"
+        hello += "llo"
+        def func(x):
+            return lambda y: x+y
+        scopefunc = func(42)
+        import marshal, StringIO
+        case = Ellipsis
+        print "case:", case
+        s = marshal.dumps(case)
+        x = marshal.loads(s)
+        assert x == case
+        f = StringIO.StringIO()
+        marshal.dump(case, f)
+        f.seek(0)
+        x = marshal.load(f)
+        assert x == case
+
+    def test_42(self):
+        import sys
+        hello = "he"
+        hello += "llo"
+        def func(x):
+            return lambda y: x+y
+        scopefunc = func(42)
+        import marshal, StringIO
+        case = 42
+        print "case:", case
+        s = marshal.dumps(case)
+        x = marshal.loads(s)
+        assert x == case
+        f = StringIO.StringIO()
+        marshal.dump(case, f)
+        f.seek(0)
+        x = marshal.load(f)
+        assert x == case
+
+    def test_sys_dot_maxint(self):
+        import sys
+        hello = "he"
+        hello += "llo"
+        def func(x):
+            return lambda y: x+y
+        scopefunc = func(42)
+        import marshal, StringIO
+        case = sys.maxint
+        print "case:", case
+        s = marshal.dumps(case)
+        x = marshal.loads(s)
+        assert x == case
+        f = StringIO.StringIO()
+        marshal.dump(case, f)
+        f.seek(0)
+        x = marshal.load(f)
+        assert x == case
+
+    def test__minus_1_dot_25(self):
+        import sys
+        hello = "he"
+        hello += "llo"
+        def func(x):
+            return lambda y: x+y
+        scopefunc = func(42)
+        import marshal, StringIO
+        case = -1.25
+        print "case:", case
+        s = marshal.dumps(case)
+        x = marshal.loads(s)
+        assert x == case
+        f = StringIO.StringIO()
+        marshal.dump(case, f)
+        f.seek(0)
+        x = marshal.load(f)
+        assert x == case
+
+    def test__minus_1_dot_25__2(self):
+        import sys
+        hello = "he"
+        hello += "llo"
+        def func(x):
+            return lambda y: x+y
+        scopefunc = func(42)
+        import marshal, StringIO
+        case = -1.25 #2
+        print "case:", case
+        s = marshal.dumps(case, 2); assert len(s) in (9, 17)
+        x = marshal.loads(s)
+        assert x == case
+        f = StringIO.StringIO()
+        marshal.dump(case, f)
+        f.seek(0)
+        x = marshal.load(f)
+        assert x == case
+
+    def test_2_plus_5j(self):
+        import sys
+        hello = "he"
+        hello += "llo"
+        def func(x):
+            return lambda y: x+y
+        scopefunc = func(42)
+        import marshal, StringIO
+        case = 2+5j
+        print "case:", case
+        s = marshal.dumps(case)
+        x = marshal.loads(s)
+        assert x == case
+        f = StringIO.StringIO()
+        marshal.dump(case, f)
+        f.seek(0)
+        x = marshal.load(f)
+        assert x == case
+
+    def test_2_plus_5j__2(self):
+        import sys
+        hello = "he"
+        hello += "llo"
+        def func(x):
+            return lambda y: x+y
+        scopefunc = func(42)
+        import marshal, StringIO
+        case = 2+5j #2
+        print "case:", case
+        s = marshal.dumps(case, 2); assert len(s) in (9, 17)
+        x = marshal.loads(s)
+        assert x == case
+        f = StringIO.StringIO()
+        marshal.dump(case, f)
+        f.seek(0)
+        x = marshal.load(f)
+        assert x == case
+
+    def test_42L(self):
+        import sys
+        hello = "he"
+        hello += "llo"
+        def func(x):
+            return lambda y: x+y
+        scopefunc = func(42)
+        import marshal, StringIO
+        case = 42L
+        print "case:", case
+        s = marshal.dumps(case)
+        x = marshal.loads(s)
+        assert x == case
+        f = StringIO.StringIO()
+        marshal.dump(case, f)
+        f.seek(0)
+        x = marshal.load(f)
+        assert x == case
+
+    def test__minus_1234567890123456789012345678901234567890L(self):
+        import sys
+        hello = "he"
+        hello += "llo"
+        def func(x):
+            return lambda y: x+y
+        scopefunc = func(42)
+        import marshal, StringIO
+        case = -1234567890123456789012345678901234567890L
+        print "case:", case
+        s = marshal.dumps(case)
+        x = marshal.loads(s)
+        assert x == case
+        f = StringIO.StringIO()
+        marshal.dump(case, f)
+        f.seek(0)
+        x = marshal.load(f)
+        assert x == case
+
+    def test_hello_____not_interned(self):
+        import sys
+        hello = "he"
+        hello += "llo"
+        def func(x):
+            return lambda y: x+y
+        scopefunc = func(42)
+        import marshal, StringIO
+        case = hello   # not interned
+        print "case:", case
+        s = marshal.dumps(case)
+        x = marshal.loads(s)
+        assert x == case
+        f = StringIO.StringIO()
+        marshal.dump(case, f)
+        f.seek(0)
+        x = marshal.load(f)
+        assert x == case
+
+    def test__Quote_hello_Quote_(self):
+        import sys
+        hello = "he"
+        hello += "llo"
+        def func(x):
+            return lambda y: x+y
+        scopefunc = func(42)
+        import marshal, StringIO
+        case = "hello"
+        print "case:", case
+        s = marshal.dumps(case)
+        x = marshal.loads(s)
+        assert x == case
+        f = StringIO.StringIO()
+        marshal.dump(case, f)
+        f.seek(0)
+        x = marshal.load(f)
+        assert x == case
+
+    def test__open__close_(self):
+        import sys
+        hello = "he"
+        hello += "llo"
+        def func(x):
+            return lambda y: x+y
+        scopefunc = func(42)
+        import marshal, StringIO
+        case = ()
+        print "case:", case
+        s = marshal.dumps(case)
+        x = marshal.loads(s)
+        assert x == case
+        f = StringIO.StringIO()
+        marshal.dump(case, f)
+        f.seek(0)
+        x = marshal.load(f)
+        assert x == case
+
+    def test__open_1_comma__2_close_(self):
+        import sys
+        hello = "he"
+        hello += "llo"
+        def func(x):
+            return lambda y: x+y
+        scopefunc = func(42)
+        import marshal, StringIO
+        case = (1, 2)
+        print "case:", case
+        s = marshal.dumps(case)
+        x = marshal.loads(s)
+        assert x == case
+        f = StringIO.StringIO()
+        marshal.dump(case, f)
+        f.seek(0)
+        x = marshal.load(f)
+        assert x == case
+
+    def test__list__(self):
+        import sys
+        hello = "he"
+        hello += "llo"
+        def func(x):
+            return lambda y: x+y
+        scopefunc = func(42)
+        import marshal, StringIO
+        case = []
+        print "case:", case
+        s = marshal.dumps(case)
+        x = marshal.loads(s)
+        assert x == case
+        f = StringIO.StringIO()
+        marshal.dump(case, f)
+        f.seek(0)
+        x = marshal.load(f)
+        assert x == case
+
+    def test__list_3_comma__4_(self):
+        import sys
+        hello = "he"
+        hello += "llo"
+        def func(x):
+            return lambda y: x+y
+        scopefunc = func(42)
+        import marshal, StringIO
+        case = [3, 4]
+        print "case:", case
+        s = marshal.dumps(case)
+        x = marshal.loads(s)
+        assert x == case
+        f = StringIO.StringIO()
+        marshal.dump(case, f)
+        f.seek(0)
+        x = marshal.load(f)
+        assert x == case
+
+    def test__dict__(self):
+        import sys
+        hello = "he"
+        hello += "llo"
+        def func(x):
+            return lambda y: x+y
+        scopefunc = func(42)
+        import marshal, StringIO
+        case = {}
+        print "case:", case
+        s = marshal.dumps(case)
+        x = marshal.loads(s)
+        assert x == case
+        f = StringIO.StringIO()
+        marshal.dump(case, f)
+        f.seek(0)
+        x = marshal.load(f)
+        assert x == case
+
+    def test__dict_5_colon__6_comma__7_colon__8_(self):
+        import sys
+        hello = "he"
+        hello += "llo"
+        def func(x):
+            return lambda y: x+y
+        scopefunc = func(42)
+        import marshal, StringIO
+        case = {5: 6, 7: 8}
+        print "case:", case
+        s = marshal.dumps(case)
+        x = marshal.loads(s)
+        assert x == case
+        f = StringIO.StringIO()
+        marshal.dump(case, f)
+        f.seek(0)
+        x = marshal.load(f)
+        assert x == case
+
+    def test_func_dot_func_code(self):
+        import sys
+        hello = "he"
+        hello += "llo"
+        def func(x):
+            return lambda y: x+y
+        scopefunc = func(42)
+        import marshal, StringIO
+        case = func.func_code
+        print "case:", case
+        s = marshal.dumps(case)
+        x = marshal.loads(s)
+        assert x == case
+        f = StringIO.StringIO()
+        marshal.dump(case, f)
+        f.seek(0)
+        x = marshal.load(f)
+        assert x == case
+
+    def test_scopefunc_dot_func_code(self):
+        import sys
+        hello = "he"
+        hello += "llo"
+        def func(x):
+            return lambda y: x+y
+        scopefunc = func(42)
+        import marshal, StringIO
+        case = scopefunc.func_code
+        print "case:", case
+        s = marshal.dumps(case)
+        x = marshal.loads(s)
+        assert x == case
+        f = StringIO.StringIO()
+        marshal.dump(case, f)
+        f.seek(0)
+        x = marshal.load(f)
+        assert x == case
+
+    def test_u_quote_hello_quote_(self):
+        import sys
+        hello = "he"
+        hello += "llo"
+        def func(x):
+            return lambda y: x+y
+        scopefunc = func(42)
+        import marshal, StringIO
+        case = u'hello'
+        print "case:", case
+        s = marshal.dumps(case)
+        x = marshal.loads(s)
+        assert x == case
+        f = StringIO.StringIO()
+        marshal.dump(case, f)
+        f.seek(0)
+        x = marshal.load(f)
+        assert x == case
+
+    def test_set_open__close_(self):
+        import sys
+        hello = "he"
+        hello += "llo"
+        def func(x):
+            return lambda y: x+y
+        scopefunc = func(42)
+        import marshal, StringIO
+        case = set()
+        print "case:", case
+        s = marshal.dumps(case)
+        x = marshal.loads(s)
+        assert x == case
+        f = StringIO.StringIO()
+        marshal.dump(case, f)
+        f.seek(0)
+        x = marshal.load(f)
+        assert x == case
+
+    def test_set_open__list_1_comma__2__close_(self):
+        import sys
+        hello = "he"
+        hello += "llo"
+        def func(x):
+            return lambda y: x+y
+        scopefunc = func(42)
+        import marshal, StringIO
+        case = set([1, 2])
+        print "case:", case
+        s = marshal.dumps(case)
+        x = marshal.loads(s)
+        assert x == case
+        f = StringIO.StringIO()
+        marshal.dump(case, f)
+        f.seek(0)
+        x = marshal.load(f)
+        assert x == case
+
+    def test_frozenset_open__close_(self):
+        import sys
+        hello = "he"
+        hello += "llo"
+        def func(x):
+            return lambda y: x+y
+        scopefunc = func(42)
+        import marshal, StringIO
+        case = frozenset()
+        print "case:", case
+        s = marshal.dumps(case)
+        x = marshal.loads(s)
+        assert x == case
+        f = StringIO.StringIO()
+        marshal.dump(case, f)
+        f.seek(0)
+        x = marshal.load(f)
+        assert x == case
+
+    def test_frozenset_open__list_3_comma__4__close_(self):
+        import sys
+        hello = "he"
+        hello += "llo"
+        def func(x):
+            return lambda y: x+y
+        scopefunc = func(42)
+        import marshal, StringIO
+        case = frozenset([3, 4])
+        print "case:", case
+        s = marshal.dumps(case)
+        x = marshal.loads(s)
+        assert x == case
+        f = StringIO.StringIO()
+        marshal.dump(case, f)
+        f.seek(0)
+        x = marshal.load(f)
+        assert x == case
+

Added: pypy/dist/pypy/module/marshal/test/test_marshalimpl.py
==============================================================================
--- (empty file)
+++ pypy/dist/pypy/module/marshal/test/test_marshalimpl.py	Wed Aug  3 06:48:41 2005
@@ -0,0 +1,29 @@
+from pypy.module.marshal import interp_marshal
+from pypy.interpreter.error import OperationError
+import sys
+
+class TestInternalStuff:
+    def test_nesting(self):
+        space = self.space
+        app_cost = interp_marshal.APPLEVEL_STACK_COST
+        curdepth = space.getexecutioncontext().framestack.depth()
+        for do_hack in (False, True):
+            interp_marshal.DONT_USE_MM_HACK = not do_hack
+            if not do_hack:
+                interp_cost = 5
+            else:
+                interp_cost = 2
+            stacklimit = interp_marshal.nesting_limit - (curdepth + 1) * app_cost - interp_marshal.TEST_CONST
+            w_tup = space.newtuple([])
+            tupdepth = 1
+            for i in range(0, stacklimit - interp_cost-1, interp_cost):
+                w_tup = space.newtuple([w_tup])
+                tupdepth += 1
+            w_good = w_tup
+            s = interp_marshal.dumps(space, w_good, space.wrap(1))
+            interp_marshal.loads(space, s)
+            w_bad = space.newtuple([w_tup])
+            raises(OperationError, interp_marshal.dumps, space, w_bad, space.wrap(1))
+            print 'max sys depth = %d, mm_hack = %r, marshal limit = %d' % (
+                sys.getrecursionlimit(), do_hack, tupdepth)
+

Added: pypy/dist/pypy/objspace/std/marshal_impl.py
==============================================================================
--- (empty file)
+++ pypy/dist/pypy/objspace/std/marshal_impl.py	Wed Aug  3 06:48:41 2005
@@ -0,0 +1,481 @@
+# implementation of marshalling by multimethods
+
+"""
+The idea is to have an effective but flexible
+way to implement marshalling for the native types.
+
+The marshal_w operation is called with an object,
+a callback and a state variable.
+"""
+
+from pypy.interpreter.error import OperationError
+from pypy.objspace.std.register_all import register_all
+from pypy.rpython.rarithmetic import LONG_BIT
+from pypy.objspace.std.floatobject import repr__Float as repr_float
+from pypy.objspace.std.longobject import SHIFT as long_bits
+from pypy.objspace.std.objspace import StdObjSpace
+from pypy.interpreter.special import Ellipsis
+from pypy.interpreter.pycode import PyCode
+from pypy.interpreter import gateway
+
+from pypy.objspace.std.boolobject    import W_BoolObject
+from pypy.objspace.std.intobject     import W_IntObject
+from pypy.objspace.std.floatobject   import W_FloatObject
+from pypy.objspace.std.tupleobject   import W_TupleObject
+from pypy.objspace.std.listobject    import W_ListObject
+from pypy.objspace.std.dictobject    import W_DictObject
+from pypy.objspace.std.stringobject  import W_StringObject
+from pypy.objspace.std.typeobject    import W_TypeObject
+from pypy.objspace.std.longobject    import W_LongObject
+from pypy.objspace.std.noneobject    import W_NoneObject
+from pypy.objspace.std.unicodeobject import W_UnicodeObject
+
+import longobject
+from pypy.objspace.std.strutil import string_to_float
+
+from pypy.module.marshal.interp_marshal import register
+
+TYPE_NULL      = '0'
+TYPE_NONE      = 'N'
+TYPE_FALSE     = 'F'
+TYPE_TRUE      = 'T'
+TYPE_STOPITER  = 'S'
+TYPE_ELLIPSIS  = '.'
+TYPE_INT       = 'i'
+TYPE_INT64     = 'I'
+TYPE_FLOAT     = 'f'
+TYPE_BINARY_FLOAT = 'g'
+TYPE_COMPLEX   = 'x'
+TYPE_BINARY_COMPLEX = 'y'
+TYPE_LONG      = 'l'
+TYPE_STRING    = 's'
+TYPE_INTERNED  = 't'
+TYPE_STRINGREF = 'R'
+TYPE_TUPLE     = '('
+TYPE_LIST      = '['
+TYPE_DICT      = '{'
+TYPE_CODE      = 'c'
+TYPE_UNICODE   = 'u'
+TYPE_UNKNOWN   = '?'
+TYPE_SET       = '<'
+TYPE_FROZENSET = '>'
+
+"""
+simple approach:
+a call to marshal_w has the following semantics:
+marshal_w receives a marshaller object which contains
+state and several methods.
+
+
+atomic types including typecode:
+
+atom(tc)                    puts single typecode
+atom_int(tc, int)           puts code and int
+atom_int64(tc, int64)       puts code and int64
+atom_str(tc, str)           puts code, len and string
+atom_strlist(tc, strlist)   puts code, len and list of strings
+
+building blocks for compound types:
+
+start(typecode)             sets the type character
+put(s)                      puts a string with fixed length
+put_short(int)              puts a short integer
+put_int(int)                puts an integer
+put_pascal(s)               puts a short string
+put_w_obj(w_obj)            puts a wrapped object
+put_list_w(list_w, lng)     puts a list of lng wrapped objects
+
+"""
+
+handled_by_any = []
+
+def raise_exception(space, msg):
+    raise OperationError(space.w_ValueError, space.wrap(msg))
+
+def marshal_w__None(space, w_none, m):
+    m.atom(TYPE_NONE)
+
+def unmarshal_None(space, u, tc):
+    return space.w_None
+register(TYPE_NONE, unmarshal_None)
+
+def marshal_w__Bool(space, w_bool, m):
+    if w_bool.boolval:
+        m.atom(TYPE_TRUE)
+    else:
+        m.atom(TYPE_FALSE)
+
+def unmarshal_Bool(space, u, tc):
+    if tc == TYPE_TRUE:
+        return space.w_True
+    else:
+        return space.w_False
+register(TYPE_TRUE + TYPE_FALSE, unmarshal_Bool)
+
+def marshal_w__Type(space, w_type, m):
+    if not space.is_w(w_type, space.w_StopIteration):
+        raise_exception(space, "unmarshallable object")
+    m.atom(TYPE_STOPITER)
+
+def unmarshal_Type(space, u, tc):
+    return space.w_StopIteration
+register(TYPE_STOPITER, unmarshal_Type)
+
+# not directly supported:
+def marshal_w_Ellipsis(space, w_ellipsis, m):
+    m.atom(TYPE_ELLIPSIS)
+
+StdObjSpace.MM.marshal_w.register(marshal_w_Ellipsis, Ellipsis)
+
+def unmarshal_Ellipsis(space, u, tc):
+    return space.w_Ellipsis
+register(TYPE_ELLIPSIS, unmarshal_Ellipsis)
+
+def marshal_w__Int(space, w_int, m):
+    if LONG_BIT == 32:
+        m.atom_int(TYPE_INT, w_int.intval)
+    else:
+        y = w_int.intval >> 31
+        if y and y != -1:
+            m.atom_int64(TYPE_INT64, w_int.intval)
+        else:
+            m.atom_int(TYPE_INT, w_int.intval)
+
+def unmarshal_Int(space, u, tc):
+    return W_IntObject(space, u.get_int())
+register(TYPE_INT, unmarshal_Int)
+
+def unmarshal_Int64(space, u, tc):
+    if LONG_BIT >= 64:
+        lo = u.get_int() & 0xffff
+        hi = u.get_int()
+        return W_IntObject(space, (hi << 32) or lo)
+    else:
+        # fall back to a long
+        # XXX at some point, we need to extend longobject
+        # by _PyLong_FromByteArray and _PyLong_AsByteArray.
+        # I will do that when implementing cPickle.
+        # for now, this rare case is solved the simple way.
+        lshift = longobject.lshift__Long_Long
+        longor = longobject.or__Long_Long
+        lo1 = space.newlong(u.get_short())
+        lo2 = space.newlong(u.get_short())
+        res = space.newlong(u.get_int())
+        nbits = space.newlong(16)
+        res = lshift(space, res, nbits)
+        res = longor(space, res, lo2)
+        res = lshift(space, res, nbits)
+        res = longor(space, res, lo1)
+        return res
+register(TYPE_INT64, unmarshal_Int64)
+
+# support for marshal version 2:
+# we call back into the struct module.
+# XXX struct should become interp-level.
+# XXX we also should have an rtyper operation
+# that allows to typecast between double and char{8}
+
+app = gateway.applevel(r'''
+    def float_to_str(fl):
+        import struct
+        return struct.pack('<d', fl)
+
+    def str_to_float(s):
+        import struct
+        return struct.unpack('<d', s)[0]
+''')
+
+float_to_str = app.interphook('float_to_str')
+str_to_float = app.interphook('str_to_float')
+
+def marshal_w__Float(space, w_float, m):
+    if m.version > 1:
+        m.start(TYPE_BINARY_FLOAT)
+        m.put(space.str_w(float_to_str(space, w_float)))
+    else:
+        m.start(TYPE_FLOAT)
+        m.put_pascal(space.str_w(repr_float(space, w_float)))
+
+def unmarshal_Float(space, u, tc):
+    if tc == TYPE_BINARY_FLOAT:
+        w_ret = str_to_float(space, space.wrap(u.get(8)))
+        fl = space.float_w(w_ret)
+    else:
+        fl = string_to_float(u.get_pascal())
+    return W_FloatObject(space, fl)
+register(TYPE_FLOAT + TYPE_BINARY_FLOAT, unmarshal_Float)
+
+# this is not a native type, yet, so we have to
+# dispatch on it in ANY
+
+def marshal_w_Complex(space, w_complex, m):
+    w_real = space.getattr(w_complex, space.wrap('real'))
+    w_imag = space.getattr(w_complex, space.wrap('imag'))
+    if m.version > 1:
+        m.start(TYPE_BINARY_COMPLEX)
+        m.put(space.str_w(float_to_str(space, w_real)))
+        m.put(space.str_w(float_to_str(space, w_imag)))
+    else:
+        m.start(TYPE_COMPLEX)
+        m.put_pascal(space.str_w(repr_float(space, w_real)))
+        m.put_pascal(space.str_w(repr_float(space, w_imag)))
+
+handled_by_any.append( ('complex', marshal_w_Complex) )
+
+def unmarshal_Complex(space, u, tc):
+    if tc == TYPE_BINARY_COMPLEX:
+        w_real = str_to_float(space, space.wrap(u.get(8)))
+        w_imag = str_to_float(space, space.wrap(u.get(8)))
+    else:
+        w_real = W_FloatObject(space, string_to_float(u.get_pascal()))
+        w_imag = W_FloatObject(space, string_to_float(u.get_pascal()))
+    w_t = space.builtin.get('complex')
+    return space.call_function(w_t, w_real, w_imag)
+register(TYPE_COMPLEX + TYPE_BINARY_COMPLEX, unmarshal_Complex)
+
+def marshal_w__Long(space, w_long, m):
+    assert long_bits == 15, """if long_bits is not 15,
+    we need to write much more general code for marshal
+    that breaks things into pieces, or invent a new
+    typecode and have our own magic number for pickling"""
+
+    m.start(TYPE_LONG)
+    lng = len(w_long.digits)
+    if w_long.sign < 0:
+        m.put_int(-lng)
+    else:
+        m.put_int(lng)
+    for digit in w_long.digits:
+        m.put_short(digit)
+
+def unmarshal_Long(space, u, tc):
+    lng = u.get_int()
+    if lng < 0:
+        sign = -1
+        lng = -lng
+    elif lng > 0:
+        sign = 1
+    else:
+        sign = 0
+    digits = [0] * lng
+    i = 0
+    while i < lng:
+        digit = u.get_short()
+        if digit < 0:
+            raise_exception(space, 'bad marshal data')
+        digits[i] = digit
+        i += 1
+    return W_LongObject(space, digits, sign)
+register(TYPE_LONG, unmarshal_Long)
+
+# XXX currently, intern() is at applevel,
+# and there is no interface to get at the
+# internal table.
+# Move intern to interplevel and add a flag
+# to strings.
+def PySTRING_CHECK_INTERNED(w_str):
+    return False
+
+def marshal_w__String(space, w_str, m):
+    # using the fastest possible access method here
+    # that does not touch the internal representation,
+    # which might change (array of bytes?)
+    s = w_str.unwrap()
+    if m.version >= 1 and PySTRING_CHECK_INTERNED(w_str):
+        # we use a native rtyper stringdict for speed
+        idx = m.stringtable.get(s, -1)
+        if idx >= 0:
+            m.atom_int(TYPE_STRINGREF, idx)
+        else:
+            idx = len(m.stringtable)
+            m.stringtable[s] = idx
+            m.atom_str(TYPE_INTERNED, s)
+    else:
+        m.atom_str(TYPE_STRING, s)
+
+def unmarshal_String(space, u, tc):
+    return W_StringObject(space, u.get_str())
+register(TYPE_STRING, unmarshal_String)
+
+def unmarshal_interned(space, u, tc):
+    w_ret = W_StringObject(space, u.get_str())
+    u.stringtable_w.append(w_ret)
+    w_intern = space.builtin.get('intern')
+    space.call_function(w_intern, w_ret)
+    return w_ret
+register(TYPE_INTERNED, unmarshal_interned)
+
+def unmarshal_stringref(space, u, tc):
+    idx = u.get_int()
+    try:
+        return u.stringtable_w[idx]
+    except IndexError:
+        raise_exception(space, 'bad marshal data')
+register(TYPE_STRINGREF, unmarshal_stringref)
+
+def marshal_w__Tuple(space, w_tuple, m):
+    m.start(TYPE_TUPLE)
+    m.put_list_w(w_tuple.wrappeditems, len(w_tuple.wrappeditems))
+
+def unmarshal_Tuple(space, u, tc):
+    items_w = u.get_list_w()
+    return W_TupleObject(space, items_w)
+register(TYPE_TUPLE, unmarshal_Tuple)
+
+def marshal_w__List(space, w_list, m):
+    m.start(TYPE_LIST)
+    n = w_list.ob_size
+    m.put_list_w(w_list.ob_item, w_list.ob_size)
+
+def unmarshal_List(space, u, tc):
+    items_w = u.get_list_w()
+    return W_ListObject(space, items_w)
+
+def finish_List(space, items_w, typecode):
+    return W_ListObject(space, items_w)
+register(TYPE_LIST, unmarshal_List)
+
+def marshal_w__Dict(space, w_dict, m):
+    m.start(TYPE_DICT)
+    for entry in w_dict.data:
+        if entry.w_value is not None:
+            m.put_w_obj(entry.w_key)
+            m.put_w_obj(entry.w_value)
+    m.atom(TYPE_NULL)
+
+def unmarshal_Dict(space, u, tc):
+    items_w = []
+    while 1:
+        w_key = u.get_w_obj(True)
+        if w_key is None:
+            break
+        w_value = u.get_w_obj(False)
+        items_w.append( (w_key, w_value) )
+    return W_DictObject(space, items_w)
+register(TYPE_DICT, unmarshal_Dict)
+
+def unmarshal_NULL(self, u, tc):
+    return None
+register(TYPE_NULL, unmarshal_NULL)
+
+# this one is registered by hand:
+def marshal_w_pycode(space, w_pycode, m):
+    m.start(TYPE_CODE)
+    # see pypy.interpreter.pycode for the layout
+    x = space.interpclass_w(w_pycode)
+    assert isinstance(x, PyCode)
+    m.put_int(x.co_argcount)
+    m.put_int(x.co_nlocals)
+    m.put_int(x.co_stacksize)
+    m.put_int(x.co_flags)
+    m.atom_str(TYPE_STRING, x.co_code)
+    m.start(TYPE_TUPLE)
+    m.put_list_w(x.co_consts_w, len(x.co_consts_w))
+    m.atom_strlist(TYPE_TUPLE, TYPE_STRING, x.co_names)
+    m.atom_strlist(TYPE_TUPLE, TYPE_STRING, x.co_varnames)
+    m.atom_strlist(TYPE_TUPLE, TYPE_STRING, x.co_freevars)
+    m.atom_strlist(TYPE_TUPLE, TYPE_STRING, x.co_cellvars)
+    m.atom_str(TYPE_STRING, x.co_filename)
+    m.atom_str(TYPE_STRING, x.co_name)
+    m.put_int(x.co_firstlineno)
+    m.atom_str(TYPE_STRING, x.co_lnotab)
+
+StdObjSpace.MM.marshal_w.register(marshal_w_pycode, PyCode)
+
+def unmarshal_pycode(space, u, tc):
+    code = PyCode(space)
+    code.co_argcount    = u.get_int()
+    code.co_nlocals     = u.get_int()
+    code.co_stacksize   = u.get_int()
+    code.co_flags       = u.get_int()
+    code.co_code        = u.atom_str(TYPE_STRING)
+    u.start(TYPE_TUPLE)
+    code.co_consts_w    = u.get_list_w()
+    code.co_names       = u.atom_strlist(TYPE_TUPLE, TYPE_STRING)
+    code.co_varnames    = u.atom_strlist(TYPE_TUPLE, TYPE_STRING)
+    code.co_freevars    = u.atom_strlist(TYPE_TUPLE, TYPE_STRING)
+    code.co_cellvars    = u.atom_strlist(TYPE_TUPLE, TYPE_STRING)
+    code.co_filename    = u.atom_str(TYPE_STRING)
+    code.co_name        = u.atom_str(TYPE_STRING)
+    code.co_firstlineno = u.get_int()
+    code.co_lnotab      = u.atom_str(TYPE_STRING)
+    return space.wrap(code)
+register(TYPE_CODE, unmarshal_pycode)
+
+app = gateway.applevel(r'''
+    def PyUnicode_EncodeUTF8(data):
+        import _codecs
+        return _codecs.utf_8_encode(data)[0]
+
+    def PyUnicode_DecodeUTF8(data):
+        import _codecs
+        return _codecs.utf_8_decode(data)[0]
+''')
+
+PyUnicode_EncodeUTF8 = app.interphook('PyUnicode_EncodeUTF8')
+PyUnicode_DecodeUTF8 = app.interphook('PyUnicode_DecodeUTF8')
+
+def marshal_w__Unicode(space, w_unicode, m):
+    s = space.str_w(PyUnicode_EncodeUTF8(space, w_unicode))
+    m.atom_str(TYPE_UNICODE, s)
+
+def unmarshal_Unicode(space, u, tc):
+    return PyUnicode_DecodeUTF8(space, space.wrap(u.get_str()))
+register(TYPE_UNICODE, unmarshal_Unicode)
+
+app = gateway.applevel(r'''
+    def set_to_list(theset):
+        return [item for item in theset]
+
+    def list_to_set(datalist, frozen=False):
+        if frozen:
+            return frozenset(datalist)
+        return set(datalist)
+''')
+
+set_to_list = app.interphook('set_to_list')
+list_to_set = app.interphook('list_to_set')
+
+# not directly supported:
+def marshal_w_set(space, w_set, m):
+    w_lis = set_to_list(space, w_set)
+    # cannot access this list directly, because it's
+    # type is not exactly known through applevel.
+    # otherwise, I would access ob_item and ob_size, directly.
+    lis_w = space.unpackiterable(w_lis)
+    m.start(TYPE_SET)
+    m.put_list_w(lis_w, len(lis_w))
+
+handled_by_any.append( ('set', marshal_w_set) )
+
+# not directly supported:
+def marshal_w_frozenset(space, w_frozenset, m):
+    w_lis = set_to_list(space, w_frozenset)
+    lis_w = space.unpackiterable(w_lis)
+    m.start(TYPE_FROZENSET)
+    m.put_list_w(lis_w, len(lis_w))
+
+handled_by_any.append( ('frozenset', marshal_w_frozenset) )
+
+def unmarshal_set_frozenset(space, u, tc):
+    items_w = u.get_list_w()
+    if tc == TYPE_SET:
+        w_frozen = space.w_False
+    else:
+        w_frozen = space.w_True
+    w_lis = W_ListObject(space, items_w)
+    return list_to_set(space, w_lis, w_frozen)
+register(TYPE_SET + TYPE_FROZENSET, unmarshal_set_frozenset)
+
+# dispatching for all not directly dispatched types
+def marshal_w__ANY(space, w_obj, m):
+    w_type = space.type(w_obj)
+    for name, func in handled_by_any:
+        w_t = space.builtin.get(name)
+        if space.is_true(space.issubtype(w_type, w_t)):
+            func(space, w_obj, m)
+            break
+    else:
+        raise_exception(space, "unmarshallable object")
+
+register_all(vars())

Modified: pypy/dist/pypy/objspace/std/model.py
==============================================================================
--- pypy/dist/pypy/objspace/std/model.py	(original)
+++ pypy/dist/pypy/objspace/std/model.py	Wed Aug  3 06:48:41 2005
@@ -5,7 +5,8 @@
 
 from pypy.objspace.std.multimethod import MultiMethodTable, FailedToImplement
 from pypy.interpreter.baseobjspace import W_Root, ObjSpace
-
+import pypy.interpreter.pycode
+import pypy.interpreter.special
 
 class StdTypeModel:
 
@@ -52,6 +53,7 @@
         from pypy.objspace.std import dictproxyobject
         from pypy.objspace.std import fake
         import pypy.objspace.std.default # register a few catch-all multimethods
+        import pypy.objspace.std.marshal_impl # install marshal multimethods
 
         # the set of implementation types
         self.typeorder = {
@@ -71,6 +73,8 @@
             iterobject.W_SeqIterObject: [],
             unicodeobject.W_UnicodeObject: [],
             dictproxyobject.W_DictProxyObject: [],
+            pypy.interpreter.pycode.PyCode: [],
+            pypy.interpreter.special.Ellipsis: [],
             }
         for type in self.typeorder:
             self.typeorder[type].append((type, None))

Modified: pypy/dist/pypy/objspace/std/objspace.py
==============================================================================
--- pypy/dist/pypy/objspace/std/objspace.py	(original)
+++ pypy/dist/pypy/objspace/std/objspace.py	Wed Aug  3 06:48:41 2005
@@ -441,7 +441,7 @@
         str_w   = MultiMethod('str_w', 1, [])     # returns an unwrapped string
         float_w = MultiMethod('float_w', 1, [])   # returns an unwrapped float
         uint_w  = MultiMethod('uint_w', 1, [])    # returns an unwrapped unsigned int (r_uint)
-        #marshal_w = MultiMethod('marshal_w', 1, [], extra_args=['foo'])
+        marshal_w = MultiMethod('marshal_w', 1, [], extra_args=['marshaller'])
 
         # add all regular multimethods here
         for _name, _symbol, _arity, _specialnames in ObjSpace.MethodTable:



More information about the Pypy-commit mailing list