[pypy-commit] pypy win32-cleanup2: merge default into branch
mattip
noreply at buildbot.pypy.org
Tue Apr 24 23:08:55 CEST 2012
Author: Matti Picus <matti.picus at gmail.com>
Branch: win32-cleanup2
Changeset: r54736:6c8bf5a7e2e1
Date: 2012-04-25 00:08 +0300
http://bitbucket.org/pypy/pypy/changeset/6c8bf5a7e2e1/
Log: merge default into branch
diff --git a/pypy/doc/cppyy.rst b/pypy/doc/cppyy.rst
--- a/pypy/doc/cppyy.rst
+++ b/pypy/doc/cppyy.rst
@@ -111,6 +111,121 @@
That's all there is to it!
+Advanced example
+================
+The following snippet of C++ is very contrived, to allow showing that such
+pathological code can be handled and to show how certain features play out in
+practice::
+
+ $ cat MyAdvanced.h
+ #include <string>
+
+ class Base1 {
+ public:
+ Base1(int i) : m_i(i) {}
+ virtual ~Base1() {}
+ int m_i;
+ };
+
+ class Base2 {
+ public:
+ Base2(double d) : m_d(d) {}
+ virtual ~Base2() {}
+ double m_d;
+ };
+
+ class C;
+
+ class Derived : public virtual Base1, public virtual Base2 {
+ public:
+ Derived(const std::string& name, int i, double d) : Base1(i), Base2(d), m_name(name) {}
+ virtual C* gimeC() { return (C*)0; }
+ std::string m_name;
+ };
+
+ Base1* BaseFactory(const std::string& name, int i, double d) {
+ return new Derived(name, i, d);
+ }
+
+This code is still only in a header file, with all functions inline, for
+convenience of the example.
+If the implementations live in a separate source file or shared library, the
+only change needed is to link those in when building the reflection library.
+
+If you were to run ``genreflex`` like above in the basic example, you will
+find that not all classes of interest will be reflected, nor will be the
+global factory function.
+In particular, ``std::string`` will be missing, since it is not defined in
+this header file, but in a header file that is included.
+In practical terms, general classes such as ``std::string`` should live in a
+core reflection set, but for the moment assume we want to have it in the
+reflection library that we are building for this example.
+
+The ``genreflex`` script can be steered using a so-called `selection file`_,
+which is a simple XML file specifying, either explicitly or by using a
+pattern, which classes, variables, namespaces, etc. to select from the given
+header file.
+With the aid of a selection file, a large project can be easily managed:
+simply ``#include`` all relevant headers into a single header file that is
+handed to ``genreflex``.
+Then, apply a selection file to pick up all the relevant classes.
+For our purposes, the following rather straightforward selection will do
+(the name ``lcgdict`` for the root is historical, but required)::
+
+ $ cat MyAdvanced.xml
+ <lcgdict>
+ <class pattern="Base?" />
+ <class name="Derived" />
+ <class name="std::string" />
+ <function name="BaseFactory" />
+ </lcgdict>
+
+.. _`selection file`: http://root.cern.ch/drupal/content/generating-reflex-dictionaries
+
+Now the reflection info can be generated and compiled::
+
+ $ genreflex MyAdvanced.h --selection=MyAdvanced.xml
+ $ g++ -fPIC -rdynamic -O2 -shared -I$ROOTSYS/include MyAdvanced_rflx.cpp -o libAdvExDict.so
+
+and subsequently be used from PyPy::
+
+ >>>> import cppyy
+ >>>> cppyy.load_reflection_info("libAdvExDict.so")
+ <CPPLibrary object at 0x00007fdb48fc8120>
+ >>>> d = cppyy.gbl.BaseFactory("name", 42, 3.14)
+ >>>> type(d)
+ <class '__main__.Derived'>
+ >>>> d.m_i
+ 42
+ >>>> d.m_d
+ 3.14
+ >>>> d.m_name == "name"
+ True
+ >>>>
+
+Again, that's all there is to it!
+
+A couple of things to note, though.
+If you look back at the C++ definition of the ``BaseFactory`` function,
+you will see that it declares the return type to be a ``Base1``, yet the
+bindings return an object of the actual type ``Derived``?
+This choice is made for a couple of reasons.
+First, it makes method dispatching easier: if bound objects are always their
+most derived type, then it is easy to calculate any offsets, if necessary.
+Second, it makes memory management easier: the combination of the type and
+the memory address uniquely identifies an object.
+That way, it can be recycled and object identity can be maintained if it is
+entered as a function argument into C++ and comes back to PyPy as a return
+value.
+Last, but not least, casting is decidedly unpythonistic.
+By always providing the most derived type known, casting becomes unnecessary.
+For example, the data member of ``Base2`` is simply directly available.
+Note also that the unreflected ``gimeC`` method of ``Derived`` does not
+preclude its use.
+It is only the ``gimeC`` method that is unusable as long as class ``C`` is
+unknown to the system.
+
+
Features
========
@@ -160,6 +275,8 @@
* **doc strings**: The doc string of a method or function contains the C++
arguments and return types of all overloads of that name, as applicable.
+* **enums**: Are translated as ints with no further checking.
+
* **functions**: Work as expected and live in their appropriate namespace
(which can be the global one, ``cppyy.gbl``).
diff --git a/pypy/doc/windows.rst b/pypy/doc/windows.rst
--- a/pypy/doc/windows.rst
+++ b/pypy/doc/windows.rst
@@ -24,7 +24,8 @@
translation. Failing that, they will pick the most recent Visual Studio
compiler they can find. In addition, the target architecture
(32 bits, 64 bits) is automatically selected. A 32 bit build can only be built
-using a 32 bit Python and vice versa.
+using a 32 bit Python and vice versa. By default pypy is built using the
+Multi-threaded DLL (/MD) runtime environment.
**Note:** PyPy is currently not supported for 64 bit Windows, and translation
will fail in this case.
@@ -102,10 +103,12 @@
Download the source code of expat on sourceforge:
http://sourceforge.net/projects/expat/ and extract it in the base
-directory. Then open the project file ``expat.dsw`` with Visual
+directory. Version 2.1.0 is known to pass tests. Then open the project
+file ``expat.dsw`` with Visual
Studio; follow the instruction for converting the project files,
-switch to the "Release" configuration, and build the solution (the
-``expat`` project is actually enough for pypy).
+switch to the "Release" configuration, reconfigure the runtime for
+Multi-threaded DLL (/MD) and build the solution (the ``expat`` project
+is actually enough for pypy).
Then, copy the file ``win32\bin\release\libexpat.dll`` somewhere in
your PATH.
diff --git a/pypy/jit/metainterp/heapcache.py b/pypy/jit/metainterp/heapcache.py
--- a/pypy/jit/metainterp/heapcache.py
+++ b/pypy/jit/metainterp/heapcache.py
@@ -20,6 +20,7 @@
self.dependencies = {}
# contains frame boxes that are not virtualizables
self.nonstandard_virtualizables = {}
+
# heap cache
# maps descrs to {from_box, to_box} dicts
self.heap_cache = {}
@@ -29,6 +30,26 @@
# cache the length of arrays
self.length_cache = {}
+ # replace_box is called surprisingly often, therefore it's not efficient
+ # to go over all the dicts and fix them.
+ # instead, these two dicts are kept, and a replace_box adds an entry to
+ # each of them.
+ # every time one of the dicts heap_cache, heap_array_cache, length_cache
+ # is accessed, suitable indirections need to be performed
+
+ # this looks all very subtle, but in practice the patterns of
+ # replacements should not be that complex. Usually a box is replaced by
+ # a const, once. Also, if something goes wrong, the effect is that less
+ # caching than possible is done, which is not a huge problem.
+ self.input_indirections = {}
+ self.output_indirections = {}
+
+ def _input_indirection(self, box):
+ return self.input_indirections.get(box, box)
+
+ def _output_indirection(self, box):
+ return self.output_indirections.get(box, box)
+
def invalidate_caches(self, opnum, descr, argboxes):
self.mark_escaped(opnum, argboxes)
self.clear_caches(opnum, descr, argboxes)
@@ -132,14 +153,16 @@
self.arraylen_now_known(box, lengthbox)
def getfield(self, box, descr):
+ box = self._input_indirection(box)
d = self.heap_cache.get(descr, None)
if d:
tobox = d.get(box, None)
- if tobox:
- return tobox
+ return self._output_indirection(tobox)
return None
def getfield_now_known(self, box, descr, fieldbox):
+ box = self._input_indirection(box)
+ fieldbox = self._input_indirection(fieldbox)
self.heap_cache.setdefault(descr, {})[box] = fieldbox
def setfield(self, box, descr, fieldbox):
@@ -148,6 +171,8 @@
self.heap_cache[descr] = new_d
def _do_write_with_aliasing(self, d, box, fieldbox):
+ box = self._input_indirection(box)
+ fieldbox = self._input_indirection(fieldbox)
# slightly subtle logic here
# a write to an arbitrary box, all other boxes can alias this one
if not d or box not in self.new_boxes:
@@ -166,6 +191,7 @@
return new_d
def getarrayitem(self, box, descr, indexbox):
+ box = self._input_indirection(box)
if not isinstance(indexbox, ConstInt):
return
index = indexbox.getint()
@@ -173,9 +199,11 @@
if cache:
indexcache = cache.get(index, None)
if indexcache is not None:
- return indexcache.get(box, None)
+ return self._output_indirection(indexcache.get(box, None))
def getarrayitem_now_known(self, box, descr, indexbox, valuebox):
+ box = self._input_indirection(box)
+ valuebox = self._input_indirection(valuebox)
if not isinstance(indexbox, ConstInt):
return
index = indexbox.getint()
@@ -198,25 +226,13 @@
cache[index] = self._do_write_with_aliasing(indexcache, box, valuebox)
def arraylen(self, box):
- return self.length_cache.get(box, None)
+ box = self._input_indirection(box)
+ return self._output_indirection(self.length_cache.get(box, None))
def arraylen_now_known(self, box, lengthbox):
- self.length_cache[box] = lengthbox
-
- def _replace_box(self, d, oldbox, newbox):
- new_d = {}
- for frombox, tobox in d.iteritems():
- if frombox is oldbox:
- frombox = newbox
- if tobox is oldbox:
- tobox = newbox
- new_d[frombox] = tobox
- return new_d
+ box = self._input_indirection(box)
+ self.length_cache[box] = self._input_indirection(lengthbox)
def replace_box(self, oldbox, newbox):
- for descr, d in self.heap_cache.iteritems():
- self.heap_cache[descr] = self._replace_box(d, oldbox, newbox)
- for descr, d in self.heap_array_cache.iteritems():
- for index, cache in d.iteritems():
- d[index] = self._replace_box(cache, oldbox, newbox)
- self.length_cache = self._replace_box(self.length_cache, oldbox, newbox)
+ self.input_indirections[self._output_indirection(newbox)] = self._input_indirection(oldbox)
+ self.output_indirections[self._input_indirection(oldbox)] = self._output_indirection(newbox)
diff --git a/pypy/jit/metainterp/optimizeopt/test/test_optimizebasic.py b/pypy/jit/metainterp/optimizeopt/test/test_optimizebasic.py
--- a/pypy/jit/metainterp/optimizeopt/test/test_optimizebasic.py
+++ b/pypy/jit/metainterp/optimizeopt/test/test_optimizebasic.py
@@ -7,7 +7,7 @@
import pypy.jit.metainterp.optimizeopt.optimizer as optimizeopt
import pypy.jit.metainterp.optimizeopt.virtualize as virtualize
from pypy.jit.metainterp.optimize import InvalidLoop
-from pypy.jit.metainterp.history import AbstractDescr, ConstInt, BoxInt
+from pypy.jit.metainterp.history import AbstractDescr, ConstInt, BoxInt, get_const_ptr_for_string
from pypy.jit.metainterp import executor, compile, resume, history
from pypy.jit.metainterp.resoperation import rop, opname, ResOperation
from pypy.rlib.rarithmetic import LONG_BIT
@@ -5067,6 +5067,25 @@
"""
self.optimize_strunicode_loop(ops, expected)
+ def test_call_pure_vstring_const(self):
+ ops = """
+ []
+ p0 = newstr(3)
+ strsetitem(p0, 0, 97)
+ strsetitem(p0, 1, 98)
+ strsetitem(p0, 2, 99)
+ i0 = call_pure(123, p0, descr=nonwritedescr)
+ finish(i0)
+ """
+ expected = """
+ []
+ finish(5)
+ """
+ call_pure_results = {
+ (ConstInt(123), get_const_ptr_for_string("abc"),): ConstInt(5),
+ }
+ self.optimize_loop(ops, expected, call_pure_results)
+
class TestLLtype(BaseTestOptimizeBasic, LLtypeMixin):
pass
diff --git a/pypy/jit/metainterp/test/test_heapcache.py b/pypy/jit/metainterp/test/test_heapcache.py
--- a/pypy/jit/metainterp/test/test_heapcache.py
+++ b/pypy/jit/metainterp/test/test_heapcache.py
@@ -2,12 +2,14 @@
from pypy.jit.metainterp.resoperation import rop
from pypy.jit.metainterp.history import ConstInt
-box1 = object()
-box2 = object()
-box3 = object()
-box4 = object()
+box1 = "box1"
+box2 = "box2"
+box3 = "box3"
+box4 = "box4"
+box5 = "box5"
lengthbox1 = object()
lengthbox2 = object()
+lengthbox3 = object()
descr1 = object()
descr2 = object()
descr3 = object()
@@ -276,11 +278,43 @@
h.setfield(box1, descr2, box3)
h.setfield(box2, descr3, box3)
h.replace_box(box1, box4)
- assert h.getfield(box1, descr1) is None
- assert h.getfield(box1, descr2) is None
assert h.getfield(box4, descr1) is box2
assert h.getfield(box4, descr2) is box3
assert h.getfield(box2, descr3) is box3
+ h.setfield(box4, descr1, box3)
+ assert h.getfield(box4, descr1) is box3
+
+ h = HeapCache()
+ h.setfield(box1, descr1, box2)
+ h.setfield(box1, descr2, box3)
+ h.setfield(box2, descr3, box3)
+ h.replace_box(box3, box4)
+ assert h.getfield(box1, descr1) is box2
+ assert h.getfield(box1, descr2) is box4
+ assert h.getfield(box2, descr3) is box4
+
+ def test_replace_box_twice(self):
+ h = HeapCache()
+ h.setfield(box1, descr1, box2)
+ h.setfield(box1, descr2, box3)
+ h.setfield(box2, descr3, box3)
+ h.replace_box(box1, box4)
+ h.replace_box(box4, box5)
+ assert h.getfield(box5, descr1) is box2
+ assert h.getfield(box5, descr2) is box3
+ assert h.getfield(box2, descr3) is box3
+ h.setfield(box5, descr1, box3)
+ assert h.getfield(box4, descr1) is box3
+
+ h = HeapCache()
+ h.setfield(box1, descr1, box2)
+ h.setfield(box1, descr2, box3)
+ h.setfield(box2, descr3, box3)
+ h.replace_box(box3, box4)
+ h.replace_box(box4, box5)
+ assert h.getfield(box1, descr1) is box2
+ assert h.getfield(box1, descr2) is box5
+ assert h.getfield(box2, descr3) is box5
def test_replace_box_array(self):
h = HeapCache()
@@ -291,9 +325,6 @@
h.setarrayitem(box3, descr2, index2, box1)
h.setarrayitem(box2, descr3, index2, box3)
h.replace_box(box1, box4)
- assert h.getarrayitem(box1, descr1, index1) is None
- assert h.getarrayitem(box1, descr2, index1) is None
- assert h.arraylen(box1) is None
assert h.arraylen(box4) is lengthbox1
assert h.getarrayitem(box4, descr1, index1) is box2
assert h.getarrayitem(box4, descr2, index1) is box3
@@ -304,6 +335,27 @@
h.replace_box(lengthbox1, lengthbox2)
assert h.arraylen(box4) is lengthbox2
+ def test_replace_box_array_twice(self):
+ h = HeapCache()
+ h.setarrayitem(box1, descr1, index1, box2)
+ h.setarrayitem(box1, descr2, index1, box3)
+ h.arraylen_now_known(box1, lengthbox1)
+ h.setarrayitem(box2, descr1, index2, box1)
+ h.setarrayitem(box3, descr2, index2, box1)
+ h.setarrayitem(box2, descr3, index2, box3)
+ h.replace_box(box1, box4)
+ h.replace_box(box4, box5)
+ assert h.arraylen(box4) is lengthbox1
+ assert h.getarrayitem(box5, descr1, index1) is box2
+ assert h.getarrayitem(box5, descr2, index1) is box3
+ assert h.getarrayitem(box2, descr1, index2) is box5
+ assert h.getarrayitem(box3, descr2, index2) is box5
+ assert h.getarrayitem(box2, descr3, index2) is box3
+
+ h.replace_box(lengthbox1, lengthbox2)
+ h.replace_box(lengthbox2, lengthbox3)
+ assert h.arraylen(box4) is lengthbox3
+
def test_ll_arraycopy(self):
h = HeapCache()
h.new_array(box1, lengthbox1)
diff --git a/pypy/module/cpyext/include/object.h b/pypy/module/cpyext/include/object.h
--- a/pypy/module/cpyext/include/object.h
+++ b/pypy/module/cpyext/include/object.h
@@ -38,10 +38,19 @@
PyObject_VAR_HEAD
} PyVarObject;
+#ifndef PYPY_DEBUG_REFCOUNT
#define Py_INCREF(ob) (Py_IncRef((PyObject *)ob))
#define Py_DECREF(ob) (Py_DecRef((PyObject *)ob))
#define Py_XINCREF(ob) (Py_IncRef((PyObject *)ob))
#define Py_XDECREF(ob) (Py_DecRef((PyObject *)ob))
+#else
+#define Py_INCREF(ob) (((PyObject *)ob)->ob_refcnt++)
+#define Py_DECREF(ob) ((((PyObject *)ob)->ob_refcnt > 1) ? \
+ ((PyObject *)ob)->ob_refcnt-- : (Py_DecRef((PyObject *)ob)))
+
+#define Py_XINCREF(op) do { if ((op) == NULL) ; else Py_INCREF(op); } while (0)
+#define Py_XDECREF(op) do { if ((op) == NULL) ; else Py_DECREF(op); } while (0)
+#endif
#define Py_CLEAR(op) \
do { \
diff --git a/pypy/rlib/runicode.py b/pypy/rlib/runicode.py
--- a/pypy/rlib/runicode.py
+++ b/pypy/rlib/runicode.py
@@ -1234,7 +1234,7 @@
pos += 1
continue
- if 0xD800 <= oc < 0xDC00 and pos + 1 < size:
+ if MAXUNICODE < 65536 and 0xD800 <= oc < 0xDC00 and pos + 1 < size:
# Map UTF-16 surrogate pairs to Unicode \UXXXXXXXX escapes
pos += 1
oc2 = ord(s[pos])
@@ -1350,6 +1350,20 @@
pos = 0
while pos < size:
oc = ord(s[pos])
+
+ if MAXUNICODE < 65536 and 0xD800 <= oc < 0xDC00 and pos + 1 < size:
+ # Map UTF-16 surrogate pairs to Unicode \UXXXXXXXX escapes
+ pos += 1
+ oc2 = ord(s[pos])
+
+ if 0xDC00 <= oc2 <= 0xDFFF:
+ ucs = (((oc & 0x03FF) << 10) | (oc2 & 0x03FF)) + 0x00010000
+ raw_unicode_escape_helper(result, ucs)
+ pos += 1
+ continue
+ # Fall through: isolated surrogates are copied as-is
+ pos -= 1
+
if oc < 0x100:
result.append(chr(oc))
else:
diff --git a/pypy/rlib/test/test_runicode.py b/pypy/rlib/test/test_runicode.py
--- a/pypy/rlib/test/test_runicode.py
+++ b/pypy/rlib/test/test_runicode.py
@@ -728,3 +728,18 @@
res = interpret(f, [0x10140])
assert res == 0x10140
+
+ def test_encode_surrogate_pair(self):
+ u = runicode.UNICHR(0xD800) + runicode.UNICHR(0xDC00)
+ if runicode.MAXUNICODE < 65536:
+ # Narrow unicode build, consider utf16 surrogate pairs
+ assert runicode.unicode_encode_unicode_escape(
+ u, len(u), True) == r'\U00010000'
+ assert runicode.unicode_encode_raw_unicode_escape(
+ u, len(u), True) == r'\U00010000'
+ else:
+ # Wide unicode build, don't merge utf16 surrogate pairs
+ assert runicode.unicode_encode_unicode_escape(
+ u, len(u), True) == r'\ud800\udc00'
+ assert runicode.unicode_encode_raw_unicode_escape(
+ u, len(u), True) == r'\ud800\udc00'
More information about the pypy-commit
mailing list