Bringing Cython and PyPy closer together
Hi, I'm breaking out of the thread where this topic was started ("offtopic, ontopic, ...") because it is getting too long and unfocussed to follow. The current state of the discussion seems to be that PyPy provides ways to talk to C code, but nothing as complete as CPython's C-API in the sense that it allows efficient two-way communication between C code and Python objects. Thus, we need to either improve this or look for alternatives. In order to get us more focussed on what can be done and what the implications are, so that we may eventually be able to decide what should be done, I started a Wiki page for a PyPy backend CEP (Cython Enhancement Proposal). http://wiki.cython.org/enhancements/pypy Please add to it as you see fit. In general, it is ok for a CEP to present diverging arguments, but please try to give them a structure in order to keep the overall document focussed. Stefan
Hey, http://wiki.cython.org/enhancements/pypy Nice overview of the discussion so far as far I could follow it. I hope others contribute! Regards, Martijn
Stefan Behnel, 15.02.2012 12:32:
The current state of the discussion seems to be that PyPy provides ways to talk to C code, but nothing as complete as CPython's C-API in the sense that it allows efficient two-way communication between C code and Python objects. Thus, we need to either improve this or look for alternatives.
In order to get us more focussed on what can be done and what the implications are, so that we may eventually be able to decide what should be done, I started a Wiki page for a PyPy backend CEP (Cython Enhancement Proposal).
The discussion so far makes me rather certain that the most promising short-term solution is to make Cython generate C code that PyPy's cpyext can handle. This should get us a rather broad set of running code somewhat quickly, while requiring the least design-from-scratch type of work in a direction that does not yet allow us to see if it will really make existing code work or not. On top of the basic cpyext interface, it should then be easy to implement obvious optimisations like native C level calls to Cython wrapped functions from PyPy (and potentially also the other direction) and otherwise avoid boxing/unboxing where unnecessary, e.g. for builtins. After all, it all boils down to native code at some point and I'm sure there are various ways to exploit that. Also, going this route will help both projects to get to know each other better. I think that's a required basis if we really aim for designing a more high-level interface at some point. The first steps I see are: - get Cython's test suite to run on PyPy - analyse the failing tests and decide how to fix them - adapt the Cython generated C code accordingly, special casing for PyPy where required Here is a "getting started" guide that tells you how testing works in Cython: http://wiki.cython.org/HackerGuide Once we have the test suite runnable, we can set up a PyPy instance on our CI server to get feed-back on any advances. https://sage.math.washington.edu:8091/hudson/ So, any volunteers or otherwise interested parties to help in getting this to work? Anyone in for financial support? Stefan
Hi, 2012/2/18 Stefan Behnel <stefan_ml@behnel.de>
Stefan Behnel, 15.02.2012 12:32:
The current state of the discussion seems to be that PyPy provides ways to talk to C code, but nothing as complete as CPython's C-API in the sense that it allows efficient two-way communication between C code and Python objects. Thus, we need to either improve this or look for alternatives.
In order to get us more focussed on what can be done and what the implications are, so that we may eventually be able to decide what should be done, I started a Wiki page for a PyPy backend CEP (Cython Enhancement Proposal).
The discussion so far makes me rather certain that the most promising short-term solution is to make Cython generate C code that PyPy's cpyext can handle. This should get us a rather broad set of running code somewhat quickly, while requiring the least design-from-scratch type of work in a direction that does not yet allow us to see if it will really make existing code work or not.
On top of the basic cpyext interface, it should then be easy to implement obvious optimisations like native C level calls to Cython wrapped functions from PyPy (and potentially also the other direction) and otherwise avoid boxing/unboxing where unnecessary, e.g. for builtins. After all, it all boils down to native code at some point and I'm sure there are various ways to exploit that.
Also, going this route will help both projects to get to know each other better. I think that's a required basis if we really aim for designing a more high-level interface at some point.
The first steps I see are:
- get Cython's test suite to run on PyPy - analyse the failing tests and decide how to fix them - adapt the Cython generated C code accordingly, special casing for PyPy where required
Here is a "getting started" guide that tells you how testing works in Cython:
http://wiki.cython.org/HackerGuide
Once we have the test suite runnable, we can set up a PyPy instance on our CI server to get feed-back on any advances.
https://sage.math.washington.edu:8091/hudson/
So, any volunteers or otherwise interested parties to help in getting this to work? Anyone in for financial support?
Actually I spent several evenings on this. I made some modifications to pypy, cython and lxml, and now I can compile and install cython, lxml, and they seem to work! For example:: html = etree.Element("html") body = etree.SubElement(html, "body") body.text = "TEXT" br = etree.SubElement(body, "br") br.tail = "TAIL" html.xpath("//text()") Here are the changes I made, some parts are really hacks and should be polished: lxml: http://paste.pocoo.org/show/552903/ cython: http://paste.pocoo.org/show/552904/ pypy changes are already submitted. As expected, the example above is much slower on pypy, about 15x slower than with cpython2.6. And I still get crashes when running the lxml test suite. But the situation looks much better than before, support of all lxml features seems possible. Cheers, -- Amaury Forgeot d'Arc
Amaury Forgeot d'Arc, 18.02.2012 10:08:
2012/2/18 Stefan Behnel
Stefan Behnel, 15.02.2012 12:32:
So, any volunteers or otherwise interested parties to help in getting this to work? Anyone in for financial support?
Actually I spent several evenings on this. I made some modifications to pypy, cython and lxml, and now I can compile and install cython, lxml, and they seem to work!
For example:: html = etree.Element("html") body = etree.SubElement(html, "body") body.text = "TEXT" br = etree.SubElement(body, "br") br.tail = "TAIL" html.xpath("//text()")
Here are the changes I made, some parts are really hacks and should be polished: lxml: http://paste.pocoo.org/show/552903/ cython: http://paste.pocoo.org/show/552904/ pypy changes are already submitted.
Cool. Most of the changes look reasonable at first glance. I'll see what I can apply on my side. We may get at least some of this into Cython 0.16 (which is close to release).
As expected, the example above is much slower on pypy, about 15x slower than with cpython2.6.
Given that XML processing is currently slower in PyPy than in CPython, I don't think that's all that bad. Users can still switch their imports to ElementTree if they only want to push XML out and I imagine that lxml would still be at least as fast as ElementTree under PyPy for the way in.
And I still get crashes when running the lxml test suite.
I can imagine. ;)
But the situation looks much better than before, support of all lxml features seems possible.
I think that's what matters to most users who want to do XML processing in PyPy. Stefan
On Sat, Feb 18, 2012 at 11:27 AM, Stefan Behnel <stefan_ml@behnel.de> wrote:
Amaury Forgeot d'Arc, 18.02.2012 10:08:
2012/2/18 Stefan Behnel
Stefan Behnel, 15.02.2012 12:32:
So, any volunteers or otherwise interested parties to help in getting this to work? Anyone in for financial support?
Actually I spent several evenings on this. I made some modifications to pypy, cython and lxml, and now I can compile and install cython, lxml, and they seem to work!
For example:: html = etree.Element("html") body = etree.SubElement(html, "body") body.text = "TEXT" br = etree.SubElement(body, "br") br.tail = "TAIL" html.xpath("//text()")
Here are the changes I made, some parts are really hacks and should be polished: lxml: http://paste.pocoo.org/show/552903/ cython: http://paste.pocoo.org/show/552904/ pypy changes are already submitted.
Cool. Most of the changes look reasonable at first glance. I'll see what I can apply on my side. We may get at least some of this into Cython 0.16 (which is close to release).
As expected, the example above is much slower on pypy, about 15x slower than with cpython2.6.
Given that XML processing is currently slower in PyPy than in CPython, I don't think that's all that bad. Users can still switch their imports to ElementTree if they only want to push XML out and I imagine that lxml would still be at least as fast as ElementTree under PyPy for the way in.
Are you sure actually?
Maciej Fijalkowski, 18.02.2012 10:35:
On Sat, Feb 18, 2012 at 11:27 AM, Stefan Behnel wrote:
Given that XML processing is currently slower in PyPy than in CPython, I don't think that's all that bad. Users can still switch their imports to ElementTree if they only want to push XML out and I imagine that lxml would still be at least as fast as ElementTree under PyPy for the way in.
Are you sure actually?
I'm sure it's currently much slower, see here: http://blog.behnel.de/index.php?p=210 I'm not sure the quickly patched lxml is as fast in PyPy as it is in CPython, but there is certainly room for improvements, as I mentioned before. A substantial part of it runs in properly hand tuned C, after all, and thus doesn't need to go through cpyext or otherwise talk to PyPy. Stefan
On Sat, Feb 18, 2012 at 11:48 AM, Stefan Behnel <stefan_ml@behnel.de> wrote:
Maciej Fijalkowski, 18.02.2012 10:35:
On Sat, Feb 18, 2012 at 11:27 AM, Stefan Behnel wrote:
Given that XML processing is currently slower in PyPy than in CPython, I don't think that's all that bad. Users can still switch their imports to ElementTree if they only want to push XML out and I imagine that lxml would still be at least as fast as ElementTree under PyPy for the way in.
Are you sure actually?
I'm sure it's currently much slower, see here:
http://blog.behnel.de/index.php?p=210
I'm not sure the quickly patched lxml is as fast in PyPy as it is in CPython, but there is certainly room for improvements, as I mentioned before. A substantial part of it runs in properly hand tuned C, after all, and thus doesn't need to go through cpyext or otherwise talk to PyPy.
Stefan
_______________________________________________ pypy-dev mailing list pypy-dev@python.org http://mail.python.org/mailman/listinfo/pypy-dev
Can you please send me or post somewhere numbers? I'm fairly bad at trying to deduce them from the graph (although that doesn't change that the graph is very likely more readable). I'm not sure there are easy ways to optimize. Sure cpyext is slower than ctypes, but we cannot achieve much more than that. Certainly we cannot do unboxing (boxes might be only produced to make cpyext happy for example). Unless I'm misunderstanding your intentions, can you elaborate? I somehow doubt it's possible to make this run fast using cpyext (although there are definitely some ways). Maybe speeding up ElementTree would be the way if all we want to get is a fast XML processor? I doubt this is the case though. I'm waiting for other insights, I'm a bit clueless. Cheers, fijal
Maciej Fijalkowski, 18.02.2012 10:56:
On Sat, Feb 18, 2012 at 11:48 AM, Stefan Behnel wrote:
Maciej Fijalkowski, 18.02.2012 10:35:
On Sat, Feb 18, 2012 at 11:27 AM, Stefan Behnel wrote:
Given that XML processing is currently slower in PyPy than in CPython, I don't think that's all that bad. Users can still switch their imports to ElementTree if they only want to push XML out and I imagine that lxml would still be at least as fast as ElementTree under PyPy for the way in.
Are you sure actually?
I'm sure it's currently much slower, see here:
http://blog.behnel.de/index.php?p=210
I'm not sure the quickly patched lxml is as fast in PyPy as it is in CPython, but there is certainly room for improvements, as I mentioned before. A substantial part of it runs in properly hand tuned C, after all, and thus doesn't need to go through cpyext or otherwise talk to PyPy.
Can you please send me or post somewhere numbers? I'm fairly bad at trying to deduce them from the graph (although that doesn't change that the graph is very likely more readable).
You can get the code and the input files from here: http://lxml.de/etbench.tar.bz2 Note that this only compares the parser performance, but given how much faster CPython is here (we are talking seconds!), it'll be hard enough for PyPy to catch up with anything after such a head start.
I'm not sure there are easy ways to optimize. Sure cpyext is slower than ctypes, but we cannot achieve much more than that. Certainly we cannot do unboxing (boxes might be only produced to make cpyext happy for example). Unless I'm misunderstanding your intentions, can you elaborate?
If you are referring to my comments on a faster interconnection towards Cython, I think it should be quite easy for PyPy to bypass Cython's Python function wrapper implementation (of type "CyFunction") to call directly into the underlying C function, with unboxed parameters. Cython could provide some sort of C signature introspection feature that PyPy could analyse and optimise for. But that's only that direction. Calling from Cython code back into PyPy compiled code efficiently is (from what I've heard so far) likely going to be trickier because Cython would have to know how to do that at C compilation time at the latest, while PyPy decides about theses things at runtime. Still, there could be a way for Cython to tell PyPy about the signature it wants to use for a specific unboxed call, and PyPy could generate an optimised wrapper for that and eventually a specialised function implementation for the statically known set of argument types in a specific call. A simple varargs approach may work here, imagine something like this: error = PyPy_CallFunctionWithSignature( func_obj_ptr, "(int, object, list, option=bint) -> int", i, object_ptr, list_obj_ptr, 0, int_result*) And given that the constant signature string would be interned by the C compiler, a simple pointer comparison should suffice for branching inside of PyPy. That would in many cases drop the need to convert all parameters to objects and to pack them into an argument tuple. Even keyword arguments could often be folded into positional varargs, as I indicated above. I can well imagine that PyPy could render such a calling convention quite efficient.
I somehow doubt it's possible to make this run fast using cpyext (although there are definitely some ways). Maybe speeding up ElementTree would be the way if all we want to get is a fast XML processor? I doubt this is the case though.
cElementTree, being younger, may contain a couple of optimisations that ElementTree lacks, but I doubt that there is still so much to get out of it. ET has already received quite a bit of tuning in the past (although not for PyPy). The main problem seems to be that the SAX driven callbacks from the expat parser inject too much overhead. And any decently sized XML document will trigger a lot of those. Stefan
Maciej Fijalkowski, 18.02.2012 10:56:
On Sat, Feb 18, 2012 at 11:48 AM, Stefan Behnel wrote:
Maciej Fijalkowski, 18.02.2012 10:35:
On Sat, Feb 18, 2012 at 11:27 AM, Stefan Behnel wrote:
Given that XML processing is currently slower in PyPy than in CPython, I don't think that's all that bad. Users can still switch their imports to ElementTree if they only want to push XML out and I imagine that lxml would still be at least as fast as ElementTree under PyPy for the way in.
Are you sure actually?
I'm sure it's currently much slower, see here:
Can you please send me or post somewhere numbers? I'm fairly bad at trying to deduce them from the graph (although that doesn't change that the graph is very likely more readable).
I just noticed that I still had them lying around, so I'm pasting them here as tabified table. Columns: 274KB hamlet.xml, 3.4MB ot.xml, 25MB Slavic text, 4.5MB structure, 6.2MB structure PP MiniDOM (PyPy) 0,091 0,369 2,441 1,363 3,152 MiniDOM 0,152 0,672 6,193 5,935 8,705 lxml.etree 0,014 0,041 0,454 0,131 0,199 ElementTree (PyPy) 0,045 0,293 2,282 1,005 2,247 ElementTree 0,104 0,385 3,274 3,374 4,178 cElementTree 0,022 0,056 0,459 0,192 0,478 This was using PyPy 1.7, times are in seconds. As you can see, CPython is faster by a factor of 5-10 for the larger files. Stefan
Hi there, On Sat, Feb 18, 2012 at 10:56 AM, Maciej Fijalkowski <fijall@gmail.com> wrote:
I somehow doubt it's possible to make this run fast using cpyext (although there are definitely some ways). Maybe speeding up ElementTree would be the way if all we want to get is a fast XML processor? I doubt this is the case though.
lxml does a heck of a lot more than ElementTree. If all you need is a fast parser and serializer I figure you could wrap cElementTree using rpython or something like that. But lxml does a lot of other things. For some insight of why people want would lxml, there's an interesting discussion on google app engine's bug tracker about it. http://code.google.com/p/googleappengine/issues/detail?id=18 This type of discussion is instructive as PyPy's barriers to porting C extensions, while not "just wait for google" as the google app engine case, are still somewhat similar. In the end google finally ended up supporting numpy, lxml and a few other modules. Regards, Martijn
Martijn Faassen, 18.02.2012 14:11:
For some insight of why people want would lxml, there's an interesting discussion on google app engine's bug tracker about it.
http://code.google.com/p/googleappengine/issues/detail?id=18
This type of discussion is instructive as PyPy's barriers to porting C extensions, while not "just wait for google" as the google app engine case, are still somewhat similar. In the end google finally ended up supporting numpy, lxml and a few other modules.
Whoopsa! I totally wasn't aware that they actually did something about that seriously long-standing feature request. That's cool. Stefan
Amaury Forgeot d'Arc, 18.02.2012 10:08:
I made some modifications to pypy, cython and lxml, and now I can compile and install cython, lxml, and they seem to work!
For example:: html = etree.Element("html") body = etree.SubElement(html, "body") body.text = "TEXT" br = etree.SubElement(body, "br") br.tail = "TAIL" html.xpath("//text()")
Here are the changes I made, some parts are really hacks and should be polished: lxml: http://paste.pocoo.org/show/552903/
The weakref changes are really unfortunate as they appear in one of the most performance critical spots of lxml's API: on-the-fly proxy creation. I can understand why the original code won't work as is, but could you elaborate on why the weak references are needed? Maybe there is a faster way of doing this? Stefan
2012/2/18 Stefan Behnel <stefan_ml@behnel.de>
The weakref changes are really unfortunate as they appear in one of the most performance critical spots of lxml's API: on-the-fly proxy creation.
I can understand why the original code won't work as is, but could you elaborate on why the weak references are needed? Maybe there is a faster way of doing this?
PyObject->ob_refcnt only counts the number of PyObject references to the object, not eventual references held by other parts of the pypy interpreter. For example, PyTuple_GetItem() often returns something with refcnt=1; Two calls to "PyObject *x = PyTuple_GetItem(tuple, 0); Py_DECREF(x);" will return different values for the x pointer. But this model has issues with borrowed references. For example, this code is valid CPython, but will crash with cpyext: PyObject *exc = PyErr_NewException("error", PyExc_StandardError, NULL); PyDict_SetItemString(module_dict, "error", exc); Py_DECREF(exc); // exc is now a borrowed reference, but following line crash pypy: PyObject *another = PyErr_NewException("AnotherError", exc, NULL); PyDict_SetItemString(module_dict, "AnotherError", another); Py_DECREF(exc); In CPython, the code can continue using the created object: we don't own the reference, exc is now a borrowed reference, valid as long as the containing dict is valid; The refcount is 1 when the object is created, incremented when PyDict_SetItem stores it, and 1 again after DECREF. PyPy does it differently: a dictionary does not store PyObject* pointers, but "pypy objects" with no reference counting, and which address can change with a gc collection. PyDict_SetItemString will not change exc->refcnt, which will remain 1, then Py_DECREF will free the memory pointed by exc. There are mechanisms to keep the reference a bit longer, for example PyTuple_GET_ITEM will return a "temporary" reference that will be released when the tuple loses its last cpyext reference. Another way to say this is that with cpyext, a borrowed reference has to borrow from some other reference that you own. It can be a container, or in some cases the current "context", i.e. something that have the duration of the current C call. Otherwise, weak references must be used instead. -- Amaury Forgeot d'Arc
Hey, On Sat, Feb 18, 2012 at 11:20 AM, Stefan Behnel <stefan_ml@behnel.de> wrote:
Amaury Forgeot d'Arc, 18.02.2012 10:08:
I made some modifications to pypy, cython and lxml, and now I can compile and install cython, lxml, and they seem to work!
For example:: html = etree.Element("html") body = etree.SubElement(html, "body") body.text = "TEXT" br = etree.SubElement(body, "br") br.tail = "TAIL" html.xpath("//text()")
Here are the changes I made, some parts are really hacks and should be polished: lxml: http://paste.pocoo.org/show/552903/
The weakref changes are really unfortunate as they appear in one of the most performance critical spots of lxml's API: on-the-fly proxy creation.
In fact I remember using weak references in early versions of lxml, and getting rid of them at the time sped things up a lot. Regards, Martijn
Stefan Behnel, 18.02.2012 11:20:
Amaury Forgeot d'Arc, 18.02.2012 10:08:
I made some modifications to pypy, cython and lxml, and now I can compile and install cython, lxml, and they seem to work!
For example:: html = etree.Element("html") body = etree.SubElement(html, "body") body.text = "TEXT" br = etree.SubElement(body, "br") br.tail = "TAIL" html.xpath("//text()")
Here are the changes I made, some parts are really hacks and should be polished: lxml: http://paste.pocoo.org/show/552903/
The weakref changes are really unfortunate as they appear in one of the most performance critical spots of lxml's API: on-the-fly proxy creation.
To give an idea of how much overhead there is, here's a micro-benchmark. First, parsing: $ python2.7 -m timeit -s 'import lxml.etree as et' \ 'et.parse("input3.xml")' 10 loops, best of 3: 136 msec per loop $ pypy -m timeit -s 'import lxml.etree as et' \ 'et.parse("input3.xml")' 10 loops, best of 3: 127 msec per loop I have no idea why pypy is faster here - there really isn't any interaction with the core during XML parsing, certainly nothing that would account for some 7% of the runtime. Maybe some kind of building, benchmarking or whatever fault on my side. Anyway, parsing is clearly in the same ballpark for both. However, when it comes to element proxy instantiation (collecting all elements in the XML tree here as a worst-case example), there is a clear disadvantage for PyPy: $ python2.7 -m timeit -s 'import lxml.etree as et; \ el=et.parse("input3.xml").getroot()' 'list(el.iter())' 10 loops, best of 3: 84 msec per loop $ pypy -m timeit -s 'import lxml.etree as et; \ el=et.parse("input3.xml").getroot()' 'list(el.iter())' 10 loops, best of 3: 1.29 sec per loop That's about the same factor of 15 that you got. This may or may not matter to applications, though, because there are many tools in lxml that allow users to be very selective about which proxies they want to see instantiated, and to otherwise let a lot of functionality execute in C. So applications may get away with a performance hit below that factor in practice. What certainly matters for applications is to get the feature set of lxml within PyPy. Stefan
Amaury Forgeot d'Arc, 18.02.2012 10:08:
I made some modifications to pypy, cython and lxml, and now I can compile and install cython, lxml, and they seem to work!
For example:: html = etree.Element("html") body = etree.SubElement(html, "body") body.text = "TEXT" br = etree.SubElement(body, "br") br.tail = "TAIL" html.xpath("//text()")
Here are the changes I made, some parts are really hacks and should be polished: lxml: http://paste.pocoo.org/show/552903/
I attached a reworked but untested patch for lxml. Could you try it? I couldn't find the PyWeakref_LockObject() function anywhere. That's a PyPy-only thing, right? I aliased it to (NULL) when compiling for CPython. I'll go through the Cython changes next, so I haven't got a working Cython version yet. Stefan
2012/2/18 Stefan Behnel <stefan_ml@behnel.de>
I couldn't find the PyWeakref_LockObject() function anywhere. That's a PyPy-only thing, right? I aliased it to (NULL) when compiling for CPython.
Yes, this function is PyPy-only, to fix a flaw of PyWeakref_GetObject: it returns a borrowed reference, which is very dangerous because the object can disappear anytime: with a garbage collection, or another thread... Fortunately the GIL is here to protect you, but the only sane thing to do is to quickly INCREF the returned reference. PyWeakref_LockObject directly returns a new reference. In PyPy, the behavior of PyWeakref_GetObject is even worse: to avoid returning a refcount of zero, the returned reference is attached to the weakref, effectively turning it into a strong reference! I now realize that this was written before we implemented the "temporary container" for borrowed references: PyWeakref_GetObject() could return a reference valid for the duration of the current C call. -- Amaury Forgeot d'Arc
Amaury Forgeot d'Arc, 18.02.2012 15:18:
2012/2/18 Stefan Behnel
I couldn't find the PyWeakref_LockObject() function anywhere. That's a PyPy-only thing, right? I aliased it to (NULL) when compiling for CPython.
Yes, this function is PyPy-only, to fix a flaw of PyWeakref_GetObject: it returns a borrowed reference, which is very dangerous because the object can disappear anytime: with a garbage collection, or another thread... Fortunately the GIL is here to protect you, but the only sane thing to do is to quickly INCREF the returned reference. PyWeakref_LockObject directly returns a new reference.
In PyPy, the behavior of PyWeakref_GetObject is even worse: to avoid returning a refcount of zero, the returned reference is attached to the weakref, effectively turning it into a strong reference! I now realize that this was written before we implemented the "temporary container" for borrowed references: PyWeakref_GetObject() could return a reference valid for the duration of the current C call.
Do you mean that you could actually fix this in PyPy so that lxml won't have to use that function? Or would it still have to use it, because the references are stored away and thus become long-living? (i.e. longer than the C call) Stefan
Amaury Forgeot d'Arc, 18.02.2012 10:08:
2012/2/18 Stefan Behnel I made some modifications to pypy, cython and lxml, and now I can compile and install cython, lxml, and they seem to work!
Here are the changes I made, some parts are really hacks and should be polished: lxml: http://paste.pocoo.org/show/552903/ cython: http://paste.pocoo.org/show/552904/
The exception handling code that you deleted in __Pyx_GetException(), that which accesses exc_type and friends, is actually needed for correct semantics of Cython code and Python code. Basically, it implements the part of the except clause that moves the hot exception into sys.exc_info(). This equally applies to __Pyx_ExceptionSave() and __Pyx_ExceptionReset(), which form something like an exception backup frame around code sections that may raise exceptions themselves but must otherwise not touch the current exception. Specifically, as part of the finally clause. In order to fix this, is there a way to store away and restore the current sys.exc_info() in PyPy? Stefan
2012/2/18 Stefan Behnel <stefan_ml@behnel.de>
The exception handling code that you deleted in __Pyx_GetException(), that which accesses exc_type and friends, is actually needed for correct semantics of Cython code and Python code. Basically, it implements the part of the except clause that moves the hot exception into sys.exc_info().
This equally applies to __Pyx_ExceptionSave() and __Pyx_ExceptionReset(), which form something like an exception backup frame around code sections that may raise exceptions themselves but must otherwise not touch the current exception. Specifically, as part of the finally clause.
In order to fix this, is there a way to store away and restore the current sys.exc_info() in PyPy?
I certainly was a bit fast to remove code there, and these exc_value and curexc_value have always been a delicate part of the CPython interpreter. One thing I don't understand for example, is why Cython needs to deal with sys.exc_info, when no other extension uses it for exception management. The only way to know for sure is to have unit tests with different use cases. -- Amaury Forgeot d'Arc
Amaury Forgeot d'Arc, 18.02.2012 14:52:
2012/2/18 Stefan Behnel
The exception handling code that you deleted in __Pyx_GetException(), that which accesses exc_type and friends, is actually needed for correct semantics of Cython code and Python code. Basically, it implements the part of the except clause that moves the hot exception into sys.exc_info().
This equally applies to __Pyx_ExceptionSave() and __Pyx_ExceptionReset(), which form something like an exception backup frame around code sections that may raise exceptions themselves but must otherwise not touch the current exception. Specifically, as part of the finally clause.
In order to fix this, is there a way to store away and restore the current sys.exc_info() in PyPy?
I certainly was a bit fast to remove code there, and these exc_value and curexc_value have always been a delicate part of the CPython interpreter.
One thing I don't understand for example, is why Cython needs to deal with sys.exc_info, when no other extension uses it for exception management.
Here's an example. Python code: def print_excinfo(): print(sys.exc_info()) Cython code: from stuff import print_excinfo try: raise TypeError except TypeError: print_excinfo() With the code removed, Cython will not store the TypeError in sys.exc_info(), so the Python code cannot see it. This may seem like an unimportant use case (who uses sys.exc_info() anyway, right?), but this becomes very visible when the code that uses sys.exc_info() is not user code but CPython itself, e.g. when raising another exception or when inspecting frames. Things grow really bad here, especially in Python 3.
The only way to know for sure is to have unit tests with different use cases.
Cython has loads of those in its test suite, as you can imagine. These things are so tricky to get right that it's futile to even try without growing a substantial test base of weird corner cases. Stefan
2012/2/18 Stefan Behnel <stefan_ml@behnel.de>
Here's an example.
Python code:
def print_excinfo(): print(sys.exc_info())
Cython code:
from stuff import print_excinfo
try: raise TypeError except TypeError: print_excinfo()
With the code removed, Cython will not store the TypeError in sys.exc_info(), so the Python code cannot see it. This may seem like an unimportant use case (who uses sys.exc_info() anyway, right?), but this becomes very visible when the code that uses sys.exc_info() is not user code but CPython itself, e.g. when raising another exception or when inspecting frames. Things grow really bad here, especially in Python 3.
I think I understand now, thanks for your example. Things are a bit simpler in PyPy because these exceptions are stored in the frame that is currently handling it. At least better than CPython which stores it in one place, and has to somehow save the state of the previous frames. Did you consider adding such a function to CPython? "PyErr_SetCurrentFrameExceptionInfo"? For the record, pypy could implement it as: space.getexecutioncontext().gettopframe_nohidden().last_exception = operationerr i.e. the thing returned by sys.exc_info(). -- Amaury Forgeot d'Arc
Amaury Forgeot d'Arc, 18.02.2012 15:41:
2012/2/18 Stefan Behnel
Here's an example.
Python code:
def print_excinfo(): print(sys.exc_info())
Cython code:
from stuff import print_excinfo
try: raise TypeError except TypeError: print_excinfo()
With the code removed, Cython will not store the TypeError in sys.exc_info(), so the Python code cannot see it. This may seem like an unimportant use case (who uses sys.exc_info() anyway, right?), but this becomes very visible when the code that uses sys.exc_info() is not user code but CPython itself, e.g. when raising another exception or when inspecting frames. Things grow really bad here, especially in Python 3.
I think I understand now, thanks for your example. Things are a bit simpler in PyPy because these exceptions are stored in the frame that is currently handling it. At least better than CPython which stores it in one place, and has to somehow save the state of the previous frames. Did you consider adding such a function to CPython? "PyErr_SetCurrentFrameExceptionInfo"?
We need read/write access and also swap the exception with another one in some places (lacking a dedicated frame for generators, for example), that makes it two functions at least. CPython and its hand-written extensions won't have much use for this, so the only reason to add this would be to make PyPy (and maybe others) happier when running Cython extensions. I'll ask, although I wouldn't mind using a dedicated PyPy API for this.
For the record, pypy could implement it as: space.getexecutioncontext().gettopframe_nohidden().last_exception = operationerr i.e. the thing returned by sys.exc_info().
I imagine that even if there is a way to do this from C code in PyPy, it would be too inefficient for something as common as exception handling. Stefan
Amaury Forgeot d'Arc, 18.02.2012 15:41:
2012/2/18 Stefan Behnel
Here's an example.
Python code:
def print_excinfo(): print(sys.exc_info())
Cython code:
from stuff import print_excinfo
try: raise TypeError except TypeError: print_excinfo()
With the code removed, Cython will not store the TypeError in sys.exc_info(), so the Python code cannot see it. This may seem like an unimportant use case (who uses sys.exc_info() anyway, right?), but this becomes very visible when the code that uses sys.exc_info() is not user code but CPython itself, e.g. when raising another exception or when inspecting frames. Things grow really bad here, especially in Python 3.
I think I understand now, thanks for your example. Things are a bit simpler in PyPy because these exceptions are stored in the frame that is currently handling it. At least better than CPython which stores it in one place, and has to somehow save the state of the previous frames. Did you consider adding such a function to CPython? "PyErr_SetCurrentFrameExceptionInfo"?
For the record, pypy could implement it as: space.getexecutioncontext().gettopframe_nohidden().last_exception = operationerr i.e. the thing returned by sys.exc_info().
I've dropped a patch for CPython in the corresponding tracker ticket: http://bugs.python.org/issue14098 The (trivial) implementation of the two functions is at the end of this file: http://bugs.python.org/file24613/exc_info_capi.patch Could you add them to PyPy? Thanks! Stefan
Hi list, just to keep you informed: the ZODB3 can be successfully installed in a PyPy-1.8 virtual environment, but the tests fail. Below is the whole bash session I tried on Ubuntu LTS. For the PyPy developers the interesting part might be the last section, "Running the ZODB3 tests". Thanks for the great job you are doing. Aroldo. Installing ZODB3 in a PyPy virtualenv ===================================== Ubuntu Lucid Lynx (LTS) 2012-02-29 Verifying the virtualenv ------------------------ :: (tmp-env-pypy)aroldo@aroldo-laptop:~/tmp/python/tmp-env-pypy$ env|grep ENV VIRTUAL_ENV=/home/aroldo/tmp/python/tmp-env-pypy PIP_ENVIRONMENT=/home/aroldo/tmp/python/tmp-env-pypy (tmp-env-pypy)aroldo@aroldo-laptop:~/tmp/python/tmp-env-pypy$ python --version Python 2.7.2 (0e28b379d8b3, Feb 09 2012, 19:41:19) [PyPy 1.8.0 with GCC 4.4.3] (tmp-env-pypy)aroldo@aroldo-laptop:~/tmp/python/tmp-env-pypy$ which pip /home/aroldo/tmp/python/tmp-env-pypy/bin/pip Installing the zodb3 (using pip) --------------------------------- :: (tmp-env-pypy)aroldo@aroldo-laptop:~/tmp/python/tmp-env-pypy$ pip install zodb3 Downloading/unpacking zodb3 Downloading ZODB3-3.10.5.tar.gz (706Kb): 706Kb downloaded Running setup.py egg_info for package zodb3 Downloading/unpacking transaction>=1.1.0 (from zodb3) Downloading transaction-1.2.0.tar.gz (42Kb): 42Kb downloaded Running setup.py egg_info for package transaction Downloading/unpacking zc.lockfile (from zodb3) Downloading zc.lockfile-1.0.0.tar.gz Running setup.py egg_info for package zc.lockfile Downloading/unpacking ZConfig (from zodb3) Downloading ZConfig-2.9.2.tar.gz (261Kb): 261Kb downloaded Running setup.py egg_info for package ZConfig Downloading/unpacking zdaemon (from zodb3) Downloading zdaemon-2.0.4.tar.gz (42Kb): 42Kb downloaded Running setup.py egg_info for package zdaemon Downloading/unpacking zope.event (from zodb3) Downloading zope.event-3.5.1.tar.gz Running setup.py egg_info for package zope.event Downloading/unpacking zope.interface (from zodb3) Downloading zope.interface-3.8.0.tar.gz (111Kb): 111Kb downloaded Running setup.py egg_info for package zope.interface Requirement already satisfied (use --upgrade to upgrade): setuptools in ./site- packages/setuptools-0.6c11-py2.7.egg (from zc.lockfile->zodb3) Installing collected packages: zodb3, transaction, zc.lockfile, ZConfig, zdaemo n, zope.event, zope.interface Running setup.py install for zodb3 building 'BTrees._OOBTree' extension cc -fPIC -Wimplicit -Isrc -I/home/aroldo/tmp/python/tmp-env-pypy/include -c src/BTrees/_OOBTree.c -o build/temp.linux-i686-2.7/src/BTrees/_OOBTree.o cc -shared build/temp.linux-i686-2.7/src/BTrees/_OOBTree.o -o build/lib.lin ux-i686-2.7/BTrees/_OOBTree.pypy-18.so building 'BTrees._IOBTree' extension cc -fPIC -Wimplicit -DEXCLUDE_INTSET_SUPPORT -Isrc -I/home/aroldo/tmp/pytho n/tmp-env-pypy/include -c src/BTrees/_IOBTree.c -o build/temp.linux-i686-2.7/sr c/BTrees/_IOBTree.o cc -shared build/temp.linux-i686-2.7/src/BTrees/_IOBTree.o -o build/lib.lin ux-i686-2.7/BTrees/_IOBTree.pypy-18.so building 'BTrees._OIBTree' extension cc -fPIC -Wimplicit -Isrc -I/home/aroldo/tmp/python/tmp-env-pypy/include -c src/BTrees/_OIBTree.c -o build/temp.linux-i686-2.7/src/BTrees/_OIBTree.o cc -shared build/temp.linux-i686-2.7/src/BTrees/_OIBTree.o -o build/lib.lin ux-i686-2.7/BTrees/_OIBTree.pypy-18.so building 'BTrees._IIBTree' extension cc -fPIC -Wimplicit -DEXCLUDE_INTSET_SUPPORT -Isrc -I/home/aroldo/tmp/pytho n/tmp-env-pypy/include -c src/BTrees/_IIBTree.c -o build/temp.linux-i686-2.7/sr c/BTrees/_IIBTree.o cc -shared build/temp.linux-i686-2.7/src/BTrees/_IIBTree.o -o build/lib.lin ux-i686-2.7/BTrees/_IIBTree.pypy-18.so building 'BTrees._IFBTree' extension cc -fPIC -Wimplicit -DEXCLUDE_INTSET_SUPPORT -Isrc -I/home/aroldo/tmp/pytho n/tmp-env-pypy/include -c src/BTrees/_IFBTree.c -o build/temp.linux-i686-2.7/sr c/BTrees/_IFBTree.o cc -shared build/temp.linux-i686-2.7/src/BTrees/_IFBTree.o -o build/lib.lin ux-i686-2.7/BTrees/_IFBTree.pypy-18.so building 'BTrees._fsBTree' extension cc -fPIC -Wimplicit -DEXCLUDE_INTSET_SUPPORT -Isrc -I/home/aroldo/tmp/pytho n/tmp-env-pypy/include -c src/BTrees/_fsBTree.c -o build/temp.linux-i686-2.7/sr c/BTrees/_fsBTree.o cc -shared build/temp.linux-i686-2.7/src/BTrees/_fsBTree.o -o build/lib.lin ux-i686-2.7/BTrees/_fsBTree.pypy-18.so building 'BTrees._LOBTree' extension cc -fPIC -Wimplicit -DEXCLUDE_INTSET_SUPPORT -Isrc -I/home/aroldo/tmp/pytho n/tmp-env-pypy/include -c src/BTrees/_LOBTree.c -o build/temp.linux-i686-2.7/sr c/BTrees/_LOBTree.o cc -shared build/temp.linux-i686-2.7/src/BTrees/_LOBTree.o -o build/lib.lin ux-i686-2.7/BTrees/_LOBTree.pypy-18.so building 'BTrees._OLBTree' extension cc -fPIC -Wimplicit -Isrc -I/home/aroldo/tmp/python/tmp-env-pypy/include -c src/BTrees/_OLBTree.c -o build/temp.linux-i686-2.7/src/BTrees/_OLBTree.o cc -shared build/temp.linux-i686-2.7/src/BTrees/_OLBTree.o -o build/lib.lin ux-i686-2.7/BTrees/_OLBTree.pypy-18.so building 'BTrees._LLBTree' extension cc -fPIC -Wimplicit -DEXCLUDE_INTSET_SUPPORT -Isrc -I/home/aroldo/tmp/pytho n/tmp-env-pypy/include -c src/BTrees/_LLBTree.c -o build/temp.linux-i686-2.7/sr c/BTrees/_LLBTree.o cc -shared build/temp.linux-i686-2.7/src/BTrees/_LLBTree.o -o build/lib.lin ux-i686-2.7/BTrees/_LLBTree.pypy-18.so building 'BTrees._LFBTree' extension cc -fPIC -Wimplicit -DEXCLUDE_INTSET_SUPPORT -Isrc -I/home/aroldo/tmp/pytho n/tmp-env-pypy/include -c src/BTrees/_LFBTree.c -o build/temp.linux-i686-2.7/sr c/BTrees/_LFBTree.o cc -shared build/temp.linux-i686-2.7/src/BTrees/_LFBTree.o -o build/lib.lin ux-i686-2.7/BTrees/_LFBTree.pypy-18.so building 'persistent.cPersistence' extension cc -fPIC -Wimplicit -Isrc -I/home/aroldo/tmp/python/tmp-env-pypy/include -c src/persistent/cPersistence.c -o build/temp.linux-i686-2.7/src/persistent/cPer sistence.o src/persistent/cPersistence.c: In function ‘Per_set_oid’: src/persistent/cPersistence.c:998: warning: passing argument 3 of ‘PyObject _Cmp’ from incompatible pointer type /home/aroldo/tmp/python/tmp-env-pypy/include/pypy_decl.h:263: note: expecte d ‘long int *’ but argument is of type ‘int *’ src/persistent/cPersistence.c: In function ‘Per_set_jar’: src/persistent/cPersistence.c:1034: warning: passing argument 3 of ‘PyObjec t_Cmp’ from incompatible pointer type /home/aroldo/tmp/python/tmp-env-pypy/include/pypy_decl.h:263: note: expecte d ‘long int *’ but argument is of type ‘int *’ cc -fPIC -Wimplicit -Isrc -I/home/aroldo/tmp/python/tmp-env-pypy/include -c src/persistent/ring.c -o build/temp.linux-i686-2.7/src/persistent/ring.o cc -shared build/temp.linux-i686-2.7/src/persistent/cPersistence.o build/te mp.linux-i686-2.7/src/persistent/ring.o -o build/lib.linux-i686-2.7/persistent/ cPersistence.pypy-18.so building 'persistent.cPickleCache' extension cc -fPIC -Wimplicit -Isrc -I/home/aroldo/tmp/python/tmp-env-pypy/include -c src/persistent/cPickleCache.c -o build/temp.linux-i686-2.7/src/persistent/cPic kleCache.o src/persistent/cPickleCache.c: In function ‘cc_oid_unreferenced’: src/persistent/cPickleCache.c:655: warning: implicit declaration of functio n ‘_Py_ForgetReference’ cc -fPIC -Wimplicit -Isrc -I/home/aroldo/tmp/python/tmp-env-pypy/include -c src/persistent/ring.c -o build/temp.linux-i686-2.7/src/persistent/ring.o cc -shared build/temp.linux-i686-2.7/src/persistent/cPickleCache.o build/te mp.linux-i686-2.7/src/persistent/ring.o -o build/lib.linux-i686-2.7/persistent/ cPickleCache.pypy-18.so building 'persistent.TimeStamp' extension cc -fPIC -Wimplicit -Isrc -I/home/aroldo/tmp/python/tmp-env-pypy/include -c src/persistent/TimeStamp.c -o build/temp.linux-i686-2.7/src/persistent/TimeSta mp.o cc -shared build/temp.linux-i686-2.7/src/persistent/TimeStamp.o -o build/li b.linux-i686-2.7/persistent/TimeStamp.pypy-18.so Installing fsdump script to /home/aroldo/tmp/python/tmp-env-pypy/bin Installing fstail script to /home/aroldo/tmp/python/tmp-env-pypy/bin Installing zeopack script to /home/aroldo/tmp/python/tmp-env-pypy/bin Installing runzeo script to /home/aroldo/tmp/python/tmp-env-pypy/bin Installing fsrefs script to /home/aroldo/tmp/python/tmp-env-pypy/bin Installing zeoctl script to /home/aroldo/tmp/python/tmp-env-pypy/bin Installing repozo script to /home/aroldo/tmp/python/tmp-env-pypy/bin Installing fsoids script to /home/aroldo/tmp/python/tmp-env-pypy/bin Installing zeopasswd script to /home/aroldo/tmp/python/tmp-env-pypy/bin Running setup.py install for transaction Running setup.py install for zc.lockfile Skipping installation of /home/aroldo/tmp/python/tmp-env-pypy/site-packages /zc/__init__.py (namespace package) Installing /home/aroldo/tmp/python/tmp-env-pypy/site-packages/zc.lockfile-1 .0.0-py2.7-nspkg.pth Running setup.py install for ZConfig changing mode of build/scripts-2.7/zconfig from 644 to 755 changing mode of build/scripts-2.7/zconfig_schema2html from 644 to 755 changing mode of /home/aroldo/tmp/python/tmp-env-pypy/bin/zconfig_schema2ht ml to 755 changing mode of /home/aroldo/tmp/python/tmp-env-pypy/bin/zconfig to 755 Running setup.py install for zdaemon Installing zdaemon script to /home/aroldo/tmp/python/tmp-env-pypy/bin Running setup.py install for zope.event Skipping installation of /home/aroldo/tmp/python/tmp-env-pypy/site-packages /zope/__init__.py (namespace package) Installing /home/aroldo/tmp/python/tmp-env-pypy/site-packages/zope.event-3. 5.1-py2.7-nspkg.pth Running setup.py install for zope.interface Skipping installation of /home/aroldo/tmp/python/tmp-env-pypy/site-packages /zope/__init__.py (namespace package) Installing /home/aroldo/tmp/python/tmp-env-pypy/site-packages/zope.interfac e-3.8.0-py2.7-nspkg.pth Successfully installed zodb3 transaction zc.lockfile ZConfig zdaemon zope.event zope.interface Cleaning up... Running the ZOBD3 tests ------------------------------- :: (tmp-env-pypy)aroldo@aroldo-laptop:~/tmp/python/tmp-env-pypy$ python --version Python 2.7.2 (0e28b379d8b3, Feb 09 2012, 19:41:19) [PyPy 1.8.0 with GCC 4.4.3] (tmp-env-pypy)aroldo@aroldo-laptop:~/tmp/python/tmp-env-pypy$ python site-packages/ZODB/tests/testZODB.py Traceback (most recent call last): File "app_main.py", line 51, in run_toplevel File "site-packages/ZODB/tests/testZODB.py", line 15, in <module> from persistent import Persistent File "/home/aroldo/tmp/python/tmp-env-pypy/site-packages/persistent/__init__.py", line 20, in <module> from cPickleCache import PickleCache ImportError: unable to load extension module '/home/aroldo/tmp/python/tmp-env-p ypy/site-packages/persistent/cPickleCache.pypy-18.so': /home/aroldo/tmp/python/ tmp-env-pypy/site-packages/persistent/cPickleCache.pypy-18.so: undefined symbol : _Py_ForgetReference (tmp-env-pypy)aroldo@aroldo-laptop:~/tmp/python/tmp-env-pypy$
On Mon, Feb 27, 2012 at 1:53 AM, Aroldo Souza-Leite <asouzaleite@gmx.de> wrote:
Hi list,
just to keep you informed:
the ZODB3 can be successfully installed in a PyPy-1.8 virtual environment, but the tests fail. Below is the whole bash session I tried on Ubuntu LTS. For the PyPy developers the interesting part might be the last section, "Running the ZODB3 tests".
Thanks for the great job you are doing.
Aroldo.
Hey It seems ZODB is using an internal API, notably _Py_ForgetReference. On PyPy it probably should not do anything so you can maybe try with #define _Py_ForgetReference(obj), but I don't actually know. Cheers, fijal
Hi, On Mon, Feb 27, 2012 at 16:29, Maciej Fijalkowski <fijall@gmail.com> wrote:
On PyPy it probably should not do anything so you can maybe try with #define _Py_ForgetReference(obj), but I don't actually know.
Yes, it looks like it doesn't do anything in release builds of CPython. It's there for debugging. A bientôt, Armin.
Hi Armin, hi Maciej, hi list, thanks, I'll try and take the issue to the zodb-dev@zope.org. Cheers. Aroldo. Am 27.02.2012 16:48, schrieb Armin Rigo:
Hi,
On Mon, Feb 27, 2012 at 16:29, Maciej Fijalkowski<fijall@gmail.com> wrote:
On PyPy it probably should not do anything so you can maybe try with #define _Py_ForgetReference(obj), but I don't actually know. Yes, it looks like it doesn't do anything in release builds of CPython. It's there for debugging.
A bientôt,
Armin.
Hi, On Tue, Feb 28, 2012 at 10:09, Aroldo Souza-Leite <asouzaleite@gmx.de> wrote:
thanks, I'll try and take the issue to the zodb-dev@zope.org.
Ah, that's not what I meant. I meant that we can just add a do-nothing _Py_ForgetReference in PyPy and be done. A bientôt, Armin.
Hi, Am 28.02.2012 10:42, schrieb Armin Rigo:
Hi,
On Tue, Feb 28, 2012 at 10:09, Aroldo Souza-Leite<asouzaleite@gmx.de> wrote:
thanks, I'll try and take the issue to the zodb-dev@zope.org. Ah, that's not what I meant. I meant that we can just add a do-nothing _Py_ForgetReference in PyPy and be done.
A bientôt,
Armin. Oh, does that mean that this problem could be solved by the PyPy people?
Cheers. Aroldo.
Hi, On Tue, Feb 28, 2012 at 10:55, Aroldo Souza-Leite <asouzaleite@gmx.de> wrote:
Oh, does that mean that this problem could be solved by the PyPy people?
Yes. Fixed in b741ab752493. The easiest is to wait for next night's automatic builds, but you can also try it out directly by replacing the file Include/object.h with the latest version of https://bitbucket.org/pypy/pypy/raw/default/pypy/module/cpyext/include/objec... . A bientôt, Armin.
Hi list, now ZODB3 and zope.testing can be pip-installed in a PyPy virtual environment and the tests can be called. But the tests fail. The same tests pass in CPython2.7. Could somebody please make a short comment on this before I eventually forward the case to the ZODB department? The attachment contains the complete bash session in Sphinx format with "Test results" as its last subsection. This is probably the only interesting part for developers, so that I'm reproducing it below. Thanks and cheers. Aroldo. -------------------<Test results> (tmp-env-pypy)aroldo@aroldo-laptop:~$python tmp-env-pypy/site-packages/ZODB/te sts/testZODB.py output: EEEEEEEEEE ====================================================================== ERROR: checkExplicitTransactionManager (__main__.ZODBTests) ---------------------------------------------------------------------- Traceback (most recent call last): File "tmp-env-pypy/site-packages/ZODB/tests/testZODB.py", line 159, in checkE xplicitTransactionManager File "/home/aroldo/tmp-env-pypy/lib-python/modified-2.7/UserDict.py", line 50 , in has_key def has_key(self, key): return key in self.data File "/home/aroldo/tmp-env-pypy/site-packages/persistent/mapping.py", line 30 , in __get__ return self.func(inst) File "/home/aroldo/tmp-env-pypy/site-packages/persistent/mapping.py", line 99 , in data data = self.__dict__.pop('_container') KeyError: '_container' ====================================================================== ERROR: checkExportImport (__main__.ZODBTests) ---------------------------------------------------------------------- Traceback (most recent call last): File "tmp-env-pypy/site-packages/ZODB/tests/testZODB.py", line 54, in checkEx portImport File "tmp-env-pypy/site-packages/ZODB/tests/testZODB.py", line 46, in populat e File "/home/aroldo/tmp-env-pypy/site-packages/persistent/mapping.py", line 63 , in __setitem__ self.__super_setitem(key, v) File "/home/aroldo/tmp-env-pypy/lib-python/modified-2.7/UserDict.py", line 29 , in __setitem__ def __setitem__(self, key, item): self.data[key] = item File "/home/aroldo/tmp-env-pypy/site-packages/persistent/mapping.py", line 30 , in __get__ return self.func(inst) File "/home/aroldo/tmp-env-pypy/site-packages/persistent/mapping.py", line 99 , in data data = self.__dict__.pop('_container') KeyError: '_container' ====================================================================== ERROR: checkExportImportAborted (__main__.ZODBTests) ---------------------------------------------------------------------- Traceback (most recent call last): File "tmp-env-pypy/site-packages/ZODB/tests/testZODB.py", line 122, in checkE xportImportAborted File "tmp-env-pypy/site-packages/ZODB/tests/testZODB.py", line 54, in checkEx portImport File "tmp-env-pypy/site-packages/ZODB/tests/testZODB.py", line 46, in populat e File "/home/aroldo/tmp-env-pypy/site-packages/persistent/mapping.py", line 63 , in __setitem__ self.__super_setitem(key, v) File "/home/aroldo/tmp-env-pypy/lib-python/modified-2.7/UserDict.py", line 29 , in __setitem__ def __setitem__(self, key, item): self.data[key] = item File "/home/aroldo/tmp-env-pypy/site-packages/persistent/mapping.py", line 30 , in __get__ return self.func(inst) File "/home/aroldo/tmp-env-pypy/site-packages/persistent/mapping.py", line 99 , in data data = self.__dict__.pop('_container') KeyError: '_container' ====================================================================== ERROR: checkFailingCommitSticks (__main__.ZODBTests) ---------------------------------------------------------------------- Traceback (most recent call last): File "tmp-env-pypy/site-packages/ZODB/tests/testZODB.py", line 289, in checkF ailingCommitSticks File "/home/aroldo/tmp-env-pypy/site-packages/persistent/mapping.py", line 63 , in __setitem__ self.__super_setitem(key, v) File "/home/aroldo/tmp-env-pypy/lib-python/modified-2.7/UserDict.py", line 29 , in __setitem__ def __setitem__(self, key, item): self.data[key] = item File "/home/aroldo/tmp-env-pypy/site-packages/persistent/mapping.py", line 30 , in __get__ return self.func(inst) File "/home/aroldo/tmp-env-pypy/site-packages/persistent/mapping.py", line 99 , in data data = self.__dict__.pop('_container') KeyError: '_container' ====================================================================== ERROR: checkFailingSavepointSticks (__main__.ZODBTests) ---------------------------------------------------------------------- Traceback (most recent call last): File "tmp-env-pypy/site-packages/ZODB/tests/testZODB.py", line 334, in checkF ailingSavepointSticks File "/home/aroldo/tmp-env-pypy/site-packages/persistent/mapping.py", line 63 , in __setitem__ self.__super_setitem(key, v) File "/home/aroldo/tmp-env-pypy/lib-python/modified-2.7/UserDict.py", line 29 , in __setitem__ def __setitem__(self, key, item): self.data[key] = item File "/home/aroldo/tmp-env-pypy/site-packages/persistent/mapping.py", line 30 , in __get__ return self.func(inst) File "/home/aroldo/tmp-env-pypy/site-packages/persistent/mapping.py", line 99 , in data data = self.__dict__.pop('_container') KeyError: '_container' ====================================================================== ERROR: checkMultipleUndoInOneTransaction (__main__.ZODBTests) ---------------------------------------------------------------------- Traceback (most recent call last): File "tmp-env-pypy/site-packages/ZODB/tests/testZODB.py", line 403, in checkM ultipleUndoInOneTransaction File "/home/aroldo/tmp-env-pypy/site-packages/persistent/mapping.py", line 63 , in __setitem__ self.__super_setitem(key, v) File "/home/aroldo/tmp-env-pypy/lib-python/modified-2.7/UserDict.py", line 29 , in __setitem__ def __setitem__(self, key, item): self.data[key] = item File "/home/aroldo/tmp-env-pypy/site-packages/persistent/mapping.py", line 30 , in __get__ return self.func(inst) File "/home/aroldo/tmp-env-pypy/site-packages/persistent/mapping.py", line 99 , in data data = self.__dict__.pop('_container') KeyError: '_container' ====================================================================== ERROR: checkResetCache (__main__.ZODBTests) ---------------------------------------------------------------------- Traceback (most recent call last): File "tmp-env-pypy/site-packages/ZODB/tests/testZODB.py", line 129, in checkR esetCache File "tmp-env-pypy/site-packages/ZODB/tests/testZODB.py", line 46, in populat e File "/home/aroldo/tmp-env-pypy/site-packages/persistent/mapping.py", line 63 , in __setitem__ self.__super_setitem(key, v) File "/home/aroldo/tmp-env-pypy/lib-python/modified-2.7/UserDict.py", line 29 , in __setitem__ def __setitem__(self, key, item): self.data[key] = item File "/home/aroldo/tmp-env-pypy/site-packages/persistent/mapping.py", line 30 , in __get__ return self.func(inst) File "/home/aroldo/tmp-env-pypy/site-packages/persistent/mapping.py", line 99 , in data data = self.__dict__.pop('_container') KeyError: '_container' ====================================================================== ERROR: checkResetCachesAPI (__main__.ZODBTests) ---------------------------------------------------------------------- Traceback (most recent call last): File "tmp-env-pypy/site-packages/ZODB/tests/testZODB.py", line 139, in checkR esetCachesAPI File "tmp-env-pypy/site-packages/ZODB/tests/testZODB.py", line 46, in populat e File "/home/aroldo/tmp-env-pypy/site-packages/persistent/mapping.py", line 63 , in __setitem__ self.__super_setitem(key, v) File "/home/aroldo/tmp-env-pypy/lib-python/modified-2.7/UserDict.py", line 29 , in __setitem__ def __setitem__(self, key, item): self.data[key] = item File "/home/aroldo/tmp-env-pypy/site-packages/persistent/mapping.py", line 30 , in __get__ return self.func(inst) File "/home/aroldo/tmp-env-pypy/site-packages/persistent/mapping.py", line 99 , in data data = self.__dict__.pop('_container') KeyError: '_container' ====================================================================== ERROR: checkSavepointDoesntGetInvalidations (__main__.ZODBTests) ---------------------------------------------------------------------- Traceback (most recent call last): File "tmp-env-pypy/site-packages/ZODB/tests/testZODB.py", line 204, in checkS avepointDoesntGetInvalidations File "/home/aroldo/tmp-env-pypy/site-packages/persistent/mapping.py", line 63 , in __setitem__ self.__super_setitem(key, v) File "/home/aroldo/tmp-env-pypy/lib-python/modified-2.7/UserDict.py", line 29 , in __setitem__ def __setitem__(self, key, item): self.data[key] = item File "/home/aroldo/tmp-env-pypy/site-packages/persistent/mapping.py", line 30 , in __get__ return self.func(inst) File "/home/aroldo/tmp-env-pypy/site-packages/persistent/mapping.py", line 99 , in data data = self.__dict__.pop('_container') KeyError: '_container' ====================================================================== ERROR: checkTxnBeginImpliesAbort (__main__.ZODBTests) ---------------------------------------------------------------------- Traceback (most recent call last): File "tmp-env-pypy/site-packages/ZODB/tests/testZODB.py", line 257, in checkT xnBeginImpliesAbort File "/home/aroldo/tmp-env-pypy/site-packages/persistent/mapping.py", line 63 , in __setitem__ self.__super_setitem(key, v) File "/home/aroldo/tmp-env-pypy/lib-python/modified-2.7/UserDict.py", line 29 , in __setitem__ def __setitem__(self, key, item): self.data[key] = item File "/home/aroldo/tmp-env-pypy/site-packages/persistent/mapping.py", line 30 , in __get__ return self.func(inst) File "/home/aroldo/tmp-env-pypy/site-packages/persistent/mapping.py", line 99 , in data data = self.__dict__.pop('_container') KeyError: '_container' ---------------------------------------------------------------------- Ran 10 tests in 0.439s FAILED (errors=10) -----</Test results>
Hi, On Fri, Mar 2, 2012 at 11:03, Aroldo Souza-Leite <asouzaleite@gmx.de> wrote:
data = self.__dict__.pop('_container') KeyError: '_container'
Last I heard, the Persistent base class, written in C, uses old tricks that are kind of deprecated; if I remember correctly, before Python 2.2, it was known as the place that introduced the trick called "extension classes" to Python, which later became "new-style classes" in Python 2.2. So I would not be surprized if that causes the user-visible __dict__ of its instances to miss an attribute like "_container" or "data", and so be generally unsupported by PyPy's cpyext. If you really care about it, you may want to rewrite at least the base class, persistent.Persistent, in pure Python. A bientôt, Armin.
2012/3/21 Armin Rigo <arigo@tunes.org>:
On Fri, Mar 2, 2012 at 11:03, Aroldo Souza-Leite <asouzaleite@gmx.de> wrote:
data = self.__dict__.pop('_container') KeyError: '_container'
Last I heard, the Persistent base class, written in C, uses old tricks that are kind of deprecated; if I remember correctly, before Python 2.2, it was known as the place that introduced the trick called "extension classes" to Python, which later became "new-style classes" in Python 2.2.
The answer is a bit simpler: Persistent.__getstate__ uses _PyObject_GetDictPtr(), which always returns NULL in cpyext. So the state is None and data is lost. This function is near to impossible to implement with cpyext. In the same file (cPersistence.c) you have: PyObject **dict = _PyObject_GetDictPtr(self); if (!*dict) *dict = PyDict_New(); PyDict_Update(*dict, state); This code should probably be rewritten with more conventional API: PyObject_GetAttrString(self, "__dict__") for example. Or even with pure Python code. -- Amaury Forgeot d'Arc
Hi, On Wed, Mar 21, 2012 at 20:46, Amaury Forgeot d'Arc <amauryfa@gmail.com> wrote:
Persistent.__getstate__ uses _PyObject_GetDictPtr(), which always returns NULL in cpyext.
Does it make sense? Shouldn't it be unimplemented, or raise a warning, or something? A bientôt, Armin.
2012/3/22 Armin Rigo <arigo@tunes.org>:
On Wed, Mar 21, 2012 at 20:46, Amaury Forgeot d'Arc <amauryfa@gmail.com> wrote:
Persistent.__getstate__ uses _PyObject_GetDictPtr(), which always returns NULL in cpyext.
Does it make sense? Shouldn't it be unimplemented, or raise a warning, or something?
The first time I encountered this function it was in SWIG generated code: https://github.com/klickverbot/swig/blob/244c758f0d32d19856d1e69011aa79f5bf3... which gracefully falls back to a regular PyObject_GetAttr() if the dict pointer is not provided. -- Amaury Forgeot d'Arc
Hi, I just tested NumPyPy a bit. I got very long run times for some tests. After some profiling, I identified the array constructor as the main time sink. This is a small example that makes the point. import cProfile try: import numpy except ImportError: import numpypy as numpy def test(): r = range(int(1e7)) # or 1e6 numpy.array(r) cProfile.run('test()') The numbers are below. NumPyPy is like a factor of five and more slower than NumPy creating an array of the same size from an existing list. I am just curious what the reason is and if you sees this go away in the near future? Thanks, Mike Mac OS X 10.7, Python 2.7.2, NumPy 2.0.0, 1e6 Ordered by: standard name ncalls tottime percall cumtime percall filename:lineno(function) 1 0.011 0.011 0.255 0.255 <string>:1(<module>) 1 0.002 0.002 0.244 0.244 constr.py:12(test) 1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects} 1 0.210 0.210 0.210 0.210 {numpy.core.multiarray.array} 1 0.032 0.032 0.032 0.032 {range} Mac OS X 10.7, PyPy 1.8 with NumPyPy, 1e6 Ordered by: standard name ncalls tottime percall cumtime percall filename:lineno(function) 1 0.000 0.000 0.955 0.955 <string>:1(<module>) 1 0.000 0.000 0.955 0.955 constr.py:12(test) 1 0.955 0.955 0.955 0.955 {_numpypy.array} 1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects} 1 0.000 0.000 0.000 0.000 {range} Mac OS X 10.7, Python 2.7.2, NumPy 2.0.0, 1e7 Ordered by: standard name ncalls tottime percall cumtime percall filename:lineno(function) 1 0.166 0.166 2.586 2.586 <string>:1(<module>) 1 0.016 0.016 2.420 2.420 constr.py:12(test) 1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects} 1 2.065 2.065 2.065 2.065 {numpy.core.multiarray.array} 1 0.339 0.339 0.339 0.339 {range} Mac OS X 10.7, PyPy 1.8 with NumPyPy, 1e7 Ordered by: standard name ncalls tottime percall cumtime percall filename:lineno(function) 1 0.000 0.000 10.169 10.169 <string>:1(<module>) 1 0.000 0.000 10.169 10.169 constr.py:12(test) 1 10.169 10.169 10.169 10.169 {_numpypy.array} 1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects} 1 0.000 0.000 0.000 0.000 {range} Windows, Python 2.6.5, NumPy 1.6.1, 1e6 Ordered by: standard name ncalls tottime percall cumtime percall filename:lineno(function) 1 0.009 0.009 0.242 0.242 <string>:1(<module>) 1 0.000 0.000 0.234 0.234 constr.py:12(test) 1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects} 1 0.191 0.191 0.191 0.191 {numpy.core.multiarray.array} 1 0.042 0.042 0.042 0.042 {range} Windows, PyPy 1.8 with NumPyPy, 1e6 Ordered by: standard name ncalls tottime percall cumtime percall filename:lineno(function) 1 0.000 0.000 1.375 1.375 <string>:1(<module>) 1 0.000 0.000 1.375 1.375 constr.py:8(test) 1 1.375 1.375 1.375 1.375 {_numpypy.array} 1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects} 1 0.000 0.000 0.000 0.000 {range} Windows, Python 2.6.5, NumPy 1.6.1, 1e7 Ordered by: standard name ncalls tottime percall cumtime percall filename:lineno(function) 1 0.092 0.092 2.775 2.775 <string>:1(<module>) 1 0.002 0.002 2.683 2.683 constr.py:12(test) 1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects} 1 2.254 2.254 2.254 2.254 {numpy.core.multiarray.array} 1 0.427 0.427 0.427 0.427 {range} Windows, PyPy 1.8 with NumPyPy, 1e7 Ordered by: standard name ncalls tottime percall cumtime percall filename:lineno(function) 1 0.000 0.000 13.937 13.937 <string>:1(<module>) 1 0.001 0.001 13.937 13.937 constr.py:12(test) 1 13.936 13.936 13.936 13.936 {_numpypy.array} 1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects} 1 0.000 0.000 0.000 0.000 {range}
Hi Mike, On 02/27/2012 04:31 PM, Mike Müller wrote:
I just tested NumPyPy a bit. I got very long run times for some tests. After some profiling, I identified the array constructor as the main time sink.
I opened an issue so that it doesn't get lost: https://bugs.pypy.org/issue1074 Cheers, Carl Friedrich
On Mon, Feb 27, 2012 at 7:48 AM, Carl Friedrich Bolz <cfbolz@gmx.de> wrote:
Hi Mike,
On 02/27/2012 04:31 PM, Mike Müller wrote:
I just tested NumPyPy a bit. I got very long run times for some tests. After some profiling, I identified the array constructor as the main time sink.
I opened an issue so that it doesn't get lost:
https://bugs.pypy.org/issue1074
Cheers,
Carl Friedrich
Yes, array creation is kind of slow. We should look into that
Stefan Behnel, 26.02.2012 20:54:
Amaury Forgeot d'Arc, 18.02.2012 15:41:
2012/2/18 Stefan Behnel
Here's an example.
Python code:
def print_excinfo(): print(sys.exc_info())
Cython code:
from stuff import print_excinfo
try: raise TypeError except TypeError: print_excinfo()
With the code removed, Cython will not store the TypeError in sys.exc_info(), so the Python code cannot see it. This may seem like an unimportant use case (who uses sys.exc_info() anyway, right?), but this becomes very visible when the code that uses sys.exc_info() is not user code but CPython itself, e.g. when raising another exception or when inspecting frames. Things grow really bad here, especially in Python 3.
I think I understand now, thanks for your example. Things are a bit simpler in PyPy because these exceptions are stored in the frame that is currently handling it. At least better than CPython which stores it in one place, and has to somehow save the state of the previous frames. Did you consider adding such a function to CPython? "PyErr_SetCurrentFrameExceptionInfo"?
For the record, pypy could implement it as: space.getexecutioncontext().gettopframe_nohidden().last_exception = operationerr i.e. the thing returned by sys.exc_info().
I've dropped a patch for CPython in the corresponding tracker ticket:
http://bugs.python.org/issue14098
The (trivial) implementation of the two functions is at the end of this file:
http://bugs.python.org/file24613/exc_info_capi.patch
Could you add them to PyPy?
Anyone? Stefan
On Mon, Mar 19, 2012 at 8:44 AM, Stefan Behnel <stefan_ml@behnel.de> wrote:
Stefan Behnel, 26.02.2012 20:54:
Amaury Forgeot d'Arc, 18.02.2012 15:41:
2012/2/18 Stefan Behnel
Here's an example.
Python code:
def print_excinfo(): print(sys.exc_info())
Cython code:
from stuff import print_excinfo
try: raise TypeError except TypeError: print_excinfo()
With the code removed, Cython will not store the TypeError in sys.exc_info(), so the Python code cannot see it. This may seem like an unimportant use case (who uses sys.exc_info() anyway, right?), but this becomes very visible when the code that uses sys.exc_info() is not user code but CPython itself, e.g. when raising another exception or when inspecting frames. Things grow really bad here, especially in Python 3.
I think I understand now, thanks for your example. Things are a bit simpler in PyPy because these exceptions are stored in the frame that is currently handling it. At least better than CPython which stores it in one place, and has to somehow save the state of the previous frames. Did you consider adding such a function to CPython? "PyErr_SetCurrentFrameExceptionInfo"?
For the record, pypy could implement it as: space.getexecutioncontext().gettopframe_nohidden().last_exception = operationerr i.e. the thing returned by sys.exc_info().
I've dropped a patch for CPython in the corresponding tracker ticket:
http://bugs.python.org/issue14098
The (trivial) implementation of the two functions is at the end of this file:
http://bugs.python.org/file24613/exc_info_capi.patch
Could you add them to PyPy?
Anyone?
Stefan
Hi Stefan. A lot of people have been completely busy at pycon. As far as I know they're trying to recover from jetlag/lots of conference. Give them a bit of breath ;-)
Hi, On Mon, Mar 19, 2012 at 09:11, Maciej Fijalkowski <fijall@gmail.com> wrote:
The (trivial) implementation of the two functions is at the end of this file:
http://bugs.python.org/file24613/exc_info_capi.patch
Could you add them to PyPy?
The new functions make sense to me. Anyone with cpyext knowledge can add them to PyPy. I can probably do it if needed, although I don't know cpyext very well. (Alternatively, coming up yourself with a patch would be the best way to move things forward :-) A bientôt, Armin.
Armin Rigo, 21.03.2012 17:55:
On Mon, Mar 19, 2012 at 09:11, Maciej Fijalkowski wrote:
The (trivial) implementation of the two functions is at the end of this file:
http://bugs.python.org/file24613/exc_info_capi.patch
Could you add them to PyPy?
The new functions make sense to me. Anyone with cpyext knowledge can add them to PyPy. I can probably do it if needed, although I don't know cpyext very well. (Alternatively, coming up yourself with a patch would be the best way to move things forward :-)
Yes, apparently so. However, we're currently busy with getting Cython 0.16 out of the door and I didn't find the time so far to set up a PyPy development environment to do anything useful with it. I'd be glad if this could get finished soon, because then I could finally merge the Cython side of this into mainline. This has been a major blocker for a while now, and exception handling or generators aren't exactly rare features in user code. Stefan
On Sun, Apr 1, 2012 at 3:04 PM, Stefan Behnel <stefan_ml@behnel.de> wrote:
Armin Rigo, 01.04.2012 12:31:
Hi Stefan,
Done in 623bcea85df3.
Thanks, Armin!
Would have taken me a while to figure these things out.
I'll give it a try with the next nightly.
Stefan
_______________________________________________ pypy-dev mailing list pypy-dev@python.org http://mail.python.org/mailman/listinfo/pypy-dev
You can create your own nightly by clicking "force build" on the buildbot.
On Sun, Apr 1, 2012 at 3:25 PM, Maciej Fijalkowski <fijall@gmail.com> wrote:
On Sun, Apr 1, 2012 at 3:04 PM, Stefan Behnel <stefan_ml@behnel.de> wrote:
Armin Rigo, 01.04.2012 12:31:
Hi Stefan,
Done in 623bcea85df3.
Thanks, Armin!
Would have taken me a while to figure these things out.
I'll give it a try with the next nightly.
Stefan
_______________________________________________ pypy-dev mailing list pypy-dev@python.org http://mail.python.org/mailman/listinfo/pypy-dev
You can create your own nightly by clicking "force build" on the buildbot.
Ah, maybe worth noting is that -jit are ones that create nightlies that you're interested in. I'll cancel the other ones. Cheers, fijal
Maciej Fijalkowski, 01.04.2012 15:42:
On Sun, Apr 1, 2012 at 3:25 PM, Maciej Fijalkowski wrote:
On Sun, Apr 1, 2012 at 3:04 PM, Stefan Behnel wrote:
Armin Rigo, 01.04.2012 12:31:
Hi Stefan,
Done in 623bcea85df3.
Thanks, Armin!
Would have taken me a while to figure these things out.
I'll give it a try with the next nightly.
You can create your own nightly by clicking "force build" on the buildbot.
Ah, maybe worth noting is that -jit are ones that create nightlies that you're interested in. I'll cancel the other ones.
Right, it's not immediately obvious which build job triggers what kind of output. I'm currently using the nojit version because it hand-wavingly appeared more stable than the jit version so far, and I doubt that there's any benefit for us to test against a jit build. Specifically, I'm interested in this file: http://buildbot.pypy.org/nightly/trunk/pypy-c-nojit-latest-linux64.tar.bz2 Stefan
On Sun, Apr 1, 2012 at 3:48 PM, Stefan Behnel <stefan_ml@behnel.de> wrote:
Maciej Fijalkowski, 01.04.2012 15:42:
On Sun, Apr 1, 2012 at 3:25 PM, Maciej Fijalkowski wrote:
On Sun, Apr 1, 2012 at 3:04 PM, Stefan Behnel wrote:
Armin Rigo, 01.04.2012 12:31:
Hi Stefan,
Done in 623bcea85df3.
Thanks, Armin!
Would have taken me a while to figure these things out.
I'll give it a try with the next nightly.
You can create your own nightly by clicking "force build" on the buildbot.
Ah, maybe worth noting is that -jit are ones that create nightlies that you're interested in. I'll cancel the other ones.
Right, it's not immediately obvious which build job triggers what kind of output. I'm currently using the nojit version because it hand-wavingly appeared more stable than the jit version so far, and I doubt that there's any benefit for us to test against a jit build. Specifically, I'm interested in this file:
http://buildbot.pypy.org/nightly/trunk/pypy-c-nojit-latest-linux64.tar.bz2
Stefan
_______________________________________________ pypy-dev mailing list pypy-dev@python.org http://mail.python.org/mailman/listinfo/pypy-dev
It's totally not :) Then it's the applevel one. I'll poke it.
Stefan Behnel, 01.04.2012 15:04:
Armin Rigo, 01.04.2012 12:31:
Hi Stefan,
Done in 623bcea85df3.
Thanks, Armin!
Would have taken me a while to figure these things out.
I'll give it a try with the next nightly.
Hmm, looks broken: http://buildbot.pypy.org/builders/pypy-c-jit-linux-x86-64/builds/810/steps/t... Stefan
On Sun, Apr 1, 2012 at 3:51 PM, Stefan Behnel <stefan_ml@behnel.de> wrote:
Stefan Behnel, 01.04.2012 15:04:
Armin Rigo, 01.04.2012 12:31:
Hi Stefan,
Done in 623bcea85df3.
Thanks, Armin!
Would have taken me a while to figure these things out.
I'll give it a try with the next nightly.
Hmm, looks broken:
http://buildbot.pypy.org/builders/pypy-c-jit-linux-x86-64/builds/810/steps/t...
Stefan
_______________________________________________ pypy-dev mailing list pypy-dev@python.org http://mail.python.org/mailman/listinfo/pypy-dev
Fixing fixing...
Stefan Behnel, 18.02.2012 09:48:
Once we have the test suite runnable, we can set up a PyPy instance on our CI server to get feed-back on any advances.
I've set up a build job for my development branch here: https://sage.math.washington.edu:8091/hudson/job/cython-scoder-pypy-nightly/ It builds and tests against the latest PyPy-c-jit nightly build, so that we get timely feedback for changes on either side. When is a good time to download the latest nightly build, BTW? Stefan
Stefan Behnel, 18.02.2012 16:29:
Stefan Behnel, 18.02.2012 09:48:
Once we have the test suite runnable, we can set up a PyPy instance on our CI server to get feed-back on any advances.
I've set up a build job for my development branch here:
https://sage.math.washington.edu:8091/hudson/job/cython-scoder-pypy-nightly/
It builds and tests against the latest PyPy-c-jit nightly build, so that we get timely feedback for changes on either side.
And now the question is: how do I debug into PyPy? From the nightly build, I don't get any debugging symbols in gdb, just a useless list of call addresses (running the ref-counting related "arg_incref" test here): """ #0 0x0000000000ef93ef in ?? () #1 0x0000000000fca0cb in PyDict_Next () #2 0x00007f2564be8f6c in __pyx_pf_10arg_incref_f () from /levi/scratch/robertwb/hudson/hudson/jobs/cython-scoder-pypy-nightly/workspace/BUILD/run/c/arg_incref.pypy-18.so #3 0x00007f2564be8dd3 in __pyx_pw_10arg_incref_1f () from /levi/scratch/robertwb/hudson/hudson/jobs/cython-scoder-pypy-nightly/workspace/BUILD/run/c/arg_incref.pypy-18.so #4 0x000000000109e375 in ?? () #5 0x00000000010026e4 in ?? () [a couple of hundred more skipped that look like the two above] """ Aren't debugging symbols enabled for the nightly builds or is this what PyPy's JIT gives you? I used this file: http://buildbot.pypy.org/nightly/trunk/pypy-c-jit-latest-linux64.tar.bz2 And I guess source-level debugging isn't really available for the 37MB pypy file either, is it? BTW, I've also run into a problem with distutils under PyPy. The value of the CFLAGS environment variable is not being split into separate options so that gcc complains about "-O" not accepting the value "0 -ggdb -fPIC" when I pass CFLAGS="-O0 -ggdb -fPIC". So I can currently only pass a single CFLAGS option (my choice obviously being "-ggdb"). Stefan
2012/2/18 Stefan Behnel <stefan_ml@behnel.de>
And now the question is: how do I debug into PyPy? From the nightly build, I don't get any debugging symbols in gdb, just a useless list of call addresses (running the ref-counting related "arg_incref" test here):
""" #0 0x0000000000ef93ef in ?? () #1 0x0000000000fca0cb in PyDict_Next ()
This one I know: It's a bug in our implementation of PyDict_Next() that I fixed today with 568fc4237bf8: http://mail.python.org/pipermail/pypy-commit/2012-February/059826.html -- Amaury Forgeot d'Arc
Amaury Forgeot d'Arc, 18.02.2012 21:20:
2012/2/18 Stefan Behnel
And now the question is: how do I debug into PyPy? From the nightly build, I don't get any debugging symbols in gdb, just a useless list of call addresses (running the ref-counting related "arg_incref" test here):
""" #0 0x0000000000ef93ef in ?? () #1 0x0000000000fca0cb in PyDict_Next ()
This one I know: It's a bug in our implementation of PyDict_Next() that I fixed today with 568fc4237bf8: http://mail.python.org/pipermail/pypy-commit/2012-February/059826.html
Cool, thanks! We'll see the result on the next run then. Sounds like Cython's test suite could prove to be a rather good test harness for PyPy as well. Stefan
On Sat, Feb 18, 2012 at 3:24 PM, Stefan Behnel <stefan_ml@behnel.de> wrote:
Amaury Forgeot d'Arc, 18.02.2012 21:20:
2012/2/18 Stefan Behnel
And now the question is: how do I debug into PyPy? From the nightly build, I don't get any debugging symbols in gdb, just a useless list of call addresses (running the ref-counting related "arg_incref" test here):
""" #0 0x0000000000ef93ef in ?? () #1 0x0000000000fca0cb in PyDict_Next ()
This one I know: It's a bug in our implementation of PyDict_Next() that I fixed today with 568fc4237bf8: http://mail.python.org/pipermail/pypy-commit/2012-February/059826.html
Cool, thanks! We'll see the result on the next run then.
Sounds like Cython's test suite could prove to be a rather good test harness for PyPy as well.
Stefan
_______________________________________________ pypy-dev mailing list pypy-dev@python.org http://mail.python.org/mailman/listinfo/pypy-dev
Yes, ATM our cpyext test suite is basically built by reading the docs and writing our own tests. Unlike the rest of our test suite which draws from the CPython test suite, and bugs found by the great many Python programs (Django and SQLAlchemy in particular have a long history of finding bugs in every single nook and cranny). Alex -- "I disapprove of what you say, but I will defend to the death your right to say it." -- Evelyn Beatrice Hall (summarizing Voltaire) "The people's good is the highest law." -- Cicero
Amaury Forgeot d'Arc, 18.02.2012 21:20:
2012/2/18 Stefan Behnel
And now the question is: how do I debug into PyPy? From the nightly build, I don't get any debugging symbols in gdb, just a useless list of call addresses (running the ref-counting related "arg_incref" test here):
""" #0 0x0000000000ef93ef in ?? () #1 0x0000000000fca0cb in PyDict_Next ()
This one I know: It's a bug in our implementation of PyDict_Next() that I fixed today with 568fc4237bf8: http://mail.python.org/pipermail/pypy-commit/2012-February/059826.html
It passes this test now. https://sage.math.washington.edu:8091/hudson/view/dev-scoder/job/cython-scod... After continuing a bit, it next crashes in the "builtin_abs" test. The code in this test grabs a reference to the "__builtins__" module at the start of the module init code which it stored in a static variable. Then, in a test function, it tries to get the "abs" function from it, and that crashes PyPy in the PyObject_GetAttr() call. In gdb, it looks like the module got corrupted somehow, at least its ob_type reference points to dead memory. Could this be another one of those borrowed reference issues? The module reference is retrieved using PyImport_AddModule(), which returns a borrowed reference. Some more errors that I see in the logs up to that point, which all hint at missing bits of the C-API implementation: specialfloatvals.c:490: error: ‘Py_HUGE_VAL’ undeclared autotestdict_cdef.c:2087: error: ‘PyWrapperDescrObject’ undeclared bufaccess.c:22714: error: ‘PyBoolObject’ undeclared bufaccess.c:22715: error: ‘PyComplexObject’ undeclared buffmt.c:2589: warning: implicit declaration of function ‘PyUnicode_Replace’ Stefan
Amaury Forgeot d'Arc, 19.02.2012 18:04:
2012/2/19 Stefan Behnel
bufaccess.c:22714: error: ‘PyBoolObject’ undeclared
bufaccess.c:22715: error: ‘PyComplexObject’ undeclared
Why are these structures needed? Would Cython allow them to be only aliases to PyObject?
Not sure about the PyBoolObject. It's being referenced in the module setup code of the test module above, but doesn't seem to be used at all after that. Looks like a bug to me. I agree that PyBoolObject doesn't actually provide anything useful. That's different for PyComplexObject, which allows direct unboxed access to the real and imaginary number fields. Cython makes use of that for interfacing between C/C++ complex and Python complex. Regarding the PyWrapperDescrObject which I also mentioned in my last mail, I noticed that you already had a work-around for that in your initial patch. I'll see if I can get that implemented in a cleaner way (should be done at C compile time). Stefan
2012/2/19 Stefan Behnel <stefan_ml@behnel.de>
That's different for PyComplexObject, which allows direct unboxed access to the real and imaginary number fields. Cython makes use of that for interfacing between C/C++ complex and Python complex.
Why don't you use PyComplex_AsCComplex or other similar API for this? -- Amaury Forgeot d'Arc
Amaury Forgeot d'Arc, 19.02.2012 23:10:
2012/2/19 Stefan Behnel
That's different for PyComplexObject, which allows direct unboxed access to the real and imaginary number fields. Cython makes use of that for interfacing between C/C++ complex and Python complex.
Why don't you use PyComplex_AsCComplex or other similar API for this?
You are right, there is really just one mention of that in the code base, so it's easy to fix for non-CPython. Doing this, I noticed a bug in the standard declarations of CPython's C-API that Cython ships for user code, which leads to an accidental and useless reference to PyBoolObject and PyComplexObject in the generated modules. I'll find a way to fix those, too, so don't worry about them. Stefan
Hi, adding to the list. Stefan Behnel, 19.02.2012 12:10:
Some more errors that I see in the logs up to that point, which all hint at missing bits of the C-API implementation:
specialfloatvals.c:490: error: ‘Py_HUGE_VAL’ undeclared
CPython simply defines this as #ifndef Py_HUGE_VAL #define Py_HUGE_VAL HUGE_VAL #endif to allow users to override it on buggy platforms.
buffmt.c:2589: warning: implicit declaration of function ‘PyUnicode_Replace’
Some more missing parts of the C-API: - PyUnicode_Tailmatch - PyFrozenSet_Type Regarding the PyUnicode_*() functions, I could disable their usage when compiling against PyPy. Would that be helpful? I wouldn't expect them to be hard to implement, though. And disabling means that we'd have to remember to re-enable them when they become available at some point ... Does PyPy's cpyext define a version that we could base that decision on? Stefan
Stefan Behnel, 22.02.2012 21:39:
adding to the list.
Some more errors that I see in the logs up to that point, which all hint at missing bits of the C-API implementation:
specialfloatvals.c:490: error: ‘Py_HUGE_VAL’ undeclared
CPython simply defines this as
#ifndef Py_HUGE_VAL #define Py_HUGE_VAL HUGE_VAL #endif
to allow users to override it on buggy platforms.
buffmt.c:2589: warning: implicit declaration of function ‘PyUnicode_Replace’
Some more missing parts of the C-API:
- PyUnicode_Tailmatch
- PyFrozenSet_Type
- PyUnicode_GetMax - the Unicode character type functions: Py_UNICODE_ISTITLE(), Py_UNICODE_ISALPHA(), Py_UNICODE_ISDIGIT(), Py_UNICODE_ISNUMERIC() Our exec/eval implementation is broken because these are missing: PyCode_Check(), PyCode_GetNumFree(), PyEval_EvalCode(), PyEval_MergeCompilerFlags(), PyCF_SOURCE_IS_UTF8, PyRun_StringFlags() I doubt that they will be all that trivial to implement, so I can live with not having them for a while. Code that uses exec/eval will just fail to compile for now. I had to disable the following tests from Cython's test suite because they either crash PyPy or at least corrupt it in a way that infects subsequent tests: bufaccess cascadedassignment control_flow_except_T725 exarkun exceptions_nogil extended_unpacking_T235 fused_def fused_cpdef literal_lists memoryview moduletryexcept purecdef property_decorator_T593 setjmp special_methods_T561_py2 tupleassign tryexcept tuple_unpack_string type_slots_nonzero_bool With those taken out, I get an otherwise complete test run: https://sage.math.washington.edu:8091/hudson/view/dev-scoder/job/cython-scod... Here are the test results: https://sage.math.washington.edu:8091/hudson/view/dev-scoder/job/cython-scod... It obviously runs longer than a CPython run (22 vs. 7 minutes), even though the runtime is normally dominated by the C compiler runs. However, having learned a bit about the difficulties that PyPy has in emulating the C-API, I'm actually quite impressed how much of this just works at all. Well done. And last but not least, over on python-dev, MvL proposed these two simple functions for accessing the tstate->exc_* fields: - PyErr_GetExcInfo(PyObject** type, PyObject** value, PyObject** tb) - PyErr_SetExcInfo(PyObject* type, PyObject* value, PyObject* tb) http://thread.gmane.org/gmane.comp.python.devel/129787/focus=129792 Makes sense to me. Getting those in would fix our generator/coroutine implementation, amongst other things. Stefan
2012/2/18 Stefan Behnel <stefan_ml@behnel.de>
And now the question is: how do I debug into PyPy?
Part of the answer: I never debug pypy. Even with debug symbols, the (generated) code is so complex that most of the time you cannot get anything interesting beyond the function names. But pypy is written in RPython, which can run on top of CPython. Even the cpyext layer can be interpreted; when the extension module calls PyDict_Next(), it actually steps into cpyext/dictobject.py! And it's really fun to add print statements, conditional breakpoints, etc. and experiment with the code without retranslating everything. -- Amaury Forgeot d'Arc
participants (9)
-
Alex Gaynor
-
Amaury Forgeot d'Arc
-
Armin Rigo
-
Aroldo Souza-Leite
-
Carl Friedrich Bolz
-
Maciej Fijalkowski
-
Martijn Faassen
-
Mike Müller
-
Stefan Behnel