Re: [pypy-dev] [pypy-commit] pypy default: investigate mark_opaque_ptr
Hi Hakan, On 13 October 2014 11:52, Hakan Ardo <hakan@debian.org> wrote:
On Mon, Oct 13, 2014 at 11:45 AM, Armin Rigo <arigo@tunes.org> wrote:
Hi Hakan,
On 13 October 2014 10:11, Hakan Ardo <hakan@debian.org> wrote:
mark_opaque_ptr is used by unrolling to prevent moving getfield_gc(p1) into the short preamble is p1 if opaque. This is needed since the pointer might be pointing to something of a different type than what it was pointing to during the tracing. In which case the execution of the trace will be aborted before the original position of the getfield_gc but after the short preamble is executed.
Thanks for the confirmation. This is the only issue, and the only reason for mark_opaque_ptr, right? I've already written it down in bbebe6918aa9.
Yes as far as I can remember. You have ofcourse the exact same issue with getarrayitem_* and friends...
Ah, indeed. I'm thinking about a more involved fix, prompted by https://bitbucket.org/pypy/pypy/issue/1886 . Would it work? The idea would be to allow moving the getfield_gc, but in case it was on an opaque pointer, add a new "guard_gctype" operation in the short preamble. This is possible (and easy) with our own GCs, but wouldn't work with Boehm, so it would be conditional... Armin PS: you're still using pypy-dev@codespeak.net; I think this address stopped working by now. I fixed it to pypy-dev@python.org.
On Mon, Oct 13, 2014 at 12:24 PM, Armin Rigo <arigo@tunes.org> wrote:
Yes as far as I can remember. You have ofcourse the exact same issue with getarrayitem_* and friends...
Ah, indeed.
I'm thinking about a more involved fix, prompted by https://bitbucket.org/pypy/pypy/issue/1886 . Would it work? The idea would be to allow moving the getfield_gc, but in case it was on an opaque pointer, add a new "guard_gctype" operation in the short preamble. This is possible (and easy) with our own GCs, but wouldn't work with Boehm, so it would be conditional...
Maybe. I cant see why not. If the opaque pointer shows up in the VirtualState there might be some similar fixes needed there. I.e. there is probably cases where VirtualState.generate_guards needs to generate a guard_gctype aswell. I think the problematic case is actually when mixing with arrays. There will already be a guard_class protecting the getfiled in the short preamble, but it crashes when operating on an array. Also, there is no guard_arraytype. So mixing different types of arrays will crash. There is logic to generate guards for the arraylen and place those in short preamble.
PS: you're still using pypy-dev@codespeak.net; I think this address stopped working by now. I fixed it to pypy-dev@python.org.
Sorry (I got no bounce though). -- Håkan Ardö
Hi Hakan, On 13 October 2014 13:10, Hakan Ardo <hakan@debian.org> wrote:
I think the problematic case is actually when mixing with arrays. There will already be a guard_class protecting the getfiled in the short preamble, but it crashes when operating on an array. Also, there is no guard_arraytype. So mixing different types of arrays will crash. There is logic to generate guards for the arraylen and place those in short preamble.
I think it should work. What I have in mind for GUARD_GCTYPE would be to check the GC header of the object; all GC pointers have one, with our own GCs --- even GcArrays. So GUARD_GCTYPE would check that the object is exactly of the correct type: the right kind of GcArray, or the right kind of GcStruct that is not necessarily an RPython instance. It would be inserted before we do more checks with that pointer, whenever the pointer may originally be opaque. Btw, a GUARD_CLASS on an opaque pointer that is not actually an RPython instance may also crash, depending on the translation settings. I think it works without problem if "gcremovetypeptr" is True (the default on 64-bit, but not on 32-bit). If it is False, then we have a very unlikely but possible false positive where the check is done by reading the 2nd word of the object, just after the GC header, and checking that it is exactly some value. If the object is actually, say, a tuple of integers, then it would be possible that the first integer has exactly the same value as expected. So unless I'm missing something else, we have a potential bug that involves creating a range(a, b, c) with a being exactly the same as the address of some vtable thing... This would also be resolved if in this case we put a GUARD_GCTYPE first (and then don't need an additional GUARD_CLASS). A bientôt, Armin.
On Mon, Oct 13, 2014 at 6:09 PM, Armin Rigo <arigo@tunes.org> wrote:
I think it should work.
I'm sure your right :)
What I have in mind for GUARD_GCTYPE would be to check the GC header of the object; all GC pointers have one, with our own GCs --- even GcArrays. So GUARD_GCTYPE would check that the object is exactly of the correct type: the right kind of GcArray, or the right kind of GcStruct that is not necessarily an RPython instance. It would be inserted before we do more checks with that pointer, whenever the pointer may originally be opaque.
Are we certain opaque pointers always refer to GC-objects?
Btw, a GUARD_CLASS on an opaque pointer that is not actually an RPython instance may also crash, depending on the translation settings. I think it works without problem if "gcremovetypeptr" is True (the default on 64-bit, but not on 32-bit). If it is False, then we have a very unlikely but possible false positive where the check is done by reading the 2nd word of the object, just after the GC header, and checking that it is exactly some value. If the object is actually, say, a tuple of integers, then it would be possible that the first integer has exactly the same value as expected. So unless I'm missing something else, we have a potential bug that involves creating a range(a, b, c) with a being exactly the same as the address of some vtable thing...
Right, but mark_opaque_ptr prevents this case as well by not moving operations using opaque pointers into the short preamble. So I guess we are currently fine? -- Håkan Ardö
Hi Hakan, On 13 October 2014 18:59, Hakan Ardo <hakan@debian.org> wrote:
Are we certain opaque pointers always refer to GC-objects?
Yes.
Right, but mark_opaque_ptr prevents this case as well by not moving operations using opaque pointers into the short preamble. So I guess we are currently fine?
I'm not sure: the logic in optimizeopt/heap.py seems to be "if it's an opaque pointer, and if we don't know its class, then don't move it". According to this logic, operations on RPython instances (like GUARD_CLASS) can still be moved to the short preamble. Maybe I'm missing something else? A bientôt, Armin.
On Tue, Oct 14, 2014 at 9:26 AM, Armin Rigo <arigo@tunes.org> wrote:
I'm not sure: the logic in optimizeopt/heap.py seems to be "if it's an opaque pointer, and if we don't know its class, then don't move it". According to this logic, operations on RPython instances (like GUARD_CLASS) can still be moved to the short preamble. Maybe I'm missing something else?
I see. Yes, there is the logic of ShortBoxes.assumed_classes which is supposed to make sure never to inline a short_preamble in cases where the class of the opaque pointer is not already known. This logic could probably be killed with your solution. -- Håkan Ardö
Hi again, for more details, look at 77dce024e344. It takes the original approach of not touching operations using opaque pointers at all into one that allows getfield_gc(p,...) with p opaque in the short_preamble in some specific cases. On Tue, Oct 14, 2014 at 9:55 AM, Hakan Ardo <hakan@debian.org> wrote:
On Tue, Oct 14, 2014 at 9:26 AM, Armin Rigo <arigo@tunes.org> wrote:
I'm not sure: the logic in optimizeopt/heap.py seems to be "if it's an opaque pointer, and if we don't know its class, then don't move it". According to this logic, operations on RPython instances (like GUARD_CLASS) can still be moved to the short preamble. Maybe I'm missing something else?
I see. Yes, there is the logic of ShortBoxes.assumed_classes which is supposed to make sure never to inline a short_preamble in cases where the class of the opaque pointer is not already known. This logic could probably be killed with your solution.
-- Håkan Ardö
-- Håkan Ardö
participants (2)
-
Armin Rigo
-
Hakan Ardo