Hi all, I am looking into the last failing case: "TestMicroNumPy::()::test_reduce_logical_and" but I don't quite understand what this test means. The test case fails with: ``` @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ Loops don't match ================= loop id = None ('operation mismatch',) <could not determine information> Ignore ops: [] Got: ===== HERE ===== guard_not_invalidated(descr=<Guard0x4005356a88>) i39 = int_and(i35, 7) i40 = int_is_zero(i39) guard_true(i40, descr=<Guard0x40053a0e30>) f41 = raw_load_f(i9, i35, descr=<ArrayF 8>) i43 = float_ne(f41, 0.000000) guard_true(i43, descr=<Guard0x40053a0e60>) i45 = int_add(i28, 1) i47 = int_add(i35, 8) i48 = int_ge(i45, i36) guard_false(i48, descr=<Guard0x40053a0e90>) jump(p29, i45, p2, i47, p4, p6, i9, i36, descr=TargetToken(274965300512)) Expected: i10096 = int_and(i29, 7) i10097 = int_is_zero(i10096) guard_true(i10097, descr=...) guard_not_invalidated(descr=...) f31 = raw_load_f(i9, i29, descr=<ArrayF 8>) i32 = float_ne(f31, 0.000000) guard_true(i32, descr=...) i36 = int_add(i24, 1) i37 = int_add(i29, 8) i38 = int_ge(i36, i30) guard_false(i38, descr=...) jump(..., descr=...) ``` IIUC, the difference is that guard_not_invalidated is at a different location. But I don't understand why the backend can affect the logs in the 'jit-log-opt-' tag. Also, I found that reduce_logical_and (failed) and reduce_logical_xor (passed) are very different. Is there more information on the details of this test? Any ideas to debug this test case are very welcomed! Thanks. Regards, Logan p.s. I almost covered all the test cases. Except the one described above, other test cases are either Passing, classified as XFAIL (not supportable), or related to environment (e.g. schroot/qemu). I will try to run it on the real board this weekend. On Wed, Feb 21, 2024 at 9:57 PM Logan Chien <tzuhsiang.chien@gmail.com> wrote:
Hi Armin,
Thank you for the reply.
Luckily, I found the bug. It was a bug in my write barrier card marking implementation. I misunderstood what AArch64 MVN instruction meant when I was porting the code. After fixing it, I can pass these two test cases (test_zipfile64 and test_tokenize).
Now, I am looking into test_json. Earlier, I thought it was an XFAIL because the -O2 build was failing too. But after adding `@settings(suppress_health_check=[HealthCheck.too_slow])` to `test_json.test_roundtrip`, I could run it in reasonable time.
However, it was extremely slow when I ran the same test with the `-Ojit` build. According to `PYPYLOG=jit:log.txt`, the JIT compiler kept building the same (or similar) bridge. Statistics showed that the RISC-V JIT compiled more than 3000 bridges (when Ctrl-C interrupted) whereas the X86 JIT build compiled only 900 bridges (when completed). I will try to figure out the failing guard op first.
Regards, Logan
On Mon, Feb 19, 2024 at 10:05 PM Armin Rigo <armin.rigo@gmail.com> wrote:
Hi Logan,
On Tue, 20 Feb 2024 at 05:08, Logan Chien <tzuhsiang.chien@gmail.com> wrote:
This should just be #defined to do nothing with Boehm, maybe in rpython/translator/c/src/mem.h
With this change and a few RISC-V backend fixes (related to self.cpu.vtable_offset), I can build and run a JIT+BoehmGC PyPy.
Cool! I also got a pull request merged into the main branch with this change, and it does indeed fix boehm builds.
This configuration (JIT+BoehmGC) can pass test_tokenize and test_zipfile64 (from lib_python_tests.py).
Thus, my next step will focus on the differences between JIT+BoehmGC and JIT+IncminimarkGC.
A problem that came up a lot in other backends is a specific input instruction that the backend emits with specific registers. When you run into the bad case, the emitted code reuses a register *before* reading the same register assuming that it still contains its old value. It's entirely dependent on register allocation, and if you run it with boehm then the sequence of instruction is slightly different and that might be the reason that the bug doesn't show up then. If you get two failures with incminimark and none with boehm, then it sounds more likely that the case involves one of the incminimark-only constructions---but it's also possible the bug is somewhere unrelated and it's purely bad luck...
Armin