[Cython] remaining open issues for 0.17

mark florisson markflorisson88 at gmail.com
Fri Aug 10 22:19:38 CEST 2012


On 10 August 2012 19:54, Robert Bradshaw <robertwb at gmail.com> wrote:
> On Fri, Aug 10, 2012 at 9:25 AM, Robert Bradshaw <robertwb at gmail.com> wrote:
>> On Thu, Aug 9, 2012 at 10:04 AM, Stefan Behnel <stefan_ml at behnel.de> wrote:
>>> mark florisson, 09.08.2012 18:51:
>>>> On 9 August 2012 16:36, Stefan Behnel wrote:
>>>>> Stefan Behnel, 09.08.2012 14:31:
>>>>>> mark florisson, 07.08.2012 11:09:
>>>>>>> I thought the 32 bit issue was resolved? You pushed a fix and I fixed
>>>>>>> some tests, so it passed for me. I can run it again to check...
>>>>>>
>>>>>> I don't know. Yaroslav replied to your mail saying that your fixes didn't
>>>>>> change anything for the Debian builds. Let's see what he has to say about
>>>>>> the second beta.
>>>>>
>>>>> He ran them through the build servers. Here are the latest results:
>>>>>
>>>>> https://buildd.debian.org/status/package.php?p=cython&suite=experimental
>>>>>
>>>>> (hint: click on "all" in the Logs column, then click on the build result to
>>>>> see the log output)
>>>>>
>>>>> The i386 tests show 4 errors related to fused types in the NumPy tests:
>>>>>
>>>>> https://buildd.debian.org/status/fetch.php?pkg=cython&arch=i386&ver=0.17~beta1-2&stamp=1343428255
>>>>>
>>>>> >From a quick look, they might really be problems in the tests rather than
>>>>> the code we generate.
>>>>>
>>>>> It also looks like my signed char fix was not sufficient, as you can see in
>>>>> the memslice tests here, e.g. in the
>>>>> "memslice.__test__.test_memslice_struct_with_arrays" test:
>>>>>
>>>>> https://buildd.debian.org/status/fetch.php?pkg=cython&arch=s390&ver=0.17~beta1-2&stamp=1343418358
>>>>>
>>>>> It now prints
>>>>>
>>>>> """
>>>>> ValueError: Buffer dtype mismatch, expected 'char' but got 'unsigned char'
>>>>> in 'ArrayStruct.chars'
>>>>> """
>>>>>
>>>>> Well, "char" and "unsigned char" are supposed to be the same on this
>>>>> platform, but I guess it can't know that. I wonder if "char" and "signed
>>>>> char" are considered identical on other platforms ...
>>>>>
>>>>> To fix the tests, it might be enough to use an explicit "unsigned char"
>>>>> instead of plain "char" for the types. Not sure if that's a good idea or if
>>>>> there actually is a bug hidden below this test failure. At least, it smells
>>>>> like an inconvenience for users to get that error.
>>>>
>>>> I think these tests are fine (unlike the fused test I fixed), it's a
>>>> bug somewhere. I'm not sure where though, because the code that
>>>> creates the format string uses the type info struct (with the C
>>>> compile-time unsigned check), and so does the buffer format parser.
>>>> The code creating the type strings are in the MemoryView.pyx and
>>>> Buffer.c utility files at the bottom. It does
>>>>
>>>> if (type->is_unsigned)
>>>>     *buf = toupper(*buf);
>>>>
>>>> So whatever I feed it on my system (char, signed char or unsigned
>>>> char), it generates the right format string for me.
>>>
>>> I think this needs some more investigation directly on the failing systems.
>>> It'll be much easier to figure this out with a debugger.
>>
>> Strangely enough, I wasn't even able to reproduce this passing
>> "-funsigned-char" so yes, I think it may require access to one of
>> these systems. (Perhaps I'd need to compile Python and/or NumPy with
>> this option as well...)
>>
>>>> Perhaps we can just ignore these test failures, I don't think many
>>>> people are using memoryviews with characters. Changing them to signed
>>>> or unsigned char may fix the problem for the tests, though.
>>>
>>> If we can't find a way to fix this in time, I'm leaning towards that, too.
>>> However, working around a bug in a test is a sure way to forget about the
>>> bug and consider everything working. Could you copy the failing tests into
>>> a separate test case, disable that in bugs.txt and then change the types in
>>> the original test modules to whatever you think would make the tests work?
>>
>> I'll take a stab at this.
>
> OK, the problem boiled down to using 'b' or 'B' for the format string
> for char rather than 'c'. See, e.g.
> http://docs.python.org/library/array.html .

That confounds me, I think I need some hand-holding :). So we have a
type info struct, from which we create a format string containing 'b'
or 'B', depending on signed-ness (at C compile time). We then at
buffer format string parsing time use this same type info struct, and
we check the size (match) and we check the group of the character in
the format string ('b' -> 'I' and 'B' -> 'U') against the expected
group from the type info struct (which also depends on the
signed-ness). At which point does this go wrong?

> I've pushed a fix at
> https://github.com/robertwb/cython/commit/b0539cbc32c200a09b1fbddf2d6943e92aec2f3e
> , but it'd be good to have another set of eyes with someone more
> familiar with this code to be sure I didn't miss anything. It also
> involved a lot of changes to the tests
> https://github.com/robertwb/cython/commit/6bcc8fd8419e6e4079344788023029736610d5aa
> so is not exactly backwards compatible, though perhaps more correct.

I think that's ok, people don't typically write these format strings
manually. Thanks for the fixes!

> It'd be good to see if this actually fixes the issues.
>
> - Robert
> _______________________________________________
> cython-devel mailing list
> cython-devel at python.org
> http://mail.python.org/mailman/listinfo/cython-devel


More information about the cython-devel mailing list