[Cython] Fixing NumPy support for Python 3 (Stefan, please help!)

Stefan Behnel stefan_ml at behnel.de
Fri Feb 18 06:19:32 CET 2011


Lisandro Dalcin, 17.02.2011 17:24:
> On 17 February 2011 12:16, Stefan Behnel wrote:
>> Lisandro Dalcin, 17.02.2011 15:32:
>>>
>>> Stefan, what do you think about the patch below? This hunk is part of
>>> a series of fixes required to get numpy-dev working under Python 3.2.
>>> The root of the issue is that __cythonbufferdefaults__ keys&values
>>> end-up being "bytes" (this coercion is triggered in Interpreter.py).
>>>
>>>
>>> diff --git a/Cython/Compiler/ExprNodes.py b/Cython/Compiler/ExprNodes.py
>>> index 5b339da..b72deef 100755
>>> --- a/Cython/Compiler/ExprNodes.py
>>> +++ b/Cython/Compiler/ExprNodes.py
>>> @@ -12,6 +12,7 @@ cython.declare(error=object, warning=object,
>>> warn_once=object,
>>>                  Builtin=object, Symtab=object, Utils=object,
>>> find_coercion_error
>>>                  debug_disposal_code=object, debug_temp_alloc=object,
>>> debug_coerc
>>>
>>> +import sys
>>>   import operator
>>>
>>>   from Errors import error, warning, warn_once, InternalError, CompileError
>>> @@ -1136,6 +1137,8 @@ class StringNode(PyConstNode):
>>>           return self.result_code
>>>
>>>       def compile_time_value(self, env):
>>> +        if sys.version_info[0]>= 3 and self.unicode_value:
>>
>> You must use "self.unicode_value is not None" here, it may be the empty
>> string.
>>
>>> +            return self.unicode_value
>>>           return self.value
>>
>> Ok, that's a tricky one. Just because the compilation is running in Py3
>> doesn't mean that the correct compile time value is a Unicode string - we
>> don't know what it'll be used for.
>
> OK, I've found an alternative workaround. What do you think?
>
> diff --git a/Cython/Compiler/Interpreter.py b/Cython/Compiler/Interpreter.py
> index 83cb184..9fb5fe5 100644
> --- a/Cython/Compiler/Interpreter.py
> +++ b/Cython/Compiler/Interpreter.py
> @@ -6,6 +6,7 @@ For now this only covers parse tree to value conversion of
>   compile-time values.
>   """
>
> +import sys
>   from Nodes import *
>   from ExprNodes import *
>   from Errors import CompileError
> @@ -44,6 +45,10 @@ def interpret_compiletime_options(optlist, optdict,
> type_env=None, type_args=())
>               else:
>                   raise CompileError(node.pos, "Type not allowed here.")
>           else:
> +            if (sys.version_info[0]>=3 and
> +                isinstance(node, StringNode) and
> +                node.unicode_value is not None):
> +                return (node.unicode_value, node.pos)
>               return (node.compile_time_value(empty_scope), node.pos)
>
>       if optlist:
> @@ -52,6 +57,7 @@ def interpret_compiletime_options(optlist, optdict,
> type_env=None, type_args=())
>           assert isinstance(optdict, DictNode)
>           new_optdict = {}
>           for item in optdict.key_value_pairs:
> -            new_optdict[item.key.value] = interpret(item.value, item.key.value)
> +            new_key, dummy = interpret(item.key, None)
> +            new_optdict[new_key] = interpret(item.value, item.key.value)
>           optdict = new_optdict
>       return (optlist, new_optdict)

This still isn't something that looks right. It just does the same thing at 
a different place. Actually, I'm not sure there is a way to "get it right".


>> Doing the above will do the wrong thing
>> e.g. in this case:
>>
>>     DEF const_x = "abc"
>>     cdef str x = const_x
>>
>> The problem is: it is broken already, returning self.value is wrong because
>> it drops available type information by returning plain bytes instead of str.
>> And simply returning self.unicode_value in Py3 doesn't fix that.
>
> I see... So the correct compile time value for StringNode should
> depend on options.language_level, right?

Hmmm. I'm not sure that would solve it. Py2 str has the property of 
changing type depending on the runtime environment. So this is actually 
independent of the language_level (-3 has easy semantics here).

I mean, it won't even solve the problem at hand, because the code could 
still be Py2 but require a unicode string value because it gets *compiled* 
under Python 3. It shouldn't depend on the compile time environment at all, 
but the NumPy problem shows that it has to sometimes.

I think we have to find a way to keep the double bytes/unicode string 
identity alive during runtime processing, up to the point where we can (or 
have to) decide what to make of it.

Stefan


More information about the cython-devel mailing list