[Cython] automatic character conversion problem

Stefan Behnel stefan_ml at behnel.de
Tue May 3 06:54:21 CEST 2011


[moving this to cython-users]

Robert Bradshaw, 03.05.2011 06:38:
> On Fri, Apr 29, 2011 at 6:57 AM, Hans Terlouw wrote:
>> Recently I encountered a problem with Cython's automatic char* to string
>> conversion (Cython version 0.14.1). I'll attach two sample source files. The
>> first one, char2str_a.pyx prints "The quick...", just as I expected. But the
>> second example prints "... lazy dog.". In the original situation I had a
>> call to
>> free() instead of the call to strcpy() which I use here for illustration
>> purposes. Then I got unpredictable results. Apparently the Python string
>> object
>> keeps referring to the C char* a bit longer than I would expect. A previous
>> version (0.11.2) didn't have this problem.
>
> This is due to type inference, in the second example, p_str is
> inferred to be of type char*.

Just to make this a bit clearer:

>> cdef extern from "stdlib.h":
>>    void free(void* ptr)
>>    void* malloc(size_t size)
>>
>> cdef extern from "string.h":
>>    char *strcpy(char *dest, char *src)
>>
>> def char2str():
>>    cdef char *c_str_a =<char*>malloc(80)
>>    cdef char *c_str_b = "The quick...   "
>>    cdef char *c_str_c = "... lazy dog.  "
>>
>>    strcpy(c_str_a, c_str_b)
>>
>>    p_str = c_str_a
>>    strcpy(c_str_a, c_str_c)
>>    p_str = p_str.rstrip()
>>    print p_str

In this example, p_str is assigned both a char* and a Python object, so 
type inference makes it a Python object. The first assignment is therefore 
a copy operation that creates a Python bytes object, and the second 
operation assigns the object returned from the .rstrip() call.


>> cdef extern from "stdlib.h":
>>    void free(void* ptr)
>>    void* malloc(size_t size)
>>
>> cdef extern from "string.h":
>>    char *strcpy(char *dest, char *src)
>>
>> def char2str():
>>    cdef char *c_str_a =<char*>malloc(80)
>>    cdef char *c_str_b = "The quick...   "
>>    cdef char *c_str_c = "... lazy dog.  "
>>
>>    strcpy(c_str_a, c_str_b)
>>
>>    p_str = c_str_a
>>    strcpy(c_str_a, c_str_c)
>>    print p_str.rstrip()

Here, p_str is only assigned once from a pointer, so the type is inferred 
as a char*, and the first assignment is a pointer assignment, not a copy 
operation.

You can see the difference with "cython -a", which generates an HTML 
representation of your code that highlights Python object operations. 
(Click on a source line to see the C code).

Stefan


More information about the cython-devel mailing list