On 23/09/14 09:52, Armin Rigo wrote:
Hi Lefteris,
On 22 September 2014 19:37, Eleytherios Stamatogiannakis <estama@gmail.com> wrote:
b = unicode( ffi.buffer( clib.getString(...) ) ,'utf-8')
because it'll only return the first character of getString, due to being declared as a 'char*'.
The issue is only that ffi.buffer() tries to guess how long a buffer you're giving it, and with "char *" the guess is one (only ffi.string() has logic to look for the final null character in the array).
If only ffi.string has logic to look for the final null character, then how can below work?
teststr=ffi.new('char[]', 'asdfasdfasdfasdfasdfasdf') unicode(ffi.buffer(teststr), 'utf-8') u'asdfasdfasdfasdfasdfasdf\x00'
Above doesn't explicitly set the length in ffi.buffer. There is still one problem with ffi.buffer and the last "\x00" in input, but otherwise it works with only 1 copy to go from a char* to a Python unicode string. The problem is that i cannot declare a C function as returning a char[] so that ffi.buffer will have the same behaviour on its results as it has with above "teststr".
You need to get its length explicitly, for example like this:
p = clib.getString(...) # a "char *" length = clib.strlen(p) # the standard strlen() function from C b = unicode(ffi.buffer(p, length), 'utf-8')
I've tried that, and the overhead of the second call is more or less equal to the cost of the copy when using ffi.string. Kind regards, l.