Undeterministic strxfrm?

Tuomas tuomas.vesterinen at pp.inet.fi
Wed Sep 5 10:27:36 CEST 2007


Gabriel Genellina wrote:
> I think it's not in the issue tracker - see  
> http://xforce.iss.net/xforce/xfdb/34060
> The fix is already in 2.5.1  
> http://www.python.org/download/releases/2.5.1/NEWS.txt

Thanks Gabriel, I'll try Python 2.5.1.

>> Reading the rev 54669 it seems to me, that the bug is not fixed. Man  
>> says:
>>
>> STRXFRM(3): ... size_t strxfrm(char *dest, const char *src, size_t n);
>> ... The first n characters of  the  transformed  string
>> are  placed in dest.  The transformation is based on the program’s
>> current locale for category LC_COLLATE.
>> ... The strxfrm() function returns the number of bytes required to
>> store  the transformed  string  in dest excluding the terminating ‘\0’
>> character.  If the value returned is n or more, the contents of dest are
>> *indeterminate*.
>>
>> Accordin the man pages Python should know the size of the result it
>> expects and don't trust the size strxfrm returns. I don't completely
>> understand the collate algorithm, but it should offer different levels
>> of collate. So Python too, should offer those levels as a second
>> parameter. Hovever strxfrm don't offer more parameters either except
>> there is another function strcasecmp. So Python should be able to
>> calculate the expected size before calling strxfrm or strcasecmp. I
>> don't how it is possible. May be strcoll knows better and I should kick
>> strxfrm off and take strcoll instead. It costs converting the seach key
>> in every step of the search.
> 
> 
> No. That's why strxfrm is called twice: the first one returns the 
> required  buffer size, the buffer is resized, and strxfrm is called 
> again. That's a  rather common sequence when buffer sizes are not known 
> in advance.
> [Note that `dest` is indeterminate, NOT the function return value which  
> always returns the required buffer size]
> 

OK, I made too quick conclusions of the man text without knowing the 
details.

Tuomas



More information about the Python-list mailing list