[Python-3000] string C API
Marcin 'Qrczak' Kowalczyk
qrczak at knm.org.pl
Sat Sep 16 11:53:51 CEST 2006
Greg Ewing <greg.ewing at canterbury.ac.nz> writes:
> That places a burden on all creators of strings to ensure
> that they are in the minimal format, which could be
> inconvenient for some operations, e.g. taking a substring
> could require making an extra pass to re-code the data.
Yes, but taking a substring already requires a linear time wrt. the
length of the substring.
Allocation a string from a C array of wide characters (which
determines the format from the contents) will be written once and
called as a function.
Most strings are ASCII, so most of the time there is no need to check
whether the substring could become even narrower.
> It would also preclude the possibility of representing
> a substring as a view.
If views were implemented on the level of C pointers, then views would
not have the property of being in the canonical representation wrt.
character width. It's still valuable I think to use a more compact
representation if it would affect most strings.
> I don't see any great advantage given by this restriction
Keeping the canonical representation is not very important. It just
ensures that the advantage of having a more compact representation
taken as often as possible, even if the string has been cut from
another string which contained a wide character.
__("< Marcin Kowalczyk
\__/ qrczak at knm.org.pl
More information about the Python-3000