On 6/6/2014 4:53 AM, Hrvoje Niksic wrote:
On 06/04/2014 05:52 PM, Mark Lawrence wrote:
Out of idle curiosity is there anything that stops MicroPython, or any other implementation for that matter, from providing views of a string rather than copying every time? IIRC memoryviews in CPython rely on the buffer protocol at the C API level, so since strings don't support this protocol you can't take a memoryview of them. Could this actually be implemented in the future, is the underlying C code just too complicated, or what?
Memory view of Unicode strings is controversial for two reasons:
1. It exposes the internal representation of the string. If memoryviews of strings were supported in Python 3, PEP 393 would not have been possible (without breaking that feature).
2. Even if it were OK to expose the internal representation, it might not be what the users expect. For example, memoryview("Hrvoje") would return a view of a 6-byte buffer, while memoryview("Nikšić") would return a view of a 12-byte UCS-2 buffer. The user of a memory view might expect to get UCS-2 (or UCS-4, or even UTF-8) in all cases.
An implementation that decided to export strings as memory views might be forced to make a decision about internal representation of strings, and then stick to it.
The byte objects don't have these issues, which is why in Python 2.7 memoryview("foo") works just fine, as does memoryview(b"foo") in Python 3.
The other problem is that a small slice view of a large object keeps the large object alive, so a view user needs to think carefully about whether to make a copy or create a view, and later to copy views to delete the base object. This is not for beginners. -- Terry Jan Reedy