[Tutor] string codes
eryksun
eryksun at gmail.com
Tue Nov 26 18:48:35 CET 2013
On Tue, Nov 26, 2013 at 6:34 AM, Steven D'Aprano <steve at pearwood.info> wrote:
>
> I think that views would be useful for *very large strings*, but very
> large probably means a lot larger than you might think. For small
> strings, say under a few hundred or perhaps even thousand characters,
> making a copy of the substring will probably be faster.
>
> I say "probably", but I'm only guessing, because strings in Python don't
> have views. (Perhaps they should?)
In 2.7 and 3.x, you can use a memoryview for bytes, bytearray, etc.
Unicode strings don't support the new buffer interface. 2.x has a
buffer type, but slices create a raw byte string (UTF-16 or UTF-32):
>>> b = buffer(u'abcd')
>>> b[:8]
'a\x00\x00\x00b\x00\x00\x00'
>>> b[:8].decode('utf-32')
u'ab'
In 3.3, a memoryview can compare strided views:
>>> b = b'a**b**c**d**'
>>> v = memoryview(b)
>>> v[::3].tobytes()
b'abcd'
>>> v[::3] == b'abcd'
True
http://docs.python.org/3.3/library/stdtypes.html#memory-views
In previous versions memoryview compares the raw bytes, and only for
contiguous views. For example, in 2.7:
>>> try: v[::3] == b'abcd'
... except NotImplementedError: print ':-('
...
:-(
http://docs.python.org/3.2/library/stdtypes.html#memoryview-type
http://docs.python.org/2.7/library/stdtypes.html#memoryview-type
More information about the Tutor
mailing list