[Python-3000] Droping find/rfind?
Ron Adam
rrr at ronadam.com
Fri Aug 25 20:59:46 CEST 2006
Nick Coghlan wrote:
> Fredrik Lundh wrote:
>> Nick Coghlan wrote:
>>
>>>> Nick Coghlan wrote:
>>>>
>>>>> With a variety of "view types", that work like the corresponding builtin type,
>>>>> but reference the original data structure instead of creating copies
>>>> support for string views would require some serious interpreter surgery, though,
>>>> and probably break quite a few extensions...
>>> Why do you say that?
>> because I happen to know a lot about how Python's string types are
>> implemented ?
>
> I believe you're thinking about something far more sophisticated than what I'm
> suggesting. I'm just talking about a Python data type in a standard library
> module that trades off slower performance with smaller strings (due to extra
> method call overhead) against improved scalability (due to avoidance of
> copying strings around).
>
>>> make a view of it
>> so to make a view of a string, you make a view of it ?
>
> Yep - by using all those "start" and "stop" optional arguments to builtin
> string methods to implement the methods of a string view in pure Python. By
> creating the string view all you would really be doing is a partial
> application of start and stop arguments on all of the relevant string methods.
>
> I've included an example below that just supports __len__, __str__ and
> partition(). The source object survives for as long as the view does - the
> idea is that the view should only last while you manipulate the string, with
> only real strings released outside the function via return statements or yield
> expressions.
>>> self.source = "%s" % source
I think this should be.
self.source = source
Other wise you are making copies of the source which is what you
are trying to avoid. I'm not sure if python would reuse the self.source
string, but I wouldn't count on it.
It might be nice if slice objects could be used in more ways in python.
That may work in most cases where you would want a string view.
An example of a slice version of partition would be: (not tested)
def slice_partition(s, sep, sub_slice=None):
if sub_slice is None:
sub_slice = slice(len(s))
found_slice = find_slice(s, sep, sub_slice)
prefix_slice = slice(sub_slice.start, found_slice.start)
rest_slice = slice(found_slice.stop, sub_slice.stop)
return ( prefix_slice,
found_slice,
rest_slice )
# implementation of find_slice left to readers.
def find_slice(s, sub, sub_slice=None):
...
return found_slice
Of course this isn't needed for short strings, but might be worth while
when used with very long strings.
> # Simple string view example
> class strview(object):
> def __new__(cls, source, start=None, stop=None):
> self = object.__new__(cls)
> self.source = "%s" % source
> self.start = start if start is not None else 0
> self.stop = stop if stop is not None else len(source)
> return self
> def __str__(self):
> return self.source[self.start:self.stop]
> def __len__(self):
> return self.stop - self.start
> def partition(self, sep):
> _src = self.source
> try:
> startsep = _src.index(sep, self.start, self.stop)
> except ValueError:
> # Separator wasn't found!
> return self, _NULL_STR, _NULL_STR
> # Return new views of the three string parts
> endsep = startsep + len(sep)
> return (strview(_src, self.start, startsep),
> strview(_src, startsep, endsep),
> strview(_src, endsep, self.stop))
>
> _NULL_STR = strview('')
>
> def splitview(s):
> rest = strview(s)
> while 1:
> prefix, found, rest = rest.partition("{")
> if prefix:
> yield (None, str(prefix))
> if not found:
> break
> first, found, rest = rest.partition(" ")
> if not found:
> break
> second, found, rest = rest.partition("}")
> if not found:
> break
> yield (str(first), str(second))
>
> >>> list(splitview('foo{spam eggs}bar{foo bar}'))
> [(None, 'foo'), ('spam', 'eggs'), (None, 'bar'), ('foo', 'bar')]
More information about the Python-3000
mailing list