[Python-Dev] Iterable String Redux (aka String ABC)

Raymond Hettinger python at rcn.com
Wed May 28 23:17:11 CEST 2008


>>> I'm not against this, but so far I've not been able to come up with a
>>> good set of methods to endow the String ABC with.
>> 
>> If we stay minimalistic we could consider that the three basic operations that
>> define a string are:
>> - testing for substring containment
>> - splitting on a substring into a list of substrings
>> - slicing in order to extract a substring
> 
>> Which gives us ['__contains__', 'split', '__getitem__'], and expands intuitively
>> to ['__contains__', 'find', 'index', 'split', 'rsplit', '__getitem__'].

With the Sequence ABC, you already get index, contains, __len__, count,
__iter__, and __getitem__.  What's the benefit of an additional ABC
with just three more methods?  What can be learned from any known use 
cases for abstract strings (iirc, idlelib has an interesting example of
subclassing UserString).  Is there anything about this proposal that is
intrinsically texty?

In the 3.0 world, text is an abstract sequence of code points.  
Would you want to require an encode() method so there will
always be a way to make it concrete?

The split()/rsplit() methods have a complex specification.
Including them may make it hard to write a compliant class.

>From what's been discussed so far, I don't see any advantage
of isinstance(o, String) over hasattr(o, 'encode') or somesuch.


Raymond







More information about the Python-Dev mailing list