[Python-Dev] Iterable String Redux (aka String ABC)
Bill Janssen
janssen at parc.com
Wed May 28 23:33:17 CEST 2008
> >>> I'm not against this, but so far I've not been able to come up with a
> >>> good set of methods to endow the String ABC with.
> >>
> >> If we stay minimalistic we could consider that the three basic operations that
> >> define a string are:
> >> - testing for substring containment
> >> - splitting on a substring into a list of substrings
> >> - slicing in order to extract a substring
> >
> >> Which gives us ['__contains__', 'split', '__getitem__'], and expands intuitively
> >> to ['__contains__', 'find', 'index', 'split', 'rsplit', '__getitem__'].
>
> With the Sequence ABC, you already get index, contains, __len__, count,
> __iter__, and __getitem__. What's the benefit of an additional ABC
> with just three more methods? What can be learned from any known use
> cases for abstract strings (iirc, idlelib has an interesting example of
> subclassing UserString). Is there anything about this proposal that is
> intrinsically texty?
>
> In the 3.0 world, text is an abstract sequence of code points.
> Would you want to require an encode() method so there will
> always be a way to make it concrete?
I would.
> The split()/rsplit() methods have a complex specification.
> Including them may make it hard to write a compliant class.
> >From what's been discussed so far, I don't see any advantage
> of isinstance(o, String) over hasattr(o, 'encode') or somesuch.
Look, even if there were *no* additional methods, it's worth adding
the base class, just to differentiate the class from the Sequence, as
a marker, so that those of us who want to ask "isinstance(o, String)"
can do so.
Personally, I'd add in all the string methods to that class, in all
their gory complexity. Those who need a compliant class should
subclass the String base class, and override/add what they need.
Bill
More information about the Python-Dev
mailing list