[Python-ideas] strings as iterables - from str.startswith taking any iterator instead of just tuple

Fri Jan 3 11:19:35 CET 2014

On 01/03/2014 04:54 AM, Alexander Heger wrote:
>> By designing an API that doesn't require such overloading.
>>
>> On Thursday, January 2, 2014, Alexander Heger wrote:
>>>
>>>>>     isinstance(x, Iterable) and not isinstance(x, str)
>>>>
>>>> If you find yourself typing that a lot I think you have a bigger problem
>>>> though.
>>>
>>> How do you replace this?
>
> for my applications this seemed the most natural way - have the method
> deal with what it is fed, which could be strings or any kind of
> collections or iterables of strings.  But never would I want to
> disassemble strings into characters.  From the previous message I
> gather that I am not the only one with this application case.
>
> Generally, I find strings being iterables of characters as useful as
> if integers were iterables of bits.  They should just be units.  They
> already start out being not mutable.  I think it would be a positive
> design change for Python 4 to make them units instead of being
> iterables.  At least for me, there is much fewer applications where
> the latter is useful than where it requires extra code.  Overall, it
> makes the language less clean that a string is an iterable; a special
> case we always have to code around.
>
> I know it will break a lot of existing code, but so did the string
> change from py2 to 3.  (It would break very few of my codes, though.)

I agree there is an occasionnal need which I also met in real code: it was parse 
result data, which can be a string (terminal patterns, that really "eat" part of 
the source) or list (or otherwise "tre" iterable collection, for composite or 
repetitive patterns). But the case is rare because it requires coincidence of 
conditions:
* both string and collections may come as input
* both are valid, from the app's logics' point of view
* one want to iterate collections, but not strings

On the other hand, I find you much too quickly dismiss real and very common need 
to iterate strings (on the lowest units of code points), apparently on the only 
base that in your own programming practice you don't need/want it.

We should not make iterating strings a special case (eg by requiring explicit 
call to an iterator like for ucode in s.ucodes() because the case is so common. 
Instead we may consider finding a way to exclude strings in some collection 
traversal idiom (for which I have good proposal: the obvious one would .items(), 
but it's used for a different meaning), which would for instance yield an 
exception on strings because they don't match the idiom ("str object has no 
'items' attribute").

Denis