On 28-May-08, at 5:44 PM, Greg Ewing wrote:
Mike Klaas wrote:
In my perfect world, strings would be indicable
and sliceable, but
An object that was indexable but not iterable would be a very strange thing. If it has __len__ and __getitem__, there's nothing to stop you iterating over it by hand anyway, so disallowing __iter__ would just seem perverse.
Python has a beautiful abstraction in iteration: iter() is a generic
function that allows you lazily consume a sequence of objects, whether
it be lists, tuples, custom iterators, generators, or what have you.
It is trivial to write your code to be agnostic to the type of
iterable passed-in. Almost anything else a consumer of your code
passes in will result in an immediate exception.
Unfortunately, python has two extremely common data types which do not
fail when this generic function is applied to them, and instead almost
always returns a result which is not desired. Instead, it iterates
over the characters of the string, a behaviour which is rarely needed
in practice due to the wealth of methods available.
I agree that it would be perverse to disallowing iterating over a
string. I just wish that the way to do that wasn't glommed on to the
As it stands, any consumer of iterables has to keep strings in mind.
It is particularly irksome when the target input is an iterable of
strings. I recall a function that accepts a list/iterable of item
keys, hashes them, and then retrieves values based on the item hashes
(usually over the network, so it is necessary to batch requests).
This function is often used in the interactive interpreter, and it is
thus very prone to being passed-in a string rather than a list. There
was no good way to prevent the (frequent) mysterious "not found"
errors save adding an explicit type check for basestring.
String already behaves slightly differently from the way other
sequences act: It is the only sequence for which 'seq in seq' is
true, and the only sequence for which 'x in seq' can be true but
'any(x==item for item in seq)' is false. Abstractions are sometimes
imperfect: this is why there is an explicit typecheck for strings in
the sum() builtin.
I'll stop here as I realize that the likelihood that this will be
accepted is terribly small, especially considering the late stage of
the process. But I would be willing to develop a patch that
implements this behaviour on the off chance it is.