[Python-3000] Making strings non-iterable
Nick Coghlan
ncoghlan at gmail.com
Wed Apr 19 12:17:50 CEST 2006
Barry Warsaw wrote:
> On Mon, 2006-04-17 at 16:08 -0700, Brian Harring wrote:
>
>> The issue I'm seeing is that the wart you're pointing at is a general
>> issue not limited to strings- everyone sooner or later has flattening
>> code that hits the "recursively iterate over this container, except
>> for instances of these classes".
>
> I wouldn't want to generalize this, but it is not infrequent that people
> mistakenly iterate over strings when they want to treat them atomically.
> difflib not withstanding, and keeping Guido's pronouncement in mind, I
> do think people want to treat strings atomically much more often then
> they want to treat them as a sequence of characters.
Raymond brought something like this up on python-dev over a year ago. The last
message in the thread was from me [1], with a proposed itertools.walk function
that provided:
- the choice of depth-first or breadth-first iteration
- a simple stop-list of types not to be iterated over
- block infinite recursion due to simple cycles (such as length 1 strings)
- permitted use of an iterator factory other than the builtin iter
The last feature allows iterables to be excluded from iteration based on
factors other than an isinstance check or whether or not they're
self-recursive. I'd be inclined to call YAGNI on it, except that it works well
with unbound methods. For example, a tree walking function could be written:
def walk_tree(root, depth_first=True):
# Walk the tree, yielding all leaf elements
return itertools.walk(root, depth_first, iter_factory=TreeNode.__iter__)
Unfortunately, the thread fizzled without generating any additional interest.
I don't recall the topic really coming up since then.
Example usage:
Py> seq
[['123', '456'], 'abc', 'abc', 'abc', 'abc', ['xyz']]
Py> list(walk(seq))
['123', '456', 'abc', 'abc', 'abc', 'abc', 'xyz']
Py> list(walk(seq, depth_first=False))
['abc', 'abc', 'abc', 'abc', '123', '456', 'xyz']
Py> list(walk(seq, atomic_types=()))
['1', '2', '3', '4', '5', '6', 'a', 'b', 'c', 'a', 'b', 'c', 'a', 'b', 'c', 'a',
'b', 'c', 'x', 'y', 'z']
Py> list(walk(seq, depth_first=False, atomic_types=()))
['a', 'b', 'c', 'a', 'b', 'c', 'a', 'b', 'c', 'a', 'b', 'c', '1', '2', '3', '4',
'5', '6', 'x', 'y', 'z']
Cheers,
Nick.
[1] http://mail.python.org/pipermail/python-dev/2005-March/052245.html
--
Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia
---------------------------------------------------------------
http://www.boredomandlaziness.org
More information about the Python-3000
mailing list