[Python-3000] Making strings non-iterable

Nick Coghlan ncoghlan at gmail.com
Wed Apr 19 12:17:50 CEST 2006


Barry Warsaw wrote:
> On Mon, 2006-04-17 at 16:08 -0700, Brian Harring wrote:
> 
>> The issue I'm seeing is that the wart you're pointing at is a general 
>> issue not limited to strings- everyone sooner or later has flattening 
>> code that hits the "recursively iterate over this container, except 
>> for instances of these classes".  
> 
> I wouldn't want to generalize this, but it is not infrequent that people
> mistakenly iterate over strings when they want to treat them atomically.
> difflib not withstanding, and keeping Guido's pronouncement in mind, I
> do think people want to treat strings atomically much more often then
> they want to treat them as a sequence of characters.

Raymond brought something like this up on python-dev over a year ago. The last 
message in the thread was from me [1], with a proposed itertools.walk function 
that provided:

   - the choice of depth-first or breadth-first iteration
   - a simple stop-list of types not to be iterated over
   - block infinite recursion due to simple cycles (such as length 1 strings)
   - permitted use of an iterator factory other than the builtin iter

The last feature allows iterables to be excluded from iteration based on 
factors other than an isinstance check or whether or not they're 
self-recursive. I'd be inclined to call YAGNI on it, except that it works well 
with unbound methods. For example, a tree walking function could be written:

   def walk_tree(root, depth_first=True):
       # Walk the tree, yielding all leaf elements
       return itertools.walk(root, depth_first, iter_factory=TreeNode.__iter__)

Unfortunately, the thread fizzled without generating any additional interest. 
I don't recall the topic really coming up since then.

Example usage:

Py> seq
[['123', '456'], 'abc', 'abc', 'abc', 'abc', ['xyz']]
Py> list(walk(seq))
['123', '456', 'abc', 'abc', 'abc', 'abc', 'xyz']
Py> list(walk(seq, depth_first=False))
['abc', 'abc', 'abc', 'abc', '123', '456', 'xyz']
Py> list(walk(seq, atomic_types=()))
['1', '2', '3', '4', '5', '6', 'a', 'b', 'c', 'a', 'b', 'c', 'a', 'b', 'c', 'a',
   'b', 'c', 'x', 'y', 'z']
Py> list(walk(seq, depth_first=False, atomic_types=()))
['a', 'b', 'c', 'a', 'b', 'c', 'a', 'b', 'c', 'a', 'b', 'c', '1', '2', '3', '4',
   '5', '6', 'x', 'y', 'z']

Cheers,
Nick.

[1] http://mail.python.org/pipermail/python-dev/2005-March/052245.html

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org


More information about the Python-3000 mailing list