[Python-3000] Making strings non-iterable

Thu Apr 13 20:01:04 CEST 2006

I propose that strings (unicode/text) shouldn't be iterable.  Seeing this:

<ul>
  <li> i
  <li> t
  <li> e
  <li> m
  <li>
  <li> 1
</ul>

a few too many times... it's annoying.  Instead, I propose that strings 
get a list-like view on their characters.  Oh synergy!

Thus you would do:

   for c in a_string.chars():
       print c

This view would have the full complement of list methods (like .count(), 
.index(), etc), and would not have string methods (like .upper()).

Iterating over strings causes frequent hard bugs (bad data, as opposed 
to exceptions which make for easy bugs), as the bug can manifest itself 
far from its origination.  Also strings aren't containers.  Because 
Python has no characters, only strings, as a result strings look like 
they contain strings, and those strings in turn contain themselves.  It 
just doesn't make sense.  And it is because a string and the characters 
it contains are interchangeable (they are both strings) that the 
resulting bugs can persist without exceptions.

Should bytes be iterable as well?  Because bytes (the container) and 
integers are not interchangeable, the problems that occur with strings 
seem much less likely, and the container-like nature of bytes is 
clearer.  So I don't propose this effect bytes in any way.

Questions:

* .chars() doesn't return characters; should it be named something else?

* Should it be a method that is called?  dict.keys() has a legacy, but 
this does not.  There is presumably very little overhead to getting this 
view.  However, symmetry with the only other views we are considering 
(dictionary views) would indicate it should be a method.  Also, there 
are no attributes on strings currently.

* Are there other views on strings?  Can string->byte encoding be 
usefully seen as a view in some cases?

-- 
Ian Bicking  /  ianb at colorstudy.com  /  http://blog.ianbicking.org