[Python-Dev] OOps (was: No 1.6! (was Re: A REALLY COOL PYTHON
FEATURE:))
Christian Tismer
tismer@tismer.com
Mon, 22 May 2000 14:40:51 +0200
Hi, I'm back from White Russia (yup, a surviver) :-)
Tim Peters wrote:
>
> [Christian Tismer]
> > ...
> > Then a string should better not be a sequence.
> >
> > The number of places where I really used the string sequence
> > protocol to take advantage of it is outperfomed by a factor
> > of ten by cases where I missed to tupleise and got a bad
> > result. A traceback is better than a sequence here.
>
> Alas, I think
>
> for ch in string:
> muck w/ the character ch
>
> is a common idiom.
Sure.
And now for my proposal:
Strings should be strings, but not sequences.
Slicing is ok, and it will always yield strings.
Indexing would either
a - not yield anything but an excpetion
b - just integers instead of 1-char strings
The above idiom would read like this:
Version a: Access string elements via a coercion like tuple() or list():
for ch in tuple(string):
muck w/ the character ch
Version b: Access string elements as integer codes:
for c in string:
# either:
ch = chr(c)
muck w/ the character ch
# or:
muck w/ the character code c
> > oh-what-did-I-say-here--duck--but-isn't-it-so--cover-ly y'rs - chris
>
> The "sequenenceness" of strings does get in the way often enough. Strings
> have the amazing property that, since characters are also strings,
>
> while 1:
> string = string[0]
>
> never terminates with an error. This often manifests as unbounded recursion
> in generic functions that crawl over nested sequences (the first time you
> code one of these, you try to stop the recursion on a "is it a sequence?"
> test, and then someone passes in something containing a string and it
> descends forever). And we also have that
>
> format % values
>
> requires "values" to be specifically a tuple rather than any old sequence,
> else the current
>
> "%s" % some_string
>
> could be interpreted the wrong way.
>
> There may be some hope in that the "for/in" protocol is now conflated with
> the __getitem__ protocol, so if Python grows a more general iteration
> protocol, perhaps we could back away from the sequenceness of strings
> without harming "for" iteration over the characters ...
O-K!
We seem to have a similar conclusion: It would be better if strings
were no sequences, after all. How to achieve this seems to be
kind of a problem, of course.
Oh, there is another idiom possible!
How about this, after we have the new string methods :-)
for ch in string.split():
muck w/ the character ch
Ok, in the long term, we need to rethink iteration of course.
ciao - chris
--
Christian Tismer :^) <mailto:tismer@appliedbiometrics.com>
Applied Biometrics GmbH : Have a break! Take a ride on Python's
Kaunstr. 26 : *Starship* http://starship.python.net
14163 Berlin : PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF
where do you want to jump today? http://www.stackless.com