re: Sets BOF / for in dict

I've spoken with Barbara Fuller (IPC9 org.); the two openings for a BOF on sets are breakfast or lunch on Wednesday the 7th. I'd prefer breakfast (less chance of me missing my flight :-); is there anyone who's interested in attending who *can't* make that time, but *could* make lunch? And meanwhile:
Ka-Ping Yee: - the key:value syntax suggested by Guido (i like it quite a lot)
Greg Wilson: Did another quick poll; feeling here is that if for key:value in dict: works, then: for index:value in sequence: would also be expected to work. If the keys to the dictionary are (for example) 2-element tuples, then: for (left, right):value in dict: would also be expected to work, just as: for ((left, right), value) in dict.items(): now works. Question: would the current proposal allow NumPy arrays (just as an example) to support both: for index:value in numPyArray: where 'index' would get tuples like '(0, 3, 2)' for a 3D array, *and* for (i, j, k):value in numPyArray: If so, then yeah, it would tidy up a fair bit of my code... Thanks, Greg

On Sun, Feb 04, 2001 at 09:19:47AM -0500, Greg Wilson wrote:
There is no real technical reason for it not to work. From a grammer point of view, for left, right:value in dict: would also work fine. (the grammar would be: 'for' exprlist [':' exprlist] 'in' testlist: and since there can't be a colon inside an exprlist, it's not ambiguous.) The main problem is whether you *want* that to work :) -- Thomas Wouters <thomas@xs4all.net> Hi! I'm a .signature virus! copy me into your .signature file to help me spread!

Fine with me.
Yes, that's all non-controversial.
That's up to the numPy array! Assuming that we introduce this together with iterators, the default NumPy iterator could be made to iterate over all three index sets simultaneously; there could be other iterators to iterate over a selection of index sets (e.g. to iterate over the rows). However the iterator can't be told what form the index has. --Guido van Rossum (home page: http://www.python.org/~guido/)

Greg Wilson wrote:
Depends on the time frame of "breakfast" ;-)
Two things: 1. the proposed syntax key:value does away with the easy to parse Python block statement syntax 2. why can't we use the old 'for x,y,z in something:' syntax and instead add iterators to the objects in question ? for key, value in object.iterator(): ... this doesn't only look better, it also allows having different iterators for different tasks (e.g. to iterate over values, key, items, row in a matrix, etc.) -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/

[MAL]
Depends on the time frame of "breakfast" ;-)
Does this mean you'll be at the conference? That would be excellent!
This should become the PEP. I propose that we try to keep this discussion off python-dev, and that the PEP author(s?) set up a separate discussion list (e.g. at egroups) to keep the PEP feedback coming. I promise I'll subscribe to such a list. --Guido van Rossum (home page: http://www.python.org/~guido/)

On Mon, 5 Feb 2001, M.-A. Lemburg wrote:
Oh, come on. Slices and dictionary literals use colons too, and there's nothing wrong with that. Blocks are introduced by a colon at the *end* of a line.
Because there's no good answer for "what does iterator() return?" in this design. (Trust me; i did think this through carefully.) Try it. How would you implement the iterator() method? The PEP *is* suggesting that we add iterators to the objects -- just not that we explicitly call them. In the 'for' loop you've written, iterator() returns a sequence, not an iterator. -- ?!ng

Ka-Ping Yee wrote:
Slices and dictionary enclose the two parts in brackets -- this places the colon into a visible context. for ... in ... : does not provide much of a context.
The .iterator() method would have to return an object which provides an iterator API (at C level to get the best performance). For dictionaries, this object could carry the needed state (current position in the dictionary table) and use the PyDict_Next() for the internals. Matrices would have to carry along more state (one integer per dimension) and could access the internal matrix representation directly using C functions. This would give us: speed, flexibility and extensibility which the syntax hacks cannot provide; e.g. how would you specify to iterate backwards over a sequence using that notation or diagonal for a matrix ?
No, it should return a forward iterator. -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/

On Mon, 5 Feb 2001, M.-A. Lemburg wrote:
For crying out loud! '\':' requires that you tokenize the string before you know that the colon is part of the string. Triple-quotes force you to tokenize carefully too. There is *nothing* that this stay-away-from-colons argument buys you. For a human skimming over source code -- i repeat, the visual hint is "colon at the END of a line".
Okay, provide an example. Write this iterator() method in Python. Now answer: how does 'for' know whether the thing to the right of 'in' is an iterator or a sequence?
This is already exactly what the PEP proposes for the implementation of sq_iter.
This would give us: speed, flexibility and extensibility which the syntax hacks cannot provide;
The PEP is not just about syntax hacks. It's an iterator protocol. It's clear that you haven't read it. *PLEASE* read the PEP before continuing to discuss it. I quote: | Rationale | | If all the parts of the proposal are included, this addresses many | concerns in a consistent and flexible fashion. Among its chief | virtues are the following three -- no, four -- no, five -- points: | | 1. It provides an extensible iterator interface. | | 2. It resolves the endless "i indexing sequence" debate. | | 3. It allows performance enhancements to dictionary iteration. | | 4. It allows one to provide an interface for just iteration | without pretending to provide random access to elements. | | 5. It is backward-compatible with all existing user-defined | classes and extension objects that emulate sequences and | mappings, even mappings that only implement a subset of | {__getitem__, keys, values, items}. I can take out the Monty Python jokes if you want. I can add more jokes if that will make you read it. Just read it, i beg you.
No differently from what you are suggesting, at the surface: for item in sequence.backwards(): for item in matrix.diagonal(): The difference is that the thing on the right of 'in' is always considered a sequence-like object. There is no ambiguity and no magic rule for deciding when it's a sequence and when it's an iterator. -- ?!ng "There's no point in being grown up if you can't be childish sometimes." -- Dr. Who

Ka-Ping Yee wrote:
Oh well, perhaps you are right and we should call things like key:value association and be done with it.
Simple: have the for-loop test for a type slot and have it fallback to __getitem__ in case it doesn't find the slot API.
Sorry, Ping, I didn't know you have a PEP for iterators already. ...reading it...
Done. Didn't know it exists, though (why isn't the PEP# in the subject line ?). Even after reading it, I still don't get the idea behind adding "Mapping Iterators" and "Sequence Iterators" when both of these are only special implementations of the single "Iterator" interface. Since the object can have multiple methods to construct iterators, all you need is *one* iterator API. You don't need a slot which returns an iterator object -- leave that decision to the programmer, e.g. you can have: for key in dict.xkeys(): for value in dict.xvalues(): for items in dict.xitems(): for entry in matrix.xrow(1): for entry in matrix.xcolumn(2): for entry in matrix.xdiag(): for i,element in sequence.xrange(): All of these method calls return special iterators for one specific task and all of them provide a slot which is callable without argument and yields the next element of the iteration. Iteration is terminated by raising an IndexError just like with __getitem__. Since for-loops can check for the type slot, they can use an optimized implementation which avoids the creation of temporary integer objects and leave the state-keeping to the iterator which can usually provide a C based storage for it with much better performance. Note that with this kind of interface, there is no need to add "Mapping Iterators" or "Sequence Iterators" as special cases, since these are easily implemented using the above iterators.
-- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/

On 5 Feb 2001, M.-A. Lemburg wrote:
For the third time: write an example, please. It will help a lot.
Sorry, Ping, I didn't know you have a PEP for iterators already.
I posted it on this very boutique (i mean, mailing list) a week ago and messages have been going back and forth on its thread since then. On 31 Jan 2001, Ka-Ping Yee wrote: | Okay, i have written a draft PEP that tries to combine the | "elt in dict", custom iterator, and "for k:v" issues into a | coherent proposal. Have a look: | | http://www.lfw.org/python/pep-iterators.txt | http://www.lfw.org/python/pep-iterators.html Okay. I apologize for my impatient tone, as it comes from the expectation that anyone would have read the document before trying to discuss it. I am very happy to get *new* information, the discovery of new errors in my thinking, better and interesting arguments; it's just that it's exasperating to see arguments repeated that were already made, or objections raised that were already carefully thought out and addressed. From now on, i'll stop resisting the urge to paste the text of proposals inline (instead of politely posting just URLs) so you won't miss them.
Done. Didn't know it exists, though (why isn't the PEP# in the subject line ?).
It didn't have a number at the time i posted it. Thank you for updating the subject line.
Three points: 1. We have syntactic support for mapping creation and lookup, and syntactic support for mapping iteration should mirror it. 2. IMHO for key:value in dict: is much easier to read and explain than for (key, value) in dict.xitems(): (Greg? Could you test this claim with a survey question?) To the newcomer, the former is easy to understand at a surface level. The latter exposes the implementation (an implementation that is still there in PEP 234, but that the programmer only has to worry about if they are going deeper and writing custom iteration behaviour). This separates the work of learning into two small, digestible pieces. 3. Furthermore, this still doesn't solve the backward-compatibility problem that PEP 234 takes great care to address! If you write your for-loops for (key, value) in dict.xitems(): then you are screwed if you try to replace dict with any kind of user-implemented dictionary-like replacement (since you'd have to go back and implement the xitems() method on everything). If, in order to maintain compatibility with the existing de-facto dictionary interface, you write your for-loops for (key, value) in dict.items(): then now you are screwed if dict is a built-in dictionary, since items() is supposed to construct a list, not an iterator.
These are fine, since matrices are not core data types with syntactic support or a de-facto emulation protocol.
for i,element in sequence.xrange():
This is just as bad as the xitems() issue above -- probably worse -- since nobody implements xrange() on sequence-like objects, so now you've broken compatibility with all of those. We want this feature to smoothly extend and work with existing objects with a minimum of rewriting, ideally none. PEP 234 achieves this ideal.
This statement, i believe, is orthogonal to both proposals.
I think this really just comes down to one key difference between our points of view here. Correct me if you disagree: You seem to be suggesting that we should only consider a protocol for sequences, whereas PEP 234 talks about both sequences and mappings. I argue that consideration for mappings is worthwhile because: 1. Dictionaries are a built-in type with syntactic and core implementation support. 2. Iteration over dictionaries is very common and should be spelled in an easily understood fashion. 3. Both sequence and mapping protocols are formalized in the core (with PySequenceMethods and PyMappingMethods). 4. Both sequence and mapping protocols are documented and used in Python (__getitem__, keys, values, etc.). 5. There are many, many sequence-like and mapping-like objects out there, implemented both in Python and in C, which adhere to these protocols. (There is also the not-insignificant side benefit of finally having a decent way to get the indices while you're iterating over a sequence, which i've wanted fairly often.) -- ?!ng

Ka-Ping Yee wrote:
Ping, what do you need an example for ? The above sentence says it all: for x in obj: ... This will work as follows: 1. if obj exposes the iteration slot, say tp_nextitem, the for loop will call this slot without argument and assign the returned object to x 2. if obj does not expose tp_nextitem, then the for loop will construct an integer starting at 0 and pass this to the sq_item slot or __getitem__ method and assign the returned value to x; the integer is then replaced with an incremented integer 3. both techniques work until the slot or method in question returns an IndexError exception The current implementation doesn't have 1. This is the only addition it takes to get iterators to work together well with the for-loop -- there are no backward compatibility issues here, because the tp_nextitem slot will be a new one. Since the for-loop can avoid creating temporary integers, iterations will generally run a lot faster than before. Also, iterators have access to the object's internal representation, so data access is also faster.
I must have missed those postings... don't have time to read all of python-dev anymore :-(
Tuples are well-known basic Python types. Why should (key,value) be any harder to understand than key:value. What would you tell a newbie that writes: for key:value in sequence: .... where sequence is a list of tuples and finds that this doesn't work ? Besides, the items() method has been around for ages, so switching from .items() to .xitems() in programs will be just as easy as switching from range() to xrange(). I am -0 on the key:value thingie. If you want it as a way to construct or split associations, fine. But it is really not necessary to be able to iterate over dictionaries.
Why is that ? You'd just have to add .xitems() to UserDict and be done with it. This is how we have added new dictionary methods all along. I don't see your point here. Sure, if you want to use a new feature you will have to think about whether it can be used with your data-types. What you are trying to do here is maintain forward compatibility at the cost of making iteration much more complicated than it really is.
I'm not breaking backward compatibility -- the above will still work like it has before since lists don't have the tp_nextitem slot.
Again, you are trying to achieve forward compatibility. If people want better performance, than they will have to add new functionality to their types -- one way or another.
No. I'm suggesting to add a low-level "give me the next item in the bag" and move the "how to get the next item" logic into an iterator object. This will still allow you to iterate over sequences and mappings, so I don't understand why you keep argueing for adding new syntax and slots to be able to iterate over dictionaries.
Agreed. I'd suggest to implement generic iterators which implements your suggestions and put them into the builins or a special iterator module... from iterators import xitems, xkeys, xvalues for key, value in xitems(dict): for key in xkeys(dict): for value in xvalues(dict): Other objects can then still have their own iterators by exposing special methods which construct special iterators. The for-loop will continue to work as always and happily accept __getitem__ compatible or tp_nextitem compatible objects as right-hand argument. -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/

On Tue, 6 Feb 2001, M.-A. Lemburg wrote:
*sigh* I give up. I'm not going to ask again. Real examples are a good idea when considering any proposal. (a) When you do a real example, you usually discover mistakes or things you didn't think of in your design. (b) We can compare it directly to other examples to see how easy or hard it is to write and understand code that uses the new protocol. (c) We can come up with interesting cases in practice to see if there are limitations in any proposal. Now that you have a proposal in slightly more detail, a few missing pieces are evident. How would you implement a *Python* class that supports iteration? For instance, write something that has the effect of the FileLines class in PEP 234. How would you implement an object that can be iterated over more than once, at the same time or at different times? It's not clear to me how the single tp_nextitem slot can handle that.
Again, completely orthogonal to both proposals. Regardless of the protocol, if you're implementing the iterator in C, you can use raw integers and internal access to make it fast.
It's mainly the business of calling the method and rearranging the data that i'm concerned about. Example 1: dict = {1: 2, 3: 4} for (key, value) in dict.items(): Explanation: The "items" method on the dict converts {1: 2, 3: 4} into a list of 2-tuples, [(1, 2), (3, 4)]. Then (key, value) is matched against each item of this list, and the two parts of each tuple are unpacked. Example 2: dict = {1: 2, 3: 4} for key:value in dict: Explanation: The "for" loop iterates over the key:value pairs in the dictionary, which you can see are 1:2 and 3:4.
"key:value doesn't look like a tuple, does it?"
It's not the same. xrange() is a built-in function that you call; xitems() is a method that you have to *implement*.
...and cgi.FieldStorage, and dumbdbm._Database, and rfc822.Message, and shelve.Shelf, and bsddbmodule, and dbmmodule, and gdbmmodule, to name a few. Even if you expect (or force) people to derive all their dictionary-like Python classes from UserDict (which they don't, in practice), you can't derive C objects from UserDict.
What i mean is that Python programmers would no longer know how to write their 'for' loops. Should they use 'xitems', thus dooming their loop never to work with the majority of user-implemented mapping-like objects? Or should they use 'items', thus dooming their loop to run inefficiently on built-in dictionaries?
Okay, i agree, it's forward compatibility. But it's something worth going for when you're trying to come up with a protocol. -- ?!ng "There's no point in being grown up if you can't be childish sometimes." -- Dr. Who

Ka-Ping Yee wrote:
I was just throwing in ideas, not a complete proposal. If that's what you want I can write up a complete proposal too and maybe even a patch to go with it. Exposing the tp_nextitem slot in Python classes via a __nextitem__ slot wouldn't be much of a problem. What I wanted to get across is the general idea behind my view of an iteration API and I believe that this idea has been made clear: I want a low-level API and move all the complicated object specific details into separate iterator objects. I don't see a point in trying to add complicated machinery to Python just to be able to iterate fast over some of the builtin types by special casing each object type. Let's please not add more special cases to the core.
Put all that logic into the iterator objects. These can be as complicated as needed, either trying to work in generic ways, special cased for some builtin types or be specific to a single type.
Again, if you prefer the key:value notation, fine. This is orthogonal to the iteration API though and really only touches the case of mappings.
You can put all that special logic into special iterators, e.g. a xitems iterator (see the end of my post).
The same applies to your proposed interface: people will have to write new code in order to be able to use the new technology. I don't see that as a problem, though.
Hey, people who care will be aware of this difference. It is very easy to test for interfaces in Python, so detecting the best method (in case it matters) is simple.
Sure, but is adding special cases everywhere really worth it ?

"M.-A. Lemburg" wrote:
Ka-Ping Yee wrote:
<snip/>
Sorry about sneaking in. I do in fact think that the syntax addition of key:value is easier to understand. Beginners know the { key:value } syntax, so this is just natural. Givin him an error in your above example is a step to clarity, avoiding hard to find errors if somebody has a list of tuples and the above happens to work somehow, although he forgot to use .xitems().
It has been around for years, but key:value might be better. A little faster for sure since we don't build extra tuples.
You really wouldn't stick with UserDict, but implement this on every object for speed. The key:value proposal is not only stronger through its extra syntactical strength, it is also smaller in code-size to implement. Having to force every "iterable" object to support a modified view of it via xitems() even doesn't look elegant to me. It forces key/value pairs to go through tupleization only for syntactical reasons. A weakness, not a strength. Object orientation gets at its limits here. If access to keys and values can be provided by a single implementation for all affected objects without adding new methods, this suggests to me that it is right to do so. +1 on key:value - ciao - chris -- Christian Tismer :^) <mailto:tismer@tismer.com> Mission Impossible 5oftware : Have a break! Take a ride on Python's Kaunstr. 26 : *Starship* http://starship.python.net 14163 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF where do you want to jump today? http://www.stackless.com

Christian Tismer wrote:
The problem is that key:value in sequence has a meaning under PEP234: key is the current index, value the tuple.
Small tuples are cheap and kept on the free list. I don't even think that key:value can do without them. Anyway, I've already said that I'm -0 on these thingies -- I would be +1 if Ping were to make key:value full flavoured associations (Jim Fulton has written a lot about these some years ago; I think they originated from SmallTalk).
...but it's a special case which we don't really need and it *only* works for mappings and then only if the mapping supports the new slots and methods required by PEP234. I don't buy the argument that PEP234 buys us fast iteration for free. Programmers will still have to write the code to implement the new slots and methods.
Hey, tuples are created for *every* function call, even C calls -- you can't be serious about getting much of a gain here ;-) -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/

"M.-A. Lemburg" wrote:
Why is this a problem? It is just fine.
a) I don't see a point to tell me about Python's implementation but for hair-splitting. Speed is not the point, it will just be faster. b) I think it can. But the point is the cleaner syntax which unambigously gets you keys and values, whenether the thing on the right can be indexed.
I didn't read that yet. Would it contradict Ping's version or could it be extended laer? ...
You are reducing my arguments to speed always, not me. I don't care about a tuple. But I think we can save code. Smaller *and* not slower is what I like. no offence - ly y'rs - chris -- Christian Tismer :^) <mailto:tismer@tismer.com> Mission Impossible 5oftware : Have a break! Take a ride on Python's Kaunstr. 26 : *Starship* http://starship.python.net 14163 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF where do you want to jump today? http://www.stackless.com

Christian Tismer wrote:
I'm not telling you (I know you know ;), but others on this list which may not be aware of this fact.
Ping's version would hide this detail under the cover: dictionaries would sort of implement the sequence protocol and then return associations. I don't think this is much of a problem though.
At the cost of: * special casing the for-loop implementation for sequences, mappings * adding half a dozen new slots and methods * moving all the complicated details into the for-loop implementation instead of keeping them in separate modules or object specific implementations Perhaps we are just discussing the wrong things: I believe that Ping's PEP could easily be implemented on top of my idea (or vice-versa depending on how you look at it) of how the iteration interface should look like. -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/

OK, now here's the hard one. Clearly, (a) for i in someList: has to continue to mean "iterate over the values". We've agreed that: (b) for k:v in someDict: means "iterate through the items". (a) looks like a special case of (b). I therefore asked my colleagues to guess what: (c) for x in someDict: did. They all said, "Iterates through the _values_ in the dict", by analogy with (a). I then asked, "How do you iterate through the keys in a dict, or the indices in a list?" They guessed: (d) for x: in someContainer: (note the colon trailing the iterator variable name). I think that the combination of (a) and (b) implies (c), which leads in turn to (d). Is this what we want? I gotta say, when I start thinking about how many problems my students are going to bring me when accidentally adding or removing a colon in the middle of a 'for' statement changes the iteration space from keys to values, and I start feeling queasy... Thanks, Greg

On Mon, 5 Feb 2001, Greg Wilson wrote:
The PEP explicitly proposes that (c) be an error, because i anticipated and specifically wanted to avoid this ambiguity. Have you had a good look at it? I think your survey shows that the PEP made the right choices. That is, it supports the position that if 'for key:value' is supported, then 'for key:' and 'for :value' should be supported, but 'for x in dict:' should not. It also shows that 'for index:' should be supported on sequences, which the PEP suggests. -- ?!ng

[GVW]
[Ping]
But then we should review the wisdom of using "if x in dict" as a shortcut for "if dict.has_key(x)" again. Everything is tied together! --Guido van Rossum (home page: http://www.python.org/~guido/)

But then we should review the wisdom of using "if x in dict" as a shortcut for "if dict.has_key(x)" again. Everything is tied together!
yeah, don't forget unpacking assignments: assert len(dict) == 3 { k1:v1, k2:v2, k3:v3 } = dict Cheers /F

On Mon, 5 Feb 2001, Fredrik Lundh wrote:
I think this is a total non-issue for the following reasons: 1. Recall the original philosophy behind the list/tuple split. Lists and dicts are usually variable-length homogeneous structures, and therefore it makes sense for them to be mutable. Tuples are usually fixed-length heterogeneous structures, and so it makes sense for them to be immutable and unpackable. 2. In all the Python programs i've ever seen or written, i've never known or expected a dictionary to have a particular fixed length. 3. Since the items come back in random order, there's no point in binding individual ones to individual variables. It's only ever useful to iterate over the key/value pairs. In short, i can't see how anyone would ever want to do this. (Sorry for being the straight man, if you were in fact joking...) -- ?!ng "There's no point in being grown up if you can't be childish sometimes." -- Dr. Who

On Mon, 5 Feb 2001, Guido van Rossum wrote:
Okay. Here's the philosophy; i'll describe my thinking more explicitly. Presumably we can all agree that if you ask to iterate over things "in" a sequence, you clearly want the items in the sequence, not their integer indices. You care about the data *you* put in the container. In the case of a list, you care about the items more than these additional integers that got supplied as a result of using an ordered data structure. So the meaning of for thing in sequence: is pretty clear. The meaning of for thing in mapping: is less clear, since both the keys and the values are interesting data to you. If i ask you to "get me all the things in the dictionary", it's not so obvious whether you should get me a list of just the words, just the definitions, or both (probably both, i suppose). But, if i ask you to "find 'aardvark' in the dictionary" or i ask you "is 'aardvark' in the dictionary?" it's completely obvious what i mean. "if key in dict:" makes sense both by this analogy to common use, and by an argument from efficiency (even the most rudimentary understanding of how a dictionary works is enough to see why we look up keys rather than values). In fact, we *call* it a dictionary because it works like a real dictionary: it's designed for data lookup in one direction, from key to value. "if thing in container" is about *finding* something specific. "for thing in container" is about getting everything out. Now, i know this isn't the strongest argument in the world, and i can see the potential objection that the two aren't consistent, but i think it's a very small thing that only has to be explained once, and then is easy to remember and understand. I consider this little difference less of an issue than the hasattr/has_key inconsistency that it will largely replace. We make expectations clear: for item in sequence: continues to mean, "i expect a sequence", exactly as it does now. When not given a sequence, the 'for' loop complains. Nothing could break, as the interpretation of this loop is unchanged. These three forms: for k:v in anycontainer: for k: in anycontainer: for :v in anycontainer: mean: "i am expecting any indexable thing, where ctr[k] = v". As far as the syntax goes, that's all there is to it: for item in sequence: # only on sequences for k:v in anycontainer: # get keys and values on anything for k: in anycontainer: # just keys for :v in anycontainer: # just values -- ?!ng "There's no point in being grown up if you can't be childish sometimes." -- Dr. Who

On Mon, Feb 05, 2001 at 02:22:50PM -0500, Greg Wilson wrote:
OK, now here's the hard one. Clearly,
Noshit. I ran into all of this while trying to figure out how to quick-hack implement it. My brain exploded while trying to grasp all implications, which is why I've been quiet on this issue -- I'm healing ;-P
(a) for i in someList: has to continue to mean "iterate over the values". We've agreed that:
(b) for k:v in someDict: means "iterate through the items". (a) looks like a special case of (b).
I'm still not sure if I like the special syntax to iterate over dictionaries. Are we talking about iterators, or about special syntax to use said iterators in the niche application of dicts and mapping interfaces ? :)
I therefore asked my colleagues to guess what:
(c) for x in someDict:
did. They all said, "Iterates through the _values_ in the dict", by analogy with (a).
But how baffled were they when it didn't do what they expected it to do ? Did they go, 'oh shit, now what' ?
I then asked, "How do you iterate through the keys in a dict, or the indices in a list?" They guessed:
(d) for x: in someContainer:
Again, how baffled were they when you said it wasn't going to work ? Because (c) and (d) are just very light syntactic powdered sugar substitutes for 'k:v' where you just don't use one of the two. The extra name binding operation isn't going to cost you enough to really worry about, IMHO. -- Thomas Wouters <thomas@xs4all.net> Hi! I'm a .signature virus! copy me into your .signature file to help me spread!

On Sun, Feb 04, 2001 at 09:19:47AM -0500, Greg Wilson wrote:
There is no real technical reason for it not to work. From a grammer point of view, for left, right:value in dict: would also work fine. (the grammar would be: 'for' exprlist [':' exprlist] 'in' testlist: and since there can't be a colon inside an exprlist, it's not ambiguous.) The main problem is whether you *want* that to work :) -- Thomas Wouters <thomas@xs4all.net> Hi! I'm a .signature virus! copy me into your .signature file to help me spread!

Fine with me.
Yes, that's all non-controversial.
That's up to the numPy array! Assuming that we introduce this together with iterators, the default NumPy iterator could be made to iterate over all three index sets simultaneously; there could be other iterators to iterate over a selection of index sets (e.g. to iterate over the rows). However the iterator can't be told what form the index has. --Guido van Rossum (home page: http://www.python.org/~guido/)

Greg Wilson wrote:
Depends on the time frame of "breakfast" ;-)
Two things: 1. the proposed syntax key:value does away with the easy to parse Python block statement syntax 2. why can't we use the old 'for x,y,z in something:' syntax and instead add iterators to the objects in question ? for key, value in object.iterator(): ... this doesn't only look better, it also allows having different iterators for different tasks (e.g. to iterate over values, key, items, row in a matrix, etc.) -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/

[MAL]
Depends on the time frame of "breakfast" ;-)
Does this mean you'll be at the conference? That would be excellent!
This should become the PEP. I propose that we try to keep this discussion off python-dev, and that the PEP author(s?) set up a separate discussion list (e.g. at egroups) to keep the PEP feedback coming. I promise I'll subscribe to such a list. --Guido van Rossum (home page: http://www.python.org/~guido/)

On Mon, 5 Feb 2001, M.-A. Lemburg wrote:
Oh, come on. Slices and dictionary literals use colons too, and there's nothing wrong with that. Blocks are introduced by a colon at the *end* of a line.
Because there's no good answer for "what does iterator() return?" in this design. (Trust me; i did think this through carefully.) Try it. How would you implement the iterator() method? The PEP *is* suggesting that we add iterators to the objects -- just not that we explicitly call them. In the 'for' loop you've written, iterator() returns a sequence, not an iterator. -- ?!ng

Ka-Ping Yee wrote:
Slices and dictionary enclose the two parts in brackets -- this places the colon into a visible context. for ... in ... : does not provide much of a context.
The .iterator() method would have to return an object which provides an iterator API (at C level to get the best performance). For dictionaries, this object could carry the needed state (current position in the dictionary table) and use the PyDict_Next() for the internals. Matrices would have to carry along more state (one integer per dimension) and could access the internal matrix representation directly using C functions. This would give us: speed, flexibility and extensibility which the syntax hacks cannot provide; e.g. how would you specify to iterate backwards over a sequence using that notation or diagonal for a matrix ?
No, it should return a forward iterator. -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/

On Mon, 5 Feb 2001, M.-A. Lemburg wrote:
For crying out loud! '\':' requires that you tokenize the string before you know that the colon is part of the string. Triple-quotes force you to tokenize carefully too. There is *nothing* that this stay-away-from-colons argument buys you. For a human skimming over source code -- i repeat, the visual hint is "colon at the END of a line".
Okay, provide an example. Write this iterator() method in Python. Now answer: how does 'for' know whether the thing to the right of 'in' is an iterator or a sequence?
This is already exactly what the PEP proposes for the implementation of sq_iter.
This would give us: speed, flexibility and extensibility which the syntax hacks cannot provide;
The PEP is not just about syntax hacks. It's an iterator protocol. It's clear that you haven't read it. *PLEASE* read the PEP before continuing to discuss it. I quote: | Rationale | | If all the parts of the proposal are included, this addresses many | concerns in a consistent and flexible fashion. Among its chief | virtues are the following three -- no, four -- no, five -- points: | | 1. It provides an extensible iterator interface. | | 2. It resolves the endless "i indexing sequence" debate. | | 3. It allows performance enhancements to dictionary iteration. | | 4. It allows one to provide an interface for just iteration | without pretending to provide random access to elements. | | 5. It is backward-compatible with all existing user-defined | classes and extension objects that emulate sequences and | mappings, even mappings that only implement a subset of | {__getitem__, keys, values, items}. I can take out the Monty Python jokes if you want. I can add more jokes if that will make you read it. Just read it, i beg you.
No differently from what you are suggesting, at the surface: for item in sequence.backwards(): for item in matrix.diagonal(): The difference is that the thing on the right of 'in' is always considered a sequence-like object. There is no ambiguity and no magic rule for deciding when it's a sequence and when it's an iterator. -- ?!ng "There's no point in being grown up if you can't be childish sometimes." -- Dr. Who

Ka-Ping Yee wrote:
Oh well, perhaps you are right and we should call things like key:value association and be done with it.
Simple: have the for-loop test for a type slot and have it fallback to __getitem__ in case it doesn't find the slot API.
Sorry, Ping, I didn't know you have a PEP for iterators already. ...reading it...
Done. Didn't know it exists, though (why isn't the PEP# in the subject line ?). Even after reading it, I still don't get the idea behind adding "Mapping Iterators" and "Sequence Iterators" when both of these are only special implementations of the single "Iterator" interface. Since the object can have multiple methods to construct iterators, all you need is *one* iterator API. You don't need a slot which returns an iterator object -- leave that decision to the programmer, e.g. you can have: for key in dict.xkeys(): for value in dict.xvalues(): for items in dict.xitems(): for entry in matrix.xrow(1): for entry in matrix.xcolumn(2): for entry in matrix.xdiag(): for i,element in sequence.xrange(): All of these method calls return special iterators for one specific task and all of them provide a slot which is callable without argument and yields the next element of the iteration. Iteration is terminated by raising an IndexError just like with __getitem__. Since for-loops can check for the type slot, they can use an optimized implementation which avoids the creation of temporary integer objects and leave the state-keeping to the iterator which can usually provide a C based storage for it with much better performance. Note that with this kind of interface, there is no need to add "Mapping Iterators" or "Sequence Iterators" as special cases, since these are easily implemented using the above iterators.
-- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/

On 5 Feb 2001, M.-A. Lemburg wrote:
For the third time: write an example, please. It will help a lot.
Sorry, Ping, I didn't know you have a PEP for iterators already.
I posted it on this very boutique (i mean, mailing list) a week ago and messages have been going back and forth on its thread since then. On 31 Jan 2001, Ka-Ping Yee wrote: | Okay, i have written a draft PEP that tries to combine the | "elt in dict", custom iterator, and "for k:v" issues into a | coherent proposal. Have a look: | | http://www.lfw.org/python/pep-iterators.txt | http://www.lfw.org/python/pep-iterators.html Okay. I apologize for my impatient tone, as it comes from the expectation that anyone would have read the document before trying to discuss it. I am very happy to get *new* information, the discovery of new errors in my thinking, better and interesting arguments; it's just that it's exasperating to see arguments repeated that were already made, or objections raised that were already carefully thought out and addressed. From now on, i'll stop resisting the urge to paste the text of proposals inline (instead of politely posting just URLs) so you won't miss them.
Done. Didn't know it exists, though (why isn't the PEP# in the subject line ?).
It didn't have a number at the time i posted it. Thank you for updating the subject line.
Three points: 1. We have syntactic support for mapping creation and lookup, and syntactic support for mapping iteration should mirror it. 2. IMHO for key:value in dict: is much easier to read and explain than for (key, value) in dict.xitems(): (Greg? Could you test this claim with a survey question?) To the newcomer, the former is easy to understand at a surface level. The latter exposes the implementation (an implementation that is still there in PEP 234, but that the programmer only has to worry about if they are going deeper and writing custom iteration behaviour). This separates the work of learning into two small, digestible pieces. 3. Furthermore, this still doesn't solve the backward-compatibility problem that PEP 234 takes great care to address! If you write your for-loops for (key, value) in dict.xitems(): then you are screwed if you try to replace dict with any kind of user-implemented dictionary-like replacement (since you'd have to go back and implement the xitems() method on everything). If, in order to maintain compatibility with the existing de-facto dictionary interface, you write your for-loops for (key, value) in dict.items(): then now you are screwed if dict is a built-in dictionary, since items() is supposed to construct a list, not an iterator.
These are fine, since matrices are not core data types with syntactic support or a de-facto emulation protocol.
for i,element in sequence.xrange():
This is just as bad as the xitems() issue above -- probably worse -- since nobody implements xrange() on sequence-like objects, so now you've broken compatibility with all of those. We want this feature to smoothly extend and work with existing objects with a minimum of rewriting, ideally none. PEP 234 achieves this ideal.
This statement, i believe, is orthogonal to both proposals.
I think this really just comes down to one key difference between our points of view here. Correct me if you disagree: You seem to be suggesting that we should only consider a protocol for sequences, whereas PEP 234 talks about both sequences and mappings. I argue that consideration for mappings is worthwhile because: 1. Dictionaries are a built-in type with syntactic and core implementation support. 2. Iteration over dictionaries is very common and should be spelled in an easily understood fashion. 3. Both sequence and mapping protocols are formalized in the core (with PySequenceMethods and PyMappingMethods). 4. Both sequence and mapping protocols are documented and used in Python (__getitem__, keys, values, etc.). 5. There are many, many sequence-like and mapping-like objects out there, implemented both in Python and in C, which adhere to these protocols. (There is also the not-insignificant side benefit of finally having a decent way to get the indices while you're iterating over a sequence, which i've wanted fairly often.) -- ?!ng

Ka-Ping Yee wrote:
Ping, what do you need an example for ? The above sentence says it all: for x in obj: ... This will work as follows: 1. if obj exposes the iteration slot, say tp_nextitem, the for loop will call this slot without argument and assign the returned object to x 2. if obj does not expose tp_nextitem, then the for loop will construct an integer starting at 0 and pass this to the sq_item slot or __getitem__ method and assign the returned value to x; the integer is then replaced with an incremented integer 3. both techniques work until the slot or method in question returns an IndexError exception The current implementation doesn't have 1. This is the only addition it takes to get iterators to work together well with the for-loop -- there are no backward compatibility issues here, because the tp_nextitem slot will be a new one. Since the for-loop can avoid creating temporary integers, iterations will generally run a lot faster than before. Also, iterators have access to the object's internal representation, so data access is also faster.
I must have missed those postings... don't have time to read all of python-dev anymore :-(
Tuples are well-known basic Python types. Why should (key,value) be any harder to understand than key:value. What would you tell a newbie that writes: for key:value in sequence: .... where sequence is a list of tuples and finds that this doesn't work ? Besides, the items() method has been around for ages, so switching from .items() to .xitems() in programs will be just as easy as switching from range() to xrange(). I am -0 on the key:value thingie. If you want it as a way to construct or split associations, fine. But it is really not necessary to be able to iterate over dictionaries.
Why is that ? You'd just have to add .xitems() to UserDict and be done with it. This is how we have added new dictionary methods all along. I don't see your point here. Sure, if you want to use a new feature you will have to think about whether it can be used with your data-types. What you are trying to do here is maintain forward compatibility at the cost of making iteration much more complicated than it really is.
I'm not breaking backward compatibility -- the above will still work like it has before since lists don't have the tp_nextitem slot.
Again, you are trying to achieve forward compatibility. If people want better performance, than they will have to add new functionality to their types -- one way or another.
No. I'm suggesting to add a low-level "give me the next item in the bag" and move the "how to get the next item" logic into an iterator object. This will still allow you to iterate over sequences and mappings, so I don't understand why you keep argueing for adding new syntax and slots to be able to iterate over dictionaries.
Agreed. I'd suggest to implement generic iterators which implements your suggestions and put them into the builins or a special iterator module... from iterators import xitems, xkeys, xvalues for key, value in xitems(dict): for key in xkeys(dict): for value in xvalues(dict): Other objects can then still have their own iterators by exposing special methods which construct special iterators. The for-loop will continue to work as always and happily accept __getitem__ compatible or tp_nextitem compatible objects as right-hand argument. -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/

On Tue, 6 Feb 2001, M.-A. Lemburg wrote:
*sigh* I give up. I'm not going to ask again. Real examples are a good idea when considering any proposal. (a) When you do a real example, you usually discover mistakes or things you didn't think of in your design. (b) We can compare it directly to other examples to see how easy or hard it is to write and understand code that uses the new protocol. (c) We can come up with interesting cases in practice to see if there are limitations in any proposal. Now that you have a proposal in slightly more detail, a few missing pieces are evident. How would you implement a *Python* class that supports iteration? For instance, write something that has the effect of the FileLines class in PEP 234. How would you implement an object that can be iterated over more than once, at the same time or at different times? It's not clear to me how the single tp_nextitem slot can handle that.
Again, completely orthogonal to both proposals. Regardless of the protocol, if you're implementing the iterator in C, you can use raw integers and internal access to make it fast.
It's mainly the business of calling the method and rearranging the data that i'm concerned about. Example 1: dict = {1: 2, 3: 4} for (key, value) in dict.items(): Explanation: The "items" method on the dict converts {1: 2, 3: 4} into a list of 2-tuples, [(1, 2), (3, 4)]. Then (key, value) is matched against each item of this list, and the two parts of each tuple are unpacked. Example 2: dict = {1: 2, 3: 4} for key:value in dict: Explanation: The "for" loop iterates over the key:value pairs in the dictionary, which you can see are 1:2 and 3:4.
"key:value doesn't look like a tuple, does it?"
It's not the same. xrange() is a built-in function that you call; xitems() is a method that you have to *implement*.
...and cgi.FieldStorage, and dumbdbm._Database, and rfc822.Message, and shelve.Shelf, and bsddbmodule, and dbmmodule, and gdbmmodule, to name a few. Even if you expect (or force) people to derive all their dictionary-like Python classes from UserDict (which they don't, in practice), you can't derive C objects from UserDict.
What i mean is that Python programmers would no longer know how to write their 'for' loops. Should they use 'xitems', thus dooming their loop never to work with the majority of user-implemented mapping-like objects? Or should they use 'items', thus dooming their loop to run inefficiently on built-in dictionaries?
Okay, i agree, it's forward compatibility. But it's something worth going for when you're trying to come up with a protocol. -- ?!ng "There's no point in being grown up if you can't be childish sometimes." -- Dr. Who

Ka-Ping Yee wrote:
I was just throwing in ideas, not a complete proposal. If that's what you want I can write up a complete proposal too and maybe even a patch to go with it. Exposing the tp_nextitem slot in Python classes via a __nextitem__ slot wouldn't be much of a problem. What I wanted to get across is the general idea behind my view of an iteration API and I believe that this idea has been made clear: I want a low-level API and move all the complicated object specific details into separate iterator objects. I don't see a point in trying to add complicated machinery to Python just to be able to iterate fast over some of the builtin types by special casing each object type. Let's please not add more special cases to the core.
Put all that logic into the iterator objects. These can be as complicated as needed, either trying to work in generic ways, special cased for some builtin types or be specific to a single type.
Again, if you prefer the key:value notation, fine. This is orthogonal to the iteration API though and really only touches the case of mappings.
You can put all that special logic into special iterators, e.g. a xitems iterator (see the end of my post).
The same applies to your proposed interface: people will have to write new code in order to be able to use the new technology. I don't see that as a problem, though.
Hey, people who care will be aware of this difference. It is very easy to test for interfaces in Python, so detecting the best method (in case it matters) is simple.
Sure, but is adding special cases everywhere really worth it ?

"M.-A. Lemburg" wrote:
Ka-Ping Yee wrote:
<snip/>
Sorry about sneaking in. I do in fact think that the syntax addition of key:value is easier to understand. Beginners know the { key:value } syntax, so this is just natural. Givin him an error in your above example is a step to clarity, avoiding hard to find errors if somebody has a list of tuples and the above happens to work somehow, although he forgot to use .xitems().
It has been around for years, but key:value might be better. A little faster for sure since we don't build extra tuples.
You really wouldn't stick with UserDict, but implement this on every object for speed. The key:value proposal is not only stronger through its extra syntactical strength, it is also smaller in code-size to implement. Having to force every "iterable" object to support a modified view of it via xitems() even doesn't look elegant to me. It forces key/value pairs to go through tupleization only for syntactical reasons. A weakness, not a strength. Object orientation gets at its limits here. If access to keys and values can be provided by a single implementation for all affected objects without adding new methods, this suggests to me that it is right to do so. +1 on key:value - ciao - chris -- Christian Tismer :^) <mailto:tismer@tismer.com> Mission Impossible 5oftware : Have a break! Take a ride on Python's Kaunstr. 26 : *Starship* http://starship.python.net 14163 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF where do you want to jump today? http://www.stackless.com

Christian Tismer wrote:
The problem is that key:value in sequence has a meaning under PEP234: key is the current index, value the tuple.
Small tuples are cheap and kept on the free list. I don't even think that key:value can do without them. Anyway, I've already said that I'm -0 on these thingies -- I would be +1 if Ping were to make key:value full flavoured associations (Jim Fulton has written a lot about these some years ago; I think they originated from SmallTalk).
...but it's a special case which we don't really need and it *only* works for mappings and then only if the mapping supports the new slots and methods required by PEP234. I don't buy the argument that PEP234 buys us fast iteration for free. Programmers will still have to write the code to implement the new slots and methods.
Hey, tuples are created for *every* function call, even C calls -- you can't be serious about getting much of a gain here ;-) -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/

"M.-A. Lemburg" wrote:
Why is this a problem? It is just fine.
a) I don't see a point to tell me about Python's implementation but for hair-splitting. Speed is not the point, it will just be faster. b) I think it can. But the point is the cleaner syntax which unambigously gets you keys and values, whenether the thing on the right can be indexed.
I didn't read that yet. Would it contradict Ping's version or could it be extended laer? ...
You are reducing my arguments to speed always, not me. I don't care about a tuple. But I think we can save code. Smaller *and* not slower is what I like. no offence - ly y'rs - chris -- Christian Tismer :^) <mailto:tismer@tismer.com> Mission Impossible 5oftware : Have a break! Take a ride on Python's Kaunstr. 26 : *Starship* http://starship.python.net 14163 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF where do you want to jump today? http://www.stackless.com

Christian Tismer wrote:
I'm not telling you (I know you know ;), but others on this list which may not be aware of this fact.
Ping's version would hide this detail under the cover: dictionaries would sort of implement the sequence protocol and then return associations. I don't think this is much of a problem though.
At the cost of: * special casing the for-loop implementation for sequences, mappings * adding half a dozen new slots and methods * moving all the complicated details into the for-loop implementation instead of keeping them in separate modules or object specific implementations Perhaps we are just discussing the wrong things: I believe that Ping's PEP could easily be implemented on top of my idea (or vice-versa depending on how you look at it) of how the iteration interface should look like. -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/

OK, now here's the hard one. Clearly, (a) for i in someList: has to continue to mean "iterate over the values". We've agreed that: (b) for k:v in someDict: means "iterate through the items". (a) looks like a special case of (b). I therefore asked my colleagues to guess what: (c) for x in someDict: did. They all said, "Iterates through the _values_ in the dict", by analogy with (a). I then asked, "How do you iterate through the keys in a dict, or the indices in a list?" They guessed: (d) for x: in someContainer: (note the colon trailing the iterator variable name). I think that the combination of (a) and (b) implies (c), which leads in turn to (d). Is this what we want? I gotta say, when I start thinking about how many problems my students are going to bring me when accidentally adding or removing a colon in the middle of a 'for' statement changes the iteration space from keys to values, and I start feeling queasy... Thanks, Greg

On Mon, 5 Feb 2001, Greg Wilson wrote:
The PEP explicitly proposes that (c) be an error, because i anticipated and specifically wanted to avoid this ambiguity. Have you had a good look at it? I think your survey shows that the PEP made the right choices. That is, it supports the position that if 'for key:value' is supported, then 'for key:' and 'for :value' should be supported, but 'for x in dict:' should not. It also shows that 'for index:' should be supported on sequences, which the PEP suggests. -- ?!ng

[GVW]
[Ping]
But then we should review the wisdom of using "if x in dict" as a shortcut for "if dict.has_key(x)" again. Everything is tied together! --Guido van Rossum (home page: http://www.python.org/~guido/)

But then we should review the wisdom of using "if x in dict" as a shortcut for "if dict.has_key(x)" again. Everything is tied together!
yeah, don't forget unpacking assignments: assert len(dict) == 3 { k1:v1, k2:v2, k3:v3 } = dict Cheers /F

On Mon, 5 Feb 2001, Fredrik Lundh wrote:
I think this is a total non-issue for the following reasons: 1. Recall the original philosophy behind the list/tuple split. Lists and dicts are usually variable-length homogeneous structures, and therefore it makes sense for them to be mutable. Tuples are usually fixed-length heterogeneous structures, and so it makes sense for them to be immutable and unpackable. 2. In all the Python programs i've ever seen or written, i've never known or expected a dictionary to have a particular fixed length. 3. Since the items come back in random order, there's no point in binding individual ones to individual variables. It's only ever useful to iterate over the key/value pairs. In short, i can't see how anyone would ever want to do this. (Sorry for being the straight man, if you were in fact joking...) -- ?!ng "There's no point in being grown up if you can't be childish sometimes." -- Dr. Who

On Mon, 5 Feb 2001, Guido van Rossum wrote:
Okay. Here's the philosophy; i'll describe my thinking more explicitly. Presumably we can all agree that if you ask to iterate over things "in" a sequence, you clearly want the items in the sequence, not their integer indices. You care about the data *you* put in the container. In the case of a list, you care about the items more than these additional integers that got supplied as a result of using an ordered data structure. So the meaning of for thing in sequence: is pretty clear. The meaning of for thing in mapping: is less clear, since both the keys and the values are interesting data to you. If i ask you to "get me all the things in the dictionary", it's not so obvious whether you should get me a list of just the words, just the definitions, or both (probably both, i suppose). But, if i ask you to "find 'aardvark' in the dictionary" or i ask you "is 'aardvark' in the dictionary?" it's completely obvious what i mean. "if key in dict:" makes sense both by this analogy to common use, and by an argument from efficiency (even the most rudimentary understanding of how a dictionary works is enough to see why we look up keys rather than values). In fact, we *call* it a dictionary because it works like a real dictionary: it's designed for data lookup in one direction, from key to value. "if thing in container" is about *finding* something specific. "for thing in container" is about getting everything out. Now, i know this isn't the strongest argument in the world, and i can see the potential objection that the two aren't consistent, but i think it's a very small thing that only has to be explained once, and then is easy to remember and understand. I consider this little difference less of an issue than the hasattr/has_key inconsistency that it will largely replace. We make expectations clear: for item in sequence: continues to mean, "i expect a sequence", exactly as it does now. When not given a sequence, the 'for' loop complains. Nothing could break, as the interpretation of this loop is unchanged. These three forms: for k:v in anycontainer: for k: in anycontainer: for :v in anycontainer: mean: "i am expecting any indexable thing, where ctr[k] = v". As far as the syntax goes, that's all there is to it: for item in sequence: # only on sequences for k:v in anycontainer: # get keys and values on anything for k: in anycontainer: # just keys for :v in anycontainer: # just values -- ?!ng "There's no point in being grown up if you can't be childish sometimes." -- Dr. Who

On Mon, Feb 05, 2001 at 02:22:50PM -0500, Greg Wilson wrote:
OK, now here's the hard one. Clearly,
Noshit. I ran into all of this while trying to figure out how to quick-hack implement it. My brain exploded while trying to grasp all implications, which is why I've been quiet on this issue -- I'm healing ;-P
(a) for i in someList: has to continue to mean "iterate over the values". We've agreed that:
(b) for k:v in someDict: means "iterate through the items". (a) looks like a special case of (b).
I'm still not sure if I like the special syntax to iterate over dictionaries. Are we talking about iterators, or about special syntax to use said iterators in the niche application of dicts and mapping interfaces ? :)
I therefore asked my colleagues to guess what:
(c) for x in someDict:
did. They all said, "Iterates through the _values_ in the dict", by analogy with (a).
But how baffled were they when it didn't do what they expected it to do ? Did they go, 'oh shit, now what' ?
I then asked, "How do you iterate through the keys in a dict, or the indices in a list?" They guessed:
(d) for x: in someContainer:
Again, how baffled were they when you said it wasn't going to work ? Because (c) and (d) are just very light syntactic powdered sugar substitutes for 'k:v' where you just don't use one of the two. The extra name binding operation isn't going to cost you enough to really worry about, IMHO. -- Thomas Wouters <thomas@xs4all.net> Hi! I'm a .signature virus! copy me into your .signature file to help me spread!
participants (8)
-
Christian Tismer
-
Fredrik Lundh
-
Greg Wilson
-
Guido van Rossum
-
Ka-Ping Yee
-
M.-A. Lemburg
-
Thomas Wouters
-
Tim Peters