Re: [Python-ideas] Revised**12 PEP on Yield-From

Erik Groeneveld wrote:
I think you're expecting a bit much from yield-from. In a situation like this, I wouldn't use yield to receive the values. I'd read them from some kind of buffering object that allows peeking ahead however far is needed. -- Greg

2009/4/23 Greg Ewing <greg.ewing@canterbury.ac.nz>:
Well, you asked for practical applications, and here is one. I hope to be able to use yield-from in Weightless instead of its compose function. However, I do not see how a yield-from without support for splitting boundaries would be combined with my own code to do the latter. If this combination is not possible, I would be forced to still use compose instead of yield-from. I would regret that mostly. So I am expecting at least a yield-from that can be combined orthogonally with my boundary splitting code (and other things, see below). At present, it can't because there is no way to detect of intercept an yield-from.
Well, the whole point of using coroutines is to avoid buffering. I'll try to elaborate on this point a bit, and I hope I can convince you and others to investigate what the consequences of this type of applications could be for the usage or implementation of yield-from. When generalized generators were introduced many people immediately saw the advantage for using them for thread-less I/O: tie a generator to a socket. I took the challenge and found it to be extraordinary complicated. Back to that later, first a little background. I started with Michael Jacksons now more than 30 years old JSP theory about structuring programs based on the input and output stream they process. All based on coroutines. His assumptions about memory and storage latency of mainframes are valid today for web-servers. The idea basically boils down to decompose a data-processing program into coroutines, as easily as you are used to do with functions. A programmer would be able to 'call' subcoroutines as if they were functions, without need for diving into subtle and hard to understand differences or inconsistencies between the two. It took me two years to get it right. Every time I switched to role of 'a programmer', I got stuck with code not working as expected, incomprehensible stack-traces etc. Others were even more puzzled. It was not transparent in its usage and I had to go back to the working bench. But what a reward when it finally worked! I have never seen such simple easy to read code for for example an HTTP server. Notoriously difficult bugs in my call-back based HTTP server I was not able to solved just vanished. I still am impressed by the cleanness of the code and I keep wondering: 'can it really be that simple'?. Was this really conceived more than 30 years ago? Jackson must have been a genius! Since then I have presented this on the British SPA conference and two Dutch Pythonist groups. I assembled a callback-vs-coroutine test case which clearly demonstrates the differences in amount of code, readability of code and locality of change when adding features. People explicitly appreciated the intuitive behavior for a programmer. (all documented at http://weightless.io and code fragments in svn) Back to why it was so complicated. First of all, as you already know, it is not possible to use just a straightforward for-loop to delegate to another coroutine. The yield-from proposal covers this all I believe. Secondly, if something goes wrong and a stack-trace is printed, this stack-trace would not reflect the proper sequence in which coroutines were called (this really make a programmer go mad!), at least not without additional efforts to maintain an explicit callstack with each generator on it, and using this to adjust the stack-trace when needed. (This is why I asked if the coroutine will be on the call-stack and hence be visible in a stack-trace). Thirdly, there seems to be some sort of unspoken 'protocol' with generators. A next() is actually send(None) and vaguely means 'I want data'. It the same vein 'x =3D yield' actually is 'x =3D yield None' and also vaguely means 'I want data'. So the None seems to play a special role. I hesitated a lot, but I had to apply this 'protocol' to couroutines, otherwise It was next to impossible to work with them as being 'the programmer'; it requires constant checking what happened. Funny enough, it turned out to be a major break-through in getting it transparent to a programmer. Fourthly, there is the issue of boundary clashes. These are common in any data-processing problem. The input data is simply not structured or tokenized according to boundaries on a certain level of abstraction. This is the *very reason* to use coroutines and Jackson describes elegant ways to solve the problem. JSP requires a lookahead and the coroutines must have some way to support this. (Introducing a stream or buffer would put us back to where we started of course). After several tries I settled for a push-back mechanism as this was the most elegant way (IMHO) to solve it. (This is why I suggested 'return value, pushbackdata'). At this point I hope I have gained you interest for this kind of data-processing applications and I hope that we can have a fruitful discussion about it. Also, I would like to here what other kind of applications you have in mind. Best regards Erik

Erik Groeneveld wrote:
Generators will be allowed to return tuples under the PEP, just like normal functions. So what's wrong with doing something like the following?: def dummy_example(): pushback = None while 1: item, pushback = yield from read_item(pushback) process_item(item) def read_item(init_data=None): if init_data is not None: # Initialise state based on init_data else: # Use default initial state # Read enough data to get a complete item # Since this is a coroutine, there will be at least 1 yield return item, excess_data Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------

Hi Nick, 2009/4/23 Nick Coghlan <ncoghlan@gmail.com>:
Well, there is nothing wrong with this code, but I don't want to repeat it for every generator and every generator 'call', just because one of them *might* have excess data. I would like a generic solution to have this code only once, but I can't see a solution yet. Erik

2009/4/23 Greg Ewing <greg.ewing@canterbury.ac.nz>:
Well, you asked for practical applications, and here is one. I hope to be able to use yield-from in Weightless instead of its compose function. However, I do not see how a yield-from without support for splitting boundaries would be combined with my own code to do the latter. If this combination is not possible, I would be forced to still use compose instead of yield-from. I would regret that mostly. So I am expecting at least a yield-from that can be combined orthogonally with my boundary splitting code (and other things, see below). At present, it can't because there is no way to detect of intercept an yield-from.
Well, the whole point of using coroutines is to avoid buffering. I'll try to elaborate on this point a bit, and I hope I can convince you and others to investigate what the consequences of this type of applications could be for the usage or implementation of yield-from. When generalized generators were introduced many people immediately saw the advantage for using them for thread-less I/O: tie a generator to a socket. I took the challenge and found it to be extraordinary complicated. Back to that later, first a little background. I started with Michael Jacksons now more than 30 years old JSP theory about structuring programs based on the input and output stream they process. All based on coroutines. His assumptions about memory and storage latency of mainframes are valid today for web-servers. The idea basically boils down to decompose a data-processing program into coroutines, as easily as you are used to do with functions. A programmer would be able to 'call' subcoroutines as if they were functions, without need for diving into subtle and hard to understand differences or inconsistencies between the two. It took me two years to get it right. Every time I switched to role of 'a programmer', I got stuck with code not working as expected, incomprehensible stack-traces etc. Others were even more puzzled. It was not transparent in its usage and I had to go back to the working bench. But what a reward when it finally worked! I have never seen such simple easy to read code for for example an HTTP server. Notoriously difficult bugs in my call-back based HTTP server I was not able to solved just vanished. I still am impressed by the cleanness of the code and I keep wondering: 'can it really be that simple'?. Was this really conceived more than 30 years ago? Jackson must have been a genius! Since then I have presented this on the British SPA conference and two Dutch Pythonist groups. I assembled a callback-vs-coroutine test case which clearly demonstrates the differences in amount of code, readability of code and locality of change when adding features. People explicitly appreciated the intuitive behavior for a programmer. (all documented at http://weightless.io and code fragments in svn) Back to why it was so complicated. First of all, as you already know, it is not possible to use just a straightforward for-loop to delegate to another coroutine. The yield-from proposal covers this all I believe. Secondly, if something goes wrong and a stack-trace is printed, this stack-trace would not reflect the proper sequence in which coroutines were called (this really make a programmer go mad!), at least not without additional efforts to maintain an explicit callstack with each generator on it, and using this to adjust the stack-trace when needed. (This is why I asked if the coroutine will be on the call-stack and hence be visible in a stack-trace). Thirdly, there seems to be some sort of unspoken 'protocol' with generators. A next() is actually send(None) and vaguely means 'I want data'. It the same vein 'x =3D yield' actually is 'x =3D yield None' and also vaguely means 'I want data'. So the None seems to play a special role. I hesitated a lot, but I had to apply this 'protocol' to couroutines, otherwise It was next to impossible to work with them as being 'the programmer'; it requires constant checking what happened. Funny enough, it turned out to be a major break-through in getting it transparent to a programmer. Fourthly, there is the issue of boundary clashes. These are common in any data-processing problem. The input data is simply not structured or tokenized according to boundaries on a certain level of abstraction. This is the *very reason* to use coroutines and Jackson describes elegant ways to solve the problem. JSP requires a lookahead and the coroutines must have some way to support this. (Introducing a stream or buffer would put us back to where we started of course). After several tries I settled for a push-back mechanism as this was the most elegant way (IMHO) to solve it. (This is why I suggested 'return value, pushbackdata'). At this point I hope I have gained you interest for this kind of data-processing applications and I hope that we can have a fruitful discussion about it. Also, I would like to here what other kind of applications you have in mind. Best regards Erik

Erik Groeneveld wrote:
Generators will be allowed to return tuples under the PEP, just like normal functions. So what's wrong with doing something like the following?: def dummy_example(): pushback = None while 1: item, pushback = yield from read_item(pushback) process_item(item) def read_item(init_data=None): if init_data is not None: # Initialise state based on init_data else: # Use default initial state # Read enough data to get a complete item # Since this is a coroutine, there will be at least 1 yield return item, excess_data Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------

Hi Nick, 2009/4/23 Nick Coghlan <ncoghlan@gmail.com>:
Well, there is nothing wrong with this code, but I don't want to repeat it for every generator and every generator 'call', just because one of them *might* have excess data. I would like a generic solution to have this code only once, but I can't see a solution yet. Erik
participants (3)
-
Erik Groeneveld
-
Greg Ewing
-
Nick Coghlan