Hi friends, my tasklets are flying. Now I am at the point where I'm worst suited for: Design an interface. First of all what it is: We have a "tasklet", a tiny object around chains of frames, with some additional structure that keeps C stack snippets. Ignore the details, tasklets are simply the heart of Stackless' coro/uthread/anything. The tstate maintains two lists of tasklets. One is the list of all running (or better "runnable"?) tasklets. These tasklets are not in "value state", they don't want to transmit a value. They can be scheduled as you like it. The other list keeps record of blocked tasklets. These are tasklets which are in "value state", they are waiting to do some transmission. Whenever a tasklet calls an other tasklet's "transfer" method for data transfer, the following happens: - the other tasklet is checked to be in blocked state. - the tasklet is removed from the runnable list, - it is blocked - data is transferred - the other tasklet is unblocked - the other tasklet is inserted into the runnables - the other tasklet is run The "transfer" word comes from Modula II. It implements coroutine behavior. Then, I have something like the ICON suspend, and I gave it the name "suspend" for now, since yield is in use. Suspend is a one-sided thing, and it is also needed to initiate a blocked state at all. Thre is a "client" variable that holds a reference to the tasklet that called "transfer". When suspend is called, then we have two cases: 1) There is another tasklet in the client variable. We take the client and call client.transfer(data) 2) There is no client set already. We go into blocked state and wait, until some tasklet transfers to us. What suspend does is yielding (like in generators), but also initial blocking, providing targets for transfer to jump to. What I'm missing is name and concept of an opposite method: Finish the data transfer, but leave both partners in runnable state. Ok, here a summary of what I have. Please tell me what you think, and what you'd like to change. stackless module: ----------------- schedule() switch to the next task in the runnable list. taskoutlet(func) call it with a function, and it generates a generator for tasklets. ret = suspend(value) initiates data exchange, see above. The current tasklet gets blocked. If client is set already, a transfer is performed. Example: def demo(n): print n factory = taskoutlet(demo) t = factory(42) # this is now a tasklet with bound arguments. tasklet methods: ---------------- t.insert() inserts t into the according tasklet ring at the "end". if t is in a ring already, it is removed first. The ring is either "runnables" or "blocked", depending on t's state. t.remove() removes t from whatever ring it is in. t.run() inserts t into runnables and switches to it immediately. ret = t.transfer(value) unblocks t, tansfers data, blocks myself. *Wanted* Again, What is the name of this function/method, and where does it belong? It - unblocks another tasklet - transfers data - does not block anything - schedules the other tasklet Or is this a bad design at all? If so, please help me as well. Thanks in advance - ciao - chris p.s.: Not all of the above is implemented already. -- Christian Tismer :^) mailto:tismer@tismer.com Mission Impossible 5oftware : Have a break! Take a ride on Python's Kaunstr. 26 : *Starship* http://starship.python.net/ 14163 Berlin : PGP key -> http://wwwkeys.pgp.net/ PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF where do you want to jump today? http://www.stackless.com/
Now I am at the point where I'm worst suited for: Design an interface.
Please tell me what you think, and what you'd like to change.
It's not clear exactly what you're after here. Are you trying to define the lowest-level interface upon which everything else will be built? If so, I think what you have presented is FAR too complex. It seems to me you need only two things: (1) A constructor for new tasklets: t = tasklet(f) Takes a callable object f of no parameters and returns a tasklet which will execute the code of f. The tasklet is initially suspended and does not execute any of f's code until it is switched to for the first time. (2) A way of switching to another tasklet: t.transfer() Suspends the currently-running tasklet and resumes tasklet t were it last left off. This will either be at the beginning or where it last called the transfer() of another tasklet. All the other stuff you talk about -- passing values between tasklets, rings of runnable tasklets, scheduling policies, etc -- can all be implemented in Python on top of these primitives. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+
Greg Ewing wrote:
Now I am at the point where I'm worst suited for: Design an interface.
Please tell me what you think, and what you'd like to change.
It's not clear exactly what you're after here. Are you trying to define the lowest-level interface upon which everything else will be built? If so, I think what you have presented is FAR too complex.
The old Stackless with its continuations was at lowest possible level, in a sense. What I now try to do is a compromise: I would like to build the simplest possible but powerful set of methods. At the same time, I'd like to keep track of tasklets, since they are now containing vitual information about stack state, and I cannot afford to loose one of them, or we'll crash. The little doubly-linked list maintenance is very cheap to do. So my basic idea was to provide what is needed to get uthreads at very high speed, without the ned to use Python for the basic machinery.
It seems to me you need only two things:
<snip/> Yes, I need these two things, and some more.
All the other stuff you talk about -- passing values between tasklets, rings of runnable tasklets, scheduling policies, etc -- can all be implemented in Python on top of these primitives.
Sure it can, with one exception: My tasklets will also support threading, that is they will become auto-scheduled if the user switches this on. But auto-scheduled frames are a diffeent kind of thing than those which are in "waiting for data" state. I need to distinguish them or I will crash. That's the reason why I keep these linked lists. Switching to the wrong tasklet should be rock solid in the kernel, this is nothing that I want people to play with from Python. Thanks a lot anyway, I'll try to make it even simpler. ciao - chris -- Christian Tismer :^) mailto:tismer@tismer.com Mission Impossible 5oftware : Have a break! Take a ride on Python's Kaunstr. 26 : *Starship* http://starship.python.net/ 14163 Berlin : PGP key -> http://wwwkeys.pgp.net/ PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF where do you want to jump today? http://www.stackless.com/
Christian Tismer
I'd like to keep track of tasklets, since they are now containing vitual information about stack state, and I cannot afford to loose one of them, or we'll crash.
I'm not sure what the problem is here. A tasklet isn't going to go away until there are no more references to it anywhere, and once that happens, there is no longer any way of switching to it.
So my basic idea was to provide what is needed to get uthreads at very high speed, without the ned to use Python for the basic machinery.
Well, the higher-level stuff doesn't *have* to be implemented in Python. But I think it should in principle be possible. Then you can experiment with different designs for higher-level faciliies in Python, find out which ones are most useful, and re-code those in C later.
My tasklets will also support threading, that is they will become auto-scheduled if the user switches this on.
I'm not sure how this is going to interact with the facility for switching to a specific tasklet. Seems to me that, in the presence of pre-emptive scheduling, it no longer makes sense to do so, since some other tasklet could easily get scheduled a moment later. The most you can do is say "I don't want to run any more now, let some other tasklet have a go". So it appears that we already have two distinct layers of functionality here: a low-level, non-preemptive layer where we explicitly switch from one tasklet to another, and a higher-level, preemptive one where we let the scheduler take care of picking what to run next. These two layers should be clearly separated, with the higher one built strictly on the facilities provided by the lower one. In particular, there should be exactly one way of switching between tasklets, i.e. by calling t.transfer(). Preemptive switching should be done by some kind of signal or event handler which does this.
But auto-scheduled frames are a diffeent kind of thing than those which are in "waiting for data" state. I need to distinguish them or I will crash.
If you get rid of the idea of passing values between tasklets as part of the switching process, then this distinction disappears. I think that value-passing and tasklet-switching are orthogonal activities and would be better decoupled. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+
Greg Ewing wrote:
Christian Tismer
:
...
But auto-scheduled frames are a diffeent kind of thing than those which are in "waiting for data" state. I need to distinguish them or I will crash.
If you get rid of the idea of passing values between tasklets as part of the switching process, then this distinction disappears. I think that value-passing and tasklet-switching are orthogonal activities and would be better decoupled.
Hmm, first I thought you were wrong: Any Python function that calls something, may it be a stackless schedule function or something else, expects a value to be returned. Always and ever. But when I have a scheduler counter built into the Python interpreter loop, then a schedule will happen *between* opcodes. Such a frame is not awaiting data, and therefor not suitable to be switched to by one which is in data transfer. Now I see it: You mean I can make this schedule function behave like a normal function call, that accepts and drops a dummy value? In fact, this would make all tasklets compatible. thinking - thanks - chris -- Christian Tismer :^) mailto:tismer@tismer.com Mission Impossible 5oftware : Have a break! Take a ride on Python's Kaunstr. 26 : *Starship* http://starship.python.net/ 14163 Berlin : PGP key -> http://wwwkeys.pgp.net/ PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF where do you want to jump today? http://www.stackless.com/
Greg Ewing
It's not clear exactly what you're after here. Are you trying to define the lowest-level interface upon which everything else will be built? If so, I think what you have presented is FAR too complex.
It seems to me you need only two things: ... t = tasklet(f) t.transfer()
(Sorry if I missed something -- I've been *way* busy lately and haven't been giving this much attention -- that said...) But (if I understand the current plan) we will need mechanisms internal to the Python interpreter to transfer values and maintain blocked/running state anyway; since when you generate a tasklet and run it: t = tasklet(f) t.transfer() That may cause many more tasklets to be generated, run, and destroyed that you don't ever see ... recursions/function calls in f, and only-Christian-knows what else... so the transfer value mechanism might as well be built in. I haven't thought enough about the "unamed produce-and-continue function" to decide how exactly it should work. I have two concerns in implementing uthreads this way (scheduler in C): 1 -- there doesn't seem to be anyway to "kill" a tasklet 2 -- the scheduling algorithm will be hard to tune (we'll probably *at least* need tasklet priority...) Maybe there should still be a "timeslice" function so an in-Python scheduler can be written? -- -Jas -------------------- www.maya.com Jeff Senn | / / |-/ \ / /|® Chief Technologist | /|/| |/ o | /-| Head of R&D | Taming Complexity®
Jeff Senn
That may cause many more tasklets to be generated, run, and destroyed that you don't ever see ... recursions/function calls in f, and only-Christian-knows what else... so the transfer value mechanism might as well be built in.
I don't understand what you mean. Are you saying that every function call creates a new tasklet? That stack frame == tasklet? If that's the case, then we're back to continuations! But I don't think so, because Christian said that a tasklet contains "a chain of frames", not just one frame.
2 -- the scheduling algorithm will be hard to tune (we'll probably *at least* need tasklet priority...) Maybe there should still be a "timeslice" function so an in-Python scheduler can be written?
The huge variety of possible scheduling policies is all the more reason *not* to make scheduling part of the core functionality. The user should be free to implement his own scheduling layer on top of the primitives if he doesn't like what is provided. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+
Jeff Senn wrote:
Greg Ewing
writes: It's not clear exactly what you're after here. Are you trying to define the lowest-level interface upon which everything else will be built? If so, I think what you have presented is FAR too complex.
It seems to me you need only two things:
...
t = tasklet(f) t.transfer()
(Sorry if I missed something -- I've been *way* busy lately and haven't been giving this much attention -- that said...)
But (if I understand the current plan) we will need mechanisms internal to the Python interpreter to transfer values and maintain blocked/running state anyway; since when you generate a tasklet and run it:
t = tasklet(f) t.transfer()
That may cause many more tasklets to be generated, run, and destroyed that you don't ever see ... recursions/function calls in f, and only-Christian-knows what else... so the transfer value mechanism might as well be built in.
I think all these little things are cheap to implement.
I haven't thought enough about the "unamed produce-and-continue function" to decide how exactly it should work.
Somebody named it "resume", and together with "suspend" we get a nice couple. On the other hand: I'm not sure whether resume should block its caller. I'm very undecided after all the input I got, if it is in fact better to forget data transfer completely by now and just make switching primitives which always work.
I have two concerns in implementing uthreads this way (scheduler in C):
1 -- there doesn't seem to be anyway to "kill" a tasklet
Not yet, but I want to provide an exception to kill tasklets. Also it will be prossible to just pick it off and drop it, but I'm a little concerned about the C stack inside. This might be the last resort if the exception doesn't work.
2 -- the scheduling algorithm will be hard to tune (we'll probably *at least* need tasklet priority...) Maybe there should still be a "timeslice" function so an in-Python scheduler can be written?
We had the timeslice function, yes. I think to make things simpler this time and just periodically call the scheduler which is written in C. I also have a rough concept of priorities which can be very cheaply implemented. Maybe I implement some default behavior, but allow these objects to be subclassed from Python? ciao - chris -- Christian Tismer :^) mailto:tismer@tismer.com Mission Impossible 5oftware : Have a break! Take a ride on Python's Kaunstr. 26 : *Starship* http://starship.python.net/ 14163 Berlin : PGP key -> http://wwwkeys.pgp.net/ PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF where do you want to jump today? http://www.stackless.com/
Christian Tismer
Now I see it: You mean I can make this schedule function behave like a normal function call, that accepts and drops a dummy value?
Yes, that's right. (Or more precisely, it would take no parameters and return None.)
when I have a scheduler counter built into the Python interpreter loop
I can see the attraction of having pre-emption built in this way -- it would indeed be extremely efficient. But I think you need to make a decision about whether your tasklet model is going to be fundamentally pre-emptive or fundamentally non-pre-emptive, because, as I said before, the notion of switching to a specific tasklet is incompatible with pre-emptive scheduling. If you want to go with a fundamentally pre-emptive model, I would suggest the following primitives: t = tasklet(f) Creates a new tasklet executing f. The new tasklet is initially blocked. t.block() Removes tasklet t from the set of runnable tasklets. t.unblock() Adds tasklet t to the set of runnable tasklets. current_tasklet() A built-in function which returns the currently running tasklet. Using this model, a coroutine switch would be implemented using something like def transfer(t): "Transfer from the currently running tasklet to t." t.unblock() current_tasklet().block() although some locking may be needed in there somewhere. Have to think about that some more. For sending values from one tasklet to another, I think I'd use an intermediate object to mediate the transfer, something like a channel in Occam: c = channel() # tasklet 1 does: c.send(value) # tasklet 2 does: value = c.receive() Tasklet 1 blocks at the send() until tasklet 2 reaches the receive(), or vice versa if tasklet 2 reaches the receive() first. When they're both ready, the value is transferred and both tasklets are unblocked. The advantage of this is that it's more symmetrical. Instead of one tasklet having to know about the other, they don't know about each other but they both know about the intermediate object.
I want to provide an exception to kill tasklets. Also it will be prossible to just pick it off and drop it, but I'm a little concerned about the C stack inside.
As I said before, if there are no references left to a tasklet, there's no way it can ever be switched to again, so its C stack is no longer relevant. Unless you can have return addresses from one C stack pointing into another, or something... can you? Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+
Greg Ewing wrote:
Christian Tismer
:
...
I can see the attraction of having pre-emption built in this way -- it would indeed be extremely efficient.
But I think you need to make a decision about whether your tasklet model is going to be fundamentally pre-emptive or fundamentally non-pre-emptive, because, as I said before, the notion of switching to a specific tasklet is incompatible with pre-emptive scheduling.
Yes. I will go a bit further and adapt the process model of the Alef language a bit.
If you want to go with a fundamentally pre-emptive model, I would suggest the following primitives:
[blocking stuff - ok]
Using this model, a coroutine switch would be implemented using something like
def transfer(t): "Transfer from the currently running tasklet to t." t.unblock() current_tasklet().block()
although some locking may be needed in there somewhere. Have to think about that some more.
For sending values from one tasklet to another, I think I'd use an intermediate object to mediate the transfer, something like a channel in Occam:
c = channel()
# tasklet 1 does: c.send(value)
# tasklet 2 does: value = c.receive()
Tasklet 1 blocks at the send() until tasklet 2 reaches the receive(), or vice versa if tasklet 2 reaches the receive() first. When they're both ready, the value is transferred and both tasklets are unblocked.
The advantage of this is that it's more symmetrical. Instead of one tasklet having to know about the other, they don't know about each other but they both know about the intermediate object.
Yes. This all sounds very familiar to me. In private conversation with Russ Cox, Bell Labs, I learned about rendevouz techniques which are quite similar. Having read the Alef user guide which can be found at http://plan9.bell-labs.com/who/rsc/thread.html http://plan9.bell-labs.com/who/rsc/ug.pdf I got the following picture: (Thanks to Russ Cox, these are his ideas!) We use a two-level structure. Toplevel is something similar to threads, processes in Alef language. These are pre-emptively scheduled by an internal scheduler that switches after every n opcodes. These threads are groups of tasklets, which have collaborative scheduling between them. This gives us a lot of flexibility: If people prefer thread-like behavior, they can use the system provided approach and just use the toplevel layer with just one tasklet in it. Creating new tasklets inside a process then has coroutine-like behavior. I'm just busy designing the necessary structures, things should not get too complicated on the C level.
I want to provide an exception to kill tasklets. Also it will be prossible to just pick it off and drop it, but I'm a little concerned about the C stack inside.
As I said before, if there are no references left to a tasklet, there's no way it can ever be switched to again, so its C stack is no longer relevant. Unless you can have return addresses from one C stack pointing into another, or something... can you?
Well, the problem is that an extension *might* be sitting inside a tasklet's stack with a couple of allotted objects. I would assume that the extension frees these objects when I send an exception to the tasklet. But without that, I cannot be sure if all resources are freed. thanks a lot for your help - chris -- Christian Tismer :^) mailto:tismer@tismer.com Mission Impossible 5oftware : Have a break! Take a ride on Python's Kaunstr. 26 : *Starship* http://starship.python.net/ 14163 Berlin : PGP key -> http://wwwkeys.pgp.net/ PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF where do you want to jump today? http://www.stackless.com/
On 20 Feb 2002 at 16:01, Greg Ewing wrote: [Christian's plea]
It seems to me you need only two things:
(1) A constructor for new tasklets:
t = tasklet(f)
[snip]
(2) A way of switching to another tasklet:
t.transfer()
[snip]
All the other stuff you talk about -- passing values between tasklets, rings of runnable tasklets, scheduling policies, etc -- can all be implemented in Python on top of these primitives.
Unless you've got a way to detect or pass tasklet's through transfer, you don't have enough. -- Gordon http://www.mcmillan-inc.com/
Gordon McMillan
Unless you've got a way to detect or pass tasklet's through transfer, you don't have enough.
You'll have to elaborate. I don't have any idea what you mean by that! Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+
On 21 Feb 2002 at 15:08, Greg Ewing wrote:
Gordon McMillan
: Unless you've got a way to detect or pass tasklet's through transfer, you don't have enough.
You'll have to elaborate. I don't have any idea what you mean by that!
You need a way to refer to "this" tasklet from Python, and pass that to the "other" tasklet. Alternatively, you need "the tasklet that transferred to me". This is implicit in generators; it needs to be explicit to do coroutines. You can't write a scheduler in Python without it - you need the client tasklets to transfer to the scheduler tasklet. -- Gordon http://www.mcmillan-inc.com/
Gordon McMillan
You need a way to refer to "this" tasklet from Python
Yes, that occurred to me as well. Would a built-in function called current_tasklet() provide what you want? Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+
On Tue, 2002-02-19 at 11:15, Christian Tismer wrote:
Again, What is the name of this function/method, and where does it belong? It - unblocks another tasklet - transfers data - does not block anything - schedules the other tasklet
Or is this a bad design at all?
In our current system, this function is called 'schedule()'; and it takes an 'args' tuple. It doesn't transfer control, it just makes the other coro ready to run ASAP. [i.e., next trip through the event loop it will be added to the set of 'runnable' coros]. Here is our coro::condition_variable::wake_one() for context: def wake_one (self, args=()): for coro in self._waiting: try: schedule (coro, args) except ScheduleError: pass else: self._waiting.pop(0) return 1 else: return 0 [ScheduleError is thrown if the coro has already been scheduled] -Sam
participants (5)
-
Christian Tismer
-
Gordon McMillan
-
Greg Ewing
-
Jeff Senn
-
Sam Rushing