RE: Fake threads (was [Python-Dev] ActiveState & fork & Perl)

June 30, 1999

      [Guido]
...
I guess it's all in the perspective.  99.99% of all thread apps I've
ever written use threads primarily to overlap I/O -- if there wasn't
I/O to overlap I wouldn't use a thread.  I think I share this
perspective with most of the thread community (after all, threads
originate in the OS world where they were invented as a replacement
for I/O completion routines).
Different perspective indeed!  Where I've been, you never used something as
delicate as a thread to overlap I/O, you instead used the kernel-supported
asynch Fortran I/O extensions <0.7 wink>.

Those days are long gone, and I've adjusted to that.  Time for you to leave
the past too <wink>:  by sheer numbers, most of the "thread community"
*today* is to be found typing at a Windows box, where cheap & reliable
threads are a core part of the programming culture.  They have better ways
to overlap I/O there too.  Throwing explicit threads at this is like writing
a recursive Fibonacci number generator in Scheme, but building the recursion
yourself by hand out of explicit continuations <wink>.
...
...
As far as I can tell, all the examples you give are easily done using
coroutines.  Can we call whatever you're asking for coroutines instead
of fake threads?
I have multiple agendas, of course.  What I personally want for my own work
is no more than Icon's generators, formally "semi coroutines", and easily
implemented in the interpreter (although not the language) as it exists
today.

Coroutines, fake threads and continuations are much stronger than
generators, and I expect you can fake any of the first three given either of
the others.  Generators fall out of any of them too (*you* implemented
generators once using Python threads, and I implemented general
coroutines -- "fake threads" are good enough for either of those).

So, yes, for that agenda any means of suspending/resuming control flow can
be made to work.  I seized on fake threads because Python already has a
notion of threads.

A second agenda is that Python could be a lovely language for *learning*
thread programming; the threading module helps, but fake threads could
likely help more by e.g. detecting deadlocks (and pointing them out) instead
of leaving a thread newbie staring at a hung system without a clue.

A third agenda is related to Mark & Greg's, making Python's threads "real
threads" under Windows.  The fake thread agenda doesn't tie into that,
except to confuse things even more if you take either agenda seriously <0.5
frown>.
...
I think that when you mention threads, green or otherwise colored,
most people who are at all familiar with the concept will assume they
provide I/O overlapping, except perhaps when they grew up in the
parallel machine world.
They didn't suggest I/O to me at all, but I grew up in the disqualified
world <wink>; doubt they would to a Windows programmer either (e.g., my
employer ships heavily threaded Windows apps of various kinds, and
overlapped I/O isn't a factor in any of them; it's mostly a matter of
algorithm factoring to keep the real-time incestuous subsystems from growing
impossibly complex, and in some of the very expensive apps also a need to
exploit multiple processors).  BTW, I called them "fake" threads to get away
from whatever historical baggage comes attached to "green".
...
Certainly all examples I give in my never-completed thread tutorial
(still available at
http://www.python.org/doc/essays/threads.html) use I/O as the primary
motivator --
The preceding "99.99% of all thread apps I've ever written use threads
primarily to overlap I/O" may explain this <wink>.  BTW, there is only one
example there, which rather dilutes the strength of the rhetorical "all" ...
...
this kind of example appeals to simples souls (e.g. downloading more than
one file in parallel, which they probably have already seen in action in
their web browser), as opposed to generators or pipelines or coroutines
(for which you need to have some programming theory background to
appreciate the powerful abstraction possibillities they give).
I don't at all object to using I/O as a motivator, but the latter point is
off base.  There is *nothing* in Comp Sci harder to master than thread
programming!  It's the pinnacle of perplexity, the depth of despair, the
king of confusion (stop before I exaggerate <wink>).

Generators in particular get re-invented often as a much simpler approach to
suspending a subroutine's control flow; indeed, Icon's primary audience is
still among the humanities, and even dumb linguists <wink> don't seem to
have notable problems picking it up.  Threads have all the complexities of
the other guys, plus races, deadlocks, starvation, load imbalance,
non-determinism and non-reproducibility.

Threads simply aren't simple-soul material, no matter how pedestrian a
motivating *example* may be.  I suspect that's why your tutorial remains
unfinished:  you had no trouble describing the problem to be solved, but got
bogged down in mushrooming complications describing how to use threads to
solve it.  Even so, the simple example at the end is already flawed ("print"
isn't atomic in Python, so the

    print len(text), url

may print the len(text) from one thread followed by the url from another).

It's not hard to find simple-soul examples for generators either (coroutines
& continuations *are* hard to motivate!), especially since Python's
for/__getitem__ protocol is already a weak form of generator, and xrange
*is* a full-blown generator; e.g., a common question on c.l.py is how to
iterate over a sequence backwards:

    for x in backwards(sequence):
        print x

    def backwards(s):
        for i in xrange(len(s)-1, -1, -1):
            suspend s[i]

Nobody needs a comp sci background to understand what that *does*, or why
it's handy.  Try iterating over a tree structure instead & then the *power*
becomes apparent; this isn't comp-sci-ish either, unless we adopt a "if
they've heard of trees, they're impractical dreamers" stance <wink>.  BTW,
iterating over a tree is what os.path.walk does, and a frequent source of
newbie confusion (they understand directory trees, they don't grasp the
callback-based interface; generating (dirname, names) pairs instead would
match their mental model at once).  *This* is the stuff for simple souls!
...
Another good use of threads (suggested by Sam) is for GUI programming.
An old GUI system, News by David Rosenthal at Sun, used threads
programmed in PostScript -- very elegant (and it failed for other
reasons -- if only he had used Python instead :-).
On the other hand, having written lots of GUI code using Tkinter, the
event-driven version doesn't feel so bad to me.  Threads would be nice
when doing things like rubberbanding, but I generally agree with
Ousterhout's premise that event-based GUI programming is more reliable
than thread-based.  Every time your Netscape freezes you can bet
there's a threading bug somewhere in the code.
I don't use Netscape, but I can assure you the same is true of Internet
Explorer -- except there the threading bug is now somewhere in the OS <0.5
wink>.

Anyway,

1) There are lots of goods uses for threads, and especially in the Windows
and (maybe) multiprocessor NumPy worlds.  Those would really be happier with
"free-threaded threads", though.

2) Apart from pedagogical purposes, there probably isn't a use for my "fake
threads" that couldn't be done easier & better via a more direct (coroutine,
continuation) approach; and if I had fake threads, the first thing I'd do
for me is rewrite the generator and coroutine packages to use them.  So,
yes:  you win <wink>.

3) Python's current threads are good for overlapping I/O.  Sometimes.  And
better addressed by Sam's non-threaded "select" approach when you're dead
serious about overlapping lots of I/O.  They're also beaten into service
under Windows, but not without cries of anguish from Greg and Mark.

I don't know, Guido -- if all you wanted threads for was to speed up a
little I/O in as convoluted a way as possible, you may have been witness to
the invention of the wheel but missed that ox carts weren't the last
application <wink>.

nevertheless-ox-carts-may-be-the-best-ly y'rs  - tim

RE: Fake threads (was [Python-Dev] ActiveState & fork & Perl)

Tim Peters