[Edu-sig] Comments on Kay's Reinvention of Programming proposal (was Re: More Pipeline News)

Mon Mar 19 06:26:50 CET 2007

kirby urner wrote (way back on 2/16/2007):
> In the meantime, the Kay Group is thinking to reinvent
> the wheel and implement the whole show in like just
> 20K lines of code, intended for pedagogical purposes
> (not just pie in the sky).
> 
> http://irbseminars.intel-research.net/AlanKayNSF.pdf
> 
> More power to 'em, though of course OLPC isn't waiting
> for that project to finish (I doubt Kay would want it to --
> kids need Net access today -- tomorrow is not soon
> enough).  And besides, we already have Python.

Thanks for posting that link.

Finally got around to reading this entire proposal (i.e. finally got
around to having my Debian system talk via CUPS to the networked HP
printer in another building so I could print it -- and I should start by
thanking Alan Kay and his original Xerox PARC group for making that sort
of thing possible. :-).

These are some comments on:
   "Steps Toward The Reinvention of Programming: A Compact And Practical
Model of Personal Computing As A Self-Exploratorium"
  Alan Kay, Dan Ingalls, Yoshiki Ohshima, Ian Piumarta, Andreas Raab
  http://irbseminars.intel-research.net/AlanKayNSF.pdf
Related slides by Ian Piumarta:
  http://piumarta.com/papers/EE380-2007-slides.pdf
Related webcast by Ian Piumarta:
  http://lang.stanford.edu/courses/ee380/070214/070214-ee380-300.wmv

Obviously, there is only so much you can fit into a proposal, so these
comments don't mean Kay's group has not thought about the following.
And also, any proposal has to pick and choose what it emphasizes. So I
will bring up issues from what I think important, but that does not mean
the project as an entity would even necessarily better off if it
addressed all these points given its limited resources and attention.

To begin, there is no exploration of Gordon Moore's law and its
implications enabling this entire project. That is a surprise as it is
the continued advances in processing power that allow avoiding
optimization and so follow the path outlined to use general techniques
and write simpler code (and perhaps then add an optimization layer
later). It's not like previous generations of implementors were all
dopes; they just often were working under different constraints. There
is one mention on page 14: "Moore's Law gives us different scales to
deal with every few years," but no discussion of how that has changed
what is worth doing in what contexts (or how it is likely to change).
Computers are about 100X faster for the same price than ten or so years
ago, they will be about 100X-1000X faster for the same price in another
ten or so years (hard to believe, but consider the 1MHz 8-bit C64 circa
1983 compared to a current 3GHz multi-core 64-bit desktop, probably
10000X-100000X faster and 10000X more memory, for about the same amount
of money twenty years later). Moore's law can be considered generally in
terms of exponential growth in technological capacity (or reducing
costs). See:
  http://www.kurzweilai.net/articles/art0134.html?printable=1
This trend is driving the re-examination of many "old" ideas in computer
science which are now suddenly feasible, especially in AI techniques
(which this proposal hints at). Hans Moravec talks about that a lot,
such as here:
  "When will computer hardware match the human brain?"
  http://www.transhumanist.com/volume1/moravec.htm
Essentially, Moravec argues in the 1950s important researchers got
1-MIPS supercomputers (of that day), then in the 1960s more of them got
1-MIPS mainframes, then in the 1970s they all got bumped down to 1-MIPS
minicomputers, then in the 1980s bumped further down to 1-MIPS desktop
microcomputers. Only since 1990 have the desktop micros moved up the
scale of performance. Moravec talks about AI researches, but I think
this trend may apply in other areas too. It also dovetails with the
general crunch in science funding as exponential growth of PhDs met
limited dollars and steady enrollments around 1970:
  http://www.its.caltech.edu/~dg/crunch_art.html
So in that sense, this proposal fails to introspect on how it is part of
a general trend -- suddenly lots of things are possible with computers
that weren't before due to performance constraints -- and one of the big
possibilities is to discard optimization for elegance in a lot of areas.
And maybe, paradoxically, through elegance, eventually discover even
greater performance. But I think this is a key failure -- to plan to
develop future computing tools while not wrestling with this fundamental
exponential trend in a serious way.

There is also no mention of the other Moore -- Charles Moore and Forth.
Which is surprising, as a lot of the proposal is sort-of about
reinventing Forth and following Forth ideals (simplicity; unification;
extendability; threading; chained words; nested dictionaries, etc.).
Chuck Moore accomplished entire multi-tasking multi-user real-time OS
including its own compiler, linker, loader, and editor in about 32K
bytes, running on slow 1-MIPS minicomputer processors of the 1970s.
(That 32K was for a bells and whistles system -- the core of Forth is
only 1K or 2K bytes). The 20,000 lines of code proposed here for a newer
Smalltalk at generating 50 bytes a line would be about about 1000K (or
1MB, though likely less compiled) and probably will need a CPU 1000X
more powerful than what Chuck worked with in the 1970s to work
acceptably. So - there is still a lot of bloat, even as Kay writes of
how difficult this will to accomplish. Granted, those early Forths were
limited in the graphics they could display -- but there is no clear
discussions here on why graphics by itself would add so much bloat. By
the way, most modern Forths don't bootstrap (a big focus of this
proposal), they "meta compile".
  http://www.ultratechnology.com/meta.html
>From there: "Forth Meta Compilation ... The simplicity of the Forth
language makes it easy to not only extend the language into a tongue
that is perfectly suited to a particular problem but rewrite the
language from the ground up as you see fit. It is a language the
encourages experimentation with problems, thought, and the language
itself. ... " Sounds a lot like this proposal, does it not? No reference
to Forth anywhere in the proposal.

QNX was an early microkernel message passing distributed OS for x86
hardware -- and predates MS-DOS. Oh if only IBM had used it instead...
It had a 44k kernel back then in the early 1980s, and you could
effortlessly use resources on any other computer in the network. From:
  http://en.wikipedia.org/wiki/QNX
"QNX interprocess communication consists of sending a message from one
process to another and waiting for a reply. This is a single operation,
called MsgSend."
So at least some solutions have long existed to the OS-level message
passing issue (including in device driver design). It's more an issue of
just getting people to use a common convention, and has been for
decades. Kay's group could even just add such message passing support
and interfaces as a convention to, say, the Linux kernel or even just
Gnome with GTK graphics -- and get a lot more users a lot more quickly.
But hacking GNOME and arguing on those lists would probably not be as
fun for Kay's group as making a new system? :-) I'll agree it may likely
be more productive than arguing, too, of course. :-) See:
  "Linus versus GNOME"
  http://www.desktoplinux.com/news/NS8745257437.html
Note: The widget toolkit TCL/TK also has a widely used "send" command:
  http://wiki.tcl.tk/1055
People use it all the time for gluing such GUI systems together.

Also, ultimately, even if this general approach succeeds, using all of
the key ideas outlined, then you will get a lot of abstractions built on
top of the common message passing system. The same way, for example, we
get a lot of different web applications built on top of http or even
just plain sockets, perhaps exchanging XML documents with various
semantics. But, ultimately, these different abstraction may not be
compatible, or may require interface layers, or complex code to deal
with various XML interchange formats which disagree about the basic
meaning of various terms (e.g. what a "purchase order" is). Kay's
proposal ignores that problem -- and it is a really hard one relating to
shared meaning. And the failure to address it is one reason Squeak has
remained mired in bit rot and incompatible packages (which people are
now struggling heroically to fix).
"About Squeak" -- Andres Valloud (and some comments by me)
http://blogten.blogspot.com/2004/12/about-squeak.html
 "Belling the cat of complexity" -- Paul Fernhout
http://lists.squeakfoundation.org/pipermail/squeak-dev/2000-June/001371.html

The proposal talks about version control and rolling back state and
running programs backwards -- but provides no substance to go with the
ideal. (I've been plugging away at the Pointrel data repository system
based on modeling incremental relationships using triads for twenty five
years or so, so I know a little of that area. :-) Obviously it's doable
-- for example OCaml does it with checkpoints and recomputing from the
checkpoint. Anyway, I'd have liked to see more than handwaving on that.
I'll be curious how they pursue that.

There is once again the common criticism leveled at Smalltalk of being
too self-contained. Compare this proposal with one that suggested making
tools that could be used like  telescope or a microscope for relating to
code packages in other languages -- to use them as best possible on
their own terms (or perhaps virtualized internally). Consider how the
proposal suggests scripting all the way down -- yet how are the
scripting tools built in Squeak? Certainly not with the scripting
language. And consider there are always barriers to any system -- where
you hit the OS, or CPU microcode, or proprietary hardware
specifications, or even unknowns in quantum physics, and so on. :-) So
every system has limits. But by pretending this one will not, this
project may miss out on the whole issue of interfacing to systems beyond
those limits in a coherent way. For example -- OpenGL works, and
represents an API that has matured over a quarter century, and is linked
with lots of hardware and software implementations; why not build on it
instead of "bitblt"?. Or consider how Jython can use Java Swing
libraries and other such Java libraries easily. Consider this example of
proposed failure: the proposal's decision to punt on printing -- no
focus there in difficulties of interfacing with a diversity of OS
services as a guest -- of relating to prior art. This is where Python
often shines as a system (language plus libraries plus community) -- and
Python is based on a different design philosophy or perhaps different
design ethic. Python has also prioritized "modularity" from the
beginning, which has made a lot of these issues more manageable; Kay's
proposal talks a lot about internal integration, but the word
"modularity" is not even in the document. In this sense, I think Kay's
group are repeating mistakes of the past and also dodging another very
difficult issue -- one that Python somehow has more-or-less gotten right
in its own way. It's their right to do their own thing; just pointing
out a limitation here.

There is also a sense of wanting a system supporting multiple levels of
programming ("scripting all the way down"; the word "level" comes up
again and again). Yet there does not seem to be anything deep in the
theory of the system about multiple levels (see also the earlier point
on abstractions). There is handwaving about commonality of
implementation and perhaps syntax of message passing -- but there is
nothing about the deeper issue of moving between different semantic
levels (an example of this sort of problem is when you try to relate an
error message in the bowels of a system back to the correct part of the
specification which relates to it -- this is a huge stumbling block for
most beginners, and is the sort of thing that compiler writers spend
much more time on than they might expect). I think the avoidance of this
problem is another variant on the issue of interoperability. When a
system becomes a set of levels, or nested abstractions, then
interoperability with parts of yourself becomes an issue as well. :-) I
don't have an answer to this problem, but it's an example of the sort of
deep problem that the future of computing may revolve around, and which,
as far as I see, is unaddressed here. LearningWorks, a Smalltalk
environment for learning, explored some of these issues by modifying the
debugger to omit some parts of stack traces, to focus the novice on
problems in their own code.
  "The LearningWorks development and delivery frameworks"
  http://portal.acm.org/citation.cfm?doid=262793.262808
I'm not saying that modifying the debugger is a general solution to this
problem; it's just one example of trying to address part of a large
problem of "context".

Moving from the abstract to the concrete, in general, look at big
successes in widespread adoption of a practical new system -- such
projects often pick a good platform of robust-enough technologies and
meet an important need, for example, Skype. Today, the JVM (Java) and
OpenGL today are fairly solid to build systems on. Is portability or
bootstrapping still a interesting question? It might perhaps be more
interesting to see what you could build on, say, the JVM (prototyped via
Jython?) and OpenGL using all of the same good ideas in the proposal.
Would these goals really lose very much if first implemented on that
platform compared to C++ and who knows what windowing toolkits? Perhaps
something I might think about exploring myself. :-) Yes, I know such
suggestion goes against the expressed intent of creating a system that
makes sense all the way down to the metal -- but that is part of my
point, is there really as much benefit to that these days as one might
think, compared to a system which supplies software telescopes and code
microscopes to look at the network of processes around you, whatever
language they are in? Why get an entire new community to waste their
time porting this new framework everywhere when a lot of engineering has
gone into this problem, already? Why not stand on the shoulders of
giants, and make something that goes far, far beyond the
JVM/Java/Jython/Swing/etc. and OpenGL as plain toolkits? What sort of
software do people really need these days anyway?

There is a mention at the end about needing solutions for high schools,
but they already have Python and lots of good libraries for it. So,
there is talk about reinventing curricula, but all the focus is on
making the technology. Art would echo this point, I would think, if he
were still around. While personally, I am all for better computer
technology, and think it will help kids learn in various indirect ways,
but it's a little sad you just can't often float a R&D proposal on those
merits alone -- just on making better software development environments
for *everyone*.

Nonetheless, Kay's proposal is exactly the kind of exploratory research
the NSF should be funding. These comments aren't meant to be criticisms
of the projects worth. If anything, it just shows how worthy it is
already, to spark new ideas and to be good enough to be worth time to
comment on at length. IMHO the NSF should fund a dozen projects like
this, or even hundreds. And if anyone can accomplish these goals, it
will likely be Kay and his group. There are lots of great ideas in this
proposal (many of the the best ones seem to be about moving Smalltalk
internals in a more functional programming direction) The "Albert"
approach seems both sound and innovative for O.O. bootstrapping (while
being a little Forth like?) I wish the best of success with all of it.
This project will at the very least produce a lot of good ideas for
others to build on -- and make them available under free or open source
licenses (even for Python to use). That's a good thing. And I may play
with some of them myself (especially the potentially infinite late
binding of dispatchers, which is a really cool idea).

I also liked Kay's use of the term "singularity" to describe a point in
the bootstrapping process where a system becomes self-defining and
self-inspectable. That's always one issue that has held me back from
building a Smalltalk on top of Forth -- realizing that at some point you
really don't care much about the level below as your Forth interpreter
loops becomes dominated by a Smalltalk message passing loop, and so why
not code the base level in C/C++ to get more speed? The argument for
having Forth available a decade ago is that you could code fast
extensions without needing a big C compiler lying around. Today,
especially on GNU/Linux systems, that is getting to be not much of an
issue as it was ten years ago, since the C compiler is probably always
there (at least on free systems), disk space to store the compiler is
cheap, memory to run the compiler is cheap, and the community support
for GCC is so large its complexity becomes manageable through other
means than understanding it all yourself (which was easy to do with
Forth, but not GCC.)

Forth has a simplicity and understandability along the same lines Kay's
proposal moves towards. Yet, I think some of the same forces that lead
people to use GCC instead of Forth for most projects will lead this new
project to become marginalized -- unless it addresses at least the
interoperability issue outlined above.(*) The other points I raise such
as managing differing abstractions are important, but the solutions for
them will likely come out of the masses of people working in this area,
and in that sense, may show the strength of this proposal's intended
approach to get a lot of people involved in it. Still, we can see the
deep differences between Python and Smalltalk (or its possible
successor) in terms of emphasis in the community on the importance of
interoperating with lots of existing systems. And I find this sad,
because there is no inherent reason a system like the one Alan Kay and
his colleagues propose could not excel at interoperability (such as
Jython achieves for Java) -- it is more an issue of emphasis and
perspective. So in that sense, I think Kay's project could greatly
benefit from at least a little of the Python spirit, just as Python
could benefit from a lot of the ideas Kay and Smalltalk have long had
going for them (like "edit and continue" or message passing).

--Paul Fernhout

(*) Unlike Smalltalk, Forth also has trouble being self-documenting by
not naming most temporary variables as it uses the stack, that is IMHO
it's greatest weakness as a language; you can document that using
comments, but it requires extra effort both to write and understand.