[Edu-sig] Shuttleworth Summit
Paul D. Fernhout
pdfernhout at kurtz-fernhout.com
Fri Apr 21 15:04:27 CEST 2006
Ian Bicking wrote:
> Paul D. Fernhout wrote:
>> [I
>> personally think the Squeak approach would be more stable and
>> maintainable though, just 2000 lines of core C to port per platform,
>> with widgets built on that, and a dynamic loading facility for other
>> native code.]
>
> I'm not clear what the advantage of this kind of strategy is over
> CPython. Sure, 2000 lines of C is easier to port, but CPython is
> ported, so that's not a problem. The graphical layer isn't portable,
> but pygame is fairly portable and runs on a more optimized layer (SDL)
> than what Squeak runs on (AFAIK -- though I haven't payed any attention
> to what their graphical infrastructure is like for years).
>
> I guess I just don't understand the complaints about Python graphics.
> Sure, there's work to do, but the core graphics capabilities provide a
> solid foundation, in addition to some good higher level things as well
> (like VPython). If Squeak has some good higher-level ideas, then those
> would be ported, I don't see any way you could leverage the Squeak code
> directly.
I think the main issue is not graphics; it is more cross-platform
development environment. But there are really several issues when you look
at Squeak:
* crossplatform graphics and other systems (e.g. sound, files, sockets).
* crossplatform development tools using those graphics, and these are good
tools, including complete source version history, cross referencing
function use, object inspection, and so on.
* crossplatform object store (saving the system state)
* most of the system (including widgets and most of the VM) is written and
maintained in the native language of Smalltalk (though in some cases
translated to C), and so is cross platform
* because the system is so self-contained, it is easier to make it run on
bare hardware or as a browser plugin.
I've been mostly doing Python the past few years, so my Squeak knowledge
may be a little out of date (the last two years have seen some major
changes), but basically, there are four parts to the overall Squeak
architecture: about 0.1% base platform specific code (C), about 1% VM code
(C generated from Smalltalk), about 2% loadable code (any language, maybe
some generated from Smalltalk), and about 96.9% the rest of the object
system (pure Smalltalk). (The percentages are just my guess of approximate
code size to give you a feel for it). Some more details on the parts follow.
One part is about 2000 lines of code which mostly support displaying a
bitmapped window on the screen, handling mouse and keyboard events,
talking to files, the network, and a basic sound system. [Some of these
parts might be commented out in various situations, like running headless,
e.g. "Embedded Squeak".] This layer could potentially be used by Python as
is, although it probably would need to be tweaked a little to have less
dependencies to the rest of the Squeak system (i.e. it may expect a
certain object record format). But it is true that one could use SDL (or
even wxWindows, or OpenGL) to supply many of these services. In theory,
one might be able to make use of libraries like the Apache Portable Runtime
http://apr.apache.org/
to do some of this too. The Python way would probably be to use these
prebuilt systems, and ignore any extra footprint costs or limiting
portability to exotic or bare hardware. Squeak as an idea has long
included the goal to run on bare hardware, which has been demoed, but I'm
assuming most people here would be content with running just on GNU/Linux
and then (OS X) Mac and Windows, which any combination of SDL, OpenGL,
wxWidgets, and APR cover.
Another is a larger amount of C code which is generated by translating a
subset of Smalltalk to C -- this is what define the bulk of the bytecode
processing VM plus related support routines. Since this is written in a
subset of Smalltalk, it is possible to run this VM code within a Smalltalk
system (as simulation in a sense) and develop and debug it with a comfy
environment. PyPy's RPython is somewhat similar in purpose to Squeaks's
"Slang".
Then there are dynamically loadable modules which can be written in any
languages. Note that some of these module may be handcoded C or C++, but
others might be written in Smalltalk and translated to C using Slang for
efficiency. There is a Squeak 3D engine that did this approach, starting
out merged into the VM and then becomign a module. The more complex sound
primitives written in Smalltalk also migrated out of the VM and into
modules too. One needs a common cross-platform interface for this. Again,
perhaps APR might help? One could try using ideas from the related Squeak
codebase.
Then there is the rest of the Squeak system (including compiler and
development tools) is written in Smalltalk and works on top of the
previous three layers, running on the VM. In practice all GUI widgets are
defined at this top level in Smalltalk (unless you did something funky
with loadable module for calling wxWindows or native widgets such). When a
Smalltalk "image" is written out (or read in), what is written (or read)
is the structure of objects at this fourth layer, and that layer is
written and loadable in a completely cross-platform way, meaning that for
all the core development tools and so on, you can move your image from one
machine to another and just run it. (Of course, if you depend on a
platform specific dynamically loadable modules, like for Surround Sound,
that part might not work).
I think a Python using a similar architecture would be pretty neat. But, a
Python system running on top of SWT or wxWindows (with some of its own
widgets) and with access to the Apache Portable Runtime library might get
many of the benefits at little cost. It wouldn't get all the benefits --
Squeak can supply a full GUI with development environment and compiler in
a little over one megabyte and run on bare hardware with just a little
more glue, but it would be a nice start.
For me, a big issue is transparency, not "graphics" by itself. I like as
much of the system to be accessible from within itself as possible. I get
frustrated, say, when I can't drill down into the code of wxWidgets easily
(and see it as Python). Squeak has that kind of transparency most of the
time. (Not always, because there is a VM, but mostly, and even the VM can
be self-hosted and simulated). Still, there are issues with that
transparency for beginners (who get confused by seeing too much code at
once, or who break parts of the system they should not mess with at first,
like make *all* windows hang when opened, which kills the debugger). But
even experienced users can suffer too, when they hang the system from
usign too many objects or changing core base classes. So I like the
promising idea of developing and debugging across images (or VMs) -- that
is, you develop using tools in a VM you are not also changing, but they
work across a socket to talk to another VM where your application is
running. I have one prototype that does that somewhat (for a custom
language); I built a socket server into the VM. I realized later it would
probably be better to build a client in instead, and have just one
redirecting server on the machine, which coordinated client debuggers and
client applications; that way you just use one common port, and can debug
multi-VM stuff as well.
I also have a feeling about complexity and Squeak, which is related to the
struggles the Squeak community has had managing rapid changes to core
Squeak features. And that is, building on the Squeak vision of the image,
it is that one should have one image per application, (which is somewhat
more how Python acts in practice) rather than try to bundle several of
them together into one image. That way, applications that work would stay
working, rather than break every time somebody modifies the base system.
Disk space and memory and bandwidth are so cheap now, but human time is so
expensive, why not just have lots of little applications and VMs running
at the same time. Sure, maybe later one can do like Java 1.5 and have some
VM sharing, but at the start, I'd rather see lots of robust independent
applets with wildly different versions of every library, but see them all
working and relatively bulletproof while other innovations were going on
in the community. But that would require easier cross-image
communications, perhaps made easier by the system I outlined above for
debugging and remote development. This would require of course some rarely
changing common communications protocol (or at least, one with versions).
Anyway, as Alan Kay says "burn the diskpacks", which in this case I would
suggest means, don't just try to ape Squeak in Python, but, building on
Python's (and Squeaks's) strengths, and paying attention to the lessons
learned from viewing them as experiments, build something better.
> As for actually integrating with Smalltalk, I suspect embedding the
> Squeak VM in Python is feasible.
There are a couple ways to do this:
* Have two VMs side by side (but communication is a pain)
* Have one unified VM (but one language may suffer or need to change
somewhat for ease of doing this).
The VMs can be written directly in C (or whatever: C++, Objective C, or
OCAML :-). Or they could be written in a subset or derivative of Python
(PyPy's RPython) or Smalltalk (see, it's implementation language, Slang
http://minnow.cc.gatech.edu/squeak/2267 ). Or one language could be
written on top of another (though performance might be a big issue if the
object models mismatch).
In this case, given Python is more flexible using dictionaries for object
(though slower) I would suggests it would make more sense to use a unified
VM approach, and put a Smalltalk-like syntax on top of an existing Python
VM and Python object model (maybe with a couple tweaks) and just see how
far that goes. This would also have the benefit of making it easy to write
a "Self" like prototypish Smalltalk on top of Python, using Python's
dictionary mechanisms. Squeak's newer GUI (Morphic) is prototype oriented,
and is derived from work on Self. I already have a variant of a Smalltalk
parser written in Python, and they are not that hard to do (Smalltalk is
an easy language to parse).
I've said a lot of nice things about Squeak, but I'll add here why I use
Python instead. As a Squeak negative, a big issue for me (others will
disagree) is the license and licensing history. Anything that Disney
touches scares me for example, and I don't think that stuff developed when
the Squeak team was at Disney is clearly licensed (I kept raising the
Python licensing problems example (CNRI claiming it was never formally
licensed), and that was just dealing with a non-profit!).
http://www.python.org/download/releases/1.6/license_faq/
The Squeak license even as it is isn't formally "open source" or "free"
for several reasons. I could have fixed Squeak's technical issues (and it
has several I have not mentioned), but I could never get past the license,
so after it seemed no one "in charge" cared much about fixing it they way
I wanted it fixed, or alternatively community interest in starting a from
scratch reimplementation, I moved on. On Python's plus side, it has better
and more libraries than Squeak, has a bigger community, has a C-like
syntax the masses find more acceptable (I still prefer Smalltalk's keyword
syntax though, along with blocks in control structures, though I like
indentation), it has a relatively good licensing history, and it has
widespread commercial use (good for consulting). Python misses many of
Smalltalk's development tools overall, but those are more easily remedied
by a programmer than changing a license set in stone by two big
corporations, creating a trap which could spring shut at any moment. As I
say, people disagree with my perception of the license. Squeak's still a
neat system, and for most people seems free enough. But Python has had
more traction and really is free.
--Paul Fernhout
More information about the Edu-sig
mailing list