PySqueak issues: image storage
In trying to further think through what would be involved in supporting Squeak-like (or just Smalltalk-like) capabilities for Python in constructivist education, on reflection, I think the single biggest issue is that of the Squeak/Smalltalk "image". There are lots of other issues, but I am now thinking those are more easily solvable by just programming, often using existing Python libraries, than this one, which is more reflective of deeper issues. For those of you unfamiliar with the notion of a Smalltalk "image", it is essentially this: you can pick "save image" from your running Squeak environment and the whole running vm's object memory is saved to one file. Then when you start up Squeak and specify your image file, everything is back the way you had it when you saved (in theory) -- windows, open files, open socket connections, everything. In practice, open socket connections and open files sometimes can't be reopened (the port may be in use for a server socket, or a server may be down for a client socket), files may have been deleted or locked and so can't be reopened, a database may have changed, and so on. (And making this work under the covers can be non-trivial.) But, as least, the system tries to put everything back the way it was, and it almost always does a perfect job for plain GUIs. What's more, you can bring this file to any machine on any OS and processor that runs a fairly similar version of Squeak (like moving Mac/PPC to GNU/Linux/AMD64), and just start it up and you are back where you saved. And here is the key point: you get all this for "free" when you work with Squeak. You generally don't have to use a "pickle" library yourself, or write object saving and loading code for your windows, or do anything like that. It just works. Why is the image so important for the novice user in a constructivist educational setting? If a learner is learning by building, then it stands to reason that tomorrow they would like to learn by building on top of what they have already built more often then they would want to start from scratch. Starting from scratch can be a good learning experience, no doubt, but it gets old if you do it over and over again. So, when a learner can save their image and reload it tomorrow, they are exactly where they left off. So, if they are in the midst of building a simulation by dragging graphical widgets around, then they just save it, and tomorrow there it is. And here is they key: they never had to write any saving or loading code to do this. And neither did the author of the underlying package they are using. This is a big thing IMHO. It lets the learner focus more on what they are doing without the distraction of "how can I save and load this". Now, in practice, Squeak does save and load things outside the image. It uses files to store code, communicates over sockets, uses databases, and so on. In some ways, using an external file to save, say, your email seems much safer than storing it just in an image (which is a binary file and could easily get corrupted). To do that, an application has to use some explicit form of representing objects as text and reading them back in itself. If you want to share a small part of your image, like just the simulation you wrote, then it makes sense to be able to export that part (preferably as text) and share it (most smalltalk have some support for this equivalent to pickle or xml object writing). So the image isn't everything. But for the basics of where windows are, what widgets are in them, and what dynamic objects exist in the environment, it does a great job. So, where is the Python PySqueak problem? The graphics widget set is the biggest issue (sockets, files, databases, etc. are hard, but simpler). Smalltalk are designed from the ground up to do this. In Squeak's case, all the widgets are native, so their state is defined purely by Smalltalk objects. In other Smalltalk which use native widgets, the issues are thought through at the beginning on how the windows will be created and then recreated on reloading an image and that part is usually hidden from the user. And how to figure out what native GUI widgets are up on the screen and how to save that and reload it, all transparently to the user, is a non-trivial thing, and it varies from widget set to widget set. Since Python wants to be platform agnostic, and since there are a lot of widget sets out there (x, wx, tk, swt, swing, gtk, qt, mac flavors, mozilla and other web browser widgets, etc.), plus a lot of code written for them, that means a bit of a problem. I think in theory one could write widget set savers and reloaders that know enough about a specific widget set to walk the tree of displayed widgets and rebuild it. But, that is a lot of work that needs to be done for each widget set. And then, ideally, that all needs to work with existing widget using code not written to use a well designed library that hides these reloading problems. So, I think that would be the biggest issue to solve -- giving Python an image capacity. It doesn't have to be solved because one could always continue with the Python assumption that applications will be written to save and load their state. One could also build that into specific educational constructivist widget sets like an eToys clone. But it remains a mismatch in philosophy. This isn't a plug for C++, but consider: http://www.whysmalltalk.com/quotes/index.htm "Smalltalk is the best Smalltalk around" [on using C++ to code dynamic language idioms more appropriately done in Lisp or Smalltalk] - Bjarne Stroustrup Which is meant more or less as a reminder that languages have things they do well, and not so, well, and philosophies and communities built around them. So, while it is possible to ape Squeak Smalltalk in Python (including an image), is that worth doing? If you want everything Smalltalk (or Squeak) has to offer, then you can use Smalltalk (or Squeak) and live with other limitations (it's license, stability, and world view). I'm not sure there is that much value in reinventing that wheel. And it's also probably much easier to just put a Python parser into Squeak than reinvent Squeak on Python. So I continue to think what is interesting (and challenging) about the notion of a PySqueak is to try to understand what the core issues are that Squeak tries to solve (in this case, making it easy to save and load current state of an object system, including that of GUIs) and think of Python oriented approaches to do that. Again though, the notion of having a Python image could be rejected as a goal. Even Squeak does use external files for various tasks. When the outside world changes the inside world in the image gets out of date. More and more community-oriented applications (including Squeak's Croquet shared 3D world, but also others down to simple web applets or html forms) rely on storing state in outside servers or across peers and so need to reload from the network on startup anyway in practice. Keeping everything in one image mixes application and user data, and also often leads to an unmanaged growth in complexity as the image accumulates clutter. When the Squeak VM changes, one needs to clone the image into a new format, and in practice, end users aren't going to want to do that with their old images and just start over from scratch, filing and out code or application state. Squeak stores its source code history and other changes outside the image in a couple of text files (in part to rebuild the image if it crashes or gets corrupt). Even within the Squeak community, there are constant pressures to be able to build an image from a textual representation (something not trivial to do, and historically resisted by the main people in the project, perhaps out of sentimentality? or compassion? as the Squeak image has roots as a living thing back more than 20 years). However, even with pressure to be able to build an image from Scratch, something every Python program essentially does by running from *.py files, I'm sure almost no one in the Squeak community would want to lose saving and loading their current image. So anyway, that's the outline of one of the biggest Squeak->Python issues IMHO. And, I'd caution, it is one that is easy to dismiss as unimportant if you do not have experience working with images, the same way it is easy to dismiss something like garbage collection as unimportant if you are comfortable working in C++. Still, you can patch garbage collection onto C++ (in an awkward fashion) and one could probably patch images onto Python somehow (if that was desired). It's mostly a matter of considering how images interact with the Python glue philosophy, and also, in this case, in an educational constructivist setting, where I think saving state easily is a big win for everybody, especially in a classroom setting often with very short time periods for doing a bit of exploring and constructing, but potentially lots of them over the course of months. The default right now is every Python application must invent its own way of saving its state. So, should that be revisited? Or perhaps, this instead points to a need to improve pickle or create a community-wide widget reloading standard? It occurs to me, just now, at the end, as I revise this, what the Python way might be. :-) And it is to write its state as a Python text file! Something I need to muse over. :-) --Paul Fernhout Learning by writing. :-)
participants (1)
-
Paul D. Fernhout