[Idle-dev] Forward progress with full backward compatibility

Bruce Sherwood bas@andrew.cmu.edu
Sun, 09 Apr 2000 22:46:30 -0400


Dave Scherer and I have talked quite a bit about possible ways to introduce
a changed rule for division, so that 1/2 would be 0.5 rather than 0. This is
extremely important to our plans to have freshman physics students do
scientific computer modeling in Python, with real-time 3D output, which we
are planning to do in the fall.

A problem with making such a change is of course the danger of breaking
existing Python programs. Dave has come up with a clever scheme to provide
forward progress on this issue with complete backward compatibility. His
proposal involves remarkably little coding, which he has already tried out
He intends to make a specific proposal soon.

More generally, it would seem useful to have a mechanism for
incremental yet politically acceptable improvements to Python, with full
backward compatibility. I offer a history of how this was achieved in the
TUTOR family of languages, and a bit about the political experience
associated with making incremental, seemingly incompatible changes in a
language. I showed this to Guido, who suggested that I post it to the
Idle newsgroup as being of general interest.

In the case of TUTOR (initiated in 1967 by Paul Tenczar), and to a large
extent MicroTutor (initiated in 1977 by David Andersen), compatibility was
achieved by a mechanism no longer available. TUTOR was the
graphics-oriented programming language of the PLATO system, which ran on a
large scientific mainframe (a CDC Cyber). In the early 1970s there was only
one PLATO computer-based education system, located at the University of
Illinois and running hundreds of graphics terminals. Later there were a
number of PLATO systems, but with links among the systems. All TUTOR source
code was stored centrally on the mainframe disks. This made it possible to
run source conversions centrally to change the language. From one day to
the next you found your source files changed in small ways so that the
programs continued to work as before but with new possibilities available
in the language.

This source-conversion mechanism for TUTOR was essential for a subtle and
interesting reason. The cores of languages such as C, Pascal, and Python
are relatively clean and compact, with a small number of constructs. This
facilitates the design and maintenance of compilers. Graphics libraries for
such languages however are notoriously huge and cumbersome. TUTOR on the
other hand  in order to support the development of educational software by
nonexpert faculty programmers was necessarily graphics-oriented and
therefore had a significantly larger number of constructs, though vastly
fewer than are found in standard graphics libraries.

Because of the larger arena covered by TUTOR, it was not feasible to design
a nice clean small language from the start, because no one knew quite what
would be needed. The language was not so much created as evolved, within a
community of diverse users. And there was much more scope for design
errors, which became evident only with time and use. Unable to evolve
without mistakes, source conversion was important in making it possible to
improve the language, but without invalidating old programs. (Recently
there was a talk at Carnegie Mellon by one of the creators of Java, and I
was amused to see in the abstract some comments to the effect that when
you're dealing with graphics the language has to evolve within a community
of diverse users; you can't just design some small language core once and
for all.)

It was politically important that with this mechanism in place, users of
TUTOR built up confidence that their work would not be lost due to change,
and therefore they had little reason to fear or oppose change. Changes in
the language were no big deal.

The situation with cT (initiated in 1985 by Bruce Sherwood) was somewhat
different. It was no longer the case that all source code was stored
centrally. And even though the design of cT benefitted enormously from
experience with TUTOR and MicroTutor, still the requirements for a
novice-friendly graphics language in a windows and mouse world posed new
challenges. Also challenging were the rapidly changing hardware and
operating systems of Unix, Mac, and PC. So again it was important to have a
mechanism for being able to change the language while preserving backward
compatibility. The mechanisms I will describe, due to David Andersen, have
had the effect that there are graphics- and mouse-oriented cT programs
written on Unix in 1985 that work today on diverse platforms, without
change to the source code. Needless to say, this is not true of any
educational software written in C within the Carnegie Mellon Andrew project
in the 1980s, due to changes in operating systems and graphics environments.

When you create a new file in the cT editor today, it automatically starts
with a pseudostatement "$syntaxlevel 2". This is a directive to the editor
and to the compiler. If you read up an old file containing "$syntaxlevel
1", an automatic source conversion is run, and then the initial
pseudostatement is changed to "$syntaxlevel 2". 

This conversion has two mechanisms at its disposal. It can make actual
changes to source code, though this is less often used. The more commonly
used mechanism is to add one or more additional pseudostatements at the
start of the file. For example, the initial color model in cT proved to be
inadequate as changes occurred in the color models for Mac and PC. There
was no way to change source code automatically to match needed changes in
the language. So the pseudostatement "$oldcolor" was added to old programs
when updating the syntax level. Old programs containing "$oldcolor"
continue to work as before.

A natural thing to try after such a conversion is to comment out
"$oldcolor" and try running the program. If it runs fine, you can delete
the pseudostatement. But there is no penalty to leaving it there. Nor is
there a major penalty in terms of application bloat. Typically the amount
of code involved in maintaining an older capability is a very small
fraction of the entire application. And it need not affect execution speed
at all, since the compiler may generate different output depending on the
presence or absence of the pseudostatement. (This is the case with Dave
Scherer's proposed change to Python division.)

I have the impression that the Python community has been somewhat paralyzed
and polarized over discussions of language change. Having a mechanism for
moving forward but with full backward compatibility would make people much
less fearful of change.

Bruce Sherwood