RFC: Viper: yet another python implementation

Sun Aug 15 12:15:46 EDT 1999

On 11 Aug 1999 08:21:14 GMT, m.faassen at vet.uu.nl (Martijn Faassen) wrote:

>It sounds like my (vaporware) Swallow idea. The difference with Viper
>is that Viper apparently is going to do type inference, while Swallow
>would (at least in the first stages) make do with nonintrusive type
>declarations, somewhat like this:
>
>__types__ = { 
>  foo : functionType(args={a: intType, b: intType}, returns=listType(intType)),
>  bar : functionType(args={}, returns=None)
>}

	This is neatly compatible with existing Python code.

>> The interpreter aims to be compatible with 'pure python':
>> python not using external C modules, and not fiddling
>> implementation details too much. Some of the fiddles
>> will work, and some will not. 
>
>This is a good idea, as pure Python will often still be the development
>language (while Swallow/Viper would only be invoked to compile and
>optimize things later on, at least in the early stages).

	Originally, I planned to recommend development
in Python, and just using Viper for compilation: that is,
I intended to ignore any sophisticated debugging support.
However, I'm beginning to swing the other way now, and
try to provide _better_ debugging than Python.

	The reason is pragmatic: without the debugging
support, I can't debug Viper itself. 

>> Compatibility:
>> 	4) range is a keyword
>
>Why this? Perhaps instead you could make 'range' and some other builtin
>functions 'special functions'. 

	Yes, that's the idea.

>The idea is that like module variables
>(and some other set of things perhaps) they cannot be rebinded to something
>else, so that the optimizer can do its stuff.

	That is what I meant by saying it was a keyword.

>> Extensions:

>I wouldn't aim for *any* extensions to start with. In order for such a
>project to succeed you want Viper code to run in Python and vice versa, 
>unless there's serious messing with internals. At least make it possible
>to turn these extensions *off* so they'll be errors.

	There are two kinds of extensions: those requiring
extra keywords, and those that 'fit naturally' into python.

	Extensions requiring extra keywords can be turned
off with a compiler switch that disables the lookup in the lexer
that maps names to keywords. Then the parser will not
recognize these extensions.

	The other kind of extensions include things
like lexical scoping and slicings, where existing or extended syntax is given new
semantics. Sometimes (hopefully rarely) this may lead to an
incompatibility. For example nested functions in Viper
are bound to the activation record of the enclosing
function. As such, variables in the enclosing scope will
hide globals, changing the semantics.

>Hm, I wish you luck, especially with the type inference...

	Preliminary tests indicate reasonable results
determining types. In critical cases such as inner loops,
it is usually possible:

	s = ""
	for  i in range(1,10): s = s + str(i)

Obviously i is an integer here, and s a string.

>Does viperc compile Python to C code? 

	I don't know yet. In the first instance,
a run time is still needed that works like existing Python,
where type inference fails. Therefore, I still need an
interpreter. So Viperi, the interpreter, is the first step.
[It doesn't support 'global' at the moment, so I cannot
yet run PyStone. I'd like to eliminate 'global']

	I will try to optimise it first, before considering
how to build the compiler, and what back ends to implement.
It is possible the first back end will actually general Ocaml
rather than C (since Viper is currently implemented in ocaml).

>> 	Generally, compiler optimisations
>> will go hand in hand with restrictions -- to optimise
>> something, information is required, and that will
>> require giving up some dynamism.
>
>So you could do with Swallow's "type system" (big word for nothing there),
>right? 

	Yes. It is useful to have optional typing, to 
help make the type inference work. I was considering:

	def f(x:int): ...

as the syntax ( because this is fairly conventional,
and a pure extension). However, it isn't
backward compatible, whereas your idea is,
so I will look at it. [The method resembles the
'traits' templates in C++ STL]

>> 	It is difficult to optimise individual
>> modules. 

>Hm, but the possibility to optimise individual modules is what attracts
>many people to this idea. 

	I know.  However, existing attempts to
build a python compiler have not succeeded in
producing code that is much more efficient than
the bytecode interpreter. 

	Why is this? I suspect it is because
a function in Python is often unintentionally
more polymorphic than actually desired: for example:

	def add(i,j): return i + j

Due to the unfortunate overloading of + on
numbers and strings, it is not possible to
determine if 'add' adds numbers or concatenates
strings. However, this can be determined by examining
_calls_ to the function add.

It is possible that both are done: the function is actually
used polymorphically .. but not necessarily in the
same program

	Perhaps a _worse_ case is this:

	def get_attribute(a,b):
		try: return get_attr(a,b)
		except: return None

Here, the return type could be anything at all.
If we knew 'a' was an object with only string
attributes, the return type would be either
string or None.

	In that case, the function is monomorphic,
but the return type is the sum of string and NoneType.

>  * Swallow now optimises these functions by translating them to C, and
>    you can use them in Python (just like C extension modules).
>
>For classes you'd need some extra support (and more restrictions) but that
>was phase 2. My advice is to start as small and unambitious as possible
>for such an ambitious project. :)

	I agree. However, it turns out adding extensions is often
trivial, whereas providing compatibility is often difficult. Similarly,
it may be more difficult to generate a function than a whole
module, and more difficult to do that than build a whole
program.

	In particular, building a whole program is easiest,
since it doesn't require interfacing to python. That interface
is non-trivial: binding to the C API, and then the bytecode
interpreter, and then again, the result is different if you
were using JPython.

	So it may be easier to just provide
a binding at a single level, namely 
'pure python' source code, than try to build something
that interfaces to existing Python.

	In addition, Python 1.6 may have different
internals, and so may Python 2.x. This creates
a maintenance nightmare which may make the project
less useful.

	OTOH, many python 'built in C modules'
have corresponding Pure Python sources, for example
the string module. In that case, if the compiler works,
there is no need for the C module.

	Another example is Lemburgs mxTools:
I have a 'Pure python' implementation of it,
so I don't need the C version, thus eliminating
a  maintenance problem.

	Another example (which is now working):
Python has a module 'exceptions'. I have no idea
if Python actually compiles this module and uses it,
however, Viperi currently does. That is, it actually
imports the exceptions module and uses the
classes in it for exceptions.

	If Guido adds a new exception,
or changes the semantics, implementation,
or interface, I have a rock solid python compatible
specification to implement, and half the work
will be done by Viper itself. [Not all, since the
internal exceptions stll have to be mapped to the
python ones]

John Max Skaller                ph:61-2-96600850              
mailto:skaller at maxtal.com.au       10/1 Toxteth Rd 
http://www.maxtal.com.au/~skaller  Glebe 2037 NSW AUSTRALIA