[pypy-dev] External RPython mailing list
bhartsho at yahoo.com
Thu Sep 16 02:44:35 CEST 2010
Porting ShedSkin to use the PyPy translation toolchain on the surface sounds like a good idea, but its not if we look at the details. The first issue is legal, PyPy is MIT licensed, which works very well when integrated by commerical software. But Shedskin uses the GNU GPL3, so importing any of its code (or code that imports GPL code, etc) into the user's compiled program also binds it to the GPL - which is no good for commerical software. Shedskin within the PyPy toolchain may taint the users program with GPL code because some RPython programs will import from rlib (which may in someway depend on Mark's GPL code).
The second issue is technical, not much is gained for the likely great amount of effort it would take to merge ShedSkin. Lets look at what is gained:
1. muteable globals
2. None and (int,float) are intermixable as attributes on an instance because ShedSkin has some limited support for dynamic-sub-types. (PyPy can not mix None with int and float)
3. operator overloading (except for __iter__ and __call__), PyPy only allows overloading __init__ and __del__
#1, would be nice to have but its an easy workaround to use singleton instances.
#2, no big advantage.
#3, this is a big advantage, i wish i could at least overload __getattr__ in PyPy Rpy.
ShedSkin is behind PyPy Rpy in the following areas:
1. no getattr, hasattr etc.
2. *args (Mark says he can bring it back but only for homogenous types (PyPy supports *args with non-homogenous types))
3. passing method references
4. no interface to C
5. mixed-type tuples are limited to length two (PyPy allows for any length)
6. multiple inheritance
The ShedSkin readme itself states that " the type inference techniques employed by Shed Skin currently do not scale very well beyond several hundred lines of code" and recommends ShedSkin only for small programs. Not having the 6 items above are additional reasons why ShedSkin is not ideal for writting large programs, these are serious limitations for a large program, especially #4 - not having a easy way to interface with C is a huge show-stopper; PyPy Rpy has an amazingly simple way to interface with C (rffi). The other dissadvantages of ShedSkin are: slow object allocation (Phil Hassey did a test showing ShedSkin 30% slower than RPython), and it only translates to C++ while PyPy can translate to C, Java, and #C.
There are simple ways to improve PyPy Rpy that will benifit both the PyPy project and those who strictly want to use the translation toolchain. I've been following the progress of this years Google Summer of Code projects, and i see a big stumbling block for everybody was RPython. PyPy today would have a better 64bit JIT, faster ctypes, and better numpy support if RPython was itself better.
. iteration over tuples of any length (with any mixed types)
. overloading __getattr__, __setattr__
. pickle support, if its limited thats ok.
. rstruct is incomplete
. llvm backend, what happened to llvm support?
. not having to define dummy functions on the base class to prevent 'demotion'
. not having to use the hack `assert isinstance(a,MySubClass)` to call methods with incompatible signatures.
. we already have the decorator: @specialize.argtype(1), why can't we have @specialize.argtype(*) so that all arguments can have flexible types?
. methods stored in a list for easy dispatch can not have mismatched signatures.
I aggree with Fijal, CPython module extension should be a low priority. There is a big speed overhead when passing data back and forth from CPython, and speed is the whole point of going through the trouble of writting in RPython. Those who are less concerned with speed and want CPython module extensions can use Cython, which is well tested. Those who are interested in RPython and want a simple way to get started can use ShedSkin to make an extension module, and then migrate their module to a standalone app with PyPy if they choose.
More information about the Pypy-dev