[Cython] History of SWIG and applicability to Cython

Stefan Behnel stefan_ml at behnel.de
Mon Aug 29 16:33:15 CEST 2011


Hi,

here's an interesting history wrap-up of SWIG, by its original author.

One thing that stroke me when I read this (or rather when I was half-way 
through) was that it might be possible to use SWIG as a glue code generator 
for .pxd files and trivial wrapper code. Not sure how hard that would be - 
it does sound like such a complex system could also be somewhat tricky to 
extend ...

Stefan


-------- Original-Message --------
Subject: SWIG (was Re: Ctypes and the stdlib)
Date: Mon, 29 Aug 2011 07:41:23 -0500
From: David Beazley <dave... at dabeaz.com>
To: python-dev... at python.org
CC: David Beazley <dave... at dabeaz.com>

On Mon, Aug 29, 2011 at 12:27 PM, Guido van Rossum <guido... at python.org> wrote:

> I wonder if for
> this particular purpose SWIG isn't the better match. (If SWIG weren't
> universally hated, even by its original author. :-)

Hate is probably a strong word, but as the author of Swig, let me chime in 
here ;-).   I think there are probably some lessons to be learned from Swig.

As Nick noted, Swig is best suited when you have control over both sides 
(C/C++ and Python) of whatever code you're working with.  In fact, the 
original motivation for  Swig was to give application programmers 
(scientists in my case), a means for automatically generating the Python 
bindings to their code.  However, there was one other important 
assumption--and that was the fact that all of your "real code" was going to 
be written in C/C++ and that the Python scripting interface was just an 
optional add-on (perhaps even just a throw-away thing).  Keep in mind, Swig 
was first created in 1995 and at that time, the use of Python (or any 
similar language) was a pretty radical idea in the sciences.  Moreover, 
there was a lot of legacy code that people just weren't going to abandon. 
Thus, I alwa
  ys viewed Swig as a kind of transitional vehicle for getting people to 
use Python who might otherwise not even consider it.   Getting back to 
Nick's point though, to really use Swig effectiv
  ely, it was always known that you might have to reorganize or refactor 
your C/C++ code to make it more Python friendly.  However, due to the 
automatic wrapper generation, you didn't have to do it all at once. 
Basically your code could organically evolve and Swig would just keep up 
with whatever you were doing.  In my projects, we'd usually just tuck Swig 
away in some Makefile somewhere and forget about it.

One of the major complexities of Swig is the fact that it attempts to parse 
C/C++ header files.   This very notion is actually a dangerous trap waiting 
for anyone who wants to wander into it.  You might look at a header file 
and say, well how hard could it be to just grab a few definitions out of 
there?   I'll just write a few regexs or come up with some simple hack for 
recognizing function definitions or something.   Yes, you can do that, but 
you're immediately going to find that whatever approach you take starts to 
break down into horrible corner cases.   Swig started out like this and 
quickly turned into a quagmire of esoteric bug reports.  All sorts of 
problems with preprocessor macros, typedefs, missing headers, and other 
things.  For awhile, I would get these bug reports that would g
  o something like "I had this C++ class inside a namespace with an 
abstract method taking a typedef'd const reference to this smart pointer 
..... and Swig broke."   Hell, I can't even underst
  and the bug report let alone know how to fix it.  Almost all of these 
bugs were due to the fact that Swig started out as a hack and didn't really 
have any kind of solid conceptual foundation for how it should be put together.

If you flash forward a bit, from about 2001-2004 there was a very serious 
push to fix these kinds of issues.  Although it was not a complete rewrite 
of Swig, there were a huge number of changes to how it worked during this 
time.  Swig grew a fully compatible C++ preprocessor that fully supported 
macros   A complete C++ type system was implemented including support for 
namespaces, templates, and even such things as template partial 
specialization.  Swig evolved into a multi-pass compiler that was doing all 
sorts of global analysis of the interface.   Just to give you an idea, Swig 
would do things such as automatically detect/wrap C++ smart pointers.  It 
could wrap overloaded C++ methods/function.  Also, if you had a C++ class 
with virtual methods, it would only make one Python wrapper funct
  ion and then reuse across all wrapped subclasses.

Under the covers of all of this, the implementation basically evolved into 
a sophisticated macro preprocessor coupled with a pattern matching engine 
built on top of the C++ type system.   For example, you could write 
patterns that matched specific C++ types (the much hated "typemap" feature) 
and you could write patterns that matched entire C++ declarations.  This 
whole pattern matching approach had a huge power if you knew what you were 
doing.  For example, I had a graduate student working on adding "contracts" 
to Swig--something that was being funded by a NSF grant.   It was cool and 
mind boggling all at once.

In hindsight however, I think the complexity of Swig has exceeded anyone's 
ability to fully understand it (including my own).  For example, to even 
make sense of what's happening, you have to have a pretty solid grasp of 
the C/C++ type system (easier said than done).   Couple that with all sorts 
of crazy pattern matching, low-level code fragments, and a ton of macro 
definitions, your head will literally explode if you try to figure out 
what's happening.   So far as I know, recent versions of Swig have even 
combined all of this type-pattern matching with regular expressions.  I 
can't even fathom it.

Sadly, my involvement was Swig was an unfortunate casualty of my academic 
career biting the dust.  By 2005, I was so burned out of working on it and 
so sick of what I was doing, I quite literally put all of my computer stuff 
aside to go play in a band for a few years.   After a few years, I came 
back to programming (obviously), but not to keep working on the same stuff. 
   In particularly, I will die quite happy if I never have to look at 
another line of C++ code ever again.  No, I would much rather fling my 
toddlers, ride my bike, play piano, or do just about anything than ever do 
that again.   Although I still subscribe the Swig mailing lists and watch 
what's happening, I'm not active with it at the moment.

I've sometimes thought it might be interesting to create a Swig replacement 
purely in Python.  When I work on the PLY project, this is often what I 
think about.   In that project, I've actually built a number of the parsing 
tools that would be useful in creating such a thing.   The only catch is 
that when I start thinking along these lines, I usually reach a point where 
I say "nah, I'll just write the whole application in Python."

Anyways, this is probably way more than anyone wants to know about Swig. 
Getting back to the original topic of using it to make standard library 
modules, I just don't know.   I think you probably could have some success 
with an automatic code generator of some kind.  I'm just not sure it should 
take the Swig approach of parsing C++ headers.  I think you could do better.

Cheers,
Dave

P.S. By the way, if people want to know a lot more about Swig internals, 
they should check out the PyCon 2008 presentation I gave about it. 
http://www.dabeaz.com/SwigMaster/



More information about the cython-devel mailing list