
Hi,
Is it possible to use SWIG to parse C/C++, and provide an interface for me to generate some code? I thought it might be good to have SWIG help generate expy (see http://expy.sourceforge.net) files, then generate the python extension via expy.
Yingjie

On 27-Apr-2010, at 08:30 , Yingjie Lan wrote:
Hi,
Is it possible to use SWIG to parse C/C++, and provide an interface for me to generate some code? I thought it might be good to have SWIG help generate expy (see http://expy.sourceforge.net) files, then generate the python extension via expy.
I would be very interested in a universal intermediate format for all the interface generators. I'm still using a version of Guido's old bgen, now grudgingly extended to handle C++ and do bidirectional bridging between Python and C++, and while I love and cherish the code generator the C++ parser is, uhm... challenging. Parsing C++ with per-line regular expressions is no fun:-)
I looked at gccxml at some point, as well as at some of the competing Python interface generators, but it went nowhere. gccxml output is far too detailed, and the output is too much of a simple parse tree to be of any use. The intermediate formats of the other interface generators I looked at were all too inaccessible.
Maybe we can come up with something decent in this group?
If there is enough interest: I can start by describing bgen's intermediate format, and if other people do the same for theirs we may be able to get to common ground...
Jack Jansen, <Jack.Jansen@cwi.nl>, http://www.cwi.nl/~jack If I can't dance I don't want to be part of your revolution -- Emma Goldman

[changing subject appropriately]
Jack Jansen, 01.05.2010 23:40:
Certainly not the right tool here. It appears that clang seems to work quite well for both C and C++.
That's likely because the parser requirements grow with the tool itself.
Maybe we can come up with something decent in this group?
I think it really makes sense to do this. A suitable level of detail for all generators may not be immediately obvious, but should be doable.
Please do. I'll ask over at the Cython-users list to see if others have something to contribute to this discussion.
Stefan

On 2-May-2010, at 07:27 , Stefan Behnel wrote:
Ok, here goes. People interested in a (slightly) more complete writeup can read <http://homepages.cwi.nl/~jack/presentations/nluug-praatje.pdf>, but here is the basics.
The bgen intermediate format is a python file. Each C or C++ definition is transformed into a few lines of Python code that describe the definition. Here is an example (manually entered, so probably incorrect:-):
--------- test.h: int increment(int value); void print(const char *string); void clear(int *location);
---------- intermediate code: f = Function(int, 'increment', (int, 'value', InMode)) functions.append(f)
f = Function(void, 'print', (char_ptr, 'string', InMode)) functions.append(f)
f = Function(void, 'clear', (int, 'location', OutMode)) functions.append(f)
That's the basics. There is a little mangling of names going on, as you can see in the second function, so that the C type is representable as a Python identifier.
But, as you can see in the third line, there is a little more to it: patterns are applied before outputting the intermediate format. One of the patterns has turned the expected (int_ptr, 'location', InMode) argument into the (int, 'location', OutMode). The current implementation applies the patterns before creating the intermediate format, but I think that for a future implementation I would be much more in favor of having that be an extra step (so it would read intermediate code and write intermediate code).
The pattern substitution engine is really the power of bgen, because it can do much more than the simple transformation shown here. Patterns can trigger on multiple arguments, and they can also be told to look for "C-style" object-oriented code. So,
int writestream(streamptr *sp, char *buf, int nbytes);
is turned into f = Method(int, 'writestream', (VarInputBufferSize, 'buf', InMode)) methods_streamptr.append(f)
This is why I love bgen so much, because it means that the Python interface is the expected sp.writestream("hello") as opposed to the barebones writestream(sp, "hello", 5). But that's bgen-evangelism, so I'll stop here:-)
-- Jack Jansen, <Jack.Jansen@cwi.nl>, http://www.cwi.nl/~jack If I can't dance I don't want to be part of your revolution -- Emma Goldman

On Sun, May 2, 2010 at 22:47, Jack Jansen <Jack.Jansen@cwi.nl> wrote:
You know, this sounds a lot like pybingen. It reads gccxml and generates an API description as a python file. Syntax used in pybindgen is only slightly different thant what you propose.
The pybindgen parser has some problems, but it is functioning in a very large and complex C++ API (network simulator 3). Just trying to save people from reinventing the wheel... :-)
Here's the link: http://code.google.com/p/pybindgen/
Here is an example (manually entered, so probably incorrect:-):
-- Gustavo J. A. M. Carneiro INESC Porto, UTM, WiN, http://win.inescporto.pt/gjc "The universe is always one step beyond logic." -- Frank Herbert

On 2-May-2010, at 23:57 , Gustavo Carneiro wrote:
Can you elaborate (a bit)? Because I'm interested in the places where the wrapper generators have different requirements (because that's going to be the difficulat areas for a common format).
An example of where things may be different is that the bgen format has absolutely no knowledge of C types. As you can see from my example they are nothing more than python-identifier representations of the C names.
Jack Jansen, <Jack.Jansen@cwi.nl>, http://www.cwi.nl/~jack If I can't dance I don't want to be part of your revolution -- Emma Goldman

On 27-Apr-2010, at 08:30 , Yingjie Lan wrote:
Hi,
Is it possible to use SWIG to parse C/C++, and provide an interface for me to generate some code? I thought it might be good to have SWIG help generate expy (see http://expy.sourceforge.net) files, then generate the python extension via expy.
I would be very interested in a universal intermediate format for all the interface generators. I'm still using a version of Guido's old bgen, now grudgingly extended to handle C++ and do bidirectional bridging between Python and C++, and while I love and cherish the code generator the C++ parser is, uhm... challenging. Parsing C++ with per-line regular expressions is no fun:-)
I looked at gccxml at some point, as well as at some of the competing Python interface generators, but it went nowhere. gccxml output is far too detailed, and the output is too much of a simple parse tree to be of any use. The intermediate formats of the other interface generators I looked at were all too inaccessible.
Maybe we can come up with something decent in this group?
If there is enough interest: I can start by describing bgen's intermediate format, and if other people do the same for theirs we may be able to get to common ground...
Jack Jansen, <Jack.Jansen@cwi.nl>, http://www.cwi.nl/~jack If I can't dance I don't want to be part of your revolution -- Emma Goldman

[changing subject appropriately]
Jack Jansen, 01.05.2010 23:40:
Certainly not the right tool here. It appears that clang seems to work quite well for both C and C++.
That's likely because the parser requirements grow with the tool itself.
Maybe we can come up with something decent in this group?
I think it really makes sense to do this. A suitable level of detail for all generators may not be immediately obvious, but should be doable.
Please do. I'll ask over at the Cython-users list to see if others have something to contribute to this discussion.
Stefan

On 2-May-2010, at 07:27 , Stefan Behnel wrote:
Ok, here goes. People interested in a (slightly) more complete writeup can read <http://homepages.cwi.nl/~jack/presentations/nluug-praatje.pdf>, but here is the basics.
The bgen intermediate format is a python file. Each C or C++ definition is transformed into a few lines of Python code that describe the definition. Here is an example (manually entered, so probably incorrect:-):
--------- test.h: int increment(int value); void print(const char *string); void clear(int *location);
---------- intermediate code: f = Function(int, 'increment', (int, 'value', InMode)) functions.append(f)
f = Function(void, 'print', (char_ptr, 'string', InMode)) functions.append(f)
f = Function(void, 'clear', (int, 'location', OutMode)) functions.append(f)
That's the basics. There is a little mangling of names going on, as you can see in the second function, so that the C type is representable as a Python identifier.
But, as you can see in the third line, there is a little more to it: patterns are applied before outputting the intermediate format. One of the patterns has turned the expected (int_ptr, 'location', InMode) argument into the (int, 'location', OutMode). The current implementation applies the patterns before creating the intermediate format, but I think that for a future implementation I would be much more in favor of having that be an extra step (so it would read intermediate code and write intermediate code).
The pattern substitution engine is really the power of bgen, because it can do much more than the simple transformation shown here. Patterns can trigger on multiple arguments, and they can also be told to look for "C-style" object-oriented code. So,
int writestream(streamptr *sp, char *buf, int nbytes);
is turned into f = Method(int, 'writestream', (VarInputBufferSize, 'buf', InMode)) methods_streamptr.append(f)
This is why I love bgen so much, because it means that the Python interface is the expected sp.writestream("hello") as opposed to the barebones writestream(sp, "hello", 5). But that's bgen-evangelism, so I'll stop here:-)
-- Jack Jansen, <Jack.Jansen@cwi.nl>, http://www.cwi.nl/~jack If I can't dance I don't want to be part of your revolution -- Emma Goldman

On Sun, May 2, 2010 at 22:47, Jack Jansen <Jack.Jansen@cwi.nl> wrote:
You know, this sounds a lot like pybingen. It reads gccxml and generates an API description as a python file. Syntax used in pybindgen is only slightly different thant what you propose.
The pybindgen parser has some problems, but it is functioning in a very large and complex C++ API (network simulator 3). Just trying to save people from reinventing the wheel... :-)
Here's the link: http://code.google.com/p/pybindgen/
Here is an example (manually entered, so probably incorrect:-):
-- Gustavo J. A. M. Carneiro INESC Porto, UTM, WiN, http://win.inescporto.pt/gjc "The universe is always one step beyond logic." -- Frank Herbert

On 2-May-2010, at 23:57 , Gustavo Carneiro wrote:
Can you elaborate (a bit)? Because I'm interested in the places where the wrapper generators have different requirements (because that's going to be the difficulat areas for a common format).
An example of where things may be different is that the bgen format has absolutely no knowledge of C types. As you can see from my example they are nothing more than python-identifier representations of the C names.
Jack Jansen, <Jack.Jansen@cwi.nl>, http://www.cwi.nl/~jack If I can't dance I don't want to be part of your revolution -- Emma Goldman
participants (4)
-
Gustavo Carneiro
-
Jack Jansen
-
Stefan Behnel
-
Yingjie Lan