[C++-sig] Pyste: your opinion about some changes
Nicodemus
nicodemus at globalite.com.br
Sun Jul 13 00:06:21 CEST 2003
Hi everyone!
Prabhu and I have engaged in some discussions on irc about Pyste, and
came up with some ideas on how to fix The Order Bug, and the workings of
Pyste in general, and would like to know your opinion on this.
The Order problem is this: class hierarchies must be exported in order,
from the base class to the most-derived class. The naive approach, parse
all header files, look up all the bases in the header files and order
the classes, occupies too much memory, making it prohibitive with too
much classes.
The first idea was based on the suggestion by David of using
pickle/shelve to hold the declarations in a form that could be easily
swaped in/out of disk as needed, solving then the memory problem. We
implemented this, it works correctly, but it is too slow, sometimes
prohibitive. Plus, Prabhu noted a flaw in Pyste that was always there:
you always generate the *entire* wrapper code, there's no support for
generating only the wrapper code for a single pyste file. So, he would
change something in a Pyste file, like excluding a function, and *all*
the wrapper code would be generated again, taking another 10 minutes in
his machine. Of course, a more incremental approach is needed.
We have thought that a viable solution is that we could lend the
responsability of ordering the classes to the users. Ideally, one could
do this in a Pyste file:
Class('Derived', ...)
Class('Base', ...)
And Pyste would first instantiate Base, and then Derived. While this is
a nice feature, it generates some of complications:
- Given Derived, we must know what are its Bases, and we can only do
that by calling gccxml.
- Given Derived, we must know if Base was exported, so we can put
"bases<Base>" inside the class_. If the user just exported Derived,
"bases<Base>" should not be generated.
We decided to drop this feature, since while nice, it is not totally
necessary, since while exporting hierarchies it is natural to export the
bases first. So, the user must write:
Class('Base', ...)
Class('Derived', ...)
as he would if he were writing the wrapper by hand. What do you people
feel about this?
For Pyste to generate the correct code for a given class, it must know
all the other classes that are also being exported, as explained above.
So we must pass all pyste files to Pyste somehow before being able to
produce code for any class.
pyste --module=foo foo1.pyste foo2.pyste
This generates a file named foo.cpp, with all the wrapper code on it.
While convenient, this is impratical, since compile times can get very
high for large libraries. That is why --multiple was created:
pyste --multiple --module=foo foo1.pyste foo2.pyste
That generates 3 files, _main.cpp, _foo1.cpp and _foo2.cpp. Compiling
and linking them together gives the same results as without the
--multiple flag.
And now, to solve the incrementing generation problem, we came up with 2
approaches, and would like to know the opinions of everyone in the list
that are interested. Both of them aim to improve --multiple, and should
be faster then the current system.
1. One approach is to add an option like --wrap-only <pyste file>, that
would generate just the code related to the pyste file, and --main-only,
that would generate just the _main.cpp file:
pyste --wrap-only foo1.pyste --multiple --module=foo foo1.pyste foo2.pyste
Would generate just _foo1.cpp. Notice that we still pass all pyste files
to the command line, because as explained before, Pyste must know all
classes that are being exported to be able to generate correct code.
pyste --main-only --multiple --module=foo foo1.pyste foo2.pyste
Would generate just _main.cpp.
The advantage with this approach is that it basically extends the
current workings of Pyste, ie, users won't have to change anything to
keep using it. The disadvantage of this method is that it looks weird,
since you have to pass all pyste files even thought you may be
interested in generating code for just one.
2. Another approach is to make the dependencies explicit in the pyste
file by using another function, Import. This would make clear that for a
given pyste file to be exported, the Imported pyste file would have to
be taken in account. Going back to the Base/Derived example, we would
have either:
all.pyste:
Class('Base', ...)
Class('Derived', ...)
or:
base.pyste:
Class('Base', ...)
derived.pyste:
Import('base.pyste')
Class('Derived', ...)
That way, the dependencies between the files is explicit, and the user
is no longer required to pass all the files in the command line:
pyste --multiple --module=foo derived.pyste
would generate _derived.cpp, and:
pyste --multiple --module=foo base.pyste
would generate _base.cpp. To generate _main.cpp, the user would have to
call:
pyste --only-main --multiple --module=foo base.pyste derived.pyste
The advantages of this method is that the dependencies between the files
are explicit in the pyste files themselves, plus it feels more natural
than passing all pyste files in the command line in order to generate
code for only one of them. The disadvantages is that it complicates the
pyste files a little, and changes the way that Pyste currently works.
Whew, that's all. Opinions, anyone?
Regards,
Nicodemus.
More information about the Cplusplus-sig
mailing list