Pyste: your opinion about some changes
Hi everyone! Prabhu and I have engaged in some discussions on irc about Pyste, and came up with some ideas on how to fix The Order Bug, and the workings of Pyste in general, and would like to know your opinion on this. The Order problem is this: class hierarchies must be exported in order, from the base class to the most-derived class. The naive approach, parse all header files, look up all the bases in the header files and order the classes, occupies too much memory, making it prohibitive with too much classes. The first idea was based on the suggestion by David of using pickle/shelve to hold the declarations in a form that could be easily swaped in/out of disk as needed, solving then the memory problem. We implemented this, it works correctly, but it is too slow, sometimes prohibitive. Plus, Prabhu noted a flaw in Pyste that was always there: you always generate the *entire* wrapper code, there's no support for generating only the wrapper code for a single pyste file. So, he would change something in a Pyste file, like excluding a function, and *all* the wrapper code would be generated again, taking another 10 minutes in his machine. Of course, a more incremental approach is needed. We have thought that a viable solution is that we could lend the responsability of ordering the classes to the users. Ideally, one could do this in a Pyste file: Class('Derived', ...) Class('Base', ...) And Pyste would first instantiate Base, and then Derived. While this is a nice feature, it generates some of complications: - Given Derived, we must know what are its Bases, and we can only do that by calling gccxml. - Given Derived, we must know if Base was exported, so we can put "bases<Base>" inside the class_. If the user just exported Derived, "bases<Base>" should not be generated. We decided to drop this feature, since while nice, it is not totally necessary, since while exporting hierarchies it is natural to export the bases first. So, the user must write: Class('Base', ...) Class('Derived', ...) as he would if he were writing the wrapper by hand. What do you people feel about this? For Pyste to generate the correct code for a given class, it must know all the other classes that are also being exported, as explained above. So we must pass all pyste files to Pyste somehow before being able to produce code for any class. pyste --module=foo foo1.pyste foo2.pyste This generates a file named foo.cpp, with all the wrapper code on it. While convenient, this is impratical, since compile times can get very high for large libraries. That is why --multiple was created: pyste --multiple --module=foo foo1.pyste foo2.pyste That generates 3 files, _main.cpp, _foo1.cpp and _foo2.cpp. Compiling and linking them together gives the same results as without the --multiple flag. And now, to solve the incrementing generation problem, we came up with 2 approaches, and would like to know the opinions of everyone in the list that are interested. Both of them aim to improve --multiple, and should be faster then the current system. 1. One approach is to add an option like --wrap-only <pyste file>, that would generate just the code related to the pyste file, and --main-only, that would generate just the _main.cpp file: pyste --wrap-only foo1.pyste --multiple --module=foo foo1.pyste foo2.pyste Would generate just _foo1.cpp. Notice that we still pass all pyste files to the command line, because as explained before, Pyste must know all classes that are being exported to be able to generate correct code. pyste --main-only --multiple --module=foo foo1.pyste foo2.pyste Would generate just _main.cpp. The advantage with this approach is that it basically extends the current workings of Pyste, ie, users won't have to change anything to keep using it. The disadvantage of this method is that it looks weird, since you have to pass all pyste files even thought you may be interested in generating code for just one. 2. Another approach is to make the dependencies explicit in the pyste file by using another function, Import. This would make clear that for a given pyste file to be exported, the Imported pyste file would have to be taken in account. Going back to the Base/Derived example, we would have either: all.pyste: Class('Base', ...) Class('Derived', ...) or: base.pyste: Class('Base', ...) derived.pyste: Import('base.pyste') Class('Derived', ...) That way, the dependencies between the files is explicit, and the user is no longer required to pass all the files in the command line: pyste --multiple --module=foo derived.pyste would generate _derived.cpp, and: pyste --multiple --module=foo base.pyste would generate _base.cpp. To generate _main.cpp, the user would have to call: pyste --only-main --multiple --module=foo base.pyste derived.pyste The advantages of this method is that the dependencies between the files are explicit in the pyste files themselves, plus it feels more natural than passing all pyste files in the command line in order to generate code for only one of them. The disadvantages is that it complicates the pyste files a little, and changes the way that Pyste currently works. Whew, that's all. Opinions, anyone? Regards, Nicodemus.
Nicodemus <nicodemus@globalite.com.br> writes:
The advantages of this method is that the dependencies between the files are explicit in the pyste files themselves, plus it feels more natural than passing all pyste files in the command line in order to generate code for only one of them. The disadvantages is that it complicates the pyste files a little, and changes the way that Pyste currently works.
Sounds like all the advantages go to the users and the disadvantages go to the developer. I vote for this one ;-) -- Dave Abrahams Boost Consulting www.boost-consulting.com
"DA" == David Abrahams <dave@boost-consulting.com> writes:
DA> Nicodemus <nicodemus@globalite.com.br> writes: >> The advantages of this method is that the dependencies between >> the files are explicit in the pyste files themselves, plus it >> feels more natural than passing all pyste files in the command >> line in order to generate code for only one of them. The >> disadvantages is that it complicates the pyste files a little, >> and changes the way that Pyste currently works. DA> Sounds like all the advantages go to the users and the DA> disadvantages go to the developer. I vote for this one ;-) I'm very glad that you do! It does make life much easier when wrapping multiple Pyste files. I don't think this feature would be hard to implement either. I think what Nicodemus was referring to in, "complicates the pyste file a little, and changes the way that Pyste currently works." was that users now need to remember to add an Import('base.pyste') before they wrap Class('Derived') inside derived.pyste. If not, the bases will not be right. However, I think thats quite OK since users will know the bases that they wrap -- atleast they will know what bases the currently wrapped classes are derived from. The scheme also is similar to what one would expect in C++. The Import function is similar to the %import directive in SWIG. So SWIG users will also be comfortable with this function. Best of all, the import directive is optional, so all current users of Pyste will not notice any difference since the original behavior will be unchanged. Its only when one needs to wrap the library incrementally that this function helps heaps. It is also useful when you need to split the library into small sub-packages where if you have classes in one sub-package derived from bases in another sub-package. The second issue is that it changes the internals of Pyste a little, but I think it will not be a big change at all. We've made much bigger ones in the past week. So I'm not sure there are serious disadvantages to the developer either. :) Nicodemus, please do let us know if I am seriously mistaken about something here. cheers, prabhu
Prabhu Ramachandran wrote:
"DA" == David Abrahams <dave@boost-consulting.com> writes:
DA> Sounds like all the advantages go to the users and the DA> disadvantages go to the developer. I vote for this one ;-)
I'm very glad that you do! It does make life much easier when wrapping multiple Pyste files. I don't think this feature would be hard to implement either. I think what Nicodemus was referring to in,
"complicates the pyste file a little, and changes the way that Pyste currently works."
was that users now need to remember to add an Import('base.pyste') before they wrap Class('Derived') inside derived.pyste.
Exactly, the internals will change very little, I meant how the users would use Pyste that would change. 8) Regards, Nicodemus.
participants (3)
-
David Abrahams -
Nicodemus -
Prabhu Ramachandran