[C++-sig] Patches and complete pyste replacement prototype for pyplusplus

Mon Feb 27 03:13:48 CET 2006

On 2/24/06, Matthias Baas <baas at ira.uka.de> wrote:
> Allen Bierbaum wrote:
> > Roman and Matthias:  I have attempted to implement some solution or
> > method for every topic we have talked about so far on the e-mail list.
> >  If there was anything I missed please let me know.  This API has
> > evolved from my original proposal and thus differs in implementation
> > from the interface Matthais has proposed.  I read his proposal though
> > and tried to incorporate the ideas where possible.  If I haven't used
> > an idea it is either because I found another way to handle the issue
> > or I found issues with either the use of implementation.
>
> Well, as I have already seen your first version and even used that as a
> basis for my own version, it shouldn't come as a surprise that I
> actually do prefer some things as they are implemented in my proposal
> (otherwise I wouldn't have changed them in the first place). ;)

I have finally had some time to look over the proposals and add my two
cents about the differences and possible ideas for the future.

> But first, I'd like to note that the overall principle of our two
> versions is actually the same and because of some minor details it just
> appears to be different.

Agreed.  The concepts are similar but I think since we had slightly
differing goals we may have ended up with differing interfaces and
ideas.

My goal was to create something that was similar to Pyste in that it
was a domain specific language for explicitly exposing declarations
for boost.python bindings.  As such I didn't focus on too many of the
"power user" features yet but instead concentrated on making the easy
things easy and some of the previously impossible things easy as well.
:)

Also the libraries I am wrapping are fairly huge so I worked on some
issues related to splitting bindings across multiple generation phases
and other issues of scalability.

> Allen has a class called "Module" that is used to control the internals
> of pyplusplus. In my version, this class is called "Pipeline" (as in my
> opinion, this class actually represents the pyplusplus "core" I find the
> name "Module" a bit misleading. But some might say this is nitpicking... ;)

I debated for quite a while what to call this object.  It corresponds
roughly to a builder for what pyplusplus calls a module, so I went
with that.  Another reason I chose this name is that from the user
perspective what they are trying to build is a python module and this
is the tool they are using to build it.  I am definitely willing to
rename this and thinking about it now it may be better to call it
"ModuleBuilder" or something along those lines.

> In Allen's version, the user always explicitly creates an instance of
> that class himself, in my version this instance is created internally
> and each method is also available as function which internally calls the
> corresponding method of the global instance (if desired the user could
> also create an instance himself).

This is definitely one area where our efforts diverged.  I really took
hold of the idea early on to use an object oriented API throughout
because:
- I am assuming that people using the tool know python
- I want the ability to have binding generation scripts instantiate
multiple separate builders
- It seemed to be a good idea conceptually to deal with objects throughout

I could definitely add a similar interface of global methods that
automatically call through to a single global instance, but it
wouldn't work for my bindings so I didn't spend to much time on it. 
If this is a required capability I could easily add it.

> In Allen's version there are three main "control methods": parse(),
> createCreators(), createModule(). In my version, I have the three
> methods parse(), codeCreators() and writeFiles() which serve the same
> purpose (as said above, these methods are also available as functions).
> In both versions, the second step (creating the code creators) is done
> internally if it wasn't done explicitly by the user (in my version I
> also applied that rule for the parse() step, but I admit that probably
> everyone has to do that step manually anyway (but it's a nice feature
> for a "Hello World" example :) ).

That was an oversight on my part.  I have added code now to
automatically call parse() if needed.

> In both versions, there are methods Class, Method, Function, etc. to
> select one or more particular declarations that can then be decorated to
> customize the final bindings. In Allen's version, these function either
> return a DeclWrapper or MultiDeclWrapper object (depending on whether
> the selection contains one or more declarations). In my version, the
> return value is an IDecl object (that always acts like a MultiDeclWrapper).
> Decorating the declarations also looks almost the same in both versions.

I thought about doing this similar to Matthias, but I decided that I
wanted an easy ability to detect user errors and give good warnings. 
What I found was that by splitting this is two I could have a separate
interface for MultiDeclWrapper (the case where multiple declarations
are wrapped) and only allow methods that made sense for multiple
declarations.  Similarly this interface can modify the way the methods
operate to make them take into account they they are wrapping multiple
declarations.   If I made everything wrap multiple declarations then I
would have to add test/handling code in each method to check wether
the method was valid.

I am not too hung up on this though as it was more an implementation
detail then anything else.

> So far, both versions are almost identical. However, at the moment, a
> big difference is the expressiveness of the declaration selection
> methods (Class, Method, Function,...) and the exact semantics of the
> declaration wrappers. And here I actually prefer my version where
> obtaining an IDecl object is like doing a database query to retrieve a
> set of declarations that meet certain requirements. The resulting IDecl
> object can reference an arbitrary number of declarations scattered all
> over the declaration tree. It can even be empty and still provide the
> decoration interface. Such a "database query" can be further refined by
> calling the Class, Method, Function,... methods again on a IDecl object
> (this is a feature that the MultiDeclWrapper in Allen's version
> currently does not allow). Each individual query can also be composed of
> several filters where different filter types are concatenated with AND
> and filters of the same type are concatenated with OR (see
> http://i31www.ira.uka.de/~baas/pypp/classpyppapi_1_1decl_1_1_i_decl.html#063f1880bb6bc164d2f0ecd5fc92a3c1).
> In Allen's version, all queries are based on the declaration type and
> name. Queries for methods can optionally use the arguments or return
> type as well (but the name is still mandatory). As this is a subset of
> the query filters in my version I think it wouldn't be a problem to
> elevate Allen's version to the same expressiveness as my version.

Agreed.  This is definitely the primary area where your API has more
capabilities then the version I wrote.  In general I am all for adding
the expressiveness that you have and I believe your implementation
based on building/extending a list of filters is really a nice way to
go about this in an extensible way. (I would love to use some custom
filters built up from type traits on member methods). I agree that I
could extend it with this and I would actually like to give it a shot
very soon.

There is one area here though where I am a little worried.  Namely I
find the way I query only the children of a declaration to be a little
more structured.

For example with my method the user would always go about build up
their module based on the name hierarchy of the module:

ns = mod.Namespace("test_ns")
class1 = ns.Class("class1")
class1_method1 = class1.Method("method1")
class2 = ns.Class("class2")
class2_method1 = class2.Method("method1")

In Matthias's API I believe you could do something where you could ask
for all methods named "method1" across the entire decl tree.  I am not
sure this is such a  good idea or at least I would classify it as a
power user capability.  As such I am not opposed at all to adding it
but I don't think it should be the default and I would recommend that
it be available through a different interface.

Once again though I could be convinced otherwise if people really like
this ability.

> I tried to convert my current project to Allen's API but as I have used
> my "multiple selection" feature quite often I didn't translate all of
> it. Here are some examples:
>
> In my version I'm ignoring all protected methods of all classes like this:
>
> Method(accesstype=PROTECTED).ignore()
>
> Here, I don't know what the corresponding code would look like in
> Allen's version (but I suppose that it would also be the Method() query
> that would provide a similar argument, right? But this shows already
> that the 'name' argument shouldn't be mandatory).

Agreed.  I definitely think this should be possible.

> Then I ignore all ()-operators that return a reference to a float or
> double by the following line:
>
> Method("operator()", retval=["float &", "double &"]).ignore()
>
> Again, this addresses several classes and several methods at once. There
> are four filters (and three filter types) involved in this query:

This is the one I am not so sure about.  I like the idea of being able
to do this but I am not convinced that it should be default behavior
to search across the entire declaration tree.

Maybe something like this instead:

ns = mod.Namespace("test_ns")
ns.Method("operator()", retval=["float &", "double &"], recursive=True).ignore()

(notice the explicit request to recursively search).

> - A "type" filter because I was using the Method() function
> (alternatively I could have used the generic Decl() function together
> with the type=METHOD filter (which is what happens internally))
> - A "name" filter (in my API I'm using the convention that the first
> argument is guaranteed to be the 'name' filter. All other filters must
> be specified by keyword arguments)
> - Two "return value" filters (which are concatenated with OR)
>
> When I translated my project to Allen's API the above line became:
>
> for cls in classes:
>      Cls = mod.Class(cls)
>      try:
>          op = Cls.Method("operator\(\)", retval="float &|double &")
>          op.ignore()
>      except RuntimeError:
>          pass
>
> (classes is a list of class names that should be exposed. I have that
> list anyway, so it was no problem to use that one here)
> I had to check for the RuntimeError exception because the Method() query
> could be empty which is actually ok in my case. This is an example that
> shows that it can be ok that a query produces an empty result.
> Another thing that was confusing is that the name is *always* treated as
> a regular expression. In my first attempt, I was just searching for
> "operator()" and didn't get the expected results. After looking at the
> API code I noticed that the string is treated as a regular expression
> which means the brackets already have a special meaning and have to be
> escaped. This is the reason why I suggested to mark regular expressions
> explicitly (in my version by enclosing it between two '/' (maybe this
> should be another character as the slash could actually be part of a
> path name)).

Agreed, it could be very useful to have a way to make explicit that a
regex is being used.

> I have more such examples, but I think they won't highlight any further
> issues, so I'll leave them out.
>
> I can't comment on the features where Allen is clearly ahead of my
> version (such as templates) as I was focusing on the stuff that I need
> for wrapping my SDK (which doesn't have any templates).

And I think it is clear that you were wrapping an API where you needed
more expressiveness in the queries. :)

In my personal opinion (and I am higly biased) I would summarize the
comparison by saying that the prototype I put together may be further
ahead on features in general but could definitely be helped out with
more expressiveness of queries.  If we could come to some agreement
about how queries should work across the decl tree I would like to add
to extend my api proposal with the expressiveness of yours.  I could
build upon many of the ideas from your implementation and I am already
thinking of places in my wrapper scripts where doing so would help
simplify my life quite a bit. :)

Do you think it would be a good idea for me to refine my prototype
with your query system or should we start over with a new code base
merging the best ideas?

-Allen

>
>
> - Matthias -
>
> _______________________________________________
> C++-sig mailing list
> C++-sig at python.org
> http://mail.python.org/mailman/listinfo/c++-sig
>