[Python-Dev] Private header files (Was: Renaming Include/object.h)

Thu Jan 4 09:33:46 CET 2007

On 1/3/07, "Martin v. Löwis" <martin at v.loewis.de> wrote:
> Neal Norwitz schrieb:
> > By private, I mean internal only to python and don't need to prefix
> > their identifiers with Py and are subject to change without backwards
> > compatibility.  Include/graminit.h is one example of what I mean.
> > Some others are:  bitset.h, grammar.h, opcode.h, metagrammar.h,
> > errcode.h
>
> Ah. This seems to be a requirement completely different from the
> one I'm talking about. By this definition, object.h is *not* an
> internal header file, yet I want it to be renamed.

Agreed.  I was mixing two things that aren't necessarily related
because I see the same possible solution.  I'm also using this one
example as a good opportunity to clean up more things.  Let me try to
explain a bit more below.

> As for this issue: how about moving all such private header files
> out of Include entirely? The parser-related ones should go into
> Parser, for example (grammar.h, bitset.h, graminit.h, metagrammar.h,
> errcode.h). This would leave us with opcode.h only.
>
> > Others are kinda questionable (they have some things that are
> > definitely public, others I'm not so sure about):  code.h, parsetok.h,
> > pyarena.h, longintrepr.h, osdefs.h, pgen.h, node.h
>
> Thomas said that at least code.h must stay where it is.
>
> What is the reason that you want them to be renamed?

Sorry, I wasn't trying to imply that these should necessarily be
renamed, only that the internal portions be moved elsewhere.  I guess
I should explain my mental model first which might make things
clearer.  Then again, I'm tired, so who knows if it will explain
anything. :-)

I'm a Python embedder and I want to know what's available to me.  I
look in Include and see a ton of header files.  Do I need all these?
What do I *need* and what can I *use*?  I only want to see the public
stuff that is available to me.  Thus I want anything that has
internal/implementation details/etc out of my sight to reduce my
learning curve.  I don't ever want to learn about something I won't
need nor include anything I won't need.

That's one part.

Another part of my mental model is that I'm a Python developer and I'm
modifying a header file that is implementation specific.  I need to
share it among different subdiretories (say Python and Objects).  So I
really need to stick the header file in a common place, Include/ is
it.

I don't want to export anything, but I don't know if other third party
developers will use the header or not.  Or maybe I need to include
some implementation details in another public header.  I'll probably
be lazy and just make a single header which has some internal and some
public stuff.

I want clear rules on when identifiers need to be prefixed.  If it's
internal (e.g., in an internal directory or prefixed with _), it can
have any name and can't be included from any non-internal header.  I
can also change it in a point release.  If anyone uses anything from
here, they are on their own.  If I see any identifier in a
non-internal header, it must be public and therefore prefixed with Py
or _Py.

The Python headers are pretty good about prefixing most things.  But
they could be better.  I think it gets harder to maintain without the
rules though.

Finally, by putting everything in directories and always including a
directory in the header file, like:

  #include "python/python.h"
  #include "python/internal/foo.h"

There can never be an include file name collision as what started this
thread.  It also provides a simple way of demonstrating what's public
and what is not.  It addresses all my complaints.  There are only a
few rules and they are simple.  But I am addressing several points
that are only loosely related which I what I think generated some
confusion.

Adding the directory also makes clear were the header file comes from.
 If you see:

  #include "node.h"

you don't know if that's a python node.h, from some other part of the
code or a third party library.

Not to try to confuse things even more, but I will point out something
Google does that is only indirectly related.  Google requires
importing modules.  You aren't supposed to import classes.  You
wouldn't do:

  from foo.bar import Message
  # ...
  msg = Message(...)

You would do:

  from foo import bar
  # ...
  msg = bar.Message(...)

This makes it clear where Message comes from, just like adding a
python prefix to all header file names makes it clear where the header
file lives.  Both are good for traceability, though in different ways.
 This technique makes it easier to manage larger code bases,
particularly when there are multiple libraries used.

n