On Sun, Apr 11, 2021 at 2:07 PM Oscar Benjamin <oscar.j.benjamin@gmail.com> wrote:
On Sun, 11 Apr 2021 at 10:25, Stéfane Fermigier <sf@fermigier.com> wrote:

> 2) What's a "public API" ? A particular library can have an internal API (with regular public functions and methods, etc,) but only willingly expose a subset of this API to its clients. __all__ doesn't help much here, as the tools I know of don't have this distinction in mind.

I don't think that this distinction is so important in practice. 
What matters is the boundary between different codebases. Within a codebase
a team can agree whatever conventions they want because any time
something is changed it can be changed everywhere at once in a single
commit. What matters is communicating from project A to all downstream
developers and users what can be expected to remain compatible across
different versions of project A.

We agree that the most important API is the Public API, not the internal API (except for very large scale libraries where this can become important).

But Python mostly supports (via https://docs.python.org/3/tutorial/classes.html#tut-private or the _-prefix naming convention) the notion of internal APIs, not of Public API.

> 3) A library author can decide that the public API of its library is the part exposed by its top-level namespace, e.g:
>
> from flask import Flask, resquest, ...
>
> where Flask, request, etc. are defined in subpackages of flask.

This approach really doesn't scale.

It scales as long as it's expected that when you import one object from the library, you will use most of the library.

In some cases (for instance: SQLAlchemy where you can want to use the Engine but not the ORM) the library author(s) have already thought about the issue and made several top-level namespaces.

See for instance:

etc.

For one thing imports in Python
have a run-time performance cost and if everything is imported at
top-level like this then it implies that the top level __init__.py has
imported every module in the codebase. Also organising around
sub-packages and modules is much nicer for organising documentation,
module level docstrings etc.

> I guess this issue also comes from the lack of a proper way to define (either as a language-level construct or as a convention among developers and tools authors) a proper notion of a "public API", and, once again, I don't believe __all__ helps much with this issue.
>
> => For these use cases, a simple solution could be devised, that doesn't involve language-level changes but  needs a wide consensus among both library authors and tools authors:
>
> 1) Mark public API namespaces package with a special marker (for instance: __public_api__ = True).
>
> 2) Statical and runtime tools could be easily devised that raise a warning when:
>
> a) Such a marker is present in one or more modules of the package.
>
> b) and: one imports another module of the same package from another package.
>
> This is just a rough idea. Additional use cases could easily be added by adding other marker types.
>
> An alternative could be to use decorators (eg. @public like mentioned in another message), as long as we don't confuse "public = part of the public API of the library" with "public = not private to this particular module".

I don't think __all__ helps either but for different reasons.

I think we agree for the same reason actually.
 
Python makes everything implicitly public. What is actually needed is a way
to clearly mark the internal code (which is most of the code in a
large project) as being obviously private.

We agree that there needs to be a way to mark what is part of the public API vs. what is the (library-)private implementation.

Being explicit about what is public vs. being explicit what is private are two complementary approaches (as long as we agree that public vs. private is a two-state choice, in this context).
 
That's why I prefer the Jax approach of putting the implementation in
a jax._src package.

That's another approach, which reminds me of some Java projects I have worked on in the past.

First time I've seen it in a Python project, though, vs. the other approach which I have found in several projects I work with on a daily basis.

  S.

--
Stefane Fermigier - http://fermigier.com/ - http://twitter.com/sfermigier - http://linkedin.com/in/sfermigier
Founder & CEO, Abilian - Enterprise Social Software - http://www.abilian.com/
Chairman, National Council for Free & Open Source Software (CNLL) - http://cnll.fr/
Founder & Organiser, PyParis & PyData Paris - http://pyparis.org/http://pydata.fr/