
14.07.19 05:09, Raymond Hettinger пише:
On Jul 13, 2019, at 1:56 PM, Serhiy Storchaka <storchaka@gmail.com> wrote:
Could we strictly define what is considered a public module interface in Python?
The RealDefinition™ is that whatever we include in the docs is public, otherwise not.
Beyond that, there is a question of how users can deduce what is public when they run "import somemodule; print(dir(some module))".
Run "help(some module)" or read the module documentation. dir() is not proper tool for getting the public interface. https://docs.python.org/3/library/functions.html#dir * If the object is a module object, the list contains the names of the module’s attributes. It does not say about publicly.
In some modules, we've been careful to use both __all__ and to use an underscore prefix to indicate private variables and helper functions (collections and random for example). IMO, when a module has shown that care, future maintainers should stick with that practice.
Either we establish the rule that all non-public names must be underscored, and do mass renaming through the whole stdlib. Or allow to use non-underscored names for internal things and leave the sources in peace. Note also that underscored names can be a part of the public interface (for example namedtuple._replace).
The calendar module is an example of where that care was taken for many years and then a recent patch went against that practice. This came to my attention when an end-user questioned which functions were for internal use only and posted their question on Twitter. On the tracker, I then made a simple request to restore the module's convention but you seem steadfastly resistant to the suggestion.
There was never such convention. Before that changes there were non-underscored non-public members in the module. In Python 3.6:
sorted(set(dir(calendar)) - set(calendar.__all__)) ['EPOCH', 'FRIDAY', 'February', 'January', 'MONDAY', 'SATURDAY', 'SUNDAY', 'THURSDAY', 'TUESDAY', 'WEDNESDAY', '_EPOCH_ORD', '__all__', '__builtins__', '__cached__', '__doc__', '__file__', '__loader__', '__name__', '__package__', '__spec__', '_colwidth', '_locale', '_localized_day', '_localized_month', '_spacing', 'c', 'datetime', 'different_locale', 'error', 'format', 'formatstring', 'main', 'mdays', 'prweek', 'repeat', 'sys', 'week']
When we do have evidence of user confusion (as in the case with the calendar module), we should just fix it.
The main source of user confusion is not reading the documentation. Recent examples: https://bugs.python.org/issue37620, https://bugs.python.org/issue37623.
IMO, it would be an undue burden on the user to have to check every method in dir() against the contents of __all__ to determine what is public (see below).
Just do not use dir() for this. It returns the list of attributes of the object. Use __all__ or help().
Also, as a maintainer of the module, I would not have found it obvious whether the functions were public or not. The non-public functions look just like the public ones.
As you said, public names are explicitly documented.