[Python-ideas] Add a __cite__ method for scientific packages
Steven D'Aprano
steve at pearwood.info
Thu Jun 28 04:43:15 EDT 2018
On Wed, Jun 27, 2018 at 05:20:01PM -0400, Andrei Kucharavy wrote:
[...]
> To remediate to that situation, I suggest a __citation__ method associated
> to each package installation and import. Called from the __main__,
> __citation__() would scan __citation__ of all imported packages and return
> the list of all relevant top-level citations associated to the packages.
Why does this have to be a dunder method? In general, application code
shouldn't be calling dunders directly, they're reserved for Python.
I think your description of what this method should do is not
really coherent. On the one hand, you have __citation__() be a method
that you call (how?) but on the other hand you have it being a data
field __citation__ that you scan.
Which is it?
I do think you have identified an important feature, but I think this is
a *tool*, not a *language feature*. My spur of the moment thought is:
- we could have a script (a third party script? or in the std lib?)
which the user calls, giving the name of their module or package as
argument
e.g. "python -m cite myapplication.py"
- this script knows how to analyse myapplication.py for a list of
dependencies, perhaps filtering out standard library packages;
- it interrogates myapplication, and each dependency, for a citation;
- this might involve reserving a standard __citation__ data field
in each module, or a __citation__.xml file in the package, or
some other protocol;
- or perhaps the cite script nows how to generate the appropriate
citation itself, from any of the standard formatted data fields
found in many common modules, like __author__, __version__ etc.
- either way, the script would generate a list of packages and
modules used by myapplication, plus citations for them.
Presumably you would need to be able to specify which citation style to
use.
The point is, the *grunt work* of generating the citations is just a
script. It isn't a language feature. It might not even be in the std lib
(although perhaps we could ship it as a standard Python script, like the
compileall module and a few other tools, starting in version 3.8).
The protocol of how the script works out the citations can be
developed. Perhaps we could reserve a __citation__ dunder as a de facto
standard data field, like people already use __author__ and __version__
and similar. Or it could look for a separate XML or TXT file in the
package directory.
> As a scientific package developer working in academia, the problem is quite
> serious, and the solution seems relatively straightforward.
>
> What does Python core team think about addition and long-term maintenance
> of such a feature to the import and setup mechanisms?
What does this have to do with either import or setup?
> What do other users
> and scientific package developers think of such a mechanism for citations
> retrieval?
A long time ago, I added a feature request for a page in the
documentation to show how to cite Python in various formats:
https://bugs.python.org/issue26597
I don't believe there has been any progress on this. (I certainly don't
know the right way to cite software.) Perhaps this can be merged with
your idea.
Should Python have a standard sys.__citation__ field that provides the
relevant detail in some format-independent, machine-readable object like
a named tuple? Then this hypothetical cite.py tool could read the tuple
and format it according to any citation style.
--
Steve
More information about the Python-ideas
mailing list