[Python-ideas] Add a __cite__ method for scientific packages

Steven D'Aprano steve at pearwood.info
Thu Jun 28 04:43:15 EDT 2018


On Wed, Jun 27, 2018 at 05:20:01PM -0400, Andrei Kucharavy wrote:
[...]

> To remediate to that situation, I suggest a __citation__ method associated
> to each package installation and import. Called from the __main__,
> __citation__() would scan __citation__ of all imported packages and return
> the list of all relevant top-level citations associated to the packages.

Why does this have to be a dunder method? In general, application code 
shouldn't be calling dunders directly, they're reserved for Python.

I think your description of what this method should do is not 
really coherent. On the one hand, you have __citation__() be a method 
that you call (how?) but on the other hand you have it being a data 
field __citation__ that you scan.

Which is it?

I do think you have identified an important feature, but I think this is 
a *tool*, not a *language feature*. My spur of the moment thought is:

- we could have a script (a third party script? or in the std lib?) 
  which the user calls, giving the name of their module or package as
  argument

  e.g. "python -m cite myapplication.py"

- this script knows how to analyse myapplication.py for a list of
  dependencies, perhaps filtering out standard library packages;

- it interrogates myapplication, and each dependency, for a citation;

- this might involve reserving a standard __citation__ data field
  in each module, or a __citation__.xml file in the package, or
  some other protocol;

- or perhaps the cite script nows how to generate the appropriate
  citation itself, from any of the standard formatted data fields
  found in many common modules, like __author__, __version__ etc.

- either way, the script would generate a list of packages and
  modules used by myapplication, plus citations for them.

Presumably you would need to be able to specify which citation style to 
use.

The point is, the *grunt work* of generating the citations is just a 
script. It isn't a language feature. It might not even be in the std lib 
(although perhaps we could ship it as a standard Python script, like the 
compileall module and a few other tools, starting in version 3.8).

The protocol of how the script works out the citations can be 
developed. Perhaps we could reserve a __citation__ dunder as a de facto 
standard data field, like people already use __author__ and __version__ 
and similar. Or it could look for a separate XML or TXT file in the 
package directory.



> As a scientific package developer working in academia, the problem is quite
> serious, and the solution seems relatively straightforward.
> 
> What does Python core team think about addition and long-term maintenance
> of such a feature to the import and setup mechanisms?

What does this have to do with either import or setup?


> What do other users
> and scientific package developers think of such a mechanism for citations
> retrieval?

A long time ago, I added a feature request for a page in the 
documentation to show how to cite Python in various formats:

https://bugs.python.org/issue26597

I don't believe there has been any progress on this. (I certainly don't 
know the right way to cite software.) Perhaps this can be merged with 
your idea.

Should Python have a standard sys.__citation__ field that provides the 
relevant detail in some format-independent, machine-readable object like 
a named tuple? Then this hypothetical cite.py tool could read the tuple 
and format it according to any citation style.



-- 
Steve


More information about the Python-ideas mailing list