Should an sdist/MANIFEST.in include docs and tests?

What should really be included in an sdist via MANIFEST.in? Besides the obvious files required for a functioning package (Anything not caught by the default rules required for the package to function that is) and obviously LICENSE.txt and similar. A package's source tree, more often than not, includes other files such as documentation, tests, examples, a random assortment of other text files, etc. Should docs & tests, in particular, be included in an sdist via MANIFEST.in? Should other files be added too? Or maybe the sdist should be kept to a minimum? This is not clearly discussed in the packaging guide: https://packaging.python.org/guides/distributing-packages-using-setuptools/#.... The sampleproject (https://github.com/pypa/sampleproject) does seem to include tests (Well a no-op test) and doesn't include them in the sdist.

Segev Finer <segev208@gmail.com> writes:
What should really be included in an sdist via MANIFEST.in?
The ‘sdist’ name is derived from “source distribution (of this Python package/distribution)”. It is of practical benefit to treat it as “the state of the source needed for developing this Python distribution, as of this release”. The release of a Python distribution (e.g. ‘sampleproject’) marks a snapshot of the source code at a point in time. The source code as of that specific release is beneficial for some recipients, in order to be able to make changes and build a modified complete distribution from that.
A package's source tree, more often than not, includes other files such as documentation, tests, examples, a random assortment of other text files, etc. Should docs & tests, in particular, be included in an sdist via MANIFEST.in? Should other files be added too?
Yes, all those should be in the source distribution. PyPI is a good, stable over time, and easily-discovered repository of these source distributions. The source distribution (‘sdist’) should reflect the state of the source code – all of it – that a recipient might need for making modifications, testing them, updating documentation, etc. based on that specific release, by version, sourced from the same location that has the wheels or other ‘bdists’.
Or maybe the sdist should be kept to a minimum?
For a minimal installable, we already have the ‘wheel’ format. I don't think ‘sdist’ needs to be kept minimal, when it is more useful as a snapshot of the source for a distribution.
This is not clearly discussed in the packaging guide: https://packaging.python.org/guides/distributing-packages-using-setuptools/#.... The sampleproject (https://github.com/pypa/sampleproject) does seem to include tests (Well a no-op test) and doesn't include them in the sdist.
Yes, I would like the packaging guide to reflect what I described above. -- \ “Repetition leads to boredom, boredom to horrifying mistakes, | `\ horrifying mistakes to God-I-wish-I-was-still-bored, and it | _o__) goes downhill from there.” —Will Larson, 2008-11-04 | Ben Finney

It depends on whether or not you require docs or tests with your software. Most of the libraries that I've written require neither - nobody is going to be running the tests, and if they want the documentation then they're probably going to check out the library page on readthedocs. If you had some kind of reason to include them, though, you could. Typically though if that happens with tests then they're actually part of the library, e.g. yourmodule.tests =========================================================== I welcome VSRE emails. Learn more at http://vsre.info/ =========================================================== On Sun, Sep 9, 2018 at 6:08 PM Segev Finer <segev208@gmail.com> wrote:
What should really be included in an sdist via MANIFEST.in? Besides the obvious files required for a functioning package (Anything not caught by the default rules required for the package to function that is) and obviously LICENSE.txt and similar.
A package's source tree, more often than not, includes other files such as documentation, tests, examples, a random assortment of other text files, etc. Should docs & tests, in particular, be included in an sdist via MANIFEST.in? Should other files be added too? Or maybe the sdist should be kept to a minimum?
This is not clearly discussed in the packaging guide: https://packaging.python.org/guides/distributing-packages-using-setuptools/#.... The sampleproject (https://github.com/pypa/sampleproject) does seem to include tests (Well a no-op test) and doesn't include them in the sdist. -- Distutils-SIG mailing list -- distutils-sig@python.org To unsubscribe send an email to distutils-sig-leave@python.org https://mail.python.org/mm3/mailman3/lists/distutils-sig.python.org/ Message archived at https://mail.python.org/mm3/archives/list/distutils-sig@python.org/message/B...

I include tests and docs in my sdists, but I don't have the sdist install them. My goal there is to make it easy for people to grab the sdist, unpack it, and run the tests/build the docs for their own purposes, without cluttering up the final environment they install into.

Wayne Werner <waynejwerner@gmail.com> writes:
It depends on whether or not you require docs or tests with your software. Most of the libraries that I've written require neither - nobody is going to be running the tests
How can you know that? If someone wants to try some changes based on the source distribution (the ‘sdist’) they downloaded, why do you assume they won't run the test suite? Rather, I think you should assume that the source distribution is likely to be used by at least some recipients as a starting point to make and test and redistribute changes to your work (maybe many years later, when the VCS hosting has gone away). For that reason you must not be so certain no-one will want to have those files in the source distribution.
and if they want the documentation then they're probably going to check out the library page on readthedocs.
Quite commonly I have wanted to access documentation for a package that I have the ‘sdist’, but there's no longer any online hosting of the documentation. The distribution should have included the documentation as it existed, in the source distribution. -- \ “There are no chaplains in foxholes.” —Sergeant Justin | `\ Griffith, 2011-07-27 | _o__) | Ben Finney

Speaking as a maintainer of various different packages for the Pylons project, we include the following in our sdists: - source code for the package - tests for the package - documentation for the package and of course the license/history/changelog/everything you'd theoretically need to create a fork (minus .git). Our sdists are pretty big as a result. In our wheels we ship: - source code for the package/software And nothing else, tests are not included in the wheel. Some people do ship tests with their wheel, but we try not to, to keep wheel sizes small. It comes down to personal preference, we tend to think that source dist means exactly that, a source distribution. Bert
On Sep 8, 2018, at 17:08, Segev Finer <segev208@gmail.com> wrote:
What should really be included in an sdist via MANIFEST.in? Besides the obvious files required for a functioning package (Anything not caught by the default rules required for the package to function that is) and obviously LICENSE.txt and similar.
A package's source tree, more often than not, includes other files such as documentation, tests, examples, a random assortment of other text files, etc. Should docs & tests, in particular, be included in an sdist via MANIFEST.in? Should other files be added too? Or maybe the sdist should be kept to a minimum?
This is not clearly discussed in the packaging guide: https://packaging.python.org/guides/distributing-packages-using-setuptools/#... <https://packaging.python.org/guides/distributing-packages-using-setuptools/#...>. The sampleproject (https://github.com/pypa/sampleproject <https://github.com/pypa/sampleproject>) does seem to include tests (Well a no-op test) and doesn't include them in the sdist. -- Distutils-SIG mailing list -- distutils-sig@python.org To unsubscribe send an email to distutils-sig-leave@python.org https://mail.python.org/mm3/mailman3/lists/distutils-sig.python.org/ Message archived at https://mail.python.org/mm3/archives/list/distutils-sig@python.org/message/B...

Bert JW Regeer <xistence@0x58.com> writes:
Speaking as a maintainer of various different packages for the Pylons project, we include the following in our sdists:
- source code for the package - tests for the package - documentation for the package
and of course the license/history/changelog/everything you'd theoretically need to create a fork (minus .git). Our sdists are pretty big as a result.
In our wheels we ship:
- source code for the package/software
And nothing else, tests are not included in the wheel.
That seems like an eminently sensible scheme: The ‘wheel’ is for installation and should be targeted only to that; the ‘sdist’ is the source distribution and should contain all the source. -- \ “You can stand tall without standing on someone. You can be a | `\ victor without having victims.” —Harriet Woods, 1927–2007 | _o__) | Ben Finney

I strongly agree with this. I think a lot of the consumers of sdists are people packaging projects for distributions (fedora, debian, arch, etc), and they want to run the tests for their package. I don't install tests for various reasons, including the fact that they are not part of the public interface of the package. On September 10, 2018 1:17:45 AM UTC, Ben Finney <ben+python@benfinney.id.au> wrote:
Bert JW Regeer <xistence@0x58.com> writes:
Speaking as a maintainer of various different packages for the Pylons project, we include the following in our sdists:
- source code for the package - tests for the package - documentation for the package
and of course the license/history/changelog/everything you'd theoretically need to create a fork (minus .git). Our sdists are pretty big as a result.
In our wheels we ship:
- source code for the package/software
And nothing else, tests are not included in the wheel.
That seems like an eminently sensible scheme: The ‘wheel’ is for installation and should be targeted only to that; the ‘sdist’ is the source distribution and should contain all the source.
-- \ “You can stand tall without standing on someone. You can be a | `\ victor without having victims.” —Harriet Woods, 1927–2007 | _o__) | Ben Finney -- Distutils-SIG mailing list -- distutils-sig@python.org To unsubscribe send an email to distutils-sig-leave@python.org https://mail.python.org/mm3/mailman3/lists/distutils-sig.python.org/ Message archived at https://mail.python.org/mm3/archives/list/distutils-sig@python.org/message/O...

On Mon, Sep 10, 2018, at 12:33 AM, Bert JW Regeer wrote:
Speaking as a maintainer of various different packages for the Pylons project, we include the following in our sdists:> - source code for the package - tests for the package - documentation for the package
and of course the license/history/changelog/everything you'd theoretically need to create a fork (minus .git). Our sdists are pretty big as a result. When I was working out how Flit could build sdists without a MANIFEST.in, I settled on the idea that an sdist should be more or less a static equivalent of checking out the git tag for that release. So if a project lost its VCS history, all the files from release versions would still be in sdists. That means all the source files are there (like rst docs), but not generated files (like Sphinx HTML output). Now that 'pip install' gets wheels where possible, there's less pressure to make sdists as slim as possible, because far fewer people will need to wait for them to download.

As a Nixpkgs Python maintainer I often ask projects to include their tests in the sdist so we can run them when packaging. I prefer it also when an sdist basically represents a snapshot of a repository. On Mon, Sep 10, 2018 at 8:08 AM, Thomas Kluyver <thomas@kluyver.me.uk> wrote:
On Mon, Sep 10, 2018, at 12:33 AM, Bert JW Regeer wrote:
Speaking as a maintainer of various different packages for the Pylons project, we include the following in our sdists:
- source code for the package - tests for the package - documentation for the package
and of course the license/history/changelog/everything you'd theoretically need to create a fork (minus .git). Our sdists are pretty big as a result.
When I was working out how Flit could build sdists without a MANIFEST.in, I settled on the idea that an sdist should be more or less a static equivalent of checking out the git tag for that release. So if a project lost its VCS history, all the files from release versions would still be in sdists. That means all the source files are there (like rst docs), but not generated files (like Sphinx HTML output).
Now that 'pip install' gets wheels where possible, there's less pressure to make sdists as slim as possible, because far fewer people will need to wait for them to download.
-- Distutils-SIG mailing list -- distutils-sig@python.org To unsubscribe send an email to distutils-sig-leave@python.org https://mail.python.org/mm3/mailman3/lists/distutils-sig.python.org/ Message archived at https://mail.python.org/mm3/ archives/list/distutils-sig@python.org/message/ FOX6YWO2NPY6ONE4RDKVD5JCTWSTIDG3/

And yet reality is that many downstreams can't rely on an sdist to include that because it is an installable format. Not everything does include those files. I tend to view an sdist as a similar format to wheels. I've been bullied into including the tests and docs on occasion, but that has never actually provided benefit to more than a handful of folks. It seems silly that we're not also considering the portions of the world with terrible internet when making this decision. Giant sdists make their lives orders of magnitude worse for the benefit of maybe 20-30 people who tend to use the tests. Sent from my phone with my typo-happy thumbs. Please excuse my brevity On Mon, Sep 10, 2018, 04:43 Freddy Rietdijk <freddyrietdijk@fridh.nl> wrote:
As a Nixpkgs Python maintainer I often ask projects to include their tests in the sdist so we can run them when packaging. I prefer it also when an sdist basically represents a snapshot of a repository.
On Mon, Sep 10, 2018 at 8:08 AM, Thomas Kluyver <thomas@kluyver.me.uk> wrote:
On Mon, Sep 10, 2018, at 12:33 AM, Bert JW Regeer wrote:
Speaking as a maintainer of various different packages for the Pylons project, we include the following in our sdists:
- source code for the package - tests for the package - documentation for the package
and of course the license/history/changelog/everything you'd theoretically need to create a fork (minus .git). Our sdists are pretty big as a result.
When I was working out how Flit could build sdists without a MANIFEST.in, I settled on the idea that an sdist should be more or less a static equivalent of checking out the git tag for that release. So if a project lost its VCS history, all the files from release versions would still be in sdists. That means all the source files are there (like rst docs), but not generated files (like Sphinx HTML output).
Now that 'pip install' gets wheels where possible, there's less pressure to make sdists as slim as possible, because far fewer people will need to wait for them to download.
-- Distutils-SIG mailing list -- distutils-sig@python.org To unsubscribe send an email to distutils-sig-leave@python.org https://mail.python.org/mm3/mailman3/lists/distutils-sig.python.org/ Message archived at https://mail.python.org/mm3/archives/list/distutils-sig@python.org/message/F...
-- Distutils-SIG mailing list -- distutils-sig@python.org To unsubscribe send an email to distutils-sig-leave@python.org https://mail.python.org/mm3/mailman3/lists/distutils-sig.python.org/ Message archived at https://mail.python.org/mm3/archives/list/distutils-sig@python.org/message/N...

It seems silly that we're not also considering the portions of the world with terrible internet when making this decision. Giant sdists make their lives orders of magnitude worse for the benefit of maybe 20-30 people who tend to use the tests. We should certainly consider internet speeds across the world in general, but I don't think it's that important to this particular discussion. In most cases, users who just want to install and use a
On Mon, Sep 10, 2018, at 1:06 PM, Ian Stapleton Cordasco wrote: package can get it as a wheel, so what we do with sdists doesn't matter to them. And there aren't many projects where adding tests and docs makes an order of magnitude difference to the archive size.

Quoting Bert, up-thread "Our sdists are pretty big as a result." Some of my projects have very large test suites that would bloat the sdist, and I've worked on many more with the same issue. I don't think we can just wave our hands and say "sdists don't end up particularly big so we can ignore those people". Further more, we can't quite know how people are using sdists and wheels anywhere other than via PyPI download statistics (which if I remember correctly, are lossy). It's also worth noting that there was a different distribution format, previously, not named an "sdist" that was a specific repository archive. The problem occurs now that PyPI only allows that or sdist and that the other format is neither pip-installable nor intended to be a distribution (versus a version artifact). On Mon, Sep 10, 2018 at 2:35 PM Thomas Kluyver <thomas@kluyver.me.uk> wrote:
On Mon, Sep 10, 2018, at 1:06 PM, Ian Stapleton Cordasco wrote:
It seems silly that we're not also considering the portions of the world with terrible internet when making this decision. Giant sdists make their lives orders of magnitude worse for the benefit of maybe 20-30 people who tend to use the tests.
We should certainly consider internet speeds across the world in general, but I don't think it's that important to this particular discussion. In most cases, users who just want to install and use a package can get it as a wheel, so what we do with sdists doesn't matter to them. And there aren't many projects where adding tests and docs makes an order of magnitude difference to the archive size.
participants (9)
-
Ben Finney
-
Bert JW Regeer
-
Freddy Rietdijk
-
Ian Stapleton Cordasco
-
James Bennett
-
Paul G
-
Segev Finer
-
Thomas Kluyver
-
Wayne Werner