[Distutils] Data centric standard for distributing Python modules

anatoly techtonik techtonik at gmail.com
Thu Oct 22 05:46:04 EDT 2015


I see a lot of talks about setuptools/no setuptools/distutils
and let me (as usual) state my opinion that all of that is
wrong. Instead of discussing who will write the code (which
is essentially the root problem) how about just setting the
data processing requirements and cover them with tests
and stories?

Ship supporting infrastructure that allows people from
Fedora, Debian, Anaconda see HOW DATA about packages
is managed and adopt their tools to support Python style.
This will save us all a lot of effort in maintaining the stuff
between different systems. As you may see - PEPs are
good, but not effective. The testsuite that you can download
and test your tools against - that would be awesome thing
both for tool development and for LEARNING Python.

Why learning is here? Just a simple data definition is not
enough - you need to understand why the data structure is
there in the first place - for what story it is made for? And
when you have the story and the data - it is everything that
is needed for a practical Python lesson about how to
process files and data formats.

Stories are scenarios. My usual scenario is:

- install files from a package into local python

When you think in data centric way, do you need to know
the files that you need to copy from a package into local
Python. Yes. But do you need a data format for that into?
If you're using wheels - no. Because there are no
needless flies. If you're using .rpm? Yes, you need to
parse and exclude .spec files etc. Every author may
choose their own input - we just define the common
output.

- uninstall files from local python

Here you need the list of files. Where do you get it from?
>From some custom specific format located in a directory
named after some complicated calculated that is
separated over separate PEPs that requires a separate
module to calculate. You see. It is where everybody in a
trap of using setuptools, distutils etc. I just want to query
static data - "give me all files for this package" or
"give me all packages with this name so that I can
choose specific versions". What should be the data
layout to allow that?

- create isolated virtualenv from existing files

I am pretty sure it is not possible now and everybody
is redownloading the stuff again and again to create
new environment.

- check all my installed modules for security issues

- check all my virtualenvs for security issues

The last part requires a place for virtualenvs to register
their location, so that you can read their modules and
give security warning about them too.


With test coverage (and beautiful and easy runner) this
will allow people to write final tools in a fun way. And if
people have fun, they usually create better things than
those who are forced to do their thing for a living. And it
will be more fun if they can easily extend tests for new
stories.

-- 
anatoly t.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/distutils-sig/attachments/20151022/aad6357d/attachment.html>


More information about the Distutils-SIG mailing list