The closest thing we have right now is 2to3. It produces different installed code depending on how setup.py was run, and if you produce wheels they have a tag to distinguish them from each other based on the Python version. It's not wrong to have a complicated build process in setup.py. The confusion is that setup.py both builds the package and generates its metadata. It's common for the list of dependencies in the metadata to change based on the version of Python. Instead, we would prefer that a single package has the same metadata independent of how it is built. The common different-dependencies-per-Python-version-or-OS case is supported with "environment markers". They can be used to turn dependencies on or off with simple expressions. The wheel project's own setup.py uses them. So my strategy would be to use a single source package to generate several wheels tagged per Python implementation or version. Later the installer would be able to pick the correct one based on its tags. As ever beware of trying to extend distutils. It's not very good at that.