Adding namespace support to PyPi (continuation from PyPA Summit/Sprint)
Good evening distutils-sig- TL;DR: today during the Python Packaging Summit/Sprint I broached the topic of introducing namespaces on PyPi with several of the luminaries in the PyPA ecosystem. After much discussion I think we concluded that there's general support for the idea, but quite a number of devils in the details that would need worked out. I've tried to capture much of the relevant discussion on in the associated thread on discuss.python.org <https://discuss.python.org/t/namespace-support-in-pypi/1609> (including a first-pass at a draft PEP <https://gist.github.com/vadave/f1f2d07f5e355c6263fc111aae634ea5> ) Please join in the conversation about whether/how namespaces should be implemented in PyPi. Detail: The namespaces discussion had actually come up before the transition to warehouse[1], but at the time it was deferred due to the effort associated with that migration. Now that Warehouse is in a better place, it seems like it's time to reinvigorate this discussion. Most other programming languages have some type of namespace construct that can be used to help better manage the supply chain integrity. As I see it namespaces have the potential to provide significant value to *both* package maintainers and package consumers. For package maintainers it potentially offers a higher-level of control over packages they are responsible for maintaining. For package consumers it potentially offers a cleaner mechanism for "at a glance" recognition of the source of a particular package (e.g. whether the package is vendor-developed vs. community-developed or has a formal relationship with some larger group like PyData). With all that said, my personal goal is for this PEP to serve kind of a foundational role. I *don't want* to try to exhaustively define all the different scenarios that namespaces can/should be used to enable. I *do want* namespaces to become a clear security-boundary that can be used in "high compliance" environments to make security-related risk decisions. And lastly, I *definitely don't want* to replace the current global namespace...I view this as a complement, not a replacement to the current situation. Finally, as this is my first post to this list I'll offer the obligatory "new guy" introduction. I started using Python about 4 years ago, so I still consider myself a relative newbie. For my day job I'm an independent consultant working with what I call "high compliance" customers...folks that really, really care about security and have lots of process baked around how they manage security risk within their organization. I'm based in Fairfax, VA. Most of what I do from day-to-day *is not* writing python code, but rather working issues around IT architecture and governance (with a general focus around enabling infrastructure such as cloud services and devops tooling). Thanks to all the folks that offered their time and opinions on this topic today. I look forward to continued dialogue on this, and welcome any ideas or cautions folks may have about making this a reality. And remember: "Namespaces are one honking great idea - let's do more of those!" -d [1] Original feature request for adding namepsaces support: https://github.com/pypa/warehouse/issues/2589
What do namespaces offer over forking, diffing, reviewing the latest commits, and installing from your GH fork URL commit hash? When I try to install 'westurner/pip' and 'pip' is already installed, what should it do? Should pypa/setuptools_scm include the namespace in the version tag? If my concerns include integrity and availability in a risky environment, why rely upon PyPI for hosting (manylinux2010 binary) wheels at all? (You can run warehouse and/or devpi on-prem and keep that upgraded and baselined) Will pip throw an exception if my local warehouse or devpi package isn't signed with 'a key recognized by upstream'? On Monday, May 6, 2019, Dave Ashby via Distutils-SIG < distutils-sig@python.org> wrote:
Good evening distutils-sig-
TL;DR: today during the Python Packaging Summit/Sprint I broached the topic of introducing namespaces on PyPi with several of the luminaries in the PyPA ecosystem. After much discussion I think we concluded that there’s general support for the idea, but quite a number of devils in the details that would need worked out. I’ve tried to capture much of the relevant discussion on in the associated thread on discuss.python.org <https://discuss.python.org/t/namespace-support-in-pypi/1609> (including a first-pass at a draft PEP <https://gist.github.com/vadave/f1f2d07f5e355c6263fc111aae634ea5>) Please join in the conversation about whether/how namespaces should be implemented in PyPi.
Detail:
The namespaces discussion had actually come up before the transition to warehouse[1], but at the time it was deferred due to the effort associated with that migration. Now that Warehouse is in a better place, it seems like it’s time to reinvigorate this discussion. Most other programming languages have some type of namespace construct that can be used to help better manage the supply chain integrity. As I see it namespaces have the potential to provide significant value to **both** package maintainers and package consumers. For package maintainers it potentially offers a higher-level of control over packages they are responsible for maintaining. For package consumers it potentially offers a cleaner mechanism for “at a glance” recognition of the source of a particular package (e.g. whether the package is vendor-developed vs. community-developed or has a formal relationship with some larger group like PyData).
With all that said, my personal goal is for this PEP to serve kind of a foundational role. I **don’t want** to try to exhaustively define all the different scenarios that namespaces can/should be used to enable. I **do want** namespaces to become a clear security-boundary that can be used in “high compliance” environments to make security-related risk decisions. And lastly, I **definitely don’t want** to replace the current global namespace…..I view this as a complement, not a replacement to the current situation.
Finally, as this is my first post to this list I’ll offer the obligatory “new guy” introduction. I started using Python about 4 years ago, so I still consider myself a relative newbie. For my day job I’m an independent consultant working with what I call “high compliance” customers…..folks that really, really care about security and have lots of process baked around how they manage security risk within their organization. I’m based in Fairfax, VA. Most of what I do from day-to-day **is not** writing python code, but rather working issues around IT architecture and governance (with a general focus around enabling infrastructure such as cloud services and devops tooling).
Thanks to all the folks that offered their time and opinions on this topic today. I look forward to continued dialogue on this, and welcome any ideas or cautions folks may have about making this a reality. And remember: “Namespaces are one honking great idea – let’s do more of those!”
-d
[1] Original feature request for adding namepsaces support: https://github.com/pypa/warehouse/issues/2589
On Tue, May 7, 2019 at 10:01 AM Wes Turner <wes.turner@gmail.com> wrote:
What do namespaces offer over forking, diffing, reviewing the latest commits, and installing from your GH fork URL commit hash?
IIUC, one of the primary objectives for namespaces is to enable a user to store state like ("We've reviewed this and consider this version acceptable for our purposes") PyPI already hosts 1 copy of each build of each version of each package for everyon.
When I try to install 'westurner/pip' and 'pip' is already installed, what should it do?
Should pypa/setuptools_scm include the namespace in the version tag?
If my concerns include integrity and availability in a risky environment, why rely upon PyPI for hosting (manylinux2010 binary) wheels at all? (You can run warehouse and/or devpi on-prem and keep that upgraded and baselined)
Or by privately hosting just a directory of packages and setting PIP_INDEX_URL.
Will pip throw an exception if my local warehouse or devpi package isn't signed with 'a key recognized by upstream'?
Will the TUF implementation need any changes to support namespaces? "Roadmap update for TUF support" https://github.com/pypa/warehouse/issues/5247
"PyPI security work: multifactor auth progress & help needed"
https://discuss.python.org/t/pypi-security-work-multifactor-auth-progress-he... (Docker Notary does TUF today)
On Monday, May 6, 2019, Dave Ashby via Distutils-SIG < distutils-sig@python.org> wrote:
Good evening distutils-sig-
TL;DR: today during the Python Packaging Summit/Sprint I broached the topic of introducing namespaces on PyPi with several of the luminaries in the PyPA ecosystem. After much discussion I think we concluded that there’s general support for the idea, but quite a number of devils in the details that would need worked out. I’ve tried to capture much of the relevant discussion on in the associated thread on discuss.python.org <https://discuss.python.org/t/namespace-support-in-pypi/1609> (including a first-pass at a draft PEP <https://gist.github.com/vadave/f1f2d07f5e355c6263fc111aae634ea5>) Please join in the conversation about whether/how namespaces should be implemented in PyPi.
Detail:
The namespaces discussion had actually come up before the transition to warehouse[1], but at the time it was deferred due to the effort associated with that migration. Now that Warehouse is in a better place, it seems like it’s time to reinvigorate this discussion. Most other programming languages have some type of namespace construct that can be used to help better manage the supply chain integrity. As I see it namespaces have the potential to provide significant value to **both** package maintainers and package consumers. For package maintainers it potentially offers a higher-level of control over packages they are responsible for maintaining. For package consumers it potentially offers a cleaner mechanism for “at a glance” recognition of the source of a particular package (e.g. whether the package is vendor-developed vs. community-developed or has a formal relationship with some larger group like PyData).
With all that said, my personal goal is for this PEP to serve kind of a foundational role. I **don’t want** to try to exhaustively define all the different scenarios that namespaces can/should be used to enable. I **do want** namespaces to become a clear security-boundary that can be used in “high compliance” environments to make security-related risk decisions. And lastly, I **definitely don’t want** to replace the current global namespace…..I view this as a complement, not a replacement to the current situation.
Finally, as this is my first post to this list I’ll offer the obligatory “new guy” introduction. I started using Python about 4 years ago, so I still consider myself a relative newbie. For my day job I’m an independent consultant working with what I call “high compliance” customers…..folks that really, really care about security and have lots of process baked around how they manage security risk within their organization. I’m based in Fairfax, VA. Most of what I do from day-to-day **is not** writing python code, but rather working issues around IT architecture and governance (with a general focus around enabling infrastructure such as cloud services and devops tooling).
Thanks to all the folks that offered their time and opinions on this topic today. I look forward to continued dialogue on this, and welcome any ideas or cautions folks may have about making this a reality. And remember: “Namespaces are one honking great idea – let’s do more of those!”
-d
[1] Original feature request for adding namepsaces support: https://github.com/pypa/warehouse/issues/2589
participants (2)
-
Dave Ashby
-
Wes Turner