[Distutils] Malicious packages on PyPI

Wes Turner wes.turner at gmail.com
Sat Jun 3 13:01:23 EDT 2017


On Thu, Jun 1, 2017 at 10:46 PM, Pandu Poluan <pepoluan at gmail.com> wrote:

> +1 for transitive trust.
>
> At the base/simplest level, `pip` would trust any packages trusted by PyPI.
>
> More advanced users / more security-oriented installation can add
> additional "required trusts".
>
> Maybe another special "PyPI Curator" pseudo-user. All packages whose
> signing key is trusted by PyPI *and* PyPI Curator can be deemed trustworthy.
>

Ha!

"Implement "hook" support for package signature verification"
https://github.com/pypa/pip/issues/1035

- TUF (The Update Framework)

  - (How does this work with DevPi / mirroring?)

- https://github.com/pypa/pip/issues/1035#issuecomment-302766580
  - re: blockchain, **blockcerts** (because which keyserver is up right
now?)

Note that:

- Projects define { install_requires, extras_require, } in setup.py
  - setup.py is not declarative: you must execute setup.py to determine the
value of install_requires for the current platform
    - so, PyPi **can't** know which dependencies setup.py might list when
run on a given machine
      - this essentially requires recursive eval of remote code (it must
run each and every setup.py)
        - which is basically the exercise: download lots of executable
source from hopefully trusted third parties over a hopefully secure channel

  - this is relevant to a trust framework because the whole objective would
be to "maximally verify" the trust chain for the whole dependency graph;
but the dependency graph isn't known until after things are unpacked and
executed
   - ... which eventually brings us to: "we could/should teach devs to sign
VCS commits and releases (e.g. before they're repackaged, patche, and then
signed again by e.g. a linux distro)

- Retrieiving the dependency metadata for a project which MAY define
conditional dependencies in a setup.py does require executing the setup.py
(which does indeed complicate efforts to solve a dependency graph (and a
trust graph)



>
> And if in a highly secure environment, probably internal curators. Which
> means that installation of packages will require three (or more) trusts:
> PyPI, PyPI Curator, curator-1 at example.com, curator-2 at example.com, etc.
>

So, if this (curator approval / LikeAction / AssessAction ) does or does
not occur, where should the metadata regarding who and why be stored?

From
http://markmail.org/search/?q=list%3Aorg.python+LikeAction#query:list%3Aorg.python%20LikeAction+page:1+mid:e2xbsb2guxagtshc+state:results


In terms of schema.org, a Django Packages resource has: * [ ] a unique URI
> * [ ] typed features (predicates with ranges)
> * [ ] http://schema.org/review
> * [ ] http://schema.org/VoteAction
> * [ ] http://schema.org/LikeAction


- https://schema.org/AssessAction
  - https://schema.org/ReviewAction

... "curated"

-  "[Distutils] Maintaining a curated set of Python packages"
  http://markmail.org/search/?q=list%3Aorg.python+curated#"
query:list%3Aorg.python%20curated+page:1+mid:ibnoqnovjxp3gavi+state:results



>
> (The relationship need not be simple boolean AND, but can also be
> implemented as a score system. For examply, PyPI has weight 0.5, PyPI
> Curator has weight 1.0, internal company curators have weights 2.0 (> PyPI
> + PyPI Curator), and minimum acceptable score is 5.5, meaning that the
> package must be trusted by PyPI, PyPI Curator, and at least 2 internal
> company curators.)
>

> We can even create multiple levels of "PyPI Curator":
>
> * PyPI Trusted Authors -- automagically trust well-known 'authors'
> * PyPI Voted Trust -- packages voted by a committee (or by minimum N
> users) to be trustworthy
> * PyPI Audited Trust -- packages that had gone through a more thorough
> code audit / code review
>

This data would be most useful if it could be merged into one graph and
linked to a stable per-package URI.


>
>
> Rgds,
> --
>
>
> FdS Pandu E Poluan
> ~ IT Optimizer ~
>
>  • LOPSA Member #15248
>  • Blog : http://pandu.poluan.info/blog/
>  • Linked-In : http://id.linkedin.com/in/pepoluan
>
> On Fri, Jun 2, 2017 at 9:33 AM, Matt Joyce <matt at nycresistor.com> wrote:
>
>> I was more pushing for the transitive trust element than signing.  That
>> being said, any signing at all would be progress.
>>
>> On Jun 1, 2017 9:07 PM, "Donald Stufft" <donald at stufft.io> wrote:
>>
>>
>> On Jun 1, 2017, at 8:15 PM, Matt Joyce <matt at nycresistor.com> wrote:
>>
>> Or start doing signed pgp for package maintainers and build a transitive
>> trust model.
>>
>>
>>
>> PGP is not useful for our use case except as a generic crypto primitive,
>> and there are better generic crypto primitives out there. See
>> https://caremad.io/posts/2013/07/packaging-signing-not-holy-grail/
>>
>>
>>>> Donald Stufft
>>
>>
>>
>>
>>
>> _______________________________________________
>> Distutils-SIG maillist  -  Distutils-SIG at python.org
>> https://mail.python.org/mailman/listinfo/distutils-sig
>>
>>
>
> _______________________________________________
> Distutils-SIG maillist  -  Distutils-SIG at python.org
> https://mail.python.org/mailman/listinfo/distutils-sig
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/distutils-sig/attachments/20170603/e3e1204c/attachment.html>


More information about the Distutils-SIG mailing list