[Distutils] Google Auth is broken for PyPI

Robert Collins robertc at robertcollins.net
Sun Feb 15 23:25:22 CET 2015

I probably shouldn't, but I feel compelled to reply :).

On 11 February 2015 at 06:33, Donald Stufft <donald at stufft.io> wrote:
>> On Feb 10, 2015, at 11:23 AM, Martin v. Löwis <martin at v.loewis.de> wrote:
>> Am 10.02.15 um 15:36 schrieb Donald Stufft:
>>> Honestly, I’d rather have less federated login not more. I wish the current OpenID support had never been added.
>> Can you please elaborate on that position? Why is it useful to have
>> separate accounts on separate systems?
> Sure.
> So the basic premise behind federated auth is that you can get a single set
> of credentials on all (or most) of your sites and eliminate the need to have a
> password for each site you visit.
> My opinion is basically influenced by a number of factors:
> 1. I feel like the goal of federated auth has failed in general and is unlikely
>    to ever succeed. As a user of websites I have over 400 different entries in
>    my password manager, even if 50% of them implement federated auth (which I
>    feel like is a high number but that's not backed by math, just gut feeling)
>    that's still over 200 entries I need to maintain in my password manager. In
>    this case federated auth has not meaningfully reduced the burden of
>    maintaining password for me since maintaining 200 isn't any easier than 400
>    and instead it just complicates my login flow

So, what is success here? I'd call 200 less passwords to maintain and
rotate on a regular basis a GOOD THING. I very much doubt that you
would have 2FA set up on the other 200 things, so that would mean a
change from 400 sites w only a couple having 2FA to 200 with regular
rotations and 2FA, and 200 liabilities.

> 2. As a site operator I feel like authentication is a core part of the
>    experience of using my site and by allowing federated auth on my site I'm
>    giving up control over that user flow. A relevant example from PyPI is that
>    a number of users signed up using MyOpenID which is no longer being
>    maintained. This means that either PyPI has to tell those people
>    "tough shit" or PyPI needs to figure out a mitigation tactic against that.
>    Another example is that launchpad randomly starts failing for people, and
>    it'll fail consistently for the same person until it just stops failing for
>    them. I'm unable to actually reproduce this error so it's extremely hard
>    for me to do anything else but shrug and tell them not to use it.

I'm genuinely curious here. Why do you feel that authentication is a
core part of the experience? Its a necessary part, sure. But I find it
hard to imagine that many people say 'that bug tracking site, its got
*awesome authentication*'! I see authentication as something that is
very very hard to get right, and incredibly easy to get wrong. I don't
trust folk that are experts in e.g. bugtracking. Or code hosting. Or
todo list management to necessarily understand all the intricacies of
password handling (e.g. *how many sites don't use PBKDF2*!) Or worse
truncate the input password you give to 8 characters (yes, seriously).
Its not that the site operators aren't trustworthy in general, its
that password handling is nasty:
 - its hard to get right
 - you won't know if you got it wrong until you or your users are compromised
 - even sites with dedicated teams doing just the IdP aspect get it wrong

I consider it irresponsible for less well resources sites to get into
credentials management unless they truely have no choice: they're
tackling something they're almost certain to get wrong.

> 3. I feel like unless you solely rely on federated auth, then federated auth is
>    always going to be a second class citizen for any particular website. For
>    instance Travis CI uses federated auth via Github only, but that's the only
>    thing they support for authentication so everything works well with that. On
>    the other hand a number of sites support federated auth ontop of local
>    accounts and federated auth is almost always worse in some ways, sometimes
>    as simple as the username you get is kinda crappy (dstufft_<somehash>)
>    sometimes some features don't work (or don't work very well) at all like
>    on PyPI where we need to authenticate people outside of a web context so
>    if we don't have usernames/passwords then we end up needing to require the
>    user to register a secondary "api password" or API key.

Relying solely on federated auth is fine by me :). You don't need to
tie yourself to one provider. Yes, most users will use just one of
fb/github/google/lp/twitter in our community, but you can (and should)
do unification on email address's to allow dealing with failed
providers [but only for trustworthy providers or by doing an email
verification step before unifying] and manage ACLs and privileged
operations locally.

The fact that some sites doit crappily is in no way an inditement of
the basic tech - in fact some sites do it really well. Its gotten so
good that these days the only time I will sign into a site that
*doesn't* use federated auth is if there is something I really really
really want from it. E.g. I made an account with Elite:Dangerous.

> 4. I feel like none of the current solutions to federated auth are very good.
>    OpenID relies on using an URL as your "personal identifier" which I feel
>    like is a strange and foreign concept to most users. The way around this is
>    often to just hardcode a list of sites, but then as a site operator you're
>    implicitly recommending that users go sign up for one of those sites and
>    use them on your site to login. This is creating an explicit relationship
>    between your site and the other site, a relationship in which you often have
>    no power (for instance, Google <-> PyPI, we're powerless to do anything
>    about them deprecating OpenID other than just sucking it up and dealing with
>    it). Persona did offer a way around this, but persona had other failings
>    like relying on the domain that you happened to be using for your email to
>    implement a persona IdP or otherwise falling back to an implicit relationship
>    with the fallback provider, again one where you're more or less powerless to
>    the operators of that service.

I agree that they're not brilliant. OpenID is basically dead, long
live OpenID Connect :/. So the thing there AIUI is that OAuth worked
out a lot better (more flexible, consistent with both CLI / app
workflows and server side web interactions). And as such everyone is
just consolidating on the one toolchain to avoid lots of needless
redundancy. But as user, its fine. I don't judge a site as subordinate
to Google if they allow Google logins, for instance.

Yes, if you use federated auth you need to keep up. But hell, we need
to keep up if we do our own auth management. When was the last time
the hash count on PyPI's password database was increased to account
for hash rate growths? Managing credentials is an ongoing effort - at
Canonical we split that out into its own team, and they were busy just
keeping on top of it and changes in the fundamentals for years. See
above about hard to get right.

> Overall I think that the use of federated auth, as a site operator, is really
> only worth it over the loss of control in two scenarios:
> A. When your site is already entwined with another site and relying on them for
>    authentication is simply increasing that. An example of this from above is
>    Travis CI where they only work with things hosted on GitHub so also relying
>    on GitHub for authentication isn't that big of a deal and actually makes
>    things better since they can then integrate with GitHub's permissions to
>    check if you have commit on a particular repository.
> B. When creating an account is likely to be enough of a burden to make people
>    decide not to interact with your site. This category is basically completely
>    comprised of sites that do not have long standing relationships with their
>    users. The only real example I can think of this of the top of my head is
>    sites with comments enabled like blogs, news sites, etc. The commentors are
>    unlikely to have or want a long standing relationship with your site, they
>    just want to make a quick one off comment and then possibly never come
>    back. Sites like PyPI otherhand the cost of creating an account is small
>    compared to the life time of majority of our user base's interaction with
>    us.

I think you're underestimating the impact this has on users. It
definitely creates a high barrier to entry for me, and I don't think
I'm alone. For bugs.python.org I leapt on Federated auth, but for PyPI
I can't use it because it doesn't allow consolidating the accounts
(AFAICT). Is it a matter of toits? E.g. do you need someone to provide
patches to both permit the new OpenID Connect, OAuth for console use,
and connecting OpenID Connect identities to local usercodes?

> A key thing to me, as a site operator, is keeping as much control over the
> experience of my users as I can. Obviously I have to outsource some things
> because It's not reasonable for me to make my own hardware, write my own
> drivers, my own kernel, my own OS, my own webserver etc. A good example of a
> major outsourcing that I was involved in was moving things behind Fastly.
> However a key difference between that outsourcing and this outsourcing is that
> if things go sour with Fastly or we need to migrate away from them for one
> reason or another we can do that without end users needing to change much or
> anything. However if something like Google dropping OpenID supports happens
> then the users who relied on that are out of luck and our ability to shield
> them from the fallout of that is limited.

Thats true, OTOH I think I've made a reasonable case above that our
ability to shield users from our own mistakes is limited, and dealing
with passwords really isn't as simple as all that... and updating to
OpenID Connect should be pretty straight forward, there are good
libraries for it all around.

> At this point we already have it enabled, so unless someone comes up with a
> really good migration strategy I doubt we'll be able to get rid of it. However
> for the reasons above I'm pretty much against adding *additional* federated
> auth things and I think that we should treat it more of a legacy thing and
> downplay the fact we have support for it. Bitbucket has downplayed support for
> random OpenID as well, when you go to their login pages it shows a login form
> that looks like http://d.stufft.io/image/1O2l2g073h0h, which still lets you
> login with OpenID but it's muted and downplayed.
> In a slightly hypocritical view point, I actually think that at some point we
> should get something like id.python.org which is an IdP and switch all of the
> *.python.org sites to authenticate against that instead of keeping local
> user accounts. This would reduce the number of passwords that Python inflicts
> on people but it still keeps authentication within our (PSF/Python/whatever)'s
> control. This is more along the lines of implementing SSO using a federated
> auth technology than actual federated auth though.

Counterpoint: why not get rid of local auth altogether (for web
service, not system administration). What do we, a non-profit, do that
requires direct control over auth? At least - bugs.python.org, pypi,
both of which support OpenID today, we've clearly considered that
there its ok.

If we didn't have local auth at all that would free up cycles to do
whatever (moderate) chasing of evolving federation standards is


Robert Collins <rbtcollins at hp.com>
Distinguished Technologist
HP Converged Cloud

More information about the Distutils-SIG mailing list