Support for OAuth2/OIDC in the standard distribution ?

Hi! Why am I bringing this up: Security is hard ! Implementing a standard correctly is not easy. I know about the later because the last 2 years I’ve been involved in certifying OpenID Connect Provider instances. Lately I’ve been doing the same for OpenID Connect Relying Party libraries. All of what I’ve done in written in Python and on Github. Regarding the first opinion that has been shown time and time again so I won’t go into that here. Now, voices has been raise within the OpenID Foundation that it would pick a number of implementations, one per language, and stamp them with a sign of approval. Those implementations would all be thoroughly tested for compliance and usability before approved. My Python implementation (https://github.com/rohe/pyoidc) is probably the forerunner when it comes to being the chosen Python implementation. It’s been around for a number of years and it’s the basis for the test tools. Which means, it has been thoroughly tested by many independent parties. My question to you is if it would be possible to get an OAuth2/OIDC implementation like mine to be part of the Python standard distribution. I realise that I will have to rewrite parts of pyoidc because presently it uses modules (for instance pycryptdome and requests) that are not part of the standard distribution. The bottom line is of course that it would benefit the community to have a high quality OAuth2/OIDC implementation within easy reach. — Roland

On 16 November 2016 at 22:50, Cory Benfield <cory@lukasa.co.uk> wrote:
I do think it could be useful for you to ask the requests developers if they'd be willing to explicitly recommend a particular approach to implementing OIDC atop requests and provide a pointer from their documentation. Searching on Google for "python oidc" indicates both "pip install oidc" and "pip install oic" are available (with the latter being the case discussed here), but of the two, only yours appears to provide API usage documentation. The protocol walkthrough at http://pyoidc.readthedocs.io/en/latest/howto/rp.html seems like it would be particularly useful to many folks as a hands-on introduction to the steps involved in OIDC based client authentication. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Basically, I think it’s a matter of visibility. If someone tells you you have to add OIDC RP capabilities to your service what do you do. If the needed batteries are already included in Python it’s easy, it’s there. If it isn’t then the best case scenario is that you find all the implementations there is and based on objective evaluations (like the OIDF standards compliant tests) chose the ‘best’ one. Is that the most common scenario ? I doubt it. The OIDF will publish information about their preferred set of libraries but I still think there will be a substantial portion of coders we won’t reach. If you have any idea about how we could reach more coders I’m all ears. — Roland

Ultimately I think that this is a bit misleading. Python’s included batteries are often not the best way to add certain bits of function to your program, and Python programmers are extremely accustomed to needing to get their batteries from elsewhere. Certainly web developers are: web developers cannot even really get started without pip installing *something* unless they plan to hand-roll WSGI using wsgiref or use SimpleHTTPServer, which they overwhelmingly do not.
Coders who need OIDC will go looking for it and will find their options. Ultimately, a huge number of projects haven’t suffered from being outside the standard library. Some of these are even replacements for Python’s included batteries, which means they’re competing with the “just there” options users already have. It should be noted that I believe that Python’s standard library is already too big, and has had a tendency in the past to expand into cases that were not warranted. I think that saying that you’re worried that users won’t find your module and so it should go into the standard library is solving the wrong problem. We shouldn’t just shove useful things into the standard library because they’re useful: that leads to a massive, bloated standard library that is an enormous maintenance burden for the Python core developers who frankly have more than enough to be doing already. Instead, we should aim to solve the actual problem: how do we provide tools to allow users to find the best-in-class solutions to their problems from the third-party Python ecosystem? That there is a much harder problem, unfortunately, but I think we should aim to provide a bit of impetus towards solving it by refusing to add things to the standard library that aren’t likely to be extremely broadly useful. Cory

I agree that is the real question. For instance, I remember someone raising at a PyCon US the concern about modules that no longer has a maintainer. That would just be the one among several things you need to know about modules you consider using.
That there is a much harder problem, unfortunately, but I think we should aim to provide a bit of impetus towards solving it by refusing to add things to the standard library that aren’t likely to be extremely broadly useful.
Granted — Roland

On 16 November 2016 at 13:55, Cory Benfield <cory@lukasa.co.uk> wrote:
If you have any idea about how we could reach more coders I’m all ears.
Coders who need OIDC will go looking for it and will find their options. Ultimately, a huge number of projects haven’t suffered from being outside the standard library. Some of these are even replacements for Python’s included batteries, which means they’re competing with the “just there” options users already have.
I'm not a web developer as such, although I do write code that consumes web services on occasion. I don't know what OIDC is, but I do know, for example, that some services use OAuth. So I can imagine being in a situation of saying "I want to get data from a web API xxx, and it needs OAuth identification, how do I do that in Python?" Typically, the API docs are in terms of something like Javascript, with a browser UI, so don't help much for a command line script which is the sort of thing I'd be writing. In that situation, a well-known, easy to use module for OAuth in Python would be fantastic. Agreed that it could as easily be on PyPI as in the stdlib, but discoverability isn't as good with PyPI - I can scan the stdlib docs, but for PyPI I'd end up scanning Google, and what I found that way was oauthlib - I didn't see any mention of pyoidc. I can't comment on what that implies, though. In my brief search though I didn't find any sort of command line "Hello world" level example.
It should be noted that I believe that Python’s standard library is already too big, and has had a tendency in the past to expand into cases that were not warranted.
I should also note that I rely heavily on the stdlib, and for a non-trivial amount of the work I do (which is one-off scripts, not full-blown applications) having to go outside the stdlib, other than for a very few select modules, is a step change in complexity. So I'm a fan of the "batteries included" approach. I don't know whether OAuth is a sufficiently common requirement to warrant going into the stdlib. My instinct is that if you're integrating it into a web app, then there's no value in it being in the stdlib as you'll already need 3rd party modules. If it's practical to use OAuth from a simple Python script (say, to integrate with a web API like github) then being in the stdlib could be a benefit. But how many people write Python scripts that use/need OAuth? I've no feel for that. Paul

I think this is actually another great example of why we should resist attempts to add modules to the standard library without enormous caution. I think that fundamentally in most of these cases the audience on python-dev is not equipped to decide whether an implementation deserves to become a default battery. And this is a surprisingly good example case. With all due respect to Roland, pyoidc is not the incumbent in the synchronous OAuth space: requests-oauthlib is. A quick Google search for “python oauth” turned up the following client libraries in my top 10 results: oauthlib (discarded because it’s a sans-IO implementation that recommends requests-oauthlib as a client), python-oauth2, requests-oauthlib, and Google’s OAuth 2 client library (discarded because it is bundled into the Google API client libraries module and so distorts my download counts below). A quick query of the PyPI download database for the three months shows the following download counts for those modules: - requests-oauthlib == 1,897,048 - oauth2 == 349,759 - pyoidc == 10,520 This is not intended to be chastening for Roland: all new modules start with low download counts. As the current lead maintainer of requests-oauthlib, let me say publicly and loudly that I’d love to have pyoidc replace requests-oauthlib. I *hate* requests-oauthlib. I maintain it literally only to prevent it falling into disrepair because it is extremely widely used. I would love a better library to come along so that we can sunset requests-oauthlib. I am entirely prepared to believe that Roland’s module is better than requests-oauthlib: it would be hard for it not to be. However, *right now*, pyoidc does not have anything like a majority (or even a plurality) of mindshare amongst Python developers writing OAuth clients. So why should pyoidc be added to the standard library over the competing implementations? The only reason I can see to add it is if it is a better implementation than its competitors, and the python-dev community believe that developers using the competitor implementations would be better served using pyoidc instead. Is that the case? Do we have some objective measure of this? Paul, you mentioned that discovery on PyPI is a problem: I don’t contest that at all. But I don’t think the solution to that problem is to jam modules into the standard library, and I think even less of that idea when there is no formal process available for python-dev to consider the implementations available for the standard library. Instead, I think we need a way to be able to ask the question: “what does the wider Python development community consider to be the gold standard for solving problem X?”. I do not think that adding modules to the standard library is the way to answer that question. The TL;DR of this massive argument is: I think the community of people who actually use OAuth on a regular basis are better placed to judge what the best-in-class battery for OAuth is. What we need is a way to surface their collective opinions to people who don’t know what options are available, rather than to make commitments to long-term support of a module in the standard library.
I think this is the other problem that needs solving, and because I’m a full-time OSS developer with complete admin rights on my development and production targets I’m badly placed to solve it. What needs to be done to make it easier for people in your position to obtain non-included batteries? Can anything be done at all?
Yeah, OAuth from Python scripts isn’t entirely uncommon. It’s mostly used to interact with third-party APIs, which usually use OAuth to allow for revocable and granular permissions grants to specific scripts. Cory

On 17 November 2016 at 10:58, Cory Benfield <cory@lukasa.co.uk> wrote:
Paul, you mentioned that discovery on PyPI is a problem: I don’t contest that at all. But I don’t think the solution to that problem is to jam modules into the standard library, and I think even less of that idea when there is no formal process available for python-dev to consider the implementations available for the standard library.
Yeah, in the process of the discussion a certain amount of context was lost. I also don't think that the solution is to "jam" modules into the standard library. I *do* think that part of the solution should be to have good solutions to common programming problems in the standard library. What is a "common problem" changes over time, as well as by problem domain, and we need to take that into account. My feeling is that (client-level, web service consumer focused) OAuth is tending towards being one of those "common problems" (as the authentication side of the whole REST/JSON/etc web API toolset) and warrants consideration for inclusion in the stdlib. I have no experience to judge what is the current "best solution". I'm happy for the community to thrash that out, and settle on a standard. Roland proposed his library as a solution - I can comment on that (from the perspective of "what a non-expert would like to have from the stdlib") but I won't until the question of "is the proposed library really the standard" is resolved. And maybe it *won't* be resolved as the situation isn't sufficiently settled yet. And that's fine. But I don't agree with the principle that we should stop adding solutions to common problems to the stdlib "because PyPI is only a pip install away". There will always be users who can't, won't or simply don't use PyPI and judge Python on what you can do with a base install. And that's a valid judgement to make. One of the reasons I prefer Python over (say) Perl, is that if I go onto a Linux server that's isolated from the internet, both are available but on Python I can do things like compose a MIME email and send it via SMTP, work with dates and times, read and write CSV files, parse XML data from an external program, etc. On Perl I can't because the Perl standard library doesn't have those things available, so I end up having to write my own - which means I take time away from getting my *actual* job done.
Instead, I think we need a way to be able to ask the question: “what does the wider Python development community consider to be the gold standard for solving problem X?”.
Agreed, that's the key unsolved question for Python packaging.
I do not think that adding modules to the standard library is the way to answer that question.
Again agreed. But I *do* think that once that question is answered (on a case by case basis, not the overall "how do we do it for everything" question) then adding the module that has been identified as the gold standard to the stdlib *may* (depending on the problem domain) be important. Put it another way - being in the stdlib isn't a solution to the discoverability problem, but it is a solution to the access problem (which is a real problem for some people, despite pip and PyPI). Paul

Fair enough: “jam” was probably a more emotive term than I needed to use there, I’ll happily concede that.
So this argument seems reasonable to me, but my problem with it is that it seems to be fuzzy. It leads me to all kinds of follow-on questions, such as: - What counts as a common problem? Is there an objective measure, or do we decide by gut feel? - How do we scope the problem? Is the problem we’re solving in this specific case OAuth? OpenID Connect? HTTP authentication in general? - How complex does that problem have to be before we decide that the solution doesn’t belong in the standard library? Alternatively, does being complex make it *more* important that we have a standard library solution? - Should the standard library have a greenfield implementation or adopt the third-party one? - If it adopts the third party one what happens to the third-party maintainers, are they expected to keep maintaining? - If they object, does CPython do a hostile fork and take over maintenance itself, or pursue another implementation? - How do we balance the desire to increase the scope of the stdlib with the increased maintenance burden that brings? - Do we ever *remove* modules that are solutions to problems that are no longer common? - Is it acceptable to solve only part of the problem? (For context, OAuth2 is a complex specification that leaves a lot of detail out: requests-oauthlib contains a lot of “compliance fixes” for specific oauth2 servers that deviate from the specification in unexpected ways. Is it acceptable to write exactly to the spec and to leave anyone who needs custom code to their own devices?) - What about asyncio integration? Is that mandatory for new protocol code? Optional? How important? Can it be asyncio-only? - As a follow-on, what about integration with other stdlib modules? Does the new OAuth module have to work with all stdlib HTTP clients? Only one? Is it its own client you use directly? - What happens if/when the protocol is revised? - What happens if/when the maintainers move on from the project? - Do we also maintain an out-of-tree backport for users on older Pythons? If not, is it acceptable for those users to have older versions of the library unless they upgrade their whole Python distribution? This isn’t me disagreeing with you, just me pointing out that the fuzziness around this makes me nervous. It has been my experience that a large number of protocol implementations in the standard library are already struggling to meet their maintenance goals, and I’d be pretty reluctant about wanting to add to that burden.
I can understand that. I definitely think you and I have disagreements on the best way to solve this problem (and that we’re unlikely to resolve them in this thread!), but I certainly acknowledge that this use case is real and important. I think my biggest disagreement on this use case is simply about scope: at what point does a use become too niche to be supported by the standard library?
Fair enough. Cory

On 17 November 2016 at 12:27, Cory Benfield <cory@lukasa.co.uk> wrote:
This isn’t me disagreeing with you, just me pointing out that the fuzziness around this makes me nervous. It has been my experience that a large number of protocol implementations in the standard library are already struggling to meet their maintenance goals, and I’d be pretty reluctant about wanting to add to that burden.
The fuzziness is a clear and definite issue here. My perspective is that this far, Python (specifically the core dev team[1]) has done a good job of balancing that fuzziness. I don't expect this to ever be a simple decision to make. All I ask is that we avoid countering a complex decision process with an over-simple guideline (specifically, that "things don't need to go into the stdlib because of pip/PyPI[2]). Regardless of how things go in the long term, I think it's good to keep the debates open in this area, and I appreciate your comments. I'll certainly be thinking about my personal answers to the questions you raised, and I expect I'll change at least some of my views as a result. Paul [1] Even though I'm a core dev, I view myself as a "normal user" in this context, and take no personal credit for the scope of the stdlib. [2] Ironically, I'm also a pip developer, so if I seem confused, I claim the right to be :-)

Agreed. For what it’s worth, I’ll almost always find myself on the “let’s not add it to the stdlib” side of that argument, but I’m entirely willing to lose those arguments. I think we’re best served by having voices on both sides of the debate who believe themselves to be right in the *general* case but are willing to treat each case on the merits. There are certainly lots of requests for addition to the stdlib I have no objection to: for example, data structure and algorithm implementations in the standard library almost always seem like no-brainers to me. So I agree, I’m going to keep an eye on this space as we move forward, and for my part I promise to treat each case on the merits, despite my general belief of “small is beautiful”. Cory

On 17 November 2016 at 21:35, Paul Moore <p.f.moore@gmail.com> wrote:
Not just Python packaging - open source publishing in general, and one of the big metrics we have providing evidence of this is hosted package growth rates. Some registries try to present exponential growth in the number of hosted packages as a good thing, but they're often wrong to do so: in most cases, that kind of exponential growth is more likely to represent a failure of software discovery mechanisms (so folks are publishing their own custom solutions to previously solved problems rather than adopting existing tools as "good enough") than it is actual growth in the number of different problem domains with readily available published toolkits for tackling them. That's fine in a software-as-creativity-and-play context, but it's a problem in the software-as-a-means-to-an-end mindset that is applicable to most professional development activities. The one upside I see to the current state of affairs is that this problem isn't *new* - it's existed for as long as we've had software, it was just hidden away behind the walls of the institutions writing custom in-house software. Now that more software publication and consumption activities are instead starting to happen in the open, the problem can be quantified, and various automated techniques brought to bear on tackling it. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

No offence taken ! :-) But you should distinguish between OAuth2 and OIDC. OIDC is a profile of OAuth2 for usage in the case where you not only need authorization (and access tokens) but also authentication and/or user info.
As the current lead maintainer of requests-oauthlib, let me say publicly and loudly that I’d love to have pyoidc replace requests-oauthlib. I *hate* requests-oauthlib. I maintain it literally only to prevent it falling into disrepair because it is extremely widely used. I would love a better library to come along so that we can sunset requests-oauthlib. I am entirely prepared to believe that Roland’s module is better than requests-oauthlib: it would be hard for it not to be.
:-)
However, *right now*, pyoidc does not have anything like a majority (or even a plurality) of mindshare amongst Python developers writing OAuth clients. So why should pyoidc be added to the standard library over the competing implementations? The only reason I can see to add it is if it is a better implementation than its competitors, and the python-dev community believe that developers using the competitor implementations would be better served using pyoidc instead. Is that the case? Do we have some objective measure of this?
The only possible objective measurement I can see would be testing the implementation for standard compliance.

On Thu, Nov 17, 2016 at 11:51 PM, Roland Hedberg <roland@catalogix.se> wrote:
When you're looking at oauth2, there are myriad uses for it, and thus large numbers of people looking for the module. But OIDC is something I had never heard of until this thread (turns out it's something built on top of OAuth2). Your module may well be best-in-show for OIDC (unproven, but assume it for the nonce), but unless it's also best-in-show for OAuth2, it's not going to have the broad draw/appeal that I would hope for in a new stdlib module. Perhaps the best step forward is to publish blog posts demonstrating how your module compares to other OAuth2 libraries. That would put the module name alongside various keywords that people will search for, and thus improve its visibility. Consider this thought process, which I'd say is fairly typical: 1) I want to use Fred's Wonderful Spamination API. 2) FWSA's docs say that I need to use this thing called OAuth. 3) What's OAuth? How do I use it? Search the web. 4) Oh, there's OAuth1 and OAuth2. Which should I use? Ahh, FWSA's docs say OAuth2. Okay. 5) I need a Python module that does OAuth2. 6) Search the web, or search PyPI? Personally, when I hit step 6, I search the web. PyPI search is exhaustive but not very usefully ranked (for this purpose). Searching for a keyword or protocol will give undue weight to a module whose name is simply that word, even if that module is terrible, unmaintained, etc, etc. Properly-ranked web search results are generally more useful in pointing me to the appropriate package, even if they're telling me to use something with a very different name. (Consider a search for "python http". You'll get httplib/http.client, but shortly after that, you get pointed to 'requests'.) As another bonus, blog posts of that nature will help to explain to more experienced devs "why should this matter to me". People who've already used requests-oauthlib are unlikely to reach for a new and unproven package without a good reason. So give them that reason! :) Also, as I mentioned earlier, the Python Wiki may well have an appropriate spot for this to be mentioned. It's worth a check. ChrisA

On 17 November 2016 at 14:45, Chris Angelico <rosuav@gmail.com> wrote:
Additionally, I look for simple usage examples. When I did search for OAuth, I got lots of hits for libraries, some even included "how to add OAuth to your Flask app" examples. But not one showed me how I should call a web service that uses OAuth from the Python interpreter prompt using that library. Contrast the first page of the requests documentation: >>> import requests >>> r = requests.get('https://api.github.com/user', auth=('user', 'pass')) >>> r.status_code 200 >>> r.headers['content-type'] 'application/json; charset=utf8' >>> r.encoding 'utf-8' >>> r.text u'{"type":"User"...' >>> r.json() {u'private_gists': 419, u'total_private_repos': 77, ...} With that, I immediately see how to use the code. Something similar with an OAuth using service would be what I'm looking for. I'd consider that sort of use case focused documentation as a minimum for any library that was looking to be included in the stdlib. I know I've been arguing earlier in this thread that "OAuth may be a good candidate for the stdlib". Here, what I'm saying is "... but I don't see any library implementing it that's ready for stdlib inclusion". While oic may be highly standards-compliant, IMO it's not ready for stdlib inclusion without user-focused design and documentation. On the other hand, nor was any other OAuth package I found via a quick search. Paul

Cory Benfield wrote:
Perhaps there could be a curated area on PyPI, maintained by core developers or people appointed by them, where the packages considered best-in-class would be placed. Instead of lobbying to get a package into the stdlib (a very high barrier to cross) people would then have the option of lobbying to get it stamped as "recommended" on PyPI. -- Greg

On Thu, Nov 17, 2016 at 8:06 AM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Or on the Python wiki: https://wiki.python.org/moin/FrontPage A page there could highlight the best PyPI packages. ChrisA

Cory Benfield writes:
I think the core question you need to answer for this proposal is: why is “pip install oic” not easy-enough reach?
My first guess would be "some enterprises use OAuth internally for the same reason they have draconian approval policies". More straightforwardly, this is the kind of battery that enterprises which make it hard to use pip seem likely to value. Nick's response is formally correct, but if requests is so important, it's likely to already be on the approved list. I would assume that pyoic also implements the client side, so it is useful even if requests is absent. But I am not a draconian security policy QA/security reviewer. I'd take anything Paul Moore says pretty seriously, as he operates in such an environment.

On 17 November 2016 at 12:42, Stephen J. Turnbull <turnbull.stephen.fw@u.tsukuba.ac.jp> wrote:
In that context, the problem is the old "batteries that leak acid everywhere can be worse than no batteries at all" one: we know from painful experience with the SSL module that the standard library's typical release and adoption cycle can be seriously problematic when it comes to network security related modules. However, when it comes to draconian security policies, *transitive recommendations have power*: if CPython is approved, and python-dev collectively says "we recommend pip, virtualenv, and requests", then folks in locked down environments can use that as evidence to suggest that those other modules are also trustworthy (particularly when there are commercial software vendors shipping them). In the case of something like OIDC, if the requests, Flask, Django and Pyramid developers were all inclined to recommend the same library for server and client implementations, then that would similarly be sufficient justification in many environments to bring the component in as a new dependency. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Nick Coghlan writes:
In that context, SSL is a somewhat special case because of its dependence on OpenSSL and because of Apple's U+1F4A9-headed attitude toward fixing something they distribute. If pyoic is pure Python, I would expect as many releases as we have branches supporting it within 6-12 weeks. Less time, if a truly nasty security bug. No? Sure, that's problematic for any sites that haven't yet approved a Python version with pyoic in it, but they're problematic period.
I understand that argument, but I can assure you that my employer does not. Its security policies tend to be both draconian and ineffective. Is the "python-dev and RHEL recommend it" evidence all that effective for "most" sites with "software must be approved before installing" policies? (This argument could easily go against me, if "draconian" sites tend to drag on approving Python point releases as well as on "new" PyPI modules.) Steve

On 17 November 2016 at 11:12, Stephen J. Turnbull <turnbull.stephen.fw@u.tsukuba.ac.jp> wrote:
In my experience, it's not so much "draconian" policies that cause the issues, as "not interested" policies. You get the standard build because that's what the default install gave, before the application-specific stack was added. If the application in question isn't built with Python (e.g., you use Python for automation, system management, or whatever) then you stand no chance of finding anyone to even *ask* for permission to go beyond the "out of the box" build. At best, on Windows systems, you get to say "we need you to run the following installers for our support tools" once, and likely once only. No internet access, no "please can we have X added", you get what you thought to ask for on day 1, and that's it. Paul

On 17 November 2016 at 02:42, Stephen J. Turnbull <turnbull.stephen.fw@u.tsukuba.ac.jp> wrote:
For context, my environment is one that doesn't formally use Python, but needs a lot of adhoc automation and management solutions, for which Python is a great fit, as long as it doesn't need anything that isn't pure "out of the box" functionality (because once we need that, we get into formal requests for things to be added to supported software lists). There's certain possibilities for "under the radar" additions, but the costs get high pretty quickly and overwhelm the benefits. So in some ways things can be very flexible, but in others I need to think in worst-case "I'm lucky to have Python at all, let's not push my luck" terms. It's likely that this sort of environment is becoming less common as Python becomes more mainstream/popular (it's not that long ago that you were lucky to find Python in a default Unix installation at all, for example), but it is still something we should be considering when looking at what deserves to be in the stdlib (sure requests is better than urllib, but if urllib disappeared, I wouldn't be able to do web requests at all in many of my environments). Paul

On 11/16/2016 03:51 AM, Roland Hedberg wrote:
My concern with a rewrite is that your code is no longer thoroughly tested. Of course, this depends in large part on the completeness of your unit tests. I agree with Cory that discoverability is the problem to solve. You can help with that somewhat by seeding sites like StackOverflow with the information.* -- ~Ethan~ * SO is not very tolerant of questions looking for opinion-based answers such as software recommendations, so make sure you phrase your question as "How do I do XXX using pyoidc?" and your answer can also include other links on the OIDC subject.

On 16 November 2016 at 22:50, Cory Benfield <cory@lukasa.co.uk> wrote:
I do think it could be useful for you to ask the requests developers if they'd be willing to explicitly recommend a particular approach to implementing OIDC atop requests and provide a pointer from their documentation. Searching on Google for "python oidc" indicates both "pip install oidc" and "pip install oic" are available (with the latter being the case discussed here), but of the two, only yours appears to provide API usage documentation. The protocol walkthrough at http://pyoidc.readthedocs.io/en/latest/howto/rp.html seems like it would be particularly useful to many folks as a hands-on introduction to the steps involved in OIDC based client authentication. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Basically, I think it’s a matter of visibility. If someone tells you you have to add OIDC RP capabilities to your service what do you do. If the needed batteries are already included in Python it’s easy, it’s there. If it isn’t then the best case scenario is that you find all the implementations there is and based on objective evaluations (like the OIDF standards compliant tests) chose the ‘best’ one. Is that the most common scenario ? I doubt it. The OIDF will publish information about their preferred set of libraries but I still think there will be a substantial portion of coders we won’t reach. If you have any idea about how we could reach more coders I’m all ears. — Roland

Ultimately I think that this is a bit misleading. Python’s included batteries are often not the best way to add certain bits of function to your program, and Python programmers are extremely accustomed to needing to get their batteries from elsewhere. Certainly web developers are: web developers cannot even really get started without pip installing *something* unless they plan to hand-roll WSGI using wsgiref or use SimpleHTTPServer, which they overwhelmingly do not.
Coders who need OIDC will go looking for it and will find their options. Ultimately, a huge number of projects haven’t suffered from being outside the standard library. Some of these are even replacements for Python’s included batteries, which means they’re competing with the “just there” options users already have. It should be noted that I believe that Python’s standard library is already too big, and has had a tendency in the past to expand into cases that were not warranted. I think that saying that you’re worried that users won’t find your module and so it should go into the standard library is solving the wrong problem. We shouldn’t just shove useful things into the standard library because they’re useful: that leads to a massive, bloated standard library that is an enormous maintenance burden for the Python core developers who frankly have more than enough to be doing already. Instead, we should aim to solve the actual problem: how do we provide tools to allow users to find the best-in-class solutions to their problems from the third-party Python ecosystem? That there is a much harder problem, unfortunately, but I think we should aim to provide a bit of impetus towards solving it by refusing to add things to the standard library that aren’t likely to be extremely broadly useful. Cory

I agree that is the real question. For instance, I remember someone raising at a PyCon US the concern about modules that no longer has a maintainer. That would just be the one among several things you need to know about modules you consider using.
That there is a much harder problem, unfortunately, but I think we should aim to provide a bit of impetus towards solving it by refusing to add things to the standard library that aren’t likely to be extremely broadly useful.
Granted — Roland

On 16 November 2016 at 13:55, Cory Benfield <cory@lukasa.co.uk> wrote:
If you have any idea about how we could reach more coders I’m all ears.
Coders who need OIDC will go looking for it and will find their options. Ultimately, a huge number of projects haven’t suffered from being outside the standard library. Some of these are even replacements for Python’s included batteries, which means they’re competing with the “just there” options users already have.
I'm not a web developer as such, although I do write code that consumes web services on occasion. I don't know what OIDC is, but I do know, for example, that some services use OAuth. So I can imagine being in a situation of saying "I want to get data from a web API xxx, and it needs OAuth identification, how do I do that in Python?" Typically, the API docs are in terms of something like Javascript, with a browser UI, so don't help much for a command line script which is the sort of thing I'd be writing. In that situation, a well-known, easy to use module for OAuth in Python would be fantastic. Agreed that it could as easily be on PyPI as in the stdlib, but discoverability isn't as good with PyPI - I can scan the stdlib docs, but for PyPI I'd end up scanning Google, and what I found that way was oauthlib - I didn't see any mention of pyoidc. I can't comment on what that implies, though. In my brief search though I didn't find any sort of command line "Hello world" level example.
It should be noted that I believe that Python’s standard library is already too big, and has had a tendency in the past to expand into cases that were not warranted.
I should also note that I rely heavily on the stdlib, and for a non-trivial amount of the work I do (which is one-off scripts, not full-blown applications) having to go outside the stdlib, other than for a very few select modules, is a step change in complexity. So I'm a fan of the "batteries included" approach. I don't know whether OAuth is a sufficiently common requirement to warrant going into the stdlib. My instinct is that if you're integrating it into a web app, then there's no value in it being in the stdlib as you'll already need 3rd party modules. If it's practical to use OAuth from a simple Python script (say, to integrate with a web API like github) then being in the stdlib could be a benefit. But how many people write Python scripts that use/need OAuth? I've no feel for that. Paul

I think this is actually another great example of why we should resist attempts to add modules to the standard library without enormous caution. I think that fundamentally in most of these cases the audience on python-dev is not equipped to decide whether an implementation deserves to become a default battery. And this is a surprisingly good example case. With all due respect to Roland, pyoidc is not the incumbent in the synchronous OAuth space: requests-oauthlib is. A quick Google search for “python oauth” turned up the following client libraries in my top 10 results: oauthlib (discarded because it’s a sans-IO implementation that recommends requests-oauthlib as a client), python-oauth2, requests-oauthlib, and Google’s OAuth 2 client library (discarded because it is bundled into the Google API client libraries module and so distorts my download counts below). A quick query of the PyPI download database for the three months shows the following download counts for those modules: - requests-oauthlib == 1,897,048 - oauth2 == 349,759 - pyoidc == 10,520 This is not intended to be chastening for Roland: all new modules start with low download counts. As the current lead maintainer of requests-oauthlib, let me say publicly and loudly that I’d love to have pyoidc replace requests-oauthlib. I *hate* requests-oauthlib. I maintain it literally only to prevent it falling into disrepair because it is extremely widely used. I would love a better library to come along so that we can sunset requests-oauthlib. I am entirely prepared to believe that Roland’s module is better than requests-oauthlib: it would be hard for it not to be. However, *right now*, pyoidc does not have anything like a majority (or even a plurality) of mindshare amongst Python developers writing OAuth clients. So why should pyoidc be added to the standard library over the competing implementations? The only reason I can see to add it is if it is a better implementation than its competitors, and the python-dev community believe that developers using the competitor implementations would be better served using pyoidc instead. Is that the case? Do we have some objective measure of this? Paul, you mentioned that discovery on PyPI is a problem: I don’t contest that at all. But I don’t think the solution to that problem is to jam modules into the standard library, and I think even less of that idea when there is no formal process available for python-dev to consider the implementations available for the standard library. Instead, I think we need a way to be able to ask the question: “what does the wider Python development community consider to be the gold standard for solving problem X?”. I do not think that adding modules to the standard library is the way to answer that question. The TL;DR of this massive argument is: I think the community of people who actually use OAuth on a regular basis are better placed to judge what the best-in-class battery for OAuth is. What we need is a way to surface their collective opinions to people who don’t know what options are available, rather than to make commitments to long-term support of a module in the standard library.
I think this is the other problem that needs solving, and because I’m a full-time OSS developer with complete admin rights on my development and production targets I’m badly placed to solve it. What needs to be done to make it easier for people in your position to obtain non-included batteries? Can anything be done at all?
Yeah, OAuth from Python scripts isn’t entirely uncommon. It’s mostly used to interact with third-party APIs, which usually use OAuth to allow for revocable and granular permissions grants to specific scripts. Cory

On 17 November 2016 at 10:58, Cory Benfield <cory@lukasa.co.uk> wrote:
Paul, you mentioned that discovery on PyPI is a problem: I don’t contest that at all. But I don’t think the solution to that problem is to jam modules into the standard library, and I think even less of that idea when there is no formal process available for python-dev to consider the implementations available for the standard library.
Yeah, in the process of the discussion a certain amount of context was lost. I also don't think that the solution is to "jam" modules into the standard library. I *do* think that part of the solution should be to have good solutions to common programming problems in the standard library. What is a "common problem" changes over time, as well as by problem domain, and we need to take that into account. My feeling is that (client-level, web service consumer focused) OAuth is tending towards being one of those "common problems" (as the authentication side of the whole REST/JSON/etc web API toolset) and warrants consideration for inclusion in the stdlib. I have no experience to judge what is the current "best solution". I'm happy for the community to thrash that out, and settle on a standard. Roland proposed his library as a solution - I can comment on that (from the perspective of "what a non-expert would like to have from the stdlib") but I won't until the question of "is the proposed library really the standard" is resolved. And maybe it *won't* be resolved as the situation isn't sufficiently settled yet. And that's fine. But I don't agree with the principle that we should stop adding solutions to common problems to the stdlib "because PyPI is only a pip install away". There will always be users who can't, won't or simply don't use PyPI and judge Python on what you can do with a base install. And that's a valid judgement to make. One of the reasons I prefer Python over (say) Perl, is that if I go onto a Linux server that's isolated from the internet, both are available but on Python I can do things like compose a MIME email and send it via SMTP, work with dates and times, read and write CSV files, parse XML data from an external program, etc. On Perl I can't because the Perl standard library doesn't have those things available, so I end up having to write my own - which means I take time away from getting my *actual* job done.
Instead, I think we need a way to be able to ask the question: “what does the wider Python development community consider to be the gold standard for solving problem X?”.
Agreed, that's the key unsolved question for Python packaging.
I do not think that adding modules to the standard library is the way to answer that question.
Again agreed. But I *do* think that once that question is answered (on a case by case basis, not the overall "how do we do it for everything" question) then adding the module that has been identified as the gold standard to the stdlib *may* (depending on the problem domain) be important. Put it another way - being in the stdlib isn't a solution to the discoverability problem, but it is a solution to the access problem (which is a real problem for some people, despite pip and PyPI). Paul

Fair enough: “jam” was probably a more emotive term than I needed to use there, I’ll happily concede that.
So this argument seems reasonable to me, but my problem with it is that it seems to be fuzzy. It leads me to all kinds of follow-on questions, such as: - What counts as a common problem? Is there an objective measure, or do we decide by gut feel? - How do we scope the problem? Is the problem we’re solving in this specific case OAuth? OpenID Connect? HTTP authentication in general? - How complex does that problem have to be before we decide that the solution doesn’t belong in the standard library? Alternatively, does being complex make it *more* important that we have a standard library solution? - Should the standard library have a greenfield implementation or adopt the third-party one? - If it adopts the third party one what happens to the third-party maintainers, are they expected to keep maintaining? - If they object, does CPython do a hostile fork and take over maintenance itself, or pursue another implementation? - How do we balance the desire to increase the scope of the stdlib with the increased maintenance burden that brings? - Do we ever *remove* modules that are solutions to problems that are no longer common? - Is it acceptable to solve only part of the problem? (For context, OAuth2 is a complex specification that leaves a lot of detail out: requests-oauthlib contains a lot of “compliance fixes” for specific oauth2 servers that deviate from the specification in unexpected ways. Is it acceptable to write exactly to the spec and to leave anyone who needs custom code to their own devices?) - What about asyncio integration? Is that mandatory for new protocol code? Optional? How important? Can it be asyncio-only? - As a follow-on, what about integration with other stdlib modules? Does the new OAuth module have to work with all stdlib HTTP clients? Only one? Is it its own client you use directly? - What happens if/when the protocol is revised? - What happens if/when the maintainers move on from the project? - Do we also maintain an out-of-tree backport for users on older Pythons? If not, is it acceptable for those users to have older versions of the library unless they upgrade their whole Python distribution? This isn’t me disagreeing with you, just me pointing out that the fuzziness around this makes me nervous. It has been my experience that a large number of protocol implementations in the standard library are already struggling to meet their maintenance goals, and I’d be pretty reluctant about wanting to add to that burden.
I can understand that. I definitely think you and I have disagreements on the best way to solve this problem (and that we’re unlikely to resolve them in this thread!), but I certainly acknowledge that this use case is real and important. I think my biggest disagreement on this use case is simply about scope: at what point does a use become too niche to be supported by the standard library?
Fair enough. Cory

On 17 November 2016 at 12:27, Cory Benfield <cory@lukasa.co.uk> wrote:
This isn’t me disagreeing with you, just me pointing out that the fuzziness around this makes me nervous. It has been my experience that a large number of protocol implementations in the standard library are already struggling to meet their maintenance goals, and I’d be pretty reluctant about wanting to add to that burden.
The fuzziness is a clear and definite issue here. My perspective is that this far, Python (specifically the core dev team[1]) has done a good job of balancing that fuzziness. I don't expect this to ever be a simple decision to make. All I ask is that we avoid countering a complex decision process with an over-simple guideline (specifically, that "things don't need to go into the stdlib because of pip/PyPI[2]). Regardless of how things go in the long term, I think it's good to keep the debates open in this area, and I appreciate your comments. I'll certainly be thinking about my personal answers to the questions you raised, and I expect I'll change at least some of my views as a result. Paul [1] Even though I'm a core dev, I view myself as a "normal user" in this context, and take no personal credit for the scope of the stdlib. [2] Ironically, I'm also a pip developer, so if I seem confused, I claim the right to be :-)

Agreed. For what it’s worth, I’ll almost always find myself on the “let’s not add it to the stdlib” side of that argument, but I’m entirely willing to lose those arguments. I think we’re best served by having voices on both sides of the debate who believe themselves to be right in the *general* case but are willing to treat each case on the merits. There are certainly lots of requests for addition to the stdlib I have no objection to: for example, data structure and algorithm implementations in the standard library almost always seem like no-brainers to me. So I agree, I’m going to keep an eye on this space as we move forward, and for my part I promise to treat each case on the merits, despite my general belief of “small is beautiful”. Cory

On 17 November 2016 at 21:35, Paul Moore <p.f.moore@gmail.com> wrote:
Not just Python packaging - open source publishing in general, and one of the big metrics we have providing evidence of this is hosted package growth rates. Some registries try to present exponential growth in the number of hosted packages as a good thing, but they're often wrong to do so: in most cases, that kind of exponential growth is more likely to represent a failure of software discovery mechanisms (so folks are publishing their own custom solutions to previously solved problems rather than adopting existing tools as "good enough") than it is actual growth in the number of different problem domains with readily available published toolkits for tackling them. That's fine in a software-as-creativity-and-play context, but it's a problem in the software-as-a-means-to-an-end mindset that is applicable to most professional development activities. The one upside I see to the current state of affairs is that this problem isn't *new* - it's existed for as long as we've had software, it was just hidden away behind the walls of the institutions writing custom in-house software. Now that more software publication and consumption activities are instead starting to happen in the open, the problem can be quantified, and various automated techniques brought to bear on tackling it. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

No offence taken ! :-) But you should distinguish between OAuth2 and OIDC. OIDC is a profile of OAuth2 for usage in the case where you not only need authorization (and access tokens) but also authentication and/or user info.
As the current lead maintainer of requests-oauthlib, let me say publicly and loudly that I’d love to have pyoidc replace requests-oauthlib. I *hate* requests-oauthlib. I maintain it literally only to prevent it falling into disrepair because it is extremely widely used. I would love a better library to come along so that we can sunset requests-oauthlib. I am entirely prepared to believe that Roland’s module is better than requests-oauthlib: it would be hard for it not to be.
:-)
However, *right now*, pyoidc does not have anything like a majority (or even a plurality) of mindshare amongst Python developers writing OAuth clients. So why should pyoidc be added to the standard library over the competing implementations? The only reason I can see to add it is if it is a better implementation than its competitors, and the python-dev community believe that developers using the competitor implementations would be better served using pyoidc instead. Is that the case? Do we have some objective measure of this?
The only possible objective measurement I can see would be testing the implementation for standard compliance.

On Thu, Nov 17, 2016 at 11:51 PM, Roland Hedberg <roland@catalogix.se> wrote:
When you're looking at oauth2, there are myriad uses for it, and thus large numbers of people looking for the module. But OIDC is something I had never heard of until this thread (turns out it's something built on top of OAuth2). Your module may well be best-in-show for OIDC (unproven, but assume it for the nonce), but unless it's also best-in-show for OAuth2, it's not going to have the broad draw/appeal that I would hope for in a new stdlib module. Perhaps the best step forward is to publish blog posts demonstrating how your module compares to other OAuth2 libraries. That would put the module name alongside various keywords that people will search for, and thus improve its visibility. Consider this thought process, which I'd say is fairly typical: 1) I want to use Fred's Wonderful Spamination API. 2) FWSA's docs say that I need to use this thing called OAuth. 3) What's OAuth? How do I use it? Search the web. 4) Oh, there's OAuth1 and OAuth2. Which should I use? Ahh, FWSA's docs say OAuth2. Okay. 5) I need a Python module that does OAuth2. 6) Search the web, or search PyPI? Personally, when I hit step 6, I search the web. PyPI search is exhaustive but not very usefully ranked (for this purpose). Searching for a keyword or protocol will give undue weight to a module whose name is simply that word, even if that module is terrible, unmaintained, etc, etc. Properly-ranked web search results are generally more useful in pointing me to the appropriate package, even if they're telling me to use something with a very different name. (Consider a search for "python http". You'll get httplib/http.client, but shortly after that, you get pointed to 'requests'.) As another bonus, blog posts of that nature will help to explain to more experienced devs "why should this matter to me". People who've already used requests-oauthlib are unlikely to reach for a new and unproven package without a good reason. So give them that reason! :) Also, as I mentioned earlier, the Python Wiki may well have an appropriate spot for this to be mentioned. It's worth a check. ChrisA

On 17 November 2016 at 14:45, Chris Angelico <rosuav@gmail.com> wrote:
Additionally, I look for simple usage examples. When I did search for OAuth, I got lots of hits for libraries, some even included "how to add OAuth to your Flask app" examples. But not one showed me how I should call a web service that uses OAuth from the Python interpreter prompt using that library. Contrast the first page of the requests documentation: >>> import requests >>> r = requests.get('https://api.github.com/user', auth=('user', 'pass')) >>> r.status_code 200 >>> r.headers['content-type'] 'application/json; charset=utf8' >>> r.encoding 'utf-8' >>> r.text u'{"type":"User"...' >>> r.json() {u'private_gists': 419, u'total_private_repos': 77, ...} With that, I immediately see how to use the code. Something similar with an OAuth using service would be what I'm looking for. I'd consider that sort of use case focused documentation as a minimum for any library that was looking to be included in the stdlib. I know I've been arguing earlier in this thread that "OAuth may be a good candidate for the stdlib". Here, what I'm saying is "... but I don't see any library implementing it that's ready for stdlib inclusion". While oic may be highly standards-compliant, IMO it's not ready for stdlib inclusion without user-focused design and documentation. On the other hand, nor was any other OAuth package I found via a quick search. Paul

Cory Benfield wrote:
Perhaps there could be a curated area on PyPI, maintained by core developers or people appointed by them, where the packages considered best-in-class would be placed. Instead of lobbying to get a package into the stdlib (a very high barrier to cross) people would then have the option of lobbying to get it stamped as "recommended" on PyPI. -- Greg

On Thu, Nov 17, 2016 at 8:06 AM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Or on the Python wiki: https://wiki.python.org/moin/FrontPage A page there could highlight the best PyPI packages. ChrisA

Cory Benfield writes:
I think the core question you need to answer for this proposal is: why is “pip install oic” not easy-enough reach?
My first guess would be "some enterprises use OAuth internally for the same reason they have draconian approval policies". More straightforwardly, this is the kind of battery that enterprises which make it hard to use pip seem likely to value. Nick's response is formally correct, but if requests is so important, it's likely to already be on the approved list. I would assume that pyoic also implements the client side, so it is useful even if requests is absent. But I am not a draconian security policy QA/security reviewer. I'd take anything Paul Moore says pretty seriously, as he operates in such an environment.

On 17 November 2016 at 12:42, Stephen J. Turnbull <turnbull.stephen.fw@u.tsukuba.ac.jp> wrote:
In that context, the problem is the old "batteries that leak acid everywhere can be worse than no batteries at all" one: we know from painful experience with the SSL module that the standard library's typical release and adoption cycle can be seriously problematic when it comes to network security related modules. However, when it comes to draconian security policies, *transitive recommendations have power*: if CPython is approved, and python-dev collectively says "we recommend pip, virtualenv, and requests", then folks in locked down environments can use that as evidence to suggest that those other modules are also trustworthy (particularly when there are commercial software vendors shipping them). In the case of something like OIDC, if the requests, Flask, Django and Pyramid developers were all inclined to recommend the same library for server and client implementations, then that would similarly be sufficient justification in many environments to bring the component in as a new dependency. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Nick Coghlan writes:
In that context, SSL is a somewhat special case because of its dependence on OpenSSL and because of Apple's U+1F4A9-headed attitude toward fixing something they distribute. If pyoic is pure Python, I would expect as many releases as we have branches supporting it within 6-12 weeks. Less time, if a truly nasty security bug. No? Sure, that's problematic for any sites that haven't yet approved a Python version with pyoic in it, but they're problematic period.
I understand that argument, but I can assure you that my employer does not. Its security policies tend to be both draconian and ineffective. Is the "python-dev and RHEL recommend it" evidence all that effective for "most" sites with "software must be approved before installing" policies? (This argument could easily go against me, if "draconian" sites tend to drag on approving Python point releases as well as on "new" PyPI modules.) Steve

On 17 November 2016 at 11:12, Stephen J. Turnbull <turnbull.stephen.fw@u.tsukuba.ac.jp> wrote:
In my experience, it's not so much "draconian" policies that cause the issues, as "not interested" policies. You get the standard build because that's what the default install gave, before the application-specific stack was added. If the application in question isn't built with Python (e.g., you use Python for automation, system management, or whatever) then you stand no chance of finding anyone to even *ask* for permission to go beyond the "out of the box" build. At best, on Windows systems, you get to say "we need you to run the following installers for our support tools" once, and likely once only. No internet access, no "please can we have X added", you get what you thought to ask for on day 1, and that's it. Paul

On 17 November 2016 at 02:42, Stephen J. Turnbull <turnbull.stephen.fw@u.tsukuba.ac.jp> wrote:
For context, my environment is one that doesn't formally use Python, but needs a lot of adhoc automation and management solutions, for which Python is a great fit, as long as it doesn't need anything that isn't pure "out of the box" functionality (because once we need that, we get into formal requests for things to be added to supported software lists). There's certain possibilities for "under the radar" additions, but the costs get high pretty quickly and overwhelm the benefits. So in some ways things can be very flexible, but in others I need to think in worst-case "I'm lucky to have Python at all, let's not push my luck" terms. It's likely that this sort of environment is becoming less common as Python becomes more mainstream/popular (it's not that long ago that you were lucky to find Python in a default Unix installation at all, for example), but it is still something we should be considering when looking at what deserves to be in the stdlib (sure requests is better than urllib, but if urllib disappeared, I wouldn't be able to do web requests at all in many of my environments). Paul

On 11/16/2016 03:51 AM, Roland Hedberg wrote:
My concern with a rewrite is that your code is no longer thoroughly tested. Of course, this depends in large part on the completeness of your unit tests. I agree with Cory that discoverability is the problem to solve. You can help with that somewhat by seeding sites like StackOverflow with the information.* -- ~Ethan~ * SO is not very tolerant of questions looking for opinion-based answers such as software recommendations, so make sure you phrase your question as "How do I do XXX using pyoidc?" and your answer can also include other links on the OIDC subject.
participants (8)
-
Chris Angelico
-
Cory Benfield
-
Ethan Furman
-
Greg Ewing
-
Nick Coghlan
-
Paul Moore
-
Roland Hedberg
-
Stephen J. Turnbull