[Python-Dev] PEP 476: Enabling certificate validation by default!

Sun Aug 31 08:09:26 CEST 2014

On 31 August 2014 12:21, R. David Murray <rdmurray at bitdance.com> wrote:
> On Sun, 31 Aug 2014 03:25:25 +0200, Antoine Pitrou <solipsis at pitrou.net> wrote:
>> On Sun, 31 Aug 2014 09:26:30 +1000
>> Nick Coghlan <ncoghlan at gmail.com> wrote:
>> > In relation to changing the Python CLI API to offer some of the wget/curl
>> > style command line options, I like the idea of providing recipes in the
>> > docs for implementing them at the application layer, but postponing making
>> > the *default* behaviour configurable that way.
>>
>> I'm against any additional environment variables and command-line
>> options. It will only complicate and obscure the security parameters of
>> certificate validation.

As Antoine says here, I'm also opposed to adding more Python specific
configuration options. However, I think there may be something
worthwhile we can do that's closer to the way browsers work, and has
the significant benefit of being implementable as a PyPI module first
(more on that in a separate reply).

>> The existing knobs have already been mentioned in this thread, I won't
>> mention them here again.
>
> Do those knobs allow one to instruct urllib to accept an invalid
> certificate without changing the program code?

Only if you add the specific certificate concerned to the certificate
store that Python is using (which PEP 476 currently suggests will be
the platform wide certificate store). Whether or not that is an
adequate solution is the point currently in dispute.

My view is that the core problem/concern we need to address here is
how we manage the migration away from a network communication model
that trusts the network by default. That transition will happen
regardless of whether or not we adapt Python as a platform - the
challenge for us is how we can address it in a way that minimises the
impact on existing users, while still ensuring future users are
protected by default.

This would be relatively easy if we only had to worry about the public
internet (since we're followers rather than leaders in that
environment), but we don't. Python made the leap into enterprise
environments long ago, so we not only need to cope with corporate
intranets, we need to cope with corporate intranets that aren't
necessarily being well managed. That's what makes this a harder
problem for us than it is for a new language like Go that was created
by a public internet utility, specifically for use over the public
internet - they didn't *have* an installed base to manage, they could
just build a language specifically tailored for the task of running
network services on Linux, without needing to account for any other
use cases.

The reason our existing installed base creates a problem is because
corporate network security has historically focused on "perimeter
defence": carving out a trusted island behind the corporate firewall
where users and other computer systems could be "safely" assumed not
to be malicious.

As an industry, we have learned though harsh experience that *this
model doesn't work*. You can't trust the network, period. A corporate
intranet is *less* dangerous than the public internet, but you still
can't trust it. This "don't trust the network" ethos is also
reinforced by the broad shift to "utility computing" where more and
more companies are running distributed networks, where some of their
systems are actually running on vendor provided servers. The "network
perimeter" is evaporating, as corporate "intranets" start to look a
lot more like recreations of the internet in miniature, with the only
difference being the existence of more formal contractual
relationships than typically exist between internet peers.

Unfortunately, far too many organisations (especially those outside
the tech industry) still trust in perimeter defence for their internal
network security, and hence tolerate the use of unsecured connections,
or skipping certificate validation internally. This is actually a
really terrible idea, but it's still incredibly common due to the
general failure of the technology industry to take usability issues
seriously when we design security systems - doing the wrong "unsafe"
thing is genuinely easier than doing things right.

We have enough evidence now to be able to say (as Alex does in PEP
476) that it has been comprehensively demonstrated that "opt-in
security" really just means "security failures are common and silent
by default". We've seen it with C buffer overflow vulnerabilities,
we've seen it with plain text communication links, we've seen it with
SSL certificate validation - the vast majority of users and developers
will just run with the default behaviour of the platform or
application they're using, even if those defaults have serious
problems. As the saying goes, "you can't document your way out of a
usability problem" - uncovered connections, or that are vulnerable to
a man-in-the-middle attack appear to work for all functional purposes,
they're just vulnerable to monitoring and subversion.

It turns out "opt-out security with a global off switch" isn't
actually much better when it comes to changing *existing* behaviours,
as people just turn the new security features off and continue on as
they were, rather than figuring out what dangers the new security
system is trying to warn them about and encourage them to
pre-emptively address them. Offering that kind of flag may sometimes
be a necessary transition phase (or we wouldn't have things like
"setenforce 0" for SELinux) but it should be considered an absolute
last resort.

In the specific case of network security, we need to take
responsibility as an industry for the endemic failure of the
networking infrastructure to provide robust end user security and
privacy, and figure out how to get to a state where encrypted and
authenticated network connections are as easy to use as uncovered
ones. I see Alex's PEP (along with the preceding work on the SSL
module that makes it feasible) as a significant step in that
direction.

At the same time, we need to account for the fact that most existing
organisations still trust in perimeter defence for their internal
network security, and hence tolerate (or even actively encourage) the
use of unsecured connections, or skipping certificate validation,
internally. This is actually a really terrible idea, but it's still
incredibly common due to the general failure of the technology
industry to take usability issues seriously when we design security
systems (at least until recently) - doing the wrong "unsafe" thing is
genuinely easier than doing things right.

We can, and should, tackle this as a design problem, and ensure PEP
476 covers this scenario adequately. We also need to make sure we do
it in a way that avoids places any significant additional burdens on
teams that may already be trying to explain what "long term
maintenance" means, and why the flow of free feature releases for the
Python 2 series stopped.

This message is already rather long, however, so I'll go into more
technical details in a separate reply to David's question.

Regards,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia