Re: [Python-Dev] PEP 476: Enabling certificate validation by default!

4 Sep 2014

      On 4 September 2014 10:00, Ethan Furman  wrote:
...
On 09/03/2014 04:36 PM, Antoine Pitrou wrote:
...
On Thu, 4 Sep 2014 09:19:56 +1000
Nick Coghlan  wrote:
...
...
Python is routinely updated to bugfix releases by Linux distributions
and other distribution channels, you usually have no say over what's
shipped in those updates. This is not like changing the major version
used for executing the script, which is normally a manual change.
We can potentially deal with the more conservative part of the user base
on
the redistributor side - so long as the PEP says it's OK for us to not
apply this particular change if we deem it appropriate to do so.
So people would believe python.org that they would get HTTPS cert
validation by default, but their upstream distributor would have
disabled it for them? That's even worse...
I agree.  If the vendors don't want to have validation by default, they
should stick with 2.7.8.
Yes, that's the way it would work in practice - we'd call it 2.7.8,
and backport fixes from upstream 2.7.x as needed (as someone put it to
me recently, a useful way to think of component version numbers in
RHEL is as referring to the *oldest* pieces of code in that
component).

I've spent quite a bit more time considering the proposal, and I'm now
satisfied that making this change upstream isn't likely to cause any
major problems, due to the fact that folks who are likely to suffer
from this change aren't likely to even be on 2.7 yet.

While we managed not to be last this time, the RHEL/CentOS ecosystem
is still a useful benchmark for the roll out of Python versions into
conservative enterprise organisations, and the more conservative users
within *that* ecosystem are likely to wait for the x.1 release (at the
earliest) rather than upgrading as soon as x.0 is out. RHEL 7.0 only
came out in June, so most of those conservative environments are still
going to be on Python 2.6 in RHEL 6. While we shipped 2.7 support well
before the release of RHEL 7 as part of Software Collections and
OpenShift, the kinds of environments where properly validating SSL by
default may cause problems aren't likely to be on the leading edge of
adopting new approaches to software deployment like SCL and PaaS.

Fixing the HTTPS validation defaults would have several significant
positive consequences:

- lowers a major barrier to Python adoption (and continued usage) for
public internet focused services
- fixes a latent security defect for Python applications operating
over the public internet
- fixes a latent security defect for Python applications operating in
a properly configured intranet environment
- reveals a latent security defect for Python applications
communicating with improperly configured public internet services
- reveals a latent security defect for Python applications operating
in improperly configured intranet environments

The debate is thus solely over how to manage the consequences of the
last two, since going from "silent failure" to "noisy failure" looks
like going from "working" to "broken" to someone that isn't already
intimately familiar with the underlying issues.

That question needs to be considered separately for 3 different versions:

- 3.5
- 3.4
- 2.7

3.5 is not controversial, the answer is that the standard library's
handling of HTTPS URLs should change to verify certificates properly.
No ifs, buts, or maybes - Python 3.5 should automatically verify all
HTTPS connections, with explicit developer action required to skip (or
otherwise relax) the validation check.

So far, we have assumed that 3.4 will get at most a warning. However,
I have changed my mind on that, because Python 3 is still largely an
early adopter driven technology (it's making inroads into more
conservative environments, but it's still at least a few years away
from catching up to Python 2 on that front). As a result, the kinds of
environments RDM and I are worried about will generally *not* be using
Python 3, or if they are, it will be for custom scripts that they can
fix. I wouldn't suggest actually making that change without getting an
explicit +1 from the Canonical folks (since 3.4 is in Ubuntu LTS
14.04), but I would now personally be +1 on just *fixing it* in 3.4.2,
rather than doing a bunch of additional development work purely so we
can make folks wait another year for the Python 3 standard library to
correctly handle HTTPS.

That leaves Python 2.7, and I have to say I'm now persuaded that a
backport (including any required httplib and urllib features) is the
right way to go. One of the tasks I'd been dreading as a follow-on
from PEP 466 was organising the code audit to make sure our existing
Python 2 applications are properly configuring SSL. If we instead
change Python 2.7.9 to validate certificates by default, then the need
to do that audit *goes away*, replaced by the far more mundane tasking
of doing integration testing on 2.7.9, which we'd have to do *anyway*.
Systematically solving the Python 2 HTTPS problem ceases to be
something special, and instead just becomes a regular upstream bug fix
that will be covered by our normal regression testing processes.

There's also the fact that Python 2.7.9 is becoming, in effect, the
2.8 several folks have been asking for (from a HTTPS perspective,
anyway), but done in such a way that it feeds more cleanly into the
redistributor channels, rather than having the multi-year lead time
(and massive additional overhead) that a 2.8 release would suffer.

On the redistributor side, we (as in Red Hat) *specifically* offer
paid services to help users manage the risks associated with this kind
of change (for example
https://access.redhat.com/support/policy/updates/errata#Extended_Update_Supp...),
and we charge them extra for it because the extra complexity it
introduces *is* a pain to support.

If, as a vendor, we're not willing to do something like that as part
of our base subscription, then I don't think upstream should feel any
obligation to do it for free - the whole *point* of redistributors
from a community perspective is for us to handle the boring & annoying
stuff whenever possible, so that upstream don't need to worry about it
so much. Seriously - we *don't want* the extremely high touch users
that need lots of reassurance as direct upstream consumers, as meeting
their expectations requires such a high level of responsiveness and
stress that it simply isn't ethical to ask people to do it for free.
When someone genuinely *wants* that kind of customer-vendor
relationship, trying to employ a colleague-colleague community style
relationship instead ends up being incredibly unpleasant for both
sides.

I freely admit that I'm heavily biased on this point - the fact that
self-organising community projects are understandably less interested
in the boring & annoying bits that some users want or need is
ultimately what pays my salary. But this kind of thing is genuinely
difficult for a collaborative community driven project to provide, and
I think it's reasonable to expect that people who want a slower pace
and/or greater selectivity in their upgrades pay for the privilege (or
at least rely on a slower moving, freely available, filtered and
curated platform environment like CentOS or Ubuntu LTS).

Regards,
Nick.

P.S. I actually believe this is also why we see so much open source
development being done on Mac OS X these days, and even a growing
presence of Windows - at the platform level, many open source devs
*don't want* a collegial relationship with a community, they want a
commercial relationship with a vendor so their computer "just works".
Take the mindset many of us have towards our OS, move up the stack a
few more layers, and we can see how many task oriented application
developers might feel about programming language runtimes, web
frameworks, etc - for many folks, they're just tools that help to get
a job done, not communities to participate in (and that's OK - it just
means we need to fully consider the benefits of relying on commercial
rather than community focused approaches to meeting their needs).

-- 
Nick Coghlan   |   ncoghlan@gmail.com   |   Brisbane, Australia