[Python-ideas] Python's Source of Randomness and the random.py module Redux

Mon Sep 14 21:14:05 CEST 2015

On 14 September 2015 at 16:32, Ian Cordasco <graffatcolmingov at gmail.com> wrote:
>> I fully expect the response to this to be "just because it'll take
>> time, doesn't mean we should do nothing". Or "even if it just fixes it
>> for one or two people, it's still worth it". But *that's* the argument
>> I don't find compelling - not that a fix won't help some situations,
>> but that because it's security, (a) all the usual trade-off
>> calculations are irrelevant, and (b) other proposed solutions (such as
>> education, adding specialised modules like a "shared secret" library,
>> etc) are off the table.
>
> They're not irrelevant. I personally think they're of a lower impact
> to the discussion, but the reality is that the people who are
> educating others are few and far between. If there are public domain
> works, free tutorials, etc. that all advocate using a module in the
> standard library and no one can update those, they still exist and are
> still recommendations. People prefer free to correct when possible
> because there's nothing free to correct them (until they get hacked or
> worse). Do we have a team in the Python community that goes out to
> educate for free people on security related best practices? I haven't
> seen them. The best we have is a few people on crufty mailing lists
> like this one trying to make an impact because education is a much
> larger and harder to solve problem than making something secure by
> default.
>
> Perhaps instead of bickering like fools on a mailing list, we could
> all be spending our time better educating others.

You may well be right. Personally, I'm pretty sick of the way all of
these debates degenerate into content-free reiteration of the same old
points, and unwillingness to hear other people's views.

Here's a point - it seems likely that the people arguing for this
change are of the opinion that I'm not appreciating their position.
(For the record, I'm not being deliberately obstructive in case anyone
thought otherwise. In my view at least, I don't understand the
security guys' position). Assuming that's the case, then I'm probably
one of the people who needs educating. But I don't feel like anyone's
trying to educate me, just that I'm being browbeaten until I give in.

Education != indoctrination.

> That said, I can't
> make that decision for you just like you can't make that for me.

Indeed. Personally, I spend quite a lot of time in my day job (closed
source corporate environment) trying to educate people in sane
security practices, usually ones I have learned from people in
communities like this one. One of the biggest challenges I have is
stopping people from viewing security as "an annoying set of rules
that get in the way of what I'm trying to do". But you would not
believe the sorts of things I see routinely - I'm not willing to give
examples or even outlines on a public mailing list because I can't
assess whether such information could be turned into an exploit. I can
say, though, that crypto-safe RNGs is *not* a relevant factor :-)

At its best, good security practice should *help* people write
reliable, easy to use systems. Or at a minimum, not get in the way.
But the PR message needs always to be "I understand the constraints
you're dealing with", not "you must do this for your own good".
Otherwise the "follow the rules until the auditors go away" attitude
just gets reinforced. Hence my focus on seeing proof that breakages
are justified *in the context of the target audience I am responsible
for*.

Conversely, you're right that I can't force anyone else to try to
educate people in good security practices, however much better than me
at it I might think they are. In actual fact, though, I think a lot of
people do a lot of good work educating others - as I say, most of what
I've learned has been from lists like these.

>> Honestly, this type of debate doesn't do the security community much
>> good - there's too little willingness to compromise, and as a result
>> the more neutral participants (which, frankly, is pretty much anyone
>> who doesn't have a security agenda to promote) end up pushed into a
>> "reject everything" stance simply as a reaction to the black and white
>> argument style.
>
> Except you seem to have missed much of the compromises being discussed
> and conceded by the security minded folks.

OK, you have a point - there have been changes to the proposals. But
there are fundamental points that have (as far as I can see) never
been acknowledged. As a result, the changes feel less like compromises
based on understanding each other's viewpoints, and more like repeated
attempts to push something through, even if it's not what was
originally proposed. (I *know* this is an emotional position - please
understand I'm fed up and not always managing to word things
objectively).

Specifically, I have been told that I can't argue my "convenience"
over the weight of all the other people who could fall into security
traps with the current API. Let's review that, shall we?

* My argument is that breaking backward compatibility needs to be
justified. People have different priorities. "Security risks should be
fixed" isn't (IMO) a free pass. Why should it be? "Windows
compatibility issues should be fixed" isn't a free pass. "PyPy/Jython
compatibility issues should be fixed" isn't a free pass. Forcing me to
adjust my priorities so that I care about security when I don't want
(or IMO need) to isn't acceptable.
* The security arguments seem to be largely in the context of web
application development (cookies, passwords, shared secrets, ...)
That's not the only context that matters.
* As I said above, in my experience, a compatibility break "to make
things more secure" is seen as equating security with inconvenience,
and can actually harm attempts to educate users in better security
practices.
* In many environments, reproducibility of random streams is
important. I'm not an expert on those fields, although I've hit some
situations where seeding is a requirement. As far as I am aware, most
of those situations have no security implications. So for them, the
PEP is all cost, no benefit. Sure the cost is small, but it's
non-zero.

How come the web application development community is the only one
whose voice gets heard? Is it because the fact that they *are*
public-facing, and frequently open-source, means that data is
available? So "back it up with facts or we won't believe you" becomes
a debating stance? I'm not arguing that everyone should be allowed to
climb up on their soapbox and rant - but I would like to think that
bringing a different perspective to the table could be treated with
respect and genuine attempts to understand. And "in my experience" is
viewed as an offer of information, not as an attempt to bluff on a
worthless hand.

Just to be clear, I think the current proposal (Nick's pre-PEP) is
relatively unobtrusive, and unlikely to cause serious compatibility
issues. I'm uncomfortable with the fact that it feels like yet another
"imposition in the name of security", and while I'm only one person I
feel that I'm not alone. I'm concerned that the people pushing
security seem unable to recognise that people becoming sick of such
changes is a PR problem they need to address, but that's their issue
not mine. So I'm unlikely to vote against the proposal, but I'll feel
sad if it's accepted without a more balanced discussion than we've
currently had.

On the meta-issue of how debates like this are conducted, I think
people probably need to listen more than they talk. I'm as guilty as
anyone else here. But in particular, when multiple people all end up
responding to rebut *every* counter-argument, essentially with the
same response, maybe it's time to think "we're in the majority here,
let's stop talking so much and see if we're missing anything from what
the people with other views are saying". He who shouts loudest isn't
always right. Not necessarily wrong, either, but sometimes it's bloody
hard to tell one way or the other, if they won't shut up long enough
to analyze the objections.

> Personally, names that
> describe the outputs of the algorithms make much more sense to me than
> "Seedless" and "Seeded" but no one has really bothered to shave that
> yak further out of a desire to compromise and make things better as a
> whole.

I'm frankly long past caring. I think we'll end up with whatever was
on the table when people got too tired to argue any more.

> Much of the lack of gradation has come from the opponents to
> this change who seem to think of security as a step function where a
> subjective measurement of "good enough for me" counts as secure.

Wait, what? It's *me* that's claiming that security is a yes/no
thing??? When all I'm hearing is "education isn't sufficient",
"dedicated libraries aren't sufficient", "keeping a deterministic RNG
as default isn't an option"? And when I'm suggesting that fixing the
PRNG use in code that misuses a PRNG may not be the only security
issue with that code? I knew the two sides weren't communicating, but
this statement staggers me. We have clearly misunderstood each other
even more fundamentally that I had thought possible :-(

Thinking hard about the implications of what you said there, I start
to see why you might have misinterpreted my stance as the black and
white one. But I have absolutely no idea how to explain to you that I
find your stance equally (and before I took the time to think through
what your statement implied, even more) so.

There's little more I can say. I'm going to take my own advice now,
and stop talking. I'll keep listening, in the hope that either this
post or something else will somehow break the logjam, but right now
I'm not sure I have much hope of that.

Paul