[Python-Dev] BDFL ruling request: should we block forever waiting for high-quality random bits?
cory at lukasa.co.uk
Sun Jun 12 06:40:58 EDT 2016
> On 12 Jun 2016, at 07:11, Theodore Ts'o <tytso at mit.edu> wrote:
> On Sat, Jun 11, 2016 at 05:46:29PM -0400, Donald Stufft wrote:
>> It was a RaspberryPI that ran a shell script on boot that called
>> ssh-keygen. That shell script could have just as easily been a
>> Python script that called os.urandom via
>> https://github.com/sybrenstuvel/python-rsa instead of a shell script
>> that called ssh-keygen.
> So I'm going to argue that the primary bug was in the how the systemd
> init scripts were configured. In generally, creating keypairs at boot
> time is just a bad idea. They should be created lazily, in a
> just-in-time paradigm.
Agreed. I hope that if there is only one thing every participant has learned from this (extremely painful for all concerned) discussion, it’s that doing anything that requires really good random numbers should be delayed as long as possible on all systems, and should absolutely not be done during the boot process on Linux. Don’t generate key pairs, don’t make TLS connections, just don’t perform any action that requires really good randomness at all.
> So some people will freak out when the keygen systemd unit hangs,
> blocking the boot --- and other people will freak out of the systemd
> unit doesn't hang, and you get predictable SSH keys --- and some wiser
> folks will be asking the question, why the *heck* is it not
> openssh/systemd's fault for trying to generate keys this early,
> instead of after the first time sshd needs host ssh keys? If you wait
> until the first time the host ssh keys are needed, then the system is
> fully booted, so it's likely that the entropy will be collected -- and
> even if it isn't, networking will already be brought up, and the
> system will be in multi-user mode, so entropy will be collected very
As far as I know we still only have three programs that were encountering this problem: Debian’s autopkgtest (which patched with PYTHONHASHSEED=0), systemd-cron (which is moving from Python to Rust anyway), and cloud-init (not formally reported but mentioned to me by a third-party). It remains unclear to me why the systemd-cron service files can’t simply request to be delayed until the kernel CSPRNG is seeded: I guess systemd doesn’t have any way to express that constraint? Perhaps it should.
Of this set, only cloud-init worries me, and it worries me for the *opposite* reason that Guido and Larry are worried. Guido and Larry are worried that programs like cloud-init will be delayed by two minutes while they wait for entropy: that’s an understandable concern. I’m much more worried that programs like cloud-init may attempt to establish TLS connections or create keys during this two minute window, leaving them staring down the possibility of performing “secure” actions with insecure keys.
This is why I advocate, like Donald does, for having *some* tool in Python that allows Python programs to crash if they attempt to generate cryptographically secure random bytes on a system that is incapable of providing them (which, in practice, can only happen on Linux systems). I don’t care how it’s spelled, I just care that programs that want to use a properly-seeded CSPRNG can error out effectively when one is not available. That allows us to ensure that Python programs that want to do TLS or build key pairs correctly refuse to do so when used in this state, *and* that they provide a clearly debuggable reason for why they refused. That allows the savvy application developers that Ted talked about to make their own decisions about whether their rapid startup is sufficiently important to take the risk.
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 801 bytes
Desc: Message signed with OpenPGP using GPGMail
More information about the Python-Dev