data:image/s3,"s3://crabby-images/dd0a4/dd0a42a02806ea7090d99cac7429fe5ba711e70c" alt=""
On Sat, Jun 11, 2016 at 05:46:29PM -0400, Donald Stufft wrote:
It was a RaspberryPI that ran a shell script on boot that called ssh-keygen. That shell script could have just as easily been a Python script that called os.urandom via https://github.com/sybrenstuvel/python-rsa instead of a shell script that called ssh-keygen.
So I'm going to argue that the primary bug was in the how the systemd init scripts were configured. In generally, creating keypairs at boot time is just a bad idea. They should be created lazily, in a just-in-time paradigm. Consider that if you assume that os.urandom can block, this isn't necessarily going to do the right thing either --- if you use getrandom and it blocks, and it's part of a systemd unit which is blocking futher boot progress, then the system will hang for 90 seconds, and while it's hanging, there won't be any interrupts, so the system will be dead in the water, just like the orignal bug report complaining that Python was hanging when it was using getrandom() to initialize its SipHash. At which point there will be another bug complaining about how python was causing systemd to hang for 90 seconds, and there will be demand to make os.random no longer block. (Since by definition, systemd can do no wrong; it's always other programs that have to change to accomodate systemd. :-) So some people will freak out when the keygen systemd unit hangs, blocking the boot --- and other people will freak out of the systemd unit doesn't hang, and you get predictable SSH keys --- and some wiser folks will be asking the question, why the *heck* is it not openssh/systemd's fault for trying to generate keys this early, instead of after the first time sshd needs host ssh keys? If you wait until the first time the host ssh keys are needed, then the system is fully booted, so it's likely that the entropy will be collected -- and even if it isn't, networking will already be brought up, and the system will be in multi-user mode, so entropy will be collected very quickly. Sometimes, we can't solve the problem at the Python level or at the Kernel level. It will require security-saavy userspace/application programmers as well. Cheers, - Ted