> On 26 Jan 2017, at 01:33, Steve Dower <steve.dower(a)python.org> wrote:
> Looks good to me, but I wonder why we need to define all the algorithms in the PEP (by reference)? Could they also use a similar model to certificates, where the implementation provides a constructor that takes a string (in a format defined here, e.g. "OpenSSL style") and does the best it can to return something it will know how to use later? It involves some trust, but I honestly don't see a world where we end up with implementations deliberately trying to be incompatible with each other (which would seem to be the only reason to define the full enum ahead of time).
There’s a thread running through this PEP which I’d call a rejection of “stringly-typed” APIs. I have never liked OpenSSL’s cipher string format: it’s a constant source of footguns where a given cipher string provides an extremely non-deterministic result. Because OpenSSL needs to tolerate the possibility that you are referring to ciphers that are only present in older or newer versions of OpenSSL, there is little error checking done on a cipher string. You can see this by running the openssl ciphers command on two very similar cipher strings:
openssl ciphers "ECDHE+AESGCM:RSA+AESGCM"
openssl ciphers "ECDHE+AESGCM:RSA+AESGMC"
Note that the typo in the second string wipes out a whole bunch of cipher suites, but OpenSSL doesn’t flag this as a problem because it doesn’t know whether AESGMC is a thing that might appear in the future or have appeared in the past. The only way to actually error check this is to compare to the hard-coded list of what cipher suites you expect, except of course at that point you may as well have just written the full list of cipher suites out or, more aptly, used a typed API. OpenSSL only errors out on a cipher string if it leads to no ciphers being supported, which is an extremely unlikely outcome for any project taking rigorous control of their cipher string.
The other problem here is that we need to define a grammar of how to translate the cipher string into a list of actual ciphers. This grammar needs to be forward-looking, which is a problem that OpenSSL is running directly into as we speak.
For this reason I’m inclined to lean towards the more verbose approach of just writing down what all of the cipher suites are in an enum. That way, it gets much easier to validate what’s going on. There’s still no requirement to actually support them all: an implementation is allowed to quietly ignore any cipher suites it doesn’t support. But that can no longer happen due to typos, because typos now cause AttributeErrors at runtime in a way that is very obvious and clear.
Does that make sense?
: This is a digression, but worth harping on about. OpenSSL’s cipher suite names have traditionally been of the form “key exchange-signing algo-stream cipher-mode-MAC”. However, OpenSSL has allowed some of those to be unspecified and to mean their defaults (e.g. AES128-GCM-SHA256 actually means RSA-RSA-AES128-GCM-SHA256). This has caused them problems with TLSv1.3, which has defined all new cipher suites that no longer include their key exchange mechanism (e.g. TLS_AES_128_GCM_SHA256 is now a complete cipher string). Because of OpenSSL’s weird defaulty grammar, right now their cipher string assumes that this new cipher uses RSA key exchange, even though it absolutely does not.
All of this is a very long form way of saying that defining a stable string-to-cipher-suite parser is very hard unless we define it as saying “just use the actual IANA names for the cipher suites”, and the second we’ve done that we may as well have just defined an enum and called it good. ;)