adding digital signature and encryption "hashes" to hashlib?
A couple of weeks ago, there was a discussion on python-dev about adding the ability to load modules from encrypted zip files. I'm not sure the discussion went anywhere, and I was on vacation when it took place. However, it reminded me of an idea from a couple of years ago: extend the hashlib module to produce two additional kinds of hashes: a digital signature for some sequence of bytes, and an encrypted/decrypted version of a sequence of bytes. Basically, the would bring more of the OpenSSL EVP API out to Python (hashlib already uses OpenSSL EVP for various hash formations). http://www.openssl.org/docs/crypto/evp.html With this, it would be fairly trivial to implement strong encryption of zip files (or anything else), and this could then be used to do the import feature. I'd envision adding new constructors to hashlib: sig = hashlib.signature([data] [, keyfile=...] [, signature_algorithm=...]) This would have the regular update() method, and digest() and hexdigest(), but would also support the method sig.verify(existing_sig) which would return a boolean saying the "existing sig" is a verified signature for that data. Similarly, encryption/decryption would be enc = hashlib.encryptor([plaintext] [, keyfile=...] [, ciphers=...]) enc.digest() would give the ciphertext. And dec = hashlib.decryptor([ciphertext] [, keyfile=...] [, ciphers=...]) and dec.digest would yield the plaintext. The encryptor and decryptor constructors could take either "key" or "keyfile" parameters. Using "key" would support symmetric encrytion, while using "keyfile" would produce EVP envelope encryption/decryption. Bill
On Sep 17, 8:24 pm, Bill Janssen <jans...@parc.com> wrote:
A couple of weeks ago, there was a discussion on python-dev about adding the ability to load modules from encrypted zip files. I'm not sure the discussion went anywhere, and I was on vacation when it took place.
However, it reminded me of an idea from a couple of years ago: extend the hashlib module to produce two additional kinds of hashes: a digital signature for some sequence of bytes, and an encrypted/decrypted version of a sequence of bytes. Basically, the would bring more of the OpenSSL EVP API out to Python (hashlib already uses OpenSSL EVP for various hash formations).
Besides the fact that hashes and encryption are pretty much totally different, I like the idea of putting more cryptographic power in the standard library.
With this, it would be fairly trivial to implement strong encryption of zip files (or anything else), and this could then be used to do the import feature.
I'd envision adding new constructors to hashlib:
sig = hashlib.signature([data] [, keyfile=...] [, signature_algorithm=...])
This would have the regular update() method, and digest() and hexdigest(), but would also support the method
sig.verify(existing_sig)
which would return a boolean saying the "existing sig" is a verified signature for that data.
Similarly, encryption/decryption would be
enc = hashlib.encryptor([plaintext] [, keyfile=...] [, ciphers=...])
enc.digest() would give the ciphertext.
And
dec = hashlib.decryptor([ciphertext] [, keyfile=...] [, ciphers=...])
and dec.digest would yield the plaintext.
The encryptor and decryptor constructors could take either "key" or "keyfile" parameters. Using "key" would support symmetric encrytion, while using "keyfile" would produce EVP envelope encryption/decryption.
Bill
My only real concern about this approach is that it glosses over the complexity of selecting and using a cryptosystem. How about passing on having a default for the 'cipher' argument, at least? Geremy Condra
CTO <debatem1@gmail.com> wrote:
However, it reminded me of an idea from a couple of years ago: extend the hashlib module to produce two additional kinds of hashes: a digital signature for some sequence of bytes, and an encrypted/decrypted version of a sequence of bytes. Basically, the would bring more of the OpenSSL EVP API out to Python (hashlib already uses OpenSSL EVP for various hash formations).
Besides the fact that hashes and encryption are pretty much totally different
I know it seems that way at first glance, but in fact they are strongly related. There's a reason all three (and nothing else) are exported through OpenSSL's EVP API. Bill
On Sep 20, 1:07 pm, Bill Janssen <jans...@parc.com> wrote:
CTO <debat...@gmail.com> wrote:
However, it reminded me of an idea from a couple of years ago: extend the hashlib module to produce two additional kinds of hashes: a digital signature for some sequence of bytes, and an encrypted/decrypted version of a sequence of bytes. Basically, the would bring more of the OpenSSL EVP API out to Python (hashlib already uses OpenSSL EVP for various hash formations).
Besides the fact that hashes and encryption are pretty much totally different
I know it seems that way at first glance, but in fact they are strongly related. There's a reason all three (and nothing else) are exported through OpenSSL's EVP API.
Bill
Don't get me wrong, I like the basic idea you're advancing, and in use hashes and crypto are frequently seen together, but they solve different problems in *very* different ways. The fact that you- who obviously have some interest in crypto and presumably a good deal of knowledge about it- seem to be under the impression that RSA and SHA have some kinship, well, it doesn't reassure me about the ability of non-experts to figure out what's what. IMO, adding public key crypto routines to hashlib seems almost guaranteed to increase that confusion. Others will doubtless disagree. Geremy Condra
On Sun, Sep 20, 2009 at 11:52 AM, CTO <debatem1@gmail.com> wrote:
On Sep 20, 1:07 pm, Bill Janssen <jans...@parc.com> wrote:
CTO <debat...@gmail.com> wrote:
However, it reminded me of an idea from a couple of years ago: extend the hashlib module to produce two additional kinds of hashes: a digital signature for some sequence of bytes, and an encrypted/decrypted version of a sequence of bytes. Basically, the would bring more of the OpenSSL EVP API out to Python (hashlib already uses OpenSSL EVP for various hash formations).
Besides the fact that hashes and encryption are pretty much totally different
I know it seems that way at first glance, but in fact they are strongly related. There's a reason all three (and nothing else) are exported through OpenSSL's EVP API.
Bill
Don't get me wrong, I like the basic idea you're advancing, and in use hashes and crypto are frequently seen together, but they solve different problems in *very* different ways. The fact that you- who obviously have some interest in crypto and presumably a good deal of knowledge about it- seem to be under the impression that RSA and SHA have some kinship, well, it doesn't reassure me about the ability of non-experts to figure out what's what. IMO, adding public key crypto routines to hashlib seems almost guaranteed to increase that confusion. Others will doubtless disagree.
Geremy Condra
I don't like the attempt to overload the hash function API. Encryption and decryption should not be done using a digest() method. That makes no sense. They are stream APIs with a constant mapping of bytes in to bytes out rather than a hash function that always outputs a constant number of bytes. I wouldn't put signing functions in hashlib itself but any common EVP wrapping code under could be shared. Before doing that I really suggest someone fleshes out the API and limits its scope to avoid feature creep. http://pycrypto.org/ is the API that most Python code wanting crypto services use today.
CTO <debatem1@gmail.com> wrote:
I know it seems that way at first glance, but in fact they are strongly related. There's a reason all three (and nothing else) are exported through OpenSSL's EVP API.
Bill
Don't get me wrong, I like the basic idea you're advancing, and in use hashes and crypto are frequently seen together,
Yes, that's the relationship I was thinking of. But from a broader philosophical view, a ciphertext can be thought of as a hash of a plaintext, too. A reversible hash.
IMO, adding public key crypto routines to hashlib seems almost guaranteed to increase that confusion.
Well, that could be. Perhaps the packaging "insight" I had wasn't inspired :-). I was thinking that from the crypto-ignorant point of view, they seem quite similar. A SHA256 hash can be seen as a digital "signature" (or I've heard it called a "fingerprint") of a sequence of bytes, just as with a public-key signature. Sure, what's going on is different, but from a utility point of view, it's much the same. This is why people post md5 checksums of downloadable packages -- it's a signature. Bill
Gregory P. Smith <greg@krypto.org> wrote:
I don't like the attempt to overload the hash function API. Encryption and decryption should not be done using a digest() method. That makes no sense. They are stream APIs with a constant mapping of bytes in to bytes out rather than a hash function that always outputs a constant number of bytes.
Sure, I could see the stream API, as well, but I think the hashlib methods actually work pretty well for this, too. Certainly for the digital signature portion.
I wouldn't put signing functions in hashlib itself but any common EVP wrapping code under could be shared. Before doing that I really suggest someone fleshes out the API and limits its scope to avoid feature creep.
Yes, the right thing to do is to generate a separate module and put it up in PyPI. See how it goes. Further consolidation could be left to the future. Bill
On Sep 21, 11:43 am, Bill Janssen <jans...@parc.com> wrote:
CTO <debat...@gmail.com> wrote:
I know it seems that way at first glance, but in fact they are strongly related. There's a reason all three (and nothing else) are exported through OpenSSL's EVP API.
Bill
Don't get me wrong, I like the basic idea you're advancing, and in use hashes and crypto are frequently seen together,
Yes, that's the relationship I was thinking of. But from a broader philosophical view, a ciphertext can be thought of as a hash of a plaintext, too. A reversible hash.
You really shouldn't conflate these things. The point of a hash is to ensure message integrity, while the point of encryption is to preserve secrecy. As an example, ElGamal is a common cryptosystem that nevertheless preserves the multiplicative homomorphism, ie, E(m1) * E(m2) = E(m1*m2). Others, including unpadded RSA, will demonstrate similar properties. Under certain conditions, that can be desirable, but under many others it is very, very bad. Think of encrypting the value for a debit purchase- $100000 is just a public-key operation away from $10, but would be financially crippling to most people.
IMO, adding public key crypto routines to hashlib seems almost guaranteed to increase that confusion.
Well, that could be. Perhaps the packaging "insight" I had wasn't inspired :-). I was thinking that from the crypto-ignorant point of view, they seem quite similar. A SHA256 hash can be seen as a digital "signature" (or I've heard it called a "fingerprint") of a sequence of bytes, just as with a public-key signature. Sure, what's going on is different, but from a utility point of view, it's much the same. This is why people post md5 checksums of downloadable packages -- it's a signature.
Also a very bad idea. Hashes ensure data integrity, not that it came from the person that you think it came from. As an example, if I took a message, MD5'd it (a bad idea anyway), and appended it to the end, an adversary could just man-in-the-middle the process and wind up changing both message and hash. To you, this would remain undetectable, and in your example would result in the adversary installing arbitrary code on your machine. A good public key signature system can help to prevent that, although even that has some nontrivial difficulties associated with it. My point here is not to scare you away from crypto- its to point out that crypto is a big field, and the consequences for getting it wrong are sometimes very high. Geremy Condra
CTO <debatem1@gmail.com> wrote:
On Sep 21, 11:43 am, Bill Janssen <jans...@parc.com> wrote:
CTO <debat...@gmail.com> wrote:
I know it seems that way at first glance, but in fact they are strongly related. There's a reason all three (and nothing else) are exported through OpenSSL's EVP API.
Bill
Don't get me wrong, I like the basic idea you're advancing, and in use hashes and crypto are frequently seen together,
Yes, that's the relationship I was thinking of. But from a broader philosophical view, a ciphertext can be thought of as a hash of a plaintext, too. A reversible hash.
You really shouldn't conflate these things. The point of a hash is to ensure message integrity, while the point of encryption is to preserve secrecy. As an example, ElGamal is a common cryptosystem that nevertheless preserves the multiplicative homomorphism..., ie,
I know lots of non-crypto users -- the people the "batteries included" aspect of Python are for -- that don't understand the fine points of this, and don't want to. They just want to encrypt some text with a "good" cipher scheme, and they depend on the library implementor to know how to do that. They want a function "encrypt(plaintext, key)", and don't really want to know more than that. And, by the by, hashes are often used for purposes other than message integrity, outside the sphere of crypto.
IMO, adding public key crypto routines to hashlib seems almost guaranteed to increase that confusion.
Well, that could be. Perhaps the packaging "insight" I had wasn't inspired :-). I was thinking that from the crypto-ignorant point of view, they seem quite similar. A SHA256 hash can be seen as a digital "signature" (or I've heard it called a "fingerprint") of a sequence of bytes, just as with a public-key signature. Sure, what's going on is different, but from a utility point of view, it's much the same. This is why people post md5 checksums of downloadable packages -- it's a signature.
Also a very bad idea. Hashes ensure data integrity, not that it came from the person that you think it came from. As an example, if I took
Sure. And lots of people use digital signatures in that way, too. Again, I wasn't proposing to replace m2cryto or pycrypto or anything else; I was suggesting that providing easy-to-use APIs to a couple of commonly-requested crypto features, for use by non-cryptographers, wouldn't be a bad idea. Bill
On 2009-09-21 16:37 PM, Bill Janssen wrote:
Again, I wasn't proposing to replace m2cryto or pycrypto or anything else; I was suggesting that providing easy-to-use APIs to a couple of commonly-requested crypto features, for use by non-cryptographers, wouldn't be a bad idea.
Going back to CTO's original reply, I would say that he agrees with you. Where he (and I, for that matter) diverge is that we don't think they should go into hashlib. The name is inappropriate and misleading. The unifying concept among the functionality you want to include is not "hashing" but "cryptography", and the module that ties together that functionality should be named appropriately. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco
Bill Janssen wrote:
Again, I wasn't proposing to replace m2cryto or pycrypto or anything else; I was suggesting that providing easy-to-use APIs to a couple of commonly-requested crypto features, for use by non-cryptographers, wouldn't be a bad idea.
Actually, it could be a really bad idea that leads to people thinking they have secured something when they have in fact done nothing of the sort. Having to go find a crypto library at least means a developer has put in a minimal amount of thought. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------
Nick Coghlan <ncoghlan@gmail.com> wrote:
Bill Janssen wrote:
Again, I wasn't proposing to replace m2cryto or pycrypto or anything else; I was suggesting that providing easy-to-use APIs to a couple of commonly-requested crypto features, for use by non-cryptographers, wouldn't be a bad idea.
Actually, it could be a really bad idea that leads to people thinking they have secured something when they have in fact done nothing of the sort.
I don't think so. People make mistakes continually. Part of life. Making it easier to do things leads to that sort of misunderstanding. That being said, are there ways to make things more foolproof?
Having to go find a crypto library at least means a developer has put in a minimal amount of thought.
Too much, IMO. Bill
Robert Kern <robert.kern@gmail.com> wrote:
On 2009-09-21 16:37 PM, Bill Janssen wrote:
Again, I wasn't proposing to replace m2cryto or pycrypto or anything else; I was suggesting that providing easy-to-use APIs to a couple of commonly-requested crypto features, for use by non-cryptographers, wouldn't be a bad idea.
Going back to CTO's original reply, I would say that he agrees with you. Where he (and I, for that matter) diverge is that we don't think they should go into hashlib. The name is inappropriate and misleading.
OK, so let's not do that, then. Greg commented that the underlying C implementation of access to EVP can be shared, which eliminates the only real functional justification for adding it to hashlib, which is to avoid duplicating code. Suppose we added, then, a simple-minded API to the EVP functions: EVP_Seal... and EVP_Open... provide public key encryption EVP_Sign... and EVP_Verify... provide digital signatures EVP_Encrypt and EVP_Decrypt provide symmetric key encryption (The EVP_Digest... API is already brought out by hashlib.) Let's call this new module "evp". (Or perhaps there should be a "crypto" package, with "hashes", "encryption", and "signature" submodules.) Let's look at symmetric encryption. You'd want to be able to create a new encryptor: import evp e = evp.encryptor(key, cipher="AES256", padding=True) "cipher" defaults to AES256, the constructor raises an exception if that isn't available (or the specified cipher isn't available), or for a bad key (wrong length). e.update(plaintext) # repeat as needed ciphertext = e.result() Very similar for decryption. What can be done to make something like this foolproof? Bill
On Sep 21, 5:37 pm, Bill Janssen <jans...@parc.com> wrote:
CTO <debat...@gmail.com> wrote:
On Sep 21, 11:43 am, Bill Janssen <jans...@parc.com> wrote:
CTO <debat...@gmail.com> wrote:
I know it seems that way at first glance, but in fact they are strongly related. There's a reason all three (and nothing else) are exported through OpenSSL's EVP API.
Bill
Don't get me wrong, I like the basic idea you're advancing, and in use hashes and crypto are frequently seen together,
Yes, that's the relationship I was thinking of. But from a broader philosophical view, a ciphertext can be thought of as a hash of a plaintext, too. A reversible hash.
You really shouldn't conflate these things. The point of a hash is to ensure message integrity, while the point of encryption is to preserve secrecy. As an example, ElGamal is a common cryptosystem that nevertheless preserves the multiplicative homomorphism..., ie,
I know lots of non-crypto users -- the people the "batteries included" aspect of Python are for -- that don't understand the fine points of this, and don't want to. They just want to encrypt some text with a "good" cipher scheme, and they depend on the library implementor to know how to do that. They want a function "encrypt(plaintext, key)", and don't really want to know more than that.
Crypto is complex for a reason. I didn't point those examples out because they were exceptions to the rule, I pointed them out because in crypto the rule is that your ignorance will burn you in the end. Trying to make it easier to do security badly is not a goal I'd set for the standard library.
And, by the by, hashes are often used for purposes other than message integrity, outside the sphere of crypto.
Cryptographic hashes were designed to be used in cryptography. Any other use should be well-researched before being deployed. Bad commmon practices aren't less bad for being more common.
IMO, adding public key crypto routines to hashlib seems almost guaranteed to increase that confusion.
Well, that could be. Perhaps the packaging "insight" I had wasn't inspired :-). I was thinking that from the crypto-ignorant point of view, they seem quite similar. A SHA256 hash can be seen as a digital "signature" (or I've heard it called a "fingerprint") of a sequence of bytes, just as with a public-key signature. Sure, what's going on is different, but from a utility point of view, it's much the same. This is why people post md5 checksums of downloadable packages -- it's a signature.
Also a very bad idea. Hashes ensure data integrity, not that it came from the person that you think it came from. As an example, if I took
Sure. And lots of people use digital signatures in that way, too.
See the above about "bad practice".
Again, I wasn't proposing to replace m2cryto or pycrypto or anything else; I was suggesting that providing easy-to-use APIs to a couple of commonly-requested crypto features, for use by non-cryptographers, wouldn't be a bad idea.
Bill _______________________________________________ Python-ideas mailing list Python-id...@python.orghttp://mail.python.org/mailman/listinfo/python-ideas
On Sep 21, 9:37 pm, Bill Janssen <jans...@parc.com> wrote:
Robert Kern <robert.k...@gmail.com> wrote:
On 2009-09-21 16:37 PM, Bill Janssen wrote:
Again, I wasn't proposing to replace m2cryto or pycrypto or anything else; I was suggesting that providing easy-to-use APIs to a couple of commonly-requested crypto features, for use by non-cryptographers, wouldn't be a bad idea.
Going back to CTO's original reply, I would say that he agrees with you. Where he (and I, for that matter) diverge is that we don't think they should go into hashlib. The name is inappropriate and misleading.
OK, so let's not do that, then. Greg commented that the underlying C implementation of access to EVP can be shared, which eliminates the only real functional justification for adding it to hashlib, which is to avoid duplicating code.
Seems reasonable.
Suppose we added, then, a simple-minded API to the EVP functions:
EVP_Seal... and EVP_Open... provide public key encryption EVP_Sign... and EVP_Verify... provide digital signatures EVP_Encrypt and EVP_Decrypt provide symmetric key encryption
(The EVP_Digest... API is already brought out by hashlib.)
Since we're doing the rest of it, we might as well do the EVP_PKEY stuff too.
Let's call this new module "evp".
Not a big fan of that, although as long as everything in it is clearly marked as to what its purpose and limitations are, I don't really care about the name...
(Or perhaps there should be a "crypto" package, with "hashes", "encryption", and "signature" submodules.)
...except that it probably shouldn't conflict with existing third-party modules, and Crypto is already in use. I'll send an email and see if there are any plans to change the capitalization on that.
Let's look at symmetric encryption. You'd want to be able to create a new encryptor:
import evp e = evp.encryptor(key, cipher="AES256", padding=True)
"cipher" defaults to AES256, the constructor raises an exception if that isn't available (or the specified cipher isn't available), or for a bad key (wrong length).
I'm personally against the idea of default ciphers, etc. Since difference ciphers, keylengths, and padding choices have immediate consequences for what kinds of security you're going to have, I would rather be explicit than implicit here.
e.update(plaintext) # repeat as needed ciphertext = e.result()
Very similar for decryption.
Most ciphers are not stream ciphers, so it doesn't make a lot of sense in the case of, say, RSA or AES, but again- bikeshedding.
What can be done to make something like this foolproof?
Not a whole lot, but, that's kind of the way security works. Geremy Condra
CTO <debatem1@gmail.com> wrote:
Let's look at symmetric encryption. You'd want to be able to create a new encryptor:
import evp e = evp.encryptor(key, cipher="AES256", padding=True)
"cipher" defaults to AES256, the constructor raises an exception if that isn't available (or the specified cipher isn't available), or for a bad key (wrong length).
I'm personally against the idea of default ciphers, etc. Since difference ciphers, keylengths, and padding choices have immediate consequences for what kinds of security you're going to have, I would rather be explicit than implicit here.
Sure, your privilege. Unfortunately, most users won't be able to make those choices sanely, and will rely on some sort of external advice about it. So I think it makes sense to try to build some such advice into the code, by adding a reasonably strong encryption standard as a default, and by adding some code to do sanity/compatibility checks on the user-selected keys, if possible.
e.update(plaintext) # repeat as needed ciphertext = e.result()
Very similar for decryption.
Most ciphers are not stream ciphers, so it doesn't make a lot of sense in the case of, say, RSA or AES, but again- bikeshedding.
Still, good point. Multiple calls to update() should raise an exception if the chosen cipher is not a stream cipher. Or, allow multiple calls, and buffer the input until result() is called. Bill
On Sep 22, 1:00 pm, Bill Janssen <jans...@parc.com> wrote:
CTO <debat...@gmail.com> wrote:
Let's look at symmetric encryption. You'd want to be able to create a new encryptor:
import evp e = evp.encryptor(key, cipher="AES256", padding=True)
"cipher" defaults to AES256, the constructor raises an exception if that isn't available (or the specified cipher isn't available), or for a bad key (wrong length).
I'm personally against the idea of default ciphers, etc. Since difference ciphers, keylengths, and padding choices have immediate consequences for what kinds of security you're going to have, I would rather be explicit than implicit here.
Sure, your privilege.
Unfortunately, most users won't be able to make those choices sanely, and will rely on some sort of external advice about it. So I think it makes sense to try to build some such advice into the code, by adding a reasonably strong encryption standard as a default, and by adding some code to do sanity/compatibility checks on the user-selected keys, if possible.
If you don't know what the application is, you don't know what's secure and what isn't. We have no way of knowing, and so should resist the temptation to guess. It's also worth pointing out that hashlib as currently implemented makes users do exactly this: you have to specify the hash you want, with no default provided. I've never heard anybody describe that as an onerous level of difficulty.
e.update(plaintext) # repeat as needed ciphertext = e.result()
Very similar for decryption.
Most ciphers are not stream ciphers, so it doesn't make a lot of sense in the case of, say, RSA or AES, but again- bikeshedding.
Still, good point. Multiple calls to update() should raise an exception if the chosen cipher is not a stream cipher. Or, allow multiple calls, and buffer the input until result() is called.
AFAIK, AES and RSA are the most commonly used algorithms in EVP. Maybe it would make more sense to take the more traditional keygen-encrypt-decrypt approach? Geremy Condra
CTO <debatem1@gmail.com> wrote:
If you don't know what the application is, you don't know what's secure and what isn't. We have no way of knowing, and so should resist the temptation to guess.
"secure" is not the same as "strongly encrypted". I'm looking at providing a simple way to do encryption here, not security. Let's just focus on that, first. I think defaulting to Blowfish or AES256 would be a reasonable tack to take there. I suggested AES256 because it seems to me more likely to be widely available.
AFAIK, AES and RSA are the most commonly used algorithms in EVP. Maybe it would make more sense to take the more traditional keygen-encrypt-decrypt approach?
Sure, maybe so. What would a proposed interface look like, then? Bill
Bill Janssen <janssen@...> writes:
Again, I wasn't proposing to replace m2cryto or pycrypto or anything else; I was suggesting that providing easy-to-use APIs to a couple of commonly-requested crypto features, for use by non-cryptographers, wouldn't be a bad idea.
I think it would be good indeed. Since we already wrapping OpenSSL, let's give access to more of its features instead of having people find additional binary packages (of varying quality) for their platform. As for some of the points which have been raised here: - Putting non-hash functions in "hashlib" would look strange. - Please don't call the package "evp", it's cryptic (;-)) and tied to a specific implementation. "crypto" would be fine and obvious. - I don't think there should be a default argument. People shouldn't try to do any crypto at all if they aren't able to choose an algorithm. Documenting (perhaps recommending) a couple of them (AES, RSA) would be more helpful than supporting a silent default. - The API needn't (and shouldn't) be the same as for hashing objects. A digest() method doesn't make sense. Ideally some of the {en,de}crypting objects could be file-like objects, but it's not really necessary IMO (and it can always be done in pure Python on top of a lower-level C extension, assuming the extension does provide a streaming interface). Regards Antoine.
2009/9/25 Antoine Pitrou <solipsis@pitrou.net>:
Bill Janssen <janssen@...> writes:
Again, I wasn't proposing to replace m2cryto or pycrypto or anything else; I was suggesting that providing easy-to-use APIs to a couple of commonly-requested crypto features, for use by non-cryptographers, wouldn't be a bad idea.
I think it would be good indeed. Since we already wrapping OpenSSL, let's give access to more of its features instead of having people find additional binary packages (of varying quality) for their platform.
+1.
As for some of the points which have been raised here: - Putting non-hash functions in "hashlib" would look strange. - Please don't call the package "evp", it's cryptic (;-)) and tied to a specific implementation. "crypto" would be fine and obvious. - I don't think there should be a default argument. People shouldn't try to do any crypto at all if they aren't able to choose an algorithm. Documenting (perhaps recommending) a couple of them (AES, RSA) would be more helpful than supporting a silent default. - The API needn't (and shouldn't) be the same as for hashing objects. A digest() method doesn't make sense. Ideally some of the {en,de}crypting objects could be file-like objects, but it's not really necessary IMO (and it can always be done in pure Python on top of a lower-level C extension, assuming the extension does provide a streaming interface).
Again, +1. Paul
On Sep 25, 7:44 am, Antoine Pitrou <solip...@pitrou.net> wrote:
Bill Janssen <janssen@...> writes:
Again, I wasn't proposing to replace m2cryto or pycrypto or anything else; I was suggesting that providing easy-to-use APIs to a couple of commonly-requested crypto features, for use by non-cryptographers, wouldn't be a bad idea.
I think it would be good indeed. Since we already wrapping OpenSSL, let's give access to more of its features instead of having people find additional binary packages (of varying quality) for their platform.
I've started putting the code together. It's very rough, but it does vaguely what it's supposed to do. It only supports AES192 right now, but it shouldn't be any harder to expand on that than it was to write the existing code, and I'd appreciate the contributions of anyone interested in seeing this in the standard library- it needs the help. You can find my code over at http://gitorious.org/cryptography-py/cryptography-py.
As for some of the points which have been raised here: ...snip... - Please don't call the package "evp", it's cryptic (;-)) and tied to a specific implementation. "crypto" would be fine and obvious.
I know of at least one implementation that uses Crypto as its name, and forgot to send them an email earlier this week. I'll do that today, and if they don't seem bent on changing the capitalization, I don't see any problem there. A side note: the code I've posted is a little schizophrenic since we haven't figured a name or toplevel structure out. That can change whenever a consensus emerges.
- I don't think there should be a default argument. People shouldn't try to do any crypto at all if they aren't able to choose an algorithm. Documenting (perhaps recommending) a couple of them (AES, RSA) would be more helpful than supporting a silent default.
Right now it just divides one cryptosystem into each module. For example, to use AES192, you can simply say: import aes ciphertext = aes.encrypt("this is a message", "this is my password") plaintext = aes.decrypt(ciphertext, "this is my password") assert plaintext == "this is a message" and it handles salt generation, IV generation, bitlength selection, and key strengthening for you. Eventually, if crypto becomes the name of the package, it will probably turn into "from crypto import aes".
- The API needn't (and shouldn't) be the same as for hashing objects. A digest() method doesn't make sense. Ideally some of the {en,de}crypting objects could be file-like objects, but it's not really necessary IMO (and it can always be done in pure Python on top of a lower-level C extension, assuming the extension does provide a streaming interface).
I can provide the streaming interface if it's desired. It would in some cases be more efficient, but for most I think it's needless complexity. Again, others may disagree, and I'm sure there's fertile ground for discussion there. Geremy Condra
On 2009-09-25 10:40 AM, CTO wrote:
On Sep 25, 7:44 am, Antoine Pitrou<solip...@pitrou.net> wrote:
As for some of the points which have been raised here: ...snip... - Please don't call the package "evp", it's cryptic (;-)) and tied to a specific implementation. "crypto" would be fine and obvious.
I know of at least one implementation that uses Crypto as its name, and forgot to send them an email earlier this week. I'll do that today, and if they don't seem bent on changing the capitalization, I don't see any problem there.
I wouldn't rely on capitalization to differentiate between packages. I think that Python will be able to do so in most circumstances, even on Windows, but it makes it hard to talk about the different packages and ask questions about them on the mailing lists. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco
2009/9/25 Robert Kern <robert.kern@gmail.com>:
On 2009-09-25 10:40 AM, CTO wrote:
On Sep 25, 7:44 am, Antoine Pitrou<solip...@pitrou.net> wrote:
- Please don't call the package "evp", it's cryptic (;-)) and tied to a specific implementation. "crypto" would be fine and obvious.
I know of at least one implementation that uses Crypto as its name, and forgot to send them an email earlier this week. I'll do that today, and if they don't seem bent on changing the capitalization, I don't see any problem there.
I wouldn't rely on capitalization to differentiate between packages. I think that Python will be able to do so in most circumstances, even on Windows, but it makes it hard to talk about the different packages and ask questions about them on the mailing lists.
Agreed. How about "encryption"? Paul.
On Sep 25, 1:45 pm, Paul Moore <p.f.mo...@gmail.com> wrote:
2009/9/25 Robert Kern <robert.k...@gmail.com>:
On 2009-09-25 10:40 AM, CTO wrote:
On Sep 25, 7:44 am, Antoine Pitrou<solip...@pitrou.net> wrote:
- Please don't call the package "evp", it's cryptic (;-)) and tied to a specific implementation. "crypto" would be fine and obvious.
I know of at least one implementation that uses Crypto as its name, and forgot to send them an email earlier this week. I'll do that today, and if they don't seem bent on changing the capitalization, I don't see any problem there.
I wouldn't rely on capitalization to differentiate between packages. I think that Python will be able to do so in most circumstances, even on Windows, but it makes it hard to talk about the different packages and ask questions about them on the mailing lists.
Agreed. How about "encryption"?
EVP covers hashing, signatures, and encryption/decryption. If we're going to go for a longer name, maybe "cryptography" would be more appropriate? Geremy Condra
CTO wrote:
EVP covers hashing, signatures, and encryption/decryption. If we're going to go for a longer name, maybe "cryptography" would be more appropriate?
Something to keep in mind while working on this is your threat model for the library. If you aren't going to do anything to guard against side-channel attacks (which are rather hard to avoid in a cross platform algorithm on a general purpose PC) or against attacks which grab unencrypted messages and keys from released-but-not-overwritten computer memory or (worse) the swap file, then this should be mentioned in the documentation. That way application developers that are looking for that extra level of security will know they need to look elsewhere. Regards, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------
On Fri, Sep 25, 2009 at 9:49 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
CTO wrote:
EVP covers hashing, signatures, and encryption/decryption. If we're going to go for a longer name, maybe "cryptography" would be more appropriate?
Something to keep in mind while working on this is your threat model for the library. If you aren't going to do anything to guard against side-channel attacks (which are rather hard to avoid in a cross platform algorithm on a general purpose PC) or against attacks which grab unencrypted messages and keys from released-but-not-overwritten computer memory or (worse) the swap file, then this should be mentioned in the documentation.
That way application developers that are looking for that extra level of security will know they need to look elsewhere.
Regards, Nick.
I can make a note of it, although I'm unsure what concrete steps I could take to prevent such attacks from succeeding. Any ideas? Geremy Condra
geremy condra wrote:
On Fri, Sep 25, 2009 at 9:49 PM, Nick Coghlan <ncoghlan@gmail.com <mailto:ncoghlan@gmail.com>> wrote:
CTO wrote: > EVP covers hashing, signatures, and encryption/decryption. If we're > going > to go for a longer name, maybe "cryptography" would be more > appropriate?
Something to keep in mind while working on this is your threat model for the library. If you aren't going to do anything to guard against side-channel attacks (which are rather hard to avoid in a cross platform algorithm on a general purpose PC) or against attacks which grab unencrypted messages and keys from released-but-not-overwritten computer memory or (worse) the swap file, then this should be mentioned in the documentation.
That way application developers that are looking for that extra level of security will know they need to look elsewhere.
Regards, Nick.
I can make a note of it, although I'm unsure what concrete steps I could take to prevent such attacks from succeeding. Any ideas?
OpenSSL may actually guard against of the first part already. I'm unsure about the second part though. And I don't know enough about the problems to know how to fix them either - I just know when I'm theoretically leaving these attack vectors open and make sure to defend them by other means (such as physically securing the affected networks). But it's this kind of stuff that people are talking about when they point out that practical crypto is harder than just using good algorithms. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------
On Fri, Sep 25, 2009 at 10:35 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
geremy condra wrote:
On Fri, Sep 25, 2009 at 9:49 PM, Nick Coghlan <ncoghlan@gmail.com <mailto:ncoghlan@gmail.com>> wrote:
CTO wrote: > EVP covers hashing, signatures, and encryption/decryption. If we're > going > to go for a longer name, maybe "cryptography" would be more > appropriate?
Something to keep in mind while working on this is your threat model
for
the library. If you aren't going to do anything to guard against side-channel attacks (which are rather hard to avoid in a cross
platform
algorithm on a general purpose PC) or against attacks which grab unencrypted messages and keys from released-but-not-overwritten
computer
memory or (worse) the swap file, then this should be mentioned in the documentation.
That way application developers that are looking for that extra level
of
security will know they need to look elsewhere.
Regards, Nick.
I can make a note of it, although I'm unsure what concrete steps I could take to prevent such attacks from succeeding. Any ideas?
OpenSSL may actually guard against of the first part already. I'm unsure about the second part though. And I don't know enough about the problems to know how to fix them either - I just know when I'm theoretically leaving these attack vectors open and make sure to defend them by other means (such as physically securing the affected networks).
But it's this kind of stuff that people are talking about when they point out that practical crypto is harder than just using good algorithms.
Cheers, Nick.
It seems to me that most timing attacks should already be out, so I'm not *too* worried about that- but I have literally no idea how to stop the secrets from being dropped into swap in this context. I think what I'm going to do is just make a note of it in the docs for the module and make sure that there's enough contact info in there to ensure that anybody reviewing the code can let me know if we're doing something stupid. And on that note: since most of my knowledge of crypto is theoretical in nature, I'm going to be leaning a lot on those of you with more practical experience as we go forward with this. Therefore, for everybody reading this list- please, *review the code*! If you think there's a problem, assume you're right and let me know- we really, really, really do not want half-baked crypto in the standard lib. Assuming that's even where this is headed. Geremy Condra
On Sat, Sep 26, 2009 at 12:01 AM, geremy condra <debatem1@gmail.com> wrote:
On Fri, Sep 25, 2009 at 10:35 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
geremy condra wrote:
On Fri, Sep 25, 2009 at 9:49 PM, Nick Coghlan <ncoghlan@gmail.com <mailto:ncoghlan@gmail.com>> wrote:
CTO wrote: > EVP covers hashing, signatures, and encryption/decryption. If
we're
> going > to go for a longer name, maybe "cryptography" would be more > appropriate?
Something to keep in mind while working on this is your threat model
for
the library. If you aren't going to do anything to guard against side-channel attacks (which are rather hard to avoid in a cross
platform
algorithm on a general purpose PC) or against attacks which grab unencrypted messages and keys from released-but-not-overwritten
computer
memory or (worse) the swap file, then this should be mentioned in
the
documentation.
That way application developers that are looking for that extra
level of
security will know they need to look elsewhere.
Regards, Nick.
I can make a note of it, although I'm unsure what concrete steps I could take to prevent such attacks from succeeding. Any ideas?
OpenSSL may actually guard against of the first part already. I'm unsure about the second part though. And I don't know enough about the problems to know how to fix them either - I just know when I'm theoretically leaving these attack vectors open and make sure to defend them by other means (such as physically securing the affected networks).
But it's this kind of stuff that people are talking about when they point out that practical crypto is harder than just using good algorithms.
Cheers, Nick.
It seems to me that most timing attacks should already be out, so I'm not *too* worried about that- but I have literally no idea how to stop the secrets from being dropped into swap in this context. I think what I'm going to do is just make a note of it in the docs for the module and make sure that there's enough contact info in there to ensure that anybody reviewing the code can let me know if we're doing something stupid.
And on that note: since most of my knowledge of crypto is theoretical in nature, I'm going to be leaning a lot on those of you with more practical experience as we go forward with this. Therefore, for everybody reading this list- please, *review the code*! If you think there's a problem, assume you're right and let me know- we really, really, really do not want half-baked crypto in the standard lib. Assuming that's even where this is headed.
Geremy Condra
Added the notice to the top of aes.c. Geremy Condra
participants (8)
-
Antoine Pitrou
-
Bill Janssen
-
CTO
-
geremy condra
-
Gregory P. Smith
-
Nick Coghlan
-
Paul Moore
-
Robert Kern