Specification of procedures to store cryptographic secrets

Some hours ago I sent an email to python-crypto asking how to securely wipe cryptographic secrets from memory: http://mail.python.org/pipermail/python-crypto/2013-February/001170.html Antoine said that cryptographic secret wiping could be achieved if one uses bytearrays carefully and then overwrites their contents after use. I agree that this sounds reasonable, but I think it would be even better if that was a documented property of bytearrays. If that property of bytearrays was specified in the Python standards, it would be easier for people who write cryptographic applications and libraries to use bytearrays correctly, and it would also guarantee that this property won't change in future versions of Python. Furthermore, it would help authors of cryptographic libraries to design their APIs and internal functions in a way that would allow the secure erasure of sensitive data. Would this make sense or am I asking too much from Python?

On Sun, Feb 3, 2013 at 7:18 PM, <desnacked@riseup.net> wrote:
It would similarly be helpful to add low-level support for "pinning" such memory so that it is not written to backing store. While that can be done with the mmap module, the details are tricky. I don't think that this belongs in the Python core, though. Rather, I think that this should be implemented in a module which can be used in conjunction with bytearrays, mmap, and any other necessary pieces of the core and stdlib. In fact, such a thing might already exist - I haven't looked (it's really not within my area of interest) Putting such a thing in the stdlib might achieve the guarantee you suggest, but it might not. It really just shifts responsibility for ensuring good cryptographic programming onto people who spend their time implementing programming languages. Dustin

That might work if you never ever resize a bytearray during its life cycle. A resize op calls realloc() which may copy the data to a new memory region. The old region isn't zeroed. The approach only takes care of the object itself on the heap. Some function may store data on the stack or make a temporary copy to another memory location on the heap. You have to compensate for that. libtomcrypt has a function burn_stack() that allocates and overwrites memory on the stack with a recursive function call. Christian

Correct. this isn't something that belongs in the core python language and types. something needing memory-pinning and secure wiping should be implemented as a special type (c extension module) for use with the c extension libraries that need those properties. as soon as anything enters python's own types or values ever make it into python code in any way, no guarantees can ever be made as to how many copies were made and scattered around the process's own address space. assume "many". Python doesn't implement any sort of chain of custody for data internally. On Sun, Feb 3, 2013 at 5:11 PM, Christian Heimes <christian@python.org>wrote:

Am 04.02.2013 02:54, schrieb Gregory P. Smith:
I agree! A custom type came into my mind, too. Data wiping is merely a small part of the general issue. A confident and secure container for secrets must do more. For example it has to prevent the memory page from getting swapped to disk with mlock(2). Lot's of bad things can happen when you look at L1/L2/L3 CPU cache, hyper threading and virtualization. All that stuff makes it hard to conceal secrets. On the bright side attacks rarely crack cryptography. In most cases it's easier, faster and less costly to do social engineering. Humans are lazy, ignorant and bribable. Christian

On 2/3/2013 7:18 PM, desnacked@riseup.net wrote:
I presume he meant with CPython with its non-compacting gc on current major OSes. Perhaps the system also needs to be unloaded enough that the memory is not written to disk. Or the secret is written and erased before that would happen.
agree that this sounds reasonable, but I think it would be even better if that was a documented property of bytearrays.
I do not think such a low-level special-case property would be appropriate. Python is a high-level languages for manipulating fairly abstract objects defined by interface and behavior. The reference manual defining the language intentionally says almost nothing about the hardware and memory of an implementation. This is partly why Python is relatively easy to read and mentally execute in a human brain. One usually does not need to mentally simulate a linear byte memory.
This would mean that Python could not run on hardware that made the guarantee impossible. What if a future OS ran directly off an SSD, either dispensing with current DRAM, or using it as the outer cache layer? My understanding is that SSDs run independently with their own os and that external access is to logical rather than physical memory. What if a farther future system had a write-once, read-many, never-erase petabyte or exabyte 3d cube memory, with the SSD only serving as an index? Of course, it is possible that security concern will figure into future designs. The statement 'del x' only means "break the association between the name 'x' and the object currently associated with 'x'". If that is the last link to the object, it becomes inaccessible from Python and *eligible* to be physically deleted. What what happens in concrete hardware is explicitly not Python's concern. From 3.1. Objects, values and types: "Objects are never explicitly destroyed; however, when they become unreachable they may be garbage-collected. An implementation is allowed to postpone garbage collection or omit it altogether ..."
I agree with Dustin that you need a 3rd-party crytobytes module. It could be specific to OS and hardware, keep up with changes, and refuse to run if the required guarantees cannot be met.
Would this make sense or am I asking too much from Python?
To me, it makes perfect sense for you to want a cryptobytes class that does exactly what you want it to do. And, again to me, you are asking too much for such to be part of the stdlib. Whether you are asking too much of any particular OS is beyond my knowledge. If the OS can provide the guarantees, a 3rd party Python wrapping should be possible. -- Terry Jan Reedy

desnacked@riseup.net wrote:
I think to fully guarantee that you would need a promise from the OS that overwriting a particular piece of your virtual address space removes all evidence of that data from swap space, etc. I don't know whether any current OSes provide that kind of guarantee. -- Greg

On 04.02.2013 01:18, desnacked@riseup.net wrote:
I don't think there's any safe way to store crypto information in memory. You'd have to use a dedicated hardware crypto device to avoid leaking the keys (think memory reallocation, the OS swapping memory to disk, your code running on a VM, etc.). See e.g. http://c0decstuff.blogspot.de/2011/01/in-memory-extraction-of-ssl-private.ht... for an example on how to do this intentionally. Not even OpenSSL tries to address this, so I think it's asking a bit much from Python ;-) That said, adding a little more security to a custom blob type would certainly not hurt :-) Here's some inspiration for locking and cleaning memory: http://c0decstuff.blogspot.de/2011/01/in-memory-extraction-of-ssl-private.ht... (pages 36ff) -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 04 2013)
::::: Try our mxODBC.Connect Python Database Interface for free ! :::::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/

On Sun, Feb 3, 2013 at 7:18 PM, <desnacked@riseup.net> wrote:
It would similarly be helpful to add low-level support for "pinning" such memory so that it is not written to backing store. While that can be done with the mmap module, the details are tricky. I don't think that this belongs in the Python core, though. Rather, I think that this should be implemented in a module which can be used in conjunction with bytearrays, mmap, and any other necessary pieces of the core and stdlib. In fact, such a thing might already exist - I haven't looked (it's really not within my area of interest) Putting such a thing in the stdlib might achieve the guarantee you suggest, but it might not. It really just shifts responsibility for ensuring good cryptographic programming onto people who spend their time implementing programming languages. Dustin

That might work if you never ever resize a bytearray during its life cycle. A resize op calls realloc() which may copy the data to a new memory region. The old region isn't zeroed. The approach only takes care of the object itself on the heap. Some function may store data on the stack or make a temporary copy to another memory location on the heap. You have to compensate for that. libtomcrypt has a function burn_stack() that allocates and overwrites memory on the stack with a recursive function call. Christian

Correct. this isn't something that belongs in the core python language and types. something needing memory-pinning and secure wiping should be implemented as a special type (c extension module) for use with the c extension libraries that need those properties. as soon as anything enters python's own types or values ever make it into python code in any way, no guarantees can ever be made as to how many copies were made and scattered around the process's own address space. assume "many". Python doesn't implement any sort of chain of custody for data internally. On Sun, Feb 3, 2013 at 5:11 PM, Christian Heimes <christian@python.org>wrote:

Am 04.02.2013 02:54, schrieb Gregory P. Smith:
I agree! A custom type came into my mind, too. Data wiping is merely a small part of the general issue. A confident and secure container for secrets must do more. For example it has to prevent the memory page from getting swapped to disk with mlock(2). Lot's of bad things can happen when you look at L1/L2/L3 CPU cache, hyper threading and virtualization. All that stuff makes it hard to conceal secrets. On the bright side attacks rarely crack cryptography. In most cases it's easier, faster and less costly to do social engineering. Humans are lazy, ignorant and bribable. Christian

On 2/3/2013 7:18 PM, desnacked@riseup.net wrote:
I presume he meant with CPython with its non-compacting gc on current major OSes. Perhaps the system also needs to be unloaded enough that the memory is not written to disk. Or the secret is written and erased before that would happen.
agree that this sounds reasonable, but I think it would be even better if that was a documented property of bytearrays.
I do not think such a low-level special-case property would be appropriate. Python is a high-level languages for manipulating fairly abstract objects defined by interface and behavior. The reference manual defining the language intentionally says almost nothing about the hardware and memory of an implementation. This is partly why Python is relatively easy to read and mentally execute in a human brain. One usually does not need to mentally simulate a linear byte memory.
This would mean that Python could not run on hardware that made the guarantee impossible. What if a future OS ran directly off an SSD, either dispensing with current DRAM, or using it as the outer cache layer? My understanding is that SSDs run independently with their own os and that external access is to logical rather than physical memory. What if a farther future system had a write-once, read-many, never-erase petabyte or exabyte 3d cube memory, with the SSD only serving as an index? Of course, it is possible that security concern will figure into future designs. The statement 'del x' only means "break the association between the name 'x' and the object currently associated with 'x'". If that is the last link to the object, it becomes inaccessible from Python and *eligible* to be physically deleted. What what happens in concrete hardware is explicitly not Python's concern. From 3.1. Objects, values and types: "Objects are never explicitly destroyed; however, when they become unreachable they may be garbage-collected. An implementation is allowed to postpone garbage collection or omit it altogether ..."
I agree with Dustin that you need a 3rd-party crytobytes module. It could be specific to OS and hardware, keep up with changes, and refuse to run if the required guarantees cannot be met.
Would this make sense or am I asking too much from Python?
To me, it makes perfect sense for you to want a cryptobytes class that does exactly what you want it to do. And, again to me, you are asking too much for such to be part of the stdlib. Whether you are asking too much of any particular OS is beyond my knowledge. If the OS can provide the guarantees, a 3rd party Python wrapping should be possible. -- Terry Jan Reedy

desnacked@riseup.net wrote:
I think to fully guarantee that you would need a promise from the OS that overwriting a particular piece of your virtual address space removes all evidence of that data from swap space, etc. I don't know whether any current OSes provide that kind of guarantee. -- Greg

On 04.02.2013 01:18, desnacked@riseup.net wrote:
I don't think there's any safe way to store crypto information in memory. You'd have to use a dedicated hardware crypto device to avoid leaking the keys (think memory reallocation, the OS swapping memory to disk, your code running on a VM, etc.). See e.g. http://c0decstuff.blogspot.de/2011/01/in-memory-extraction-of-ssl-private.ht... for an example on how to do this intentionally. Not even OpenSSL tries to address this, so I think it's asking a bit much from Python ;-) That said, adding a little more security to a custom blob type would certainly not hurt :-) Here's some inspiration for locking and cleaning memory: http://c0decstuff.blogspot.de/2011/01/in-memory-extraction-of-ssl-private.ht... (pages 36ff) -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 04 2013)
::::: Try our mxODBC.Connect Python Database Interface for free ! :::::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/
participants (7)
-
Christian Heimes
-
desnacked@riseup.net
-
Dustin J. Mitchell
-
Greg Ewing
-
Gregory P. Smith
-
M.-A. Lemburg
-
Terry Reedy