[Python-ideas] Secure string disposal (maybe other inmutable seq types too?)

Sat Jun 23 03:31:19 EDT 2018

On 23/06/2018 06:21, Nathaniel Smith wrote:
> On Fri, Jun 22, 2018 at 6:45 PM, Steven D'Aprano <steve at pearwood.info> wrote:
>> On Sat, Jun 23, 2018 at 01:33:59PM +1200, Greg Ewing wrote:
>>> Chris Angelico wrote:
>>>> Downside:
>>>> You can't say "I'm done with this string, destroy it immediately".
>>>
>>> Also it would be hard to be sure there wasn't another
>>> copy of the data somewhere from a time before you
>>> got around to marking the string as sensitive, e.g.
>>> in a file buffer.
>>
>> Don't let the perfect be the enemy of the good.
> 
> That's true, but for security features it's important to have a proper
> analysis of the threat and when the mitigation will and won't work;
> otherwise, you don't know whether it's even "good", and you don't know
> how to educate people on what they need to do to make effective use of
> it (or where it's not worth bothering).
> 
> Another issue: I believe it'd be impossible for this proposal to work
> correctly on implementations with a compacting GC (e.g., PyPy),
> because with a compacting GC strings might get copied around in memory
> during their lifetime. And crucially, this might have already happened
> before the interpreter was told that a particular string object
> contained sensitive data. I'm guessing this is part of why Java and C#
> use a separate type.
> 
> There's a lot of prior art on this in other languages/environments,
> and a lot of experts who've thought hard about it. Python-{ideas,dev}
> doesn't have a lot of security experts, so I'd very much want to see
> some review of that work before we go running off designing something
> ad hoc.
> 
> The PyCA cryptography library has some discussion in their docs:
> https://cryptography.io/en/latest/limitations/
> 
> One possible way to move the discussion forward would be to ask the
> pyca devs what kind of API they'd like to see in the interpreter, if
> any.
> 
> -n
> 
All good points - I would think that for this to be effective the 
string, or secure string, would need to be marked at create time and all 
operations on it would have to honour the wipe before free flag and 
include it forward in any copies made.

This needs to be implemented at a very low level so that, e.g.: adding 
to a string, (which makes a copy if the string is growing beyond the 
current allocation), will have to check for the flag add it to the new 
string, copy the expanded contents and then wipe the old before freeing 
it - any normal string which is being added to a secure string should 
probably get the flag added automatically as well. Of course adding or 
assigning a secure string to a normal string should automatically make 
the target string secure as well.

This sounds like a lot of overhead to be adding to every string 
operation secure or not.

That being the case it probably makes a lot of sense to use a separate 
base class - while this will result in a certain amount of bloat in 
software that makes use of it it will avoid the overhead of checking the 
flag on the vast majority of software which does not use it.

I do know that I have heard in the past of security breaches in both C & 
Pascal strings where the problem was tracked down to the "delete" 
mechanism being just setting the first byte to 0x00, (which in both 
cases would result in the length of the string being 0 but the contents 
being untouched in Pascal strings and only the first character being 
lost in C.
-- 
Steve (Gadget) Barnes
Any opinions in this message are my personal opinions and do not reflect 
those of my employer.

---
This email has been checked for viruses by AVG.
https://www.avg.com