Tragic! Pickle is relatively (?) fast and could be made more secure while making any performance regression due to additional security optional.

 Perhaps it is the objectives of pickle which are desirable:

- serialize/deserialize arbitrary objects
- binary representation

 Or perhaps the docs could include clarifications from this thread regarding the unsuitability of pickle for anything and the module should be underscored: _pickle


On Sat, Jul 18, 2020, 11:28 PM Random832 <random832@fastmail.com> wrote:
On Sat, Jul 18, 2020, at 12:54, Stephen J. Turnbull wrote:
>  > I think I got all of them, but if you think there may be others
>  > feel free to be an extra pair of eyes. But these overrides are not
>  > available for the C version,
>
> That's going to be a sticking point, as many pickle use cases want to
> be as fast as possible.  Additional overhead is likely to be
> unwelcome, although I guess the default would be minimal (I guess
> checking for the default of None and only calling if non-None would be
> fastest and do the job).

The *default* would be to just pass the call through as-is, e.g. def do_call(self, f, *a, **k): return f(*a, **k); or whatever is the equivalent C - my proposal is all just about having an internally called method that *can* be overridden, not defining anything special with it by default.

I guess part of where I'm not sure I'm on solid ground is... is the pure-python version guaranteed to always exist and always be available under the name _Unpickler, or is that an implementation detail? I've been assuming that there was no such guarantee and any change would have to be clearly defined and ultimately available in both versions.

> It certainly is *not* sufficient to be safe, if the threat model
> includes, say, a zero-day in the code being called, or an extended
> attack in which one allowed call sets the stage for another allowed
> call to blow up.  And it may be sufficient to be useless, depending on
> the use case.  You *are* going to die on *that* hill, you know.

Well, sure, anything can have bugs. I meant it's sufficient for it not to have any special vulnerabilities vs anything else you might ever do with python. I'm basically just trying to push back against "this is, like eval, the keys to the kingdom and thus not worth hardening in any way at all".

>  > On a mostly unrelated note I also have to admit I am baffled why
>  > the NEWOBJ opcodes are defined to call __new__ instead of
>  > __newobj__, when the latter is expected to exist and be a valid
>  > reduce function.
>
> A lot of these decisions have implications for backward compatibility
> of pickles.  If you add code to check versions and decide whether to
> call __new__ or __newobj__, that has performance and complexity
> implications that may have been judged not worth the marginal[1]
> improvement in security.  Again, a request to change this seems likely
> to get pushback.

Sure - it'd have to be a new opcode at this point, and almost certainly isn't worth it... I just think the wrong decision was made in the first place, and we'd have more solid ground to design a version that doesn't require every application to provide its own specific filters if the decision had gone the other way. It doesn't matter at this point, I was just mentioning it as an aside.
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-leave@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/OHSE4CJHMGIR5NEA7GBRHWPZAPTJADWE/
Code of Conduct: http://python.org/psf/codeofconduct/