[Python-Dev] bpo-36829: Add sys.unraisablehook()

Thu May 16 06:12:20 EDT 2019

Le jeu. 16 mai 2019 à 08:39, Serhiy Storchaka <storchaka at gmail.com> a écrit :
>
> 16.05.19 04:23, Victor Stinner пише:
> > The first implementation of my API used sys.unraisablehook(exc_type,
> > exc_value, exc_tb, obj). The problem is that Serhiy Storchaka asked me
> > to add a new error message field which breaks the API: the API is not
> > future-proof.
> >
> > I modified my API to create an object to pack arguments. The new API
> > becomes sys.unraisablehook(unraisable) where unraisable has 4 fields:
> > exc_type, exc_value, exc_tb, obj. This API is now future-proof: adding
> > a new field will not break existing custom hooks!
>
> I prefer the former design, when the hook takes 5 arguments: exc_type,
> exc_value, exc_tb, obj and msg. Any additional human readable
> information can be encoded in msg, and machine readable information can
> be encoded in msg or obj. Currently we have no plans for adding more
> details, and I do not think that we will need to do this in future.
> Packing arguments into a single extendable object just complicates the
> code and increases the chance of raising an exception or crashing.
>
> Even obj and msg could be merged into a single object, and the hook
> could check whether it is a string or callable. But passing them as
> separate arguments is fine to me, I just think that no more complication
> is necessary.

I came to the UnraisableHookArgs structseq idea because of my bad
experience with extension warnings "hooks". Let me tell you this story
:-)

To enhance ResourceWarning messages, I modified Python 3.6 to add a
new "source" parameter to the warnings module. It's the object which
caused the ResourceWarning. This object is mostly used to display the
traceback where the object has been allocated, traceback read using
tracemalloc.get_object_traceback() and so required tracemalloc to be
enabled.

   https://bugs.python.org/issue26604

The problem is that the 2 warnings "hooks" use a fixed number of
positional (and positional-or-keyword) parameters:

def showwarning(message, category, filename, lineno, file=None, line=None): ...
def formatwarning(message, category, filename, lineno, line=None): ...

Adding a new parameter would simply break *all* custom hooks in the wild...

At the same time, internally, the warnings module uses a
WarningMessage object to "pack" all parameters into a single object. I
was able to easily add a new "source" attribute to WarningMessage
without breaking the code using WarningMessage.

I added new hooks which accept WarningMessage:

def _showwarnmsg(msg): ...
def _formatwarnmsg(msg): ...

The problem is to keep the backward compatibility: decide which hook
should be called... How to detect that showwarning or _showwarnmsg has
been overriden? If both are overriden, which one should be modified?

The implementation is tricky, and it caused a few minor regressions:

https://github.com/python/cpython/commit/be7c460fb50efe3b88a00281025d76acc62ad2fd

All these problems are the consequence of the initial design choice of
warnings hooks: using a fixed number of parameters. This API simply
cannot be extended. IMHO it's a bad design,  and we can do better:
WarningMessage is a better design.

--

I would prefer to not reproduce the same API design mistake with
sys.unraisablehook.

I modified my PR to only use 4 fields: (exc_type, exc_value, exc_tb, obj).

I plan to add a 5th field "err_msg" ("error_message"?) later to
enhance the hook (allow to pass a custom error message to give more
context where the exception comes from). Passing a single
UnraisableHookArgs parameter to sys.unraisablehook allows to add a 5th
field without breaking the backward compatibility.

> Any additional human readable
> information can be encoded in msg, and machine readable information can
> be encoded in msg or obj.

IMHO it's very risky to declare the current API as "complete". In my
experience, we always want to enhance an API at some point to pass
more information. IMHO the warnings module with my new "source"
parameter is a good example.

For unraisable hook, it's not hard to imagine other useful parameters
that can be passed to the hook to provide more context about the
exception. For example, right now we only pass one object. But there
are cases where a second object would totally makes sense.

--

> Packing arguments into a single extendable object just complicates the
> code and increases the chance of raising an exception or crashing.

Technically, UnraisableHookArgs is basically a tuple of 4 items. I
consider that there is a low risk that creating the object can fail.

UnraisableHookArgs creation failure *is* covered by my implementation!
The default hook is called directly in C without using a temporary
UnraisableHookArgs object. The exception which caused the failure is
logged as well.

IMHO the very little risk of UnraisableHookArgs creation failure is
worth it, compared to the pain to extend an API based on a fixed
number of parameters.

Victor
-- 
Night gathers, and now my watch begins. It shall not end until my death.