Socket module bug on OpenVMS

Tue Oct 24 01:29:50 EDT 2006

Irmen de Jong schrieb:
> Although I'm not yet convinced about the all-or-nothing failure
> model you were talking about. How can I know if recv() fails
> with a MemoryError (or socket error) that it actually didn't
> receive anything? Is that even an assumption I can rely on?

Yes, you can. It can't produce a MemoryError (AFAICT, there is
no code path which possibly could (*)). If there is a socket
error, then the operating system guarantees that no data
were written into the buffer, and that subsequent calls
will yield any data that might have been available.

(*) It can, of course, produce a MemoryError if the result buffer
can't be allocated. If that happens, no recv(2) call has been
made, so the MemoryError indicates that nothing has happened -
you need to provide the result buffer before calling recv; that's
why you need to provide the size of the message you want to
receive.

> If it isn't, splitting up the recv() internally (into smaller
> blocks) wouldn't be any different, right?

Right. However, directly calling operating system calls and directly
exposing their results allows for such all-or-nothing error model,
as the operating system API itself is designed with the goal of
atomicity for each operation (or, if there are partial results,
has a way of reporting them, e.g. for partial writes).

> Nope, I was indeed thinking about splitting up the recv() into
> smaller blocks internally.

Look at shutil.copytree for an example of what this leads to.
This routine originally traversed the input tree, copied all
files it could, and skipped over those it couldn't. It ignored
the basic principle "errors should never pass silently".

I then tried fixing it, and the result is really ugly: It
keeps a list of all exceptions that occurred, and then
raises a single exception that contains them all. The
caller is supposed to know this, and perhaps print out
multiple error messages.

shutil.rmtree takes a different approach, which could be
considered just as ugly: it provides an onerror call-back
function which is invoked whenever some remove operation
fails.

I also think there is some API that returns the partial
result inside the exception if there is an exception.
The caller has to know this, and look at any data that
are already in the exception. Of course, changing an
existing API to switch to such a convention should
be considered as an incompatible change.

> Maybe a more elaborate check in Python's build system (to remove
> the MSG_WAITALL on VMS) would qualify as some sort of work-around
> as well.

Before that is done, somebody should confirm that MSG_WAITALL
is really not behaving correctly on VMS. Then, the check doesn't
have to be all that elaborate: a simple #ifdef would do.

Regards,
Martin