[Python-bugs-list] [ python-Bugs-780354 ] socket.makefile() isn't compatible with marshal/cPickle/etc

SourceForge.net noreply@sourceforge.net
Mon, 04 Aug 2003 05:37:11 -0700


Bugs item #780354, was opened at 2003-07-30 12:02
Message generated for change (Comment added) made by dmgrime
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=780354&group_id=5470

Category: Python Library
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: David M. Grimes (dmgrime)
Assigned to: Skip Montanaro (montanaro)
Summary: socket.makefile() isn't compatible with marshal/cPickle/etc

Initial Comment:
Even on platforms where the underlying C libraries and associated 
code in
socketmodule.c support it, socket object's .makefile() method no 
longer
return "real" file objects.  This breaks support for cPickle, marshal, 
and
any other C modules which expect a file object as an argument.

The change initially appears in socket.py v1.36 - the changelog 
entry
states:

"""

The socket module now always uses the _socketobject wrapper 
class, even on
platforms which have dup(2).  The makefile() method is built directly 
on top
of the socket without duplicating the file descriptor, allowing timeouts 
to
work properly.  Includes a new test case (urllibnet) which requires 
the
network resource.

Closes bug 707074.
"""

I'm not sure of the implication WRT timeouts, but the patch is very 
small -
quite obviously removing the platform check which prevented the 
redefinition
of the symbol socket in the exports of socket.py.  It now is always 
the
_socketobject class, and not the underlying _socket.socket type 
(when it is
known the platform supports all functionality in _socket).

For now, I can workaround (since I don't need the timeout 
semantics) by
using _socket.socket() directly (I'm on a Linux platform), but I 
wondered if
this is something which can/should be addressed differently?



----------------------------------------------------------------------

>Comment By: David M. Grimes (dmgrime)
Date: 2003-08-04 07:37

Message:
Logged In: YES 
user_id=699265

I know the wrapper has been there for a long time - some 
platforms don't support the "native" underlying FILE 
implementation, but many (most) platforms did.

What was the problem that necessitated the change?  I'd be 
more than willing to spend some time looking into alternative 
solutions which keep the optimized implementation where 
possible.  Can you provide me with some history of the 
motivations for the current solution?

Thanks!

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2003-08-01 15:52

Message:
Logged In: YES 
user_id=21627

I don't know of any other approach to solve the problem.

Also notice that the wrapper object has existed for a long
time already, on some systems.

----------------------------------------------------------------------

Comment By: David M. Grimes (dmgrime)
Date: 2003-08-01 14:31

Message:
Logged In: YES 
user_id=699265

It is clear that something which used to work no longer 
works, and I'm just wondering if the breakage is necessary.  
I'm not sure what all was involved with respect to timeouts 
and why the makefile() implementation had to change.  I 
guess I see 3 problems:

1) Previously functional code breaks.  While marshal is 
not "general purpose" in the sense it can't deal with 
instances, etc. it is still very useful and for data which 
doesn't contain many redundant references (where cPickle 
reduces them) it is the most efficient way to serialize and 
unserialize "native" python types.

2) The performance of everything which uses the results of 
socket.makefile() now suffers the cost of a pure-python 
implementation where it had perviously been optimized as a C 
implementation on platforms which supported it (most).

3) Any other 3rd party C extensions which use the 
PyFile_Check (and would previously have worked) break.

Is there another way to solve the timeout problem?  Breaking 
something which has worked since 1.5 somehow feels wrong 
to me.

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2003-08-01 14:09

Message:
Logged In: YES 
user_id=21627

I don't think anymore that marshal has a problem. There is
nothing wrong with it breaking when operating on a socket;
marshal is not for general application use, anyway. Anybody
who wants to transmit marshalled data over a socket can
still dump them into a string first.

So I propose to fix this aspect by documenting this intended
limitation.

----------------------------------------------------------------------

Comment By: Skip Montanaro (montanaro)
Date: 2003-08-01 09:52

Message:
Logged In: YES 
user_id=44345

Agreed, marshal has problems.  It's problems are a bit
deeper than the technique in cPickle can solve, however.  It
seems to me that the WFILE struct is a bad hack.  It should
contain a union with three elements, one is the regular FILE *
stuff, one is the current write-to-string stuff, and a third is
a "generic object which has a write method".

I'll see if I can whip something up and pass along to Martin
for review.


----------------------------------------------------------------------

Comment By: David M. Grimes (dmgrime)
Date: 2003-08-01 09:27

Message:
Logged In: YES 
user_id=699265

Skip --

I stand corrected (for cPickle, anyway) - but I imagine there is a 
performance degredation over the "old" way of doing things - if I get a 
chance, I'll test that.  

The problem does still exist at least in marshal (.load and .dump only 
accept a "real" file object).  I did a grep for PyFile_Check in the source tree 
prior to posting the bug report, and was a little over zealous to include 
cPickle in the subject.  There are other places in the source ( and possibly 
in 3rd party extensions) which use PyFile_Check, and I haven't had a 
change to analyze those yet.

In any case, marshal doesn't work - in your test, if you change the cPickle 
to marshal, you'll get:

Traceback (most recent call last):
  File "x.py", line 10, in ?
    marshal.dump("Hello, world", f)
TypeError: marshal.dump() 2nd arg must be file

Again, I'm not sure of the impact other places where PyFile_Check is 
used, and I don't know the performance loss in cPickle (as compared to 
the "real" file implementation in pre-2.3 socket.makefile())

Let me know if there is anything I can do to help debug/fix!



----------------------------------------------------------------------

Comment By: Skip Montanaro (montanaro)
Date: 2003-08-01 09:18

Message:
Logged In: YES 
user_id=44345

I modified the Demo/sockets/unixclient.py script to the following:

# Echo client demo using Unix sockets
# Piet van Oostrum
from socket import *
FILE = 'blabla'
s = socket(AF_UNIX, SOCK_STREAM)
s.connect(FILE)
import cPickle
f = s.makefile("wb")
cPickle.dump("Hello, world", f)
f.close()
#s.send('Hello, world')
data = s.recv(1024)
s.close()
print 'Received', `data`

It seemed to work fine when connected to the corresponding
unixserver.py script.


----------------------------------------------------------------------

Comment By: Skip Montanaro (montanaro)
Date: 2003-08-01 09:11

Message:
Logged In: YES 
user_id=44345

Looking at the code in cPickle.c it appears that as long as the
object being written to has a 'write' method it should do the
right thing.  Can you provide a simple test case which fails?


----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2003-08-01 02:14

Message:
Logged In: YES 
user_id=21627

See the bug that this patch closes; I doubt there is an
alternative approach.

Instead, I think the C modules that expect file objects need
to be fixed.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=780354&group_id=5470