[Python-bugs-list] [ python-Bugs-780354 ] socket.makefile() isn't compatible with marshal/cPickle/etc
SourceForge.net
noreply@sourceforge.net
Mon, 04 Aug 2003 05:37:11 -0700
Bugs item #780354, was opened at 2003-07-30 12:02
Message generated for change (Comment added) made by dmgrime
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=780354&group_id=5470
Category: Python Library
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: David M. Grimes (dmgrime)
Assigned to: Skip Montanaro (montanaro)
Summary: socket.makefile() isn't compatible with marshal/cPickle/etc
Initial Comment:
Even on platforms where the underlying C libraries and associated
code in
socketmodule.c support it, socket object's .makefile() method no
longer
return "real" file objects. This breaks support for cPickle, marshal,
and
any other C modules which expect a file object as an argument.
The change initially appears in socket.py v1.36 - the changelog
entry
states:
"""
The socket module now always uses the _socketobject wrapper
class, even on
platforms which have dup(2). The makefile() method is built directly
on top
of the socket without duplicating the file descriptor, allowing timeouts
to
work properly. Includes a new test case (urllibnet) which requires
the
network resource.
Closes bug 707074.
"""
I'm not sure of the implication WRT timeouts, but the patch is very
small -
quite obviously removing the platform check which prevented the
redefinition
of the symbol socket in the exports of socket.py. It now is always
the
_socketobject class, and not the underlying _socket.socket type
(when it is
known the platform supports all functionality in _socket).
For now, I can workaround (since I don't need the timeout
semantics) by
using _socket.socket() directly (I'm on a Linux platform), but I
wondered if
this is something which can/should be addressed differently?
----------------------------------------------------------------------
>Comment By: David M. Grimes (dmgrime)
Date: 2003-08-04 07:37
Message:
Logged In: YES
user_id=699265
I know the wrapper has been there for a long time - some
platforms don't support the "native" underlying FILE
implementation, but many (most) platforms did.
What was the problem that necessitated the change? I'd be
more than willing to spend some time looking into alternative
solutions which keep the optimized implementation where
possible. Can you provide me with some history of the
motivations for the current solution?
Thanks!
----------------------------------------------------------------------
Comment By: Martin v. Löwis (loewis)
Date: 2003-08-01 15:52
Message:
Logged In: YES
user_id=21627
I don't know of any other approach to solve the problem.
Also notice that the wrapper object has existed for a long
time already, on some systems.
----------------------------------------------------------------------
Comment By: David M. Grimes (dmgrime)
Date: 2003-08-01 14:31
Message:
Logged In: YES
user_id=699265
It is clear that something which used to work no longer
works, and I'm just wondering if the breakage is necessary.
I'm not sure what all was involved with respect to timeouts
and why the makefile() implementation had to change. I
guess I see 3 problems:
1) Previously functional code breaks. While marshal is
not "general purpose" in the sense it can't deal with
instances, etc. it is still very useful and for data which
doesn't contain many redundant references (where cPickle
reduces them) it is the most efficient way to serialize and
unserialize "native" python types.
2) The performance of everything which uses the results of
socket.makefile() now suffers the cost of a pure-python
implementation where it had perviously been optimized as a C
implementation on platforms which supported it (most).
3) Any other 3rd party C extensions which use the
PyFile_Check (and would previously have worked) break.
Is there another way to solve the timeout problem? Breaking
something which has worked since 1.5 somehow feels wrong
to me.
----------------------------------------------------------------------
Comment By: Martin v. Löwis (loewis)
Date: 2003-08-01 14:09
Message:
Logged In: YES
user_id=21627
I don't think anymore that marshal has a problem. There is
nothing wrong with it breaking when operating on a socket;
marshal is not for general application use, anyway. Anybody
who wants to transmit marshalled data over a socket can
still dump them into a string first.
So I propose to fix this aspect by documenting this intended
limitation.
----------------------------------------------------------------------
Comment By: Skip Montanaro (montanaro)
Date: 2003-08-01 09:52
Message:
Logged In: YES
user_id=44345
Agreed, marshal has problems. It's problems are a bit
deeper than the technique in cPickle can solve, however. It
seems to me that the WFILE struct is a bad hack. It should
contain a union with three elements, one is the regular FILE *
stuff, one is the current write-to-string stuff, and a third is
a "generic object which has a write method".
I'll see if I can whip something up and pass along to Martin
for review.
----------------------------------------------------------------------
Comment By: David M. Grimes (dmgrime)
Date: 2003-08-01 09:27
Message:
Logged In: YES
user_id=699265
Skip --
I stand corrected (for cPickle, anyway) - but I imagine there is a
performance degredation over the "old" way of doing things - if I get a
chance, I'll test that.
The problem does still exist at least in marshal (.load and .dump only
accept a "real" file object). I did a grep for PyFile_Check in the source tree
prior to posting the bug report, and was a little over zealous to include
cPickle in the subject. There are other places in the source ( and possibly
in 3rd party extensions) which use PyFile_Check, and I haven't had a
change to analyze those yet.
In any case, marshal doesn't work - in your test, if you change the cPickle
to marshal, you'll get:
Traceback (most recent call last):
File "x.py", line 10, in ?
marshal.dump("Hello, world", f)
TypeError: marshal.dump() 2nd arg must be file
Again, I'm not sure of the impact other places where PyFile_Check is
used, and I don't know the performance loss in cPickle (as compared to
the "real" file implementation in pre-2.3 socket.makefile())
Let me know if there is anything I can do to help debug/fix!
----------------------------------------------------------------------
Comment By: Skip Montanaro (montanaro)
Date: 2003-08-01 09:18
Message:
Logged In: YES
user_id=44345
I modified the Demo/sockets/unixclient.py script to the following:
# Echo client demo using Unix sockets
# Piet van Oostrum
from socket import *
FILE = 'blabla'
s = socket(AF_UNIX, SOCK_STREAM)
s.connect(FILE)
import cPickle
f = s.makefile("wb")
cPickle.dump("Hello, world", f)
f.close()
#s.send('Hello, world')
data = s.recv(1024)
s.close()
print 'Received', `data`
It seemed to work fine when connected to the corresponding
unixserver.py script.
----------------------------------------------------------------------
Comment By: Skip Montanaro (montanaro)
Date: 2003-08-01 09:11
Message:
Logged In: YES
user_id=44345
Looking at the code in cPickle.c it appears that as long as the
object being written to has a 'write' method it should do the
right thing. Can you provide a simple test case which fails?
----------------------------------------------------------------------
Comment By: Martin v. Löwis (loewis)
Date: 2003-08-01 02:14
Message:
Logged In: YES
user_id=21627
See the bug that this patch closes; I doubt there is an
alternative approach.
Instead, I think the C modules that expect file objects need
to be fixed.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=780354&group_id=5470