type converion in python

Alan Kennedy alanmk at hotmail.com
Thu May 29 07:03:07 EDT 2003


Winnie Poon wrote:

> I’m writing a socket server in python and I’m having some trouble
> converting the data received.

> Another socket client, which is written in C, sends data to a
> particular ipaddress and port number and my job is to parse the
> data received. Now the problem is, the data sent from the client
> is an array of uint2. If the server is written also in C/C++,
> then I can cast the char* to uint2 * and i would be able to obtain
> the data. 

In the distributed computing world, e.g. CORBA, what you are doing is
called "marshalling", and it's where you serialise a representation of
some data, for sending down a wire one byte at a time. The data are then
"unserialised" on the other end.

As long as you always send and receive the data in precisely the same
format, it will all work fine. However, there are complications:

1. Byte order. On Intel Pentium class systems, memory is stored in a
"little endian" fashion, i.e. a 4-byte integer is actually stored in
memory in the following sequence: byte 4, byte 3, byte 2, byte 1. On
"big endian" systems, the reverse is true: the sequence of storage is
1234. Now if you send a network packet from a big-endian to a
little-endian machine, containing say, an integer, your integer will be
corrupted in the transmission. For this reason, all data should be
translated to "network byte order" (i.e. big-endian) before being sent.
More from here

http://www.cs.rpi.edu/courses/sysprog/sockets/byteorder.html

2. Data representation. Different languages and platforms represent data
in different ways. For example, C stores a string as a pointer to a
null-terminated series of bytes. Python doesn't do that: instead, since
strings are immutable, it hides the implemenation of the terminator and
the length of the string, and just gives you the characters in the
string when you ask for them. Now, if you send a string from a python
program to a C program, across a network, and without taking these
differences into account, the C program will break, because python did
not send the null-terminating zero-byte that C needs.

There are oodles of examples of these kinds of incompatibilities between
languages and hardware platforms, and oodles of papers on how to solve
the problems that they give rise to.

The solution that the CORBA people took, in the Internet
InterOperability Protocol or IIOP (actually GIOP, but I'm not going
there), was to specify a common "wire representation" of all data types,
to which all compliant implementations must adhere. The CORBA solution
is a sophisticated one, involving the use of numbers to identify types
(typecodes), etc.

So, if you're not thinking in these platform and language independent
terms, I think you'll find yourself continually falling over these kinds
of incompatibilities.

So, as other posters have asked: what is the problem that you are trying
to solve? If you are simply trying to transfer data from one machine to
another, or to call object methods on one machine from another machine,
you might make your life a *whole* lot simpler by using a framework
designed for the purpose, such as Pyro or CORBA, which hide all of this
nastiness from you, and do all the hard work so you don't have to.

http://pyro.sourceforge.net/

Pyro might not be useful for you if you need inter-language calling
between python and C. However, there are multiple CORBA implementations
for both python and C.

HTH,

-- 
alan kennedy
-----------------------------------------------------
check http headers here: http://xhaus.com/headers
email alan:              http://xhaus.com/mailto/alan




More information about the Python-list mailing list