2.6, 3.0, and truly independent intepreters

Andy O'Meara andy55 at gmail.com
Sat Oct 25 16:43:58 EDT 2008


On Oct 24, 10:24 pm, Glenn Linderman <v+pyt... at g.nevcal.com> wrote:
>
> > And in the case of hundreds of megs of data
>
> ... and I would be surprised at someone that would embed hundreds of
> megs of data into an object such that it had to be serialized... seems
> like the proper design is to point at the data, or a subset of it, in a
> big buffer.  Then data transfers would just transfer the offset/length
> and the reference to the buffer.
>
> > and/or thousands of data structure instances,
>
> ... and this is another surprise!  You have thousands of objects (data
> structure instances) to move from one thread to another?

Heh, no, we're actually in agreement here.  I'm saying that in the
case where the data sets are large and/or intricate, a single top-
level pointer changing hands is *always* the way to go rather than
serialization.  For example, suppose you had some nifty python code
and C procs that were doing lots of image analysis, outputting tons of
intricate and rich data structures.  Once the thread is done with that
job, all that output is trivially transferred back to the appropriate
thread by a pointer changing hands.

>
> Of course, I know that data get large, but typical multimedia streams
> are large, binary blobs.  I was under the impression that processing
> them usually proceeds along the lines of keeping offsets into the blobs,
> and interpreting, etc.  Editing is usually done by making a copy of a
> blob, transforming it or a subset in some manner during the copy
> process, resulting in a new, possibly different-sized blob.

No, you're definitely right-on, with the the additional point that the
representation of multimedia usually employs intricate and diverse
data structures (imagine the data structure representation of a movie
encoded in modern codec, such as H.264, complete with paths, regions,
pixel flow, geometry, transformations, and textures).  As we both
agree, that's something that you *definitely* want to move around via
a single pointer (and not in a serialized form).  Hence, my position
that apps that use python can't be forced to go through IPC or else:
(a) there's a performance/resource waste to serialize and unserialize
large or intricate data sets, and (b) they're required to write and
maintain serialization code that otherwise doesn't serve any other
purpose.

Andy






More information about the Python-list mailing list