[Python-Dev] Iterating over marshal/pickle

Tim Lesher tlesher at gmail.com
Mon Oct 9 16:52:44 CEST 2006


Both marshal and pickle allow multiple objects to be serialized to the
same file-like object.

The pattern for deserializing an unknown number of serialized objects
looks like this:

objs = []
while True:
  try:
    objs.append(marshal.load(fobj)) # or objs.append(unpickler.load())
  except EOFError:
    break

This seems like a good use case for an generator:

def some_name(fobj):
  while True:
    try:
      yield marshal.load(fobj) # or yield unpickler.load()
    except EOFError:
      raise StopIteration

1. Does this seem like a reasonable addition to the standard library?
2. Where should it go, and what should it be called?

>From an end-user point of view, this "feels" right:

import pickle
u = pickle.Unpickler(open('picklefile'))
for x in u:
  print x

import marshal
for x in marshal.unmarshalled(open('marshalfile')):
  print x

But I'm not hung up on the actual names or the use of sequence
semantics in the Unpickler case.

Incidentally, I know that pickle is preferred over marshal, but some
third-party tools (like the Perforce client) still use the marshal
library for serialization, so I've included it in the discussion
-- 
Tim Lesher <tlesher at gmail.com>


More information about the Python-Dev mailing list