[Python-Dev] urllib, multipart/form-data encoding and file uploads
janssen at parc.com
Sat Jun 28 01:21:03 CEST 2008
All sounds reasonable to me.
> On Fri, Jun 27, 2008 at 11:40 AM, Bill Janssen <janssen at parc.com> wrote:
> >> I notice that there is some work being done on urllib / urllib2 for
> >> python 2.6/3.0. One thing I've always missed in urllib/urllib2 is the
> >> facility to encode POST data as multipart/form-data. I think it would
> >> also be useful to be able to stream a POST request to the remote
> >> server rather than having requiring the user to create the entire POST
> >> body in memory before starting the request. This would be extremely
> >> useful when writing any kind of code that does file uploads.
> >> I didn't see any recent discussion about this so I thought I'd ask
> >> here: do you think this would make a good addition to the new urllib
> >> package?
> > I think it would be very helpful. I'd separate the two things,
> > though; you want to be able to format a set of values as
> > "multipart/form-data", and do various things with that resulting
> > "document", and you want to be able to stream a POST (or PUT) request.
> How about if the function that encoded the values as "multipart/form-data"
> was able to stream data to a POST (or PUT) request via an iterator that
> yielded chunks of data?
> def multipart_encode(params, boundary=None):
> """Encode ``params`` as multipart/form-data.
> ``params`` should be a dictionary where the keys represent parameter names,
> and the values are either parameter values, or file-like objects to
> use as the parameter value. The file-like object must support the .read(),
> .seek(), and .tell() methods.
> If ``boundary`` is set, then it as used as the MIME boundary. Otherwise
> a randomly generated boundary will be used. In either case, if the
> boundary string appears in the parameter values a ValueError will be
> Returns an iterable object that will yield blocks of data representing
> the encoded parameters."""
> The file objects need to support .seek() and .tell() so we can determine
> how large they are before including them in the output. I've been trying
> to come up with a good way to specify the size separately so you could use
> unseekable objects, but no good ideas have come to mind. Maybe it could
> look for a 'size' attribute or callable on the object? That seems a bit
> A couple helper functions would be necessary as well, one to generate
> random boundary strings that are guaranteed not to collide with file data,
> and another function to calculate the total size of the encoding to be used
> in the 'Content-Length' header in the main HTTP request.
> Then we'd need to change either urllib or httplib to support iterable
> objects in addition to the regular strings that it currently uses.
> Python-Dev mailing list
> Python-Dev at python.org
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/janssen%40parc.com
More information about the Python-Dev