Re: [Python-Dev] urllib, multipart/form-data encoding and file uploads
On Fri, Jun 27, 2008 at 11:40 AM, Bill Janssen
I notice that there is some work being done on urllib / urllib2 for python 2.6/3.0. One thing I've always missed in urllib/urllib2 is the facility to encode POST data as multipart/form-data. I think it would also be useful to be able to stream a POST request to the remote server rather than having requiring the user to create the entire POST body in memory before starting the request. This would be extremely useful when writing any kind of code that does file uploads.
I didn't see any recent discussion about this so I thought I'd ask here: do you think this would make a good addition to the new urllib package?
I think it would be very helpful. I'd separate the two things, though; you want to be able to format a set of values as "multipart/form-data", and do various things with that resulting "document", and you want to be able to stream a POST (or PUT) request.
How about if the function that encoded the values as "multipart/form-data" was able to stream data to a POST (or PUT) request via an iterator that yielded chunks of data? def multipart_encode(params, boundary=None): """Encode ``params`` as multipart/form-data. ``params`` should be a dictionary where the keys represent parameter names, and the values are either parameter values, or file-like objects to use as the parameter value. The file-like object must support the .read(), .seek(), and .tell() methods. If ``boundary`` is set, then it as used as the MIME boundary. Otherwise a randomly generated boundary will be used. In either case, if the boundary string appears in the parameter values a ValueError will be raised. Returns an iterable object that will yield blocks of data representing the encoded parameters.""" The file objects need to support .seek() and .tell() so we can determine how large they are before including them in the output. I've been trying to come up with a good way to specify the size separately so you could use unseekable objects, but no good ideas have come to mind. Maybe it could look for a 'size' attribute or callable on the object? That seems a bit hacky... A couple helper functions would be necessary as well, one to generate random boundary strings that are guaranteed not to collide with file data, and another function to calculate the total size of the encoding to be used in the 'Content-Length' header in the main HTTP request. Then we'd need to change either urllib or httplib to support iterable objects in addition to the regular strings that it currently uses. Cheers, Chris
All sounds reasonable to me. Bill
On Fri, Jun 27, 2008 at 11:40 AM, Bill Janssen
wrote: I notice that there is some work being done on urllib / urllib2 for python 2.6/3.0. One thing I've always missed in urllib/urllib2 is the facility to encode POST data as multipart/form-data. I think it would also be useful to be able to stream a POST request to the remote server rather than having requiring the user to create the entire POST body in memory before starting the request. This would be extremely useful when writing any kind of code that does file uploads.
I didn't see any recent discussion about this so I thought I'd ask here: do you think this would make a good addition to the new urllib package?
I think it would be very helpful. I'd separate the two things, though; you want to be able to format a set of values as "multipart/form-data", and do various things with that resulting "document", and you want to be able to stream a POST (or PUT) request.
How about if the function that encoded the values as "multipart/form-data" was able to stream data to a POST (or PUT) request via an iterator that yielded chunks of data?
def multipart_encode(params, boundary=None): """Encode ``params`` as multipart/form-data.
``params`` should be a dictionary where the keys represent parameter names, and the values are either parameter values, or file-like objects to use as the parameter value. The file-like object must support the .read(), .seek(), and .tell() methods.
If ``boundary`` is set, then it as used as the MIME boundary. Otherwise a randomly generated boundary will be used. In either case, if the boundary string appears in the parameter values a ValueError will be raised.
Returns an iterable object that will yield blocks of data representing the encoded parameters."""
The file objects need to support .seek() and .tell() so we can determine how large they are before including them in the output. I've been trying to come up with a good way to specify the size separately so you could use unseekable objects, but no good ideas have come to mind. Maybe it could look for a 'size' attribute or callable on the object? That seems a bit hacky...
A couple helper functions would be necessary as well, one to generate random boundary strings that are guaranteed not to collide with file data, and another function to calculate the total size of the encoding to be used in the 'Content-Length' header in the main HTTP request.
Then we'd need to change either urllib or httplib to support iterable objects in addition to the regular strings that it currently uses.
Cheers, Chris _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/janssen%40parc.com
Chris AtLee wrote:
Then we'd need to change either urllib or httplib to support iterable objects in addition to the regular strings that it currently uses.
Chris, To avoid losing these ideas, could you add them to the issue tracker as feature requests? It's too late to get them into 2.6/3.0 but they may make good additions for the next release cycle. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org
On Fri, Jun 27, 2008 at 9:06 PM, Nick Coghlan
Chris,
To avoid losing these ideas, could you add them to the issue tracker as feature requests? It's too late to get them into 2.6/3.0 but they may make good additions for the next release cycle.
Cheers, Nick.
Issues #3243 and #3244 created. Cheers, Chris
participants (3)
-
Bill Janssen
-
Chris AtLee
-
Nick Coghlan