[Twisted-Python] Doing HTTP file uploads (multipart forms)
![](https://secure.gravatar.com/avatar/415203f2727ceaf56d8f7f5e6d5d508b.jpg?s=120&d=mm&r=g)
Hi, I have some code that takes file uploads from browsers. I'm trying to write a test for it, so now I need to get Twisted to do file uploads like browsers do. I think my code (not test code) essentially works, by manually trying it with a browser. I can't get the functional test part to work. I have reduced the problem to what I think is a SSCCE. render_POST drops into a debugger to easily inspect the received request. Here's the code: https://gist.github.com/3058974 When debugging this with wireshark I found an obvious culprit: there's some random junk in front of it (3 ASCII hex digits and a CRLF) and some junk at the end (CRLF and an ASCII "0", although I'm not sure if that CRLF is junk). Wireshark reports some broken TCP packets (PCAP attached). I have no idea why that happens. Packets were captured with: tcpdump -i lo0 -nn -s0 -w sample.pcap port 8080 and analyzed with a recent version of Wireshark (1.6.2, SVN rev 38931). The analyzed TCP stream is also attached. If I look at the request in the debugger (request_POST *DOES* get called…): - it has an empty request.args, instead of having the expected keys "a", "b", "f" - request.content.getvalue() has the data you see in tcpdata.txt: it starts with "'1e7\r\n------------" even though I obviously would like it to start with just the dashes cheers lvh
![](https://secure.gravatar.com/avatar/415203f2727ceaf56d8f7f5e6d5d508b.jpg?s=120&d=mm&r=g)
As an extra reference, see the behavior Firefox has when approaching the sample server (with some value for the sample field and the Axiom README as the file): cheers lvh On 06 Jul 2012, at 10:47, Laurens Van Houtven wrote:
Hi,
I have some code that takes file uploads from browsers. I'm trying to write a test for it, so now I need to get Twisted to do file uploads like browsers do. I think my code (not test code) essentially works, by manually trying it with a browser. I can't get the functional test part to work.
I have reduced the problem to what I think is a SSCCE. render_POST drops into a debugger to easily inspect the received request.
Here's the code: https://gist.github.com/3058974
When debugging this with wireshark I found an obvious culprit: there's some random junk in front of it (3 ASCII hex digits and a CRLF) and some junk at the end (CRLF and an ASCII "0", although I'm not sure if that CRLF is junk). Wireshark reports some broken TCP packets (PCAP attached). I have no idea why that happens. Packets were captured with:
tcpdump -i lo0 -nn -s0 -w sample.pcap port 8080
and analyzed with a recent version of Wireshark (1.6.2, SVN rev 38931). The analyzed TCP stream is also attached.
<tcpdata.txt><test.pcap>
If I look at the request in the debugger (request_POST *DOES* get called…):
- it has an empty request.args, instead of having the expected keys "a", "b", "f" - request.content.getvalue() has the data you see in tcpdata.txt: it starts with "'1e7\r\n------------" even though I obviously would like it to start with just the dashes
cheers lvh
![](https://secure.gravatar.com/avatar/426d6dbf6554a9b3fca1fd04e6b75f38.jpg?s=120&d=mm&r=g)
some random junk in front of it (3 ASCII hex digits and a CRLF) and some junk at the end (CRLF and an ASCII "0", although I'm not sure if that CRLF is junk). Wireshark reports some broken TCP packets (PCAP
The hex/cr-lf is http chunked transfer format. -- Sent from my phone. Please excuse brevity and typos.
![](https://secure.gravatar.com/avatar/415203f2727ceaf56d8f7f5e6d5d508b.jpg?s=120&d=mm&r=g)
Aha, okay, so that's that possible culprit off the table. Thanks! FWIW: Chunked transfer encoding *should* work with twisted.web.server, right? cheers lvh On 06 Jul 2012, at 11:39, Phil Mayers wrote:
some random junk in front of it (3 ASCII hex digits and a CRLF) and some junk at the end (CRLF and an ASCII "0", although I'm not sure if that CRLF is junk). Wireshark reports some broken TCP packets (PCAP
The hex/cr-lf is http chunked transfer format. -- Sent from my phone. Please excuse brevity and typos.
_______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
![](https://secure.gravatar.com/avatar/426d6dbf6554a9b3fca1fd04e6b75f38.jpg?s=120&d=mm&r=g)
On 06/07/12 10:52, Laurens Van Houtven wrote:
Aha, okay, so that's that possible culprit off the table. Thanks!
FWIW: Chunked transfer encoding *should* work with twisted.web.server, right?
Good question. Unfortunately the Trac implementation is running slowly so I can't browse the HEAD code, but in my local copy I don't see any sign of chunked encoding handling in the client->server direction i.e. for request body. So, I think not. I'm not even sure chunked encoding is *legal* in HTTP request bodies; which client is generating this format?
![](https://secure.gravatar.com/avatar/d7875f8cfd8ba9262bfff2bf6f6f9b35.jpg?s=120&d=mm&r=g)
On 07/06/2012 07:42 AM, Phil Mayers wrote:
On 06/07/12 10:52, Laurens Van Houtven wrote:
Aha, okay, so that's that possible culprit off the table. Thanks!
FWIW: Chunked transfer encoding *should* work with twisted.web.server, right? Good question. Unfortunately the Trac implementation is running slowly so I can't browse the HEAD code, but in my local copy I don't see any sign of chunked encoding handling in the client->server direction i.e. for request body.
The server definitely supports chunked encoding; see twisted/web/http.py:1585.
I'm not even sure chunked encoding is *legal* in HTTP request bodies; which client is generating this format?
It is legal: "All HTTP/1.1 applications MUST be able to receive and decode the "chunked" transfer-coding, and MUST ignore chunk-extension extensions they do not understand."
![](https://secure.gravatar.com/avatar/426d6dbf6554a9b3fca1fd04e6b75f38.jpg?s=120&d=mm&r=g)
On 06/07/12 14:05, Itamar Turner-Trauring wrote:
On 07/06/2012 07:42 AM, Phil Mayers wrote:
On 06/07/12 10:52, Laurens Van Houtven wrote:
Aha, okay, so that's that possible culprit off the table. Thanks!
FWIW: Chunked transfer encoding *should* work with twisted.web.server, right? Good question. Unfortunately the Trac implementation is running slowly so I can't browse the HEAD code, but in my local copy I don't see any sign of chunked encoding handling in the client->server direction i.e. for request body.
The server definitely supports chunked encoding; see twisted/web/http.py:1585.
Ah, didn't spot that.
![](https://secure.gravatar.com/avatar/415203f2727ceaf56d8f7f5e6d5d508b.jpg?s=120&d=mm&r=g)
Attached is an updated version of the client and server, including a 1x1 white JPEG pixel (that Paintbrush decides to make an entire 2.6kB, mostly due to i18n…). This way you can actually make semantically identical requests using both your browser (don't change the input fields other than the file field) and the client. TIA, lvh
![](https://secure.gravatar.com/avatar/426d6dbf6554a9b3fca1fd04e6b75f38.jpg?s=120&d=mm&r=g)
On 06/07/12 09:47, Laurens Van Houtven wrote:
Hi,
I have some code that takes file uploads from browsers. I'm trying to
Out of curiosity, I'm seeing no "List-Id" header in the posts that lvh makes; is anyone else seeing this? The other mailman headers are there, but no List-Id. I occasionally see this on other mailman lists, and don't know why it happens.
![](https://secure.gravatar.com/avatar/415203f2727ceaf56d8f7f5e6d5d508b.jpg?s=120&d=mm&r=g)
Sorry, I hope this isn't causing you any issues! FWIW my client is Mail.app on a fully updated Lion machine. cheers lvh On Fri, Jul 6, 2012 at 12:52 PM, Phil Mayers <p.mayers@imperial.ac.uk>wrote:
On 06/07/12 09:47, Laurens Van Houtven wrote:
Hi,
I have some code that takes file uploads from browsers. I'm trying to
Out of curiosity, I'm seeing no "List-Id" header in the posts that lvh makes; is anyone else seeing this?
The other mailman headers are there, but no List-Id.
I occasionally see this on other mailman lists, and don't know why it happens.
_______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
-- cheers lvh
![](https://secure.gravatar.com/avatar/426d6dbf6554a9b3fca1fd04e6b75f38.jpg?s=120&d=mm&r=g)
On 06/07/12 15:26, Laurens Van Houtven wrote:
Sorry, I hope this isn't causing you any issues!
No biggie. I *think* it must be messages with attachments; only your first two had that issue, the rest of the thread has List-Id. It's probably mailman being dumb.
![](https://secure.gravatar.com/avatar/01aa7d6d4db83982a2f6dd363d0ee0f3.jpg?s=120&d=mm&r=g)
On Jul 06, 2012, at 05:04 PM, Phil Mayers wrote:
On 06/07/12 15:26, Laurens Van Houtven wrote:
Sorry, I hope this isn't causing you any issues!
No biggie.
I *think* it must be messages with attachments; only your first two had that issue, the rest of the thread has List-Id.
It's probably mailman being dumb.
Inconceivable! <vizzini wink> Reading the list via gmane, and looking at lvh's Message-IDs <32FC267C-F4F6-4FBB-B553-15812751A7E2@lvh.cc> <3B814A92-5FED-45C5-A6AA-B77314E25899@lvh.cc> both of which contain attachments, I see the expected List-IDs. -Barry
![](https://secure.gravatar.com/avatar/426d6dbf6554a9b3fca1fd04e6b75f38.jpg?s=120&d=mm&r=g)
Reading the list via gmane, and looking at lvh's Message-IDs
<32FC267C-F4F6-4FBB-B553-15812751A7E2@lvh.cc> <3B814A92-5FED-45C5-A6AA-B77314E25899@lvh.cc>
both of which contain attachments, I see the expected List-IDs.
Weird. Maybe it is at my end; I wonder if our exim config or spam filter is selectively mangling... -- Sent from my phone. Please excuse brevity and typos.
![](https://secure.gravatar.com/avatar/415203f2727ceaf56d8f7f5e6d5d508b.jpg?s=120&d=mm&r=g)
With a lot of help from idnar, the issue is resolved! There were a few red herrings that ended up cleaning my code a lot, but eventually it ended up boiling down to my MIME generating code simply being busted. Fixed version attached. Apparently this also highlighted a minor issue in twisted.web with the wrong header being preferred, but I'll leave that up to the expert :) cheers lvh
![](https://secure.gravatar.com/avatar/415203f2727ceaf56d8f7f5e6d5d508b.jpg?s=120&d=mm&r=g)
Forgot the attachment (as itamar pointed out) cheers lvh On 07 Jul 2012, at 16:57, Laurens Van Houtven wrote:
With a lot of help from idnar, the issue is resolved!
There were a few red herrings that ended up cleaning my code a lot, but eventually it ended up boiling down to my MIME generating code simply being busted. Fixed version attached.
Apparently this also highlighted a minor issue in twisted.web with the wrong header being preferred, but I'll leave that up to the expert :)
cheers lvh
![](https://secure.gravatar.com/avatar/415203f2727ceaf56d8f7f5e6d5d508b.jpg?s=120&d=mm&r=g)
Forgot the attachment… twice. Yay for new email clients I'm not used to cheers lvh On 07 Jul 2012, at 16:57, Laurens Van Houtven wrote:
With a lot of help from idnar, the issue is resolved!
There were a few red herrings that ended up cleaning my code a lot, but eventually it ended up boiling down to my MIME generating code simply being busted. Fixed version attached.
Apparently this also highlighted a minor issue in twisted.web with the wrong header being preferred, but I'll leave that up to the expert :)
cheers lvh
participants (4)
-
Barry Warsaw
-
Itamar Turner-Trauring
-
Laurens Van Houtven
-
Phil Mayers