Hi,
This is my first post to this list.
I've been using Twisted for a few (like three) days. My wife had a baby two days ago, no, really. I just got the other two kids to bed and thought I'd ask...sorry if this has been answered eleven-thousand times before but I did search the archives first.
I'm writing a client app that needs to get the contents of a URL, usually just a web page.
I'm trying to use client.py's downloadPage(url, file, contextFactory=None, *args, **kwargs):
It claims to use a 'file' which, according to the docstring, can be a file or file-like object. It uses the HTTPDownloader class to perform the actual download.
In the pageEnd(): method of HTTPDownloader, the 'file' is closed.
Unfortunately, for 'file-like' objects, like StringIOs, this trashes the 'file'; not so useful since I need to work with the result.
I realize I can subclass and override and such but it seems odd that a function that is documented to use a file-like object uses a class that will destroy something that's not a 'file.'
I'm sure I'm missing something...or maybe nobody uses this 'cause there's something way better that I haven't discovered yet.
Clues?
Thanks,
S, AKA: Steve Steiner
s s wrote:
Hi,
This is my first post to this list. I've been using Twisted for a few (like three) days. My wife had a
baby two days ago, no, really. I just got the other two kids to bed and thought I'd ask...sorry if this has been answered eleven-thousand times before but I did search the archives first.
I'm writing a client app that needs to get the contents of a URL,
usually just a web page.
I'm trying to use client.py's downloadPage(url, file,
contextFactory=None, *args, **kwargs):
It claims to use a 'file' which, according to the docstring, can be
a file or file-like object.
It uses the HTTPDownloader class to perform the actual download. In the pageEnd(): method of HTTPDownloader, the 'file' is closed. Unfortunately, for 'file-like' objects, like StringIOs, this trashes
the 'file'; not so useful since I need to work with the result.
Huh, so it does. That's a pretty rubbish feature of StringIO
I realize I can subclass and override and such but it seems odd that
a function that is documented to use a file-like object uses a class that will destroy something that's not a 'file.'
My guess is that calling "close" is to ensure that data is actually written to the file before returning, but I agree it's not helpful.
Since you obviously want the page as a string, I recommend you use t.w.client.getPage which returns the result as a deferred(string):
from twisted.internet import reactor from twisted.web import client
def got(text): print "page is", len(text), "bytes"
client.getPage('http://www.google.com%27).addCallback(got) reactor.run()
I'm sure I'm missing something...or maybe nobody uses this 'cause
there's something way better that I haven't discovered yet.
Clues?
Thanks,
S, AKA: Steve Steiner
Twisted-web mailing list Twisted-web@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-web
* Phil Mayers p.mayers@imperial.ac.uk [2008-09-20 11:44:30 +0100]:
Unfortunately, for 'file-like' objects, like StringIOs, this
trashes the 'file'; not so useful since I need to work with the result.
Huh, so it does. That's a pretty rubbish feature of StringIO
Perhaps you could subclass StringIO to "neutralize" the data-destroying functionality of close()?