[lxml-dev] lxml forms and cookies...
data:image/s3,"s3://crabby-images/f2fe8/f2fe8f9233648137a1145cf70e7eca268d4a4971" alt=""
Hey everyone, I've been trying to use the lxml forms with client cookies to handle html logins. Using cookielib, I'm able to manually login to a page using either urllib or urllib2 and have it work. The moment I try to use lxml.html.submit_form(), however, it fails. I've tried sniffing the packets, and it turns out that lxml is never sending cookies, which makes me guess that lxml is using neither urllib nor urllib2. How do I use cookes with lxml? Doug
data:image/s3,"s3://crabby-images/92634/926347002eca9a0112ac1e407c763398e4a17a21" alt=""
On Mon, 22 Dec 2008, Douglas Mayle wrote:
lxml.html.submit_form() has an open_http parameter: import urllib import urllib2 import urlparse import lxml.html def url_with_query(url, values): parts = urlparse.urlparse(url) rest, (query, frag) = parts[:-2], parts[-2:] return urlparse.urlunparse(rest + (urllib.urlencode(values), None)) def make_open_http(): opener = urllib2.build_opener(urllib2.HTTPCookieProcessor()) opener.addheaders = [] # pretend we're a human -- don't do this def open_http(method, url, values={}): if method == "POST": return opener.open(url, urllib.urlencode(values)) else: return opener.open(url_with_query(url, values)) return open_http open_http = make_open_http() tree = lxml.html.fromstring(open_http("GET", "http://python.org").read()) form = tree.forms[0] form.fields["q"] = "lxml" submit_values = {"submit": form.fields["submit"]} response = lxml.html.submit_form(form, extra_values=submit_values, open_http=open_http) html = response.read() doc = lxml.html.fromstring(html) lxml.html.open_in_browser(doc) John
data:image/s3,"s3://crabby-images/92634/926347002eca9a0112ac1e407c763398e4a17a21" alt=""
On Mon, 22 Dec 2008, Douglas Mayle wrote:
lxml.html.submit_form() has an open_http parameter: import urllib import urllib2 import urlparse import lxml.html def url_with_query(url, values): parts = urlparse.urlparse(url) rest, (query, frag) = parts[:-2], parts[-2:] return urlparse.urlunparse(rest + (urllib.urlencode(values), None)) def make_open_http(): opener = urllib2.build_opener(urllib2.HTTPCookieProcessor()) opener.addheaders = [] # pretend we're a human -- don't do this def open_http(method, url, values={}): if method == "POST": return opener.open(url, urllib.urlencode(values)) else: return opener.open(url_with_query(url, values)) return open_http open_http = make_open_http() tree = lxml.html.fromstring(open_http("GET", "http://python.org").read()) form = tree.forms[0] form.fields["q"] = "lxml" submit_values = {"submit": form.fields["submit"]} response = lxml.html.submit_form(form, extra_values=submit_values, open_http=open_http) html = response.read() doc = lxml.html.fromstring(html) lxml.html.open_in_browser(doc) John
participants (2)
-
Douglas Mayle
-
John J Lee