[wwwsearch-general] ClientForm request re ParseErrors
bruce
bedouglas at earthlink.net
Sun Jul 9 21:34:21 EDT 2006
update.....
out of curiosity, i fetched the latest mechanize from svn.. i get the same
error with the parse...
i've also tried to do:
br.select_form(nr = 1)
br.select_form(name="foo")
br.select_form(name=foo)
br.select_form(name="foo")
etc.... same err occurs...
-bruce
hi john...
not sure exactly who i should talk to tabout this..but here goes...
i have the following piece of code... i'm trying to do a select form, and my
test throws an error...
i have the actual form "main" in the html, so it should find it... as far as
i can tell, i've followed the docs.. but i could be wrong. any thoughts?
the code, output, and partial html is below...
thoughts/comments/ideas/etc...
thanks
-bruce
test code
------------------------
#get the semester page
#get the 2nd semester/frame src url page
br.open(url)
response = br.response() # this is a copy of response
s = response.read()
print response.read()
print s
#we now have the semester page...
d = libxml2dom.parseString(s, html=1)
ff = d.xpath(fnamepath)
fname = ff[0].nodeValue
print "fname = ",fname
br.select_form(name="main")<<<<<<<<<<<<<<< error happens....
output
------------------------
fname = main
Traceback (most recent call last):
File "./stest.py", line 156, in ?
br.select_form(name="main")
File "build/bdist.linux-i686/egg/mechanize/_mechanize.py", line 352, in
select_form
File "build/bdist.linux-i686/egg/mechanize/_mechanize.py", line 296, in
forms
File "build/bdist.linux-i686/egg/mechanize/_html.py", line 510, in forms
File "build/bdist.linux-i686/egg/mechanize/_html.py", line 226, in forms
File "build/bdist.linux-i686/egg/ClientForm.py", line 922, in
ParseResponse
File "build/bdist.linux-i686/egg/ClientForm.py", line 952, in ParseFile
File "/usr/lib/python2.4/sgmllib.py", line 95, in feed
self.goahead(0)
File "/usr/lib/python2.4/sgmllib.py", line 165, in goahead
k = self.parse_declaration(i)
File "/usr/lib/python2.4/markupbase.py", line 89, in parse_declaration
decltype, j = self._scan_name(j, i)
File "/usr/lib/python2.4/markupbase.py", line 378, in _scan_name
self.error("expected name token at %r"
File "/usr/lib/python2.4/sgmllib.py", line 102, in error
raise SGMLParseError(message)
sgmllib.SGMLParseError: expected name token at '<! Others/0/WIN; Too'
partial html
-----------------------------------
</table>
<br />
<FORM NAME='main' METHOD=POST
Action="/servlets/iclientservlet/a2k_prd/?ICType=Panel&Menu=SA_LEARNER_SERVI
CES&Market=GBL&PanelGroupName=CLASS_SEARCH" autocomplete=off>
<INPUT TYPE=hidden NAME=ICType VALUE=Panel>
<INPUT TYPE=hidden NAME=ICElementNum VALUE="0">
<INPUT TYPE=hidden NAME=ICStateNum VALUE="1">
-----Original Message-----
From: wwwsearch-general-bounces at lists.sourceforge.net
[mailto:wwwsearch-general-bounces at lists.sourceforge.net]On Behalf Of
John J Lee
Sent: Sunday, July 09, 2006 9:51 AM
To: wwwsearch-general at lists.sf.net
Subject: Re: [wwwsearch-general] ClientForm request re ParseErrors
On Sun, 9 Jul 2006, Titus Brown wrote:
[...]
> Define "better patch"...? The code I sent out before lets ClientForm
> parse otherwise unparseable HTML, and it works fine. I suppose it's
> less elegant than having two separate while loops; is that what you
> mean?
No, I just hate going one char at a time in Python. Surely this should be
fixed somewhere else? (I'm not sure where; I haven't looked recently)
If you've determined that fixing it elsewhere pulls in too much code or
requires a fix to stdlib code (if so, why?), maybe I should do as you
suggest anyway, but I don't like it.
> -> > The problem I have is that there's literally no way to pass
> -> > configuration parameters like 'ignore_errors' down from the
> -> > mechanize.Factory.forms() call.
> ->
> -> You can reimplement FormsFactory. It's a trivial (if slightly verbose)
> -> class, right?
>
> I could do that, yes. But I'd also need to redefine Factory.forms(),
> too, which calls FormsFactory.
Why? You can supply your own FormsFactory, as DefaultFactory does.
[...]
> -> > Separately, it'd be nice if ignore_errors wasn't hardcoded as False
in
> -> > ParseFile ;).
> ->
> -> I'm not sure what you want here. Could you send a patch?
>
> Line 914 of ClientForm.py should be changed to 'ignore_errors,'
Oh. Sure, if I apply a patch to enable ignore_errors, I'll of course do
that too.
John
-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job
easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
wwwsearch-general mailing list
wwwsearch-general at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/wwwsearch-general
More information about the Python-list
mailing list