From flahertyk1 at hotmail.com Thu Feb 4 05:04:47 2010
From: flahertyk1 at hotmail.com (kimmyaf)
Date: Wed, 3 Feb 2010 20:04:47 -0800 (PST)
Subject: [XML-SIG] parsing XML with minidom
Message-ID: <27447458.post@talk.nabble.com>
Hello, I am not real sure if my question belongs here or not, but this is
best place I could find.
I am a python beginner and trying to teach myself how to parse some XML with
minidom.
This is the code excerpt I am struggling with....
********************************************************
dom = minidom.parseString(xml_response)
handler.close()
route_list = []
tag = ['route']
tmp_route=[]
for route in dom.getElementsByTagName('body'):
print 'in'
tmp_route[route] =
dom.getElementsByTagName(tag)[0].getAttribute('tag')
route_list.append(tmp_route)
*******************************************************************
Here is the XML I am getting back when I call...
' \r\n
\r\n\r\n\r\n\r\n\r\n\r\n\r\n'
See this formatted better by pasting this URL = >
http://webservices.nextbus.com/service/publicXMLFeed?command=routeList&a=mbta
I am taking the following error:
File "C:/Users/Kim/Grad School/Python/bus python.py", line 54, in
get_available_routes()
File "C:/Users/Kim/Grad School/Python/bus python.py", line 43, in
get_available_routes
tmp_route[route] = dom.getElementsByTagName(tag)[0].getAttribute('tag')
IndexError: list index out of range
I'm sure there is something obvious that I am doing wrong. All I want to do
is grab all of the values and put them into a list. Kind of new
to parsing XML! I'm working off an example but the XML in the example code
is a lot more in depth so can't really relate it to mine. I also would like
any reference anyone has about how to parse with minidom!!
Help! Thank you! %-|
--
View this message in context: http://old.nabble.com/parsing-XML-with-minidom-tp27447458p27447458.html
Sent from the Python - xml-sig mailing list archive at Nabble.com.
From rajanikanth at gmail.com Thu Feb 4 07:11:25 2010
From: rajanikanth at gmail.com (Rajanikanth Jammalamadaka)
Date: Wed, 3 Feb 2010 22:11:25 -0800
Subject: [XML-SIG] parsing XML with minidom
In-Reply-To: <27447458.post@talk.nabble.com>
References: <27447458.post@talk.nabble.com>
Message-ID: <84bdef3c1002032211l23fe60bi4681dca06f18bc04@mail.gmail.com>
Try this:
from xml.etree.ElementTree import ElementTree
doc = ElementTree(file = "t.xml")
listOfTags = []
for item in doc.findall(".//route"):
listOfTags.append(item.get('tag'))
print listOfTags
where t.xml is your xml file.
Thanks,
Raj
On Wed, Feb 3, 2010 at 8:04 PM, kimmyaf wrote:
>
> Hello, I am not real sure if my question belongs here or not, but this is
> best place I could find.
>
> I am a python beginner and trying to teach myself how to parse some XML
> with
> minidom.
>
> This is the code excerpt I am struggling with....
>
> ********************************************************
> dom = minidom.parseString(xml_response)
> handler.close()
>
> route_list = []
> tag = ['route']
>
> tmp_route=[]
> for route in dom.getElementsByTagName('body'):
> print 'in'
> tmp_route[route] =
> dom.getElementsByTagName(tag)[0].getAttribute('tag')
> route_list.append(tmp_route)
>
> *******************************************************************
> Here is the XML I am getting back when I call...
>
> ' \r\n\r\n\r\n title="111"/>\r\n\r\n title="116"/>\r\n\r\n\r\n'
>
> See this formatted better by pasting this URL = >
>
>
> http://webservices.nextbus.com/service/publicXMLFeed?command=routeList&a=mbta
>
>
> I am taking the following error:
>
> File "C:/Users/Kim/Grad School/Python/bus python.py", line 54, in
> get_available_routes()
> File "C:/Users/Kim/Grad School/Python/bus python.py", line 43, in
> get_available_routes
> tmp_route[route] = dom.getElementsByTagName(tag)[0].getAttribute('tag')
> IndexError: list index out of range
>
>
>
> I'm sure there is something obvious that I am doing wrong. All I want to do
> is grab all of the values and put them into a list. Kind of
> new
> to parsing XML! I'm working off an example but the XML in the example code
> is a lot more in depth so can't really relate it to mine. I also would like
> any reference anyone has about how to parse with minidom!!
>
> Help! Thank you! %-|
> --
> View this message in context:
> http://old.nabble.com/parsing-XML-with-minidom-tp27447458p27447458.html
> Sent from the Python - xml-sig mailing list archive at Nabble.com.
>
> _______________________________________________
> XML-SIG maillist - XML-SIG at python.org
> http://mail.python.org/mailman/listinfo/xml-sig
>
--
Rajanikanth
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From bigotp at acm.org Thu Feb 4 13:43:16 2010
From: bigotp at acm.org (Peter A. Bigot)
Date: Thu, 04 Feb 2010 05:43:16 -0700
Subject: [XML-SIG] parsing XML with minidom
In-Reply-To: <27447458.post@talk.nabble.com>
References: <27447458.post@talk.nabble.com>
Message-ID: <4B6AC0E4.2010601@acm.org>
The variable tag is a list of strings. The method getElementsByTagName
takes a single string as its first parameter. Since a list cannot
appear as a tag name, the second call to getElementsByTagName returns an
empty list.
body = dom.getElementsByTagName('body')[0]
for route in body.getElementsByTagName('route'):
print route.getAttribute('tag')
Peter
On 2/3/2010 9:04 PM, kimmyaf wrote:
> Hello, I am not real sure if my question belongs here or not, but this is
> best place I could find.
>
> I am a python beginner and trying to teach myself how to parse some XML with
> minidom.
>
> This is the code excerpt I am struggling with....
>
> ********************************************************
> dom = minidom.parseString(xml_response)
> handler.close()
>
> route_list = []
> tag = ['route']
>
> tmp_route=[]
> for route in dom.getElementsByTagName('body'):
> print 'in'
> tmp_route[route] =
> dom.getElementsByTagName(tag)[0].getAttribute('tag')
> route_list.append(tmp_route)
>
> *******************************************************************
> Here is the XML I am getting back when I call...
>
> ' \r\n\r\n\r\n title="111"/>\r\n\r\n title="116"/>\r\n\r\n\r\n'
>
> See this formatted better by pasting this URL =>
>
> http://webservices.nextbus.com/service/publicXMLFeed?command=routeList&a=mbta
>
>
> I am taking the following error:
>
> File "C:/Users/Kim/Grad School/Python/bus python.py", line 54, in
> get_available_routes()
> File "C:/Users/Kim/Grad School/Python/bus python.py", line 43, in
> get_available_routes
> tmp_route[route] = dom.getElementsByTagName(tag)[0].getAttribute('tag')
> IndexError: list index out of range
>
>
>
> I'm sure there is something obvious that I am doing wrong. All I want to do
> is grab all of the values and put them into a list. Kind of new
> to parsing XML! I'm working off an example but the XML in the example code
> is a lot more in depth so can't really relate it to mine. I also would like
> any reference anyone has about how to parse with minidom!!
>
> Help! Thank you! %-|
>
From james.johnston at lifeway.com Thu Feb 18 21:42:10 2010
From: james.johnston at lifeway.com (James Johnston)
Date: Thu, 18 Feb 2010 14:42:10 -0600
Subject: [XML-SIG] Python working with XML
Message-ID: <710c80bb1002181242g5c0df17aid5710a1574761871@mail.gmail.com>
I want to develop some tools for working with XML document files. This
might require the validation of XML rules and converting various formats
into XML. What is new and available? Some of the things I am reading are
from 2000 - 2001.
Thanks
--
James Johnston
Retail Technologies
(615) 251-2792
james.johnston at lifeway.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From nimmyliji at gmail.com Tue Feb 9 10:18:05 2010
From: nimmyliji at gmail.com (nimmyliji)
Date: Tue, 9 Feb 2010 01:18:05 -0800 (PST)
Subject: [XML-SIG] Send an xml file
Message-ID: <27512403.post@talk.nabble.com>
Hi,
How can i send an xml file from python to flex 3? Any one can help me....
With example codes....
Thanks in advance
nimyliji
--
View this message in context: http://old.nabble.com/Send-an-xml-file-tp27512403p27512403.html
Sent from the Python - xml-sig mailing list archive at Nabble.com.
From vnocciolini at mbigroup.it Wed Feb 17 11:47:40 2010
From: vnocciolini at mbigroup.it (Vinicio Nocciolini)
Date: Wed, 17 Feb 2010 11:47:40 +0100
Subject: [XML-SIG] PyXML-0.8.4 error
Message-ID: <4B7BC94C.9020608@mbigroup.it>
Hi
I am using Ubuntu 9.10
This is the error putput
regards Vinicio
PyXML-0.8.4$ python2.5 setup.py build
running build
running build_py
running build_ext
building '_xmlplus.parsers.pyexpat' extension
gcc -pthread -fno-strict-aliasing -DNDEBUG -g -fwrapv -O2 -Wall
-Wstrict-prototypes -fPIC -DXML_NS=1 -DXML_DTD=1 -DBYTEORDER=1234
-DXML_CONTEXT_BYTES=1024 -DHAVE_MEMMOVE=1 -Iextensions/expat/lib
-I/usr/include/python2.5 -c extensions/pyexpat.c -o
build/temp.linux-i686-2.5/extensions/pyexpat.o
extensions/pyexpat.c:5:20: error: Python.h: No such file or directory
extensions/pyexpat.c:8:21: error: compile.h: No such file or directory
extensions/pyexpat.c:9:25: error: frameobject.h: No such file or directory
extensions/pyexpat.c:63: error: expected ?=?, ?,?, ?;?, ?asm? or
?__attribute__? before ?*? token
extensions/pyexpat.c:70: error: expected specifier-qualifier-list before
?PyObject_HEAD?
extensions/pyexpat.c:89: error: expected ?=?, ?,?, ?;?, ?asm? or
?__attribute__? before ?Xmlparsetype?
extensions/pyexpat.c:98: error: expected specifier-qualifier-list before
?PyCodeObject?
extensions/pyexpat.c:108: error: expected ?)? before ?*? token
extensions/pyexpat.c:123: error: expected ?=?, ?,?, ?;?, ?asm? or
?__attribute__? before ?*? token
extensions/pyexpat.c: In function ?have_handler?:
extensions/pyexpat.c:150: error: ?PyObject? undeclared (first use in
this function)
extensions/pyexpat.c:150: error: (Each undeclared identifier is reported
only once
extensions/pyexpat.c:150: error: for each function it appears in.)
extensions/pyexpat.c:150: error: ?handler? undeclared (first use in this
function)
extensions/pyexpat.c:150: error: ?xmlparseobject? has no member named
?handlers?
extensions/pyexpat.c: At top level:
extensions/pyexpat.c:154: error: expected ?=?, ?,?, ?;?, ?asm? or
?__attribute__? before ?*? token
extensions/pyexpat.c:201: error: expected ?=?, ?,?, ?;?, ?asm? or
?__attribute__? before ?*? token
extensions/pyexpat.c:214: error: expected ?=?, ?,?, ?;?, ?asm? or
?__attribute__? before ?*? token
extensions/pyexpat.c: In function ?flag_error?:
extensions/pyexpat.c:248: error: ?xmlparseobject? has no member named
?itself?
extensions/pyexpat.c: At top level:
extensions/pyexpat.c:252: error: expected ?=?, ?,?, ?;?, ?asm? or
?__attribute__? before ?*? token
extensions/pyexpat.c:305: error: expected ?)? before ?*? token
extensions/pyexpat.c:332: error: expected ?)? before ?*? token
extensions/pyexpat.c:367: error: expected ?=?, ?,?, ?;?, ?asm? or
?__attribute__? before ?*? token
extensions/pyexpat.c:419: error: expected ?=?, ?,?, ?;?, ?asm? or
?__attribute__? before ?*? token
extensions/pyexpat.c: In function ?call_character_handler?:
extensions/pyexpat.c:444: error: ?PyObject? undeclared (first use in
this function)
extensions/pyexpat.c:444: error: ?args? undeclared (first use in this
function)
extensions/pyexpat.c:445: error: ?temp? undeclared (first use in this
function)
extensions/pyexpat.c:447: warning: implicit declaration of function
?PyTuple_New?
extensions/pyexpat.c:455: warning: implicit declaration of function
?conv_string_len_to_utf8?
extensions/pyexpat.c:458: warning: implicit declaration of function
?Py_DECREF?
extensions/pyexpat.c:462: warning: implicit declaration of function
?PyTuple_SET_ITEM?
extensions/pyexpat.c:464: error: ?xmlparseobject? has no member named
?in_callback?
extensions/pyexpat.c:465: warning: implicit declaration of function
?call_with_frame?
extensions/pyexpat.c:465: warning: implicit declaration of function
?getcode?
extensions/pyexpat.c:466: error: ?xmlparseobject? has no member named
?handlers?
extensions/pyexpat.c:468: error: ?xmlparseobject? has no member named
?in_callback?
extensions/pyexpat.c: In function ?flush_character_buffer?:
extensions/pyexpat.c:482: error: ?xmlparseobject? has no member named
?buffer?
extensions/pyexpat.c:482: error: ?xmlparseobject? has no member named
?buffer_used?
extensions/pyexpat.c:484: error: ?xmlparseobject? has no member named
?buffer?
extensions/pyexpat.c:484: error: ?xmlparseobject? has no member named
?buffer_used?
extensions/pyexpat.c:485: error: ?xmlparseobject? has no member named
?buffer_used?
extensions/pyexpat.c: In function ?my_CharacterDataHandler?:
extensions/pyexpat.c:493: error: ?xmlparseobject? has no member named
?buffer?
extensions/pyexpat.c:496: error: ?xmlparseobject? has no member named
?buffer_used?
extensions/pyexpat.c:496: error: ?xmlparseobject? has no member named
?buffer_size?
extensions/pyexpat.c:505: error: ?xmlparseobject? has no member named
?buffer_size?
extensions/pyexpat.c:507: error: ?xmlparseobject? has no member named
?buffer_used?
extensions/pyexpat.c:510: warning: implicit declaration of function ?memcpy?
extensions/pyexpat.c:510: warning: incompatible implicit declaration of
built-in function ?memcpy?
extensions/pyexpat.c:510: error: ?xmlparseobject? has no member named
?buffer?
extensions/pyexpat.c:510: error: ?xmlparseobject? has no member named
?buffer_used?
extensions/pyexpat.c:512: error: ?xmlparseobject? has no member named
?buffer_used?
extensions/pyexpat.c: In function ?my_StartElementHandler?:
extensions/pyexpat.c:524: error: ?PyObject? undeclared (first use in
this function)
extensions/pyexpat.c:524: error: ?container? undeclared (first use in
this function)
extensions/pyexpat.c:524: error: ?rv? undeclared (first use in this
function)
extensions/pyexpat.c:524: warning: left-hand operand of comma expression
has no effect
extensions/pyexpat.c:524: error: ?args? undeclared (first use in this
function)
extensions/pyexpat.c:524: warning: left-hand operand of comma expression
has no effect
extensions/pyexpat.c:532: error: ?xmlparseobject? has no member named
?specified_attributes?
extensions/pyexpat.c:533: error: ?xmlparseobject? has no member named
?itself?
extensions/pyexpat.c:541: error: ?xmlparseobject? has no member named
?ordered_attributes?
extensions/pyexpat.c:542: warning: implicit declaration of function
?PyList_New?
extensions/pyexpat.c:544: warning: implicit declaration of function
?PyDict_New?
extensions/pyexpat.c:550: error: ?n? undeclared (first use in this function)
extensions/pyexpat.c:550: warning: implicit declaration of function
?string_intern?
extensions/pyexpat.c:551: error: ?v? undeclared (first use in this function)
extensions/pyexpat.c:557: warning: implicit declaration of function
?conv_string_to_utf8?
extensions/pyexpat.c:564: error: ?xmlparseobject? has no member named
?ordered_attributes?
extensions/pyexpat.c:565: warning: implicit declaration of function
?PyList_SET_ITEM?
extensions/pyexpat.c:568: warning: implicit declaration of function
?PyDict_SetItem?
extensions/pyexpat.c:579: warning: implicit declaration of function
?Py_BuildValue?
extensions/pyexpat.c:585: error: ?xmlparseobject? has no member named
?in_callback?
extensions/pyexpat.c:587: error: ?xmlparseobject? has no member named
?handlers?
extensions/pyexpat.c:588: error: ?xmlparseobject? has no member named
?in_callback?
extensions/pyexpat.c: In function ?my_EndElementHandler?:
extensions/pyexpat.c:636: error: ?PyObject? undeclared (first use in
this function)
extensions/pyexpat.c:636: error: ?args? undeclared (first use in this
function)
extensions/pyexpat.c:636: error: ?rv? undeclared (first use in this
function)
extensions/pyexpat.c:636: error: ?xmlparseobject? has no member named
?in_callback?
extensions/pyexpat.c:636: error: ?xmlparseobject? has no member named
?handlers?
extensions/pyexpat.c:636: error: ?xmlparseobject? has no member named
?in_callback?
extensions/pyexpat.c: In function ?my_ProcessingInstructionHandler?:
extensions/pyexpat.c:640: error: ?PyObject? undeclared (first use in
this function)
extensions/pyexpat.c:640: error: ?args? undeclared (first use in this
function)
extensions/pyexpat.c:640: error: ?rv? undeclared (first use in this
function)
extensions/pyexpat.c:640: error: ?conv_string_to_utf8? undeclared (first
use in this function)
extensions/pyexpat.c:640: error: ?xmlparseobject? has no member named
?in_callback?
extensions/pyexpat.c:640: error: ?xmlparseobject? has no member named
?handlers?
extensions/pyexpat.c:640: error: ?xmlparseobject? has no member named
?in_callback?
extensions/pyexpat.c: In function ?my_UnparsedEntityDeclHandler?:
extensions/pyexpat.c:646: error: ?PyObject? undeclared (first use in
this function)
extensions/pyexpat.c:646: error: ?args? undeclared (first use in this
function)
extensions/pyexpat.c:646: error: ?rv? undeclared (first use in this
function)
extensions/pyexpat.c:646: error: ?xmlparseobject? has no member named
?in_callback?
extensions/pyexpat.c:646: error: ?xmlparseobject? has no member named
?handlers?
extensions/pyexpat.c:646: error: ?xmlparseobject? has no member named
?in_callback?
extensions/pyexpat.c: In function ?my_EntityDeclHandler?:
extensions/pyexpat.c:659: error: ?PyObject? undeclared (first use in
this function)
extensions/pyexpat.c:659: error: ?args? undeclared (first use in this
function)
extensions/pyexpat.c:659: error: ?rv? undeclared (first use in this
function)
extensions/pyexpat.c:659: error: ?xmlparseobject? has no member named
?in_callback?
extensions/pyexpat.c:659: error: ?xmlparseobject? has no member named
?handlers?
extensions/pyexpat.c:659: error: ?xmlparseobject? has no member named
?in_callback?
extensions/pyexpat.c: In function ?my_XmlDeclHandler?:
extensions/pyexpat.c:696: error: ?PyObject? undeclared (first use in
this function)
extensions/pyexpat.c:696: error: ?args? undeclared (first use in this
function)
extensions/pyexpat.c:696: error: ?rv? undeclared (first use in this
function)
extensions/pyexpat.c:696: error: ?conv_string_to_utf8? undeclared (first
use in this function)
extensions/pyexpat.c:696: error: ?xmlparseobject? has no member named
?in_callback?
extensions/pyexpat.c:696: error: ?xmlparseobject? has no member named
?handlers?
extensions/pyexpat.c:696: error: ?xmlparseobject? has no member named
?in_callback?
extensions/pyexpat.c: At top level:
extensions/pyexpat.c:705: error: expected ?=?, ?,?, ?;?, ?asm? or
?__attribute__? before ?*? token
extensions/pyexpat.c: In function ?my_ElementDeclHandler?:
extensions/pyexpat.c:737: error: ?PyObject? undeclared (first use in
this function)
extensions/pyexpat.c:737: error: ?args? undeclared (first use in this
function)
extensions/pyexpat.c:740: error: ?rv? undeclared (first use in this
function)
extensions/pyexpat.c:741: error: ?modelobj? undeclared (first use in
this function)
extensions/pyexpat.c:741: error: ?nameobj? undeclared (first use in this
function)
extensions/pyexpat.c:741: warning: left-hand operand of comma expression
has no effect
extensions/pyexpat.c:751: warning: implicit declaration of function
?conv_content_model?
extensions/pyexpat.c:751: error: ?conv_string_to_utf8? undeclared (first
use in this function)
extensions/pyexpat.c:769: error: ?xmlparseobject? has no member named
?in_callback?
extensions/pyexpat.c:771: error: ?xmlparseobject? has no member named
?handlers?
extensions/pyexpat.c:772: error: ?xmlparseobject? has no member named
?in_callback?
extensions/pyexpat.c:780: warning: implicit declaration of function
?Py_XDECREF?
extensions/pyexpat.c:781: error: ?xmlparseobject? has no member named
?itself?
extensions/pyexpat.c: In function ?my_AttlistDeclHandler?:
extensions/pyexpat.c:785: error: ?PyObject? undeclared (first use in
this function)
extensions/pyexpat.c:785: error: ?args? undeclared (first use in this
function)
extensions/pyexpat.c:785: error: ?rv? undeclared (first use in this
function)
extensions/pyexpat.c:785: error: ?conv_string_to_utf8? undeclared (first
use in this function)
extensions/pyexpat.c:785: error: ?xmlparseobject? has no member named
?in_callback?
extensions/pyexpat.c:785: error: ?xmlparseobject? has no member named
?handlers?
extensions/pyexpat.c:785: error: ?xmlparseobject? has no member named
?in_callback?
extensions/pyexpat.c: In function ?my_SkippedEntityHandler?:
extensions/pyexpat.c:798: error: ?PyObject? undeclared (first use in
this function)
extensions/pyexpat.c:798: error: ?args? undeclared (first use in this
function)
extensions/pyexpat.c:798: error: ?rv? undeclared (first use in this
function)
extensions/pyexpat.c:798: error: ?xmlparseobject? has no member named
?in_callback?
extensions/pyexpat.c:798: error: ?xmlparseobject? has no member named
?handlers?
extensions/pyexpat.c:798: error: ?xmlparseobject? has no member named
?in_callback?
extensions/pyexpat.c: In function ?my_NotationDeclHandler?:
extensions/pyexpat.c:806: error: ?PyObject? undeclared (first use in
this function)
extensions/pyexpat.c:806: error: ?args? undeclared (first use in this
function)
extensions/pyexpat.c:806: error: ?rv? undeclared (first use in this
function)
extensions/pyexpat.c:806: error: ?xmlparseobject? has no member named
?in_callback?
extensions/pyexpat.c:806: error: ?xmlparseobject? has no member named
?handlers?
extensions/pyexpat.c:806: error: ?xmlparseobject? has no member named
?in_callback?
extensions/pyexpat.c: In function ?my_StartNamespaceDeclHandler?:
extensions/pyexpat.c:816: error: ?PyObject? undeclared (first use in
this function)
extensions/pyexpat.c:816: error: ?args? undeclared (first use in this
function)
extensions/pyexpat.c:816: error: ?rv? undeclared (first use in this
function)
extensions/pyexpat.c:816: error: ?xmlparseobject? has no member named
?in_callback?
extensions/pyexpat.c:816: error: ?xmlparseobject? has no member named
?handlers?
extensions/pyexpat.c:816: error: ?xmlparseobject? has no member named
?in_callback?
extensions/pyexpat.c: In function ?my_EndNamespaceDeclHandler?:
extensions/pyexpat.c:823: error: ?PyObject? undeclared (first use in
this function)
extensions/pyexpat.c:823: error: ?args? undeclared (first use in this
function)
extensions/pyexpat.c:823: error: ?rv? undeclared (first use in this
function)
extensions/pyexpat.c:823: error: ?xmlparseobject? has no member named
?in_callback?
extensions/pyexpat.c:823: error: ?xmlparseobject? has no member named
?handlers?
extensions/pyexpat.c:823: error: ?xmlparseobject? has no member named
?in_callback?
extensions/pyexpat.c: In function ?my_CommentHandler?:
extensions/pyexpat.c:828: error: ?PyObject? undeclared (first use in
this function)
extensions/pyexpat.c:828: error: ?args? undeclared (first use in this
function)
extensions/pyexpat.c:828: error: ?rv? undeclared (first use in this
function)
extensions/pyexpat.c:828: error: ?conv_string_to_utf8? undeclared (first
use in this function)
extensions/pyexpat.c:828: error: ?xmlparseobject? has no member named
?in_callback?
extensions/pyexpat.c:828: error: ?xmlparseobject? has no member named
?handlers?
extensions/pyexpat.c:828: error: ?xmlparseobject? has no member named
?in_callback?
extensions/pyexpat.c: In function ?my_StartCdataSectionHandler?:
extensions/pyexpat.c:832: error: ?PyObject? undeclared (first use in
this function)
extensions/pyexpat.c:832: error: ?args? undeclared (first use in this
function)
extensions/pyexpat.c:832: error: ?rv? undeclared (first use in this
function)
extensions/pyexpat.c:832: error: ?xmlparseobject? has no member named
?in_callback?
extensions/pyexpat.c:832: error: ?xmlparseobject? has no member named
?handlers?
extensions/pyexpat.c:832: error: ?xmlparseobject? has no member named
?in_callback?
extensions/pyexpat.c: In function ?my_EndCdataSectionHandler?:
extensions/pyexpat.c:836: error: ?PyObject? undeclared (first use in
this function)
extensions/pyexpat.c:836: error: ?args? undeclared (first use in this
function)
extensions/pyexpat.c:836: error: ?rv? undeclared (first use in this
function)
extensions/pyexpat.c:836: error: ?xmlparseobject? has no member named
?in_callback?
extensions/pyexpat.c:836: error: ?xmlparseobject? has no member named
?handlers?
extensions/pyexpat.c:836: error: ?xmlparseobject? has no member named
?in_callback?
extensions/pyexpat.c: In function ?my_DefaultHandler?:
extensions/pyexpat.c:841: error: ?PyObject? undeclared (first use in
this function)
extensions/pyexpat.c:841: error: ?args? undeclared (first use in this
function)
extensions/pyexpat.c:841: error: ?rv? undeclared (first use in this
function)
extensions/pyexpat.c:841: error: ?xmlparseobject? has no member named
?in_callback?
extensions/pyexpat.c:841: error: ?xmlparseobject? has no member named
?handlers?
extensions/pyexpat.c:841: error: ?xmlparseobject? has no member named
?in_callback?
extensions/pyexpat.c: In function ?my_DefaultHandlerExpandHandler?:
extensions/pyexpat.c:845: error: ?PyObject? undeclared (first use in
this function)
extensions/pyexpat.c:845: error: ?args? undeclared (first use in this
function)
extensions/pyexpat.c:845: error: ?rv? undeclared (first use in this
function)
extensions/pyexpat.c:845: error: ?xmlparseobject? has no member named
?in_callback?
extensions/pyexpat.c:845: error: ?xmlparseobject? has no member named
?handlers?
extensions/pyexpat.c:845: error: ?xmlparseobject? has no member named
?in_callback?
extensions/pyexpat.c: In function ?my_NotStandaloneHandler?:
extensions/pyexpat.c:862: error: ?PyObject? undeclared (first use in
this function)
extensions/pyexpat.c:862: error: ?args? undeclared (first use in this
function)
extensions/pyexpat.c:862: error: ?rv? undeclared (first use in this
function)
extensions/pyexpat.c:862: error: ?xmlparseobject? has no member named
?in_callback?
extensions/pyexpat.c:862: error: ?xmlparseobject? has no member named
?handlers?
extensions/pyexpat.c:862: error: ?xmlparseobject? has no member named
?in_callback?
extensions/pyexpat.c:862: warning: implicit declaration of function
?PyInt_AsLong?
extensions/pyexpat.c: In function ?my_ExternalEntityRefHandler?:
extensions/pyexpat.c:866: error: ?PyObject? undeclared (first use in
this function)
extensions/pyexpat.c:866: error: ?args? undeclared (first use in this
function)
extensions/pyexpat.c:866: error: ?rv? undeclared (first use in this
function)
extensions/pyexpat.c:866: error: ?conv_string_to_utf8? undeclared (first
use in this function)
extensions/pyexpat.c:866: error: ?xmlparseobject? has no member named
?in_callback?
extensions/pyexpat.c:866: error: ?xmlparseobject? has no member named
?handlers?
extensions/pyexpat.c:866: error: ?xmlparseobject? has no member named
?in_callback?
extensions/pyexpat.c: In function ?my_StartDoctypeDeclHandler?:
extensions/pyexpat.c:881: error: ?PyObject? undeclared (first use in
this function)
extensions/pyexpat.c:881: error: ?args? undeclared (first use in this
function)
extensions/pyexpat.c:881: error: ?rv? undeclared (first use in this
function)
extensions/pyexpat.c:881: error: ?xmlparseobject? has no member named
?in_callback?
extensions/pyexpat.c:881: error: ?xmlparseobject? has no member named
?handlers?
extensions/pyexpat.c:881: error: ?xmlparseobject? has no member named
?in_callback?
extensions/pyexpat.c: In function ?my_EndDoctypeDeclHandler?:
extensions/pyexpat.c:889: error: ?PyObject? undeclared (first use in
this function)
extensions/pyexpat.c:889: error: ?args? undeclared (first use in this
function)
extensions/pyexpat.c:889: error: ?rv? undeclared (first use in this
function)
extensions/pyexpat.c:889: error: ?xmlparseobject? has no member named
?in_callback?
extensions/pyexpat.c:889: error: ?xmlparseobject? has no member named
?handlers?
extensions/pyexpat.c:889: error: ?xmlparseobject? has no member named
?in_callback?
extensions/pyexpat.c: At top level:
extensions/pyexpat.c:893: error: expected ?=?, ?,?, ?;?, ?asm? or
?__attribute__? before ?*? token
extensions/pyexpat.c:912: error: expected ?=?, ?,?, ?;?, ?asm? or
?__attribute__? before ?*? token
extensions/pyexpat.c:930: error: expected declaration specifiers or
?...? before ?PyObject?
extensions/pyexpat.c: In function ?readinst?:
extensions/pyexpat.c:932: error: ?PyObject? undeclared (first use in
this function)
extensions/pyexpat.c:932: error: ?arg? undeclared (first use in this
function)
extensions/pyexpat.c:933: error: ?bytes? undeclared (first use in this
function)
extensions/pyexpat.c:934: error: ?str? undeclared (first use in this
function)
extensions/pyexpat.c:937: warning: implicit declaration of function
?PyInt_FromLong?
extensions/pyexpat.c:948: warning: implicit declaration of function
?PyObject_CallObject?
extensions/pyexpat.c:948: error: ?meth? undeclared (first use in this
function)
extensions/pyexpat.c:956: warning: implicit declaration of function
?PyString_Check?
extensions/pyexpat.c:957: warning: implicit declaration of function
?PyErr_Format?
extensions/pyexpat.c:957: error: ?PyExc_TypeError? undeclared (first use
in this function)
extensions/pyexpat.c:962: warning: implicit declaration of function
?PyString_GET_SIZE?
extensions/pyexpat.c:964: error: ?PyExc_ValueError? undeclared (first
use in this function)
extensions/pyexpat.c:970: warning: incompatible implicit declaration of
built-in function ?memcpy?
extensions/pyexpat.c:970: warning: implicit declaration of function
?PyString_AsString?
extensions/pyexpat.c:970: warning: passing argument 2 of ?memcpy? makes
pointer from integer without a cast
extensions/pyexpat.c:970: note: expected ?const void *? but argument is
of type ?int?
extensions/pyexpat.c: At top level:
extensions/pyexpat.c:981: error: expected ?=?, ?,?, ?;?, ?asm? or
?__attribute__? before ?*? token
extensions/pyexpat.c:1044: error: expected ?=?, ?,?, ?;?, ?asm? or
?__attribute__? before ?*? token
extensions/pyexpat.c:1062: error: expected ?=?, ?,?, ?;?, ?asm? or
?__attribute__? before ?*? token
extensions/pyexpat.c:1077: error: expected ?=?, ?,?, ?;?, ?asm? or
?__attribute__? before ?*? token
extensions/pyexpat.c:1108: error: expected ?=?, ?,?, ?;?, ?asm? or
?__attribute__? before ?*? token
extensions/pyexpat.c:1203: error: expected ?=?, ?,?, ?;?, ?asm? or
?__attribute__? before ?*? token
extensions/pyexpat.c:1223: error: expected ?=?, ?,?, ?;?, ?asm? or
?__attribute__? before ?*? token
extensions/pyexpat.c:1242: error: array type has incomplete element type
extensions/pyexpat.c:1243: error: ?PyCFunction? undeclared here (not in
a function)
extensions/pyexpat.c:1243: error: expected ?}? before ?xmlparse_Parse?
extensions/pyexpat.c:1245: error: expected ?}? before ?xmlparse_ParseFile?
extensions/pyexpat.c:1247: error: expected ?}? before ?xmlparse_SetBase?
extensions/pyexpat.c:1249: error: expected ?}? before ?xmlparse_GetBase?
extensions/pyexpat.c:1251: error: expected ?}? before
?xmlparse_ExternalEntityParserCreate?
extensions/pyexpat.c:1253: error: expected ?}? before
?xmlparse_SetParamEntityParsing?
extensions/pyexpat.c:1255: error: expected ?}? before
?xmlparse_GetInputContext?
extensions/pyexpat.c:1258: error: expected ?}? before
?xmlparse_UseForeignDTD?
extensions/pyexpat.c:1320: error: expected ?=?, ?,?, ?;?, ?asm? or
?__attribute__? before ?*? token
extensions/pyexpat.c: In function ?xmlparse_dealloc?:
extensions/pyexpat.c:1395: warning: implicit declaration of function
?PyObject_GC_Fini?
extensions/pyexpat.c:1397: error: ?xmlparseobject? has no member named
?itself?
extensions/pyexpat.c:1398: error: ?xmlparseobject? has no member named
?itself?
extensions/pyexpat.c:1399: error: ?xmlparseobject? has no member named
?itself?
extensions/pyexpat.c:1401: error: ?xmlparseobject? has no member named
?handlers?
extensions/pyexpat.c:1402: error: ?PyObject? undeclared (first use in
this function)
extensions/pyexpat.c:1402: error: ?temp? undeclared (first use in this
function)
extensions/pyexpat.c:1404: error: ?xmlparseobject? has no member named
?handlers?
extensions/pyexpat.c:1405: error: ?xmlparseobject? has no member named
?handlers?
extensions/pyexpat.c:1408: error: ?xmlparseobject? has no member named
?handlers?
extensions/pyexpat.c:1409: error: ?xmlparseobject? has no member named
?handlers?
extensions/pyexpat.c:1411: error: ?xmlparseobject? has no member named
?buffer?
extensions/pyexpat.c:1412: error: ?xmlparseobject? has no member named
?buffer?
extensions/pyexpat.c:1413: error: ?xmlparseobject? has no member named
?buffer?
extensions/pyexpat.c:1415: error: ?xmlparseobject? has no member named
?intern?
extensions/pyexpat.c:1418: warning: implicit declaration of function
?PyObject_Del?
extensions/pyexpat.c: In function ?handlername2int?:
extensions/pyexpat.c:1430: warning: implicit declaration of function
?strcmp?
extensions/pyexpat.c: At top level:
extensions/pyexpat.c:1437: error: expected ?=?, ?,?, ?;?, ?asm? or
?__attribute__? before ?*? token
extensions/pyexpat.c:1445: error: expected ?=?, ?,?, ?;?, ?asm? or
?__attribute__? before ?*? token
extensions/pyexpat.c:1549: error: expected declaration specifiers or
?...? before ?PyObject?
extensions/pyexpat.c: In function ?sethandler?:
extensions/pyexpat.c:1554: error: ?PyObject? undeclared (first use in
this function)
extensions/pyexpat.c:1554: error: ?temp? undeclared (first use in this
function)
extensions/pyexpat.c:1554: error: ?xmlparseobject? has no member named
?handlers?
extensions/pyexpat.c:1556: error: ?v? undeclared (first use in this
function)
extensions/pyexpat.c:1556: error: ?Py_None? undeclared (first use in
this function)
extensions/pyexpat.c:1559: warning: implicit declaration of function
?Py_INCREF?
extensions/pyexpat.c:1562: error: ?xmlparseobject? has no member named
?handlers?
extensions/pyexpat.c:1564: error: ?xmlparseobject? has no member named
?itself?
extensions/pyexpat.c: At top level:
extensions/pyexpat.c:1571: error: expected declaration specifiers or
?...? before ?PyObject?
extensions/pyexpat.c: In function ?xmlparse_setattr?:
extensions/pyexpat.c:1574: error: ?v? undeclared (first use in this
function)
extensions/pyexpat.c:1575: warning: implicit declaration of function
?PyErr_SetString?
extensions/pyexpat.c:1575: error: ?PyExc_RuntimeError? undeclared (first
use in this function)
extensions/pyexpat.c:1579: warning: implicit declaration of function
?PyObject_IsTrue?
extensions/pyexpat.c:1580: error: ?xmlparseobject? has no member named
?buffer?
extensions/pyexpat.c:1581: error: ?xmlparseobject? has no member named
?buffer?
extensions/pyexpat.c:1581: error: ?xmlparseobject? has no member named
?buffer_size?
extensions/pyexpat.c:1582: error: ?xmlparseobject? has no member named
?buffer?
extensions/pyexpat.c:1583: warning: implicit declaration of function
?PyErr_NoMemory?
extensions/pyexpat.c:1586: error: ?xmlparseobject? has no member named
?buffer_used?
extensions/pyexpat.c:1589: error: ?xmlparseobject? has no member named
?buffer?
extensions/pyexpat.c:1592: error: ?xmlparseobject? has no member named
?buffer?
extensions/pyexpat.c:1593: error: ?xmlparseobject? has no member named
?buffer?
extensions/pyexpat.c:1599: error: ?xmlparseobject? has no member named
?ns_prefixes?
extensions/pyexpat.c:1601: error: ?xmlparseobject? has no member named
?ns_prefixes?
extensions/pyexpat.c:1602: error: ?xmlparseobject? has no member named
?itself?
extensions/pyexpat.c:1602: error: ?xmlparseobject? has no member named
?ns_prefixes?
extensions/pyexpat.c:1607: error: ?xmlparseobject? has no member named
?ordered_attributes?
extensions/pyexpat.c:1609: error: ?xmlparseobject? has no member named
?ordered_attributes?
extensions/pyexpat.c:1615: error: ?PyExc_ValueError? undeclared (first
use in this function)
extensions/pyexpat.c:1623: error: ?xmlparseobject? has no member named
?returns_unicode?
extensions/pyexpat.c:1628: error: ?xmlparseobject? has no member named
?specified_attributes?
extensions/pyexpat.c:1630: error: ?xmlparseobject? has no member named
?specified_attributes?
extensions/pyexpat.c:1642: error: too many arguments to function
?sethandler?
extensions/pyexpat.c:1645: error: ?PyExc_AttributeError? undeclared
(first use in this function)
extensions/pyexpat.c: At top level:
extensions/pyexpat.c:1676: error: expected ?=?, ?,?, ?;?, ?asm? or
?__attribute__? before ?Xmlparsetype?
extensions/pyexpat.c:1719: error: expected ?=?, ?,?, ?;?, ?asm? or
?__attribute__? before ?*? token
extensions/pyexpat.c:1766: error: expected ?=?, ?,?, ?;?, ?asm? or
?__attribute__? before ?*? token
extensions/pyexpat.c:1778: error: array type has incomplete element type
extensions/pyexpat.c:1779: error: expected ?}? before ?pyexpat_ParserCreate?
extensions/pyexpat.c:1781: error: expected ?}? before ?pyexpat_ErrorString?
extensions/pyexpat.c:1797: error: expected ?=?, ?,?, ?;?, ?asm? or
?__attribute__? before ?*? token
extensions/pyexpat.c: In function ?initpyexpat?:
extensions/pyexpat.c:1835: error: ?PyObject? undeclared (first use in
this function)
extensions/pyexpat.c:1835: error: ?m? undeclared (first use in this
function)
extensions/pyexpat.c:1835: error: ?d? undeclared (first use in this
function)
extensions/pyexpat.c:1835: warning: left-hand operand of comma
expression has no effect
extensions/pyexpat.c:1836: error: ?errmod_name? undeclared (first use in
this function)
extensions/pyexpat.c:1836: warning: implicit declaration of function
?PyString_FromString?
extensions/pyexpat.c:1837: error: ?errors_module? undeclared (first use
in this function)
extensions/pyexpat.c:1838: error: ?modelmod_name? undeclared (first use
in this function)
extensions/pyexpat.c:1839: error: ?model_module? undeclared (first use
in this function)
extensions/pyexpat.c:1840: error: ?sys_modules? undeclared (first use in
this function)
extensions/pyexpat.c:1848: error: ?Xmlparsetype? undeclared (first use
in this function)
extensions/pyexpat.c:1848: error: ?PyType_Type? undeclared (first use in
this function)
extensions/pyexpat.c:1851: warning: implicit declaration of function
?Py_InitModule3?
extensions/pyexpat.c:1855: error: ?ErrorObject? undeclared (first use in
this function)
extensions/pyexpat.c:1856: warning: implicit declaration of function
?PyErr_NewException?
extensions/pyexpat.c:1862: warning: implicit declaration of function
?PyModule_AddObject?
extensions/pyexpat.c:1866: error: expected expression before ?)? token
extensions/pyexpat.c:1868: warning: implicit declaration of function
?get_version_string?
extensions/pyexpat.c:1869: warning: implicit declaration of function
?PyModule_AddStringConstant?
extensions/pyexpat.c:1889: warning: implicit declaration of function
?PySys_GetObject?
extensions/pyexpat.c:1890: warning: implicit declaration of function
?PyModule_GetDict?
extensions/pyexpat.c:1891: warning: implicit declaration of function
?PyDict_GetItem?
extensions/pyexpat.c:1893: warning: implicit declaration of function
?PyModule_New?
extensions/pyexpat.c:1918: error: ?list? undeclared (first use in this
function)
extensions/pyexpat.c:1921: warning: implicit declaration of function
?PyErr_Clear?
extensions/pyexpat.c:1926: error: ?item? undeclared (first use in this
function)
extensions/pyexpat.c:1933: warning: implicit declaration of function
?PyList_Append?
extensions/pyexpat.c:1996: warning: implicit declaration of function
?PyModule_AddIntConstant?
extensions/pyexpat.c: In function ?clear_handlers?:
extensions/pyexpat.c:2023: error: ?PyObject? undeclared (first use in
this function)
extensions/pyexpat.c:2023: error: ?temp? undeclared (first use in this
function)
extensions/pyexpat.c:2027: error: ?xmlparseobject? has no member named
?handlers?
extensions/pyexpat.c:2029: error: ?xmlparseobject? has no member named
?handlers?
extensions/pyexpat.c:2030: error: ?xmlparseobject? has no member named
?handlers?
extensions/pyexpat.c:2032: error: ?xmlparseobject? has no member named
?itself?
error: command 'gcc' failed with exit status 1
From dieter at handshake.de Sun Feb 21 07:30:45 2010
From: dieter at handshake.de (Dieter Maurer)
Date: Sun, 21 Feb 2010 07:30:45 +0100
Subject: [XML-SIG] PyXML-0.8.4 error
In-Reply-To: <4B7BC94C.9020608@mbigroup.it>
References: <4B7BC94C.9020608@mbigroup.it>
Message-ID: <19328.54037.34726.70853@gargle.gargle.HOWL>
Vinicio Nocciolini wrote at 2010-2-17 11:47 +0100:
>I am using Ubuntu 9.10
>This is the error putput
>regards Vinicio
>
>
>PyXML-0.8.4$ python2.5 setup.py build
>running build
>running build_py
>running build_ext
>building '_xmlplus.parsers.pyexpat' extension
>gcc -pthread -fno-strict-aliasing -DNDEBUG -g -fwrapv -O2 -Wall
>-Wstrict-prototypes -fPIC -DXML_NS=1 -DXML_DTD=1 -DBYTEORDER=1234
>-DXML_CONTEXT_BYTES=1024 -DHAVE_MEMMOVE=1 -Iextensions/expat/lib
>-I/usr/include/python2.5 -c extensions/pyexpat.c -o
>build/temp.linux-i686-2.5/extensions/pyexpat.o
>extensions/pyexpat.c:5:20: error: Python.h: No such file or directory
You need to install the development package for Python
(something like "python-dev") on your system.
--
Dieter
From stefan_ml at behnel.de Sun Feb 21 12:14:11 2010
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Sun, 21 Feb 2010 12:14:11 +0100
Subject: [XML-SIG] Python working with XML
In-Reply-To: <710c80bb1002181242g5c0df17aid5710a1574761871@mail.gmail.com>
References: <710c80bb1002181242g5c0df17aid5710a1574761871@mail.gmail.com>
Message-ID: <4B811583.4020604@behnel.de>
James Johnston, 18.02.2010 21:42:
> I want to develop some tools for working with XML document files. This
> might require the validation of XML rules and converting various formats
> into XML. What is new and available? Some of the things I am reading are
> from 2000 - 2001.
Data conversion tends to be rather easy in Python. If you want to output
XML, there are multiple options, but you might want to start with the
xml.etree Package in Python's standard library. If you need validation, use
lxml instead.
Stefan
From kulthum91 at gmail.com Mon Feb 22 14:24:12 2010
From: kulthum91 at gmail.com (sharifah ummu kulthum)
Date: Mon, 22 Feb 2010 21:24:12 +0800
Subject: [XML-SIG] HTML parse error
Message-ID: <437a31571002220524g5a51facfibbdbe8ab64530c0@mail.gmail.com>
Hi guys
I am new to python. I have just installed python yesterday for my mythtv
project. I found a site
herefor
getting channel listing grabber to get channel for Malaysia for my
mythtv box. but I get these. I don't know what it means
Any insight is very mush appreciated as I am very new to python.
bitto at bitto:~$ python grabmy.py -f my.xml
Traceback (most recent call last):
File "grabmy.py", line 236, in
main()
File "grabmy.py", line 225, in main
for elem in grabber.grab(date + timedelta(i), **params_dict):
File "grabmy.py", line 102, in grab
html = self.get_html(date, **kwargs)
File "grabmy.py", line 63, in get_html
return BeautifulSoup(content)
File "build/bdist.linux-i686/egg/BeautifulSoup.py", line 1499, in __init__
File "build/bdist.linux-i686/egg/BeautifulSoup.py", line 1230, in __init__
File "build/bdist.linux-i686/egg/BeautifulSoup.py", line 1263, in _feed
File "/usr/lib/python2.6/HTMLParser.py", line 108, in feed
self.goahead(0)
File "/usr/lib/python2.6/HTMLParser.py", line 148, in goahead
k = self.parse_starttag(i)
File "/usr/lib/python2.6/HTMLParser.py", line 226, in parse_starttag
endpos = self.check_for_whole_start_tag(i)
File "/usr/lib/python2.6/HTMLParser.py", line 301, in
check_for_whole_start_tag
self.error("malformed start tag")
File "/usr/lib/python2.6/HTMLParser.py", line 115, in error
raise HTMLParseError(message, self.getpos())
HTMLParser.HTMLParseError: malformed start tag, at line 830, column 36
Bitto
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From stefan_ml at behnel.de Mon Feb 22 15:06:24 2010
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Mon, 22 Feb 2010 15:06:24 +0100
Subject: [XML-SIG] HTML parse error
In-Reply-To: <437a31571002220524g5a51facfibbdbe8ab64530c0@mail.gmail.com>
References: <437a31571002220524g5a51facfibbdbe8ab64530c0@mail.gmail.com>
Message-ID: <4B828F60.9070209@behnel.de>
sharifah ummu kulthum, 22.02.2010 14:24:
> I am new to python. I have just installed python yesterday for my mythtv
> project. I found a site
> herefor
> getting channel listing grabber to get channel for Malaysia for my
> mythtv box. but I get these. I don't know what it means
> [...]
> HTMLParser.HTMLParseError: malformed start tag, at line 830, column 36
It means that what you want to parse here is not valid HTML, i.e. the web
page is broken. The HTMLParser package in the standard library is not made
for parsing broken HTML. Use another tool like html5lib or lxml.html.
Stefan
From stefan_ml at behnel.de Mon Feb 22 15:12:51 2010
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Mon, 22 Feb 2010 15:12:51 +0100
Subject: [XML-SIG] HTML parse error
In-Reply-To: <437a31571002220608x7d7abfcj2dc3622b2ad8474d@mail.gmail.com>
References: <437a31571002220524g5a51facfibbdbe8ab64530c0@mail.gmail.com>
<4B828F60.9070209@behnel.de>
<437a31571002220608x7d7abfcj2dc3622b2ad8474d@mail.gmail.com>
Message-ID: <4B8290E3.2070406@behnel.de>
sharifah ummu kulthum, 22.02.2010 15:08:
> On Mon, Feb 22, 2010 at 10:06 PM, Stefan Behnel wrote:
>
>> sharifah ummu kulthum, 22.02.2010 14:24:
>>> I am new to python. I have just installed python yesterday for my mythtv
>>> project. I found a site
>>> here<
>> https://sayap.com/blog/2008/12/30/mythtv-s-xmltv-grabber-for-malaysia-channels
>>> for
>>> getting channel listing grabber to get channel for Malaysia for my
>>> mythtv box. but I get these. I don't know what it means
>>> [...]
>>> HTMLParser.HTMLParseError: malformed start tag, at line 830, column 36
>> It means that what you want to parse here is not valid HTML, i.e. the web
>> page is broken. The HTMLParser package in the standard library is not made
>> for parsing broken HTML. Use another tool like html5lib or lxml.html.
>>
>> Stefan
>>
> does it means that i have to install the tool?
Yes. That's pretty easy, though. They should be readily packaged for your
platform (Linux), so you can just install them like any other software
package. Look out for "python-html5lib" or "python-lxml".
Stefan
From stefan_ml at behnel.de Mon Feb 22 15:46:27 2010
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Mon, 22 Feb 2010 15:46:27 +0100
Subject: [XML-SIG] HTML parse error
In-Reply-To: <437a31571002220524g5a51facfibbdbe8ab64530c0@mail.gmail.com>
References: <437a31571002220524g5a51facfibbdbe8ab64530c0@mail.gmail.com>
Message-ID: <4B8298C3.5040701@behnel.de>
sharifah ummu kulthum, 22.02.2010 14:24:
> File "grabmy.py", line 63, in get_html
> return BeautifulSoup(content)
> File "build/bdist.linux-i686/egg/BeautifulSoup.py", line 1499, in __init__
> File "build/bdist.linux-i686/egg/BeautifulSoup.py", line 1230, in __init__
> File "build/bdist.linux-i686/egg/BeautifulSoup.py", line 1263, in _feed
> File "/usr/lib/python2.6/HTMLParser.py", line 108, in feed
> self.goahead(0)
> File "/usr/lib/python2.6/HTMLParser.py", line 148, in goahead
> k = self.parse_starttag(i)
> File "/usr/lib/python2.6/HTMLParser.py", line 226, in parse_starttag
> endpos = self.check_for_whole_start_tag(i)
> File "/usr/lib/python2.6/HTMLParser.py", line 301, in
> check_for_whole_start_tag
> self.error("malformed start tag")
> File "/usr/lib/python2.6/HTMLParser.py", line 115, in error
> raise HTMLParseError(message, self.getpos())
> HTMLParser.HTMLParseError: malformed start tag, at line 830, column 36
Just noticed this now - you seem to be using BeautifulSoup, likely version
3.1. This version does not support parsing broken HTML any well, so use
version 3.0.8 instead, or switch to the tools I indicated.
Note that switching tools means that you need to change your code to use
them. Just installing them is not enough.
Stefan
From kulthum91 at gmail.com Tue Feb 23 04:45:36 2010
From: kulthum91 at gmail.com (sharifah ummu kulthum)
Date: Tue, 23 Feb 2010 11:45:36 +0800
Subject: [XML-SIG] HTML parse error
In-Reply-To: <4B8298C3.5040701@behnel.de>
References: <437a31571002220524g5a51facfibbdbe8ab64530c0@mail.gmail.com>
<4B8298C3.5040701@behnel.de>
Message-ID: <437a31571002221945h1c0079d5i33d641b98f0fabfa@mail.gmail.com>
On Mon, Feb 22, 2010 at 10:46 PM, Stefan Behnel wrote:
> sharifah ummu kulthum, 22.02.2010 14:24:
> > File "grabmy.py", line 63, in get_html
> > return BeautifulSoup(content)
> > File "build/bdist.linux-i686/egg/BeautifulSoup.py", line 1499, in
> __init__
> > File "build/bdist.linux-i686/egg/BeautifulSoup.py", line 1230, in
> __init__
> > File "build/bdist.linux-i686/egg/BeautifulSoup.py", line 1263, in _feed
> > File "/usr/lib/python2.6/HTMLParser.py", line 108, in feed
> > self.goahead(0)
> > File "/usr/lib/python2.6/HTMLParser.py", line 148, in goahead
> > k = self.parse_starttag(i)
> > File "/usr/lib/python2.6/HTMLParser.py", line 226, in parse_starttag
> > endpos = self.check_for_whole_start_tag(i)
> > File "/usr/lib/python2.6/HTMLParser.py", line 301, in
> > check_for_whole_start_tag
> > self.error("malformed start tag")
> > File "/usr/lib/python2.6/HTMLParser.py", line 115, in error
> > raise HTMLParseError(message, self.getpos())
> > HTMLParser.HTMLParseError: malformed start tag, at line 830, column 36
>
> Just noticed this now - you seem to be using BeautifulSoup, likely version
> 3.1. This version does not support parsing broken HTML any well, so use
> version 3.0.8 instead, or switch to the tools I indicated.
>
> Note that switching tools means that you need to change your code to use
> them. Just installing them is not enough.
>
> Stefan
>
>
I am so sorry but I really don't know how to change the code as I have just
learn python. How am I going to switch the version or to change the code?
Because I don't really understand the code.
Here is the code:
'''
Copyright (c) 2008 Yap Sok Ann
This module contains xmltv grabbers for Malaysia channels.
'''
__author__ = 'Yap Sok Ann '
__license__ = 'PSF License'
import logging
from datetime import date as dt
from datetime import datetime, time, timedelta
from dateutil.tz import tzlocal
from httplib2 import Http
from lxml import etree
from urllib import urlencode
from BeautifulSoup import BeautifulSoup
channels = ['rtm1', 'rtm2', 'tv3', 'ntv7', '8tv', 'tv9']
datetime_format = '%Y%m%d%H%M%S %z'
h = Http()
h.force_exception_to_status_code = True
#h.timeout = 15
logging.basicConfig(
level=logging.DEBUG,
format='%(asctime)s %(levelname)-8s %(process)d %(message)s',
)
log = logging.getLogger(__name__)
def strclean(s):
s = s.strip().replace('‘', '\'').replace('’', '\'')
if s != ' ':
return s
class Grabber(object):
base_url = None
def __init__(self, channel):
self.channel = channel
self.url = self.base_url
def qs_params(self, date, **kwargs):
'''Returns a dict of params to form the url's query string
'''
raise NotImplementedError
def _parse_html(self, date, html):
'''Returns a list of dicts with the following keys:
- mandatory: title, start
- optional: stop, sub_title, desc, episode_number, episode_system
'''
raise NotImplementedError
def get_html(self, date, **kwargs):
params = self.qs_params(date, **kwargs)
response, content = h.request(self.url + '?' + urlencode(params))
if response.status == 200:
return BeautifulSoup(content)
else:
log.error('Status: %s\nContent: %s' % (response.status,
content))
def parse_html(self, date, html):
prev_schedule = None
try:
for schedule in self._parse_html(date, html):
if 'stop' in schedule:
yield schedule
elif prev_schedule:
prev_schedule['stop'] = schedule['start']
yield prev_schedule
prev_schedule = schedule
except:
log.exception('Cannot parse html for date %s' % date)
def to_xml(self, schedules):
for schedule in schedules:
program = etree.Element('programme', channel=self.channel,
start=schedule['start'].strftime(datetime_format),
stop=schedule['stop'].strftime(datetime_format))
title = etree.SubElement(program, 'title')
title.text = schedule['title']
if schedule.get('episode_num'):
episode_num = etree.SubElement(program, 'episode-num')
episode_num.set('system', schedule.get('episode_system'))
episode_num.text = schedule['episode_num']
for field in ['sub_title', 'desc']:
if schedule.get(field):
elem = etree.SubElement(program, field.replace('_',
'-'))
elem.text = schedule[field]
yield program
def grab(self, date, **kwargs):
html = self.get_html(date, **kwargs)
if html:
return self.to_xml(self.parse_html(date, html))
class Astro(Grabber):
base_url = 'http://www.astro.com.my/channels/%(channel)s/Default.asp'
params_dicts = [dict(batch=1),
dict(batch=2)]
ignores = ['No Transmission', 'Transmission Ends']
def __init__(self, channel):
self.channel = channel
self.url = self.base_url % dict(channel=channel)
def qs_params(self, date, **kwargs):
kwargs['sDate'] = date.strftime('%d-%b-%Y')
return kwargs
def _parse_html(self, date, html):
header_row = html.find('tr', bgcolor='#29487F')
for tr in header_row.fetchNextSiblings('tr'):
tds = tr.findChildren('td')
title = strclean(tds[1].find('a').string)
if title in self.ignores:
continue
# start time, '21:00' -> 9 PM
hour, minute = [int(x) for x in tds[0].string.split(':')]
start = datetime.combine(date,
time(hour, minute, tzinfo=tzlocal()))
# duration, '00:30' -> 30 minutes
hours, minutes = [int(x) for x in tds[2].string.split(':')]
stop = start + timedelta(hours=hours, minutes=minutes)
yield dict(title=title, start=start, stop=stop)
class TheStar(Grabber):
base_url = 'http://star-ecentral.com/tvnradio/tvguide/guide.asp'
params_dicts = [dict(db='live')]
def qs_params(self, date, **kwargs):
kwargs['pdate'] = date.strftime('%m/%d/%Y')
kwargs['chn'] = self.channel.replace('rtm', 'tv')
return kwargs
def _parse_html(self, date, html):
last_ampm = None
header_row = html.find('tr', bgcolor='#5e789c')
for tr in header_row.fetchNextSiblings('tr'):
tds = tr.findChildren('td')
schedule = {}
schedule['title'] =
strclean(tds[1].find('b').find('font').string)
schedule['desc'] = strclean(tds[2].find('font').string)
episode_num = strclean(tds[3].find('font').string)
if episode_num:
try:
episode_num = int(episode_num) - 1
episode_num = '.' + str(episode_num) + '.'
episode_system = 'xmltv_ns'
except ValueError:
episode_system = 'onscreen'
schedule['episode_num'] = episode_num
schedule['episode_system'] = episode_system
# start time, '9.00pm' -> 9 PM
time_str = tds[0].find('font').string
ampm = time_str[-2:]
hour, minute = [int(x) for x in time_str[:-2].split('.')]
if ampm == 'pm' and hour < 12:
hour += 12
elif ampm =='am' and hour == 12:
hour = 0
if last_ampm == 'pm' and ampm == 'am':
date = date + timedelta(1)
schedule['start'] = datetime.combine(
date, time(hour, minute, tzinfo=tzlocal()))
last_ampm = ampm
yield schedule
def main():
from optparse import OptionParser
parser = OptionParser()
parser.add_option('-s', '--source', dest='source',
help='SOURCE to grab from: Astro, TheStar. Default: TheStar')
parser.add_option('-d', '--date', dest='date',
help='Start DATE to grab schedules for (YYYY-MM-DD). Default:
today')
parser.add_option('-n', '--days', dest='days',
help='Number of DAYS to grab schedules for. Default: 1')
parser.add_option('-f', '--file', dest='filename', metavar='FILE',
help='Output FILE to write to. Default: stdout')
options, args = parser.parse_args()
if options.source is None:
cls = TheStar
else:
cls = globals()[options.source]
if options.date is None:
date = dt.today()
else:
date = dt(*[int(x) for x in options.date.split('-')])
if options.days is None:
days = 1
else:
days = int(options.days)
root = etree.Element('tv')
for channel in channels:
grabber = cls(channel)
for i in range(days):
for params_dict in cls.params_dicts:
for elem in grabber.grab(date + timedelta(i),
**params_dict):
root.append(elem)
xml = etree.tostring(root, encoding='UTF-8', xml_declaration=True,
pretty_print=True)
if options.filename is None:
print xml
else:
open(options.filename, 'w').write(xml)
if __name__ == '__main__':
main()
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From stefan_ml at behnel.de Tue Feb 23 10:46:33 2010
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Tue, 23 Feb 2010 10:46:33 +0100
Subject: [XML-SIG] HTML parse error
In-Reply-To: <437a31571002221945h1c0079d5i33d641b98f0fabfa@mail.gmail.com>
References: <437a31571002220524g5a51facfibbdbe8ab64530c0@mail.gmail.com>
<4B8298C3.5040701@behnel.de>
<437a31571002221945h1c0079d5i33d641b98f0fabfa@mail.gmail.com>
Message-ID: <4B83A3F9.3050202@behnel.de>
sharifah ummu kulthum, 23.02.2010 04:45:
> I am so sorry but I really don't know how to change the code as I have just
> learn python. How am I going to switch the version or to change the code?
> Because I don't really understand the code.
>
> Here is the code:
> [...]
That's some funny code - it uses BeautifulSoup to parse HTML, and then uses
lxml to build an XML tree from it - instead of using just lxml in the first
place...
Please send an e-mail to the original author of the tool to tell him/her
about the problem. Use the project mailing list for this (if there is one).
If that doesn't help, I'd suggest installing BeautifulSoup 3.0.8 to see if
that helps.
Stefan
From kulthum91 at gmail.com Tue Feb 23 11:28:39 2010
From: kulthum91 at gmail.com (sharifah ummu kulthum)
Date: Tue, 23 Feb 2010 18:28:39 +0800
Subject: [XML-SIG] HTML parse error
In-Reply-To: <4B83A3F9.3050202@behnel.de>
References: <437a31571002220524g5a51facfibbdbe8ab64530c0@mail.gmail.com>
<4B8298C3.5040701@behnel.de>
<437a31571002221945h1c0079d5i33d641b98f0fabfa@mail.gmail.com>
<4B83A3F9.3050202@behnel.de>
Message-ID: <437a31571002230228t6058e426j2f5aa4eaac193a9@mail.gmail.com>
On Tue, Feb 23, 2010 at 5:46 PM, Stefan Behnel wrote:
> sharifah ummu kulthum, 23.02.2010 04:45:
> > I am so sorry but I really don't know how to change the code as I have
> just
> > learn python. How am I going to switch the version or to change the code?
> > Because I don't really understand the code.
> >
> > Here is the code:
> > [...]
>
> That's some funny code - it uses BeautifulSoup to parse HTML, and then uses
> lxml to build an XML tree from it - instead of using just lxml in the first
> place...
>
> Please send an e-mail to the original author of the tool to tell him/her
> about the problem. Use the project mailing list for this (if there is one).
> If that doesn't help, I'd suggest installing BeautifulSoup 3.0.8 to see if
> that helps.
>
> Stefan
>
> I have sent an email to the author and I doubt that it will be a quick
respond. And this project does not have a mailing list. This is just an
individual class project that I have to complete which the deadline is so
close now. How can I install BeautifulSoup 3.0.8?
# sudo easy_install BeautifulSoup 3.0.8
like this?
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From stefan_ml at behnel.de Tue Feb 23 11:51:29 2010
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Tue, 23 Feb 2010 11:51:29 +0100
Subject: [XML-SIG] HTML parse error
In-Reply-To: <437a31571002230228t6058e426j2f5aa4eaac193a9@mail.gmail.com>
References: <437a31571002220524g5a51facfibbdbe8ab64530c0@mail.gmail.com>
<4B8298C3.5040701@behnel.de>
<437a31571002221945h1c0079d5i33d641b98f0fabfa@mail.gmail.com>
<4B83A3F9.3050202@behnel.de>
<437a31571002230228t6058e426j2f5aa4eaac193a9@mail.gmail.com>
Message-ID: <4B83B331.5020703@behnel.de>
sharifah ummu kulthum, 23.02.2010 11:28:
> On Tue, Feb 23, 2010 at 5:46 PM, Stefan Behnel wrote:
>> sharifah ummu kulthum, 23.02.2010 04:45:
>>> I am so sorry but I really don't know how to change the code as I have
>> just
>>> learn python. How am I going to switch the version or to change the code?
>>> Because I don't really understand the code.
>>>
>>> Here is the code:
>>> [...]
>> That's some funny code - it uses BeautifulSoup to parse HTML, and then uses
>> lxml to build an XML tree from it - instead of using just lxml in the first
>> place...
>>
>> Please send an e-mail to the original author of the tool to tell him/her
>> about the problem. Use the project mailing list for this (if there is one).
>> If that doesn't help, I'd suggest installing BeautifulSoup 3.0.8 to see if
>> that helps.
>>
>> I have sent an email to the author and I doubt that it will be a quick
> respond. And this project does not have a mailing list. This is just an
> individual class project that I have to complete which the deadline is so
> close now. How can I install BeautifulSoup 3.0.8?
>
> # sudo easy_install BeautifulSoup 3.0.8
>
> like this?
You should consider reading the documentation of easy_install. That would
have told you that you can use
# sudo easy_install BeautifulSoup==3.0.8
Note that this (and most of the previous thread) is rather off-topic to
this list. The comp.lang.python newsgroup would have been a better choice.
Stefan