[XML-SIG] Thread safe XML parser
Tom Kirkpatrick
tom at settopsolutions.com
Thu Sep 28 10:47:14 CEST 2006
I'm having issues using pyExpat from within a thread... I'm getting
the following error:
python: Modules/gcmodule.c:379: move_unreachable: Assertion `gc-
>gc.gc_refs > 0' failed.
The code is like so:
def _handle_success( self ):
""" called once the fetcher succeeds """
self.log.debug( "XMLFetcher succeeded fetching %s", self.uri )
callback = MainThreadCallback( self.signals[ "success" ].emit )
callback()
def _handle_error( self ):
""" called if the fetch attempt fails """
self.log.debug( "XMLFetcher reached retry limit" )
callback = MainThreadCallback( self.signals[ "failure" ].emit )
callback()
def _do_fetch( self ):
""" does the work of fetching and processing the xml file
from the source url """
reader = PyExpat.Reader()
for i in range( 0, self.retry_limit ):
self.try_count += 1
self.log.debug( "Attempting fetch %s: %s of %s",
self.uri, self.try_count, self.retry_limit )
try:
self.xml = reader.fromUri( self.uri ).documentElement
self._handle_success()
return
except ExpatError, e:
self.log.error( "Could not parse XML file" )
except HTTPError, e:
self.log.warning( "HTTP-Error whilst attempting to
fetch %s: %s" %(self.uri, e.code) )
except URLError, e:
self.log.warning( "ULR-Error whilst attempting to
fetch %s: %s" %(self.uri, e.reason) )
time.sleep( self.retry_interval )
self._handle_error()
def fetch( self ):
""" spawns a new thread to fetch the xml file asyncronously """
thread = Thread( self._do_fetch )
thread.start()
return None
------------------------
The offending line is:
self.xml = reader.fromUri( self.uri ).documentElement
Comment that out and it runs ok (although I get no xml back!!). I
have also tried a slightly different method - fetching the file with
urlopen and then using reader.fromStream to do the parsing, but I
still get the same error:
...
try:
config_file = urllib.urlopen( self.uri )
self.xml = reader.fromStream
( config_file ).documentElement
self._handle_success()
return
...
If I move the xml parsing stuff out of the thread it runs fine,
although thats the bit that takes the time and thats the bit that
need threading the most. I have searched the net trying to find out
information about python xml parsing and thread safety but am not
having much luck...
Does anyone know of an xml parsing module with xpath support, that is
thread safe? Or can anyone suggest a way round this problem or even
give some pointers as to what the actual problem is being caussed by?
many thanks
Tom
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/xml-sig/attachments/20060928/143c8234/attachment.htm
More information about the XML-SIG
mailing list