
Itamar Shtull-Trauring wrote:
[...] What you mean by "traditional" is actually a pull parser. Parsing APIs can be pull or push (i.e. asynchronous). Well-designed parsers are always push, because push parsers can be trivially converted to blocking pull parsers, but not vice-versa. Some examples of push/asynch parsers: twisted's Protocol class, or the SAX API. Sorry, I think my example was somewhat misleading and it also becomes clear to me that I haven't used the word "asynchronous" correctly. I didn't consider that one can also register callbacks with a parser, for example, and call this type of programming asynchronous. (The principle "Don't call us, we call you" would apply here, too, of course.)
No, what I really meant by "traditional" was to write parsers and generators which traverse the document as a whole in one large step, without giving a chance to the twisted reactor to process any other events. Let's assume I got a dom tree from pythons XML parser. First, I'd traverse that tree and build up another tree consisting of element objects. Each element object is an instance of a class corresponding to a tag, for example for tag <chapter> I'd create a class "chapter". This is necessary because there's not always a one-to-one correspondence between tags and my document elements and to associate some additional attributes with such elements later, for example automatically generated chapter numbers. I'd then use the generator to traverse that element tree, calling "render_element" methods on my way. For element chapter with attribute title I'd call "render_chapter( node )", which then generates "<h1>chapter_title</h1>". Let's assume I had some element with child elements. Without knowing about twisted at all, I'd have created a foreach loop to process each child like this: foreach child_node in root_elem.children: if child_node.type = chapter: processChapter( child_node ) My idea now is that depending on the number of child elements, looping could take some time. So instead I'd use twisted's reactor, specifically its callLater method like this (it's only pseudo code!): class Generator: def generate_html( self ): self.d = defer.Deferred() self.startProcessing() return self.d def startProcessing( self ): self.current_element = root_elem self.processNextElement() def processNextElement( self ): if more elements to process: if current_element.type = chapter reactor.callLater( 0, processChapter, current_element ) ..... else: d.callback( "finished" ) In this way any twisted user could get a Deferred from the generate_html method and get called when the Generator has generated all HTML. The problem with this is that I couldn't ever use such code without also installing twisted, of course. It's more or less clear to me how to divide the traversal of such a dom tree into discrete steps, but it's not so clear how to call the processNextElement with reactor.callLater from the outside. Although, after I've read the other answers, it seems to me I'm not far from a solution. I think I could also create two classes: the Generator class, which would provide a processNextElement method and doesn't need to depend on the twisted framework, and a TwistedGenerator class, which would do exactly the same like the code above and repeatedly call processNextElement with reactor.callLater. But the internal housekeeping which element to process next could be more difficult than with the solution above, couldn't it? (Because instead of seperate methods like "processChapter", "processList", etc. I'd only have one method to call from outside, "processNextElement" (and something like "moreElementsToProcess"). The TwistedGenerator wrapper shouldn't know about the internal state of the Generator, I think.) Many greetings, Jürgen