Regular Expressions: large amount of or's

Daniel Yoo dyoo at
Mon Mar 14 23:26:31 CET 2005

Scott David Daniels <Scott.Daniels at> wrote:

: I have a (very high speed) modified Aho-Corasick machine that I sell.
: The calling model that I found works well is:

:      def chases(self, sourcestream, ...):
:           '''A generator taking a generator of source blocks,
:           yielding (matches, position) pairs where position is an
:           offset within the "current" block.
:           '''

: You might consider taking a look at providing that form.

Hi Scott,

No problem, I'll be happy to do this.

I need some clarification on the calling model though.  Would this be
an accurate test case?

    def testChasesInterface(self):
        sourceStream = iter(("python programming is fun",
                             "how much is that python in the window"))
                           (sourceBlocks[0], (0, 6)),
                           (sourceBlocks[0], (19, 21)),
                           (sourceBlocks[1], (9, 11)),
                           (sourceBlocks[1], (17, 23)),

Here, I'm assuming that chases() takes in a 'sourceStream', which is
an iterator of text blocks., and that the return value is itself an

Best of wishes!

