Parsing text

Shawn Milochik Shawn at Milochik.com
Wed May 6 14:49:49 EDT 2009


On Wed, May 6, 2009 at 2:32 PM, iainemsley <iainemsley at googlemail.com> wrote:
> Hi,
> I'm trying to write a fairly basic text parser to split up scenes and
> acts in plays to put them into XML. I've managed to get the text split
> into the blocks of scenes and acts and returned correctly but I'm
> trying to refine this and get the relevant scene number when the split
> is made but I keep getting an NoneType error trying to read the block
> inside the for loop and nothing is being returned. I'd be grateful for
> some suggestions as to how to get this working.
>
> for scene in text.split('Scene'):
>    num = re.compile("^\s\[0-9, i{1,4}, v]", re.I)
>    textNum = num.match(scene)
>    if textNum:
>        print textNum
>    else:
>        print "No scene number"
>    m = '<div type="scene>'
>    m += scene
>    m += '<\div>'
>    print m
>
> Thanks, Iain


Can you provide some sample input so we can recreate the problem?

Also, consider something like this instead of the concatenation:

m = '<div type="scene>%s</div>' % (scene,)



More information about the Python-list mailing list