Parsing text
MRAB
google at mrabarnett.plus.com
Wed May 6 14:55:21 EDT 2009
iainemsley wrote:
> Hi,
> I'm trying to write a fairly basic text parser to split up scenes and
> acts in plays to put them into XML. I've managed to get the text split
> into the blocks of scenes and acts and returned correctly but I'm
> trying to refine this and get the relevant scene number when the split
> is made but I keep getting an NoneType error trying to read the block
> inside the for loop and nothing is being returned. I'd be grateful for
> some suggestions as to how to get this working.
>
> for scene in text.split('Scene'):
> num = re.compile("^\s\[0-9, i{1,4}, v]", re.I)
> textNum = num.match(scene)
> if textNum:
> print textNum
> else:
> print "No scene number"
> m = '<div type="scene>'
> m += scene
> m += '<\div>'
> print m
>
The problem is with your regular expression. Unfortunately, I can't tell
what you're trying to match. Could you provide some examples of the
scene numbers?
More information about the Python-list
mailing list