[Tutor] Regular expression re.search() object . Please help

Alan Gauld alan.gauld at freenet.co.uk
Fri Jan 14 01:50:14 CET 2005


> I have looked into many books including my favs(
> Larning python and Alan Gaulds Learn to program using

Yes this is pushing regex a bit further than I show in my book.

> What I want to extract:
> I want to extract 164:623:
> Which always comes after _at: and ends with ;

You should be able to use the group() method to extract 
the matching string out of the match object.

> 2. The second pattern/number I want to extract is
> 6649:
> This always comes after position=.
> 
> How I want to put to desired[]:
> 
> >>> desired
> ['>164:623|6649', 'TCATGGCTGACAACCCATCTTGGGA']
> 
> I write a pattern:
> 
> 
> pat = '[0-9]*[:][0-9]*'
> pat1 = '[_Position][=][0-9]*'
> 
> >>> for line in seq:
> pat = '[0-9]*[:][0-9]*'
> pat1 = '[_Position][=][0-9]*'

pat1 = [_Position] will match any *one* of the characters 
in _Position, is that really what you want?

I suspect its:

'_Position=[0-9]*'

Which is the fixed string followed by any number(including zero) 
of digits.

> print (re.search(pat,line) and re.search(pat1,line))

This is asking print to print the boolean value of your expression
which if the first search fails will be that failure and if 
it succeeeds will be the result of the second search. Check 
the section on Functional Programming in my tutor to see why.

> <_sre.SRE_Match object at 0x163CAF00>
> None

Looks like your searches worked but of course you don't have 
the match objects stored so you can't use them for anything.
But I'm suspicious of where the None is coming from...
Again without any indentation showing its not totally clear 
what your code looks like.

> What kind of operations can I do on this to get those
> two matches: 
> 164:623 and 6649. 

I think that if you keep the match objects you can use group() 
to extract the thing that matched which should be close to 
what you want.

> In Alan Gauld's book, most of the explanation stopped
> at 
> <_sre.SRE_Match object at 0x163CAF00> this level.

Yep, personally I use regex to find the line then extract 
the data I need from the line "manually". Messing around with 
match objects is something I try to avoid and so left it 
out of the tutor.

:-)

Alan G.


More information about the Tutor mailing list