[Tutor] Regular expression re.search() object . Please help

Danny Yoo dyoo at hkn.eecs.berkeley.edu
Thu Jan 13 20:54:16 CET 2005



On Thu, 13 Jan 2005, kumar s wrote:

> My list looks like this: List name = probe_pairs
> Name=AFFX-BioB-5_at
> Cell1=96	369	N	control	AFFX-BioB-5_at
> Cell2=96	370	N	control	AFFX-BioB-5_at
> Cell3=441	3	N	control	AFFX-BioB-5_at
> Cell4=441	4	N	control	AFFX-BioB-5_at
> Name=223473_at
> Cell1=307	87	N	control	223473_at
> Cell2=307	88	N	control	223473_at
> Cell3=367	84	N	control	223473_at
>
> My Script:
> >>> name1 = '[N][a][m][e][=]'


Hi Kumar,

The regular expression above can be simplified to:

    'Name='

The character-class operator that you're using, with the brackets '[]', is
useful when we want to allow different kind of characters.  Since the code
appears to be looking at a particular string, the regex can be greatly
simplified by not using character classes.



> >>> for i in range(len(probe_pairs)):
> 	key = re.match(name1,probe_pairs[i])
> 	key
>
>
> <_sre.SRE_Match object at 0x00E37A68>
> <_sre.SRE_Match object at 0x00E37AD8>
> <_sre.SRE_Match object at 0x00E37A68>
> <_sre.SRE_Match object at 0x00E37AD8>
> <_sre.SRE_Match object at 0x00E37A68>
> ..................................... (cont. 10K
> lines)
>
> Here it prints a bunch of reg.match objects. However when I say group()
> it prints only one object why?


Is it possible that the edited code may have done something like this?

###
for i in range(len(probe_pairs)):
    key = re.match(name1, probe_pairs[i])
print key
###

Without seeing what the literal code looks like, we're doomed to use our
imaginations and make up a reasonable story.  *grin*




> >>> for i in range(len(probe_pairs)):
> 	key = re.match(name1,probe_pairs[i])
> 	key.group()


Ok, I think I see what you're trying to do.  You're using the interactive
interpreter, which tries to be nice when we use it as a calculator.  The
interactive interpreter has a special feature that prints out the result
of expressions, even though we have not explicitely put in a "print"
statement.


When we using a loop, like:

###
>>> for i in range(10):
...     i, i*2, i*3
...
(0, 0, 0)
(1, 2, 3)
(2, 4, 6)
(3, 6, 9)
(4, 8, 12)
(5, 10, 15)
(6, 12, 18)
(7, 14, 21)
(8, 16, 24)
(9, 18, 27)
###

If the body of the loop contains a single expression, then Python's
interactive interpreter will try to be nice and print that expression
through each iteration.


The automatic expression-printing feature of the interactive interpreter
is only for our convenience.  If we're not running in interactive mode,
Python will not automatically print out the values of expressions!


So in a real program, it is much better to explicity write out the command
statement to 'print' the expression to screen, if that's what you want:

###
>>> for i in range(10):
...     print (i, i*2, i*3)
...
(0, 0, 0)
(1, 2, 3)
(2, 4, 6)
(3, 6, 9)
(4, 8, 12)
(5, 10, 15)
(6, 12, 18)
(7, 14, 21)
(8, 16, 24)
(9, 18, 27)
###




> After I get the reg.match object, I tried to remove
> that match object like this:
> >>> for i in range(len(probe_pairs)):
> 	key = re.match(name1,probe_pairs[i])
> 	del key
> 	print probe_pairs[i]


The match object has a separate existance from the string
'probe_pairs[i]'.  Your code does drop the 'match' object, but this has no
effect in making a string change in probe_pairs[i].

The code above, removing those two lines that play with the 'key', reduces
down back to:

###
for i in range(len(probe_pairs)):
    print probe_pairs[i]
###

which is why you're not seeing any particular change in the output.

I'm not exactly sure you really need to do regular expression stuff here.
Would the following work for you?

###
for probe_pair in probe_pairs:
    if not probe_pair.startswith('Name='):
        print probe_pair
###






> Name=AFFX-BioB-5_at
> Cell1=96	369	N	control	AFFX-BioB-5_at
> Cell2=96	370	N	control	AFFX-BioB-5_at
> Cell3=441	3	N	control	AFFX-BioB-5_at
>
> Result shows that that Name** line has not been deleted.


What do you want to see?  Do you want to see:

###
AFFX-BioB-5_at
Cell1=96	369	N	control	AFFX-BioB-5_at
Cell2=96	370	N	control	AFFX-BioB-5_at
Cell3=441	3	N	control	AFFX-BioB-5_at
###


or do you want to see this instead?

###
Cell1=96	369	N	control	AFFX-BioB-5_at
Cell2=96	370	N	control	AFFX-BioB-5_at
Cell3=441	3	N	control	AFFX-BioB-5_at
###


Good luck to you!



More information about the Tutor mailing list