[Tutor] Regular expression on python
Alan Gauld
alan.gauld at btinternet.com
Mon Apr 13 20:42:03 CEST 2015
On 13/04/15 13:29, jarod_v6 at libero.it wrote:
> Input Read Pairs: 2127436 Both Surviving: 1795091 (84.38%) Forward Only Surviving: 17315 (0.81%) Reverse Only Surviving: 6413 (0.30%) Dropped: 308617 (14.51%)
Its not clear where the tabs are in this line.
But if they are after the numbers, like so:
Input Read Pairs: 2127436 \t
Both Surviving: 1795091 (84.38%) \t
Forward Only Surviving: 17315 (0.81%) \t
Reverse Only Surviving: 6413 (0.30%) \t
Dropped: 308617 (14.51%)
Then you may not need to use regular expressions.
Simply split by tab then split by :
And if the 'number' contains parens split again by space
> with open("255.trim.log","r") as p:
> for i in p:
> lines= i.strip("\t")
lines is a bad name here since its only a single line. In fact I'd lose
the 'i' variable and just use
for line in p:
> if lines.startswith("Input"):
> tp = lines.split("\t")
> print re.findall("Input\d",str(tp))
Input is not followed by a number. You need a more powerful pattern.
Which is why I recommend trying to solve it as far as possible
without using regex.
> So I started to find ":" from the row:
> with open("255.trim.log","r") as p:
> for i in p:
> lines= i.strip("\t")
> if lines.startswith("Input"):
> tp = lines.split("\t")
> print re.findall(":",str(tp[0]))
Does finding the colons really help much?
Or at least, does it help any more than splitting by colon would?
> And I'm able to find, but when I try to take the number using \d not work.
> Someone can explain why?
Because your pattern doesn't match the string.
HTH
--
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.amazon.com/author/alan_gauld
Follow my photo-blog on Flickr at:
http://www.flickr.com/photos/alangauldphotos
More information about the Tutor
mailing list