[Tutor] regexp: a bit lost

Alex Hall mehgcap at gmail.com
Fri Oct 1 04:45:38 CEST 2010


Hi, once again...
I have a regexp that I am trying to use to make sure a line matches the format:
[c*]n [c*]n n
where c* is (optionally) 0 or more non-numeric characters and n is any
numeric character. The spacing should not matter. These should pass:
v1 v2   5
2 someword7 3

while these should not:
word 2  3
1 2

Here is my test:
s=re.search(r"[\d+\s+\d+\s+\d]", l)
if s: #do stuff

However:
1. this seems to pass with *any* string, even when l is a single
character. This causes many problems and cannot happen since I have to
ignore any strings not formatted as described above. So if I have
for a in b:
  s=re.search(r"[\d+\s+\d+\s+\d]", l)
  if s: c.append(a)

then c will have every string in b, even if the string being examined
looks nothing like the pattern I am after.

2. How would I make my regexp able to match 0-n characters? I know to
use \D*, but I am not sure about brackets or parentheses for putting
the \D* into the parent expression (the \d\s one).

3. Once I get the above working, I will need a way of pulling the
characters out of the string and sticking them somewhere. For example,
if the string were
v9 v10 15
I would want an array:
n=[9, 10, 15]
but the array would be created from a regexp. This has to be possible,
but none of the manuals or tutorials on regexp say just how this is
done. Mentions are made of groups, but nothing explicit (to me at
least).

-- 
Have a great day,
Alex (msg sent from GMail website)
mehgcap at gmail.com; http://www.facebook.com/mehgcap


More information about the Tutor mailing list