[Tutor] Question on re.findall usage

Mitya Sirenef msirenef at lightbird.net
Mon Jan 28 21:41:51 CET 2013


On 01/28/2013 03:19 PM, Dave Wilder wrote:
>
 > On 28 January 2013 2:44, : Oscar Benjamin 
[mailto:oscar.j.benjamin at gmail.com wrote:
 >
 > Please post in plain text (not html) as otherwise the code gets 
screwed up.
 > ...
 >
 > Some people like to use regexes for everything. I prefer to try 
string methods first as I find them easier to understand.
 > Here's my attempt:
 >>>> junk_list = 'tmsh list net interface 1.3 media-ca 
\rpabilities\r\nnet interface 1.3 {\r\n media-capabilities {\r\n 
none\r\n auto\r\n 40000SR4-FD\r\n 10T-HD\r\n 100TX-FD\r\n 100TX-HD\r\n 
1000T-FD\r\n 40000LR4-FD\r\n 1000T-HD\r\n }\r\n}\r\n'
 >>>> junk_list = [s.strip() for s in junk_list.splitlines()] junk_list =
 >>>> [s for s in junk_list if s == 'auto' or s[:2] in ('10', '40')]
 >>>> junk_list
 > ['auto', '40000SR4-FD', '10T-HD', '100TX-FD', '100TX-HD', '1000T-FD', 
'40000LR4-FD', '1000T-HD']
 >
 > Does that do what you want?
 >
 >
 > Oscar
 >
 >
 > *****************************
 >
 > Got it Oscar. Thank you for your respectful corrections and your 
solution.
 > I used "Rich Text" which is what I thought was recommended by the 
list gurus at one point. Plain Text it is then.
 >
 > Your response definitely does the trick and I can use that as a base 
for the future.
 >
 > As per Joel's comment that it is a variation of questions I asked in 
the past, right you are. I had to put this away for a while and am 
picking it up again now.
 > I will get string manipulation / RegEx educated.
 >
 > Thank You,
 >
 > Dave

I would like to emphasize that regex is an entirely wrong approach for
this task. The reason is that it's very brittle, hard to read, hard to
debug and update if the text file changes. The first step should be to
simplfy the task -- in this case, to split the string into lines and
strip each line.

List comps as shown by Oscar is the best approach, but if you don't feel
comfortable with list comps, you can then use a loop:

lst = []
for line in junk_list:
     if line == "auto"          : lst.append(line)
     elif line.startswith("10") : lst.append(line)
     elif line.startswith("40") : lst.append(line)

You could even use a regex as part of the loop:

for line in junk_list:
     if re.match("^(auto|10|40)", line):
         lst.append(line)

It's still much better than using a humongous regex.

  - m


-- 
Lark's Tongue Guide to Python: http://lightbird.net/larks/

When a friend succeeds, I die a little.  Gore Vidal



More information about the Tutor mailing list