[Tutor] Question on re.findall usage

Joel Goldstick joel.goldstick at gmail.com
Mon Jan 28 21:02:50 CET 2013


On Mon, Jan 28, 2013 at 2:15 PM, Dave Wilder <D.Wilder at f5.com> wrote:

>  Hello,
>
> I am trying using re.findall to parse the string below and then create a
> list from the results.
> junk_list = 'tmsh list net interface 1.3 media-ca \rpabilities\r\nnet
> interface 1.3 {\r\n    media-capabilities {\r\n        none\r\n
> auto\r\n     40000SR4-FD\r\n  10T-HD\r\n        100TX-FD\r\n
> 100TX-HD\r\n        1000T-FD\r\n        40000LR4-FD\r\n     1000T-HD\r\n
> }\r\n}\r\n'
>

This looks like a variation on the questions you asked over the last couple
of months.  Printing junk_list I get this:

>>> print junk_list
pabilitiesnet interface 1.3 media-ca
net interface 1.3 {
    media-capabilities {
        none
        auto
     40000SR4-FD
  10T-HD
        100TX-FD
        100TX-HD
        1000T-FD
        40000LR4-FD
     1000T-HD
    }
}
How do you get junk_list?  Read from a file?  Is there more in the file
besides what is in junk_list?  Do you have this exact file format every
time?

You might do better to use readline() instead of read(), and strip() each
line so that you don't have the new line issue.  I'm guessing but it looks
like you can toss every line until you get past media-capabilities, and
toss every line that contains }.  But maybe I am reading more into the data
format than is appropriate.

If my guesses are correct, you don't need regex stuff at all, because each
line (that you don't toss) contains something you want, and you can build
you list




> What I am doing now is obviously quite ugly, but I have not yet able to
> manipulate it to work how I want but in a much more efficient and modular
> way.
> I did some research on re.findall but am still confused as to how to do
> character repetition searches, which  I guess is what I need to do here.
> >> junk_list =
> re.findall(r'(auto|[1|4]0+[A-Z]-[HF]D|[1|4]0+[A-Z][A-Z]-[HF]D|[1|4]0+[A-Z][A-Z][0-9])',
> junk_list)
> >> junk_list
> ['auto', '40000SR4', '10T-HD', '100TX-FD', '100TX-HD', '40000LR4',
> '1000T-FD', '1000T-HD']
> >>>
>
> Basically, all I need to search on is:
>
>    - auto
>    - anything that starts w/ ‘1’ or ‘4’ and then any number of subsequent
>    zeroes   e.g. 10T-HD, 40000LR4-FD, 100TX-FD
>
>
> My environment:
> [root at f5ite ~/tests]$ uname -a
> Linux VM-QA-ITE-03 2.6.32-220.17.1.el6.i686 #1 SMP Tue May 15 22:09:39 BST
> 2012 i686 i686 i386 GNU/Linux
> [root at f5ite ~/tests]$
> [root at f5ite ~/tests]$ /usr/bin/python
> Python 2.7 (r27:82500, Jul  6 2010, 02:54:50)
> [GCC 4.1.2 20080704 (Red Hat 4.1.2-48)] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
>
> Any ideas?
>
> Thanks,
>
> Dave
>
>
> _______________________________________________
> Tutor maillist  -  Tutor at python.org
> To unsubscribe or change subscription options:
> http://mail.python.org/mailman/listinfo/tutor
>
>


-- 
Joel Goldstick
http://joelgoldstick.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/tutor/attachments/20130128/f2b4f24d/attachment-0001.html>


More information about the Tutor mailing list