[Tutor] Processing Linux command line output

Thu Apr 25 01:11:05 CEST 2013

Hi Gareth,

On Thu, Apr 25, 2013 at 8:03 AM, Gareth Allen <gallen at openworld.co.za> wrote:
> Hi all,
>
> I'm trying to get the output of a command and split it into a list that I
> can process.  What is the best way to go about doing this? In bash I would
> use tools like grep, sed awk etc.
>
> Here's an example:
>
> ifconfig
>
> lo        Link encap:Local Loopback
>           inet addr:127.0.0.1  Mask:255.0.0.0
>           inet6 addr: ::1/128 Scope:Host
>           UP LOOPBACK RUNNING  MTU:16436  Metric:1
>           RX packets:84253 errors:0 dropped:0 overruns:0 frame:0
>           TX packets:84253 errors:0 dropped:0 overruns:0 carrier:0
>           collisions:0 txqueuelen:0
>           RX bytes:11763964 (11.2 MiB)  TX bytes:11763964 (11.2 MiB)
>
> I would like to end up with something like this in a file:
>
> <unix timestamp>,lo,rx_errors=0,rx_dropped=0,rx_overruns=0,rx_frame=0

I think there are two parts to your question:

1. How to execute a command and get its output?
2. How to process that output to get the desired information using
text processing/etc?

I will start by first discussing (1). You will find one of the
functions in the 'subprocess' module useful for this. For example, the
subprocess.check_output() function [1] executes your command and
returns the output as a byte string.

Coming to your second questions, once you have the string, you can
then use string functions on it to extract the information you want.
Things like indexing, slicing, stripping and splitting becomes useful
then.

Let's see a simple example. 'who -b' command returns the system boot time:

$ who -b
         system boot  2013-04-13 09:00

Now, let's see how we can extract the date + time only from the above output.

First, we execute the command and store the output in 'output':

>>> output = subprocess.check_output(['who','-b'])

The result you get will be something like this:

>>> output
'         system boot  2013-04-13 09:00\n'

Note there are leading whitespaces and a trailing newline. So we will remove it:

>>> output = output.strip()
>>> output
'system boot  2013-04-13 09:00'

Now, we need to extract the date and time, leaving the string out.
Let's 'tokenize' it:

>>> output.split()
['system', 'boot', '2013-04-13', '09:00']

So, now you have a list of which the 3rd and 4th items are the date
and time respectively:

>>> output.split()[2:]
['2013-04-13', '09:00']

You can combine all the operations into:

>>> subprocess.check_output(['who','-b']).strip().split()[2:]
['2013-04-13', '09:00']

Well, that was a very simple example. A lot of things you will end up
doing will however involve operations such as above. You can also use
the 're' module to use regular expressions to retrieve information
perhaps more intelligently [2].

If you are interested for some related examples, I would like to point
you to couple of Python modules I was putting together [3]. In
readproc.py, I am extracting information from /proc and in pylinux.py
you will see a number of examples showing extracting information from
'subprocess.check_output()' calls.

I put together a list of resources to help someone who may be
interested in such tasks as yours for an article of mine. Here it is
[4].

[1] http://docs.python.org/2/library/subprocess.html#subprocess.check_output
[2] http://docs.python.org/2/library/re.html
[3] https://github.com/amitsaha/pylinux/tree/master/pylinux
[4] https://gist.github.com/amitsaha/4964491

Hope that helps.

Best,Amit.

--
http://amitsaha.github.com/