Elegant solution needed: Data manipulation

Steve Holden sholden at holdenweb.com
Wed Jan 30 20:57:14 EST 2002


"Mark" <Aristotle_00 at yahoo.com> wrote in message
news:12e3779b.0201301635.2dd3c5b4 at posting.google.com...
> I have the following setup:
>
> I have some data:
>
>     height    weight    speed
> 1     3        6          12
> 2     5        9          20
> 3     4        10         15
>
> I also have an equation:
> add( speed , mul (weight, height) )
>
> These both live in their own files.
>
> I want to read in the equation and perform the corresponding
> operations on the data.
> So, for example, in this case, I'd have a new column "new":
>     new
> 1    30
> 2    65
> 3    55
>
> (hope the math is right).
>
> I can certainly have the data set up as lists in a dictionary so that
> data["height"] will give me [3, 5, 4].  I just need an elegant way to
> parse the equation string and have it apply to the results.  It seems
> pretty easy to convert from add ( x , y) to map( add, x , y) but 1) I
> don't know python's regular expression stuff and 2) it seems there
> should be an even easier way.  I definitely don't want to go as far as
> formally parsing the equation to do this (unless python has some
> killer built in tools).
>
Fortunately, Python has some killer built-in tools ...

Do the formulae have to be in that form? Is there some reason why you aren't
using

    speed + height * weight

perhaps? If this were acceptable, I can see how you might read the lines of
the file (processing them one at a time), associating each value with the
appropriate name, then using eval() to evaluate the expression using the
dictionary as the namespace. Here is a simple example, where your data and
expression are stored in files with obvious names.

f = open("ardata", "r")
names = f.readline().split()
print "Cols are:", ", ".join(names)
e = open("arexpr", "r")
expr = e.read()
print "Expression is:", expr
expr = compile(expr, "Expression", "eval")
e.close()

namespace = {}
while 1:
    data = f.readline().split()
    if not data:
        break
    for n, v in zip(names, data[1:]):
        namespace[n] = eval(v)  # eval() to convert from string
    result = eval(expr, namespace)
    print "Data", data, "result", result

f.close()

The results from running the program are as follows:

D:\Steve\Projects\Python>python artest.py
Cols are: height, weight, speed
Expression is: speed + height * weight

Data ['1', '3', '6', '12'] result 30
Data ['2', '5', '9', '20'] result 65
Data ['3', '4', '10', '15'] result 55

Hopefully this will give you enough of a starting point. You can probably
work out how to write out the results in an appropriate format.

If you really do need the expressions to be in the form you gave, look into
the operator module: it has functions corresponding to all the standard
operators.

regards
 Steve
--
Consulting, training, speaking: http://www.holdenweb.com/
Python Web Programming: http://pydish.holdenweb.com/pwp/








More information about the Python-list mailing list