Elegant solution needed: Data manipulation

Jason Orendorff jason at jorendorff.com
Sun Feb 3 16:43:01 EST 2002


Mark wrote:
> No, the only real problem with Steve's code was a need
> to have infix expressions.

It certainly doesn't *require* you to use infix.
Is infix notation bad?

Anyway, text-level transformation of code is error-prone.
But other than that the code looks just fine.  I've included
another stab at the problem below.

## Jason Orendorff    http://www.jorendorff.com/

------------------------------------
# formula.py - Code for applying one or more formulas to tabular data

# Grab the formulas from the "formula_code.txt" file.
# Compile them.
e = open("formula_code.txt", "r")
source_code = e.read()
e.close()
my_code = compile(source_code, "arexpr", "exec")

# Open the file that has all the data.
# Read the first line, which has the names on it.
f = open("formula_data.txt", "r")
names = f.readline().split()

# Make a global namespace and import a bunch of stuff into it.
my_globals = {}
exec "from math import *" in my_globals
exec "from operator import *" in my_globals

# Now perform the calculations.
results = []
for line in f.xreadlines():
    # Grab the data and put it into the locals dictionary.
    my_locals = {}
    data = line.split()
    for i in range(len(names)):
        name = names[i]
        value = eval(data[i], my_globals, my_locals)
        my_locals[name] = value

    # Execute all the formulas.
    exec my_code in my_globals, my_locals

    # Stick the results in the results list.
    results.append(my_locals)
f.close()

# Figure out what the column names are.
first_row = results[0]
if first_row.has_key('output_columns'):
    column_names = first_row['output_columns'].split()
else:
    column_names = first_row.keys()

# Lastly, print rows of output.
print str.join('\t\t', column_names)
for row in results:
    outrow = []
    for name in column_names:
        outrow.append(str(row[name]))
    print str.join('\t\t', outrow)
------------------------------------

formula_code.txt:
------------------------------------
# formula_code.txt - formulas for formula.py

output_columns = 'height weight shoesize X Y Z'

X = add(mul(height, weight), shoesize)
Y = log(height) + log(weight) + 0.44 * log(shoesize)
# ...and in case you want to avoid infix notation...
Z = add(add(log(height), log(weight)), mul(0.44, log(shoesize)))
------------------------------------

formula_data.txt:
------------------------------------
height  weight  shoesize
1   2   3
4   8   12
9   6   3
8   7   3
6   5   4
3   2   1
1   7   8
------------------------------------




More information about the Python-list mailing list