Elegant solution needed: Data manipulation
Jason Orendorff
jason at jorendorff.com
Sun Feb 3 16:43:01 EST 2002
Mark wrote:
> No, the only real problem with Steve's code was a need
> to have infix expressions.
It certainly doesn't *require* you to use infix.
Is infix notation bad?
Anyway, text-level transformation of code is error-prone.
But other than that the code looks just fine. I've included
another stab at the problem below.
## Jason Orendorff http://www.jorendorff.com/
------------------------------------
# formula.py - Code for applying one or more formulas to tabular data
# Grab the formulas from the "formula_code.txt" file.
# Compile them.
e = open("formula_code.txt", "r")
source_code = e.read()
e.close()
my_code = compile(source_code, "arexpr", "exec")
# Open the file that has all the data.
# Read the first line, which has the names on it.
f = open("formula_data.txt", "r")
names = f.readline().split()
# Make a global namespace and import a bunch of stuff into it.
my_globals = {}
exec "from math import *" in my_globals
exec "from operator import *" in my_globals
# Now perform the calculations.
results = []
for line in f.xreadlines():
# Grab the data and put it into the locals dictionary.
my_locals = {}
data = line.split()
for i in range(len(names)):
name = names[i]
value = eval(data[i], my_globals, my_locals)
my_locals[name] = value
# Execute all the formulas.
exec my_code in my_globals, my_locals
# Stick the results in the results list.
results.append(my_locals)
f.close()
# Figure out what the column names are.
first_row = results[0]
if first_row.has_key('output_columns'):
column_names = first_row['output_columns'].split()
else:
column_names = first_row.keys()
# Lastly, print rows of output.
print str.join('\t\t', column_names)
for row in results:
outrow = []
for name in column_names:
outrow.append(str(row[name]))
print str.join('\t\t', outrow)
------------------------------------
formula_code.txt:
------------------------------------
# formula_code.txt - formulas for formula.py
output_columns = 'height weight shoesize X Y Z'
X = add(mul(height, weight), shoesize)
Y = log(height) + log(weight) + 0.44 * log(shoesize)
# ...and in case you want to avoid infix notation...
Z = add(add(log(height), log(weight)), mul(0.44, log(shoesize)))
------------------------------------
formula_data.txt:
------------------------------------
height weight shoesize
1 2 3
4 8 12
9 6 3
8 7 3
6 5 4
3 2 1
1 7 8
------------------------------------
More information about the Python-list
mailing list