[Tutor] Extract several arrays from a large 2D array
Peter Otten
__peter__ at web.de
Fri Jan 22 05:03:43 EST 2016
Ek Esawi wrote:
> Thank you all for your help. I am a decent programmer in another language
> but new to Python and I have some issues with a project I am working on.
> Some suggested using pandas but I am barley starting on Numpy. The
> suggestions were very helpful, however, I decided to replace the 2D array
> with several single arrays b/c the arrays are of different data types. I
> ran into another problems and am almost done but got stuck.
>
>
>
> Part of my code is below. The question is how to put the variables for
> each j into a 14 by 6 array by a statement at the end of this code. I was
> hoping to get an array like this one below:
>
> [2013 TT1 TT2 TT3 TT4 TT5 TT6]
>
> [2012TT1 TT2 TT3
> TT4 TT5 TT6]
>
> .
>
> .
>
> [1999TT1 TT2 TT3
> TT4 TT5 TT6]
>
> for j in range(14)
>
> for i in range(200):
As a rule of thumb, if you are iterating over individual array entries in
numpy you are doing something wrong ;)
>
> if TYear[i]==2013-j:
>
> if TTrea[i]=='T1':
>
> TT1+=TTemp[i]
>
> elif TTrea[i]=='T2':
>
> TT2+=TTemp[i]
>
> elif TTrea[i]=='T3':
>
> TT3+=TTemp[i]
>
> elif TTrea[i]=='T4':
>
> TT4+=TTemp[i]
>
> elif TTrea[i]=='T5':
>
> TT5+=TTemp[i]
>
> elif TTrea[i]=='T6':
> TT6+=TTemp[i]
This looks like you are manually building a pivot table. If you don't want
to use a spreadsheet (like Excel or Libre/OpenOffice Calc) you can do it
with pandas, too:
import pandas
import numpy
df = pandas.DataFrame(
[
[2013, "T1", 42],
[2013, "T2", 12],
[2012, "T1", 1],
[2012, "T1", 2],
[2012, "T2", 10],
[2012, "T3", 11],
[2012, "T4", 12],
],
columns=["year", "t", "v"])
print(
pandas.pivot_table(df, cols="t", rows="year", aggfunc=numpy.sum)
)
Here's a generic Python solution for educational purposes. It uses nested
dicts to model the table. The outer dict is used for the rows, the inner
dicts for the cells in a row. An extra set keeps track of the column labels.
data = [
[2013, "T1", 42],
[2013, "T2", 12],
[2012, "T1", 1],
[2012, "T1", 2],
[2012, "T2", 10],
[2012, "T3", 11],
[2012, "T4", 12],
]
table = {}
columns = set()
for year, t, v in data:
columns.add(t)
row = table.setdefault(year, {})
row[t] = row.get(t, 0) + v
columns = sorted(columns)
print(" ", *["{:>5}".format(c) for c in columns])
for year, row in sorted(table.items(), reverse=True):
print(year, *["{:5}".format(row.get(c, 0)) for c in columns])
More information about the Tutor
mailing list