[Tutor] Change datatype for specific columns in an 2D array & computing the mean

Ek Esawi esawiek at gmail.com
Sat Jan 23 12:57:50 EST 2016


Hi All---



Sorry for posting again, but I have a problem that I tried several
different ways to solve w/o success. I approached the problem from one
angle and asked about it here; I got some good input using pandas, and
structured array, but I am new to python and not very familiar with either
to use at this moment.  I decided to go about in a different direction. I
am hoping for a simpler solution using Numpy.


I have a csv file with 4 columns and 2000 rows. There are 10 variables in
column 1 and 4 variables on each column, 2 and 3. I read the csv file and
converted it to arrays. The problem I ran into and could not resolve is
2-fold: (1) change the datatype for columns 1 and 4 to float and (2) then,
I want to use Numpy-or simpler method- to calculate the mean of the data
points on column 4 based on each variable on column 1 and column 2. Below
is my code and sample data file.



Here is part of my code:



import numpy as np

import csv



TMatrix=[]

np.set_printoptions(precision=2)



" Converting csv to lists "



with open('c:/Users/My Documents/AAA/temp1.csv') as temp:

    reader = csv.reader(temp, delimiter=',', quoting=csv.QUOTE_NONE)

    for row in reader:

        TMatrix.append(row)



" converting lists to arrays "

TMatrix=np.array(TMatrix)

TMatrix=np.array(4,TMatrix[1:,::],dtype='float,int,int,float')        #
this statement is not working



+++++++++++++++ This is a sample of my file +++++++++++++



['19' 'A4' 'B2' '2']

 ['19' 'A5' 'B1' '12']

 ['18' 'A5' 'B2' '121']]



Thanks in advance

EK Esawi


More information about the Tutor mailing list