[Tutor] New Item

Peter Otten __peter__ at web.de
Thu Sep 28 06:46:03 EDT 2017


LARRYSTALEY07 at comcast.net wrote:

> I am very new to Python and appreciate the input as I was able to fully
> install Python with all needed libraries (i.e., numpy, pandas, etc.).
> However, I now have an application question in needed to construct a 2D
> Histogram.
> 
> Basically, I have an Excel file that includes three columns:
> Column 1 - Gender (Male or Female)
> Column 2 - Height (in inches)
> Column 3 - Hand Span (in inches)

I have yet to grok your code samples, but my feeling is that your approach 
is too low-level. Do you mean something like 

http://matplotlib.org/examples/pylab_examples/hist2d_demo.html

by "2d histograms"? That would require very little code written by yourself:

import pandas as pd
from matplotlib import pyplot

filename = "size.xls"
sheetname = "first"

data = pd.read_excel(filename, sheetname)

for index, sex in enumerate(["female", "male"], 1):
    pyplot.figure(index)
    subset = data[data["Gender"] == sex]
    pyplot.hist2d(subset["Height"].values, subset["Hand Span"].values)

pyplot.show()


> 
> There are 168 entries in the Excel file.
> 
> I need to construct two separate 2D histograms for the two classes, one
> for the height and the other for the hand span. These formulas would
> appear as:
> 
> r = ROUND ( (B-1) ((h(i) - h(min) / (h(max) - h(min)) ***this is for the
> height and requires each the min and max height from the 168 entries. B is
> for number of bins. I will use 7. i is for each height entry, from 1
> through 168. c = ROUND ( (B-1) ((s(i) - s(min) / (s(max) - s(min)) ***this
> is for the hand span and requires each the min and max hand span from the
> 168 entries. B is for number of bins. Again, I am using 7. i is for each
> hand span entry, from 1 through 168.
> 
> Finally, if gender = Female, I update H(f) as H(f)[r,c] + 1. Else is Male
> or update H(m) as H(m)[r,c] + H(m)[r,c] + 1.
> 
> As it appears I need arrays first totalled by gender, height and hand
> span, appears that would look like the following:
> 
> 
> 
> data=readExcel(excelfile)
> X=np.array(data[:,1],dtype=float);
> S=np.array(data[:,2],dtype=float);
> T=np.array(data[:,0],dtype=str);
> 
> 
> 
> 
> Finally, is my intended coding for the actual 2D histogram. I will get min
> and max from the height and hand span arrays. Note I am still learning
> Python and looping is new to me:
> 
> 
> 
> 
> # Define histogram classifier to build histogram using two variables
> def Build1DHistogramClassifier(X, S, smin, smax,T,B,xmin,xmax):
> HF=np.zeros(B).astype('int32');
> HM=np.zeros(B).astype('int32');
> binindices1=(np.round(((B-1)*(X-xmin)/(xmax-xmin)))).astype('int32');
> binindices2=(np.round(((B-1)*(S-smin)/(smax-smin)))).astype('int32');
> for i,b in enumerate(binindices1):
> for i,c in enumerate(bindindices2):
> if T[i]=='Female':
> HF[b,c]+=1;
> else:
> HM[b,c]+=1;
> return [HF, HM]
> 
> 
> 
> 
> I would appreciate any input on this approach and at least if I am going
> in the right direction.
> 
> Thanks again.
> 
> 
> 
> 
> Larry Staley
> 
> larrystaley07 at comcast.net
> 
> 650.274.6794
> _______________________________________________
> Tutor maillist  -  Tutor at python.org
> To unsubscribe or change subscription options:
> https://mail.python.org/mailman/listinfo/tutor




More information about the Tutor mailing list