[Tutor] New Item

Sun Oct 15 09:37:50 EDT 2017

Sydney Shall wrote:

> On 28/09/2017 11:46, Peter Otten wrote:
>> LARRYSTALEY07 at comcast.net wrote:
>> 
>>> I am very new to Python and appreciate the input as I was able to fully
>>> install Python with all needed libraries (i.e., numpy, pandas, etc.).
>>> However, I now have an application question in needed to construct a 2D
>>> Histogram.
>>>
>>> Basically, I have an Excel file that includes three columns:
>>> Column 1 - Gender (Male or Female)
>>> Column 2 - Height (in inches)
>>> Column 3 - Hand Span (in inches)
>> 
>> I have yet to grok your code samples, but my feeling is that your
>> approach is too low-level. Do you mean something like
>> 
>> http://matplotlib.org/examples/pylab_examples/hist2d_demo.html
>> 
>> by "2d histograms"? That would require very little code written by
>> yourself:
>> 
>> import pandas as pd
>> from matplotlib import pyplot
>> 
>> filename = "size.xls"
>> sheetname = "first"
>> 
>> data = pd.read_excel(filename, sheetname)
>> 
>> for index, sex in enumerate(["female", "male"], 1):
>>      pyplot.figure(index)
>>      subset = data[data["Gender"] == sex]
>>      pyplot.hist2d(subset["Height"].values, subset["Hand Span"].values)
>> 
>> pyplot.show()

> I have a similar problem, but my data is not in excel but is in
> OpenOffice "Spreadsheet', but not in "Database".
> 
> My question is can I use a similar simple procedure as that given by
> Peter Otten.

There doesn't seem to be direct support for the ods file format in pandas. 
Your easiest option is to open the file in OpenOffice and save as xls or 
csv.

If you don't want to go that route you can install a library that can read 
ods files. With

https://pypi.python.org/pypi/pyexcel-ods/0.3.1

the above example should work after the following modifications:

import pandas as pd
from matplotlib import pyplot
import pyexcel_ods

def read_ods(filename, sheetname):
    table = pyexcel_ods.read_data(filename)[sheetname]
    return pd.DataFrame(table[1:], columns=table[0])

filename = "size.ods"
sheetname = "first"

data = read_ods(filename, sheetname)

for index, sex in enumerate(["female", "male"], 1):
    pyplot.figure(index)
    subset = data[data["Gender"] == sex]
    pyplot.hist2d(subset["Height"].values, subset["Hand Span"].values)

pyplot.show()