[Tutor] Issues Inserting Graphical Overlay Using Matplotlib Patches
Stephen Malcolm
stephen_malcolm at hotmail.com
Tue Sep 29 14:28:13 EDT 2020
Hi Alan,
I have that sorted now, and my data points are plotted. Thanks for the advice.
Whilst I'm here, I have one more question on this thread.
I'm having some trouble adding a graphical overlay i.e. an ellipse onto my plot.
I wish to do this, as I need to explain/ portray the mean, standard deviation and outliers. And hence evaluate the suitability of the dataset.
Could you please let me know what code I'm missing/ or need to add, in order to insert this ellipse?
I have no trouble plotting the data points and the mean using this code, however, the ellipse (width and height/ standard deviation) doesn't appear.
I have no errors, instead, I'm getting a separate graph (without data points or ellipse) below the plotted one.
Please find my code again:
#pandas used to read dataset and return the data
#numpy and matplotlib to represent and visualize the data
#sklearn to implement kmeans algorithm
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.cluster import KMeans
#import the data
data = pd.read_csv('banknotes.csv')
#extract values
x=data['V1']
y=data['V2']
#print range to determine normalization
print ("X_max : ",x.max())
print ("X_min : ",x.min())
print ("Y_max : ",y.max())
print ("Y_min : ",y.min())
#normalize values
mean_x=x.mean()
mean_y=y.mean()
max_x=x.max()
max_y=y.max()
min_x=x.min()
min_y=y.min()
for i in range(0,x.size):
x[i] = (x[i] - mean_x) / (max_x - min_x)
for i in range(0,y.size):
y[i] = (y[i] - mean_y) / (max_y - min_y)
#statistical analyis using mean and standard deviation
import matplotlib.patches as patches
mean = np.mean(data, 0)
std_dev = np.std(data, 0)
ellipse = patches.Ellipse([mean[0], mean [1]], std_dev[0]*2, std_dev[1]*2, alpha=0.25)
plt.xlabel('V1')
plt.ylabel('V2')
plt.title('Visualization of raw data');
plt.scatter(data.iloc[:, 0], data.iloc[:, 1])
plt.scatter(mean[0],mean[1])
plt.figure(figsize=(6, 6))
fig,graph = plt.subplots()
graph.add_patch(ellipse)
________________________________
From: Tutor <tutor-bounces+stephen_malcolm=hotmail.com at python.org> on behalf of Alan Gauld via Tutor <tutor at python.org>
Sent: 29 September 2020 07:48
To: tutor at python.org <tutor at python.org>
Subject: Re: [Tutor] Issues Inserting Graphical Overlay Using Matplotlib Patches
On 29/09/2020 06:53, Stephen Malcolm wrote:
> I’ve studied those two lines, and I’m struggling to come up with a solution.
> I think the word ‘data’ should be replaced with something else.
Its not the word data that matters but that closing parenthesis ')'.
It should be a comma. That's what Python is telling you by saying
there is a syntax error. Its not a name error it is a mistake
in the grammar of your code.
>>> graph.scatter(data[:,0])data[:,1])
>>> graph.scatter(mean[0], mean[1])
--
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.amazon.com/author/alan_gauld
Follow my photo-blog on Flickr at:
http://www.flickr.com/photos/alangauldphotos
_______________________________________________
Tutor maillist - Tutor at python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor
More information about the Tutor
mailing list