[Tutor] toy program to find standard deviation of 2 columns of a sqlite3 database
Manprit Singh
manpritsinghece at gmail.com
Mon Jul 4 13:14:29 EDT 2022
Dear Sir,
Finally I came up with a solution which seems more good to me, rather than
using the previous approach. In this solution I have used shortcut method
for calculating the standard deviation.
import sqlite3
class StdDev:
def __init__(self):
self.cnt = 0
self.sumx = 0
self.sumsqrx = 0
def step(self, x):
self.cnt += 1
self.sumx += x
self.sumsqrx += x**2
def finalize(self):
return ((self.sumsqrx - self.sumx**2/self.cnt)/self.cnt)**0.5
conn = sqlite3.connect(":memory:")
cur = conn.cursor()
cur.execute("create table table1(X1 int, X2 int)")
ls = [(2, 5),
(3, 7),
(4, 2),
(5, 1),
(8, 6)]
cur.executemany("insert into table1 values(?, ?)", ls)
conn.commit()
conn.create_aggregate("stdev", 1, StdDev)
std_dev, = cur.execute("select stdev(X1), stdev(X2) from table1")
print(std_dev)
cur.close()
conn.close()
gives output
(2.0591260281974, 2.315167380558045)
That's all. This is what I was looking for .So what will be the best
solution to this problem ? This one or the previous one posted by me ?
The whole credit goes to Dennis lee bieber & avi.e.gross at gmail.com
Regards
Manprit Singh
On Mon, Jul 4, 2022 at 4:17 PM Alan Gauld via Tutor <tutor at python.org>
wrote:
> On 03/07/2022 14:01, Manprit Singh wrote:
> > Sir,
> > I am just going through all the functionalities available in sqlite3
> module
> > , just to see if I can use sqlite3 as a good data analysis tool or not .
>
> SQLite is a good storage and retrieval system. It's not aimed at data
> analysis, thats where tools like Pandas and R come into play.
>
> SQLite will do a better job in pulling out specific subsets of
> data and of organising your data with relationships etc. But it
> makes no attempt to be a fully featured application environment
> (unlike the bigger client/server databases like Oracle or DB2)
>
> > Upto this point I have figured out that and sqlite data base file can be
> an
> > excellent replacement for data stored in files .
> >
> > You can preserve data in a structured form, email to someone who need it
> > etc etc .
>
> Yes, that is its strong point. Everything is stored in a single file
> that can be easily shared by email or by storing it on a cloud server.
>
> > But for good data analysis ....I found pandas is superior . I use pandas
> > for data analysis and visualization .
>
> And that's good because that is what Pandas (and SciPy in general)
> is designed for.
> >
> > Btw ....this is true . You should use right tool for your task .
>
> Absolutely. One of the key skills of a software engineer is
> recognising which tools are best suited to which part of the
> task and how to glue them together.
> There is no universally best tool.
>
> --
> Alan G
> Author of the Learn to Program web site
> http://www.alan-g.me.uk/
> http://www.amazon.com/author/alan_gauld
> Follow my photo-blog on Flickr at:
> http://www.flickr.com/photos/alangauldphotos
>
>
> _______________________________________________
> Tutor maillist - Tutor at python.org
> To unsubscribe or change subscription options:
> https://mail.python.org/mailman/listinfo/tutor
>
More information about the Tutor
mailing list