[Tutor] toy program to find standard deviation of 2 columns of a sqlite3 database

Manprit Singh manpritsinghece at gmail.com
Mon Jul 4 13:14:29 EDT 2022


Dear Sir,

Finally I came up with a solution which seems more good to me, rather than
using the previous approach. In this solution I have used shortcut method
for calculating the standard deviation.

import sqlite3

class StdDev:

    def __init__(self):
        self.cnt = 0
        self.sumx = 0
        self.sumsqrx = 0

    def step(self, x):
        self.cnt += 1
        self.sumx += x
        self.sumsqrx += x**2

    def finalize(self):
        return ((self.sumsqrx - self.sumx**2/self.cnt)/self.cnt)**0.5

conn = sqlite3.connect(":memory:")
cur = conn.cursor()
cur.execute("create table table1(X1 int, X2 int)")
ls = [(2, 5),
      (3, 7),
      (4, 2),
      (5, 1),
      (8, 6)]
cur.executemany("insert into table1 values(?, ?)", ls)
conn.commit()

conn.create_aggregate("stdev", 1, StdDev)
std_dev, = cur.execute("select stdev(X1), stdev(X2) from table1")
print(std_dev)
cur.close()
conn.close()


gives  output

(2.0591260281974, 2.315167380558045)

That's all.  This is what I was looking for .So what will be the best
solution to this problem ? This one or the previous one posted by me ?

The whole credit goes to Dennis lee bieber & avi.e.gross at gmail.com




Regards
Manprit Singh


On Mon, Jul 4, 2022 at 4:17 PM Alan Gauld via Tutor <tutor at python.org>
wrote:

> On 03/07/2022 14:01, Manprit Singh wrote:
> > Sir,
> > I am just going through all the functionalities available in sqlite3
> module
> > , just to see if I can use sqlite3 as a good data analysis tool or not .
>
> SQLite is a good storage and retrieval system. It's not aimed at data
> analysis, thats where tools like Pandas and R come into play.
>
> SQLite will do a better job in pulling out specific subsets of
> data and of organising your data with relationships etc. But it
> makes no attempt to be a fully featured application environment
> (unlike the bigger client/server databases like Oracle or DB2)
>
> > Upto this point I have figured out that and sqlite data base file can be
> an
> > excellent replacement for data stored in files .
> >
> > You can preserve data in a structured form, email to someone who need it
> > etc etc .
>
> Yes, that is its strong point. Everything is stored in a single file
> that can be easily shared by email or by storing it on a cloud server.
>
> > But for good data analysis ....I found pandas is superior . I use pandas
> > for data analysis and visualization .
>
> And that's good because that is what Pandas (and SciPy in general)
> is  designed for.
> >
> > Btw ....this is true . You should use right tool for your task .
>
> Absolutely. One of the key skills of a software engineer is
> recognising which tools are best suited to which part of the
> task and how to glue them together.
> There is no universally best tool.
>
> --
> Alan G
> Author of the Learn to Program web site
> http://www.alan-g.me.uk/
> http://www.amazon.com/author/alan_gauld
> Follow my photo-blog on Flickr at:
> http://www.flickr.com/photos/alangauldphotos
>
>
> _______________________________________________
> Tutor maillist  -  Tutor at python.org
> To unsubscribe or change subscription options:
> https://mail.python.org/mailman/listinfo/tutor
>


More information about the Tutor mailing list