[Tutor] Memory Error problem solved

Karthik Swaminathan rs.karthick at gmail.com
Tue Jul 29 16:35:58 CEST 2008


I was amazed by the response of the Python community. Hats off to all. I
stopped using the lists and got the issue resolved using just the
variables. Nevertheless, i learned a lot by starting this thread.

Thanks a million to Alan, Chris, John for spending your quality time in
helping this newbie.

Hope i havent spoiled Alan's sleep much.


On 7/29/08, tutor-request at python.org <tutor-request at python.org> wrote:
>
> Send Tutor mailing list submissions to
>        tutor at python.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
>        http://mail.python.org/mailman/listinfo/tutor
> or, via email, send a message with subject or body 'help' to
>        tutor-request at python.org
>
> You can reach the person managing the list at
>        tutor-owner at python.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Tutor digest..."
>
>
> Today's Topics:
>
>   1. Re: Memory error - how to manage large data sets? (John Fouhy)
>   2. Re: Memory error - how to manage large data sets? (Chris Fuller)
>   3. Re: Memory error - how to manage large data sets? (Alan Gauld)
>   4. Re: Turtle problem: how to exit the .exe? (Alan Gauld)
>   5. Re: Turtle problem: how to exit the .exe? (Dick Moores)
>   6. Re: Memory error - how to manage large data sets?
>      (Daniel Sarmiento)
>   7. Re: Tutor Digest, Vol 53, Issue 99 (kinuthiA muchanE)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Tue, 29 Jul 2008 11:48:11 +1200
> From: "John Fouhy" <john at fouhy.net>
> Subject: Re: [Tutor] Memory error - how to manage large data sets?
> To: "Daniel Sarmiento" <dsarmientos at gmail.com>
> Cc: tutor at python.org
> Message-ID:
>        <5e58f2e40807281648g4621d37doda5db90436a78039 at mail.gmail.com>
> Content-Type: text/plain; charset=ISO-8859-1
>
> On 29/07/2008, Daniel Sarmiento <dsarmientos at gmail.com> wrote:
> >  I tried to run your code and checked (with top) the memory ussage and
> >  it uses more than 2 Gb of memory.
> >
> >  I tried to modify the code a little bit to use less memory and came up
> >  with this:
> >
> >  fib = {0:0,1:1}
> >
> > even = []
> >
> >  def fibonacci(x,y):
> >    return x+y
> >
> > for j in xrange (2,1000000):
> >     i = fib[j-1] + fib[j-2]
> >     if i % 2 == 0:
> >         even.append(i)
> >     fib = {j-1:fib[j-1], j:i}
> >
> >
> >  total = reduce(fibonacci,even)
> >  print total
> >
> >  It looks like the progam still hangs and I did not notice any memory
> >  imrovements when running it with 1 000 000
>
> Well, let's see.  You're still storing all the even fibonacci numbers
> that you compute.  By the looks of things, one third of fibonacci
> numbers are even, so that's about 333,333 numbers that we're storing.
>
> How big do these numbers get?  There's a closed-form expression for
> Fibonacci numbers; it's: fib(n) = (phi**n - (1-phi)**n)/sqrt(5), where
> phi is the golden ratio (about 1.6).  1-phi is -0.6, so when n is
> large, (1-phi)**n is practically zero.  So fib(n) is roughly
> phi**n/sqrt(5).  These numbers will quickly get beyond the size of
> normal integers, and into long integers.  I don't know how many bytes
> a long integer takes up, but I guess we can estimate it by looking at
> the log to base 2.
>
> So let's consider the millionth Fibonacci number.  fib(1000000) ~=
> phi**1000000/sqrt(5).
> So log(fib(1000000)) ~= log(phi**1000000/sqrt(5)) = 1000000*log(phi) -
> log(sqrt(5)).
>
> Taking logs to base 2, we get:
>
> >>> 1000000*log(phi, 2) - log(sqrt(5), 2)
> 694240.75266657001
>
> In other words, the best possible representation of the millionth
> Fibonacci number will take almost 700,000 bits, or around 85
> kilobytes.  I don't know how Python long integers actually work; it
> may take up more space than that.
>
> Of course, that's just the biggest one we're working out.  Let's try
> to work out how much space the first million will take up.  This is:
>
> sum_{i=1}^{1000000} i*log(phi, 2) - log(sqrt(5), 2)
>
> == -1000000*log(sqrt(5), 2) + log(phi, 2)*sum_{i=1}^{1000000} i
>
> Remembering the formula for summing integers (n(n+1)/2), this is:
>
> >>> -1000000*log(sqrt(5), 2) + log(phi, 2)*(1000000*1000001/2)
> 347120142972.21808
>
> This is a number of bits; divide by 8 to get bytes; divide by 8*1024
> to get kilobytes, etc:
>
> >>> _/(8*1024*1024*1024)
> 40.410103156722393
>
> So ... that's about 40 gigabytes worth of numbers in this list you're
> trying to build.  Well, actually you're only storing a third of the
> Fibonacci numbers (the even ones), so we can cut that down to thirteen
> gigabytes.  Assuming my maths is correct and there's not too much
> overhead in Python long integers :-)
>
> (the solution, of course, is to avoid storing all those numbers in the
> first place)
>
> --
> John.
>
>
> ------------------------------
>
> Message: 2
> Date: Mon, 28 Jul 2008 19:18:48 -0500
> From: Chris Fuller <cfuller084 at thinkingplanet.net>
> Subject: Re: [Tutor] Memory error - how to manage large data sets?
> To: tutor at python.org
> Message-ID: <200807281918.48230.cfuller084 at thinkingplanet.net>
> Content-Type: text/plain;  charset="utf-8"
>
>
> There's no need to keep any lists.  The sum can be done on the fly, which
> is
> perhaps a bit slower, but takes a constant amount of ram.  Even storing
> every
> other element (or every third, which is what he's trying to do: the
> elements
> that are even numbers, not every other element.. See his example code, or
> Project Euler, problem two, which seems to be the original source) will
> still
> take a lot of ram.
>
> Something like this:
>
> a = 0
> b = 1
>
> total = 0
>
> while True:
>   c = a+b
>   a = b
>   b = c
>
> if c>1e6:
>    break
>
> if c%2 == 0:
>    total += c
>
> print total
>
> By the way,  working through those problems will really exercise your
> programming and math skills.  It's a great way to get started, if you enjoy
> those kind of puzzles.
>
>
> Can you see why every third element must be even?
> Cheers
>
>
> ------------------------------
>
> Message: 3
> Date: Tue, 29 Jul 2008 01:18:53 +0100
> From: "Alan Gauld" <alan.gauld at btinternet.com>
> Subject: Re: [Tutor] Memory error - how to manage large data sets?
> To: tutor at python.org
> Message-ID: <g6lnhk$dmm$1 at ger.gmane.org>
> Content-Type: text/plain; charset="iso-8859-1"
>
> "Alan Gauld" <alan.gauld at btinternet.com> wrote
>
> > were infinite using floats! So you need to calculate the
> > total as you go without saving the values
> >
>
> I got curious so wrote the following function:
>
> >>> def fibtot(N):
> ...   f0,f1,tot = 0,1,1
> ...   for n in range(N):
> ...     f = f0 + f1
> ...     f0,f1 = f1,f
> ...     if n % 2 == 0: tot += f
> ...   return tot
>
> and here are the results...
>
> >>> len(str(fibtot(10000)))
> 2090
> >>> len(str(fibtot(100000)))
> 20899
> >>> len(str(fibtot(200000)))
> 41798
> >>> len(str(fibtot(400000)))
> 83595
> >>> len(str(fibtot(500000)))
> 104494
> >>> len(str(fibtot(600000)))
> 125393
> >>> len(str(fibtot(800000)))
> 167190
> >>> len(str(fibtot(1000000)))
> 208988
> >>>
>
> So the final total contains nearly 209 thousand digits!
> Thats a very, very big number... and it took 5 minutes
> to compute on my PC (Dual core 2.8GHz with 2G RAM)
>
> Now I really must go to bed :-)
>
> --
> Alan Gauld
> Author of the Learn to Program web site
> http://www.freenetpages.co.uk/hp/alan.gauld
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <
> http://mail.python.org/pipermail/tutor/attachments/20080729/e87468ad/attachment-0001.htm
> >
>
> ------------------------------
>
> Message: 4
> Date: Tue, 29 Jul 2008 01:30:44 +0100
> From: "Alan Gauld" <alan.gauld at btinternet.com>
> Subject: Re: [Tutor] Turtle problem: how to exit the .exe?
> To: tutor at python.org
> Message-ID: <g6lo7r$f9k$1 at ger.gmane.org>
> Content-Type: text/plain; format=flowed; charset="iso-8859-1";
>        reply-type=response
>
>
> "Dick Moores" <rdm at rcblue.com> wrote
>
> >>Tkinter is involved, turtle usesw Tkinter.
> >
> > Yes, I allowed for that when I made the .exe:
> > python Makespec.py -FKc
> > E:\PyInstaller\MyScripts\randomTriangles_wo_named_colorsV13.exe
>
> Means nothing to me, I've never found a need to make an exe
> from a python program... :-)
>
> >>But to exit I would expect the standard Windows exit key
> >>combination (Alt F4) to work fine?
> >
> > Yes, but no. Try it yourself. Please.
> > <
> http://www.rcblue.com/Misc/randomTriangles_wo_named_colorsV13_no_console.exe
> >
>
> Very odd, I assume the python script when executed normally
> doesn't exhibit such odd behaviour? The only way I could stop
> it was to kill the process from Task Manager. Are you sure you
> aren't generating some kind of autostart option in your exe
> building tool?
>
> I have no further ideas...
>
> Alan G.
>
>
>
>
> ------------------------------
>
> Message: 5
> Date: Mon, 28 Jul 2008 19:27:50 -0700
> From: Dick Moores <rdm at rcblue.com>
> Subject: Re: [Tutor] Turtle problem: how to exit the .exe?
> To: tutor at python.org
> Message-ID: <20080729022803.22D811E4004 at bag.python.org>
> Content-Type: text/plain; charset="us-ascii"; format=flowed
>
> At 05:30 PM 7/28/2008, Alan Gauld wrote:
>
> >"Dick Moores" <rdm at rcblue.com> wrote
> >
> >>>Tkinter is involved, turtle usesw Tkinter.
> >>
> >>Yes, I allowed for that when I made the .exe:
> >>python Makespec.py -FKc
> >>E:\PyInstaller\MyScripts\randomTriangles_wo_named_colorsV13.exe
> >
> >Means nothing to me, I've never found a need to make an exe
> >from a python program... :-)
> >
> >>>But to exit I would expect the standard Windows exit key
> >>>combination (Alt F4) to work fine?
> >>
> >>Yes, but no. Try it yourself. Please.
> >><
> http://www.rcblue.com/Misc/randomTriangles_wo_named_colorsV13_no_console.exe
> >
> >
> >Very odd, I assume the python script when executed normally
> >doesn't exhibit such odd behaviour?
>
> When I try to stop the script from running not by a Ctrl+Q on the
> console window, but on the Turtle window, the only way is to use the
> Task Manager.
>
> >The only way I could stop
> >it was to kill the process from Task Manager.
>
> Yes.
>
> >  Are you sure you
> >aren't generating some kind of autostart option in your exe
> >building tool?
>
> Well, since the behavior of the .exe is exactly the same as that of
> the running script, I'd say that isn't happening.
>
> >I have no further ideas...
>
> Well, thanks for trying, Alan.
>
> Dick
>
>
>
>
> ------------------------------
>
> Message: 6
> Date: Mon, 28 Jul 2008 22:48:35 -0500
> From: "Daniel Sarmiento" <dsarmientos at gmail.com>
> Subject: Re: [Tutor] Memory error - how to manage large data sets?
> To: tutor at python.org
> Message-ID:
>        <d356a5240807282048y67164aa8p2398abd4705f3aba at mail.gmail.com>
> Content-Type: text/plain; charset=ISO-8859-1
>
> >(the solution, of course, is to avoid storing all those numbers in the
> >first place)
>
> I tried this:
>
> fib = {0:0,1:1}
> sum = 0
>
> for j in xrange (2,1000000):
>    i = fib[j-1] + fib[j-2]
>    if i % 2 == 0:
>        sum += i
>    fib = {j-1:fib[j-1], j:i}
>
> print sum
>
> I guess it should come up with the right answer (i compared sum to the
> total value for small values in the range and they were the same)
>
> Just storing the value of the sum doesn't use that much memory and
> prints the sum for (2, 1 million) but it takes a long time to compute.
> I know there has to be a way better solution.
>
>
> ------------------------------
>
> Message: 7
> Date: Tue, 29 Jul 2008 09:31:52 +0300
> From: kinuthiA muchanE <muchanek at gmail.com>
> Subject: Re: [Tutor] Tutor Digest, Vol 53, Issue 99
> To: tutor at python.org
> Message-ID: <1217313112.5761.7.camel at www.kinuthia.com>
> Content-Type: text/plain
>
>
> On Tue, 2008-07-29 at 01:12 +0200, tutor-request at python.org wrote:
> > Message: 3
> > Date: Mon, 28 Jul 2008 13:26:13 -0500
> > From: "Daniel Sarmiento" <dsarmientos at gmail.com>
> > Subject: Re: [Tutor] Memory error - how to manage large data sets?
> > To: tutor at python.org
> > Message-ID:
> >         <d356a5240807281126t6bd27dc5y6ef411305a9f5f5a at mail.gmail.com>
> > Content-Type: text/plain; charset=ISO-8859-1
> >
> > Hi
> >
> > I tried to run your code and checked (with top) the memory ussage and
> > it uses more than 2 Gb of memory.
> >
> > I tried to modify the code a little bit to use less memory and came up
> > with this:
> >
> > fib = {0:0,1:1}
> > even = []
> >
> > def fibonacci(x,y):
> >    return x+y
> >
> > for j in xrange (2,1000000):
> >     i = fib[j-1] + fib[j-2]
> >     if i % 2 == 0:
> >         even.append(i)
> >     fib = {j-1:fib[j-1], j:i}
> >
> > total = reduce(fibonacci,even)
> > print total
> >
> > First, I replaced range with xrange.
> > I figured that you only need the last two values in the fibonnaci
> > series to calculate the next value, so I replaced the fib list with a
> > dictionary to only store the last two values instead of the whole
> > series.
> >
> > It looks like the progam still hangs and I did not notice any memory
> > imrovements when running it with 1 000 000
> >
> > Am I wrong thinking that the modifications I made help use less
> > memory?
> >
> I have realised that when you need to work with large numbers, lists are
> slow and tend to gobble up bytes. Another thing try to be as simple as
> possible :). This is how I would approach the problem, without using
> lists. It produces the result instantly, even for large numbers like a
> 100, 000, 000.
>
> def fibonacci()
>    a = 0
>    b = 1
>    evenTotal = 0
>    while a < 100000000:
>        a,b = b,a+b
>        if a%2 == 0:
>            evenTotal += a
>    print evenTotal
>
> Does this help?
> Kinuthia...
>
>
>
> ------------------------------
>
> _______________________________________________
> Tutor maillist  -  Tutor at python.org
> http://mail.python.org/mailman/listinfo/tutor
>
>
> End of Tutor Digest, Vol 53, Issue 100
> **************************************
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/tutor/attachments/20080729/1f77c0a8/attachment-0001.htm>


More information about the Tutor mailing list