mysql -> record array

Tim Hochberg tim.hochberg at ieee.org
Tue Nov 14 19:44:48 EST 2006


John Hunter wrote:
>>>>>> "Erin" == Erin Sheldon <erin.sheldon at gmail.com> writes:
>>>>>>             
>
>     Erin> The question I have been asking myself is "what is the
>     Erin> advantage of such an approach?".  It would be faster, but by
>
> In the use case that prompted this message, the pull from mysql took
> almost 3 seconds, and the conversion from lists to numpy arrays took
> more that 4 seconds.  We have a list of about 500000 2 tuples of
> floats.
>   
I'm no database user, but a glance at the at the docs seems to indicate 
that you can get your data via an iterator (by iterating over the cursor 
or some such db mumbo jumbo) rather than slurping up the whole list up 
at once. If so, then you'll save a lot of memory by passing the iterator 
straight to fromiter. It may even be faster, who knows.

Accessing the db via the iterator could be a performance killer, but 
it's almost certainly worth trying as it could a few megabytes of 
storage and that in turn might speed things up.

-tim

> Digging in a little bit, we found that numpy is about 3x slower than
> Numeric here
>
>   peds-pc311:~> python test.py
>   with dtype: 4.25 elapsed seconds
>   w/o dtype 5.79 elapsed seconds
>   Numeric  1.58 elapsed seconds
>   24.0b2
>   1.0.1.dev3432
>
> Hmm... So maybe the question is -- is there some low hanging fruit
> here to get numpy speeds up?
>
> import time
> import numpy
> import numpy.random
> rand = numpy.random.rand
>
> x = [(rand(), rand()) for i in xrange(500000)]
> tnow = time.time()
> y = numpy.array(x, dtype=numpy.float_)
> tdone = time.time()
> print 'with dtype: %1.2f elapsed seconds'%(tdone - tnow)
>
> tnow = time.time()
> y = numpy.array(x)
> tdone = time.time()
> print 'w/o dtype %1.2f elapsed seconds'%(tdone - tnow)
>
> import Numeric
> tnow = time.time()
> y = Numeric.array(x, Numeric.Float)
> tdone = time.time()
> print 'Numeric  %1.2f elapsed seconds'%(tdone - tnow)
>
> print Numeric.__version__
> print numpy.__version__
>
> -------------------------------------------------------------------------
> Take Surveys. Earn Cash. Influence the Future of IT
> Join SourceForge.net's Techsay panel and you'll get the chance to share your
> opinions on IT & business topics through brief surveys - and earn cash
> http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>
>
>   



-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV




More information about the NumPy-Discussion mailing list