[Tutor] outputting long lines

Magnus Lycka magnus@thinkware.se
Tue Mar 11 04:42:02 2003


Paul Tremblay wrote:
>It seems like Gregor's solution is the best. (See what he wrote in his
>ps.) Would the solution in this email really save a lot of time in
>comparison to Gregor's solution?

It probably won't matter in this case. But don't ask *us* about
*your* performance. Just measure!

The performance gain is mainly when there are many strings
involved, as in:

y = ''
for x in longSequence:
     y += x

with

y = []
for x in longSequence:
     y.append(x)
y = "".join(y)

If it's just a handful of strings, don't bother.

>On Sat, Mar 08, 2003 at 12:11:39PM -0500, Erik Price wrote:
> >
> > One way to do it that doesn't sacrifice performance is to create a list
> > of strings and then join them all together when they need to be treated
> > as a single string.
> >
> > >>> text = [
> > ...  'Here is a ',
> > ...  'really long string ',
> > ...  'that is broken up ',
> > ...  'over multiple lines.'
> > ...        ]
> > >>> text
> > ['Here is a ', 'really long string ', 'that is broken up ', 'over
> > multiple lines.']
> > >>> output = ''.join(text)
> > >>> output
> > 'Here is a really long string that is broken up over multiple lines.'

To measure something like this, wrap up what you want to measure in
a function. It's probably good if the function can take a parameter
which is a number, and repeat the test that many times, otherwise
you might just measure the time 0.000 all the time. For instance:

 >>> import profile
 >>> def f1():
...     a = ('asd'
...     'asdasd'
...     'asdasdadsad'
...     'asdadasdasdasdasd'
...     'sdfsfsdfsdfsdf'
...     'sadasdasdasd')
...
 >>> def f2():
...     a = ['asd',
...     'asdasd',
...     'asdasdadsad',
...     'asdadasdasdasdasd',
...     'sdfsfsdfsdfsdf',
...     'sadasdasdasd']
...     a = "".join(a)
...
 >>> def m(f,n):
...     for i in xrange(n):
...         f()
 >>> profile.run('m(f1,1000)')
          1003 function calls in 0.414 CPU seconds

    Ordered by: standard name

    ncalls  tottime  percall  cumtime  percall filename:lineno(function)
      1000    0.009    0.000    0.009    0.000 <stdin>:1(f1)
         1    0.013    0.013    0.023    0.023 <stdin>:1(m)
         1    0.000    0.000    0.023    0.023 <string>:1(?)
         1    0.391    0.391    0.414    0.414 profile:0(m(f1,1000))
         0    0.000             0.000          profile:0(profiler)

We can bump up n a bit to get more precision.

 >>> profile.run('m(f1,100000)')
          100003 function calls in 2.338 CPU seconds

    Ordered by: standard name

    ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    100000    1.008    0.000    1.008    0.000 <stdin>:1(f1)
         1    1.330    1.330    2.338    2.338 <stdin>:1(m)
         1    0.000    0.000    2.338    2.338 <string>:1(?)
         1    0.001    0.001    2.338    2.338 profile:0(m(f1,100000))
         0    0.000             0.000          profile:0(profiler)


 >>> profile.run('m(f2,100000)')
          100003 function calls in 2.945 CPU seconds

    Ordered by: standard name

    ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    100000    1.560    0.000    1.560    0.000 <stdin>:1(f2)
         1    1.384    1.384    2.945    2.945 <stdin>:1(m)
         1    0.000    0.000    2.945    2.945 <string>:1(?)
         1    0.001    0.001    2.945    2.945 profile:0(m(f2,100000))
         0    0.000             0.000          profile:0(profiler)

In this case, using a list increased execution time for f2/f1 by roughly 50%.

But look at a case where we have plenty of strings:

 >>> def f3():
...     a = ""
...     for i in xrange(10000):
...         a += str(i)
...
 >>> profile.run('m(f3,1)')
          4 function calls in 0.218 CPU seconds

    Ordered by: standard name

    ncalls  tottime  percall  cumtime  percall filename:lineno(function)
         1    0.217    0.217    0.217    0.217 <stdin>:1(f3)
         1    0.000    0.000    0.217    0.217 <stdin>:1(m)
         1    0.000    0.000    0.217    0.217 <string>:1(?)
         1    0.001    0.001    0.218    0.218 profile:0(m(f3,1))
         0    0.000             0.000          profile:0(profiler)

 >>> def f4():
...     a = []
...     for i in xrange(10000):
...         a.append(str(i))
...     a = "".join(a)
...
 >>> profile.run('m(f4,1)')
          4 function calls in 0.107 CPU seconds

    Ordered by: standard name

    ncalls  tottime  percall  cumtime  percall filename:lineno(function)
         1    0.106    0.106    0.106    0.106 <stdin>:1(f4)
         1    0.000    0.000    0.106    0.106 <stdin>:1(m)
         1    0.000    0.000    0.106    0.106 <string>:1(?)
         1    0.001    0.001    0.107    0.107 profile:0(m(f4,1))
         0    0.000             0.000          profile:0(profiler)

Here, the list strategy reduced the execution time by 50%.


-- 
Magnus Lycka, Thinkware AB
Alvans vag 99, SE-907 50 UMEA, SWEDEN
phone: int+46 70 582 80 65, fax: int+46 70 612 80 65
http://www.thinkware.se/  mailto:magnus@thinkware.se