[Tutor] How to find optimisations for code
cs at cskk.id.au
Fri Oct 19 21:26:16 EDT 2018
On 19Oct2018 23:07, Alan Gauld <alan.gauld at yahoo.co.uk> wrote:
>On 19/10/18 17:12, Pat Martin wrote:
>> TLDR; How do you figure out if code is inefficient (if it isn't
>> necessarily obvious) and how do you find a more efficient solution?
>Others have addressed most of the issues but I'd just add the caveat
>that you can spend forever trying to make your code "more efficient".
>Usually, in the real world, you don't need to.
>So how do you figure out if your code is inefficient?
>Run it and if it doesn't work fast enough for your purpose then
>try to speed it up. If it does then move onto the next project.
>> how do I go about finding a more efficient solution?
>Google, a good reference book and experience.
Particularly, get some knowledge of the cost of various algorithms and
data structures. There are outright inefficient things to do, but many
others have costs and benefits: things fast in time but costly in space,
and vice versa, and things efficient for some kinds of data but known to
be bad for others.
Have a quick read of this:
which talks about a common notation for costs. Skip the formal section
with the math and go to the Orders of Common Functions section.
You'll find people talk about big O notation a lot with algorithms and
their efficiency. For example, a dict _lookup_ is O(1): constant time
per lookup. But that doesn't mean always use a dict, because there's a
cost to creating one in the first place. What you use it for matters.
>But always, always, measure because there has been more
>time wasted on unnecessary optimisations than on any other
>feature of programming.
I just want to add a little nuance to all of this, because it to me it
reads a little like (a) if it works and you get things done in a timely
manner it is efficient and (b) if it isn't fast enough you can always
make it faster. Alan hasn't actually said that, but one could get that
To my mind:
Alan's point about "fast enough" is perfectly valid: in the real world,
if your problem has been solved then further work on efficiency might
cost more to implement than you gain after doing it.
His other point about measurement is also very important. It is possible
to guess incorrectly about where performance problems lie, particularly
with higher level languages because their internals has costs not
apparent to the eye (because you can't see the internals). So from an
engineering point of view, it is important to measure where a
problematic programme spends its time and devote effort to those parts
consuming the most time.
The standard example is the tight loop:
for a in sequence1:
for b in sequence2:
do thing A
do thing B
The cost of "do thing A" is more important than the cost of "do thing B"
because likely A runs many many times more often than B. So even if B
has some known inefficiency you can see, which will take some work to
improve, effort spent on A will probably be more useful (if that's
With a complex programme this isn't always obvious; in the example above
the 2 pieces of code are side by side.
The point about measurement is that profiling tools are useful here:
when you _don't_ have an obvious primary target to improve but do need
to improve efficiency, a profiling tool can measure where a programme
spends its time in aggregate. That can point the way to code more
beneficial to inspect.
It at least gets you objective information about where your programme
spends its time. It is limited by the data you give your programme: toy
example input data are not as good as real world data.
Finally, some things are as efficient as they get. You _can't_ always
make things even more efficient.
Cameron Simpson <cs at cskk.id.au>
More information about the Tutor