[Tutor] [OT] Best Practices for Scientific Computing

Albert-Jan Roskam fomcl at yahoo.com
Tue Dec 30 15:10:40 CET 2014

On Tue, Dec 30, 2014 12:34 PM CET Alan Gauld wrote:

>On 29/12/14 23:50, Patti Scott wrote:
>> Could someone clarify "Modularize code rather than copying and pasting?"
>Joel and Danny have given the basic answer in terms of putting
>loose code into functions. But it goes a little further too.
>You can be tempted to copy/paste functions because they almost
>do what you want but not quite (maybe the take a list of values
>and you want to use a dictionary). Its easy to copy the function
>and change all the numerical indexes into keys and use the
>dictionary. But it would be better to change the original function
>so that it can work with both, if possible. This usually involves
>representing the variations in behaviour as parameters in the
>function (usually with defaults to retain the original semantics).

But isn't there a point where you break the 'law' that a function should have one and only one goal? At  a certain point you end up with a big fat juicy function that not only takes dicts and lists, but also scratches your back, does the laundry,...

>Another case might be where you have a bunch of functions that operate on a global data structure. Then you want to have a second similar data structure but you can't use the functions on it because they access
>the existing global. You could copy/paste the functions and change the data reference. But it would be better to either add the data structure reference to the functions as a parameter, or you create a class that has the data as an internal instance attribute.
>Finally, you might have some code in a program that you want to use in another program. But you can't just import the first program because
>it runs all sorts of other stuff you don't need. So you just copy
>out the functions you want. Or you could put the loose code into a function (main() usually) and call that within a

Say you have a file 'helpers.py' that contains functions that you might use across projects. How do you only display the functions that you actually used in your (Sphinx) documentation of a particular project?

>if __name__ === "__main__": main()
>line in your original program, turning it into a reusable module
>which you can now safely import.
>So modularize can have several meanings, it applies to any
>mechanism used to bundle up code for reuse. The point being that reusing code, whether as a function, module or class, is more
>flexible and reliable than copy/pasting it.
>Note that reuse is not universally a good thing, it does
>have negative aspects too, in test and maintenance costs
>and, often, in performance impact. 

I often use python for data analysis. I would be afraid that a modification to a generic function for project B would affect the results of project A (which may be long-running analysis). Hmmm, maybe 'git submodule' would be useful here.

But in the vast majority
>of cases the positives vastly outweigh the negatives.
>-- Alan G
>Author of the Learn to Program web site
>Follow my photo-blog on Flickr at:
>Tutor maillist  -  Tutor at python.org
>To unsubscribe or change subscription options:

More information about the Tutor mailing list