[Numpy-discussion] Style guide for numpy code?

Fri May 10 03:30:46 EDT 2019

Hi Joe,

Thanks for sharing!

I'm going to use your handout as a base for my numerical computing classes,
(with an appropriate citation, of course :-)).

чт, 9 мая 2019 г., 21:19 Joe Harrington <jh at physics.ucf.edu>:

> I have a handout for my PHZ 3150 Introduction to Numerical Computing
> course that includes some rules:
>
> (a) All integer-valued floating-point numbers should have decimal points
> after them. For
> example, if you have a time of 10 sec, do not use
>
> y = np.e**10 # sec
>
> use
>
> y = np.e**10. # sec
>
> instead.  For example, an item count is always an integer, but a distance
> is always a float.  A decimal in the range (-1,1) must always have a zero
> before the decimal point, for readability:
>
> x = 0.23 # Right!
>
> x = .23 # WRONG
>
> The purpose of this one is simply to build the decimal-point habit.  In
> Python it's less of an issue now, but sometimes code is translated, and
> integer division is still out there.  For that reason, in other languages,
> it may be desirable to use a decimal point even for counts, unless integer
> division is wanted.  Make a comment whenever you intend integer division
> and the language uses the same symbol (/) for both kinds of division.
>
> (b) Use spaces around binary operations and relations (=<>+-*/). Put a
> space after “,”.
> Do not put space around “=” in keyword arguments, or around “ ** ”.
>
> (c) Do not put plt.show() in your homework file! You may put it in a
> comment if you
> like, but it is not necessary. Just save the plot. If you say
>
> plt.ion()
>
> plots will automatically show while you are working.
>
> (d) Use:
>
> import matplotlib.pyplot as plt
>
> NOT:
>
> import matplotlib.pylab as plt
>
> (e) Keep lines to 80 characters, max, except in rare cases that are well
> justified, such as
> very long strings. If you make comments on the same line as code, keep
> them short or
> break them over more than a line:
>
> code = code2   # set code equal to code2
>
> # Longer comment requiring much more space because
> # I'm explaining something complicated.
> code = code2
>
> code = code2   # Another way to do a very long comment,
>                # like this one, which runs over more than
>                # one line.
>
> (f) Keep blocks of similar lines internally lined up on decimals,
> comments, and = signs.  This makes them easier to read and verify.  There
> will be some cases when this is impractical.  Use your judgment (you're not
> a computer, you control the computer!):
>
> x    =   1.      # this is a comment
> y    = 378.2345  # here's another
> fred = chuck     # note how the decimals, = signs, and
>                  # comments line up nicely...
> alacazamshmazooboloid = 2721 # but not always!
>
> (g) Put the units and sources of all values in comments:
>
> t_planet = 523.     # K, Smith and Jones (2016, ApJ 234, 22)
>
> (h) I don't mean to start a religious war, but I emphasize the alignment
> of similar adjacent code lines to make differences pop out and reduce the
> likelihood of bugs.  For example, it is much easier to verify the
> correctness of:
>
> a     = 3 * x + 3 * 8. *     short    - 5. * np.exp(np.pi * omega * t)
> a_alt = 3 * x + 3 * 8. * anotshortvar - 5. * np.exp(np.pi * omega * t)
>
> than:
>
> a = 3 * x + 3 * 8. * short - 5. * np.exp(np.pi * omega * t)
> a_altvarname = 3 * x + 3*9*anotshortvar - 5. * np.exp(np.pi * omega * i)
>
> (i) Assign values to meaningful variables, and use them in formulae and
> functions:
>
> ny = 512
> nx = 512
> image = np.zeros((ny, nx))
> expr1 = ny * 3
> expr2 = nx * 4
>
> Otherwise, later on when you upgrade to 2560x1440 arrays, you won't know
> which of the 512s are in the x direction and which are in the y direction.
> Or, the student you (now a senior researcher) assign to code the upgrade
> won't!  Also, it reduces bugs arising from the order of arguments to
> functions if the args have meaningful names.  This is not to say that you
> should assign all numbers to functions.  This is fine:
>
> circ = 2 * np.pi * r
>
> (j) All functions assigned for grading must have full docstrings in
> numpy's format, as well as internal comments.  Utility functions not
> requested in the assignment and that the user will never see can have
> reduced docstrings if the functions are simple and obvious, but at least
> give the one-line summary.
>
> (k) If you modify an existing function, you must either make a Git entry
> or, if it is not under revision control, include a Revision History section
> in your docstring and record your name, the date, the version number, your
> email, and the nature of the change you made.
>
> (l) Choose variable names that are meaningful and consistent in style.
> Document your style either at the head of a module or in a separate text
> file for the project.  For example, if you use CamelCaps with initial
> capital, say that.  If you reserve initial capitals for classes, say that.
> If you use underscores for variable subscripts and camelCaps for the base
> variables, say that.  If you accept some other style and build on that, say
> that.  There are too many good reasons to have such styles for only one to
> be the community standard.  If certain kinds of values should get the same
> variable or base variable, such as fundamental constants or things like
> amplitudes, say that.
>
> (j) It's best if variables that will appear in formulae are short, so more
> terms can fit in one 80 character line.
>
> Overall, having and following a style makes code easier to read.  And, as
> an added bonus, if you take care to be consistent, you will write slower,
> view your code more times, and catch more bugs as you write them.  Thus,
> for codes of any significant size, writing pedantically commented and
> aligned code is almost always faster than blast coding, if you include
> debugging time.
>
> Did you catch both bugs in item h?
>
> --jh--
>
> On 5/9/19 11:25 AM, Chris Barker - NOAA Federal <chris.barker at noaa.gov>
> <chris.barker at noaa.gov> wrote:
>
> Do any of you know of a style guide for computational / numpy code?
>
> I don't mean code that will go into numpy itself, but rather, users code
> that uses numpy (and scipy, and...)
>
> I know about (am a proponent of) PEP8, but it doesn’t address the unique
> needs of scientific programming.
>
> This is mostly about variable names. In scientific code, we often want:
>
> - variable names that match the math notation- so single character names,
> maybe upper or lower case to mean different things ( in ocean wave
> mechanics, often “h” is the water depth, and “H” is the wave height)
>
> -to distinguish between scalar, vector, and matrix values — often
> UpperCase means an array or matrix, for instance.
>
> But despite (or because of) these unique needs, a style guide would be
> really helpful.
>
> Anyone have one? Or even any notes on what you do yourself?
>
> Thanks,
> -CHB
>
>
>
>
> --
>
> Christopher Barker, Ph.D.
> Oceanographer
>
> Emergency Response Division
> NOAA/NOS/OR&R            (206) 526-6959   voice
> 7600 Sand Point Way NE   (206) 526-6329   fax
> Seattle, WA  98115       (206) 526-6317   main reception
>
> Chris.Barker at noaa.gov
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20190510/81039bb1/attachment-0001.html>