adopting Python Style Guide for classes

Hello, NumPy and SciPy should conform with Guido's style guide as closely as possible: http://www.python.org/dev/peps/pep-0008/ The only serious divergence that I am aware of between the NumPy and SciPy codebase and the Python recommended standards is in class naming. According to Guido, class names should use the CapWords convention. Most Python projects (eg, ETS, matploltlib) adhere to the Python naming conventions and it is confusing that NumPy and SciPy don't. Currently, both NumPy and SciPy use either lower_underscore_separated or CapWords for class names. By my rough count, NumPy has about 752 classes using CapWords and 440 using lower_underscore_separated. While SciPy has about 256 classes using CapWords and 815 using lower_underscore_separated. Moreover, a far amount of the lower_underscore_separated classes are tests. That a number of classes currently use CapWords despite the NumPy convention to use lower_underscore_separated names may indicate it would make more sense to switch to the Python convention. I am hoping that most of you agree with the general principle of bringing NumPy and SciPy into compliance with the standard naming conventions. I have already talked to a few people including Travis about this switch and so far everyone supports the change in general. Please let me know if you have any major objections to adopting the Python class naming convention. Once we have agreed to using CapWords for classes, we will need to decide what to do about our existing class names. Obviously, it is important that we are careful to not break a lot of code just to bring our class names up to standards. So at the very least, we probably won't be able to just change every class name until NumPy 1.1.0 is released. Here is what I propose for NumPy: 1. Change the names of the TestCase class names to use CapWords. I just checked in a change to allow TestCase classes to be prefixed with either 'test' or 'Test': http://projects.scipy.org/scipy/numpy/changeset/4144 If no one objects, I plan to go through all the NumPy unit tests and change their names to CapWords. Ideally, I would like to get at least this change in before NumPy 1.0.4. 2. Any one adding a new class to NumPy would use CapWords. 3. When we release NumPy 1.1, we will convert all (or almost all) class names to CapWords. There is no reason to worry about the exact details of this conversion at this time. I just would like to get a sense whether, in general, this seems like a good direction to move in. If so, then after we get steps 1 and 2 completed we can start discussing how to handle step 3. Cheers, -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/

Jarrod Millman wrote:
Hello,
..
Please let me know if you have any major objections to adopting the Python class naming convention.
I don't object.
Once we have agreed to using CapWords for classes, we will need to decide what to do about our existing class names. Obviously, it is important that we are careful to not break a lot of code just to bring our class names up to standards. So at the very least, we probably won't be able to just change every class name until NumPy 1.1.0 is released.
Here is what I propose for NumPy: 1. Change the names of the TestCase class names to use CapWords. I just checked in a change to allow TestCase classes to be prefixed with either 'test' or 'Test': http://projects.scipy.org/scipy/numpy/changeset/4144 If no one objects, I plan to go through all the NumPy unit tests and change their names to CapWords. Ideally, I would like to get at least this change in before NumPy 1.0.4.
It is safe to change all classes in tests to CamelCase.
2. Any one adding a new class to NumPy would use CapWords. 3. When we release NumPy 1.1, we will convert all (or almost all) class names to CapWords. There is no reason to worry about the exact details of this conversion at this time. I just would like to get a sense whether, in general, this seems like a good direction to move in. If so, then after we get steps 1 and 2 completed we can start discussing how to handle step 3.
After fixing the class names in tests then how many classes use camelcase style in numpy/distutils? How many of them are implementation specific and how many of them are exposed to users? I think having this statistics would be essential to make any decisions. Eg would we need to introduce warnings for the few following releases of numpy/scipy when camelcase class is used by user code, or not? Pearu

On Tue, Oct 02, 2007 at 09:12:43AM +0200, Pearu Peterson wrote:
Jarrod Millman wrote:
Hello,
..
Please let me know if you have any major objections to adopting the Python class naming convention.
I don't object.
Me either.
2. Any one adding a new class to NumPy would use CapWords. 3. When we release NumPy 1.1, we will convert all (or almost all) class names to CapWords. There is no reason to worry about the exact details of this conversion at this time. I just would like to get a sense whether, in general, this seems like a good direction to move in. If so, then after we get steps 1 and 2 completed we can start discussing how to handle step 3.
After fixing the class names in tests then how many classes use camelcase style in numpy/distutils? How many of them are implementation specific and how many of them are exposed to users? I think having this statistics would be essential to make any decisions. Eg would we need to introduce warnings for the few following releases of numpy/scipy when camelcase class is used by user code, or not?
In numpy/distutils, there's the classes in command/* modules (but note that distutils uses the same lower_case convention, so I'd say keep them), cpu_info (none of which are user accessible; I'm working in there now), and system_info (which are documented as user accessible). Poking through the rest, it looks like only the system_info classes are ones that we would expect users to subclass. We could document the lower_case names as deprecated, and alias them to CamlCase versions. -- |>|\/|< /--------------------------------------------------------------------------\ |David M. Cooke http://arbutus.physics.mcmaster.ca/dmc/ |cookedm@physics.mcmaster.ca

Jarrod Millman wrote:
I am hoping that most of you agree with the general principle of bringing NumPy and SciPy into compliance with the standard naming conventions.
+1
3. When we release NumPy 1.1, we will convert all (or almost all) class names to CapWords.
What's the backwards-compatible plan? - keep the old names as aliases? - raise deprecation warnings? What about factory functions that kind of look like they might be classes -- numpy.array() comes to mind. Though maybe using CamelCase for the real classes will help folks understand the difference. What is a "class" in this case -- with new-style classes, there is no distinction between types and classes, so I guess they are all classes, which means lots of things like: numpy.float32 etc. etc. etc. are classes. should they be CamelCase too? NOTE: for i in dir(numpy): if type(getattr(numpy, i)) == type(numpy.ndarray): print i Yields 86 type objects. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

On 10/2/07, Christopher Barker <Chris.Barker@noaa.gov> wrote:
Jarrod Millman wrote:
I am hoping that most of you agree with the general principle of bringing NumPy and SciPy into compliance with the standard naming conventions.
Excellent plan - and I think it will make the code considerably more readable (and writeable).
3. When we release NumPy 1.1, we will convert all (or almost all) class names to CapWords.
What's the backwards-compatible plan?
- keep the old names as aliases? - raise deprecation warnings?
Both seem good. How about implementing both for the next minor release, with the ability to turn the deprecation warnings off?
What about factory functions that kind of look like they might be classes -- numpy.array() comes to mind. Though maybe using CamelCase for the real classes will help folks understand the difference.
Sounds right to me - factory function as function, class as class.
What is a "class" in this case -- with new-style classes, there is no distinction between types and classes, so I guess they are all classes, which means lots of things like:
numpy.float32
etc. etc. etc. are classes. should they be CamelCase too?
I would vote for CamelCase in this case too. Matthew

Matthew Brett wrote:
On 10/2/07, Christopher Barker <Chris.Barker@noaa.gov> wrote:
What is a "class" in this case -- with new-style classes, there is no distinction between types and classes, so I guess they are all classes, which means lots of things like:
numpy.float32
etc. etc. etc. are classes. should they be CamelCase too?
I would vote for CamelCase in this case too.
I would prefer to leave them as they are. While they are implemented as classes, they're usually used as data. Also, they are more similar to builtin types than classes one might write, and Python itself has a convention of leaving these lowercase (e.g. int, float, etc.). -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco

A Tuesday 02 October 2007, Robert Kern escrigué:
Matthew Brett wrote:
On 10/2/07, Christopher Barker <Chris.Barker@noaa.gov> wrote:
What is a "class" in this case -- with new-style classes, there is no distinction between types and classes, so I guess they are all classes, which means lots of things like:
numpy.float32
etc. etc. etc. are classes. should they be CamelCase too?
I would vote for CamelCase in this case too.
I would prefer to leave them as they are. While they are implemented as classes, they're usually used as data. Also, they are more similar to builtin types than classes one might write, and Python itself has a convention of leaving these lowercase (e.g. int, float, etc.).
I would prefer to leave them as they are now too. In addition to what Robert is saying, they are very heavily used in regular NumPy programs, and changing everything (both in code and docs) would be rather messy. Cheers, --
0,0< Francesc Altet http://www.carabos.com/ V V Cárabos Coop. V. Enjoy Data "-"

On 10/2/07, Christopher Barker <Chris.Barker@noaa.gov> wrote:
Jarrod Millman wrote:
I am hoping that most of you agree with the general principle of bringing NumPy and SciPy into compliance with the standard naming conventions.
+1
3. When we release NumPy 1.1, we will convert all (or almost all) class names to CapWords.
What's the backwards-compatible plan?
- keep the old names as aliases? - raise deprecation warnings?
What about factory functions that kind of look like they might be classes -- numpy.array() comes to mind. Though maybe using CamelCase for the real classes will help folks understand the difference.
I'm not a big fan of this kind of distinction distinction between factory functions and "real" classes based on the concrete types of the objects.. In some cases whether an object is a class or a factory function is simply an implementation detail. The real distinction, as I see it, is whether the object in question is meant to be subclassed. Thus an object is conceptually a class if it can be called to create an instance and it can be usefully subclassed. A factory function, on the other hand is only meant to be called to get an instance, regardless whether it is implemented as a class or a function. Of course, in the case of ndarray/array the distinction is clear since array cannot be subclassed, so the minor rant above doesn't apply. What is a "class" in this case -- with new-style classes, there is no
distinction between types and classes, so I guess they are all classes, which means lots of things like:
numpy.float32
etc. etc. etc. are classes. should they be CamelCase too?
Core Python makes an additional distinction between types built into the core (float, int, etc...) which are all lower case and library classes, which generally use CapWords. So I guess there are two questions, IMO: 1. Can float32 and friends be usefull subclassed? I suspect the answer is no. And in my head at least, these are conceptually mostly marker objects that can also be used as coercion functions, not classes. FWIW. 2. Are these enough like builtin types to leave them alone in any case? One approach would be CapWords the superclasses of these that are subclassable, but leave the leaf types alone. For example, looking at float32 and its bases : - numpy.generic -> numpy.Generic - numpy.number -> numpy.Number - numpy.inexact -> numpy.Inexact - numpy.floating -> numpy.Floating - numpy.float32 stays the same This is probably a lot less painful in terms of backwards compatibility. My $0.02. NOTE:
for i in dir(numpy): if type(getattr(numpy, i)) == type(numpy.ndarray): print i
Yields 86 type objects.
-Chris
-- Christopher Barker, Ph.D. Oceanographer
Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception
Chris.Barker@noaa.gov _______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
-- . __ . |-\ . . tim.hochberg@ieee.org

A Tuesday 02 October 2007, Timothy Hochberg escrigué:
One approach would be CapWords the superclasses of these that are subclassable, but leave the leaf types alone. For example, looking at float32 and its bases :
- numpy.generic -> numpy.Generic - numpy.number -> numpy.Number - numpy.inexact -> numpy.Inexact - numpy.floating -> numpy.Floating - numpy.float32 stays the same
This is probably a lot less painful in terms of backwards compatibility.
Yeah. I second this also. --
0,0< Francesc Altet http://www.carabos.com/ V V Cárabos Coop. V. Enjoy Data "-"
participants (8)
-
Christopher Barker
-
David M. Cooke
-
Francesc Altet
-
Jarrod Millman
-
Matthew Brett
-
Pearu Peterson
-
Robert Kern
-
Timothy Hochberg