On Tue, Jan 3, 2012 at 12:46 PM, Wes McKinney <wesmckinn@gmail.com> wrote:
On Tue, Jan 3, 2012 at 1:06 PM, Jim Vickroy <jim.vickroy@noaa.gov> wrote:
On 1/3/2012 10:46 AM, Ognen Duzlevski wrote:
Hello,
I am playing with adding an enum dtype to numpy (to get my feet wet in numpy really). I have looked at the https://github.com/martinling/numpy_quaternion and I feel comfortable with my understanding of adding a simple type to numpy in technical terms.
I am mostly a C programmer and have programmed in Python but not at the level where my code wcould be considered "pretty" or maybe even "pythonic". I know enums from C and have browsed around a few python enum implementations online. Most of them use hash tables or lists to associate names to numbers - these approaches just feel "heavy" to me.
What would be a proper "numpy approach" to this? I am looking mostly for direction and advice as I would like to do the work myself :-)
Any input appreciated :-) Ognen
Does "enumerate" (http://docs.python.org/library/functions.html#enumerate) work for you? That's not exactly what he means. The R lingo for this concept is "factor" or a bit more common "categorical variable":
http://stat.ethz.ch/R-manual/R-patched/library/base/html/factor.html
FWIW R's factor type is implemented using hash tables. I do the same in pandas.
- Wes
Wes, You are right, "categorical variable" is what I am after. Thanks for the pointer, I will go the klib route you suggested and see what comes out. I may be "old fashioned" a bit in the sense that adding dependencies on external libraries is something I am reluctant to do - this is why I said using hashes may have felt a bit "heavy". But that may be my shortcoming :-) Ognen