[Numpy-discussion] Parameterised dtypes

Nathaniel Smith njs at pobox.com
Tue May 28 16:06:18 EDT 2013


On Fri, May 24, 2013 at 3:12 PM, Richard Hattersley
<rhattersley at gmail.com> wrote:
> Hi all,
>
> I'm in the process of defining some new dtypes to handle non-physical
> calendars (such as the 360-day calendar used in the climate modelling
> world). This is all going fine[*] so far, but I'd like to know a little bit
> more about how much is ultimately possible.
>
> The PyArray_Descr members `metadata` and `c_metadata` allow run-time
> parametrisation, but is it possible to hook into the dtype('...') parsing
> mechanism to supply those parameters? Or is there some other dtype mechanism
> for supplying parameters?
>
> As an example, would it be possible to supply month lengths?
>>>> a = np.zeros(n, dtype='my_date[34,33,31,30,30,29,29,30,31,32,34,35]')
>
> Or is the intended use of parametrisation more like:
>>>> weird = my_stuff.make_dtype([34,33,31,30,30,29,29,30,31,32,34,35])
>>>> a = np.zeros(n, dtype=weird)

I don't think there's any "intended use" really. AFAICT numpy was
originally designed with the assumption that there were a fixed set of
dtypes stateless dtypes, and then the ability to add new dtypes, the
ability to add state (parametrize them), etc., have gradually been
kluged in as needed to solve whatever immediate problem someone faced.
E.g., dtype callbacks and ufuncs don't in general get access to the
dtype object, so they can't access whatever parameters exist, and the
builtin dtypes that *are* parametrized (strings, structs, etc.) all
have special case code scattered all around numpy.

You don't even need 'metadata' or 'c_metadata' -- this is Python, we
already have a totally standard way to add new fields, just subclass
the dumb thing. Instead we have this baffling system of 'NpyAuxData'
which invents its own tiny little refcounted object system from
scratch, and 'issubdtype' which invents its own way of representing
inheritance hierarchies independent of Python's object system, and so
forth.

Anyway!

1) No, you can't hook into the dtype string parser. Though, are you
sure you really want to? Surely it's nicer to use Python syntax
instead of inventing a new syntax and then having to write a parser
for it from scratch?

2) I have some vague plans worked out to fix all this so dtypes are
just ordinary python objects, but haven't written it down yet due to a
combination of lack of time to do so, and lack of anyone with time to
actually implement the plan even if it were written down. I mention
this just in case someone wants to volunteer, which would move it up
my stack.

-n



More information about the NumPy-Discussion mailing list