[Numpy-discussion] Parameterised dtypes

David Cournapeau cournape at gmail.com
Wed May 29 07:01:34 EDT 2013


On Tue, May 28, 2013 at 9:06 PM, Nathaniel Smith <njs at pobox.com> wrote:
> On Fri, May 24, 2013 at 3:12 PM, Richard Hattersley
> <rhattersley at gmail.com> wrote:
>> Hi all,
>>
>> I'm in the process of defining some new dtypes to handle non-physical
>> calendars (such as the 360-day calendar used in the climate modelling
>> world). This is all going fine[*] so far, but I'd like to know a little bit
>> more about how much is ultimately possible.
>>
>> The PyArray_Descr members `metadata` and `c_metadata` allow run-time
>> parametrisation, but is it possible to hook into the dtype('...') parsing
>> mechanism to supply those parameters? Or is there some other dtype mechanism
>> for supplying parameters?
>>
>> As an example, would it be possible to supply month lengths?
>>>>> a = np.zeros(n, dtype='my_date[34,33,31,30,30,29,29,30,31,32,34,35]')
>>
>> Or is the intended use of parametrisation more like:
>>>>> weird = my_stuff.make_dtype([34,33,31,30,30,29,29,30,31,32,34,35])
>>>>> a = np.zeros(n, dtype=weird)
>
> I don't think there's any "intended use" really. AFAICT numpy was
> originally designed with the assumption that there were a fixed set of
> dtypes stateless dtypes, and then the ability to add new dtypes, the
> ability to add state (parametrize them), etc., have gradually been
> kluged in as needed to solve whatever immediate problem someone faced.
> E.g., dtype callbacks and ufuncs don't in general get access to the
> dtype object, so they can't access whatever parameters exist, and the
> builtin dtypes that *are* parametrized (strings, structs, etc.) all
> have special case code scattered all around numpy.
>
> You don't even need 'metadata' or 'c_metadata' -- this is Python, we
> already have a totally standard way to add new fields, just subclass
> the dumb thing. Instead we have this baffling system of 'NpyAuxData'
> which invents its own tiny little refcounted object system from
> scratch, and 'issubdtype' which invents its own way of representing
> inheritance hierarchies independent of Python's object system, and so
> forth.
>
> Anyway!
>
> 1) No, you can't hook into the dtype string parser. Though, are you
> sure you really want to? Surely it's nicer to use Python syntax
> instead of inventing a new syntax and then having to write a parser
> for it from scratch?
>
> 2) I have some vague plans worked out to fix all this so dtypes are
> just ordinary python objects, but haven't written it down yet due to a
> combination of lack of time to do so, and lack of anyone with time to
> actually implement the plan even if it were written down. I mention
> this just in case someone wants to volunteer, which would move it up
> my stack.

Nathan, will you be at scipy conference this year ? That's something
I'd like to improve/refactor myself as well.

I have not thought much about making dtype ordinary object yet, but is
what you had in mind 'mostly' backward incompatible ?

David



More information about the NumPy-Discussion mailing list