[Numpy-discussion] Datetime branch

Travis Oliphant oliphant at enthought.com
Thu Jun 11 15:07:12 EDT 2009


On Jun 11, 2009, at 1:44 PM, Charles R Harris wrote:

>
> The implementation of  PyArray_CanCastSafely illustrates two other  
> points that bother me.
>
> 1) The rules are encoded in the program logic. This makes them  
> difficult to find or to see what they are and requires editing the  
> code to make changes.

I agree that this is all sub-optimal.     I didn't do much to fix what  
was there with Numeric except add a semi-orthogonal user-defined  
approach.

I like the generic function concept that was added to the ufuncs quite  
a bit.   I'm wondering if most of the functions currently in the *f  
member of the data-type structure couldn't be implemented under that  
notion instead.

Also, should we attach coercion information to each data-type directly  
and an API to extend the coercion information?   I agree that the  
"implicit" ordering of the data-types for coercion is wonky, but it  
allowed the code from Numeric to be used to dispatch in the ufunc  
instead of designing a new approach.   Do you have other ideas about  
how this might work?

>
> 2) Some of the rules are maintained by the types. That is even more  
> obscure and reminiscent of the "friend" functions in c++ that encode  
> the same sort of thing when the operators are overloaded. I never  
> did like that as a general system ;)

Are you referring to the user-defined data-types?    I agree it's  
pretty kludgy.    Are you envisioning a "global" coercion table?  It  
seems like this may need to be operation specific and extensible to  
allow new data-types to be added fairly easily.

>
> BTW, what is the metadata that is going to be added to the types?  
> What purpose does it serve?

In the date-time case, it holds what frequency the integer in the data- 
type represents.    There will only be 2 new static data-types.   
"Datetime" and "Timedelta" that use 8 bytes each.

What those 8 bytes represent will be determined by the metadata  
(years, months, seconds, etc...).

But, generally, it will be an extra dictionary that can store anything  
you want (anybody want to define a "float" data-type that uses IBM  
format bits?).  The ufunc machinery needs to change to handle passing  
that information in somehow.   The approaches we take to doing that  
will also hopefully allow us to define ufuncs for string, unicode, and  
void * arrays as well.

Thanks,

-Travis







More information about the NumPy-Discussion mailing list