[Numpy-discussion] Datetime again

Nathaniel Smith njs at pobox.com
Thu Jan 22 15:58:52 EST 2015


On Thu, Jan 22, 2015 at 3:18 PM, Charles R Harris
<charlesr.harris at gmail.com> wrote:
>
>
> On Thu, Jan 22, 2015 at 8:08 AM, Charles R Harris
> <charlesr.harris at gmail.com> wrote:
>>
>>
>>
>> On Thu, Jan 22, 2015 at 7:54 AM, Nathaniel Smith <njs at pobox.com> wrote:
>>>
>>> On Thu, Jan 22, 2015 at 2:51 PM, Charles R Harris
>>> <charlesr.harris at gmail.com> wrote:
>>> > Hi All,
>>> >
>>> > I'm playing with the idea of building a simplified datetime class on
>>> > top of
>>> > the current numpy implementation. I believe Pandas does something like
>>> > this,
>>> > and blaze will (does?) have a simplified version. The reason for the
>>> > new
>>> > class would be to have an easier, and hopefully more portable, API that
>>> > can
>>> > be implemented in Python, and maybe pushed down into C when things
>>> > settle.
>>>
>>> When you say "datetime class" what do you mean? A dtype? An ndarray
>>> subclass? A python class representing a scalar datetime that you can
>>> put in an object array? ...?
>>
>>
>> I was thinking an ndarray subclass that is based on a single datetime
>> type, but part of the reason for this post is to elicit ideas. I'm
>> influenced by Mark's  discussion apropos blaze.  I thought it easier to
>> start such a project in python, as it is far easier for people interested in
>> the problem to work with.
>
>
> And if I had my druthers, it would use quad precision floating point at it's
> heart. The 64 bits of long long really isn't enough and leads to all sorts
> of compromises. But that is probably a pipe dream.

I guess there are lots of options -- e.g. 32-bit day + 64-bit
time-of-day (I think that would give 11.8 million years at
10-femtisecond precision?). Figuring out which clock this is on
matters a lot more though (e.g. how to handle leap-seconds in absolute
and relative times -- is adding 1 day always the same as adding 24 *
60 * 60 seconds?).

At a very general level, I feel like numpy-qua-numpy's role here
shouldn't be to try and add special code to handle any one specific
datetime implementation: that hasn't worked out terribly well
historically, and as referenced above there's a *ton* of plausible
ways of approaching datetime handling that people might want, so we
don't want to be in the position of having to pick the-one-and-only
implementation. Telling people who want to tweak datetime handling
that they have to start mucking around in umath.so is terrible.

Instead, we should be trying to evolve numpy to add generic
functionality, so that it's prepared to handle multiple third-party
approaches to date-time handling (among other things).

Implementing prototypes built on top of numpy could be an excellent
way to generate ideas for appropriate changes to the numpy core.

As far as this specific prototype, I should say that I'm dubious that
subclassing ndarray is actually a *good* long-term solution. I really
think that the *right* way to solve this would be to improve the dtype
system so we could define useful date/time types that worked with
plain vanilla ndarrays. But that approach requires a lot more up-front
C coding; it's harder to throw together a quick prototype. OTOOH if
your goal is the moon then you don't want to waste time investing in
ladder technology... so I dunno.

-n

-- 
Nathaniel J. Smith
Postdoctoral researcher - Informatics - University of Edinburgh
http://vorpus.org



More information about the NumPy-Discussion mailing list