[Pandas-dev] tslibs 2.0 and non-nanosecond datetime64/timedelta64

Tom Augspurger tom.augspurger88 at gmail.com
Fri May 29 15:03:27 EDT 2020


Thanks for the update.

On Fri, May 29, 2020 at 11:37 AM Brock Mendel <jbrockmendel at gmail.com>
wrote:

> This is a discussion of what it would take to support non-nanosecond
> datetime64/timedelta64 dtypes and what decisions would need to be made
> along the way.
>
> The implementation would probably consist of:
> - add a NPY_DATETIMEUNIT attribute to Timestamp and Datetime64TZDtype
> - for timezone-related methods:
>     - short-term: cast to nanosecond, use existing code, cast back to
> other unit
>

Will this cause issues if the original datetime isn't in the bounds of a
ns-precision timestamp?


>     - longer-term: update existing code to support non-nano units directly
> - comb through the code for all the places where we implicitly assume nano
> units and update
> - tests, so, so many tests
>
> We could then consider de-duplication. Tick is already redundant with
> Timedelta, and Timestamp[H] would render Period[H] redundant.  With
> appropriate deprecation cycle, we could rip out a bunch of code.
>

What would the user facing changes that warrant deprecation? For me,
`Period` represents a span of time. It would make sense to implement
something like `pd.Timestamp("2000-01-01") in pd.Period("2000-01-01",
freq="H")`. But something checking whether that timestamp is in a
`Timestamp[H]` doesn't seem natural, since it represents a point in time
rather than a span.


> Another possibility is to try to upstream some code to numpy, which they
> have recently been receptive to (#16266
> <https://github.com/numpy/numpy/pull/16266>, #16363
> <https://github.com/numpy/numpy/pull/16363>, #16364
> <https://github.com/numpy/numpy/pull/16364>, #16352
> <https://github.com/numpy/numpy/issues/16352>,
> <https://github.com/numpy/numpy/issues/16195>#16195
> <https://github.com/numpy/numpy/issues/16195>).  @rgommers tells me that
> trying to implement a tz-aware datetime64 dtype in numpy would be "folly,
> that way madness lies", but that it might be more feasible once @seberg's
> dtype refactor lands.  More realistically short-term, if we convinced numpy
> to update NPY_DATETIMEUNIT to include the anchored quarter/year/week units
> we use for Period, we could condense a lot of confusing enum-like code.
>

Great to see this being pushed upstream!


> Tangentially related: with zoneinfo (PEP 615) we should consider making
> those our canonical tzinfos and converting any dateutil/pytz tzinfos we
> encounter to those.  They are implemented in C, so I'm _hopeful_ we can
> make some of our vectorized tzconversion code unnecessary.  @pganssle has
> suggested we implement our own tzinfos, but I'm holding out hope we can
> keep that upstream.
>

I'd be happy to see this as well, though implementing it in a way that's
compatible with older Pythons seems a bit tricky. Perhaps we get the
building blocks in place and then require it once we require Python 3.10+?


> _______________________________________________
> Pandas-dev mailing list
> Pandas-dev at python.org
> https://mail.python.org/mailman/listinfo/pandas-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pandas-dev/attachments/20200529/3d30391d/attachment-0001.html>


More information about the Pandas-dev mailing list