[Pandas-dev] Datetime (with timezone?) as extension array?

Joris Van den Bossche jorisvandenbossche at gmail.com
Tue Sep 25 13:41:00 EDT 2018


2018-08-14 17:32 GMT+02:00 Brock Mendel <jbrockmendel at gmail.com>:

> `DatetimeArray` is close to ready if you want to bring it over the finish
> line.  Pretty much all that has to be done is having `DatetimeArrayMixin`
> subclass `ExtensionArray` (and, uh, implement the relevant EA methods).  If
> no one else picks this up, my current plan is to do this _after_ updating
> all of the relevant arithmetic tests to test DatetimeArrayMixin.
>
> What's the status of this?  Asking because I think having a working EA
DatetimeArray implementation is important for a 0.24.0 release, and I can
imagine it will still take quite some discussion and would be good to have
it in master for a while.

It's a hard to really steer this since it is volunteer based (and certainly
because I currently don't have the time to do it myself), but to the extent
possible, it would be good if we could try to prioritize it a bit.

Joris



> > The unclear part is what `Series[datetime_with_tz].values` should be.
>
> I thought the conclusion was that `.values` should be non-lossy, in which
> case it would have to be the EA.  My preference would be for the EA to be
> returned for non-tz datetime64[ns] Series too.
>
> For that matter, I'd like it if `Series.values` _always_ returned an EA,
> but we're not there yet.
>
>
> On Tue, Aug 14, 2018 at 4:13 AM, Tom Augspurger <
> tom.augspurger88 at gmail.com> wrote:
>
>> The discussion on datetime with timezone has been a bit scattered. I
>> don't think there's a single issue with everyone's thoughts.
>>
>> There will be a DatetimeWithTZ array that implements the EA interface.
>> Anywhere we're internally using a DatetimeIndex as a
>> container for datetimes with timezones will use the new EA.
>>
>> The unclear part is what `Series[datetime_with_tz].values` should be.
>> Currently, we convert to UTC, strip the timezone, and return
>> a datetime64[ns] ndarray. Changing that would be disruptive, jarringly
>> different from `Series[datetime].values` (no tz) and of little
>> value I think.
>>
>> Tom
>>
>> On Tue, Aug 14, 2018 at 4:07 AM Pietro Battiston <me at pietrobattiston.it>
>> wrote:
>>
>>> Hi all,
>>>
>>> I assumed that Datetime (with timezone, or maybe in general?) was also
>>> planned to follow the extension array interface, which is related to
>>> issue https://github.com/pandas-dev/pandas/issues/19041 , to the
>>> annoying fact that datetimeindexwithtz._values returns the index
>>> itself, and also to the fact that
>>> https://pandas.pydata.org/pandas-docs/stable/extending.html
>>> currently states "Pandas itself uses the extension system for some
>>> types that aren’t built into NumPy (categorical, period, interval,
>>> datetime with timezone).", which is false.
>>>
>>> ... but I didn't find an issue for this? Did I miss it? Should I create
>>> it? Or was there a decision to leave datetimeindextz as it is, maybe
>>> for better compatibility with numpy?
>>>
>>> Pietro
>>> _______________________________________________
>>> Pandas-dev mailing list
>>> Pandas-dev at python.org
>>> https://mail.python.org/mailman/listinfo/pandas-dev
>>>
>>
>> _______________________________________________
>> Pandas-dev mailing list
>> Pandas-dev at python.org
>> https://mail.python.org/mailman/listinfo/pandas-dev
>>
>>
>
> _______________________________________________
> Pandas-dev mailing list
> Pandas-dev at python.org
> https://mail.python.org/mailman/listinfo/pandas-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pandas-dev/attachments/20180925/dc5eb539/attachment.html>


More information about the Pandas-dev mailing list