[Pandas-dev] What could a pandas 2.0 look like?

Tue Feb 11 12:13:35 EST 2020

Joris:

Another aspirational goal for pandas 2.0 would be to clean up the API so
that index names and column names are treated equivalently throughout.  I
created a meta-issue for this 6+ months ago:
https://github.com/pandas-dev/pandas/issues/27652

Dr-Irv

> ----------------------------------------------------------------------
>
> Message: 1
> Date: Mon, 10 Feb 2020 18:43:21 +0100
> From: Joris Van den Bossche <jorisvandenbossche at gmail.com>
> To: pandas-dev <pandas-dev at python.org>
> Subject: [Pandas-dev] What could a pandas 2.0 look like?
> Message-ID:
>         <
> CALQtMBZdrFD7iiNwOQeXs94tYxLqLb-otoJcSfvTX9meQnPcyw at mail.gmail.com>
> Content-Type: text/plain; charset="utf-8"
>
> pandas 1.0 is out, so time to start thinking about 2.0 ;)
>
> In principle, pandas 2.0 will just be one of the next releases when we
> decide we want to clean-up the deprecations / make a few changes that are
> hard to deprecate (following our new versioning policy).
> But nonetheless, I think it can still be interesting to think about it if
> it can also be something more than that, and have more specific goals in
> mind*.
>
> Last year I made the pd.NA proposal, which resulted in using that for the
> nullale integer, boolean and string dtypes. In the proposal, pd.NA was
> described as "can be used consistently across all data types". And for me,
> the aspirational end goal of this proposal is to *actually* have this for
> *all* dtypes, but we never really discussed this aspect explicitly.
>
> So, for me, a possible future pandas 2.0:
>
>    - Uses all "nullable dtypes" by default (i.e. dtypes that use pd.NA as
>    missing value indicator). That means we add a nullable version of all
> other
>    dtypes (as we now already did for int, boolean, string). End goal: a
> single
>    missing value indicator with the same behavior for all dtypes.
>    - If we add such nullable dtypes using the extension dtypes/array
>    mechanism (so it can first be opt-in in 1.X), this could "automatically"
>    lead to a simplification of the internals / Block Manager (another
>    aspirational goal that has been discussed before, but never became
>    concrete). Because in such a case (all extension dtypes), we would only
> be
>    using 1D blocks (simplifying the 1D / 2D thorny cases in internals).
> This
>    simplifies the memory model, consolidation, etc
>
> Do you think this is a desirable goal? And realistic? Other aspirational
> goals?
>
> Best,
> Joris
>
> *Agreeing on goals doesn't mean it will happen, that's open source (or at
> least community-based open source). But I think it can still be useful to
> guide some efforts where possible or in trying to get traction for certain
> issues from contributors. And then we can still see if it gets done in 2.0,
> 3.0, 4.0 or never ;)
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pandas-dev/attachments/20200211/faf22be9/attachment.html>