[Pandas-dev] pandas-vet

Marc Garcia garcia.marc at gmail.com
Thu Feb 2 11:59:16 EST 2023


Thanks for starting this discussion. I think each of the points need an
independent discussion. In general I think the solution would be to
deprecate things in pandas.

For the inplace keyword, there is consensus to deprecate it. There was even
before pandas 1.0 and plans to remove it everywhere before it, and we
almost removed it for pandas 2.0 (not finally happening), but there are
still few details to discuss. I guess a linter can help before we start
raising the FutureWarnings.

For isna/isnull, the initial plan and obvious solution was to also
deprecate isnull, but it was decided
<https://github.com/pandas-dev/pandas/pull/16972#issuecomment-317216086> it
was too common. Seems like deprecating it is a better option that a linter.
But I guess if the linter is popular enough could help. I'd personally just
deprecate things we don't want users to use (instead of encouraging a
linter), but if there is no consensus to deprecate, maybe there will be in
the future and the linter can help. Some things are trickier, but I guess
in general we could end up deprecating things like Series.values in favor
of .array or .to_numpy()...

Personally -1 on `import pandas as pd`. If we had to rewrite things I guess
the numpy module would be simply named np, so no aliasing is needed. And
the pandas module namespace is much smaller and not used so frequently, and
shorting it to pd has almost no impact in code verbosity. I never alias the
pandas module name, and while consistency across projects can be nice,
seems odd to have a linter to recommend something that is more a tradition
than a good practice. At least that's my opinion.

On Thu, Feb 2, 2023 at 4:09 PM Roman Yurchak <rth.yurchak at gmail.com> wrote:

> Hi,
>
> There was interesting work done in https://github.com/deppen8/pandas-vet
> for enforcing automated checks on pandas code.
>
> I was wondering if the core teams had some opinions on the enforced rules
> and could comment to what extent there is a consensus on those, whether
> they are consistent with what's recommended in the pandas docs.
> Particularly on things, like pivot_table vs unstack, .array vs .values, and
> melt vs stack.
>
> Currently working on a largish legacy code with lots of pandas code, so
> IMO something like pyupgrade for pandas could really be great. Also now
> that pandas-vet is implemented in ruff, I feel it has the potential to
> become mainstream in a few years. Just checking whether there is some
> consensus on what could / should be enforced for pandas linting.
>
> For the rule "'inplace = True' should be avoided; it has inconsistent
> behavior": if there is an issue,  this could be fixed in some future major
> release, right ?
>
> Thanks,
>
> Roman
> _______________________________________________
> Pandas-dev mailing list
> Pandas-dev at python.org
> https://mail.python.org/mailman/listinfo/pandas-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/pandas-dev/attachments/20230202/8bfd68b6/attachment.html>


More information about the Pandas-dev mailing list