[scikit-learn] Is there a model for truncated regression in sklearn?
Gael Varoquaux
gael.varoquaux at normalesup.org
Tue Jun 8 03:31:03 EDT 2021
Hi,
Scikit-learn does not cover this problem.
I think that it relates to what is called survival analysis. You'll find
a survival analysis package in Python at
https://lifelines.readthedocs.io/en/latest/
Best,
Gaël
On Tue, Jun 08, 2021 at 04:22:14PM +0900, Francois Berenger wrote:
> Hello,
> https://en.wikipedia.org/wiki/Truncated_regression_model
> Sometimes, data have missing samples when the target variable
> is above or below a threshold value.
> This is very often the case for biochemical data (e.g. target
> variable outside detection range of some lab equipment).
> I highly suspect some specific models could handle such datasets
> better than generic methods (i.e. train better models).
> Some points of entry, if that might help:
> - R has a truncreg package
> https://cran.r-project.org/web/packages/truncreg/index.html
> - a related paper from the wikipedia page:
> "Local likelihood estimation of truncated regression and
> its partial derivatives: Theory and application"
> https://hal.archives-ouvertes.fr/hal-00520650/file/PEER_stage2_10.1016%252Fj.jeconom.2008.08.007.pdf
> I can provide a cleaned public regression dataset, if someone is interested,
> for tests
> (there are many such datasets in ChEMBL and PubChem by the way, but you need
> to know how
> to "featurize"/encode molecules).
> Regards,
> F.
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
--
Gael Varoquaux
Research Director, INRIA Visiting professor, McGill
http://gael-varoquaux.info http://twitter.com/GaelVaroquaux
More information about the scikit-learn
mailing list