[scikit-learn] Announcing modAL: a modular active learning framework

Brown J.B. jbbrown at kuhp.kyoto-u.ac.jp
Mon Feb 19 09:58:27 EST 2018


Dear Dr. Danka,

This is a very nice generalization you have built.

My group and I have published multiple papers on using active learning for
drug discovery model creation, built on top of scikit-learn.
(2017) Future Med Chem : https://dx.doi.org/10.4155/fmc-2016-0197 (*Most
downloaded paper of the year) (Open Access)
(2017) J Comput-Aided Chem : https://dx.doi.org/10.2751/jcac.18.124  (Open
Access)
(2018) ChemMedChem : https://dx.doi.org/10.1002/cmdc.201700677

In our work, we built a similar framework to modAL, though in our framework
the iterative model building is done on a fully labeled (Y) set of
examples, and we are more interested in knowing:
  (1) How fast learning converges within some convergence criteria (e.g.,
how many drugs must be in a model, given an evaluation metric),
  (2) Which examples are picked across repeated executions of AL (e.g.,
which drugs appear to be the most informative for model construction),
  (3) How much diversity is there in the examples picked (e.g., how
different are the drugs selected by AL - visualized in the 2017
FutureMedChem paper), and
  (4) How dependent are actively learned models on descriptors (e.g., do
different representations affect the speed of performance convergence?).

I think some, if not all, of these questions are also answerable in your
framework.

Also, with regards to point (1) and evaluation metrics, I recently came up
with an idea to generically analyze the nature of 2-class prediction
performance metrics independent of the model methodology used:
(2018) Molecular Informatics : https://dx.doi.org/10.1002/minf.201700127
(Open Access)
You can find the philosophy of this article embedded in the active learning
experiments performed in the 2018 ChemMedChem article.

If you or anyone else on this list is interested in active learning and
chemistry, please drop me a line.

Again - very nice job, and best wishes for continued development.

Sincerely,
J.B. Brown
Kyoto University Graduate School of Medicine


2018-02-19 16:45 GMT+09:00 Tivadar Danka <theodore.danka at gmail.com>:

> Dear scikit-learn community!
>
> It is my pleasure to announce modAL, a modular active learning framework
> for Python3, built on top of scikit-learn. Designed with modularity,
> flexibility and extensibility in mind, it allows the rapid development of
> active learning workflows with nearly complete freedom. It is aimed for
> researchers and practitioners, where fast prototyping is essential for
> testing and developing active learning pipelines.
>
> modAL is quite young and under constant improvement. Any feedback, feature
> request or contribution are very welcome!
>
> The package can be installed via pip:
> pip3 install modAL
>
> The repository, tutorials and documentation are available at
>    - GitHub: https://github.com/cosmic-cortex/modAL
>    - Webpage: https://cosmic-cortex.github.io/modAL
>
> Cheers,
> Tivadar
>
> --------------------------------------
> Tivadar Danka
> postdoctoral researcher
> BIOMAG group, MTA-BRC
> http://www.tivadardanka.com
> twitter: @TivadarDanka <https://twitter.com/TivadarDanka>
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20180219/8a94da4b/attachment.html>


More information about the scikit-learn mailing list