[scikit-learn] [ANN] Scikit-learn 0.20.0
Javier López
jlopez at ende.cc
Fri Sep 28 05:47:52 EDT 2018
On Fri, Sep 28, 2018 at 1:03 AM Sebastian Raschka <mail at sebastianraschka.com>
wrote:
> Chris Emmery, Chris Wagner and I toyed around with JSON a while back (
> https://cmry.github.io/notes/serialize), and it could be feasible
I came across your notes a while back, they were really useful!
I hacked a variation of it that didn't need to know the model class in
advance:
https://gist.github.com/jlopezpena/2cdd09c56afda5964990d5cf278bfd31
but is is VERY hackish, and it doesn't work with complex models with nested
components. (At work we use a further variation of this that also works on
pipelines and some specific nested stuff, like `mlxtend`'s
`SequentialFeatureSelector`)
> but yeah, it will involve some work, especially with testing things
> thoroughly for all kinds of estimators. Maybe this could somehow be
> automated though in a grid-search kind of way with a build matrix for
> estimators and parameters once a general framework has been developed.
>
I considered making this serialization into an external project, but I
think this would be much easier if estimators provided a dunder method
`__serialize__` (or whatever) that would handle the idiosyncrasies of each
particular family, I don't believe there will be a "one-size-fits-all"
solution for this problem. This approach would also make it possible to
work on it incrementally, raising a default `NotImplementedError` for
estimators that haven't been addressed yet.
In the long run, I also believe that the "proper" way to do this is to
allow dumping entire processes into PFA: http://dmg.org/pfa/docs/motivation/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20180928/9b2bc1fd/attachment.html>
More information about the scikit-learn
mailing list