[scikit-learn] [ANN] Scikit-learn 0.20.0
Andreas Mueller
t3kcit at gmail.com
Fri Sep 28 13:41:13 EDT 2018
On 09/28/2018 01:38 PM, Andreas Mueller wrote:
>
>
> On 09/28/2018 12:10 PM, Sebastian Raschka wrote:
>>>> I think model serialization should be a priority.
>>> There is also the ONNX specification that is gaining industrial
>>> adoption and that already includes open source exporters for several
>>> families of scikit-learn models:
>>>
>>> https://github.com/onnx/onnxmltools
>>
>> Didn't know about that. This is really nice! What do you think about
>> referring to it under
>> http://scikit-learn.org/stable/modules/model_persistence.html to make
>> people aware that this option exists?
>> Would be happy to add a PR.
>>
>>
> I don't think an open source runtime has been announced yet (or they
> didn't email me like they promised lol).
> I'm quite excited about this as well.
>
> Javier:
> The problem is not so much storing the "model" but storing how to make
> predictions. Different versions could act differently
> on the same data structure - and the data structure could change. Both
> happen in scikit-learn.
> So if you want to make sure the right thing happens across versions,
> you either need to provide serialization and deserialization for
> every version and conversion between those or you need to provide a
> way to store the prediction function,
> which basically means you need a turing-complete language (that's what
> ONNX does).
>
> We basically said doing the first is not feasible within scikit-learn
> given our current amount of resources, and no-one
> has even tried doing it outside of scikit-learn (which would be
> possible).
> Implementing a complete prediction serialization language (the second
> option) is definitely outside the scope of sklearn.
>
>
Maybe we should add to the FAQ why serialization is hard?
More information about the scikit-learn
mailing list