<html>

  <head>

    <meta http-equiv="Content-Type" content="text/html; charset=utf-8">

  </head>

  <body text="#000000" bgcolor="#FFFFFF">

    <br>

    <br>

    <div class="moz-cite-prefix">On 09/28/2018 04:45 PM, Javier López

      wrote:<br>

    </div>

    <blockquote type="cite"

cite="mid:CAJn5T5Wy2d4=Gf-Fzb8DtcP2rPCA2cfNjQ44j8VkhK2Y7Y2_tw@mail.gmail.com">

      <meta http-equiv="content-type" content="text/html; charset=utf-8">

      <div dir="ltr"><br>

        <br>

        <div class="gmail_quote">

          <div dir="ltr">On Fri, Sep 28, 2018 at 8:46 PM Andreas Mueller

            <<a href="mailto:t3kcit@gmail.com" moz-do-not-send="true">t3kcit@gmail.com</a>>

            wrote:</div>

          <blockquote class="gmail_quote" style="margin:0 0 0

            .8ex;border-left:1px #ccc solid;padding-left:1ex">

            Basically what you're saying is that you're fine with

            versioning the <br>

            models and having the model break loudly if anything

            changes.<br>

            That's not actually what most people want. They want to be

            able to make <br>

            predictions with a given model for ever into the future.<br>

          </blockquote>

          <div><br>

          </div>

          <div>Are we talking about "(the new version of) the old model

            can still make predictions" or "the old model makes exactly

            the same predictions as before"? I'd like the first to hold,

            don't care that much about the second.</div>

          <div> </div>

        </div>

      </div>

    </blockquote>

    The second.<br>

    <blockquote type="cite"

cite="mid:CAJn5T5Wy2d4=Gf-Fzb8DtcP2rPCA2cfNjQ44j8VkhK2Y7Y2_tw@mail.gmail.com">

      <div dir="ltr">

        <div class="gmail_quote"><br>

          <blockquote class="gmail_quote" style="margin:0 0 0

            .8ex;border-left:1px #ccc solid;padding-left:1ex">

            We're now storing the version of scikit-learn that was used

            in the <br>

            pickle and warn if you're trying to load with a different

            version.</blockquote>

          <div><br>

          </div>

          <div>This is not the whole truth. Yes, you store the sklearn

            version on the pickle and raise a warning; I am mostly ok

            with that, but the pickles are brittle and oftentimes they

            stop loading when other versions of other stuff change. I am

            not talking about "Warning: wrong version", but rather

            "Unpickling error: expected bytes, found tuple" that prevent

            the file from loading entirely.</div>

        </div>

      </div>

    </blockquote>

    Can you give examples of that? That shouldn't really happen afaik.<br>

    <blockquote type="cite"

cite="mid:CAJn5T5Wy2d4=Gf-Fzb8DtcP2rPCA2cfNjQ44j8VkhK2Y7Y2_tw@mail.gmail.com">

      <div dir="ltr">

        <div class="gmail_quote">

          <div> </div>

          <blockquote class="gmail_quote" style="margin:0 0 0

            .8ex;border-left:1px #ccc solid;padding-left:1ex">

            That's basically a stricter test than what you wanted. Yes,

            there are <br>

            false positives, but given that this release took a year,<br>

            this doesn't seem that big an issue?<br>

          </blockquote>

          <div><br>

          </div>

          <div>1. Things in the current state break when something else

            changes, not only sklearn.</div>

          <div>2. Sharing pickles is a bad practice due to a number of

            reasons.</div>

          <div>3. We might want to explore model parameters without

            having to load the entire runtime</div>

          <div><br>

          </div>

          <br>

        </div>

      </div>

    </blockquote>

    I agree, it would be great to have something other than pickle, but

    as I said, the usual request is "I want a way for a model to make

    the same predictions in the future".<br>

    If you have a way to do that with a text-based format that doesn't

    require writing lots of version converters I'd be very happy.<br>

    <br>

    Generally, what you want is not to store the model but to store the

    prediction function, and have separate runtimes for training and

    prediction.<br>

    It might not be possible to represent a model from a previous

    version of scikit-learn in a newer version.<br>

  </body>

</html>