[scikit-learn] Query Regarding Model Scoring using scikit learn's joblib library

Debabrata Ghosh mailfordebu at gmail.com
Mon Dec 26 13:28:54 EST 2016


Dear All,

                                Greetings!

                                I need some urgent guidance and help from
you all in model scoring. What I mean by model scoring is around the
following steps:



   1. I have trained a Random Classifier model using scikit-learn
   (RandomForestClassifier library)
   2. Then I have generated the True Positive and False Positive
   predictions on my test data set using predict_proba method (I have splitted
   my data into training and test samples in 80:20 ratio)
   3. Finally, I have dumped the model into a pkl file.
   4. Next in another instance, I have loaded the .pkl file
   5. I have initiated job_lib.predict_proba method for predicting the True
   Positive and False positives on a different sample. I am terming this step
   as scoring whether I am predicting without retraining the model

                My question is when I generate the True Positive Rate on
the test data set (as part of model training approach), the rate which I am
getting is 10 – 12%. But when I do the scoring (using the steps mentioned
above), my True Positive Rate is shooting high upto 80%. Although, I am
happy to get a very high TPR but my question is whether getting such a high
TPR during the scoring phase is an expected outcome? In other words,
whether achieving a high TPR through joblib is an accepted outcome
vis-à-vis getting the TPR on training / test data set.

                Your views on the above ask will be really helpful as I am
very confused whether to consider scoring the model using joblib. Otherwise
is there any other alternative to joblib, which can help me to do scoring
without retraining the model. Please let me know as per your earliest
convenience as am a bit pressed



Thanks for your help in advance!



Cheers,

Debu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20161226/10c22e04/attachment.html>


More information about the scikit-learn mailing list