[scikit-learn] Adding a function that Calculates Weight of Evidence and Information Value

urvesh patel urvesh.patel11 at gmail.com
Wed Oct 5 11:41:35 EDT 2016


Hi Andreas,

You are correct about weight of evidence. Information Value is a fancy term
but it is very similar to mutual information. Also, this method is used
most widely with uplift random forest methodology or any incremental
modeling problems where the goal is to find subset of population who will
contribute to ROI goal over the users who would have purchased it anyways
and over the users who have negative effect because of promotion.

Citations for Information Value that I found -
http://www.mwsug.org/proceedings/2013/AA/MWSUG-2013-AA14.pdf
http://documentation.statsoft.com/STATISTICAHelp.aspx?path=WeightofEvidence/WeightofEvidenceWoEIntroductoryOverview

More on Uplift Random Forest or Incremental Modeling -
https://www.linkedin.com/pulse/need-more-lift-try-uplift-models-jeffrey-strickland-ph-d-cmsp

PS - The function I have has a special flag for uplift modeling. If this
flag is set, then Information value and weight of evidence are calculated
accordingly.


On Wed, Oct 5, 2016 at 8:19 AM, Andreas Mueller <t3kcit at gmail.com> wrote:

> Hey Urvesh.
> That looks interesting. We recently added mutual information based feature
> selection.
> To add this to scikit-learn, we would like to see that this is an
> established method, for example via citations
> or forks or some other way.
> If it's only a year old (the date of the blog post) that might be a bit
> fresh for us, and you
> can add it to scikit-learn contrib.
>
> We would also like to see that there are cases when it works better than
> what is already established
> and what we have, like mutual info based selection.
>
> It looks like WOE is just the coefficient vector of Naive Bayes, right?
> I don't quite understand the information value at a glance, though.
>
> Andy
>
>
> On 10/04/2016 05:39 PM, urvesh patel wrote:
>
>
>> I have been using R extensively until last few months when I started
>> using Python. I noticed that Python doesn't have a function to compute
>> information value and weight of evidence. Detailed explanation -
>> http://multithreaded.stitchfix.com/blog/2015/08/13/weight-of-evidence/
>>
>> I have version 0 of this concept ready and I would like to contribute to
>> scikit-learn so that more and more people can use it. What are the steps I
>> need to follow in order to do so ?
>>
>> --
>> Thanking You,
>>
>> Urvesh Patel
>> Data Ninja
>> Udacity
>>
>
>
>
>
> _______________________________________________
> scikit-learn mailing listscikit-learn at python.orghttps://mail.python.org/mailman/listinfo/scikit-learn
>
>
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
>


-- 
Thanking You,

Urvesh Patel
Columbia University
*Masters in Operations Research*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20161005/5f477a0f/attachment-0001.html>


More information about the scikit-learn mailing list