<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=us-ascii">
<meta name="Generator" content="Microsoft Word 14 (filtered medium)">
<!--[if !mso]><style>v\:* {behavior:url(#default#VML);}
o\:* {behavior:url(#default#VML);}
w\:* {behavior:url(#default#VML);}
.shape {behavior:url(#default#VML);}
</style><![endif]--><style><!--
/* Font Definitions */
@font-face
        {font-family:Calibri;
        panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
        {font-family:Tahoma;
        panose-1:2 11 6 4 3 5 4 4 2 4;}
@font-face
        {font-family:"Arial Narrow";
        panose-1:2 11 6 6 2 2 2 3 2 4;}
@font-face
        {font-family:Webdings;
        panose-1:5 3 1 2 1 5 9 6 7 3;}
@font-face
        {font-family:"Monotype Corsiva";
        panose-1:3 1 1 1 1 2 1 1 1 1;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
        {margin:0cm;
        margin-bottom:.0001pt;
        font-size:11.0pt;
        font-family:"Calibri","sans-serif";
        mso-fareast-language:EN-US;}
a:link, span.MsoHyperlink
        {mso-style-priority:99;
        color:blue;
        text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
        {mso-style-priority:99;
        color:purple;
        text-decoration:underline;}
pre
        {mso-style-priority:99;
        mso-style-link:"Pr\00E9format\00E9 HTML Car";
        margin:0cm;
        margin-bottom:.0001pt;
        font-size:10.0pt;
        font-family:"Courier New";}
p.MsoAcetate, li.MsoAcetate, div.MsoAcetate
        {mso-style-priority:99;
        mso-style-link:"Texte de bulles Car";
        margin:0cm;
        margin-bottom:.0001pt;
        font-size:8.0pt;
        font-family:"Tahoma","sans-serif";
        mso-fareast-language:EN-US;}
span.EmailStyle17
        {mso-style-type:personal;
        font-family:"Calibri","sans-serif";
        color:windowtext;}
span.TextedebullesCar
        {mso-style-name:"Texte de bulles Car";
        mso-style-priority:99;
        mso-style-link:"Texte de bulles";
        font-family:"Tahoma","sans-serif";}
span.PrformatHTMLCar
        {mso-style-name:"Pr\00E9format\00E9 HTML Car";
        mso-style-priority:99;
        mso-style-link:"Pr\00E9format\00E9 HTML";
        font-family:"Courier New";
        mso-fareast-language:FR;}
.MsoChpDefault
        {mso-style-type:export-only;
        font-family:"Calibri","sans-serif";
        mso-fareast-language:EN-US;}
@page WordSection1
        {size:612.0pt 792.0pt;
        margin:70.85pt 70.85pt 70.85pt 70.85pt;}
div.WordSection1
        {page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
</head>
<body lang="FR" link="blue" vlink="purple">
<div class="WordSection1">
<p class="MsoNormal">Hi all,<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"><span lang="EN-US">I am playing with the pipeline features of sklearn and it seems that I can’t use a prediction algorithm as intermediate step.<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">For instance, in the example below I use the output of a lasso as an additional feature to feed a random forest, in such a way that feature selection the Lasso does some preliminary feature selection.<o:p></o:p></span></p>
<p class="MsoNormal" style="background:white"><span lang="EN-US" style="font-size:10.0pt;font-family:"Courier New";color:black;background:#FFE4FF;mso-fareast-language:FR"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">But I get a : “TypeError: All estimators should implement fit and transform.”<o:p></o:p></span></p>
<p class="MsoNormal" style="background:white"><span lang="EN-US" style="font-size:10.0pt;font-family:"Courier New";color:black;background:#FFE4FF;mso-fareast-language:FR"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">So I would like to add a transform method to the Lasso estimator so that it can be used in a FeatureUnion. Is that possible ?<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">Best regards<o:p></o:p></span></p>
<p class="MsoNormal" style="background:white"><span lang="EN-US" style="font-size:10.0pt;font-family:"Courier New";color:black;background:#FFE4FF;mso-fareast-language:FR"><o:p> </o:p></span></p>
<p class="MsoNormal" style="background:white"><span lang="EN-US" style="font-size:10.0pt;font-family:"Courier New";color:black;background:#FFE4FF;mso-fareast-language:FR"><o:p> </o:p></span></p>
<p class="MsoNormal" style="background:#EBF5FF"><span lang="EN-US" style="font-size:10.0pt;font-family:"Courier New";color:black;background:#FFE4FF;mso-fareast-language:FR">X</span><span lang="EN-US" style="font-size:10.0pt;font-family:"Courier New";color:black;mso-fareast-language:FR">=np.hstack((np.random.randn(</span><span lang="EN-US" style="font-size:10.0pt;font-family:"Courier New";color:blue;mso-fareast-language:FR">500</span><span lang="EN-US" style="font-size:10.0pt;font-family:"Courier New";color:black;mso-fareast-language:FR">,</span><span lang="EN-US" style="font-size:10.0pt;font-family:"Courier New";color:blue;mso-fareast-language:FR">10</span><span lang="EN-US" style="font-size:10.0pt;font-family:"Courier New";color:black;mso-fareast-language:FR">),np.random.randint(</span><span lang="EN-US" style="font-size:10.0pt;font-family:"Courier New";color:blue;mso-fareast-language:FR">0</span><span lang="EN-US" style="font-size:10.0pt;font-family:"Courier New";color:black;mso-fareast-language:FR">,</span><span lang="EN-US" style="font-size:10.0pt;font-family:"Courier New";color:blue;mso-fareast-language:FR">10</span><span lang="EN-US" style="font-size:10.0pt;font-family:"Courier New";color:black;mso-fareast-language:FR">,(</span><span lang="EN-US" style="font-size:10.0pt;font-family:"Courier New";color:blue;mso-fareast-language:FR">500</span><span lang="EN-US" style="font-size:10.0pt;font-family:"Courier New";color:black;mso-fareast-language:FR">,</span><span lang="EN-US" style="font-size:10.0pt;font-family:"Courier New";color:blue;mso-fareast-language:FR">10</span><span lang="EN-US" style="font-size:10.0pt;font-family:"Courier New";color:black;mso-fareast-language:FR">))))
</span><i><span lang="EN-US" style="font-size:10.0pt;font-family:"Courier New";color:gray;mso-fareast-language:FR"># regressor variables<br>
</span></i><span lang="EN-US" style="font-size:10.0pt;font-family:"Courier New";color:black;mso-fareast-language:FR">y=np.random.randn(</span><span lang="EN-US" style="font-size:10.0pt;font-family:"Courier New";color:blue;mso-fareast-language:FR">500</span><span lang="EN-US" style="font-size:10.0pt;font-family:"Courier New";color:black;mso-fareast-language:FR">)
</span><i><span lang="EN-US" style="font-size:10.0pt;font-family:"Courier New";color:gray;mso-fareast-language:FR"># target variable<br>
</span></i><span lang="EN-US" style="font-size:10.0pt;font-family:"Courier New";color:black;mso-fareast-language:FR">ct_get = FunctionTransformer(</span><b><span lang="EN-US" style="font-size:10.0pt;font-family:"Courier New";color:navy;mso-fareast-language:FR">lambda
</span></b><span lang="EN-US" style="font-size:10.0pt;font-family:"Courier New";color:black;mso-fareast-language:FR">d:d[</span><span lang="EN-US" style="font-size:10.0pt;font-family:"Courier New";color:blue;mso-fareast-language:FR">0</span><span lang="EN-US" style="font-size:10.0pt;font-family:"Courier New";color:black;mso-fareast-language:FR">:</span><span lang="EN-US" style="font-size:10.0pt;font-family:"Courier New";color:blue;mso-fareast-language:FR">10</span><span lang="EN-US" style="font-size:10.0pt;font-family:"Courier New";color:black;mso-fareast-language:FR">])
</span><i><span lang="EN-US" style="font-size:10.0pt;font-family:"Courier New";color:gray;mso-fareast-language:FR"># transformer to extract continuous variables<br>
</span></i><span lang="EN-US" style="font-size:10.0pt;font-family:"Courier New";color:black;mso-fareast-language:FR">dt_get = FunctionTransformer(</span><b><span lang="EN-US" style="font-size:10.0pt;font-family:"Courier New";color:navy;mso-fareast-language:FR">lambda
</span></b><span lang="EN-US" style="font-size:10.0pt;font-family:"Courier New";color:black;mso-fareast-language:FR">d:d[</span><span lang="EN-US" style="font-size:10.0pt;font-family:"Courier New";color:blue;mso-fareast-language:FR">11</span><span lang="EN-US" style="font-size:10.0pt;font-family:"Courier New";color:black;mso-fareast-language:FR">:</span><span lang="EN-US" style="font-size:10.0pt;font-family:"Courier New";color:blue;mso-fareast-language:FR">20</span><span lang="EN-US" style="font-size:10.0pt;font-family:"Courier New";color:black;mso-fareast-language:FR">])
</span><i><span lang="EN-US" style="font-size:10.0pt;font-family:"Courier New";color:gray;mso-fareast-language:FR"># transformer to extract discrete variables<br>
<br>
# first step is a regression pipeline<br>
</span></i><span lang="EN-US" style="font-size:10.0pt;font-family:"Courier New";color:black;mso-fareast-language:FR">reg = Pipeline([(</span><b><span lang="EN-US" style="font-size:10.0pt;font-family:"Courier New";color:green;mso-fareast-language:FR">'ct_vars'</span></b><span lang="EN-US" style="font-size:10.0pt;font-family:"Courier New";color:black;mso-fareast-language:FR">,ct_get),(</span><b><span lang="EN-US" style="font-size:10.0pt;font-family:"Courier New";color:green;mso-fareast-language:FR">'scaler'</span></b><span lang="EN-US" style="font-size:10.0pt;font-family:"Courier New";color:black;mso-fareast-language:FR">,StandardScaler()),(</span><b><span lang="EN-US" style="font-size:10.0pt;font-family:"Courier New";color:green;mso-fareast-language:FR">'poly'</span></b><span lang="EN-US" style="font-size:10.0pt;font-family:"Courier New";color:black;mso-fareast-language:FR">,PolynomialFeatures(</span><span lang="EN-US" style="font-size:10.0pt;font-family:"Courier New";color:#660099;mso-fareast-language:FR">degree</span><span lang="EN-US" style="font-size:10.0pt;font-family:"Courier New";color:black;mso-fareast-language:FR">=</span><span lang="EN-US" style="font-size:10.0pt;font-family:"Courier New";color:blue;mso-fareast-language:FR">3</span><span lang="EN-US" style="font-size:10.0pt;font-family:"Courier New";color:black;mso-fareast-language:FR">)),(</span><b><span lang="EN-US" style="font-size:10.0pt;font-family:"Courier New";color:green;mso-fareast-language:FR">'lasso'</span></b><span lang="EN-US" style="font-size:10.0pt;font-family:"Courier New";color:black;mso-fareast-language:FR">,Lasso())])<br>
</span><i><span lang="EN-US" style="font-size:10.0pt;font-family:"Courier New";color:gray;mso-fareast-language:FR"># A random forest feeds on the discrete part of the data + one continuous variable<br>
</span></i><span lang="EN-US" style="font-size:10.0pt;font-family:"Courier New";color:black;mso-fareast-language:FR">estimator = Pipeline([(</span><b><span lang="EN-US" style="font-size:10.0pt;font-family:"Courier New";color:green;mso-fareast-language:FR">'level1'</span></b><span lang="EN-US" style="font-size:10.0pt;font-family:"Courier New";color:black;mso-fareast-language:FR">,FeatureUnion([(</span><b><span lang="EN-US" style="font-size:10.0pt;font-family:"Courier New";color:green;mso-fareast-language:FR">'dt_vars'</span></b><span lang="EN-US" style="font-size:10.0pt;font-family:"Courier New";color:black;mso-fareast-language:FR">,dt_get),(</span><b><span lang="EN-US" style="font-size:10.0pt;font-family:"Courier New";color:green;mso-fareast-language:FR">'reg'</span></b><span lang="EN-US" style="font-size:10.0pt;font-family:"Courier New";color:black;mso-fareast-language:FR">,reg)])),(</span><b><span lang="EN-US" style="font-size:10.0pt;font-family:"Courier New";color:green;mso-fareast-language:FR">'rf'</span></b><span lang="EN-US" style="font-size:10.0pt;font-family:"Courier New";color:black;mso-fareast-language:FR">,RandomForestRegressor())])<br>
<br>
estimator.fit(<span style="background:#E4E4FF">X</span>,y)<br>
</span><b><span lang="EN-US" style="font-size:10.0pt;font-family:"Courier New";color:navy;mso-fareast-language:FR">print
</span></b><b><span lang="EN-US" style="font-size:10.0pt;font-family:"Courier New";color:green;mso-fareast-language:FR">"R^2 score is :"</span></b><span lang="EN-US" style="font-size:10.0pt;font-family:"Courier New";color:black;mso-fareast-language:FR">,estimator.score(<span style="background:#E4E4FF">X</span>,y)<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="color:#1F497D;mso-fareast-language:FR"><o:p> </o:p></span></p>
<table class="MsoNormalTable" border="0" cellspacing="0" cellpadding="0" style="border-collapse:collapse">
<tbody>
<tr>
<td style="border:none;border-top:solid #BB9A4B 1.0pt;padding:.75pt .75pt .75pt .75pt">
<p class="MsoNormal"><b><span lang="EN-US" style="font-size:10.0pt;color:black;mso-fareast-language:FR">Augustin LEFEVRE</span></b><span lang="EN-US" style="font-size:12.0pt;color:black;mso-fareast-language:FR">|<b>
</b></span><span lang="EN-US" style="font-size:10.0pt;color:black;mso-fareast-language:FR">Consultant Senior |</span><span lang="EN-US" style="font-size:10.0pt;color:#BB9A4B;mso-fareast-language:FR"> Ykems</span><span lang="EN-US" style="font-size:10.0pt;color:black;mso-fareast-language:FR">
 | - </span><span lang="EN-US" style="font-size:10.0pt;color:black;mso-fareast-language:FR"><o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:10.0pt;color:black;mso-fareast-language:FR"> M : +33 7 77 97 94 89 |
</span><span style="font-size:10.0pt;color:black;mso-fareast-language:FR"><a href="mailto:alefevre@ykems.com"><span lang="EN-US" style="color:black;text-decoration:none">alefevre@ykems.com</span></a></span><span lang="EN-US" style="font-size:10.0pt;color:black;mso-fareast-language:FR">
 | </span><span style="font-size:10.0pt;color:black;mso-fareast-language:FR"><a href="http://www.ykems.com/"><span lang="EN-US" style="color:black;text-decoration:none">www.ykems.com</span></a></span><span lang="EN-US" style="font-size:10.0pt;color:black;mso-fareast-language:FR"><o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:10.0pt;color:black;mso-fareast-language:FR"><o:p> </o:p></span></p>
<p class="MsoNormal"><a href="http://www.linkedin.com/company/beijaflore?trk=top_nav_home"><span style="font-size:10.0pt;color:blue;mso-fareast-language:FR;text-decoration:none"><img border="0" width="16" height="16" id="_x0000_i1028" src="cid:image001.png@01D32739.51B2FC60" alt="https://www.beijaflore.com/_mailing/signature/image001.png"></span></a><span style="font-size:10.0pt;color:black;mso-fareast-language:FR">
</span><a href="https://twitter.com/BeijafloreGroup"><span style="font-size:10.0pt;color:blue;mso-fareast-language:FR;text-decoration:none"><img border="0" width="16" height="16" id="_x0000_i1027" src="cid:image002.png@01D32739.51B2FC60" alt="https://www.beijaflore.com/_mailing/signature/image002.png"></span></a><span style="font-size:10.0pt;color:black;mso-fareast-language:FR"> </span><a href="https://www.facebook.com/BeijafloreGroup"><span style="font-size:10.0pt;color:blue;mso-fareast-language:FR;text-decoration:none"><img border="0" width="16" height="16" id="_x0000_i1026" src="cid:image003.png@01D32739.51B2FC60" alt="https://www.beijaflore.com/_mailing/signature/image003.png"></span></a><span style="font-size:10.0pt;color:black;mso-fareast-language:FR"> </span><a href="https://www.youtube.com/user/ComBeijaflore"><span style="font-size:10.0pt;color:blue;mso-fareast-language:FR;text-decoration:none"><img border="0" width="16" height="16" id="_x0000_i1025" src="cid:image004.png@01D32739.51B2FC60" alt="https://www.beijaflore.com/_mailing/signature/image004.png"></span></a><span style="font-size:10.0pt;color:black;mso-fareast-language:FR"> </span><span style="font-size:10.0pt;font-family:"Arial Narrow","sans-serif";color:black;mso-fareast-language:FR"><o:p></o:p></span></p>
</td>
</tr>
</tbody>
</table>
<p class="MsoNormal"><span lang="EN-US" style="font-size:10.0pt;color:#1F497D;mso-fareast-language:FR"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:18.0pt;font-family:Webdings;color:#00B050;mso-fareast-language:FR">P</span><span style="color:#00B050;mso-fareast-language:FR">
</span><span lang="EN-US" style="font-size:10.0pt;color:#00B050;mso-fareast-language:FR">Save a tree ! Think before you print</span><span lang="EN-US" style="font-size:10.0pt;font-family:"Monotype Corsiva";color:#00B050;mso-fareast-language:FR"><o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:10.0pt;color:#1F497D;mso-fareast-language:FR"><o:p> </o:p></span></p>
<p class="MsoNormal"><i><span lang="EN-US" style="font-size:10.0pt;color:#BB9A4B;mso-fareast-language:FR">SECURE BUSINESS<o:p></o:p></span></i></p>
<p class="MsoNormal"><i><span lang="EN-GB" style="font-size:8.0pt;font-family:"Arial","sans-serif";color:black;mso-fareast-language:FR">This message and its attachment contain information that may be privileged or confidential and is the property of Beijaflore.
 It is intended only for the person to whom it is addressed. If you are not the intended recipient, you are not authorized to read, print, retain, copy, disseminate, distribute, use or rely on the information contained in this email. If you receive this message
 in error, please notify the sender immediately and delete all copies of this message.<o:p></o:p></span></i></p>
<p class="MsoNormal"><span lang="EN-US"><o:p> </o:p></span></p>
</div>
</body>
</html>