<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body text="#000000" bgcolor="#FFFFFF">
Splitting the data into train and test data is needed with any
machine learning model (not just linear regression with or without
least squares).
<p>The idea is that you want to evaluate the performance of your
model (prediction + scoring) on a portion of the data that you did
not use for training.</p>
<p>You'll find more details in the user guide
<a class="moz-txt-link-freetext" href="https://scikit-learn.org/stable/modules/cross_validation.html">https://scikit-learn.org/stable/modules/cross_validation.html</a><br>
</p>
<p>Nicolas</p>
<p><br>
</p>
<div class="moz-cite-prefix">On 5/31/19 8:54 PM, C W wrote:<br>
</div>
<blockquote type="cite"
cite="mid:CAE2FW2=OT8jsXRE598WH9rGNhBmXYtgb0ZdH3LgcM+wihhd-+A@mail.gmail.com">
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
<div dir="ltr">Hello everyone,
<div><br>
</div>
<div>I'm new to scikit learn. I see that many tutorial in
scikit-learn follows the work-flow along the lines of</div>
<div>1) tranform the data</div>
<div>2) split the data: train, test</div>
<div>3) instantiate the sklearn object and fit</div>
<div>4) predict and tune parameter</div>
<div><br>
</div>
<div>But, linear regression is done in least squares, so I don't
think train test split is necessary. So, I guess I can just
use the entire dataset?</div>
<div><br>
</div>
<div>Thanks in advance!</div>
</div>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<pre class="moz-quote-pre" wrap="">_______________________________________________
scikit-learn mailing list
<a class="moz-txt-link-abbreviated" href="mailto:scikit-learn@python.org">scikit-learn@python.org</a>
<a class="moz-txt-link-freetext" href="https://mail.python.org/mailman/listinfo/scikit-learn">https://mail.python.org/mailman/listinfo/scikit-learn</a>
</pre>
</blockquote>
</body>
</html>