<div dir="ltr">CW<div><br></div><div>you might want to read <a href="http://greenteapress.com/wp/think-python/">http://greenteapress.com/wp/think-python/</a> (available as free pdf)</div><div>(for basics of programming and python)</div><div>and </div><div>Python for Data Analysis<br></div><div> Data Wrangling with Pandas, NumPy, and IPython, O'reilly</div><div><br></div><div>(for data analysis libraries: pandas, numpy, ipython...)</div><div><br></div><div><br></div><div><br></div><div><br></div><div><br></div><div><br></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Sun, Jun 18, 2017 at 10:18 PM, C W <span dir="ltr"><<a href="mailto:tmrsg11@gmail.com" target="_blank">tmrsg11@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">Hi Sebastian,<div><br></div><div>I looked through your book. I think it is great if you already know Python, and looking to learn machine learning.</div><div><br></div><div>For me, I have some sense of machine learning, but none of Python.</div><div><br></div><div>Unlike R, which is specifically for statistics analysis. Python is broad!</div><div><br></div><div>Maybe some expert here with R can tell me how to go about this. :)</div></div><div class="gmail_extra"><br><div class="gmail_quote">On Sun, Jun 18, 2017 at 12:53 PM, Sebastian Raschka <span dir="ltr"><<a href="mailto:se.raschka@gmail.com" target="_blank">se.raschka@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi,<br>
<span><br>
> I am extremely frustrated using this thing. Everything comes after a dot! Why would you type the sam thing at the beginning of every line. It's not efficient.<br>
><br>
> code 1:<br>
> y_sin = np.sin(x)<br>
> y_cos = np.cos(x)<br>
><br>
> I know you can import the entire package without the "as np", but I see np.something as the standard. Why?<br>
<br>
</span>Because it makes it clear where this function is coming from. Sure, you could do<br>
<br>
from numpy import *<br>
<br>
but this is NOT!!! recommended. The reason why this is not recommended is that it would clutter up your main name space. For instance, numpy has its own sum function. If you do from numpy import *, Python's in-built `sum` will be gone from your main name space and replaced by NumPy's sum. This is confusing and should be avoided.<br>
<span><br>
> In the code above, sklearn > linear_model > Ridge, one lives inside the other, it feels that there are multiple layer, how deep do I have to dig in?<br>
><br>
> Can someone explain the mentality behind this setup?<br>
<br>
</span>This is one way to organize your code and package. Sklearn contains many things, and organizing it by subpackages (linear_model, svm, ...) makes only sense; otherwise, you would end up with code files > 100,000 lines or so, which would make life really hard for package developers.<br>
<br>
Here, scikit-learn tries to follow the core principles of good object oriented program design, for instance, Abstraction, encapsulation, modularity, hierarchy, ...<br>
<span><br>
> What are some good ways and resources to learn Python for data analysis?<br>
<br>
</span>I think baed on your questions, a good resource would be an introduction to programming book or course. I think that sections on objected oriented programming would make the rationale/design/API of scikit-learn and Python classes as a whole more accessible and address your concerns and questions.<br>
<br>
Best,<br>
Sebastian<br>
<div><div class="m_-4000930632618580597h5"><br>
> On Jun 18, 2017, at 12:02 PM, C W <<a href="mailto:tmrsg11@gmail.com" target="_blank">tmrsg11@gmail.com</a>> wrote:<br>
><br>
> Dear Scikit-learn,<br>
><br>
> What are some good ways and resources to learn Python for data analysis?<br>
><br>
> I am extremely frustrated using this thing. Everything comes after a dot! Why would you type the sam thing at the beginning of every line. It's not efficient.<br>
><br>
> code 1:<br>
> y_sin = np.sin(x)<br>
> y_cos = np.cos(x)<br>
><br>
> I know you can import the entire package without the "as np", but I see np.something as the standard. Why?<br>
><br>
> Code 2:<br>
> model = LogisticRegression()<br>
> model.fit(X_train, y_train)<br>
> model.score(X_test, y_test)<br>
><br>
> In R, everything is saved to a variable. In the code above, what if I accidentally ran model.fit(), I would not know.<br>
><br>
> Code 3:<br>
> from sklearn import linear_model<br>
> reg = linear_model.Ridge (alpha = .5)<br>
> reg.fit ([[0, 0], [0, 0], [1, 1]], [0, .1, 1])<br>
><br>
> In the code above, sklearn > linear_model > Ridge, one lives inside the other, it feels that there are multiple layer, how deep do I have to dig in?<br>
><br>
> Can someone explain the mentality behind this setup?<br>
><br>
> Thank you very much!<br>
><br>
> M<br>
</div></div>> ______________________________<wbr>_________________<br>
> scikit-learn mailing list<br>
> <a href="mailto:scikit-learn@python.org" target="_blank">scikit-learn@python.org</a><br>
> <a href="https://mail.python.org/mailman/listinfo/scikit-learn" rel="noreferrer" target="_blank">https://mail.python.org/mailma<wbr>n/listinfo/scikit-learn</a><br>
<br>
______________________________<wbr>_________________<br>
scikit-learn mailing list<br>
<a href="mailto:scikit-learn@python.org" target="_blank">scikit-learn@python.org</a><br>
<a href="https://mail.python.org/mailman/listinfo/scikit-learn" rel="noreferrer" target="_blank">https://mail.python.org/mailma<wbr>n/listinfo/scikit-learn</a><br>
</blockquote></div><br></div>
<br>______________________________<wbr>_________________<br>
scikit-learn mailing list<br>
<a href="mailto:scikit-learn@python.org">scikit-learn@python.org</a><br>
<a href="https://mail.python.org/mailman/listinfo/scikit-learn" rel="noreferrer" target="_blank">https://mail.python.org/<wbr>mailman/listinfo/scikit-learn</a><br>
<br></blockquote></div><br></div>