[scikit-learn] R user trying to learn Python
Nelle Varoquaux
nelle.varoquaux at gmail.com
Sun Jun 18 16:37:30 EDT 2017
Hello,
The concepts behind R and python are entirely different. Python is
meant to be as explicit as possible, and uses the concepts of
namespace which R doesn't.
While it can seem that python code is more verbose, it is very clear
when reading python code which functions come from which module and
submodule (this is link to your code 1 and code 3 examples).
For example 2, R indeed saves everything to a variable, while python
does not. The advantage is that Python is much more time and memory
efficient than R. The tradeoff is that you do not keep intermediate
results.
Hope that explains,
N
On 18 June 2017 at 13:18, C W <tmrsg11 at gmail.com> wrote:
> Hi Sebastian,
>
> I looked through your book. I think it is great if you already know Python,
> and looking to learn machine learning.
>
> For me, I have some sense of machine learning, but none of Python.
>
> Unlike R, which is specifically for statistics analysis. Python is broad!
>
> Maybe some expert here with R can tell me how to go about this. :)
>
> On Sun, Jun 18, 2017 at 12:53 PM, Sebastian Raschka <se.raschka at gmail.com>
> wrote:
>>
>> Hi,
>>
>> > I am extremely frustrated using this thing. Everything comes after a
>> > dot! Why would you type the sam thing at the beginning of every line. It's
>> > not efficient.
>> >
>> > code 1:
>> > y_sin = np.sin(x)
>> > y_cos = np.cos(x)
>> >
>> > I know you can import the entire package without the "as np", but I see
>> > np.something as the standard. Why?
>>
>> Because it makes it clear where this function is coming from. Sure, you
>> could do
>>
>> from numpy import *
>>
>> but this is NOT!!! recommended. The reason why this is not recommended is
>> that it would clutter up your main name space. For instance, numpy has its
>> own sum function. If you do from numpy import *, Python's in-built `sum`
>> will be gone from your main name space and replaced by NumPy's sum. This is
>> confusing and should be avoided.
>>
>> > In the code above, sklearn > linear_model > Ridge, one lives inside the
>> > other, it feels that there are multiple layer, how deep do I have to dig in?
>> >
>> > Can someone explain the mentality behind this setup?
>>
>> This is one way to organize your code and package. Sklearn contains many
>> things, and organizing it by subpackages (linear_model, svm, ...) makes only
>> sense; otherwise, you would end up with code files > 100,000 lines or so,
>> which would make life really hard for package developers.
>>
>> Here, scikit-learn tries to follow the core principles of good object
>> oriented program design, for instance, Abstraction, encapsulation,
>> modularity, hierarchy, ...
>>
>> > What are some good ways and resources to learn Python for data analysis?
>>
>> I think baed on your questions, a good resource would be an introduction
>> to programming book or course. I think that sections on objected oriented
>> programming would make the rationale/design/API of scikit-learn and Python
>> classes as a whole more accessible and address your concerns and questions.
>>
>> Best,
>> Sebastian
>>
>> > On Jun 18, 2017, at 12:02 PM, C W <tmrsg11 at gmail.com> wrote:
>> >
>> > Dear Scikit-learn,
>> >
>> > What are some good ways and resources to learn Python for data analysis?
>> >
>> > I am extremely frustrated using this thing. Everything comes after a
>> > dot! Why would you type the sam thing at the beginning of every line. It's
>> > not efficient.
>> >
>> > code 1:
>> > y_sin = np.sin(x)
>> > y_cos = np.cos(x)
>> >
>> > I know you can import the entire package without the "as np", but I see
>> > np.something as the standard. Why?
>> >
>> > Code 2:
>> > model = LogisticRegression()
>> > model.fit(X_train, y_train)
>> > model.score(X_test, y_test)
>> >
>> > In R, everything is saved to a variable. In the code above, what if I
>> > accidentally ran model.fit(), I would not know.
>> >
>> > Code 3:
>> > from sklearn import linear_model
>> > reg = linear_model.Ridge (alpha = .5)
>> > reg.fit ([[0, 0], [0, 0], [1, 1]], [0, .1, 1])
>> >
>> > In the code above, sklearn > linear_model > Ridge, one lives inside the
>> > other, it feels that there are multiple layer, how deep do I have to dig in?
>> >
>> > Can someone explain the mentality behind this setup?
>> >
>> > Thank you very much!
>> >
>> > M
>> > _______________________________________________
>> > scikit-learn mailing list
>> > scikit-learn at python.org
>> > https://mail.python.org/mailman/listinfo/scikit-learn
>>
>> _______________________________________________
>> scikit-learn mailing list
>> scikit-learn at python.org
>> https://mail.python.org/mailman/listinfo/scikit-learn
>
>
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
More information about the scikit-learn
mailing list