[scikit-learn] R user trying to learn Python

Sun Jun 18 19:07:37 EDT 2017

Hi M,  

I think what you describe can be summarized as the difference of a domain specific language (r) and a general purpose language (Python). Most of what you describe is related to namespaces - "one honking great" feature of python. Namespaces are less needed in r because r is domain specific. But if you write your webserver's frontend, database access, prediction engine, user authentication, and what not all in Python (or at least large part of it), then namespaces help a lot keeping those domains apart.

I also added a couple of more specific answers to your points below, but I somehow can't make them appear as "not reply". I hope you all find them.  

Hope that helps, Ingo  

>  
>  
>  
> I am extremely frustrated using this thing. Everything comes after a dot! Why would you type the sam thing at the beginning of every line. It's not efficient.
>  
>  
>
>  
> This is mostly the Python way to do namespaces. Although it may not be efficient when you type, it is efficient when you debug: you always get both function/method *and* the context in which it was executed.  
>  
>
>  
> code 1:
>  
> y_sin = np.sin(x)
>  
> y_cos = np.cos(x)
>  
>
>  
> I know you can import the entire package without the "as np", but I see np.something as the standard. Why?
>  
>
>  
> Imagine you were doing an analysis for the Catholic church. Obviously sins would play and important role. So there might be a function that's called "sin" somewhere that does something entirely different from a trigonometric function. Ok, maybe this is a bad example but you get the idea. In this case it might even be a real issue because math.sin and numpy.sin do different but similar things. That could be difficult to debug and it's handy to mark which one you are using where.  
>  
>
>  
>  
>  
> Code 2:
>  
> model = LogisticRegression()
>  
> model.fit(X_train, y_train)
>  
>  
> model.score(X_test, y_test)
>    
>
>  
> In R, everything is saved to a variable. In the code above, what if I accidentally ran model.fit(), I would not know.
>  
>
>  
> - That's right. You would not know. This is a design decision in sklearn. There are advantages and disadvantages to it. Sklearn is using stateful objects here. For those you would expect to change them by calling their methods. Note though, that the methods you call on your model also return values that are likely what you expect them to return.  
>  
>
>  
> Code 3:
>  from sklearn import linear_model
>  reg = linear_model.Ridge (alpha = .5)
>  reg.fit ([[0, 0], [0, 0], [1, 1]], [0, .1, 1]) 
>
>  
> In the code above, sklearn  >  linear_model  >  Ridge, one lives inside the other, it feels that there are multiple layer, how deep do I have to dig in?
>  
>
>  
> - Again, this is the namespace idea. Python allows you to group functions, classes, and even namespaces themselves    in namespaces. For larger packages, this can be very useful because you can structure your code accordingly.  
>  
>  

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20170618/47add680/attachment-0001.html>