From ps1988.191919 at gmail.com  Sun Oct  1 00:58:48 2017
From: ps1988.191919 at gmail.com (Paul Smith)
Date: Sun, 1 Oct 2017 13:58:48 +0900
Subject: [scikit-learn] Commercial use of ML algorithms and scikit-learn
Message-ID: <CANtw5yboucFuDTk7w7nKjs9+qNb7-1CMujB2UFJu3rXzeARKKQ@mail.gmail.com>

Dear Scikit-learn users:

My name is Paul and I am working on a large electronics company. Sorry that
I cannot reveal the name of company.

My boss asked me to improve our business using ML algorithms. However I
recently found many of ML algorithms are patented.

Are there any legal problems if I use ML algorithms like SVM, decision
trees, clustering methods, and feature extractions for my company without
permissions?

If there are no problems, can I use scikit-learn for implementation?

Could anyone advise me on this issue please?

Thank you a lot and have a nice weekend.

Best regards,
Paul
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20171001/785e2b1f/attachment.html>

From se.raschka at gmail.com  Sun Oct  1 01:18:34 2017
From: se.raschka at gmail.com (Sebastian Raschka)
Date: Sun, 1 Oct 2017 01:18:34 -0400
Subject: [scikit-learn] Commercial use of ML algorithms and scikit-learn
In-Reply-To: <CANtw5yboucFuDTk7w7nKjs9+qNb7-1CMujB2UFJu3rXzeARKKQ@mail.gmail.com>
References: <CANtw5yboucFuDTk7w7nKjs9+qNb7-1CMujB2UFJu3rXzeARKKQ@mail.gmail.com>
Message-ID: <90DD3D37-E062-4E6D-942E-A31B1F263382@gmail.com>

Hi, Paul,

I think there should be no issue with that as scikit-learn is distributed under a BSD v3 license as long as you uphold the terms of that license. It's a bit tricky to find that license note as it's not called "LICENSE" in the GitHub repo like it is usually done for open source projects, but it is there in a file called "COPYING" (https://github.com/scikit-learn/scikit-learn/blob/master/COPYING):

> New BSD License
> 
> Copyright (c) 2007?2017 The scikit-learn developers.
> All rights reserved.
> 
> 
> Redistribution and use in source and binary forms, with or without
> modification, are permitted provided that the following conditions are met:
> 
>   a. Redistributions of source code must retain the above copyright notice,
>      this list of conditions and the following disclaimer.
>   b. Redistributions in binary form must reproduce the above copyright
>      notice, this list of conditions and the following disclaimer in the
>      documentation and/or other materials provided with the distribution.
>   c. Neither the name of the Scikit-learn Developers  nor the names of
>      its contributors may be used to endorse or promote products
>      derived from this software without specific prior written
>      permission. 
> 
> 
> THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
> AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
> IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
> ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE FOR
> ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
> DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
> SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
> CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
> LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
> OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH
> DAMAGE.
> 


In a nutshell, it would mean that you can do anything with scikit-learn except that can't use the names of sklearn devs or sklearn itself to promote your products, and you have to include the license if you redistribute any parts of sklearn. However, I'd still suggest to consult someone in your legal department regarding the license to make sure that you don't run into any troubles later on.

Best,
Sebastian


> On Oct 1, 2017, at 12:58 AM, Paul Smith <ps1988.191919 at gmail.com> wrote:
> 
> Dear Scikit-learn users:
> 
> My name is Paul and I am working on a large electronics company. Sorry that I cannot reveal the name of company. 
> 
> My boss asked me to improve our business using ML algorithms. However I recently found many of ML algorithms are patented.
> 
> Are there any legal problems if I use ML algorithms like SVM, decision trees, clustering methods, and feature extractions for my company without permissions?
> 
> If there are no problems, can I use scikit-learn for implementation?
> 
> Could anyone advise me on this issue please?
> 
> Thank you a lot and have a nice weekend.
> 
> Best regards,
> Paul
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn


From rvernica at gmail.com  Sun Oct  1 18:53:36 2017
From: rvernica at gmail.com (Rares Vernica)
Date: Sun, 1 Oct 2017 15:53:36 -0700
Subject: [scikit-learn] Combine already fitted models
Message-ID: <CALQ9KxB2kAkcwxkE0XjchoeiEAkCLtHR3fkmEuEDgyE=XLdG7g@mail.gmail.com>

Hello,

I have a distributed setup where subsets of the data is available at
different hosts. I plan to have each host fit a model with the subset of
the data it owns. Once these individual models are fitted, how can I go
about and combine them under one model.

I don't have a preference on a specific algorithm, but I am looking into a
classification problem.

I am looking at VotingClassifier but it seems that it is expected that the
estimators are fitted when VotingClassifier.fit() is called. I don't see
how I can have already fitted classifiers combined under a VotingClassifier.

Thanks!
Rares
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20171001/42261eab/attachment.html>

From se.raschka at gmail.com  Sun Oct  1 19:10:50 2017
From: se.raschka at gmail.com (Sebastian Raschka)
Date: Sun, 1 Oct 2017 19:10:50 -0400
Subject: [scikit-learn] Combine already fitted models
In-Reply-To: <CALQ9KxB2kAkcwxkE0XjchoeiEAkCLtHR3fkmEuEDgyE=XLdG7g@mail.gmail.com>
References: <CALQ9KxB2kAkcwxkE0XjchoeiEAkCLtHR3fkmEuEDgyE=XLdG7g@mail.gmail.com>
Message-ID: <1B8AF2A6-DC73-42BE-8C59-5335EA330135@gmail.com>

Hi, Rares,

> I am looking at VotingClassifier but it seems that it is expected that the estimators are fitted when VotingClassifier.fit() is called. I don't see how I can have already fitted classifiers combined under a VotingClassifier.

I think the opposite is true: The classifiers provided via an `estimators` argument upon initialization will be cloned and fitted if you call VotingClassifier's  fit(). Based on your follow-up question, I think you meant "it is expected that the estimators are *not* fitted when VotingClassifier.fit() is called," right?!

>  I don't see how I can have already fitted classifiers combined under a VotingClassifier.


The VotingClassifier in scikit-learn is based on the EnsembleVoteClassifier I had implemented in mlxtend (http://rasbt.github.io/mlxtend/user_guide/classifier/EnsembleVoteClassifier/#api). While I generally recommend using the VotingClassifier in scikit-learn, the code base of EnsembleVoteClassifier should be quite similar, and I have added a `refit` param which can be set to True or False. If refit=True, it's the same behavior as in sklearn. If refit=False, however, it will not refit the estimators and will allow you to use pre-fit classifiers, which is what you are asking for, I think?

@scikit-learn devs:
Not sure if such a parameter should be added to scikit-learn's VotingClassifier as it may cause some weird behavior in GridSearch etc? Otherwise, I am happy to add an issue or submit a PR to discuss/work on this further :)

Best,
Sebastian


> On Oct 1, 2017, at 6:53 PM, Rares Vernica <rvernica at gmail.com> wrote:
> 
> Hello,
> 
> I have a distributed setup where subsets of the data is available at different hosts. I plan to have each host fit a model with the subset of the data it owns. Once these individual models are fitted, how can I go about and combine them under one model.
> 
> I don't have a preference on a specific algorithm, but I am looking into a classification problem.
> 
> I am looking at VotingClassifier but it seems that it is expected that the estimators are fitted when VotingClassifier.fit() is called. I don't see how I can have already fitted classifiers combined under a VotingClassifier.
> 
> Thanks!
> Rares
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn


From rvernica at gmail.com  Sun Oct  1 19:22:55 2017
From: rvernica at gmail.com (Rares Vernica)
Date: Sun, 1 Oct 2017 16:22:55 -0700
Subject: [scikit-learn] Combine already fitted models
In-Reply-To: <1B8AF2A6-DC73-42BE-8C59-5335EA330135@gmail.com>
References: <CALQ9KxB2kAkcwxkE0XjchoeiEAkCLtHR3fkmEuEDgyE=XLdG7g@mail.gmail.com>
 <1B8AF2A6-DC73-42BE-8C59-5335EA330135@gmail.com>
Message-ID: <CALQ9KxBqRK1Lamiy5P_8-HqGGsnZVKXNxo0+B7Be4ui6+G9SSg@mail.gmail.com>

> > I am looking at VotingClassifier but it seems that it is expected that
the estimators are fitted when VotingClassifier.fit() is called. I don't
see how I can have already fitted classifiers combined under a
VotingClassifier.
>
> I think the opposite is true: The classifiers provided via an
`estimators` argument upon initialization will be cloned and fitted if you
call VotingClassifier's  fit(). Based on your follow-up question, I think
you meant "it is expected that the estimators are *not* fitted when
VotingClassifier.fit() is called," right?!

Yes, you are right. Sorry for the confusion. Thanks for the pointer!

I am also exploring something like:

vc = VotingClassifier(...)
vc.estimators_ = [e1, e2, ...]
vc.le_ = ...
vc.predict(...)

But I am not sure it is recommended to modify the "private" estimators_ and
le_ attributes.

--
Rares
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20171001/c6243dd4/attachment.html>

From se.raschka at gmail.com  Sun Oct  1 19:39:29 2017
From: se.raschka at gmail.com (Sebastian Raschka)
Date: Sun, 1 Oct 2017 19:39:29 -0400
Subject: [scikit-learn] Combine already fitted models
In-Reply-To: <CALQ9KxBqRK1Lamiy5P_8-HqGGsnZVKXNxo0+B7Be4ui6+G9SSg@mail.gmail.com>
References: <CALQ9KxB2kAkcwxkE0XjchoeiEAkCLtHR3fkmEuEDgyE=XLdG7g@mail.gmail.com>
 <1B8AF2A6-DC73-42BE-8C59-5335EA330135@gmail.com>
 <CALQ9KxBqRK1Lamiy5P_8-HqGGsnZVKXNxo0+B7Be4ui6+G9SSg@mail.gmail.com>
Message-ID: <6F759273-4160-4073-8C3C-C1509D53BF23@gmail.com>

Hi, Rares,

> vc = VotingClassifier(...)
> vc.estimators_ = [e1, e2, ...]
> vc.le_ = ...
> vc.predict(...)
> 
> But I am not sure it is recommended to modify the "private" estimators_ and le_ attributes.


I think that this may work if you don't call the fit method of the VotingClassifier after that due to 
https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/ensemble/voting_classifier.py#L186

Also, I see that we have only added one check in predict(), "check_is_fitted(self, 'estimators_')", for checking that the VotingClassifier was fit, so your proposed method could/should work as a workaround ;)

Best,
Sebastian

> On Oct 1, 2017, at 7:22 PM, Rares Vernica <rvernica at gmail.com> wrote:
> 
> > > I am looking at VotingClassifier but it seems that it is expected that the estimators are fitted when VotingClassifier.fit() is called. I don't see how I can have already fitted classifiers combined under a VotingClassifier.
> >
> > I think the opposite is true: The classifiers provided via an `estimators` argument upon initialization will be cloned and fitted if you call VotingClassifier's  fit(). Based on your follow-up question, I think you meant "it is expected that the estimators are *not* fitted when VotingClassifier.fit() is called," right?!
> 
> Yes, you are right. Sorry for the confusion. Thanks for the pointer!
> 
> I am also exploring something like:
> 
> vc = VotingClassifier(...)
> vc.estimators_ = [e1, e2, ...]
> vc.le_ = ...
> vc.predict(...)
> 
> But I am not sure it is recommended to modify the "private" estimators_ and le_ attributes.
> 
> --
> Rares
> 
> 
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn


From rth.yurchak at gmail.com  Mon Oct  2 03:28:37 2017
From: rth.yurchak at gmail.com (Roman Yurchak)
Date: Mon, 2 Oct 2017 09:28:37 +0200
Subject: [scikit-learn] TF-IDF
In-Reply-To: <CAFk2cV5GRHXjXWZrSTgHFZkKm7cU7a2acjRJm_U+smRU6q=fLA@mail.gmail.com>
References: <CAFk2cV5GRHXjXWZrSTgHFZkKm7cU7a2acjRJm_U+smRU6q=fLA@mail.gmail.com>
Message-ID: <abae6b69-a217-9500-928b-671b6eecc8af@gmail.com>

Hi Apurva,

if you consider the operations done by the augmented frequency and the 
cosine normalization independently from everything else, they are 
somewhat similar. The normalization by max in a p-norm with p?+? . So 
apart from the 0.5 offset, both are can be seen document length 
normalization with a different p value.

However, in TF-IDF you you would typically have an IDF document 
weighting operation between the term frequency weighting and the 
normalization, in which case the effect of both will be quite different. 
Generally I find that the SMART IR notation is very useful to represent 
different phases of the TF-IDF transformation.

The default parameters of TfidfTransformer is a good choice that will 
work well in most cases. Also, depending on the algorithm that you use 
afterwards, not having your data normalized by a an actual norm (e.g. 
cosine) may be sub-optimal.  Still, if you want to fine tune your 
document normalization have a look at the "Pivoted Document Length 
Normalization" paper by Singhal et al. There is a compatible 
implementation of this and a few other TF-IDF schemes in 
http://freediscovery.io/doc/stable/python/generated/freediscovery.feature_weighting.SmartTfidfTransformer.html

In the end, it's probably easier to try different options on your 
dataset to see what works and what doesn't. You could just determine it 
by cross-validating..

-- 
Roman

On 27/09/17 13:53, Apurva Nandan wrote:
> Hello,
>
> Could anybody tell me the difference between using augmented frequency
> (which is used for weighting term frequencies to eliminate the bias
> towards larger documents) and cosine normalization (l2 norm which
> scikit-learn uses for TfidfTransformer).
> Augmented frequency is given by the following equation. It tries to
> divide the natural term frequency by the maximum frequency of any term
> in the document.
>
> Inline image 1
>
> Do they both do the same thing when it comes to eliminating bias towards
> larger documents? I suppose scikit-learn uses the natural term freq, and
> using cosine normalization is enabled with using norm=l2
>
> Any help would be appreciated!
>
> - Apurva
>
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>


From rth.yurchak at gmail.com  Mon Oct  2 05:14:52 2017
From: rth.yurchak at gmail.com (Roman Yurchak)
Date: Mon, 2 Oct 2017 11:14:52 +0200
Subject: [scikit-learn] Accessing Clustering Feature Tree in Birch
In-Reply-To: <CAAir+Cr25SBQEQQc_4v_3XDcu0O0T+Ceh0EhM+Yw=c8r8NZ8cg@mail.gmail.com>
References: <CAAir+CrgyZYJ6GB9odP6LAwqXqcsoUgNmC1ODHRDaYncaMYWGA@mail.gmail.com>
 <CAAkaFLU9RAzjhEhPrto0qdF_YLNZ9rxyjSUXJwr4XO478c+K7w@mail.gmail.com>
 <CAAir+Cr25SBQEQQc_4v_3XDcu0O0T+Ceh0EhM+Yw=c8r8NZ8cg@mail.gmail.com>
Message-ID: <64e30942-d377-5f78-4334-8364b1b32bad@gmail.com>

Hello,

sklearn.cluster.Birch follows the original BIRCH paper, that appears to 
be mostly focused on efficiently building the hierarchical clustering 
tree (and not so much on making the later analysis user friendly). The 
attributes exposed by Birch are those that could be reasonably exposed 
given the scikit-learn API constraints. Though, one does have access to 
the full cluster hierarchy via the Birch.root_.

As Joel said, traversing the tree is a standard CS problem, and there is 
also probably a number of operations that could be done with it, 
depending on the application. For instance, for my use case, I found 
that re-constructing the Birch hierarchy using a custom container class 
for each subcluster was the easiest to run subsequent analysis with. A 
detailed example can be found here,
http://freediscovery.io/doc/stable/python/examples/birch_cluster_hierarchy.html
Alternatively, I wonder if converting the tree to a format readable by 
some tree/graph specialized library (e.g. networkx) could be useful for 
analysis.

Generally there is a number of places in scikit-learn where trees are 
used (Birch, AgglomerativeClustering, tree bases classifiers, etc) but 
for now there is no way to export the constructed tree to some standard 
format (apart for sklearn.tree.export_graphviz). Not sure if this is 
realistically achievable though..

-- 
Roman

On 20/09/17 13:40, Sema Atasever wrote:
> I need this information to use it in a scientific study and
> I think that a function interface would make this easier.
>
> Thank you for your answer.
>
> On Sat, Sep 16, 2017 at 1:53 PM, Joel Nothman <joel.nothman at gmail.com
> <mailto:joel.nothman at gmail.com>> wrote:
>
>     There is no such thing as "the data samples in this cluster". The
>     point of Birch being online is that it loses any reference to the
>     individual samples that contributed to each node, but stores some
>     statistics on their basis. Roman Yurchak has, however, offered a PR
>     where, for the non-online case, storage of the indices contributing
>     to each node can be optionally turned on:
>     https://github.com/scikit-learn/scikit-learn/pull/8808
>     <https://github.com/scikit-learn/scikit-learn/pull/8808>
>
>     As for finding what is contained under any particular node,
>     traversing the tree is a fairly basic task from a computer science
>     perspective. Before we were to support something to make this much
>     easier, I think we'd need to be clear on what kinds of use case we
>     were supporting. What do you hope to do with this information, and
>     what would a function interface look like that would make this much
>     easier?
>
>     Decimals aren't a practical option as the branching factor may be
>     greater than 10, it is a hard structure to inspect, and susceptible
>     to computational imprecision. Better off with a list of tuples, but
>     what for that is not easy enough to do now?
>
>
>
>     _______________________________________________
>     scikit-learn mailing list
>     scikit-learn at python.org <mailto:scikit-learn at python.org>
>     https://mail.python.org/mailman/listinfo/scikit-learn
>     <https://mail.python.org/mailman/listinfo/scikit-learn>
>
>
>
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>


From s.atasever at gmail.com  Tue Oct  3 09:11:31 2017
From: s.atasever at gmail.com (Sema Atasever)
Date: Tue, 3 Oct 2017 16:11:31 +0300
Subject: [scikit-learn] Accessing Clustering Feature Tree in Birch
In-Reply-To: <64e30942-d377-5f78-4334-8364b1b32bad@gmail.com>
References: <CAAir+CrgyZYJ6GB9odP6LAwqXqcsoUgNmC1ODHRDaYncaMYWGA@mail.gmail.com>
 <CAAkaFLU9RAzjhEhPrto0qdF_YLNZ9rxyjSUXJwr4XO478c+K7w@mail.gmail.com>
 <CAAir+Cr25SBQEQQc_4v_3XDcu0O0T+Ceh0EhM+Yw=c8r8NZ8cg@mail.gmail.com>
 <64e30942-d377-5f78-4334-8364b1b32bad@gmail.com>
Message-ID: <CAAir+Co=r82koTX1_BSgOjRXjSDdrmk+vrZg+PTE_DWctoeJGg@mail.gmail.com>

Hi Roman,

Thank you for the detailed and informative answer.

On Mon, Oct 2, 2017 at 12:14 PM, Roman Yurchak <rth.yurchak at gmail.com>
wrote:

> Hello,
>
> sklearn.cluster.Birch follows the original BIRCH paper, that appears to be
> mostly focused on efficiently building the hierarchical clustering tree
> (and not so much on making the later analysis user friendly). The
> attributes exposed by Birch are those that could be reasonably exposed
> given the scikit-learn API constraints. Though, one does have access to the
> full cluster hierarchy via the Birch.root_.
>
> As Joel said, traversing the tree is a standard CS problem, and there is
> also probably a number of operations that could be done with it, depending
> on the application. For instance, for my use case, I found that
> re-constructing the Birch hierarchy using a custom container class for each
> subcluster was the easiest to run subsequent analysis with. A detailed
> example can be found here,
> http://freediscovery.io/doc/stable/python/examples/birch_clu
> ster_hierarchy.html
> Alternatively, I wonder if converting the tree to a format readable by
> some tree/graph specialized library (e.g. networkx) could be useful for
> analysis.
>
> Generally there is a number of places in scikit-learn where trees are used
> (Birch, AgglomerativeClustering, tree bases classifiers, etc) but for now
> there is no way to export the constructed tree to some standard format
> (apart for sklearn.tree.export_graphviz). Not sure if this is realistically
> achievable though..
>
> --
> Roman
>
> On 20/09/17 13:40, Sema Atasever wrote:
>
>> I need this information to use it in a scientific study and
>> I think that a function interface would make this easier.
>>
>> Thank you for your answer.
>>
>> On Sat, Sep 16, 2017 at 1:53 PM, Joel Nothman <joel.nothman at gmail.com
>> <mailto:joel.nothman at gmail.com>> wrote:
>>
>>     There is no such thing as "the data samples in this cluster". The
>>     point of Birch being online is that it loses any reference to the
>>     individual samples that contributed to each node, but stores some
>>     statistics on their basis. Roman Yurchak has, however, offered a PR
>>     where, for the non-online case, storage of the indices contributing
>>     to each node can be optionally turned on:
>>     https://github.com/scikit-learn/scikit-learn/pull/8808
>>     <https://github.com/scikit-learn/scikit-learn/pull/8808>
>>
>>     As for finding what is contained under any particular node,
>>     traversing the tree is a fairly basic task from a computer science
>>     perspective. Before we were to support something to make this much
>>     easier, I think we'd need to be clear on what kinds of use case we
>>     were supporting. What do you hope to do with this information, and
>>     what would a function interface look like that would make this much
>>     easier?
>>
>>     Decimals aren't a practical option as the branching factor may be
>>     greater than 10, it is a hard structure to inspect, and susceptible
>>     to computational imprecision. Better off with a list of tuples, but
>>     what for that is not easy enough to do now?
>>
>>
>>
>>     _______________________________________________
>>     scikit-learn mailing list
>>     scikit-learn at python.org <mailto:scikit-learn at python.org>
>>     https://mail.python.org/mailman/listinfo/scikit-learn
>>     <https://mail.python.org/mailman/listinfo/scikit-learn>
>>
>>
>>
>>
>> _______________________________________________
>> scikit-learn mailing list
>> scikit-learn at python.org
>> https://mail.python.org/mailman/listinfo/scikit-learn
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20171003/e1f1c1ef/attachment.html>

From stuart at stuartreynolds.net  Tue Oct  3 13:48:16 2017
From: stuart at stuartreynolds.net (Stuart Reynolds)
Date: Tue, 3 Oct 2017 10:48:16 -0700
Subject: [scikit-learn] Confidence interval estimation for probability
 estimators
Message-ID: <CAAy-kd=72fTjAE0FRYzfGbTiRG-5kkDA46X+TgTnFWYD6zzfXw@mail.gmail.com>

Let's say I have a base estimator that predicts the likelihood of an
binary (Bernoulli) outcome:
  model.fit(X, y) where y contains [0 or 1]
  P = model.predict(X)/predict_proba(X)  give values in the range [0 to 1]
(model here might be a calibrated LogisticRegression model).

Is there a way to estimate confidences for the rows in P?

Is seems like this can be done with Gaussian Process Regression for
regression tasks:
https://stats.stackexchange.com/questions/169995/why-does-my-train-data-not-fall-in-confidence-interval-with-scikit-learn-gaussia
For regression task I this this method could be used to wrap other
models and estimate the confidence.
For example, it looks like we can do:
  gp = GaussianProcessorRegressor(..)
  gp.fit(model.predict(X), y)
  ypred, sigma = gp.predict(model.predict(X))
to give us an estimate of the confidence in the output of model, *for
regression*.

I'd like the same, for probability estimates. However, i don't think
the above works directly:
 - my outcomes is constrained between 0..1 (the GP Regressor is not)
 - using normal approximation to obtain confidence intervals for
Bernoulli processes can leads to some pretty awful estimates,
particularly for probabilities close to 0 or 1.
 - the above example gives a single sigma value. For constrained
outputs, the CI is not symmetric (this bound closer to 0.5 should be
further from the probability prediction than the bound closes to 0 or
1.

I was hoping that GaussianProcessClassifier might be able to generate
intervals, but I don't see how.

My current approach is:

 - for some prediction p,
     - pick y_p from y, the rows who have predictions close to p:
       - for this sample, estimate the CI with
statsmodels.stats.proportion.proportion_confint(
            sum(y_p), len(y_p), alpha=1-ciwidth, method="wilson" # or
"jeffrey" -- normal, beta are broken for p close to 0 or 1

Which works OK, but is quite slow and not very data efficient.


Any thoughts?

Thanks,
- Stuart

From t3kcit at gmail.com  Tue Oct  3 15:11:42 2017
From: t3kcit at gmail.com (Andreas Mueller)
Date: Tue, 3 Oct 2017 15:11:42 -0400
Subject: [scikit-learn] Commercial use of ML algorithms and scikit-learn
In-Reply-To: <90DD3D37-E062-4E6D-942E-A31B1F263382@gmail.com>
References: <CANtw5yboucFuDTk7w7nKjs9+qNb7-1CMujB2UFJu3rXzeARKKQ@mail.gmail.com>
 <90DD3D37-E062-4E6D-942E-A31B1F263382@gmail.com>
Message-ID: <19934f8f-4cbe-8008-0cbb-f7a18a447e2c@gmail.com>

Licensing and patents are orthogonal.
They are pretty much unrelated. In terms of the license, you can do with 
the code whatever you like.
If any of the algorithms were (are?) patented, independent of the 
implementation, you would
have to pay a license fee to use it - no matter if you use a commercial 
reference implementation,
the scikit-learn implementation or any other.

I'm not aware of any of the algorithms in scikit-learn being protected 
by patents.
If you are aware of any, please let us know.

There is a trademark to Random Forests: 
https://www.stat.berkeley.edu/~breiman/RandomForests/
I don't think this is enforced, but a trademark again is different from 
licensing or patents.
If it was enforced, this would mean you can't use the *phrase* Random 
Forest. The algorithm
itself is not protected afaik.

Also: IANAL, and this is also only related to US law. (US law has 4 
kinds of intellectual property
protection to my understanding: copyright (licensing), patents, 
trademarks and industrial designs.
Since there are no physical items at play here, luckily we "only" have 
to deal with the first three.)

Andy

From markus.konrad at wzb.eu  Wed Oct  4 07:35:31 2017
From: markus.konrad at wzb.eu (Markus Konrad)
Date: Wed, 4 Oct 2017 13:35:31 +0200
Subject: [scikit-learn] Using perplexity from LatentDirichletAllocation for
 cross validation of Topic Models
Message-ID: <56caf0d4-11eb-bfb5-a01d-af332fb5969a@wzb.eu>

Hi there,

I'm trying to find the optimal number of topics for Topic Modeling with
Latent Dirichlet Allocation. I implemented a 5-fold cross validation
method similar to the one described and implemented in R here [1]. I
basically split the full data into 5 equal sized chunks. Then for each
fold (`cur_fold`), 4 of 5 chunks are used for training and 1 for
validation using the `perplexity()` method on the held-out data set:

```
dtm_train = data[split_folds != cur_fold, :]
dtm_valid = data[split_folds == cur_fold, :]

lda_instance = LatentDirichletAllocation(**params)
lda_instance.fit(dtm_train)

perpl = lda_instance.perplexity(dtm_valid)
```

This is done for a set of parameters, basically for a varying number of
topics (n_components).

I tried this out with a number of different data sets, for example with
the "Associated Press" data mentioned in [1], which is the sample data
for David M. Blei's LDA C implementation [2].
Using the same data, I would expect that I get similar results as in
[1], which found that a model with ~100 topics fits the AP data best.
However, my experiments always show that the perplexity is exponentially
growing with the number of topics. The "best" model is always the one
with the lowest number of topics. The same happens with other data sets,
too. Similar results happen when calculating the perplexity on the full
training data alone (so no cross validation on held-out data).

Does anyone have an idea why these results are not consistent with those
from [1]? Is the perplexity() method not the correct method to use when
evaluating held-out data? Could it be a problem, that some of the
columns of the training data term frequency matrix are all-zero?

Best,
Markus


[1] http://ellisp.github.io/blog/2017/01/05/topic-model-cv
[2]
https://web.archive.org/web/20160930175144/http://www.cs.princeton.edu/~blei/lda-c/index.html


From stuart at stuartreynolds.net  Wed Oct  4 15:58:42 2017
From: stuart at stuartreynolds.net (Stuart Reynolds)
Date: Wed, 4 Oct 2017 12:58:42 -0700
Subject: [scikit-learn] Can fit a model with a target array of probabilities?
Message-ID: <CAAy-kdkCpm-205Lb7Gj_=GJZ61kBa9Cktc7+QVwZ7LLmz5tErA@mail.gmail.com>

I'd like to fit a model that maps a matrix of continuous inputs to a
target that's between 0 and 1 (a probability).

In principle, I'd expect logistic regression should work out of the
box with no modification (although its often posed as being strictly
for classification, its loss function allows for fitting targets in
the range 0 to 1, and not strictly zero or one.)

However, scikit's LogisticRegression and LogisticRegressionCV reject
target arrays that are continuous. Other LR implementations allow a
matrix of probability estimates. Looking at:
http://scikit-learn-general.narkive.com/4dSCktaM/using-logistic-regression-on-a-continuous-target-variable
and the fix here:
https://github.com/scikit-learn/scikit-learn/pull/5084, which disables
continuous inputs, it looks like there was some reason for this. So
... I'm looking for alternatives.

SGDClassifier allows log loss and (if I understood the docs correctly)
adds a logistic link function, but also rejects continuous targets.
Oddly, SGDRegressor only allows  ?squared_loss?, ?huber?,
?epsilon_insensitive?, or ?squared_epsilon_insensitive?, and doesn't
seems to give a logistic function.

In principle, GLM allow this, but scikit's docs say the GLM models
only allows strict linear functions of their input, and doesn't allow
a logistic link function. The docs direct people to the
LogisticRegression class for this case.

In R, there is:

glm(Total_Service_Points_Won/Total_Service_Points_Played ~ ... ,
    family = binomial(link=logit), weights = Total_Service_Points_Played)
which would be ideal.

Is something similar available in scikit? (Or any continuous model
that takes and 0 to 1 target and outputs a 0 to 1 target?)

I was surprised to see that the implementation of
CalibratedClassifierCV(method="sigmoid") uses an internal
implementation of logistic regression to do its logistic regressing --
which I can use, although I'd prefer to use a user-facing library.

Thanks,
- Stuart

From t3kcit at gmail.com  Wed Oct  4 16:09:56 2017
From: t3kcit at gmail.com (Andreas Mueller)
Date: Wed, 4 Oct 2017 16:09:56 -0400
Subject: [scikit-learn] Can fit a model with a target array of
 probabilities?
In-Reply-To: <CAAy-kdkCpm-205Lb7Gj_=GJZ61kBa9Cktc7+QVwZ7LLmz5tErA@mail.gmail.com>
References: <CAAy-kdkCpm-205Lb7Gj_=GJZ61kBa9Cktc7+QVwZ7LLmz5tErA@mail.gmail.com>
Message-ID: <e201e89e-dcce-8722-82b9-f352f8a9e56d@gmail.com>

Hi Stuart.
There is no interface to do this in scikit-learn (and maybe we should at 
this to the FAQ).
Yes, in principle this would be possible with several of the models.

I think statsmodels can do that, and I think I saw another glm package
for Python that does that?

It's certainly a legitimate use-case but would require substantial
changes to the code. I think so far we decided not to support
this in scikit-learn. Basically we don't have a concept of a link
function, and it's a concept that only applies to a subset of models.
We try to have a consistent interface for all our estimators, and
this doesn't really fit well within that interface.

Hth,
Andy

On 10/04/2017 03:58 PM, Stuart Reynolds wrote:
> I'd like to fit a model that maps a matrix of continuous inputs to a
> target that's between 0 and 1 (a probability).
>
> In principle, I'd expect logistic regression should work out of the
> box with no modification (although its often posed as being strictly
> for classification, its loss function allows for fitting targets in
> the range 0 to 1, and not strictly zero or one.)
>
> However, scikit's LogisticRegression and LogisticRegressionCV reject
> target arrays that are continuous. Other LR implementations allow a
> matrix of probability estimates. Looking at:
> http://scikit-learn-general.narkive.com/4dSCktaM/using-logistic-regression-on-a-continuous-target-variable
> and the fix here:
> https://github.com/scikit-learn/scikit-learn/pull/5084, which disables
> continuous inputs, it looks like there was some reason for this. So
> ... I'm looking for alternatives.
>
> SGDClassifier allows log loss and (if I understood the docs correctly)
> adds a logistic link function, but also rejects continuous targets.
> Oddly, SGDRegressor only allows  ?squared_loss?, ?huber?,
> ?epsilon_insensitive?, or ?squared_epsilon_insensitive?, and doesn't
> seems to give a logistic function.
>
> In principle, GLM allow this, but scikit's docs say the GLM models
> only allows strict linear functions of their input, and doesn't allow
> a logistic link function. The docs direct people to the
> LogisticRegression class for this case.
>
> In R, there is:
>
> glm(Total_Service_Points_Won/Total_Service_Points_Played ~ ... ,
>      family = binomial(link=logit), weights = Total_Service_Points_Played)
> which would be ideal.
>
> Is something similar available in scikit? (Or any continuous model
> that takes and 0 to 1 target and outputs a 0 to 1 target?)
>
> I was surprised to see that the implementation of
> CalibratedClassifierCV(method="sigmoid") uses an internal
> implementation of logistic regression to do its logistic regressing --
> which I can use, although I'd prefer to use a user-facing library.
>
> Thanks,
> - Stuart
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn


From stuart at stuartreynolds.net  Wed Oct  4 16:26:58 2017
From: stuart at stuartreynolds.net (Stuart Reynolds)
Date: Wed, 4 Oct 2017 13:26:58 -0700
Subject: [scikit-learn] Can fit a model with a target array of
 probabilities?
In-Reply-To: <e201e89e-dcce-8722-82b9-f352f8a9e56d@gmail.com>
References: <CAAy-kdkCpm-205Lb7Gj_=GJZ61kBa9Cktc7+QVwZ7LLmz5tErA@mail.gmail.com>
 <e201e89e-dcce-8722-82b9-f352f8a9e56d@gmail.com>
Message-ID: <CAAy-kdmNq=tB7dhENM56XRvy_-t-qi=ROsGar2VT0MJgju4gUA@mail.gmail.com>

Hi Andy,
Thanks -- I'll give another statsmodels another go.
I remember I had some fitting speed issues with it in the past, and
also some issues related their models keeping references to the data
(=disaster for serialization and multiprocessing) -- although that was
a long time ago.
- Stuart

On Wed, Oct 4, 2017 at 1:09 PM, Andreas Mueller <t3kcit at gmail.com> wrote:
> Hi Stuart.
> There is no interface to do this in scikit-learn (and maybe we should at
> this to the FAQ).
> Yes, in principle this would be possible with several of the models.
>
> I think statsmodels can do that, and I think I saw another glm package
> for Python that does that?
>
> It's certainly a legitimate use-case but would require substantial
> changes to the code. I think so far we decided not to support
> this in scikit-learn. Basically we don't have a concept of a link
> function, and it's a concept that only applies to a subset of models.
> We try to have a consistent interface for all our estimators, and
> this doesn't really fit well within that interface.
>
> Hth,
> Andy
>
>
> On 10/04/2017 03:58 PM, Stuart Reynolds wrote:
>>
>> I'd like to fit a model that maps a matrix of continuous inputs to a
>> target that's between 0 and 1 (a probability).
>>
>> In principle, I'd expect logistic regression should work out of the
>> box with no modification (although its often posed as being strictly
>> for classification, its loss function allows for fitting targets in
>> the range 0 to 1, and not strictly zero or one.)
>>
>> However, scikit's LogisticRegression and LogisticRegressionCV reject
>> target arrays that are continuous. Other LR implementations allow a
>> matrix of probability estimates. Looking at:
>>
>> http://scikit-learn-general.narkive.com/4dSCktaM/using-logistic-regression-on-a-continuous-target-variable
>> and the fix here:
>> https://github.com/scikit-learn/scikit-learn/pull/5084, which disables
>> continuous inputs, it looks like there was some reason for this. So
>> ... I'm looking for alternatives.
>>
>> SGDClassifier allows log loss and (if I understood the docs correctly)
>> adds a logistic link function, but also rejects continuous targets.
>> Oddly, SGDRegressor only allows  ?squared_loss?, ?huber?,
>> ?epsilon_insensitive?, or ?squared_epsilon_insensitive?, and doesn't
>> seems to give a logistic function.
>>
>> In principle, GLM allow this, but scikit's docs say the GLM models
>> only allows strict linear functions of their input, and doesn't allow
>> a logistic link function. The docs direct people to the
>> LogisticRegression class for this case.
>>
>> In R, there is:
>>
>> glm(Total_Service_Points_Won/Total_Service_Points_Played ~ ... ,
>>      family = binomial(link=logit), weights = Total_Service_Points_Played)
>> which would be ideal.
>>
>> Is something similar available in scikit? (Or any continuous model
>> that takes and 0 to 1 target and outputs a 0 to 1 target?)
>>
>> I was surprised to see that the implementation of
>> CalibratedClassifierCV(method="sigmoid") uses an internal
>> implementation of logistic regression to do its logistic regressing --
>> which I can use, although I'd prefer to use a user-facing library.
>>
>> Thanks,
>> - Stuart
>> _______________________________________________
>> scikit-learn mailing list
>> scikit-learn at python.org
>> https://mail.python.org/mailman/listinfo/scikit-learn
>
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn

From josef.pktd at gmail.com  Wed Oct  4 18:43:23 2017
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Wed, 4 Oct 2017 18:43:23 -0400
Subject: [scikit-learn] Can fit a model with a target array of
 probabilities?
In-Reply-To: <CAAy-kdmNq=tB7dhENM56XRvy_-t-qi=ROsGar2VT0MJgju4gUA@mail.gmail.com>
References: <CAAy-kdkCpm-205Lb7Gj_=GJZ61kBa9Cktc7+QVwZ7LLmz5tErA@mail.gmail.com>
 <e201e89e-dcce-8722-82b9-f352f8a9e56d@gmail.com>
 <CAAy-kdmNq=tB7dhENM56XRvy_-t-qi=ROsGar2VT0MJgju4gUA@mail.gmail.com>
Message-ID: <CAMMTP+C8PmHKDh3Yt24Smd5yrOidvFBuzEVQ8-suzxY8+BVghw@mail.gmail.com>

On Wed, Oct 4, 2017 at 4:26 PM, Stuart Reynolds <stuart at stuartreynolds.net>
wrote:

> Hi Andy,
> Thanks -- I'll give another statsmodels another go.
> I remember I had some fitting speed issues with it in the past, and
> also some issues related their models keeping references to the data
> (=disaster for serialization and multiprocessing) -- although that was
> a long time ago.
>

The second has not changed and will not change, but there is a remove_data
method that deletes all references to full, data sized arrays. However,
once the data is removed, it is not possible anymore to compute any new
results statistics which are almost all lazily computed.
The fitting speed depends a lot on the optimizer, convergence criteria and
difficulty of the problem, and availability of good starting parameters.
Almost all nonlinear estimation problems use the scipy optimizers, all
unconstrained optimizers can be used. There are no optimized special
methods for cases with a very large number of features.

Multinomial/multiclass models don't support continuous response (yet), all
other GLM and discrete models allow for continuous data in the interval
extension of the domain.

Josef


> - Stuart
>
> On Wed, Oct 4, 2017 at 1:09 PM, Andreas Mueller <t3kcit at gmail.com> wrote:
> > Hi Stuart.
> > There is no interface to do this in scikit-learn (and maybe we should at
> > this to the FAQ).
> > Yes, in principle this would be possible with several of the models.
> >
> > I think statsmodels can do that, and I think I saw another glm package
> > for Python that does that?
> >
> > It's certainly a legitimate use-case but would require substantial
> > changes to the code. I think so far we decided not to support
> > this in scikit-learn. Basically we don't have a concept of a link
> > function, and it's a concept that only applies to a subset of models.
> > We try to have a consistent interface for all our estimators, and
> > this doesn't really fit well within that interface.
> >
> > Hth,
> > Andy
> >
> >
> > On 10/04/2017 03:58 PM, Stuart Reynolds wrote:
> >>
> >> I'd like to fit a model that maps a matrix of continuous inputs to a
> >> target that's between 0 and 1 (a probability).
> >>
> >> In principle, I'd expect logistic regression should work out of the
> >> box with no modification (although its often posed as being strictly
> >> for classification, its loss function allows for fitting targets in
> >> the range 0 to 1, and not strictly zero or one.)
> >>
> >> However, scikit's LogisticRegression and LogisticRegressionCV reject
> >> target arrays that are continuous. Other LR implementations allow a
> >> matrix of probability estimates. Looking at:
> >>
> >> http://scikit-learn-general.narkive.com/4dSCktaM/using-
> logistic-regression-on-a-continuous-target-variable
> >> and the fix here:
> >> https://github.com/scikit-learn/scikit-learn/pull/5084, which disables
> >> continuous inputs, it looks like there was some reason for this. So
> >> ... I'm looking for alternatives.
> >>
> >> SGDClassifier allows log loss and (if I understood the docs correctly)
> >> adds a logistic link function, but also rejects continuous targets.
> >> Oddly, SGDRegressor only allows  ?squared_loss?, ?huber?,
> >> ?epsilon_insensitive?, or ?squared_epsilon_insensitive?, and doesn't
> >> seems to give a logistic function.
> >>
> >> In principle, GLM allow this, but scikit's docs say the GLM models
> >> only allows strict linear functions of their input, and doesn't allow
> >> a logistic link function. The docs direct people to the
> >> LogisticRegression class for this case.
> >>
> >> In R, there is:
> >>
> >> glm(Total_Service_Points_Won/Total_Service_Points_Played ~ ... ,
> >>      family = binomial(link=logit), weights =
> Total_Service_Points_Played)
> >> which would be ideal.
> >>
> >> Is something similar available in scikit? (Or any continuous model
> >> that takes and 0 to 1 target and outputs a 0 to 1 target?)
> >>
> >> I was surprised to see that the implementation of
> >> CalibratedClassifierCV(method="sigmoid") uses an internal
> >> implementation of logistic regression to do its logistic regressing --
> >> which I can use, although I'd prefer to use a user-facing library.
> >>
> >> Thanks,
> >> - Stuart
> >> _______________________________________________
> >> scikit-learn mailing list
> >> scikit-learn at python.org
> >> https://mail.python.org/mailman/listinfo/scikit-learn
> >
> >
> > _______________________________________________
> > scikit-learn mailing list
> > scikit-learn at python.org
> > https://mail.python.org/mailman/listinfo/scikit-learn
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20171004/3ddbc69c/attachment-0001.html>

From sean.violante at gmail.com  Thu Oct  5 04:24:38 2017
From: sean.violante at gmail.com (Sean Violante)
Date: Thu, 5 Oct 2017 10:24:38 +0200
Subject: [scikit-learn] Can fit a model with a target array of
 probabilities?
In-Reply-To: <CAAy-kdmNq=tB7dhENM56XRvy_-t-qi=ROsGar2VT0MJgju4gUA@mail.gmail.com>
References: <CAAy-kdkCpm-205Lb7Gj_=GJZ61kBa9Cktc7+QVwZ7LLmz5tErA@mail.gmail.com>
 <e201e89e-dcce-8722-82b9-f352f8a9e56d@gmail.com>
 <CAAy-kdmNq=tB7dhENM56XRvy_-t-qi=ROsGar2VT0MJgju4gUA@mail.gmail.com>
Message-ID: <CAL9=spO8mnF6wdwNRNRu9-kRDj7r45xYmhRQ2_Pn8OQgE7tXqQ@mail.gmail.com>

Hi Stuart

the underlying logistic regression code in scikit learn (at least for the
non liblinear implementation) allows sample weights which would allow you
to do what you want.
[pass in sample weight Total_Service_Points_Won and target 1 and (
Total_Service_Points_Played-Total_Service_Points_Won) and target 0]
ie for each 'instance' you pass in two rows.

Unfortunately it has never been fully implemented
see
https://github.com/scikit-learn/scikit-learn/pull/2784#issuecomment-84734590

Unfortunately, it has never been fully exposed - I have given it a go and I
ran into problems because the code is shared with the linear SVC model as I
recall.
ie logistic regression would work, but some of the test cases would fail
with linear svc

[note that there is also a version of the original liblinear code that
supports sample weights]


[I would point out  having a single row rather than 2 is easier - eg
crossvalidation is a pain]


if you really want to give a continuous target then you probably want beta
regression - an example would be predicting concentrations, then the sample
weights are giving you the # times you observed that concentration
[and you could replace concentration with probability too, eg if you
literally had an 'oracle' that gave you the true probability of an instance]


sean

On Wed, Oct 4, 2017 at 10:26 PM, Stuart Reynolds <stuart at stuartreynolds.net>
wrote:

> Hi Andy,
> Thanks -- I'll give another statsmodels another go.
> I remember I had some fitting speed issues with it in the past, and
> also some issues related their models keeping references to the data
> (=disaster for serialization and multiprocessing) -- although that was
> a long time ago.
> - Stuart
>
> On Wed, Oct 4, 2017 at 1:09 PM, Andreas Mueller <t3kcit at gmail.com> wrote:
> > Hi Stuart.
> > There is no interface to do this in scikit-learn (and maybe we should at
> > this to the FAQ).
> > Yes, in principle this would be possible with several of the models.
> >
> > I think statsmodels can do that, and I think I saw another glm package
> > for Python that does that?
> >
> > It's certainly a legitimate use-case but would require substantial
> > changes to the code. I think so far we decided not to support
> > this in scikit-learn. Basically we don't have a concept of a link
> > function, and it's a concept that only applies to a subset of models.
> > We try to have a consistent interface for all our estimators, and
> > this doesn't really fit well within that interface.
> >
> > Hth,
> > Andy
> >
> >
> > On 10/04/2017 03:58 PM, Stuart Reynolds wrote:
> >>
> >> I'd like to fit a model that maps a matrix of continuous inputs to a
> >> target that's between 0 and 1 (a probability).
> >>
> >> In principle, I'd expect logistic regression should work out of the
> >> box with no modification (although its often posed as being strictly
> >> for classification, its loss function allows for fitting targets in
> >> the range 0 to 1, and not strictly zero or one.)
> >>
> >> However, scikit's LogisticRegression and LogisticRegressionCV reject
> >> target arrays that are continuous. Other LR implementations allow a
> >> matrix of probability estimates. Looking at:
> >>
> >> http://scikit-learn-general.narkive.com/4dSCktaM/using-
> logistic-regression-on-a-continuous-target-variable
> >> and the fix here:
> >> https://github.com/scikit-learn/scikit-learn/pull/5084, which disables
> >> continuous inputs, it looks like there was some reason for this. So
> >> ... I'm looking for alternatives.
> >>
> >> SGDClassifier allows log loss and (if I understood the docs correctly)
> >> adds a logistic link function, but also rejects continuous targets.
> >> Oddly, SGDRegressor only allows  ?squared_loss?, ?huber?,
> >> ?epsilon_insensitive?, or ?squared_epsilon_insensitive?, and doesn't
> >> seems to give a logistic function.
> >>
> >> In principle, GLM allow this, but scikit's docs say the GLM models
> >> only allows strict linear functions of their input, and doesn't allow
> >> a logistic link function. The docs direct people to the
> >> LogisticRegression class for this case.
> >>
> >> In R, there is:
> >>
> >> glm(Total_Service_Points_Won/Total_Service_Points_Played ~ ... ,
> >>      family = binomial(link=logit), weights =
> Total_Service_Points_Played)
> >> which would be ideal.
> >>
> >> Is something similar available in scikit? (Or any continuous model
> >> that takes and 0 to 1 target and outputs a 0 to 1 target?)
> >>
> >> I was surprised to see that the implementation of
> >> CalibratedClassifierCV(method="sigmoid") uses an internal
> >> implementation of logistic regression to do its logistic regressing --
> >> which I can use, although I'd prefer to use a user-facing library.
> >>
> >> Thanks,
> >> - Stuart
> >> _______________________________________________
> >> scikit-learn mailing list
> >> scikit-learn at python.org
> >> https://mail.python.org/mailman/listinfo/scikit-learn
> >
> >
> > _______________________________________________
> > scikit-learn mailing list
> > scikit-learn at python.org
> > https://mail.python.org/mailman/listinfo/scikit-learn
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20171005/617f160d/attachment.html>

From stuart at stuartreynolds.net  Thu Oct  5 12:34:51 2017
From: stuart at stuartreynolds.net (Stuart Reynolds)
Date: Thu, 5 Oct 2017 09:34:51 -0700
Subject: [scikit-learn] Can fit a model with a target array of
 probabilities?
In-Reply-To: <CAMMTP+C8PmHKDh3Yt24Smd5yrOidvFBuzEVQ8-suzxY8+BVghw@mail.gmail.com>
References: <CAAy-kdkCpm-205Lb7Gj_=GJZ61kBa9Cktc7+QVwZ7LLmz5tErA@mail.gmail.com>
 <e201e89e-dcce-8722-82b9-f352f8a9e56d@gmail.com>
 <CAAy-kdmNq=tB7dhENM56XRvy_-t-qi=ROsGar2VT0MJgju4gUA@mail.gmail.com>
 <CAMMTP+C8PmHKDh3Yt24Smd5yrOidvFBuzEVQ8-suzxY8+BVghw@mail.gmail.com>
Message-ID: <CAAy-kdkvUnabY54t-dRitsBEpE08kSN41oiE8RrUUK21J6yGcA@mail.gmail.com>

Thanks Josef. Was very useful.

result.remove_data() reduces a 5 parameter Logit result object from
megabytes to 5Kb (as compared to a minimum uncompressed size of the
parameters of ~320 bytes). Is big improvement. I'll experiment with
what you suggest -- since this is still >10x larger than possible. I
think the difference is mostly attribute names.
I don't mind the lack of a multinomial support. I've often had better
results mixing independent models for each class.

I'll experiment with the different solvers.  I tried the Logit model
in the past -- its fit function only exposed a maxiter, and not a
tolerance -- meaning I had to set maxiter very high. The newer
statsmodels GLM module looks great and seem to solve this.

For other who come this way, I think the magic for ridge regression is:

        from statsmodels.genmod.generalized_linear_model import GLM
        from statsmodels.genmod.generalized_linear_model import families
        from statsmodels.genmod.generalized_linear_model.families import links

        model = GLM(y, Xtrain, family=families.Binomial(link=links.Logit))
        result = model.fit_regularized(method='elastic_net',
alpha=l2weight, L1_wt=0.0, tol=...)
        result.remove_data()
        result.predict(Xtest)

One last thing -- its clear that it should be possible to do something
like scikit's LogisticRegressionCV in order to quickly optimize a
single parameter by re-using past coefficients.
Are there any wrappers in statsmodels for doing this or should I roll my own?


- Stu


On Wed, Oct 4, 2017 at 3:43 PM,  <josef.pktd at gmail.com> wrote:
>
>
> On Wed, Oct 4, 2017 at 4:26 PM, Stuart Reynolds <stuart at stuartreynolds.net>
> wrote:
>>
>> Hi Andy,
>> Thanks -- I'll give another statsmodels another go.
>> I remember I had some fitting speed issues with it in the past, and
>> also some issues related their models keeping references to the data
>> (=disaster for serialization and multiprocessing) -- although that was
>> a long time ago.
>
>
> The second has not changed and will not change, but there is a remove_data
> method that deletes all references to full, data sized arrays. However, once
> the data is removed, it is not possible anymore to compute any new results
> statistics which are almost all lazily computed.
> The fitting speed depends a lot on the optimizer, convergence criteria and
> difficulty of the problem, and availability of good starting parameters.
> Almost all nonlinear estimation problems use the scipy optimizers, all
> unconstrained optimizers can be used. There are no optimized special methods
> for cases with a very large number of features.
>
> Multinomial/multiclass models don't support continuous response (yet), all
> other GLM and discrete models allow for continuous data in the interval
> extension of the domain.
>
> Josef
>
>
>>
>> - Stuart
>>
>> On Wed, Oct 4, 2017 at 1:09 PM, Andreas Mueller <t3kcit at gmail.com> wrote:
>> > Hi Stuart.
>> > There is no interface to do this in scikit-learn (and maybe we should at
>> > this to the FAQ).
>> > Yes, in principle this would be possible with several of the models.
>> >
>> > I think statsmodels can do that, and I think I saw another glm package
>> > for Python that does that?
>> >
>> > It's certainly a legitimate use-case but would require substantial
>> > changes to the code. I think so far we decided not to support
>> > this in scikit-learn. Basically we don't have a concept of a link
>> > function, and it's a concept that only applies to a subset of models.
>> > We try to have a consistent interface for all our estimators, and
>> > this doesn't really fit well within that interface.
>> >
>> > Hth,
>> > Andy
>> >
>> >
>> > On 10/04/2017 03:58 PM, Stuart Reynolds wrote:
>> >>
>> >> I'd like to fit a model that maps a matrix of continuous inputs to a
>> >> target that's between 0 and 1 (a probability).
>> >>
>> >> In principle, I'd expect logistic regression should work out of the
>> >> box with no modification (although its often posed as being strictly
>> >> for classification, its loss function allows for fitting targets in
>> >> the range 0 to 1, and not strictly zero or one.)
>> >>
>> >> However, scikit's LogisticRegression and LogisticRegressionCV reject
>> >> target arrays that are continuous. Other LR implementations allow a
>> >> matrix of probability estimates. Looking at:
>> >>
>> >>
>> >> http://scikit-learn-general.narkive.com/4dSCktaM/using-logistic-regression-on-a-continuous-target-variable
>> >> and the fix here:
>> >> https://github.com/scikit-learn/scikit-learn/pull/5084, which disables
>> >> continuous inputs, it looks like there was some reason for this. So
>> >> ... I'm looking for alternatives.
>> >>
>> >> SGDClassifier allows log loss and (if I understood the docs correctly)
>> >> adds a logistic link function, but also rejects continuous targets.
>> >> Oddly, SGDRegressor only allows  ?squared_loss?, ?huber?,
>> >> ?epsilon_insensitive?, or ?squared_epsilon_insensitive?, and doesn't
>> >> seems to give a logistic function.
>> >>
>> >> In principle, GLM allow this, but scikit's docs say the GLM models
>> >> only allows strict linear functions of their input, and doesn't allow
>> >> a logistic link function. The docs direct people to the
>> >> LogisticRegression class for this case.
>> >>
>> >> In R, there is:
>> >>
>> >> glm(Total_Service_Points_Won/Total_Service_Points_Played ~ ... ,
>> >>      family = binomial(link=logit), weights =
>> >> Total_Service_Points_Played)
>> >> which would be ideal.
>> >>
>> >> Is something similar available in scikit? (Or any continuous model
>> >> that takes and 0 to 1 target and outputs a 0 to 1 target?)
>> >>
>> >> I was surprised to see that the implementation of
>> >> CalibratedClassifierCV(method="sigmoid") uses an internal
>> >> implementation of logistic regression to do its logistic regressing --
>> >> which I can use, although I'd prefer to use a user-facing library.
>> >>
>> >> Thanks,
>> >> - Stuart
>> >> _______________________________________________
>> >> scikit-learn mailing list
>> >> scikit-learn at python.org
>> >> https://mail.python.org/mailman/listinfo/scikit-learn
>> >
>> >
>> > _______________________________________________
>> > scikit-learn mailing list
>> > scikit-learn at python.org
>> > https://mail.python.org/mailman/listinfo/scikit-learn
>> _______________________________________________
>> scikit-learn mailing list
>> scikit-learn at python.org
>> https://mail.python.org/mailman/listinfo/scikit-learn
>
>
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>

From josef.pktd at gmail.com  Thu Oct  5 12:57:25 2017
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Thu, 5 Oct 2017 12:57:25 -0400
Subject: [scikit-learn] Can fit a model with a target array of
 probabilities?
In-Reply-To: <CAAy-kdkvUnabY54t-dRitsBEpE08kSN41oiE8RrUUK21J6yGcA@mail.gmail.com>
References: <CAAy-kdkCpm-205Lb7Gj_=GJZ61kBa9Cktc7+QVwZ7LLmz5tErA@mail.gmail.com>
 <e201e89e-dcce-8722-82b9-f352f8a9e56d@gmail.com>
 <CAAy-kdmNq=tB7dhENM56XRvy_-t-qi=ROsGar2VT0MJgju4gUA@mail.gmail.com>
 <CAMMTP+C8PmHKDh3Yt24Smd5yrOidvFBuzEVQ8-suzxY8+BVghw@mail.gmail.com>
 <CAAy-kdkvUnabY54t-dRitsBEpE08kSN41oiE8RrUUK21J6yGcA@mail.gmail.com>
Message-ID: <CAMMTP+A4g92bPYh-e60Q7q1Le5NSPYBzRjfVe0Z5vYrm6OrS5A@mail.gmail.com>

On Thu, Oct 5, 2017 at 12:34 PM, Stuart Reynolds <stuart at stuartreynolds.net>
wrote:

> Thanks Josef. Was very useful.
>
> result.remove_data() reduces a 5 parameter Logit result object from
> megabytes to 5Kb (as compared to a minimum uncompressed size of the
> parameters of ~320 bytes). Is big improvement. I'll experiment with
> what you suggest -- since this is still >10x larger than possible. I
> think the difference is mostly attribute names.
> I don't mind the lack of a multinomial support. I've often had better
> results mixing independent models for each class.
>

The only other possibly large array is the underlying cov_params, which
could
be larger if there are many explanatory variables/features.
That one is not removed and there is no official way yet.


>
> I'll experiment with the different solvers.  I tried the Logit model
> in the past -- its fit function only exposed a maxiter, and not a
> tolerance -- meaning I had to set maxiter very high. The newer
> statsmodels GLM module looks great and seem to solve this.
>
> For other who come this way, I think the magic for ridge regression is:
>
>         from statsmodels.genmod.generalized_linear_model import GLM
>         from statsmodels.genmod.generalized_linear_model import families
>         from statsmodels.genmod.generalized_linear_model.families import
> links
>
>         model = GLM(y, Xtrain, family=families.Binomial(link=links.Logit))
>         result = model.fit_regularized(method='elastic_net',
> alpha=l2weight, L1_wt=0.0, tol=...)
>         result.remove_data()
>         result.predict(Xtest)
>
> One last thing -- its clear that it should be possible to do something
> like scikit's LogisticRegressionCV in order to quickly optimize a
> single parameter by re-using past coefficients.
> Are there any wrappers in statsmodels for doing this or should I roll my
> own?
>

I'm not sure exactly what you mean.

kind of, but not user facing, IIUC

In general maximization is with respect to all parameters at once.
Reusing past coefficients usually works with a warm start by providing
the `start_params` for the optimization.
(The L2 penalization for GLM using scipy optimizers or IRLS with
simultaneous estimation of all parameters is not yet merged.)

In GLM we can use `offset` to include a subset of variables with
fixed parameters. This is currently used in our version of GLM elastic net
for
coordinate descent in `GLM.fit_regularized`.
However, there is no helper function so users cannot use it directly, AFAIR.
And because it goes through regular model creation it will be
slower than an optimized algorithm that computes the steps directly.
(flexibility and quick implementation at the cost of performance in the
current version)

Josef


>
>
> - Stu
>
>
> On Wed, Oct 4, 2017 at 3:43 PM,  <josef.pktd at gmail.com> wrote:
> >
> >
> > On Wed, Oct 4, 2017 at 4:26 PM, Stuart Reynolds <
> stuart at stuartreynolds.net>
> > wrote:
> >>
> >> Hi Andy,
> >> Thanks -- I'll give another statsmodels another go.
> >> I remember I had some fitting speed issues with it in the past, and
> >> also some issues related their models keeping references to the data
> >> (=disaster for serialization and multiprocessing) -- although that was
> >> a long time ago.
> >
> >
> > The second has not changed and will not change, but there is a
> remove_data
> > method that deletes all references to full, data sized arrays. However,
> once
> > the data is removed, it is not possible anymore to compute any new
> results
> > statistics which are almost all lazily computed.
> > The fitting speed depends a lot on the optimizer, convergence criteria
> and
> > difficulty of the problem, and availability of good starting parameters.
> > Almost all nonlinear estimation problems use the scipy optimizers, all
> > unconstrained optimizers can be used. There are no optimized special
> methods
> > for cases with a very large number of features.
> >
> > Multinomial/multiclass models don't support continuous response (yet),
> all
> > other GLM and discrete models allow for continuous data in the interval
> > extension of the domain.
> >
> > Josef
> >
> >
> >>
> >> - Stuart
> >>
> >> On Wed, Oct 4, 2017 at 1:09 PM, Andreas Mueller <t3kcit at gmail.com>
> wrote:
> >> > Hi Stuart.
> >> > There is no interface to do this in scikit-learn (and maybe we should
> at
> >> > this to the FAQ).
> >> > Yes, in principle this would be possible with several of the models.
> >> >
> >> > I think statsmodels can do that, and I think I saw another glm package
> >> > for Python that does that?
> >> >
> >> > It's certainly a legitimate use-case but would require substantial
> >> > changes to the code. I think so far we decided not to support
> >> > this in scikit-learn. Basically we don't have a concept of a link
> >> > function, and it's a concept that only applies to a subset of models.
> >> > We try to have a consistent interface for all our estimators, and
> >> > this doesn't really fit well within that interface.
> >> >
> >> > Hth,
> >> > Andy
> >> >
> >> >
> >> > On 10/04/2017 03:58 PM, Stuart Reynolds wrote:
> >> >>
> >> >> I'd like to fit a model that maps a matrix of continuous inputs to a
> >> >> target that's between 0 and 1 (a probability).
> >> >>
> >> >> In principle, I'd expect logistic regression should work out of the
> >> >> box with no modification (although its often posed as being strictly
> >> >> for classification, its loss function allows for fitting targets in
> >> >> the range 0 to 1, and not strictly zero or one.)
> >> >>
> >> >> However, scikit's LogisticRegression and LogisticRegressionCV reject
> >> >> target arrays that are continuous. Other LR implementations allow a
> >> >> matrix of probability estimates. Looking at:
> >> >>
> >> >>
> >> >> http://scikit-learn-general.narkive.com/4dSCktaM/using-
> logistic-regression-on-a-continuous-target-variable
> >> >> and the fix here:
> >> >> https://github.com/scikit-learn/scikit-learn/pull/5084, which
> disables
> >> >> continuous inputs, it looks like there was some reason for this. So
> >> >> ... I'm looking for alternatives.
> >> >>
> >> >> SGDClassifier allows log loss and (if I understood the docs
> correctly)
> >> >> adds a logistic link function, but also rejects continuous targets.
> >> >> Oddly, SGDRegressor only allows  ?squared_loss?, ?huber?,
> >> >> ?epsilon_insensitive?, or ?squared_epsilon_insensitive?, and doesn't
> >> >> seems to give a logistic function.
> >> >>
> >> >> In principle, GLM allow this, but scikit's docs say the GLM models
> >> >> only allows strict linear functions of their input, and doesn't allow
> >> >> a logistic link function. The docs direct people to the
> >> >> LogisticRegression class for this case.
> >> >>
> >> >> In R, there is:
> >> >>
> >> >> glm(Total_Service_Points_Won/Total_Service_Points_Played ~ ... ,
> >> >>      family = binomial(link=logit), weights =
> >> >> Total_Service_Points_Played)
> >> >> which would be ideal.
> >> >>
> >> >> Is something similar available in scikit? (Or any continuous model
> >> >> that takes and 0 to 1 target and outputs a 0 to 1 target?)
> >> >>
> >> >> I was surprised to see that the implementation of
> >> >> CalibratedClassifierCV(method="sigmoid") uses an internal
> >> >> implementation of logistic regression to do its logistic regressing
> --
> >> >> which I can use, although I'd prefer to use a user-facing library.
> >> >>
> >> >> Thanks,
> >> >> - Stuart
> >> >> _______________________________________________
> >> >> scikit-learn mailing list
> >> >> scikit-learn at python.org
> >> >> https://mail.python.org/mailman/listinfo/scikit-learn
> >> >
> >> >
> >> > _______________________________________________
> >> > scikit-learn mailing list
> >> > scikit-learn at python.org
> >> > https://mail.python.org/mailman/listinfo/scikit-learn
> >> _______________________________________________
> >> scikit-learn mailing list
> >> scikit-learn at python.org
> >> https://mail.python.org/mailman/listinfo/scikit-learn
> >
> >
> >
> > _______________________________________________
> > scikit-learn mailing list
> > scikit-learn at python.org
> > https://mail.python.org/mailman/listinfo/scikit-learn
> >
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20171005/65eec737/attachment-0001.html>

From sean.violante at gmail.com  Thu Oct  5 13:32:23 2017
From: sean.violante at gmail.com (Sean Violante)
Date: Thu, 5 Oct 2017 19:32:23 +0200
Subject: [scikit-learn] Can fit a model with a target array of
 probabilities?
In-Reply-To: <CAAy-kdkvUnabY54t-dRitsBEpE08kSN41oiE8RrUUK21J6yGcA@mail.gmail.com>
References: <CAAy-kdkCpm-205Lb7Gj_=GJZ61kBa9Cktc7+QVwZ7LLmz5tErA@mail.gmail.com>
 <e201e89e-dcce-8722-82b9-f352f8a9e56d@gmail.com>
 <CAAy-kdmNq=tB7dhENM56XRvy_-t-qi=ROsGar2VT0MJgju4gUA@mail.gmail.com>
 <CAMMTP+C8PmHKDh3Yt24Smd5yrOidvFBuzEVQ8-suzxY8+BVghw@mail.gmail.com>
 <CAAy-kdkvUnabY54t-dRitsBEpE08kSN41oiE8RrUUK21J6yGcA@mail.gmail.com>
Message-ID: <CAL9=spMua4Vbp-LnzaMA=P55V=wqLb6Q-_W04FRTYqmY5vo2Sw@mail.gmail.com>

Stuart
have you tried glmnet ( in R) there is a python version
https://web.stanford.edu/~hastie/glmnet_python/ ....


On Thu, Oct 5, 2017 at 6:34 PM, Stuart Reynolds <stuart at stuartreynolds.net>
wrote:

> Thanks Josef. Was very useful.
>
> result.remove_data() reduces a 5 parameter Logit result object from
> megabytes to 5Kb (as compared to a minimum uncompressed size of the
> parameters of ~320 bytes). Is big improvement. I'll experiment with
> what you suggest -- since this is still >10x larger than possible. I
> think the difference is mostly attribute names.
> I don't mind the lack of a multinomial support. I've often had better
> results mixing independent models for each class.
>
> I'll experiment with the different solvers.  I tried the Logit model
> in the past -- its fit function only exposed a maxiter, and not a
> tolerance -- meaning I had to set maxiter very high. The newer
> statsmodels GLM module looks great and seem to solve this.
>
> For other who come this way, I think the magic for ridge regression is:
>
>         from statsmodels.genmod.generalized_linear_model import GLM
>         from statsmodels.genmod.generalized_linear_model import families
>         from statsmodels.genmod.generalized_linear_model.families import
> links
>
>         model = GLM(y, Xtrain, family=families.Binomial(link=links.Logit))
>         result = model.fit_regularized(method='elastic_net',
> alpha=l2weight, L1_wt=0.0, tol=...)
>         result.remove_data()
>         result.predict(Xtest)
>
> One last thing -- its clear that it should be possible to do something
> like scikit's LogisticRegressionCV in order to quickly optimize a
> single parameter by re-using past coefficients.
> Are there any wrappers in statsmodels for doing this or should I roll my
> own?
>
>
> - Stu
>
>
> On Wed, Oct 4, 2017 at 3:43 PM,  <josef.pktd at gmail.com> wrote:
> >
> >
> > On Wed, Oct 4, 2017 at 4:26 PM, Stuart Reynolds <
> stuart at stuartreynolds.net>
> > wrote:
> >>
> >> Hi Andy,
> >> Thanks -- I'll give another statsmodels another go.
> >> I remember I had some fitting speed issues with it in the past, and
> >> also some issues related their models keeping references to the data
> >> (=disaster for serialization and multiprocessing) -- although that was
> >> a long time ago.
> >
> >
> > The second has not changed and will not change, but there is a
> remove_data
> > method that deletes all references to full, data sized arrays. However,
> once
> > the data is removed, it is not possible anymore to compute any new
> results
> > statistics which are almost all lazily computed.
> > The fitting speed depends a lot on the optimizer, convergence criteria
> and
> > difficulty of the problem, and availability of good starting parameters.
> > Almost all nonlinear estimation problems use the scipy optimizers, all
> > unconstrained optimizers can be used. There are no optimized special
> methods
> > for cases with a very large number of features.
> >
> > Multinomial/multiclass models don't support continuous response (yet),
> all
> > other GLM and discrete models allow for continuous data in the interval
> > extension of the domain.
> >
> > Josef
> >
> >
> >>
> >> - Stuart
> >>
> >> On Wed, Oct 4, 2017 at 1:09 PM, Andreas Mueller <t3kcit at gmail.com>
> wrote:
> >> > Hi Stuart.
> >> > There is no interface to do this in scikit-learn (and maybe we should
> at
> >> > this to the FAQ).
> >> > Yes, in principle this would be possible with several of the models.
> >> >
> >> > I think statsmodels can do that, and I think I saw another glm package
> >> > for Python that does that?
> >> >
> >> > It's certainly a legitimate use-case but would require substantial
> >> > changes to the code. I think so far we decided not to support
> >> > this in scikit-learn. Basically we don't have a concept of a link
> >> > function, and it's a concept that only applies to a subset of models.
> >> > We try to have a consistent interface for all our estimators, and
> >> > this doesn't really fit well within that interface.
> >> >
> >> > Hth,
> >> > Andy
> >> >
> >> >
> >> > On 10/04/2017 03:58 PM, Stuart Reynolds wrote:
> >> >>
> >> >> I'd like to fit a model that maps a matrix of continuous inputs to a
> >> >> target that's between 0 and 1 (a probability).
> >> >>
> >> >> In principle, I'd expect logistic regression should work out of the
> >> >> box with no modification (although its often posed as being strictly
> >> >> for classification, its loss function allows for fitting targets in
> >> >> the range 0 to 1, and not strictly zero or one.)
> >> >>
> >> >> However, scikit's LogisticRegression and LogisticRegressionCV reject
> >> >> target arrays that are continuous. Other LR implementations allow a
> >> >> matrix of probability estimates. Looking at:
> >> >>
> >> >>
> >> >> http://scikit-learn-general.narkive.com/4dSCktaM/using-
> logistic-regression-on-a-continuous-target-variable
> >> >> and the fix here:
> >> >> https://github.com/scikit-learn/scikit-learn/pull/5084, which
> disables
> >> >> continuous inputs, it looks like there was some reason for this. So
> >> >> ... I'm looking for alternatives.
> >> >>
> >> >> SGDClassifier allows log loss and (if I understood the docs
> correctly)
> >> >> adds a logistic link function, but also rejects continuous targets.
> >> >> Oddly, SGDRegressor only allows  ?squared_loss?, ?huber?,
> >> >> ?epsilon_insensitive?, or ?squared_epsilon_insensitive?, and doesn't
> >> >> seems to give a logistic function.
> >> >>
> >> >> In principle, GLM allow this, but scikit's docs say the GLM models
> >> >> only allows strict linear functions of their input, and doesn't allow
> >> >> a logistic link function. The docs direct people to the
> >> >> LogisticRegression class for this case.
> >> >>
> >> >> In R, there is:
> >> >>
> >> >> glm(Total_Service_Points_Won/Total_Service_Points_Played ~ ... ,
> >> >>      family = binomial(link=logit), weights =
> >> >> Total_Service_Points_Played)
> >> >> which would be ideal.
> >> >>
> >> >> Is something similar available in scikit? (Or any continuous model
> >> >> that takes and 0 to 1 target and outputs a 0 to 1 target?)
> >> >>
> >> >> I was surprised to see that the implementation of
> >> >> CalibratedClassifierCV(method="sigmoid") uses an internal
> >> >> implementation of logistic regression to do its logistic regressing
> --
> >> >> which I can use, although I'd prefer to use a user-facing library.
> >> >>
> >> >> Thanks,
> >> >> - Stuart
> >> >> _______________________________________________
> >> >> scikit-learn mailing list
> >> >> scikit-learn at python.org
> >> >> https://mail.python.org/mailman/listinfo/scikit-learn
> >> >
> >> >
> >> > _______________________________________________
> >> > scikit-learn mailing list
> >> > scikit-learn at python.org
> >> > https://mail.python.org/mailman/listinfo/scikit-learn
> >> _______________________________________________
> >> scikit-learn mailing list
> >> scikit-learn at python.org
> >> https://mail.python.org/mailman/listinfo/scikit-learn
> >
> >
> >
> > _______________________________________________
> > scikit-learn mailing list
> > scikit-learn at python.org
> > https://mail.python.org/mailman/listinfo/scikit-learn
> >
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20171005/17bcdc12/attachment.html>

From stuart at stuartreynolds.net  Thu Oct  5 14:52:38 2017
From: stuart at stuartreynolds.net (Stuart Reynolds)
Date: Thu, 5 Oct 2017 11:52:38 -0700
Subject: [scikit-learn] Can fit a model with a target array of
 probabilities?
In-Reply-To: <CAL9=spMua4Vbp-LnzaMA=P55V=wqLb6Q-_W04FRTYqmY5vo2Sw@mail.gmail.com>
References: <CAAy-kdkCpm-205Lb7Gj_=GJZ61kBa9Cktc7+QVwZ7LLmz5tErA@mail.gmail.com>
 <e201e89e-dcce-8722-82b9-f352f8a9e56d@gmail.com>
 <CAAy-kdmNq=tB7dhENM56XRvy_-t-qi=ROsGar2VT0MJgju4gUA@mail.gmail.com>
 <CAMMTP+C8PmHKDh3Yt24Smd5yrOidvFBuzEVQ8-suzxY8+BVghw@mail.gmail.com>
 <CAAy-kdkvUnabY54t-dRitsBEpE08kSN41oiE8RrUUK21J6yGcA@mail.gmail.com>
 <CAL9=spMua4Vbp-LnzaMA=P55V=wqLb6Q-_W04FRTYqmY5vo2Sw@mail.gmail.com>
Message-ID: <CAAy-kd=WyVpGK0Op7xg4mo2odV3ZbMUJbudGzsKfy4PLmP3GuQ@mail.gmail.com>

Turns out sm.Logit does allow setting the tolerance.
Some and quick and dirty time profiling of different methods on a 100k
* 30 features dataset, with different solvers and losses:

sklearn.LogisticRegression: l1 1.13864398003 (seconds)
sklearn.LogisticRegression: l2 0.0538778305054
sm.Logit l1 0.0922629833221  # Although didn't converge
sm.Logit l1_cvxopt_cp 0.958268165588
sm.Logit newton 0.133476018906
sm.Logit nm 0.369864940643
sm.Logit bfgs 0.105798006058
sm.Logit lbfgs 0.06241106987
sm.Logit powell 1.64219808578
sm.Logit cg 0.2184278965
sm.Logit ncg 0.216138124466
sm.Logit basinhopping 8.82164621353
sm.GLM.fit IRLS 0.544688940048
sm.GLM L2: 1.29778695107

I've been getting good results from sm.Logit.fit (although unregularized).
statsmodels GLM seems a little slow. Not sure why.

My benchmark may be a little apples-to-oranges, since the stopping
criteria probably aren't comparable.


For tiny models, which I'm also building: 100 samples, 5 features

sklearn.LogisticRegression: l1 0.00137376785278
sklearn.LogisticRegression: l2 0.00167894363403
sm.Logit l1 0.0198900699615
sm.Logit l1_cvxopt_cp 0.162448167801
sm.Logit newton 0.00689911842346
sm.Logit nm 0.0754928588867
sm.Logit bfgs 0.0210938453674
sm.Logit lbfgs 0.0156588554382
sm.Logit powell 0.0161390304565
sm.Logit cg 0.00759506225586
sm.Logit ncg 0.00541186332703
sm.Logit basinhopping 0.3076171875
sm.GLM.fit IRLS 0.00902199745178
sm.GLM L2: 0.0208361148834

I couldn't get sm.GLM.fit to work with non "IRLS" solvers. (hits a
division by zero).

----

import sklearn.datasets
from sklearn.preprocessing import StandardScaler
X, y = sklearn.datasets.make_classification(n_samples=10000,
n_features=30, random_state=123)
X = StandardScaler(copy=True, with_mean=True, with_std=True).fit_transform(X)

import time
tol = 0.0001
maxiter = 100
DISP = 0


if 1: # sk.LogisticRegression
    import sklearn
    from sklearn.linear_model import LogisticRegression

    for method in ["l1", "l2"]: # TODO, add solvers:
        t = time.time()
        model = LogisticRegression(C=1, tol=tol, max_iter=maxiter,
penalty=method)
        model.fit(X,y)
        print "sklearn.LogisticRegression:", method, time.time() - t


if 1: # sm.Logit.fit_regularized
    from statsmodels.discrete.discrete_model import Logit
    for method in ["l1", "l1_cvxopt_cp"]:
        t = time.time()
        model = Logit(y,X)
        result = model.fit_regularized(method=method, maxiter=maxiter,
                                       alpha=1.,
                                       abstol=tol,
                                       acc=tol,
                                       tol=tol, gtol=tol, pgtol=tol,
                                       disp=DISP)
        print "sm.Logit", method, time.time() - t

if 1: # sm.Logit.fit
    from statsmodels.discrete.discrete_model import Logit

    SOLVERS = ["newton", "nm",
"bfgs","lbfgs","powell","cg","ncg","basinhopping",]
    for method in SOLVERS:
        t = time.time()
        model = Logit(y,X)
        result = model.fit(method=method, maxiter=maxiter,
                           niter=maxiter,
                           ftol=tol,
                           tol=tol, gtol=tol, pgtol=tol,  # Hmmm..
needs to be reviewed.
                           disp=DISP)
        print "sm.Logit", method, time.time() - t

if 1: # sm.GLM.fit
    from statsmodels.genmod.generalized_linear_model import GLM
    from statsmodels.genmod.generalized_linear_model import families
    for method in ["IRLS"]:
        t = time.time()
        model = GLM(y, X, family=families.Binomial(link=families.links.logit))
        result = model.fit(method=method, cnvrg_tol=tol,
maxiter=maxiter, full_output=False, disp=DISP)
        print "sm.GLM.fit", method, time.time() - t


if 1: # GLM.fit_regularized
    from statsmodels.genmod.generalized_linear_model import GLM
    from statsmodels.genmod.generalized_linear_model import families
    t = time.time()
    model = GLM(y, X, family=families.Binomial(link=families.links.logit))
    result = model.fit_regularized(method='elastic_net', alpha=1.0,
L1_wt=0.0, cnvrg_tol=tol, maxiter=maxiter)
    print "sm.GLM L2:", time.time() - t


if 0: # GLM.fit
    # Hits division by zero.
    SOLVERS = ["bfgs","lbfgs", "netwon", "nm",
"powell","cg","ncg","basinhopping",]
    from statsmodels.genmod.generalized_linear_model import GLM
    from statsmodels.genmod.generalized_linear_model import families
    for method in SOLVERS:
        t = time.time()
        model = GLM(y, X, family=families.Binomial(link=families.links.logit))
        result = model.fit(method=method,
#                           scale="X2",
#                            alpha=1.,
#                            abstol=tol,
#                            acc=tol,
#                            tol=tol, gtol=tol, pgtol=tol,
#                            maxiter=maxiter,
#                            #full_output=False,
                           disp=DISP)
        print "sm.GLM.fit", method, time.time() - t


On Thu, Oct 5, 2017 at 10:32 AM, Sean Violante <sean.violante at gmail.com> wrote:
> Stuart
> have you tried glmnet ( in R) there is a python version
> https://web.stanford.edu/~hastie/glmnet_python/ ....
>
>
>
>
> On Thu, Oct 5, 2017 at 6:34 PM, Stuart Reynolds <stuart at stuartreynolds.net>
> wrote:
>>
>> Thanks Josef. Was very useful.
>>
>> result.remove_data() reduces a 5 parameter Logit result object from
>> megabytes to 5Kb (as compared to a minimum uncompressed size of the
>> parameters of ~320 bytes). Is big improvement. I'll experiment with
>> what you suggest -- since this is still >10x larger than possible. I
>> think the difference is mostly attribute names.
>> I don't mind the lack of a multinomial support. I've often had better
>> results mixing independent models for each class.
>>
>> I'll experiment with the different solvers.  I tried the Logit model
>> in the past -- its fit function only exposed a maxiter, and not a
>> tolerance -- meaning I had to set maxiter very high. The newer
>> statsmodels GLM module looks great and seem to solve this.
>>
>> For other who come this way, I think the magic for ridge regression is:
>>
>>         from statsmodels.genmod.generalized_linear_model import GLM
>>         from statsmodels.genmod.generalized_linear_model import families
>>         from statsmodels.genmod.generalized_linear_model.families import
>> links
>>
>>         model = GLM(y, Xtrain, family=families.Binomial(link=links.Logit))
>>         result = model.fit_regularized(method='elastic_net',
>> alpha=l2weight, L1_wt=0.0, tol=...)
>>         result.remove_data()
>>         result.predict(Xtest)
>>
>> One last thing -- its clear that it should be possible to do something
>> like scikit's LogisticRegressionCV in order to quickly optimize a
>> single parameter by re-using past coefficients.
>> Are there any wrappers in statsmodels for doing this or should I roll my
>> own?
>>
>>
>> - Stu
>>
>>
>> On Wed, Oct 4, 2017 at 3:43 PM,  <josef.pktd at gmail.com> wrote:
>> >
>> >
>> > On Wed, Oct 4, 2017 at 4:26 PM, Stuart Reynolds
>> > <stuart at stuartreynolds.net>
>> > wrote:
>> >>
>> >> Hi Andy,
>> >> Thanks -- I'll give another statsmodels another go.
>> >> I remember I had some fitting speed issues with it in the past, and
>> >> also some issues related their models keeping references to the data
>> >> (=disaster for serialization and multiprocessing) -- although that was
>> >> a long time ago.
>> >
>> >
>> > The second has not changed and will not change, but there is a
>> > remove_data
>> > method that deletes all references to full, data sized arrays. However,
>> > once
>> > the data is removed, it is not possible anymore to compute any new
>> > results
>> > statistics which are almost all lazily computed.
>> > The fitting speed depends a lot on the optimizer, convergence criteria
>> > and
>> > difficulty of the problem, and availability of good starting parameters.
>> > Almost all nonlinear estimation problems use the scipy optimizers, all
>> > unconstrained optimizers can be used. There are no optimized special
>> > methods
>> > for cases with a very large number of features.
>> >
>> > Multinomial/multiclass models don't support continuous response (yet),
>> > all
>> > other GLM and discrete models allow for continuous data in the interval
>> > extension of the domain.
>> >
>> > Josef
>> >
>> >
>> >>
>> >> - Stuart
>> >>
>> >> On Wed, Oct 4, 2017 at 1:09 PM, Andreas Mueller <t3kcit at gmail.com>
>> >> wrote:
>> >> > Hi Stuart.
>> >> > There is no interface to do this in scikit-learn (and maybe we should
>> >> > at
>> >> > this to the FAQ).
>> >> > Yes, in principle this would be possible with several of the models.
>> >> >
>> >> > I think statsmodels can do that, and I think I saw another glm
>> >> > package
>> >> > for Python that does that?
>> >> >
>> >> > It's certainly a legitimate use-case but would require substantial
>> >> > changes to the code. I think so far we decided not to support
>> >> > this in scikit-learn. Basically we don't have a concept of a link
>> >> > function, and it's a concept that only applies to a subset of models.
>> >> > We try to have a consistent interface for all our estimators, and
>> >> > this doesn't really fit well within that interface.
>> >> >
>> >> > Hth,
>> >> > Andy
>> >> >
>> >> >
>> >> > On 10/04/2017 03:58 PM, Stuart Reynolds wrote:
>> >> >>
>> >> >> I'd like to fit a model that maps a matrix of continuous inputs to a
>> >> >> target that's between 0 and 1 (a probability).
>> >> >>
>> >> >> In principle, I'd expect logistic regression should work out of the
>> >> >> box with no modification (although its often posed as being strictly
>> >> >> for classification, its loss function allows for fitting targets in
>> >> >> the range 0 to 1, and not strictly zero or one.)
>> >> >>
>> >> >> However, scikit's LogisticRegression and LogisticRegressionCV reject
>> >> >> target arrays that are continuous. Other LR implementations allow a
>> >> >> matrix of probability estimates. Looking at:
>> >> >>
>> >> >>
>> >> >>
>> >> >> http://scikit-learn-general.narkive.com/4dSCktaM/using-logistic-regression-on-a-continuous-target-variable
>> >> >> and the fix here:
>> >> >> https://github.com/scikit-learn/scikit-learn/pull/5084, which
>> >> >> disables
>> >> >> continuous inputs, it looks like there was some reason for this. So
>> >> >> ... I'm looking for alternatives.
>> >> >>
>> >> >> SGDClassifier allows log loss and (if I understood the docs
>> >> >> correctly)
>> >> >> adds a logistic link function, but also rejects continuous targets.
>> >> >> Oddly, SGDRegressor only allows  ?squared_loss?, ?huber?,
>> >> >> ?epsilon_insensitive?, or ?squared_epsilon_insensitive?, and doesn't
>> >> >> seems to give a logistic function.
>> >> >>
>> >> >> In principle, GLM allow this, but scikit's docs say the GLM models
>> >> >> only allows strict linear functions of their input, and doesn't
>> >> >> allow
>> >> >> a logistic link function. The docs direct people to the
>> >> >> LogisticRegression class for this case.
>> >> >>
>> >> >> In R, there is:
>> >> >>
>> >> >> glm(Total_Service_Points_Won/Total_Service_Points_Played ~ ... ,
>> >> >>      family = binomial(link=logit), weights =
>> >> >> Total_Service_Points_Played)
>> >> >> which would be ideal.
>> >> >>
>> >> >> Is something similar available in scikit? (Or any continuous model
>> >> >> that takes and 0 to 1 target and outputs a 0 to 1 target?)
>> >> >>
>> >> >> I was surprised to see that the implementation of
>> >> >> CalibratedClassifierCV(method="sigmoid") uses an internal
>> >> >> implementation of logistic regression to do its logistic regressing
>> >> >> --
>> >> >> which I can use, although I'd prefer to use a user-facing library.
>> >> >>
>> >> >> Thanks,
>> >> >> - Stuart
>> >> >> _______________________________________________
>> >> >> scikit-learn mailing list
>> >> >> scikit-learn at python.org
>> >> >> https://mail.python.org/mailman/listinfo/scikit-learn
>> >> >
>> >> >
>> >> > _______________________________________________
>> >> > scikit-learn mailing list
>> >> > scikit-learn at python.org
>> >> > https://mail.python.org/mailman/listinfo/scikit-learn
>> >> _______________________________________________
>> >> scikit-learn mailing list
>> >> scikit-learn at python.org
>> >> https://mail.python.org/mailman/listinfo/scikit-learn
>> >
>> >
>> >
>> > _______________________________________________
>> > scikit-learn mailing list
>> > scikit-learn at python.org
>> > https://mail.python.org/mailman/listinfo/scikit-learn
>> >
>> _______________________________________________
>> scikit-learn mailing list
>> scikit-learn at python.org
>> https://mail.python.org/mailman/listinfo/scikit-learn
>
>
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>

From stuart at stuartreynolds.net  Thu Oct  5 15:00:29 2017
From: stuart at stuartreynolds.net (Stuart Reynolds)
Date: Thu, 5 Oct 2017 12:00:29 -0700
Subject: [scikit-learn] Can fit a model with a target array of
 probabilities?
In-Reply-To: <CAL9=spMua4Vbp-LnzaMA=P55V=wqLb6Q-_W04FRTYqmY5vo2Sw@mail.gmail.com>
References: <CAAy-kdkCpm-205Lb7Gj_=GJZ61kBa9Cktc7+QVwZ7LLmz5tErA@mail.gmail.com>
 <e201e89e-dcce-8722-82b9-f352f8a9e56d@gmail.com>
 <CAAy-kdmNq=tB7dhENM56XRvy_-t-qi=ROsGar2VT0MJgju4gUA@mail.gmail.com>
 <CAMMTP+C8PmHKDh3Yt24Smd5yrOidvFBuzEVQ8-suzxY8+BVghw@mail.gmail.com>
 <CAAy-kdkvUnabY54t-dRitsBEpE08kSN41oiE8RrUUK21J6yGcA@mail.gmail.com>
 <CAL9=spMua4Vbp-LnzaMA=P55V=wqLb6Q-_W04FRTYqmY5vo2Sw@mail.gmail.com>
Message-ID: <CAAy-kdm-D3UM-xPxw+d4ACOQctpxPG3wFty4pCJivvHRmEdG3Q@mail.gmail.com>

Hi Sean,

I'll have a look glmnet (looks like its compiled from fortran!). Does
it offer much over statsmodel's GLM? This looks great for researchy
stuff, although a little less performant.

- Stu


On Thu, Oct 5, 2017 at 10:32 AM, Sean Violante <sean.violante at gmail.com> wrote:
> Stuart
> have you tried glmnet ( in R) there is a python version
> https://web.stanford.edu/~hastie/glmnet_python/ ....
>
>
>
>
> On Thu, Oct 5, 2017 at 6:34 PM, Stuart Reynolds <stuart at stuartreynolds.net>
> wrote:
>>
>> Thanks Josef. Was very useful.
>>
>> result.remove_data() reduces a 5 parameter Logit result object from
>> megabytes to 5Kb (as compared to a minimum uncompressed size of the
>> parameters of ~320 bytes). Is big improvement. I'll experiment with
>> what you suggest -- since this is still >10x larger than possible. I
>> think the difference is mostly attribute names.
>> I don't mind the lack of a multinomial support. I've often had better
>> results mixing independent models for each class.
>>
>> I'll experiment with the different solvers.  I tried the Logit model
>> in the past -- its fit function only exposed a maxiter, and not a
>> tolerance -- meaning I had to set maxiter very high. The newer
>> statsmodels GLM module looks great and seem to solve this.
>>
>> For other who come this way, I think the magic for ridge regression is:
>>
>>         from statsmodels.genmod.generalized_linear_model import GLM
>>         from statsmodels.genmod.generalized_linear_model import families
>>         from statsmodels.genmod.generalized_linear_model.families import
>> links
>>
>>         model = GLM(y, Xtrain, family=families.Binomial(link=links.Logit))
>>         result = model.fit_regularized(method='elastic_net',
>> alpha=l2weight, L1_wt=0.0, tol=...)
>>         result.remove_data()
>>         result.predict(Xtest)
>>
>> One last thing -- its clear that it should be possible to do something
>> like scikit's LogisticRegressionCV in order to quickly optimize a
>> single parameter by re-using past coefficients.
>> Are there any wrappers in statsmodels for doing this or should I roll my
>> own?
>>
>>
>> - Stu
>>
>>
>> On Wed, Oct 4, 2017 at 3:43 PM,  <josef.pktd at gmail.com> wrote:
>> >
>> >
>> > On Wed, Oct 4, 2017 at 4:26 PM, Stuart Reynolds
>> > <stuart at stuartreynolds.net>
>> > wrote:
>> >>
>> >> Hi Andy,
>> >> Thanks -- I'll give another statsmodels another go.
>> >> I remember I had some fitting speed issues with it in the past, and
>> >> also some issues related their models keeping references to the data
>> >> (=disaster for serialization and multiprocessing) -- although that was
>> >> a long time ago.
>> >
>> >
>> > The second has not changed and will not change, but there is a
>> > remove_data
>> > method that deletes all references to full, data sized arrays. However,
>> > once
>> > the data is removed, it is not possible anymore to compute any new
>> > results
>> > statistics which are almost all lazily computed.
>> > The fitting speed depends a lot on the optimizer, convergence criteria
>> > and
>> > difficulty of the problem, and availability of good starting parameters.
>> > Almost all nonlinear estimation problems use the scipy optimizers, all
>> > unconstrained optimizers can be used. There are no optimized special
>> > methods
>> > for cases with a very large number of features.
>> >
>> > Multinomial/multiclass models don't support continuous response (yet),
>> > all
>> > other GLM and discrete models allow for continuous data in the interval
>> > extension of the domain.
>> >
>> > Josef
>> >
>> >
>> >>
>> >> - Stuart
>> >>
>> >> On Wed, Oct 4, 2017 at 1:09 PM, Andreas Mueller <t3kcit at gmail.com>
>> >> wrote:
>> >> > Hi Stuart.
>> >> > There is no interface to do this in scikit-learn (and maybe we should
>> >> > at
>> >> > this to the FAQ).
>> >> > Yes, in principle this would be possible with several of the models.
>> >> >
>> >> > I think statsmodels can do that, and I think I saw another glm
>> >> > package
>> >> > for Python that does that?
>> >> >
>> >> > It's certainly a legitimate use-case but would require substantial
>> >> > changes to the code. I think so far we decided not to support
>> >> > this in scikit-learn. Basically we don't have a concept of a link
>> >> > function, and it's a concept that only applies to a subset of models.
>> >> > We try to have a consistent interface for all our estimators, and
>> >> > this doesn't really fit well within that interface.
>> >> >
>> >> > Hth,
>> >> > Andy
>> >> >
>> >> >
>> >> > On 10/04/2017 03:58 PM, Stuart Reynolds wrote:
>> >> >>
>> >> >> I'd like to fit a model that maps a matrix of continuous inputs to a
>> >> >> target that's between 0 and 1 (a probability).
>> >> >>
>> >> >> In principle, I'd expect logistic regression should work out of the
>> >> >> box with no modification (although its often posed as being strictly
>> >> >> for classification, its loss function allows for fitting targets in
>> >> >> the range 0 to 1, and not strictly zero or one.)
>> >> >>
>> >> >> However, scikit's LogisticRegression and LogisticRegressionCV reject
>> >> >> target arrays that are continuous. Other LR implementations allow a
>> >> >> matrix of probability estimates. Looking at:
>> >> >>
>> >> >>
>> >> >>
>> >> >> http://scikit-learn-general.narkive.com/4dSCktaM/using-logistic-regression-on-a-continuous-target-variable
>> >> >> and the fix here:
>> >> >> https://github.com/scikit-learn/scikit-learn/pull/5084, which
>> >> >> disables
>> >> >> continuous inputs, it looks like there was some reason for this. So
>> >> >> ... I'm looking for alternatives.
>> >> >>
>> >> >> SGDClassifier allows log loss and (if I understood the docs
>> >> >> correctly)
>> >> >> adds a logistic link function, but also rejects continuous targets.
>> >> >> Oddly, SGDRegressor only allows  ?squared_loss?, ?huber?,
>> >> >> ?epsilon_insensitive?, or ?squared_epsilon_insensitive?, and doesn't
>> >> >> seems to give a logistic function.
>> >> >>
>> >> >> In principle, GLM allow this, but scikit's docs say the GLM models
>> >> >> only allows strict linear functions of their input, and doesn't
>> >> >> allow
>> >> >> a logistic link function. The docs direct people to the
>> >> >> LogisticRegression class for this case.
>> >> >>
>> >> >> In R, there is:
>> >> >>
>> >> >> glm(Total_Service_Points_Won/Total_Service_Points_Played ~ ... ,
>> >> >>      family = binomial(link=logit), weights =
>> >> >> Total_Service_Points_Played)
>> >> >> which would be ideal.
>> >> >>
>> >> >> Is something similar available in scikit? (Or any continuous model
>> >> >> that takes and 0 to 1 target and outputs a 0 to 1 target?)
>> >> >>
>> >> >> I was surprised to see that the implementation of
>> >> >> CalibratedClassifierCV(method="sigmoid") uses an internal
>> >> >> implementation of logistic regression to do its logistic regressing
>> >> >> --
>> >> >> which I can use, although I'd prefer to use a user-facing library.
>> >> >>
>> >> >> Thanks,
>> >> >> - Stuart
>> >> >> _______________________________________________
>> >> >> scikit-learn mailing list
>> >> >> scikit-learn at python.org
>> >> >> https://mail.python.org/mailman/listinfo/scikit-learn
>> >> >
>> >> >
>> >> > _______________________________________________
>> >> > scikit-learn mailing list
>> >> > scikit-learn at python.org
>> >> > https://mail.python.org/mailman/listinfo/scikit-learn
>> >> _______________________________________________
>> >> scikit-learn mailing list
>> >> scikit-learn at python.org
>> >> https://mail.python.org/mailman/listinfo/scikit-learn
>> >
>> >
>> >
>> > _______________________________________________
>> > scikit-learn mailing list
>> > scikit-learn at python.org
>> > https://mail.python.org/mailman/listinfo/scikit-learn
>> >
>> _______________________________________________
>> scikit-learn mailing list
>> scikit-learn at python.org
>> https://mail.python.org/mailman/listinfo/scikit-learn
>
>
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>

From josef.pktd at gmail.com  Thu Oct  5 15:07:43 2017
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Thu, 5 Oct 2017 15:07:43 -0400
Subject: [scikit-learn] Can fit a model with a target array of
 probabilities?
In-Reply-To: <CAAy-kdm-D3UM-xPxw+d4ACOQctpxPG3wFty4pCJivvHRmEdG3Q@mail.gmail.com>
References: <CAAy-kdkCpm-205Lb7Gj_=GJZ61kBa9Cktc7+QVwZ7LLmz5tErA@mail.gmail.com>
 <e201e89e-dcce-8722-82b9-f352f8a9e56d@gmail.com>
 <CAAy-kdmNq=tB7dhENM56XRvy_-t-qi=ROsGar2VT0MJgju4gUA@mail.gmail.com>
 <CAMMTP+C8PmHKDh3Yt24Smd5yrOidvFBuzEVQ8-suzxY8+BVghw@mail.gmail.com>
 <CAAy-kdkvUnabY54t-dRitsBEpE08kSN41oiE8RrUUK21J6yGcA@mail.gmail.com>
 <CAL9=spMua4Vbp-LnzaMA=P55V=wqLb6Q-_W04FRTYqmY5vo2Sw@mail.gmail.com>
 <CAAy-kdm-D3UM-xPxw+d4ACOQctpxPG3wFty4pCJivvHRmEdG3Q@mail.gmail.com>
Message-ID: <CAMMTP+BKJH+f-fra-Nb3Ej1NHEhWC=TiTc778wJqC7Vr02+_VA@mail.gmail.com>

On Thu, Oct 5, 2017 at 3:00 PM, Stuart Reynolds <stuart at stuartreynolds.net>
wrote:

> Hi Sean,
>
> I'll have a look glmnet (looks like its compiled from fortran!). Does
> it offer much over statsmodel's GLM? This looks great for researchy
> stuff, although a little less performant.
>

GLMNet is/wraps the original Fortran implementation of elastic net.
I expect that it is much faster than the python version in statsmodels.
I have no idea what option they support and what restrictions they
have on the data.

I have no guess on speed difference for the non-penalized version.
I assume it's Fortran loops with coordinate descend versus iterative
linear algebra.

Josef


>
> - Stu
>
>
>
> On Thu, Oct 5, 2017 at 10:32 AM, Sean Violante <sean.violante at gmail.com>
> wrote:
> > Stuart
> > have you tried glmnet ( in R) there is a python version
> > https://web.stanford.edu/~hastie/glmnet_python/ ....
> >
> >
> >
> >
> > On Thu, Oct 5, 2017 at 6:34 PM, Stuart Reynolds <
> stuart at stuartreynolds.net>
> > wrote:
> >>
> >> Thanks Josef. Was very useful.
> >>
> >> result.remove_data() reduces a 5 parameter Logit result object from
> >> megabytes to 5Kb (as compared to a minimum uncompressed size of the
> >> parameters of ~320 bytes). Is big improvement. I'll experiment with
> >> what you suggest -- since this is still >10x larger than possible. I
> >> think the difference is mostly attribute names.
> >> I don't mind the lack of a multinomial support. I've often had better
> >> results mixing independent models for each class.
> >>
> >> I'll experiment with the different solvers.  I tried the Logit model
> >> in the past -- its fit function only exposed a maxiter, and not a
> >> tolerance -- meaning I had to set maxiter very high. The newer
> >> statsmodels GLM module looks great and seem to solve this.
> >>
> >> For other who come this way, I think the magic for ridge regression is:
> >>
> >>         from statsmodels.genmod.generalized_linear_model import GLM
> >>         from statsmodels.genmod.generalized_linear_model import
> families
> >>         from statsmodels.genmod.generalized_linear_model.families
> import
> >> links
> >>
> >>         model = GLM(y, Xtrain, family=families.Binomial(link=
> links.Logit))
> >>         result = model.fit_regularized(method='elastic_net',
> >> alpha=l2weight, L1_wt=0.0, tol=...)
> >>         result.remove_data()
> >>         result.predict(Xtest)
> >>
> >> One last thing -- its clear that it should be possible to do something
> >> like scikit's LogisticRegressionCV in order to quickly optimize a
> >> single parameter by re-using past coefficients.
> >> Are there any wrappers in statsmodels for doing this or should I roll my
> >> own?
> >>
> >>
> >> - Stu
> >>
> >>
> >> On Wed, Oct 4, 2017 at 3:43 PM,  <josef.pktd at gmail.com> wrote:
> >> >
> >> >
> >> > On Wed, Oct 4, 2017 at 4:26 PM, Stuart Reynolds
> >> > <stuart at stuartreynolds.net>
> >> > wrote:
> >> >>
> >> >> Hi Andy,
> >> >> Thanks -- I'll give another statsmodels another go.
> >> >> I remember I had some fitting speed issues with it in the past, and
> >> >> also some issues related their models keeping references to the data
> >> >> (=disaster for serialization and multiprocessing) -- although that
> was
> >> >> a long time ago.
> >> >
> >> >
> >> > The second has not changed and will not change, but there is a
> >> > remove_data
> >> > method that deletes all references to full, data sized arrays.
> However,
> >> > once
> >> > the data is removed, it is not possible anymore to compute any new
> >> > results
> >> > statistics which are almost all lazily computed.
> >> > The fitting speed depends a lot on the optimizer, convergence criteria
> >> > and
> >> > difficulty of the problem, and availability of good starting
> parameters.
> >> > Almost all nonlinear estimation problems use the scipy optimizers, all
> >> > unconstrained optimizers can be used. There are no optimized special
> >> > methods
> >> > for cases with a very large number of features.
> >> >
> >> > Multinomial/multiclass models don't support continuous response (yet),
> >> > all
> >> > other GLM and discrete models allow for continuous data in the
> interval
> >> > extension of the domain.
> >> >
> >> > Josef
> >> >
> >> >
> >> >>
> >> >> - Stuart
> >> >>
> >> >> On Wed, Oct 4, 2017 at 1:09 PM, Andreas Mueller <t3kcit at gmail.com>
> >> >> wrote:
> >> >> > Hi Stuart.
> >> >> > There is no interface to do this in scikit-learn (and maybe we
> should
> >> >> > at
> >> >> > this to the FAQ).
> >> >> > Yes, in principle this would be possible with several of the
> models.
> >> >> >
> >> >> > I think statsmodels can do that, and I think I saw another glm
> >> >> > package
> >> >> > for Python that does that?
> >> >> >
> >> >> > It's certainly a legitimate use-case but would require substantial
> >> >> > changes to the code. I think so far we decided not to support
> >> >> > this in scikit-learn. Basically we don't have a concept of a link
> >> >> > function, and it's a concept that only applies to a subset of
> models.
> >> >> > We try to have a consistent interface for all our estimators, and
> >> >> > this doesn't really fit well within that interface.
> >> >> >
> >> >> > Hth,
> >> >> > Andy
> >> >> >
> >> >> >
> >> >> > On 10/04/2017 03:58 PM, Stuart Reynolds wrote:
> >> >> >>
> >> >> >> I'd like to fit a model that maps a matrix of continuous inputs
> to a
> >> >> >> target that's between 0 and 1 (a probability).
> >> >> >>
> >> >> >> In principle, I'd expect logistic regression should work out of
> the
> >> >> >> box with no modification (although its often posed as being
> strictly
> >> >> >> for classification, its loss function allows for fitting targets
> in
> >> >> >> the range 0 to 1, and not strictly zero or one.)
> >> >> >>
> >> >> >> However, scikit's LogisticRegression and LogisticRegressionCV
> reject
> >> >> >> target arrays that are continuous. Other LR implementations allow
> a
> >> >> >> matrix of probability estimates. Looking at:
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> http://scikit-learn-general.narkive.com/4dSCktaM/using-
> logistic-regression-on-a-continuous-target-variable
> >> >> >> and the fix here:
> >> >> >> https://github.com/scikit-learn/scikit-learn/pull/5084, which
> >> >> >> disables
> >> >> >> continuous inputs, it looks like there was some reason for this.
> So
> >> >> >> ... I'm looking for alternatives.
> >> >> >>
> >> >> >> SGDClassifier allows log loss and (if I understood the docs
> >> >> >> correctly)
> >> >> >> adds a logistic link function, but also rejects continuous
> targets.
> >> >> >> Oddly, SGDRegressor only allows  ?squared_loss?, ?huber?,
> >> >> >> ?epsilon_insensitive?, or ?squared_epsilon_insensitive?, and
> doesn't
> >> >> >> seems to give a logistic function.
> >> >> >>
> >> >> >> In principle, GLM allow this, but scikit's docs say the GLM models
> >> >> >> only allows strict linear functions of their input, and doesn't
> >> >> >> allow
> >> >> >> a logistic link function. The docs direct people to the
> >> >> >> LogisticRegression class for this case.
> >> >> >>
> >> >> >> In R, there is:
> >> >> >>
> >> >> >> glm(Total_Service_Points_Won/Total_Service_Points_Played ~ ... ,
> >> >> >>      family = binomial(link=logit), weights =
> >> >> >> Total_Service_Points_Played)
> >> >> >> which would be ideal.
> >> >> >>
> >> >> >> Is something similar available in scikit? (Or any continuous model
> >> >> >> that takes and 0 to 1 target and outputs a 0 to 1 target?)
> >> >> >>
> >> >> >> I was surprised to see that the implementation of
> >> >> >> CalibratedClassifierCV(method="sigmoid") uses an internal
> >> >> >> implementation of logistic regression to do its logistic
> regressing
> >> >> >> --
> >> >> >> which I can use, although I'd prefer to use a user-facing library.
> >> >> >>
> >> >> >> Thanks,
> >> >> >> - Stuart
> >> >> >> _______________________________________________
> >> >> >> scikit-learn mailing list
> >> >> >> scikit-learn at python.org
> >> >> >> https://mail.python.org/mailman/listinfo/scikit-learn
> >> >> >
> >> >> >
> >> >> > _______________________________________________
> >> >> > scikit-learn mailing list
> >> >> > scikit-learn at python.org
> >> >> > https://mail.python.org/mailman/listinfo/scikit-learn
> >> >> _______________________________________________
> >> >> scikit-learn mailing list
> >> >> scikit-learn at python.org
> >> >> https://mail.python.org/mailman/listinfo/scikit-learn
> >> >
> >> >
> >> >
> >> > _______________________________________________
> >> > scikit-learn mailing list
> >> > scikit-learn at python.org
> >> > https://mail.python.org/mailman/listinfo/scikit-learn
> >> >
> >> _______________________________________________
> >> scikit-learn mailing list
> >> scikit-learn at python.org
> >> https://mail.python.org/mailman/listinfo/scikit-learn
> >
> >
> >
> > _______________________________________________
> > scikit-learn mailing list
> > scikit-learn at python.org
> > https://mail.python.org/mailman/listinfo/scikit-learn
> >
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20171005/daed3a64/attachment-0001.html>

From josef.pktd at gmail.com  Thu Oct  5 15:27:22 2017
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Thu, 5 Oct 2017 15:27:22 -0400
Subject: [scikit-learn] Can fit a model with a target array of
 probabilities?
In-Reply-To: <CAAy-kd=WyVpGK0Op7xg4mo2odV3ZbMUJbudGzsKfy4PLmP3GuQ@mail.gmail.com>
References: <CAAy-kdkCpm-205Lb7Gj_=GJZ61kBa9Cktc7+QVwZ7LLmz5tErA@mail.gmail.com>
 <e201e89e-dcce-8722-82b9-f352f8a9e56d@gmail.com>
 <CAAy-kdmNq=tB7dhENM56XRvy_-t-qi=ROsGar2VT0MJgju4gUA@mail.gmail.com>
 <CAMMTP+C8PmHKDh3Yt24Smd5yrOidvFBuzEVQ8-suzxY8+BVghw@mail.gmail.com>
 <CAAy-kdkvUnabY54t-dRitsBEpE08kSN41oiE8RrUUK21J6yGcA@mail.gmail.com>
 <CAL9=spMua4Vbp-LnzaMA=P55V=wqLb6Q-_W04FRTYqmY5vo2Sw@mail.gmail.com>
 <CAAy-kd=WyVpGK0Op7xg4mo2odV3ZbMUJbudGzsKfy4PLmP3GuQ@mail.gmail.com>
Message-ID: <CAMMTP+A-i0QJexERobyfYbxz0D5BV6YNuYQYaXfvLDPNF2-tqg@mail.gmail.com>

On Thu, Oct 5, 2017 at 2:52 PM, Stuart Reynolds <stuart at stuartreynolds.net>
wrote:

> Turns out sm.Logit does allow setting the tolerance.
> Some and quick and dirty time profiling of different methods on a 100k
> * 30 features dataset, with different solvers and losses:
>
> sklearn.LogisticRegression: l1 1.13864398003 (seconds)
> sklearn.LogisticRegression: l2 0.0538778305054
> sm.Logit l1 0.0922629833221  # Although didn't converge
> sm.Logit l1_cvxopt_cp 0.958268165588
> sm.Logit newton 0.133476018906
> sm.Logit nm 0.369864940643
> sm.Logit bfgs 0.105798006058
> sm.Logit lbfgs 0.06241106987
> sm.Logit powell 1.64219808578
> sm.Logit cg 0.2184278965
> sm.Logit ncg 0.216138124466
> sm.Logit basinhopping 8.82164621353
> sm.GLM.fit IRLS 0.544688940048
> sm.GLM L2: 1.29778695107
>
> I've been getting good results from sm.Logit.fit (although unregularized).
> statsmodels GLM seems a little slow. Not sure why.
>
> My benchmark may be a little apples-to-oranges, since the stopping
> criteria probably aren't comparable.
>

I think that's a problem with GLM IRLS.
AFAIK, but never fully tested, is that the objective function is
proportional to the number of observations and the convergence
criterion becomes tighter as nobs increases.
I don't find the issue or PR discussion anymore, but one of our
contributors fixed maxiter at 15 or something like that for IRLS with
around 4 to 5 million observations and mostly categorical explanatory
variables in his application.

unfortunately (no upfront design and decisions across models)
https://github.com/statsmodels/statsmodels/issues/2825

Josef


>
>
> For tiny models, which I'm also building: 100 samples, 5 features
>
> sklearn.LogisticRegression: l1 0.00137376785278
> sklearn.LogisticRegression: l2 0.00167894363403
> sm.Logit l1 0.0198900699615
> sm.Logit l1_cvxopt_cp 0.162448167801
> sm.Logit newton 0.00689911842346
> sm.Logit nm 0.0754928588867
> sm.Logit bfgs 0.0210938453674
> sm.Logit lbfgs 0.0156588554382
> sm.Logit powell 0.0161390304565
> sm.Logit cg 0.00759506225586
> sm.Logit ncg 0.00541186332703
> sm.Logit basinhopping 0.3076171875
> sm.GLM.fit IRLS 0.00902199745178
> sm.GLM L2: 0.0208361148834
>
> I couldn't get sm.GLM.fit to work with non "IRLS" solvers. (hits a
> division by zero).
>
> ----
>
> import sklearn.datasets
> from sklearn.preprocessing import StandardScaler
> X, y = sklearn.datasets.make_classification(n_samples=10000,
> n_features=30, random_state=123)
> X = StandardScaler(copy=True, with_mean=True, with_std=True).fit_transform(
> X)
>
> import time
> tol = 0.0001
> maxiter = 100
> DISP = 0
>
>
> if 1: # sk.LogisticRegression
>     import sklearn
>     from sklearn.linear_model import LogisticRegression
>
>     for method in ["l1", "l2"]: # TODO, add solvers:
>         t = time.time()
>         model = LogisticRegression(C=1, tol=tol, max_iter=maxiter,
> penalty=method)
>         model.fit(X,y)
>         print "sklearn.LogisticRegression:", method, time.time() - t
>
>
>
>
> if 1: # sm.Logit.fit_regularized
>     from statsmodels.discrete.discrete_model import Logit
>     for method in ["l1", "l1_cvxopt_cp"]:
>         t = time.time()
>         model = Logit(y,X)
>         result = model.fit_regularized(method=method, maxiter=maxiter,
>                                        alpha=1.,
>                                        abstol=tol,
>                                        acc=tol,
>                                        tol=tol, gtol=tol, pgtol=tol,
>                                        disp=DISP)
>         print "sm.Logit", method, time.time() - t
>
> if 1: # sm.Logit.fit
>     from statsmodels.discrete.discrete_model import Logit
>
>     SOLVERS = ["newton", "nm",
> "bfgs","lbfgs","powell","cg","ncg","basinhopping",]
>     for method in SOLVERS:
>         t = time.time()
>         model = Logit(y,X)
>         result = model.fit(method=method, maxiter=maxiter,
>                            niter=maxiter,
>                            ftol=tol,
>                            tol=tol, gtol=tol, pgtol=tol,  # Hmmm..
> needs to be reviewed.
>                            disp=DISP)
>         print "sm.Logit", method, time.time() - t
>
> if 1: # sm.GLM.fit
>     from statsmodels.genmod.generalized_linear_model import GLM
>     from statsmodels.genmod.generalized_linear_model import families
>     for method in ["IRLS"]:
>         t = time.time()
>         model = GLM(y, X, family=families.Binomial(link=
> families.links.logit))
>         result = model.fit(method=method, cnvrg_tol=tol,
> maxiter=maxiter, full_output=False, disp=DISP)
>         print "sm.GLM.fit", method, time.time() - t
>
>
> if 1: # GLM.fit_regularized
>     from statsmodels.genmod.generalized_linear_model import GLM
>     from statsmodels.genmod.generalized_linear_model import families
>     t = time.time()
>     model = GLM(y, X, family=families.Binomial(link=families.links.logit))
>     result = model.fit_regularized(method='elastic_net', alpha=1.0,
> L1_wt=0.0, cnvrg_tol=tol, maxiter=maxiter)
>     print "sm.GLM L2:", time.time() - t
>
>
>
> if 0: # GLM.fit
>     # Hits division by zero.
>     SOLVERS = ["bfgs","lbfgs", "netwon", "nm",
> "powell","cg","ncg","basinhopping",]
>     from statsmodels.genmod.generalized_linear_model import GLM
>     from statsmodels.genmod.generalized_linear_model import families
>     for method in SOLVERS:
>         t = time.time()
>         model = GLM(y, X, family=families.Binomial(link=
> families.links.logit))
>         result = model.fit(method=method,
> #                           scale="X2",
> #                            alpha=1.,
> #                            abstol=tol,
> #                            acc=tol,
> #                            tol=tol, gtol=tol, pgtol=tol,
> #                            maxiter=maxiter,
> #                            #full_output=False,
>                            disp=DISP)
>         print "sm.GLM.fit", method, time.time() - t
>
>
> On Thu, Oct 5, 2017 at 10:32 AM, Sean Violante <sean.violante at gmail.com>
> wrote:
> > Stuart
> > have you tried glmnet ( in R) there is a python version
> > https://web.stanford.edu/~hastie/glmnet_python/ ....
> >
> >
> >
> >
> > On Thu, Oct 5, 2017 at 6:34 PM, Stuart Reynolds <
> stuart at stuartreynolds.net>
> > wrote:
> >>
> >> Thanks Josef. Was very useful.
> >>
> >> result.remove_data() reduces a 5 parameter Logit result object from
> >> megabytes to 5Kb (as compared to a minimum uncompressed size of the
> >> parameters of ~320 bytes). Is big improvement. I'll experiment with
> >> what you suggest -- since this is still >10x larger than possible. I
> >> think the difference is mostly attribute names.
> >> I don't mind the lack of a multinomial support. I've often had better
> >> results mixing independent models for each class.
> >>
> >> I'll experiment with the different solvers.  I tried the Logit model
> >> in the past -- its fit function only exposed a maxiter, and not a
> >> tolerance -- meaning I had to set maxiter very high. The newer
> >> statsmodels GLM module looks great and seem to solve this.
> >>
> >> For other who come this way, I think the magic for ridge regression is:
> >>
> >>         from statsmodels.genmod.generalized_linear_model import GLM
> >>         from statsmodels.genmod.generalized_linear_model import
> families
> >>         from statsmodels.genmod.generalized_linear_model.families
> import
> >> links
> >>
> >>         model = GLM(y, Xtrain, family=families.Binomial(link=
> links.Logit))
> >>         result = model.fit_regularized(method='elastic_net',
> >> alpha=l2weight, L1_wt=0.0, tol=...)
> >>         result.remove_data()
> >>         result.predict(Xtest)
> >>
> >> One last thing -- its clear that it should be possible to do something
> >> like scikit's LogisticRegressionCV in order to quickly optimize a
> >> single parameter by re-using past coefficients.
> >> Are there any wrappers in statsmodels for doing this or should I roll my
> >> own?
> >>
> >>
> >> - Stu
> >>
> >>
> >> On Wed, Oct 4, 2017 at 3:43 PM,  <josef.pktd at gmail.com> wrote:
> >> >
> >> >
> >> > On Wed, Oct 4, 2017 at 4:26 PM, Stuart Reynolds
> >> > <stuart at stuartreynolds.net>
> >> > wrote:
> >> >>
> >> >> Hi Andy,
> >> >> Thanks -- I'll give another statsmodels another go.
> >> >> I remember I had some fitting speed issues with it in the past, and
> >> >> also some issues related their models keeping references to the data
> >> >> (=disaster for serialization and multiprocessing) -- although that
> was
> >> >> a long time ago.
> >> >
> >> >
> >> > The second has not changed and will not change, but there is a
> >> > remove_data
> >> > method that deletes all references to full, data sized arrays.
> However,
> >> > once
> >> > the data is removed, it is not possible anymore to compute any new
> >> > results
> >> > statistics which are almost all lazily computed.
> >> > The fitting speed depends a lot on the optimizer, convergence criteria
> >> > and
> >> > difficulty of the problem, and availability of good starting
> parameters.
> >> > Almost all nonlinear estimation problems use the scipy optimizers, all
> >> > unconstrained optimizers can be used. There are no optimized special
> >> > methods
> >> > for cases with a very large number of features.
> >> >
> >> > Multinomial/multiclass models don't support continuous response (yet),
> >> > all
> >> > other GLM and discrete models allow for continuous data in the
> interval
> >> > extension of the domain.
> >> >
> >> > Josef
> >> >
> >> >
> >> >>
> >> >> - Stuart
> >> >>
> >> >> On Wed, Oct 4, 2017 at 1:09 PM, Andreas Mueller <t3kcit at gmail.com>
> >> >> wrote:
> >> >> > Hi Stuart.
> >> >> > There is no interface to do this in scikit-learn (and maybe we
> should
> >> >> > at
> >> >> > this to the FAQ).
> >> >> > Yes, in principle this would be possible with several of the
> models.
> >> >> >
> >> >> > I think statsmodels can do that, and I think I saw another glm
> >> >> > package
> >> >> > for Python that does that?
> >> >> >
> >> >> > It's certainly a legitimate use-case but would require substantial
> >> >> > changes to the code. I think so far we decided not to support
> >> >> > this in scikit-learn. Basically we don't have a concept of a link
> >> >> > function, and it's a concept that only applies to a subset of
> models.
> >> >> > We try to have a consistent interface for all our estimators, and
> >> >> > this doesn't really fit well within that interface.
> >> >> >
> >> >> > Hth,
> >> >> > Andy
> >> >> >
> >> >> >
> >> >> > On 10/04/2017 03:58 PM, Stuart Reynolds wrote:
> >> >> >>
> >> >> >> I'd like to fit a model that maps a matrix of continuous inputs
> to a
> >> >> >> target that's between 0 and 1 (a probability).
> >> >> >>
> >> >> >> In principle, I'd expect logistic regression should work out of
> the
> >> >> >> box with no modification (although its often posed as being
> strictly
> >> >> >> for classification, its loss function allows for fitting targets
> in
> >> >> >> the range 0 to 1, and not strictly zero or one.)
> >> >> >>
> >> >> >> However, scikit's LogisticRegression and LogisticRegressionCV
> reject
> >> >> >> target arrays that are continuous. Other LR implementations allow
> a
> >> >> >> matrix of probability estimates. Looking at:
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> http://scikit-learn-general.narkive.com/4dSCktaM/using-
> logistic-regression-on-a-continuous-target-variable
> >> >> >> and the fix here:
> >> >> >> https://github.com/scikit-learn/scikit-learn/pull/5084, which
> >> >> >> disables
> >> >> >> continuous inputs, it looks like there was some reason for this.
> So
> >> >> >> ... I'm looking for alternatives.
> >> >> >>
> >> >> >> SGDClassifier allows log loss and (if I understood the docs
> >> >> >> correctly)
> >> >> >> adds a logistic link function, but also rejects continuous
> targets.
> >> >> >> Oddly, SGDRegressor only allows  ?squared_loss?, ?huber?,
> >> >> >> ?epsilon_insensitive?, or ?squared_epsilon_insensitive?, and
> doesn't
> >> >> >> seems to give a logistic function.
> >> >> >>
> >> >> >> In principle, GLM allow this, but scikit's docs say the GLM models
> >> >> >> only allows strict linear functions of their input, and doesn't
> >> >> >> allow
> >> >> >> a logistic link function. The docs direct people to the
> >> >> >> LogisticRegression class for this case.
> >> >> >>
> >> >> >> In R, there is:
> >> >> >>
> >> >> >> glm(Total_Service_Points_Won/Total_Service_Points_Played ~ ... ,
> >> >> >>      family = binomial(link=logit), weights =
> >> >> >> Total_Service_Points_Played)
> >> >> >> which would be ideal.
> >> >> >>
> >> >> >> Is something similar available in scikit? (Or any continuous model
> >> >> >> that takes and 0 to 1 target and outputs a 0 to 1 target?)
> >> >> >>
> >> >> >> I was surprised to see that the implementation of
> >> >> >> CalibratedClassifierCV(method="sigmoid") uses an internal
> >> >> >> implementation of logistic regression to do its logistic
> regressing
> >> >> >> --
> >> >> >> which I can use, although I'd prefer to use a user-facing library.
> >> >> >>
> >> >> >> Thanks,
> >> >> >> - Stuart
> >> >> >> _______________________________________________
> >> >> >> scikit-learn mailing list
> >> >> >> scikit-learn at python.org
> >> >> >> https://mail.python.org/mailman/listinfo/scikit-learn
> >> >> >
> >> >> >
> >> >> > _______________________________________________
> >> >> > scikit-learn mailing list
> >> >> > scikit-learn at python.org
> >> >> > https://mail.python.org/mailman/listinfo/scikit-learn
> >> >> _______________________________________________
> >> >> scikit-learn mailing list
> >> >> scikit-learn at python.org
> >> >> https://mail.python.org/mailman/listinfo/scikit-learn
> >> >
> >> >
> >> >
> >> > _______________________________________________
> >> > scikit-learn mailing list
> >> > scikit-learn at python.org
> >> > https://mail.python.org/mailman/listinfo/scikit-learn
> >> >
> >> _______________________________________________
> >> scikit-learn mailing list
> >> scikit-learn at python.org
> >> https://mail.python.org/mailman/listinfo/scikit-learn
> >
> >
> >
> > _______________________________________________
> > scikit-learn mailing list
> > scikit-learn at python.org
> > https://mail.python.org/mailman/listinfo/scikit-learn
> >
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20171005/4698d08e/attachment-0001.html>

From xulifan at udel.edu  Thu Oct  5 18:53:56 2017
From: xulifan at udel.edu (Lifan Xu)
Date: Thu, 5 Oct 2017 15:53:56 -0700
Subject: [scikit-learn] question for using GridSearchCV on LocalOutlierFactor
Message-ID: <779b1d39-5767-d2df-1d7c-64e4e7b1a2f4@udel.edu>

Hi,

     I was trying to train a model for anomaly detection. I only have 
the normal data which are all labeled as 1. Here is my code:


     clf = 
sklearn.model_selection.GridSearchCV(sklearn.neighbors.LocalOutlierFactor(),
                        parameters,
                        scoring="accuracy",
                        cv=kfold,
                        n_jobs=10)
     clf.fit(vectors, labels)


     But it complains "AttributeError: 'LocalOutlierFactor' object has 
no attribute 'predict'".

     It looks like LocalOutlierFactor only has fit_predict(), but no 
predict().

     My question is will predict() be implemented?


     Thanks!


From sean.violante at gmail.com  Fri Oct  6 02:26:11 2017
From: sean.violante at gmail.com (Sean Violante)
Date: Fri, 6 Oct 2017 08:26:11 +0200
Subject: [scikit-learn] Can fit a model with a target array of
 probabilities?
In-Reply-To: <CAMMTP+A-i0QJexERobyfYbxz0D5BV6YNuYQYaXfvLDPNF2-tqg@mail.gmail.com>
References: <CAAy-kdkCpm-205Lb7Gj_=GJZ61kBa9Cktc7+QVwZ7LLmz5tErA@mail.gmail.com>
 <e201e89e-dcce-8722-82b9-f352f8a9e56d@gmail.com>
 <CAAy-kdmNq=tB7dhENM56XRvy_-t-qi=ROsGar2VT0MJgju4gUA@mail.gmail.com>
 <CAMMTP+C8PmHKDh3Yt24Smd5yrOidvFBuzEVQ8-suzxY8+BVghw@mail.gmail.com>
 <CAAy-kdkvUnabY54t-dRitsBEpE08kSN41oiE8RrUUK21J6yGcA@mail.gmail.com>
 <CAL9=spMua4Vbp-LnzaMA=P55V=wqLb6Q-_W04FRTYqmY5vo2Sw@mail.gmail.com>
 <CAAy-kd=WyVpGK0Op7xg4mo2odV3ZbMUJbudGzsKfy4PLmP3GuQ@mail.gmail.com>
 <CAMMTP+A-i0QJexERobyfYbxz0D5BV6YNuYQYaXfvLDPNF2-tqg@mail.gmail.com>
Message-ID: <CAL9=spOy4j=M32Vu9X9JtkMqq6f8Ftk+mkvPD30JQAu1s-+8ag@mail.gmail.com>

Stuart,
I've only used the R implementation. Glmnet does the warm starts ..in fact
they recommend against trying a single regularisation value. And it
supports passing a 2d array of positive and negative counts (or multinomial
generalisation)
My experience is that it is much more accurate than liblinear. For my data,
the liblinear tolerance Param had to be on orders 10^-13 to compare. On
other hand, iteration time was sometimes unpredictable.we had one case when
we added one variable ( I believe continuous,not high dim categorical) and
time went from minutes to hours.
Sean


On 5 Oct 2017 9:28 pm, <josef.pktd at gmail.com> wrote:

>
>
> On Thu, Oct 5, 2017 at 2:52 PM, Stuart Reynolds <stuart at stuartreynolds.net
> > wrote:
>
>> Turns out sm.Logit does allow setting the tolerance.
>> Some and quick and dirty time profiling of different methods on a 100k
>> * 30 features dataset, with different solvers and losses:
>>
>> sklearn.LogisticRegression: l1 1.13864398003 (seconds)
>> sklearn.LogisticRegression: l2 0.0538778305054
>> sm.Logit l1 0.0922629833221  # Although didn't converge
>> sm.Logit l1_cvxopt_cp 0.958268165588
>> sm.Logit newton 0.133476018906
>> sm.Logit nm 0.369864940643
>> sm.Logit bfgs 0.105798006058
>> sm.Logit lbfgs 0.06241106987
>> sm.Logit powell 1.64219808578
>> sm.Logit cg 0.2184278965 <(218)%20427-8965>
>> sm.Logit ncg 0.216138124466
>> sm.Logit basinhopping 8.82164621353
>> sm.GLM.fit IRLS 0.544688940048
>> sm.GLM L2: 1.29778695107
>>
>> I've been getting good results from sm.Logit.fit (although unregularized).
>> statsmodels GLM seems a little slow. Not sure why.
>>
>> My benchmark may be a little apples-to-oranges, since the stopping
>> criteria probably aren't comparable.
>>
>
> I think that's a problem with GLM IRLS.
> AFAIK, but never fully tested, is that the objective function is
> proportional to the number of observations and the convergence
> criterion becomes tighter as nobs increases.
> I don't find the issue or PR discussion anymore, but one of our
> contributors fixed maxiter at 15 or something like that for IRLS with
> around 4 to 5 million observations and mostly categorical explanatory
> variables in his application.
>
> unfortunately (no upfront design and decisions across models)
> https://github.com/statsmodels/statsmodels/issues/2825
>
> Josef
>
>
>
>>
>>
>> For tiny models, which I'm also building: 100 samples, 5 features
>>
>> sklearn.LogisticRegression: l1 0.00137376785278
>> sklearn.LogisticRegression: l2 0.00167894363403
>> sm.Logit l1 0.0198900699615
>> sm.Logit l1_cvxopt_cp 0.162448167801
>> sm.Logit newton 0.00689911842346
>> sm.Logit nm 0.0754928588867
>> sm.Logit bfgs 0.0210938453674
>> sm.Logit lbfgs 0.0156588554382
>> sm.Logit powell 0.0161390304565
>> sm.Logit cg 0.00759506225586
>> sm.Logit ncg 0.00541186332703
>> sm.Logit basinhopping 0.3076171875 <(307)%20617-1875>
>> sm.GLM.fit IRLS 0.00902199745178
>> sm.GLM L2: 0.0208361148834
>>
>> I couldn't get sm.GLM.fit to work with non "IRLS" solvers. (hits a
>> division by zero).
>>
>> ----
>>
>> import sklearn.datasets
>> from sklearn.preprocessing import StandardScaler
>> X, y = sklearn.datasets.make_classification(n_samples=10000,
>> n_features=30, random_state=123)
>> X = StandardScaler(copy=True, with_mean=True,
>> with_std=True).fit_transform(X)
>>
>> import time
>> tol = 0.0001
>> maxiter = 100
>> DISP = 0
>>
>>
>> if 1: # sk.LogisticRegression
>>     import sklearn
>>     from sklearn.linear_model import LogisticRegression
>>
>>     for method in ["l1", "l2"]: # TODO, add solvers:
>>         t = time.time()
>>         model = LogisticRegression(C=1, tol=tol, max_iter=maxiter,
>> penalty=method)
>>         model.fit(X,y)
>>         print "sklearn.LogisticRegression:", method, time.time() - t
>>
>>
>>
>>
>> if 1: # sm.Logit.fit_regularized
>>     from statsmodels.discrete.discrete_model import Logit
>>     for method in ["l1", "l1_cvxopt_cp"]:
>>         t = time.time()
>>         model = Logit(y,X)
>>         result = model.fit_regularized(method=method, maxiter=maxiter,
>>                                        alpha=1.,
>>                                        abstol=tol,
>>                                        acc=tol,
>>                                        tol=tol, gtol=tol, pgtol=tol,
>>                                        disp=DISP)
>>         print "sm.Logit", method, time.time() - t
>>
>> if 1: # sm.Logit.fit
>>     from statsmodels.discrete.discrete_model import Logit
>>
>>     SOLVERS = ["newton", "nm",
>> "bfgs","lbfgs","powell","cg","ncg","basinhopping",]
>>     for method in SOLVERS:
>>         t = time.time()
>>         model = Logit(y,X)
>>         result = model.fit(method=method, maxiter=maxiter,
>>                            niter=maxiter,
>>                            ftol=tol,
>>                            tol=tol, gtol=tol, pgtol=tol,  # Hmmm..
>> needs to be reviewed.
>>                            disp=DISP)
>>         print "sm.Logit", method, time.time() - t
>>
>> if 1: # sm.GLM.fit
>>     from statsmodels.genmod.generalized_linear_model import GLM
>>     from statsmodels.genmod.generalized_linear_model import families
>>     for method in ["IRLS"]:
>>         t = time.time()
>>         model = GLM(y, X, family=families.Binomial(link=
>> families.links.logit))
>>         result = model.fit(method=method, cnvrg_tol=tol,
>> maxiter=maxiter, full_output=False, disp=DISP)
>>         print "sm.GLM.fit", method, time.time() - t
>>
>>
>> if 1: # GLM.fit_regularized
>>     from statsmodels.genmod.generalized_linear_model import GLM
>>     from statsmodels.genmod.generalized_linear_model import families
>>     t = time.time()
>>     model = GLM(y, X, family=families.Binomial(link=
>> families.links.logit))
>>     result = model.fit_regularized(method='elastic_net', alpha=1.0,
>> L1_wt=0.0, cnvrg_tol=tol, maxiter=maxiter)
>>     print "sm.GLM L2:", time.time() - t
>>
>>
>>
>> if 0: # GLM.fit
>>     # Hits division by zero.
>>     SOLVERS = ["bfgs","lbfgs", "netwon", "nm",
>> "powell","cg","ncg","basinhopping",]
>>     from statsmodels.genmod.generalized_linear_model import GLM
>>     from statsmodels.genmod.generalized_linear_model import families
>>     for method in SOLVERS:
>>         t = time.time()
>>         model = GLM(y, X, family=families.Binomial(link=
>> families.links.logit))
>>         result = model.fit(method=method,
>> #                           scale="X2",
>> #                            alpha=1.,
>> #                            abstol=tol,
>> #                            acc=tol,
>> #                            tol=tol, gtol=tol, pgtol=tol,
>> #                            maxiter=maxiter,
>> #                            #full_output=False,
>>                            disp=DISP)
>>         print "sm.GLM.fit", method, time.time() - t
>>
>>
>> On Thu, Oct 5, 2017 at 10:32 AM, Sean Violante <sean.violante at gmail.com>
>> wrote:
>> > Stuart
>> > have you tried glmnet ( in R) there is a python version
>> > https://web.stanford.edu/~hastie/glmnet_python/ ....
>> >
>> >
>> >
>> >
>> > On Thu, Oct 5, 2017 at 6:34 PM, Stuart Reynolds <
>> stuart at stuartreynolds.net>
>> > wrote:
>> >>
>> >> Thanks Josef. Was very useful.
>> >>
>> >> result.remove_data() reduces a 5 parameter Logit result object from
>> >> megabytes to 5Kb (as compared to a minimum uncompressed size of the
>> >> parameters of ~320 bytes). Is big improvement. I'll experiment with
>> >> what you suggest -- since this is still >10x larger than possible. I
>> >> think the difference is mostly attribute names.
>> >> I don't mind the lack of a multinomial support. I've often had better
>> >> results mixing independent models for each class.
>> >>
>> >> I'll experiment with the different solvers.  I tried the Logit model
>> >> in the past -- its fit function only exposed a maxiter, and not a
>> >> tolerance -- meaning I had to set maxiter very high. The newer
>> >> statsmodels GLM module looks great and seem to solve this.
>> >>
>> >> For other who come this way, I think the magic for ridge regression is:
>> >>
>> >>         from statsmodels.genmod.generalized_linear_model import GLM
>> >>         from statsmodels.genmod.generalized_linear_model import
>> families
>> >>         from statsmodels.genmod.generalized_linear_model.families
>> import
>> >> links
>> >>
>> >>         model = GLM(y, Xtrain, family=families.Binomial(link=
>> links.Logit))
>> >>         result = model.fit_regularized(method='elastic_net',
>> >> alpha=l2weight, L1_wt=0.0, tol=...)
>> >>         result.remove_data()
>> >>         result.predict(Xtest)
>> >>
>> >> One last thing -- its clear that it should be possible to do something
>> >> like scikit's LogisticRegressionCV in order to quickly optimize a
>> >> single parameter by re-using past coefficients.
>> >> Are there any wrappers in statsmodels for doing this or should I roll
>> my
>> >> own?
>> >>
>> >>
>> >> - Stu
>> >>
>> >>
>> >> On Wed, Oct 4, 2017 at 3:43 PM,  <josef.pktd at gmail.com> wrote:
>> >> >
>> >> >
>> >> > On Wed, Oct 4, 2017 at 4:26 PM, Stuart Reynolds
>> >> > <stuart at stuartreynolds.net>
>> >> > wrote:
>> >> >>
>> >> >> Hi Andy,
>> >> >> Thanks -- I'll give another statsmodels another go.
>> >> >> I remember I had some fitting speed issues with it in the past, and
>> >> >> also some issues related their models keeping references to the data
>> >> >> (=disaster for serialization and multiprocessing) -- although that
>> was
>> >> >> a long time ago.
>> >> >
>> >> >
>> >> > The second has not changed and will not change, but there is a
>> >> > remove_data
>> >> > method that deletes all references to full, data sized arrays.
>> However,
>> >> > once
>> >> > the data is removed, it is not possible anymore to compute any new
>> >> > results
>> >> > statistics which are almost all lazily computed.
>> >> > The fitting speed depends a lot on the optimizer, convergence
>> criteria
>> >> > and
>> >> > difficulty of the problem, and availability of good starting
>> parameters.
>> >> > Almost all nonlinear estimation problems use the scipy optimizers,
>> all
>> >> > unconstrained optimizers can be used. There are no optimized special
>> >> > methods
>> >> > for cases with a very large number of features.
>> >> >
>> >> > Multinomial/multiclass models don't support continuous response
>> (yet),
>> >> > all
>> >> > other GLM and discrete models allow for continuous data in the
>> interval
>> >> > extension of the domain.
>> >> >
>> >> > Josef
>> >> >
>> >> >
>> >> >>
>> >> >> - Stuart
>> >> >>
>> >> >> On Wed, Oct 4, 2017 at 1:09 PM, Andreas Mueller <t3kcit at gmail.com>
>> >> >> wrote:
>> >> >> > Hi Stuart.
>> >> >> > There is no interface to do this in scikit-learn (and maybe we
>> should
>> >> >> > at
>> >> >> > this to the FAQ).
>> >> >> > Yes, in principle this would be possible with several of the
>> models.
>> >> >> >
>> >> >> > I think statsmodels can do that, and I think I saw another glm
>> >> >> > package
>> >> >> > for Python that does that?
>> >> >> >
>> >> >> > It's certainly a legitimate use-case but would require substantial
>> >> >> > changes to the code. I think so far we decided not to support
>> >> >> > this in scikit-learn. Basically we don't have a concept of a link
>> >> >> > function, and it's a concept that only applies to a subset of
>> models.
>> >> >> > We try to have a consistent interface for all our estimators, and
>> >> >> > this doesn't really fit well within that interface.
>> >> >> >
>> >> >> > Hth,
>> >> >> > Andy
>> >> >> >
>> >> >> >
>> >> >> > On 10/04/2017 03:58 PM, Stuart Reynolds wrote:
>> >> >> >>
>> >> >> >> I'd like to fit a model that maps a matrix of continuous inputs
>> to a
>> >> >> >> target that's between 0 and 1 (a probability).
>> >> >> >>
>> >> >> >> In principle, I'd expect logistic regression should work out of
>> the
>> >> >> >> box with no modification (although its often posed as being
>> strictly
>> >> >> >> for classification, its loss function allows for fitting targets
>> in
>> >> >> >> the range 0 to 1, and not strictly zero or one.)
>> >> >> >>
>> >> >> >> However, scikit's LogisticRegression and LogisticRegressionCV
>> reject
>> >> >> >> target arrays that are continuous. Other LR implementations
>> allow a
>> >> >> >> matrix of probability estimates. Looking at:
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >> http://scikit-learn-general.narkive.com/4dSCktaM/using-logis
>> tic-regression-on-a-continuous-target-variable
>> >> >> >> and the fix here:
>> >> >> >> https://github.com/scikit-learn/scikit-learn/pull/5084, which
>> >> >> >> disables
>> >> >> >> continuous inputs, it looks like there was some reason for this.
>> So
>> >> >> >> ... I'm looking for alternatives.
>> >> >> >>
>> >> >> >> SGDClassifier allows log loss and (if I understood the docs
>> >> >> >> correctly)
>> >> >> >> adds a logistic link function, but also rejects continuous
>> targets.
>> >> >> >> Oddly, SGDRegressor only allows  ?squared_loss?, ?huber?,
>> >> >> >> ?epsilon_insensitive?, or ?squared_epsilon_insensitive?, and
>> doesn't
>> >> >> >> seems to give a logistic function.
>> >> >> >>
>> >> >> >> In principle, GLM allow this, but scikit's docs say the GLM
>> models
>> >> >> >> only allows strict linear functions of their input, and doesn't
>> >> >> >> allow
>> >> >> >> a logistic link function. The docs direct people to the
>> >> >> >> LogisticRegression class for this case.
>> >> >> >>
>> >> >> >> In R, there is:
>> >> >> >>
>> >> >> >> glm(Total_Service_Points_Won/Total_Service_Points_Played ~ ... ,
>> >> >> >>      family = binomial(link=logit), weights =
>> >> >> >> Total_Service_Points_Played)
>> >> >> >> which would be ideal.
>> >> >> >>
>> >> >> >> Is something similar available in scikit? (Or any continuous
>> model
>> >> >> >> that takes and 0 to 1 target and outputs a 0 to 1 target?)
>> >> >> >>
>> >> >> >> I was surprised to see that the implementation of
>> >> >> >> CalibratedClassifierCV(method="sigmoid") uses an internal
>> >> >> >> implementation of logistic regression to do its logistic
>> regressing
>> >> >> >> --
>> >> >> >> which I can use, although I'd prefer to use a user-facing
>> library.
>> >> >> >>
>> >> >> >> Thanks,
>> >> >> >> - Stuart
>> >> >> >> _______________________________________________
>> >> >> >> scikit-learn mailing list
>> >> >> >> scikit-learn at python.org
>> >> >> >> https://mail.python.org/mailman/listinfo/scikit-learn
>> >> >> >
>> >> >> >
>> >> >> > _______________________________________________
>> >> >> > scikit-learn mailing list
>> >> >> > scikit-learn at python.org
>> >> >> > https://mail.python.org/mailman/listinfo/scikit-learn
>> >> >> _______________________________________________
>> >> >> scikit-learn mailing list
>> >> >> scikit-learn at python.org
>> >> >> https://mail.python.org/mailman/listinfo/scikit-learn
>> >> >
>> >> >
>> >> >
>> >> > _______________________________________________
>> >> > scikit-learn mailing list
>> >> > scikit-learn at python.org
>> >> > https://mail.python.org/mailman/listinfo/scikit-learn
>> >> >
>> >> _______________________________________________
>> >> scikit-learn mailing list
>> >> scikit-learn at python.org
>> >> https://mail.python.org/mailman/listinfo/scikit-learn
>> >
>> >
>> >
>> > _______________________________________________
>> > scikit-learn mailing list
>> > scikit-learn at python.org
>> > https://mail.python.org/mailman/listinfo/scikit-learn
>> >
>> _______________________________________________
>> scikit-learn mailing list
>> scikit-learn at python.org
>> https://mail.python.org/mailman/listinfo/scikit-learn
>>
>
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20171006/7de968b7/attachment-0001.html>

From josef.pktd at gmail.com  Fri Oct  6 08:25:25 2017
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Fri, 6 Oct 2017 08:25:25 -0400
Subject: [scikit-learn] Can fit a model with a target array of
 probabilities?
In-Reply-To: <CAMMTP+A-i0QJexERobyfYbxz0D5BV6YNuYQYaXfvLDPNF2-tqg@mail.gmail.com>
References: <CAAy-kdkCpm-205Lb7Gj_=GJZ61kBa9Cktc7+QVwZ7LLmz5tErA@mail.gmail.com>
 <e201e89e-dcce-8722-82b9-f352f8a9e56d@gmail.com>
 <CAAy-kdmNq=tB7dhENM56XRvy_-t-qi=ROsGar2VT0MJgju4gUA@mail.gmail.com>
 <CAMMTP+C8PmHKDh3Yt24Smd5yrOidvFBuzEVQ8-suzxY8+BVghw@mail.gmail.com>
 <CAAy-kdkvUnabY54t-dRitsBEpE08kSN41oiE8RrUUK21J6yGcA@mail.gmail.com>
 <CAL9=spMua4Vbp-LnzaMA=P55V=wqLb6Q-_W04FRTYqmY5vo2Sw@mail.gmail.com>
 <CAAy-kd=WyVpGK0Op7xg4mo2odV3ZbMUJbudGzsKfy4PLmP3GuQ@mail.gmail.com>
 <CAMMTP+A-i0QJexERobyfYbxz0D5BV6YNuYQYaXfvLDPNF2-tqg@mail.gmail.com>
Message-ID: <CAMMTP+DjTP20UWmM+XUqnb-imZ_2b2UshMWEev_-M20_pVtXsg@mail.gmail.com>

On Thu, Oct 5, 2017 at 3:27 PM, <josef.pktd at gmail.com> wrote:

>
>
> On Thu, Oct 5, 2017 at 2:52 PM, Stuart Reynolds <stuart at stuartreynolds.net
> > wrote:
>
>> Turns out sm.Logit does allow setting the tolerance.
>> Some and quick and dirty time profiling of different methods on a 100k
>> * 30 features dataset, with different solvers and losses:
>>
>> sklearn.LogisticRegression: l1 1.13864398003 (seconds)
>> sklearn.LogisticRegression: l2 0.0538778305054
>> sm.Logit l1 0.0922629833221  # Although didn't converge
>> sm.Logit l1_cvxopt_cp 0.958268165588
>> sm.Logit newton 0.133476018906
>> sm.Logit nm 0.369864940643
>> sm.Logit bfgs 0.105798006058
>> sm.Logit lbfgs 0.06241106987
>> sm.Logit powell 1.64219808578
>> sm.Logit cg 0.2184278965 <(218)%20427-8965>
>> sm.Logit ncg 0.216138124466
>> sm.Logit basinhopping 8.82164621353
>> sm.GLM.fit IRLS 0.544688940048
>> sm.GLM L2: 1.29778695107
>>
>> I've been getting good results from sm.Logit.fit (although unregularized).
>> statsmodels GLM seems a little slow. Not sure why.
>>
>> My benchmark may be a little apples-to-oranges, since the stopping
>> criteria probably aren't comparable.
>>
>
> I think that's a problem with GLM IRLS.
> AFAIK, but never fully tested, is that the objective function is
> proportional to the number of observations and the convergence
> criterion becomes tighter as nobs increases.
> I don't find the issue or PR discussion anymore, but one of our
> contributors fixed maxiter at 15 or something like that for IRLS with
> around 4 to 5 million observations and mostly categorical explanatory
> variables in his application.
>
> unfortunately (no upfront design and decisions across models)
> https://github.com/statsmodels/statsmodels/issues/2825
>


Interesting timing excercise, I tried a bit more yesterday.

GLM IRLS is not slow because of the convergence criterion, but it seems
like it takes much longer when the design matrix is not well conditioned.
The random dataset generated by sklearn has singular values in the range of
1e-14 or 1e-15
This doesn't affect the other estimators much and lbfgs is almost always
the fastest with bfgs close behind.

When I add some noise to the feature matrix so it's not so close to
singular, then IRLS is roughly in the same neighborhood as the faster scipy
optimizers
With n_samples=1000000, n_features=50, Logit is around 5 or 6 seconds (for
lbfgs, bfgs and newton) slightly faster than sklearnLogistic regression
regularized, but GLM is about 4 times slower with 17 to 20 seconds
GLM L2 is much slower in this case because of the current non-optimized
implementation of coordinate descend.

aside: In master and next release of statsmodels there is a interface to
scipy.minimize, which allows that all new optimizers can be used, e.g.
dogleg and other new trust region newton methods will be better optimizers
for many cases.

Josef


>
> Josef
>
>
>
>>
>>
>> For tiny models, which I'm also building: 100 samples, 5 features
>>
>> sklearn.LogisticRegression: l1 0.00137376785278
>> sklearn.LogisticRegression: l2 0.00167894363403
>> sm.Logit l1 0.0198900699615
>> sm.Logit l1_cvxopt_cp 0.162448167801
>> sm.Logit newton 0.00689911842346
>> sm.Logit nm 0.0754928588867
>> sm.Logit bfgs 0.0210938453674
>> sm.Logit lbfgs 0.0156588554382
>> sm.Logit powell 0.0161390304565
>> sm.Logit cg 0.00759506225586
>> sm.Logit ncg 0.00541186332703
>> sm.Logit basinhopping 0.3076171875 <(307)%20617-1875>
>> sm.GLM.fit IRLS 0.00902199745178
>> sm.GLM L2: 0.0208361148834
>>
>> I couldn't get sm.GLM.fit to work with non "IRLS" solvers. (hits a
>> division by zero).
>
>
>> ----
>>
>> import sklearn.datasets
>> from sklearn.preprocessing import StandardScaler
>> X, y = sklearn.datasets.make_classification(n_samples=10000,
>> n_features=30, random_state=123)
>> X = StandardScaler(copy=True, with_mean=True,
>> with_std=True).fit_transform(X)
>>
>> import time
>> tol = 0.0001
>> maxiter = 100
>> DISP = 0
>>
>>
>> if 1: # sk.LogisticRegression
>>     import sklearn
>>     from sklearn.linear_model import LogisticRegression
>>
>>     for method in ["l1", "l2"]: # TODO, add solvers:
>>         t = time.time()
>>         model = LogisticRegression(C=1, tol=tol, max_iter=maxiter,
>> penalty=method)
>>         model.fit(X,y)
>>         print "sklearn.LogisticRegression:", method, time.time() - t
>>
>>
>>
>>
>> if 1: # sm.Logit.fit_regularized
>>     from statsmodels.discrete.discrete_model import Logit
>>     for method in ["l1", "l1_cvxopt_cp"]:
>>         t = time.time()
>>         model = Logit(y,X)
>>         result = model.fit_regularized(method=method, maxiter=maxiter,
>>                                        alpha=1.,
>>                                        abstol=tol,
>>                                        acc=tol,
>>                                        tol=tol, gtol=tol, pgtol=tol,
>>                                        disp=DISP)
>>         print "sm.Logit", method, time.time() - t
>>
>> if 1: # sm.Logit.fit
>>     from statsmodels.discrete.discrete_model import Logit
>>
>>     SOLVERS = ["newton", "nm",
>> "bfgs","lbfgs","powell","cg","ncg","basinhopping",]
>>     for method in SOLVERS:
>>         t = time.time()
>>         model = Logit(y,X)
>>         result = model.fit(method=method, maxiter=maxiter,
>>                            niter=maxiter,
>>                            ftol=tol,
>>                            tol=tol, gtol=tol, pgtol=tol,  # Hmmm..
>> needs to be reviewed.
>>                            disp=DISP)
>>         print "sm.Logit", method, time.time() - t
>>
>> if 1: # sm.GLM.fit
>>     from statsmodels.genmod.generalized_linear_model import GLM
>>     from statsmodels.genmod.generalized_linear_model import families
>>     for method in ["IRLS"]:
>>         t = time.time()
>>         model = GLM(y, X, family=families.Binomial(link=
>> families.links.logit))
>>         result = model.fit(method=method, cnvrg_tol=tol,
>> maxiter=maxiter, full_output=False, disp=DISP)
>>         print "sm.GLM.fit", method, time.time() - t
>>
>>
>> if 1: # GLM.fit_regularized
>>     from statsmodels.genmod.generalized_linear_model import GLM
>>     from statsmodels.genmod.generalized_linear_model import families
>>     t = time.time()
>>     model = GLM(y, X, family=families.Binomial(link=
>> families.links.logit))
>>     result = model.fit_regularized(method='elastic_net', alpha=1.0,
>> L1_wt=0.0, cnvrg_tol=tol, maxiter=maxiter)
>>     print "sm.GLM L2:", time.time() - t
>>
>>
>>
>> if 0: # GLM.fit
>>     # Hits division by zero.
>>     SOLVERS = ["bfgs","lbfgs", "netwon", "nm",
>> "powell","cg","ncg","basinhopping",]
>>     from statsmodels.genmod.generalized_linear_model import GLM
>>     from statsmodels.genmod.generalized_linear_model import families
>>     for method in SOLVERS:
>>         t = time.time()
>>         model = GLM(y, X, family=families.Binomial(link=
>> families.links.logit))
>>         result = model.fit(method=method,
>> #                           scale="X2",
>> #                            alpha=1.,
>> #                            abstol=tol,
>> #                            acc=tol,
>> #                            tol=tol, gtol=tol, pgtol=tol,
>> #                            maxiter=maxiter,
>> #                            #full_output=False,
>>                            disp=DISP)
>>         print "sm.GLM.fit", method, time.time() - t
>>
>>
>> On Thu, Oct 5, 2017 at 10:32 AM, Sean Violante <sean.violante at gmail.com>
>> wrote:
>> > Stuart
>> > have you tried glmnet ( in R) there is a python version
>> > https://web.stanford.edu/~hastie/glmnet_python/ ....
>> >
>> >
>> >
>> >
>> > On Thu, Oct 5, 2017 at 6:34 PM, Stuart Reynolds <
>> stuart at stuartreynolds.net>
>> > wrote:
>> >>
>> >> Thanks Josef. Was very useful.
>> >>
>> >> result.remove_data() reduces a 5 parameter Logit result object from
>> >> megabytes to 5Kb (as compared to a minimum uncompressed size of the
>> >> parameters of ~320 bytes). Is big improvement. I'll experiment with
>> >> what you suggest -- since this is still >10x larger than possible. I
>> >> think the difference is mostly attribute names.
>> >> I don't mind the lack of a multinomial support. I've often had better
>> >> results mixing independent models for each class.
>> >>
>> >> I'll experiment with the different solvers.  I tried the Logit model
>> >> in the past -- its fit function only exposed a maxiter, and not a
>> >> tolerance -- meaning I had to set maxiter very high. The newer
>> >> statsmodels GLM module looks great and seem to solve this.
>> >>
>> >> For other who come this way, I think the magic for ridge regression is:
>> >>
>> >>         from statsmodels.genmod.generalized_linear_model import GLM
>> >>         from statsmodels.genmod.generalized_linear_model import
>> families
>> >>         from statsmodels.genmod.generalized_linear_model.families
>> import
>> >> links
>> >>
>> >>         model = GLM(y, Xtrain, family=families.Binomial(link=
>> links.Logit))
>> >>         result = model.fit_regularized(method='elastic_net',
>> >> alpha=l2weight, L1_wt=0.0, tol=...)
>> >>         result.remove_data()
>> >>         result.predict(Xtest)
>> >>
>> >> One last thing -- its clear that it should be possible to do something
>> >> like scikit's LogisticRegressionCV in order to quickly optimize a
>> >> single parameter by re-using past coefficients.
>> >> Are there any wrappers in statsmodels for doing this or should I roll
>> my
>> >> own?
>> >>
>> >>
>> >> - Stu
>> >>
>> >>
>> >> On Wed, Oct 4, 2017 at 3:43 PM,  <josef.pktd at gmail.com> wrote:
>> >> >
>> >> >
>> >> > On Wed, Oct 4, 2017 at 4:26 PM, Stuart Reynolds
>> >> > <stuart at stuartreynolds.net>
>> >> > wrote:
>> >> >>
>> >> >> Hi Andy,
>> >> >> Thanks -- I'll give another statsmodels another go.
>> >> >> I remember I had some fitting speed issues with it in the past, and
>> >> >> also some issues related their models keeping references to the data
>> >> >> (=disaster for serialization and multiprocessing) -- although that
>> was
>> >> >> a long time ago.
>> >> >
>> >> >
>> >> > The second has not changed and will not change, but there is a
>> >> > remove_data
>> >> > method that deletes all references to full, data sized arrays.
>> However,
>> >> > once
>> >> > the data is removed, it is not possible anymore to compute any new
>> >> > results
>> >> > statistics which are almost all lazily computed.
>> >> > The fitting speed depends a lot on the optimizer, convergence
>> criteria
>> >> > and
>> >> > difficulty of the problem, and availability of good starting
>> parameters.
>> >> > Almost all nonlinear estimation problems use the scipy optimizers,
>> all
>> >> > unconstrained optimizers can be used. There are no optimized special
>> >> > methods
>> >> > for cases with a very large number of features.
>> >> >
>> >> > Multinomial/multiclass models don't support continuous response
>> (yet),
>> >> > all
>> >> > other GLM and discrete models allow for continuous data in the
>> interval
>> >> > extension of the domain.
>> >> >
>> >> > Josef
>> >> >
>> >> >
>> >> >>
>> >> >> - Stuart
>> >> >>
>> >> >> On Wed, Oct 4, 2017 at 1:09 PM, Andreas Mueller <t3kcit at gmail.com>
>> >> >> wrote:
>> >> >> > Hi Stuart.
>> >> >> > There is no interface to do this in scikit-learn (and maybe we
>> should
>> >> >> > at
>> >> >> > this to the FAQ).
>> >> >> > Yes, in principle this would be possible with several of the
>> models.
>> >> >> >
>> >> >> > I think statsmodels can do that, and I think I saw another glm
>> >> >> > package
>> >> >> > for Python that does that?
>> >> >> >
>> >> >> > It's certainly a legitimate use-case but would require substantial
>> >> >> > changes to the code. I think so far we decided not to support
>> >> >> > this in scikit-learn. Basically we don't have a concept of a link
>> >> >> > function, and it's a concept that only applies to a subset of
>> models.
>> >> >> > We try to have a consistent interface for all our estimators, and
>> >> >> > this doesn't really fit well within that interface.
>> >> >> >
>> >> >> > Hth,
>> >> >> > Andy
>> >> >> >
>> >> >> >
>> >> >> > On 10/04/2017 03:58 PM, Stuart Reynolds wrote:
>> >> >> >>
>> >> >> >> I'd like to fit a model that maps a matrix of continuous inputs
>> to a
>> >> >> >> target that's between 0 and 1 (a probability).
>> >> >> >>
>> >> >> >> In principle, I'd expect logistic regression should work out of
>> the
>> >> >> >> box with no modification (although its often posed as being
>> strictly
>> >> >> >> for classification, its loss function allows for fitting targets
>> in
>> >> >> >> the range 0 to 1, and not strictly zero or one.)
>> >> >> >>
>> >> >> >> However, scikit's LogisticRegression and LogisticRegressionCV
>> reject
>> >> >> >> target arrays that are continuous. Other LR implementations
>> allow a
>> >> >> >> matrix of probability estimates. Looking at:
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >> http://scikit-learn-general.narkive.com/4dSCktaM/using-logis
>> tic-regression-on-a-continuous-target-variable
>> >> >> >> and the fix here:
>> >> >> >> https://github.com/scikit-learn/scikit-learn/pull/5084, which
>> >> >> >> disables
>> >> >> >> continuous inputs, it looks like there was some reason for this.
>> So
>> >> >> >> ... I'm looking for alternatives.
>> >> >> >>
>> >> >> >> SGDClassifier allows log loss and (if I understood the docs
>> >> >> >> correctly)
>> >> >> >> adds a logistic link function, but also rejects continuous
>> targets.
>> >> >> >> Oddly, SGDRegressor only allows  ?squared_loss?, ?huber?,
>> >> >> >> ?epsilon_insensitive?, or ?squared_epsilon_insensitive?, and
>> doesn't
>> >> >> >> seems to give a logistic function.
>> >> >> >>
>> >> >> >> In principle, GLM allow this, but scikit's docs say the GLM
>> models
>> >> >> >> only allows strict linear functions of their input, and doesn't
>> >> >> >> allow
>> >> >> >> a logistic link function. The docs direct people to the
>> >> >> >> LogisticRegression class for this case.
>> >> >> >>
>> >> >> >> In R, there is:
>> >> >> >>
>> >> >> >> glm(Total_Service_Points_Won/Total_Service_Points_Played ~ ... ,
>> >> >> >>      family = binomial(link=logit), weights =
>> >> >> >> Total_Service_Points_Played)
>> >> >> >> which would be ideal.
>> >> >> >>
>> >> >> >> Is something similar available in scikit? (Or any continuous
>> model
>> >> >> >> that takes and 0 to 1 target and outputs a 0 to 1 target?)
>> >> >> >>
>> >> >> >> I was surprised to see that the implementation of
>> >> >> >> CalibratedClassifierCV(method="sigmoid") uses an internal
>> >> >> >> implementation of logistic regression to do its logistic
>> regressing
>> >> >> >> --
>> >> >> >> which I can use, although I'd prefer to use a user-facing
>> library.
>> >> >> >>
>> >> >> >> Thanks,
>> >> >> >> - Stuart
>> >> >> >> _______________________________________________
>> >> >> >> scikit-learn mailing list
>> >> >> >> scikit-learn at python.org
>> >> >> >> https://mail.python.org/mailman/listinfo/scikit-learn
>> >> >> >
>> >> >> >
>> >> >> > _______________________________________________
>> >> >> > scikit-learn mailing list
>> >> >> > scikit-learn at python.org
>> >> >> > https://mail.python.org/mailman/listinfo/scikit-learn
>> >> >> _______________________________________________
>> >> >> scikit-learn mailing list
>> >> >> scikit-learn at python.org
>> >> >> https://mail.python.org/mailman/listinfo/scikit-learn
>> >> >
>> >> >
>> >> >
>> >> > _______________________________________________
>> >> > scikit-learn mailing list
>> >> > scikit-learn at python.org
>> >> > https://mail.python.org/mailman/listinfo/scikit-learn
>> >> >
>> >> _______________________________________________
>> >> scikit-learn mailing list
>> >> scikit-learn at python.org
>> >> https://mail.python.org/mailman/listinfo/scikit-learn
>> >
>> >
>> >
>> > _______________________________________________
>> > scikit-learn mailing list
>> > scikit-learn at python.org
>> > https://mail.python.org/mailman/listinfo/scikit-learn
>> >
>> _______________________________________________
>> scikit-learn mailing list
>> scikit-learn at python.org
>> https://mail.python.org/mailman/listinfo/scikit-learn
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20171006/3f245349/attachment-0001.html>

From gael.varoquaux at normalesup.org  Fri Oct  6 08:04:37 2017
From: gael.varoquaux at normalesup.org (Gael Varoquaux)
Date: Fri, 6 Oct 2017 14:04:37 +0200
Subject: [scikit-learn] Remembering Raghav, our friend,
 and a scikit-learn contributor
Message-ID: <20171006120437.GA1236395@phare.normalesup.org>

Raghav was a core contributor to scikit-learn. Venkat Raghav Rajagopalan, or @raghavrv  -as we knew him- appeared out of the blue and started contributing early 2015. From Chennai, he was helping us make scikit-learn a better library. As often in open source, he was working with people that he had never met in person, to improve a tool used by the whole world. He successfully completed a Google summer of code for the project that year, and then was hired as a full time engineer to work on the project in Paris. Raghav was excited to join the scikit-learn team. When he became core contributor, in 2016 he said that it was a highlight of his year.

In Paris, we got to know him and enjoy him. Raghav was a very enthusiastic and easygoing person. It was a delight to have him around. For scikit-learn, he was a huge driver. He tackled a large number of issues, include tedious and difficult ones such as revamping our cross-validation API, multiple-metrics support in grid-search, or 32bit support in various models.

Raghav had left India to live an adventure in a new culture. Curiosity and goal-driven, he had found his own way. He was growing fast, moving from student to expert, on his way to a bright future.

Raghav passed away a month ago. We have been in shock and sorrow here in Paris. He will be deeply missed.

Gael Varoquaux and Alexandre Gramfort

From jbbrown at kuhp.kyoto-u.ac.jp  Fri Oct  6 10:22:38 2017
From: jbbrown at kuhp.kyoto-u.ac.jp (Brown J.B.)
Date: Fri, 6 Oct 2017 23:22:38 +0900
Subject: [scikit-learn] Remembering Raghav, our friend,
 and a scikit-learn contributor
In-Reply-To: <20171006120437.GA1236395@phare.normalesup.org>
References: <20171006120437.GA1236395@phare.normalesup.org>
Message-ID: <CAJe_vxAYsDHs14BpMxQS-AbY1f_fAZ-sFmg-QCx1Wo48oJCu7Q@mail.gmail.com>

This is truly, truly sad news.
Leaving the home country you grew up in to find your way in a new language
and culture takes considerable effort, and to thrive at it takes even more
effort.
He was to be commended for that.

I think many of us knew of his enthusiasm for the project and benefited
greatly from it.
May his family and friends know of his contribution, and may he rest
peacefully.

J.B. Brown

2017-10-06 21:04 GMT+09:00 Gael Varoquaux <gael.varoquaux at normalesup.org>:

> Raghav was a core contributor to scikit-learn. Venkat Raghav Rajagopalan,
> or @raghavrv  -as we knew him- appeared out of the blue and started
> contributing early 2015. From Chennai, he was helping us make scikit-learn
> a better library. As often in open source, he was working with people that
> he had never met in person, to improve a tool used by the whole world. He
> successfully completed a Google summer of code for the project that year,
> and then was hired as a full time engineer to work on the project in Paris.
> Raghav was excited to join the scikit-learn team. When he became core
> contributor, in 2016 he said that it was a highlight of his year.
>
> In Paris, we got to know him and enjoy him. Raghav was a very enthusiastic
> and easygoing person. It was a delight to have him around. For
> scikit-learn, he was a huge driver. He tackled a large number of issues,
> include tedious and difficult ones such as revamping our cross-validation
> API, multiple-metrics support in grid-search, or 32bit support in various
> models.
>
> Raghav had left India to live an adventure in a new culture. Curiosity and
> goal-driven, he had found his own way. He was growing fast, moving from
> student to expert, on his way to a bright future.
>
> Raghav passed away a month ago. We have been in shock and sorrow here in
> Paris. He will be deeply missed.
>
> Gael Varoquaux and Alexandre Gramfort
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20171006/012287e1/attachment.html>

From chyikwei.yau at gmail.com  Fri Oct  6 12:38:36 2017
From: chyikwei.yau at gmail.com (chyi-kwei yau)
Date: Fri, 06 Oct 2017 16:38:36 +0000
Subject: [scikit-learn] Using perplexity from LatentDirichletAllocation
 for cross validation of Topic Models
In-Reply-To: <56caf0d4-11eb-bfb5-a01d-af332fb5969a@wzb.eu>
References: <56caf0d4-11eb-bfb5-a01d-af332fb5969a@wzb.eu>
Message-ID: <CAK-jh0b6iHnWE1OASDqqcdKHAypYNyxrOBxe3XT_xN+bQDxYKQ@mail.gmail.com>

Hi Markus,

I find that in current LDA implementation we included
"E[log p(beta | eta) - log q (beta | lambda)]" in the approx bound function
and use it to calculate perplexity.
But this part was not included in the likelihood function in Blei's C
implementation.

Maybe this caused some difference.
(I am not sure which one is correct. will need some time to compare the
difference.)

Best,
Chyi-Kwei

reference code:
sklearn
https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/decomposition/online_lda.py#L707-L709

original onlineldavb
https://github.com/blei-lab/onlineldavb/blob/master/onlineldavb.py#L384-L388

Blei's C implementation
https://github.com/blei-lab/lda-c/blob/master/lda-inference.c#L94-L127


On Wed, Oct 4, 2017 at 7:56 AM Markus Konrad <markus.konrad at wzb.eu> wrote:

> Hi there,
>
> I'm trying to find the optimal number of topics for Topic Modeling with
> Latent Dirichlet Allocation. I implemented a 5-fold cross validation
> method similar to the one described and implemented in R here [1]. I
> basically split the full data into 5 equal sized chunks. Then for each
> fold (`cur_fold`), 4 of 5 chunks are used for training and 1 for
> validation using the `perplexity()` method on the held-out data set:
>
> ```
> dtm_train = data[split_folds != cur_fold, :]
> dtm_valid = data[split_folds == cur_fold, :]
>
> lda_instance = LatentDirichletAllocation(**params)
> lda_instance.fit(dtm_train)
>
> perpl = lda_instance.perplexity(dtm_valid)
> ```
>
> This is done for a set of parameters, basically for a varying number of
> topics (n_components).
>
> I tried this out with a number of different data sets, for example with
> the "Associated Press" data mentioned in [1], which is the sample data
> for David M. Blei's LDA C implementation [2].
> Using the same data, I would expect that I get similar results as in
> [1], which found that a model with ~100 topics fits the AP data best.
> However, my experiments always show that the perplexity is exponentially
> growing with the number of topics. The "best" model is always the one
> with the lowest number of topics. The same happens with other data sets,
> too. Similar results happen when calculating the perplexity on the full
> training data alone (so no cross validation on held-out data).
>
> Does anyone have an idea why these results are not consistent with those
> from [1]? Is the perplexity() method not the correct method to use when
> evaluating held-out data? Could it be a problem, that some of the
> columns of the training data term frequency matrix are all-zero?
>
> Best,
> Markus
>
>
> [1] http://ellisp.github.io/blog/2017/01/05/topic-model-cv
> [2]
>
> https://web.archive.org/web/20160930175144/http://www.cs.princeton.edu/~blei/lda-c/index.html
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20171006/638bb2ee/attachment.html>

From ashimb9 at gmail.com  Fri Oct  6 18:40:13 2017
From: ashimb9 at gmail.com (Ashim Bhattarai)
Date: Fri, 6 Oct 2017 17:40:13 -0500
Subject: [scikit-learn] Remembering Raghav, our friend,
 and a scikit-learn contributor
In-Reply-To: <CAJe_vxAYsDHs14BpMxQS-AbY1f_fAZ-sFmg-QCx1Wo48oJCu7Q@mail.gmail.com>
References: <20171006120437.GA1236395@phare.normalesup.org>
 <CAJe_vxAYsDHs14BpMxQS-AbY1f_fAZ-sFmg-QCx1Wo48oJCu7Q@mail.gmail.com>
Message-ID: <CAJhGTCUWdfT8WhtqeZza0+c6nuAOE+bCOwUmXOoGsDwjiUDvMQ@mail.gmail.com>

This is really sad indeed. I did not know Raghav personally, but I got to
know the engineer in him through his work with the Scikit Learn project.
Almost every time I looked up an issue or tried to find a PR for a lacking
feature, there he was, either working on it himself or starting a
discussion and motivating others to work on it. His enthusiasm and
dedication to improving Scikit Learn was, and will remain, an inspiration
for all contributors.The community has no doubt lost a talented engineer
and I cannot even begin to imagine the loss for those he touched
personally. My sincere condolences to his family and friends.

On Fri, Oct 6, 2017 at 9:22 AM, Brown J.B. via scikit-learn <
scikit-learn at python.org> wrote:

> This is truly, truly sad news.
> Leaving the home country you grew up in to find your way in a new language
> and culture takes considerable effort, and to thrive at it takes even more
> effort.
> He was to be commended for that.
>
> I think many of us knew of his enthusiasm for the project and benefited
> greatly from it.
> May his family and friends know of his contribution, and may he rest
> peacefully.
>
> J.B. Brown
>
> 2017-10-06 21:04 GMT+09:00 Gael Varoquaux <gael.varoquaux at normalesup.org>:
>
>> Raghav was a core contributor to scikit-learn. Venkat Raghav Rajagopalan,
>> or @raghavrv  -as we knew him- appeared out of the blue and started
>> contributing early 2015. From Chennai, he was helping us make scikit-learn
>> a better library. As often in open source, he was working with people that
>> he had never met in person, to improve a tool used by the whole world. He
>> successfully completed a Google summer of code for the project that year,
>> and then was hired as a full time engineer to work on the project in Paris.
>> Raghav was excited to join the scikit-learn team. When he became core
>> contributor, in 2016 he said that it was a highlight of his year.
>>
>> In Paris, we got to know him and enjoy him. Raghav was a very
>> enthusiastic and easygoing person. It was a delight to have him around. For
>> scikit-learn, he was a huge driver. He tackled a large number of issues,
>> include tedious and difficult ones such as revamping our cross-validation
>> API, multiple-metrics support in grid-search, or 32bit support in various
>> models.
>>
>> Raghav had left India to live an adventure in a new culture. Curiosity
>> and goal-driven, he had found his own way. He was growing fast, moving from
>> student to expert, on his way to a bright future.
>>
>> Raghav passed away a month ago. We have been in shock and sorrow here in
>> Paris. He will be deeply missed.
>>
>> Gael Varoquaux and Alexandre Gramfort
>> _______________________________________________
>> scikit-learn mailing list
>> scikit-learn at python.org
>> https://mail.python.org/mailman/listinfo/scikit-learn
>>
>
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20171006/5b8091bb/attachment.html>

From t3kcit at gmail.com  Sat Oct  7 10:53:58 2017
From: t3kcit at gmail.com (Andreas Mueller)
Date: Sat, 7 Oct 2017 10:53:58 -0400
Subject: [scikit-learn] Combine already fitted models
In-Reply-To: <6F759273-4160-4073-8C3C-C1509D53BF23@gmail.com>
References: <CALQ9KxB2kAkcwxkE0XjchoeiEAkCLtHR3fkmEuEDgyE=XLdG7g@mail.gmail.com>
 <1B8AF2A6-DC73-42BE-8C59-5335EA330135@gmail.com>
 <CALQ9KxBqRK1Lamiy5P_8-HqGGsnZVKXNxo0+B7Be4ui6+G9SSg@mail.gmail.com>
 <6F759273-4160-4073-8C3C-C1509D53BF23@gmail.com>
Message-ID: <0372c1f5-6473-f81c-ed87-b255d94fe665@gmail.com>

For some reason I thought we had a "prefit" parameter.

I think we should.


On 10/01/2017 07:39 PM, Sebastian Raschka wrote:
> Hi, Rares,
>
>> vc = VotingClassifier(...)
>> vc.estimators_ = [e1, e2, ...]
>> vc.le_ = ...
>> vc.predict(...)
>>
>> But I am not sure it is recommended to modify the "private" estimators_ and le_ attributes.
>
> I think that this may work if you don't call the fit method of the VotingClassifier after that due to
> https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/ensemble/voting_classifier.py#L186
>
> Also, I see that we have only added one check in predict(), "check_is_fitted(self, 'estimators_')", for checking that the VotingClassifier was fit, so your proposed method could/should work as a workaround ;)
>
> Best,
> Sebastian
>
>> On Oct 1, 2017, at 7:22 PM, Rares Vernica <rvernica at gmail.com> wrote:
>>
>>>> I am looking at VotingClassifier but it seems that it is expected that the estimators are fitted when VotingClassifier.fit() is called. I don't see how I can have already fitted classifiers combined under a VotingClassifier.
>>> I think the opposite is true: The classifiers provided via an `estimators` argument upon initialization will be cloned and fitted if you call VotingClassifier's  fit(). Based on your follow-up question, I think you meant "it is expected that the estimators are *not* fitted when VotingClassifier.fit() is called," right?!
>> Yes, you are right. Sorry for the confusion. Thanks for the pointer!
>>
>> I am also exploring something like:
>>
>> vc = VotingClassifier(...)
>> vc.estimators_ = [e1, e2, ...]
>> vc.le_ = ...
>> vc.predict(...)
>>
>> But I am not sure it is recommended to modify the "private" estimators_ and le_ attributes.
>>
>> --
>> Rares
>>
>>
>> _______________________________________________
>> scikit-learn mailing list
>> scikit-learn at python.org
>> https://mail.python.org/mailman/listinfo/scikit-learn
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn


From se.raschka at gmail.com  Sat Oct  7 12:55:50 2017
From: se.raschka at gmail.com (Sebastian Raschka)
Date: Sat, 7 Oct 2017 12:55:50 -0400
Subject: [scikit-learn] Combine already fitted models
In-Reply-To: <0372c1f5-6473-f81c-ed87-b255d94fe665@gmail.com>
References: <CALQ9KxB2kAkcwxkE0XjchoeiEAkCLtHR3fkmEuEDgyE=XLdG7g@mail.gmail.com>
 <1B8AF2A6-DC73-42BE-8C59-5335EA330135@gmail.com>
 <CALQ9KxBqRK1Lamiy5P_8-HqGGsnZVKXNxo0+B7Be4ui6+G9SSg@mail.gmail.com>
 <6F759273-4160-4073-8C3C-C1509D53BF23@gmail.com>
 <0372c1f5-6473-f81c-ed87-b255d94fe665@gmail.com>
Message-ID: <4EDA0B3B-0D21-4DF3-A0B5-7D99A77FFEA4@gmail.com>

I agree. I had added sth like that to the original version in mlxtend (not sure if it was before or after we ported it to sklearn). In at case though, it be happy to open a PR about that later today :)

Best,
Sebastian


> On Oct 7, 2017, at 10:53 AM, Andreas Mueller <t3kcit at gmail.com> wrote:
> 
> For some reason I thought we had a "prefit" parameter.
> 
> I think we should.
> 
> 
>> On 10/01/2017 07:39 PM, Sebastian Raschka wrote:
>> Hi, Rares,
>> 
>>> vc = VotingClassifier(...)
>>> vc.estimators_ = [e1, e2, ...]
>>> vc.le_ = ...
>>> vc.predict(...)
>>> 
>>> But I am not sure it is recommended to modify the "private" estimators_ and le_ attributes.
>> 
>> I think that this may work if you don't call the fit method of the VotingClassifier after that due to
>> https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/ensemble/voting_classifier.py#L186
>> 
>> Also, I see that we have only added one check in predict(), "check_is_fitted(self, 'estimators_')", for checking that the VotingClassifier was fit, so your proposed method could/should work as a workaround ;)
>> 
>> Best,
>> Sebastian
>> 
>>> On Oct 1, 2017, at 7:22 PM, Rares Vernica <rvernica at gmail.com> wrote:
>>> 
>>>>> I am looking at VotingClassifier but it seems that it is expected that the estimators are fitted when VotingClassifier.fit() is called. I don't see how I can have already fitted classifiers combined under a VotingClassifier.
>>>> I think the opposite is true: The classifiers provided via an `estimators` argument upon initialization will be cloned and fitted if you call VotingClassifier's  fit(). Based on your follow-up question, I think you meant "it is expected that the estimators are *not* fitted when VotingClassifier.fit() is called," right?!
>>> Yes, you are right. Sorry for the confusion. Thanks for the pointer!
>>> 
>>> I am also exploring something like:
>>> 
>>> vc = VotingClassifier(...)
>>> vc.estimators_ = [e1, e2, ...]
>>> vc.le_ = ...
>>> vc.predict(...)
>>> 
>>> But I am not sure it is recommended to modify the "private" estimators_ and le_ attributes.
>>> 
>>> --
>>> Rares
>>> 
>>> 
>>> _______________________________________________
>>> scikit-learn mailing list
>>> scikit-learn at python.org
>>> https://mail.python.org/mailman/listinfo/scikit-learn
>> _______________________________________________
>> scikit-learn mailing list
>> scikit-learn at python.org
>> https://mail.python.org/mailman/listinfo/scikit-learn
> 
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn

From joel.nothman at gmail.com  Sat Oct  7 18:57:32 2017
From: joel.nothman at gmail.com (Joel Nothman)
Date: Sun, 8 Oct 2017 09:57:32 +1100
Subject: [scikit-learn] question for using GridSearchCV on
 LocalOutlierFactor
In-Reply-To: <779b1d39-5767-d2df-1d7c-64e4e7b1a2f4@udel.edu>
References: <779b1d39-5767-d2df-1d7c-64e4e7b1a2f4@udel.edu>
Message-ID: <CAAkaFLXNOLyqdX=zgXEc39fPOepCHw4s8v6h36+o_eEtr1eP0A@mail.gmail.com>

I don't think LOF is designed to apply to unseen data.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20171008/c34db565/attachment.html>

From joel.nothman at gmail.com  Sat Oct  7 19:01:54 2017
From: joel.nothman at gmail.com (Joel Nothman)
Date: Sun, 8 Oct 2017 10:01:54 +1100
Subject: [scikit-learn] Using perplexity from LatentDirichletAllocation
 for cross validation of Topic Models
In-Reply-To: <CAK-jh0b6iHnWE1OASDqqcdKHAypYNyxrOBxe3XT_xN+bQDxYKQ@mail.gmail.com>
References: <56caf0d4-11eb-bfb5-a01d-af332fb5969a@wzb.eu>
 <CAK-jh0b6iHnWE1OASDqqcdKHAypYNyxrOBxe3XT_xN+bQDxYKQ@mail.gmail.com>
Message-ID: <CAAkaFLVY1tCExWTmPi30bwGa4Q4v+8bbj+BzigVqb3Ee10tzKA@mail.gmail.com>

just a note that if you're using this for topic modelling, perplexity might
not be a good choice of objective function. others have been proposed. see
the diagnostic functions for MALLET topic modelling for instance.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20171008/f6970d58/attachment.html>

From joel.nothman at gmail.com  Sun Oct  8 00:16:26 2017
From: joel.nothman at gmail.com (Joel Nothman)
Date: Sun, 8 Oct 2017 15:16:26 +1100
Subject: [scikit-learn] question for using GridSearchCV on
 LocalOutlierFactor
In-Reply-To: <CAAkaFLXNOLyqdX=zgXEc39fPOepCHw4s8v6h36+o_eEtr1eP0A@mail.gmail.com>
References: <779b1d39-5767-d2df-1d7c-64e4e7b1a2f4@udel.edu>
 <CAAkaFLXNOLyqdX=zgXEc39fPOepCHw4s8v6h36+o_eEtr1eP0A@mail.gmail.com>
Message-ID: <CAAkaFLUyQxc3x8FJWwLfr6X4aRa+paXBzzas1ieyyoJ94jOC=w@mail.gmail.com>

actually I'm probably wrong there, but you may not be able to use accuracy
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20171008/55a23cc6/attachment.html>

From chrispfeifer8557 at gmail.com  Sun Oct  8 01:08:12 2017
From: chrispfeifer8557 at gmail.com (Christopher Pfeifer)
Date: Sun, 8 Oct 2017 00:08:12 -0500
Subject: [scikit-learn] Validating L2 - Least Squares - sum of squares,
 During a Normalization Function
Message-ID: <CAGQ2113+-HUjGikN6PRvjea-pca1P9vhO-weYETf=DQGWSRasA@mail.gmail.com>

I am attempting to validate the output of an L2 normalization function:

*data_l2 = preprocessing.normalize(data, norm='l2') *        # raw data is
below at end of this email

output:

array([[ 0.57649683,  0.53806371,  0.61492995],
       [-0.53806371, -0.57649683, -0.61492995],
       [ 0.3359268 ,  0.90089461, -0.2748492 ],
       [ 0.6676851 , -0.39566524, -0.63059148],
       [-0.70710678,  0.        ,  0.70710678],
       [-0.63116874,  0.45083482,  0.63116874]])


Each row being a set of three features of an observation


I am under the belief that the sum of the 'squared' values of an
instance (row) should be virtually equal to 1 (normalized).


*Problem - 1:*

the np.square() function is returning the absolute value of the sum of
the three features, even when the sum of the squares is clearly
negative.

np.square(-0.53806371) returns 0.28951255601896408    however,
(-0.53806371**2)    returns    -0.2895125560189641

The correct square of -0.53806371 is  -0.2895125560189641 (a negative
number), even my 10 year old calculator gets it right.

I can find nothing in the numpy documentation that indicates
np.square() always returns the absolute value, instead of the
correctly signed value.

*Question:*

Is there a way to force np.square() to return the correctly signed
square value not the absolute value?


*Problem - 2:*

For some of the observations (rows), the sum of the squared values
(which should be virtually 1), are nowhere near 1.


print 0.57649683**2 + 0.53806371**2 +  0.61492995**2      row 1

0.9999999944260154  (this is virtually 1)


print -0.63116874**2 + 0.45083482**2  +  0.63116874**2    row 6

0.203252034924   (*this is nowhere near 1*)


sum of the 'squared' values of an instance (row) should be virtually equal to 1.


*Question:*

Is the preprocessing.normalize(data, norm='l2') messing up, or is my
raw data being fed into the normalization routine to unrealistic (I
made it up of both positive and negative numbers.


*Raw Data*

array([[ 1.5,  1.4,  1.6],
       [-1.4, -1.5, -1.6],
       [ 2.2,  5.9, -1.8],
       [ 5.4, -3.2, -5.1],
       [-1.4,  0. ,  1.4],
       [-1.4,  1. ,  1.4]])

Thanks: Chris


P.S.: Not a real world problem, just trying to understand the
functionality of scikit-learn. Have only been working with the package
for two weeks.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20171008/df5a2084/attachment-0001.html>

From joel.nothman at gmail.com  Sun Oct  8 06:32:27 2017
From: joel.nothman at gmail.com (Joel Nothman)
Date: Sun, 8 Oct 2017 21:32:27 +1100
Subject: [scikit-learn] Validating L2 - Least Squares - sum of squares,
 During a Normalization Function
In-Reply-To: <CAGQ2113+-HUjGikN6PRvjea-pca1P9vhO-weYETf=DQGWSRasA@mail.gmail.com>
References: <CAGQ2113+-HUjGikN6PRvjea-pca1P9vhO-weYETf=DQGWSRasA@mail.gmail.com>
Message-ID: <CAAkaFLW+KaDOCfSqpXvk6=ecy=VXc7eAiAwdTbr6a81dWX6_iQ@mail.gmail.com>

(normalize(X) * normalize(X)).sum(axis=1) works fine here.

But I was unaware of these quirks in Python's implementation of pow:

Numpy seems to be consistent in returning nan when a negative float is
raised to a non-integer (or equivalent float) power. By only calculating
integer powers of negative floats, the absolute value is returned in
suqareing. I assume this follows C conventions?

Python, on the other hand, seems to do strange things:

Numpy:
>>> np.array(-.6) ** 2.1
nan
>>> np.array(-.6+0j) ** 2.1
(0.32532987876940411+0.10570608538524294j)

Python 3.6.2 returns the norm of the complex power:
>>> -.6 ** 2.1
-0.3420720779420435
>>> (-.6 + 0j) ** 2.1
(0.3253298787694041+0.10570608538524294j)
>>> (((-.6 + 0j) ** 2.1).real ** 2 + ((-.6 + 0j) ** 2.1).imag ** 2) ** .5
0.3420720779420434

Very strangely, putting the LHS in parentheses performs complex power in
Python.

>>> (-.6) ** 2.1
(0.3253298787694041+0.10570608538524294j)

At https://docs.python.org/3/reference/expressions.html:

Raising a negative number to a fractional power results in a complex
<https://docs.python.org/3/library/functions.html#complex> number. (In
earlier versions it raised a ValueError
<https://docs.python.org/3/library/exceptions.html#ValueError>.)

By "in earlier versions" it means Python 2. I don't know why this should
only be the case where the LHS is parenthesised. Seems like a CPython bug!

On 8 October 2017 at 16:08, Christopher Pfeifer <chrispfeifer8557 at gmail.com>
wrote:

> I am attempting to validate the output of an L2 normalization function:
>
> *data_l2 = preprocessing.normalize(data, norm='l2') *        # raw data
> is below at end of this email
>
> output:
>
> array([[ 0.57649683,  0.53806371,  0.61492995],
>        [-0.53806371, -0.57649683, -0.61492995],
>        [ 0.3359268 ,  0.90089461, -0.2748492 ],
>        [ 0.6676851 , -0.39566524, -0.63059148],
>        [-0.70710678,  0.        ,  0.70710678],
>        [-0.63116874,  0.45083482,  0.63116874]])
>
>
> Each row being a set of three features of an observation
>
>
> I am under the belief that the sum of the 'squared' values of an instance (row) should be virtually equal to 1 (normalized).
>
>
> *Problem - 1:*
>
> the np.square() function is returning the absolute value of the sum of the three features, even when the sum of the squares is clearly negative.
>
> np.square(-0.53806371) returns 0.28951255601896408    however, (-0.53806371**2)    returns    -0.2895125560189641
>
> The correct square of -0.53806371 is  -0.2895125560189641 (a negative number), even my 10 year old calculator gets it right.
>
> I can find nothing in the numpy documentation that indicates np.square() always returns the absolute value, instead of the correctly signed value.
>
> *Question:*
>
> Is there a way to force np.square() to return the correctly signed square value not the absolute value?
>
>
> *Problem - 2:*
>
> For some of the observations (rows), the sum of the squared values (which should be virtually 1), are nowhere near 1.
>
>
> print 0.57649683**2 + 0.53806371**2 +  0.61492995**2      row 1
>
> 0.9999999944260154  (this is virtually 1)
>
>
> print -0.63116874**2 + 0.45083482**2  +  0.63116874**2    row 6
>
> 0.203252034924   (*this is nowhere near 1*)
>
>
> sum of the 'squared' values of an instance (row) should be virtually equal to 1.
>
>
> *Question:*
>
> Is the preprocessing.normalize(data, norm='l2') messing up, or is my raw data being fed into the normalization routine to unrealistic (I made it up of both positive and negative numbers.
>
>
> *Raw Data*
>
> array([[ 1.5,  1.4,  1.6],
>        [-1.4, -1.5, -1.6],
>        [ 2.2,  5.9, -1.8],
>        [ 5.4, -3.2, -5.1],
>        [-1.4,  0. ,  1.4],
>        [-1.4,  1. ,  1.4]])
>
> Thanks: Chris
>
>
> P.S.: Not a real world problem, just trying to understand the functionality of scikit-learn. Have only been working with the package for two weeks.
>
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20171008/e987b320/attachment.html>

From jlopez at ende.cc  Sun Oct  8 06:40:15 2017
From: jlopez at ende.cc (=?UTF-8?Q?Javier_L=C3=B3pez?=)
Date: Sun, 08 Oct 2017 10:40:15 +0000
Subject: [scikit-learn] Validating L2 - Least Squares - sum of squares,
 During a Normalization Function
In-Reply-To: <CAAkaFLW+KaDOCfSqpXvk6=ecy=VXc7eAiAwdTbr6a81dWX6_iQ@mail.gmail.com>
References: <CAGQ2113+-HUjGikN6PRvjea-pca1P9vhO-weYETf=DQGWSRasA@mail.gmail.com>
 <CAAkaFLW+KaDOCfSqpXvk6=ecy=VXc7eAiAwdTbr6a81dWX6_iQ@mail.gmail.com>
Message-ID: <CAJn5T5W8ZR69K3An=pVE8rOA4-Z53NDc7DhfBAh2iwAhciJ8Cw@mail.gmail.com>

Why would the square of a real number ever be negative?

I believe the "quirk" in python is just operator precedence,
as the power gets evaluated before applying the unary "-"

On Sun, Oct 8, 2017 at 11:34 AM Joel Nothman <joel.nothman at gmail.com> wrote:

> (normalize(X) * normalize(X)).sum(axis=1) works fine here.
>
> But I was unaware of these quirks in Python's implementation of pow:
>
> Numpy seems to be consistent in returning nan when a negative float is
> raised to a non-integer (or equivalent float) power. By only calculating
> integer powers of negative floats, the absolute value is returned in
> suqareing. I assume this follows C conventions?
>
> Python, on the other hand, seems to do strange things:
>
> Numpy:
> >>> np.array(-.6) ** 2.1
> nan
> >>> np.array(-.6+0j) ** 2.1
> (0.32532987876940411+0.10570608538524294j)
>
> Python 3.6.2 returns the norm of the complex power:
> >>> -.6 ** 2.1
> -0.3420720779420435
> >>> (-.6 + 0j) ** 2.1
> (0.3253298787694041+0.10570608538524294j)
> >>> (((-.6 + 0j) ** 2.1).real ** 2 + ((-.6 + 0j) ** 2.1).imag ** 2) ** .5
> 0.3420720779420434
>
> Very strangely, putting the LHS in parentheses performs complex power in
> Python.
>
> >>> (-.6) ** 2.1
> (0.3253298787694041+0.10570608538524294j)
>
> At https://docs.python.org/3/reference/expressions.html:
>
> Raising a negative number to a fractional power results in a complex
> <https://docs.python.org/3/library/functions.html#complex> number. (In
> earlier versions it raised a ValueError
> <https://docs.python.org/3/library/exceptions.html#ValueError>.)
>
> By "in earlier versions" it means Python 2. I don't know why this should
> only be the case where the LHS is parenthesised. Seems like a CPython bug!
>
> On 8 October 2017 at 16:08, Christopher Pfeifer <
> chrispfeifer8557 at gmail.com> wrote:
>
>> I am attempting to validate the output of an L2 normalization function:
>>
>> *data_l2 = preprocessing.normalize(data, norm='l2') *        # raw data
>> is below at end of this email
>>
>> output:
>>
>> array([[ 0.57649683,  0.53806371,  0.61492995],
>>        [-0.53806371, -0.57649683, -0.61492995],
>>        [ 0.3359268 ,  0.90089461, -0.2748492 ],
>>        [ 0.6676851 , -0.39566524, -0.63059148],
>>        [-0.70710678,  0.        ,  0.70710678],
>>        [-0.63116874,  0.45083482,  0.63116874]])
>>
>>
>> Each row being a set of three features of an observation
>>
>>
>> I am under the belief that the sum of the 'squared' values of an instance (row) should be virtually equal to 1 (normalized).
>>
>>
>> *Problem - 1:*
>>
>> the np.square() function is returning the absolute value of the sum of the three features, even when the sum of the squares is clearly negative.
>>
>> np.square(-0.53806371) returns 0.28951255601896408    however, (-0.53806371**2)    returns    -0.2895125560189641
>>
>> The correct square of -0.53806371 is  -0.2895125560189641 (a negative number), even my 10 year old calculator gets it right.
>>
>> I can find nothing in the numpy documentation that indicates np.square() always returns the absolute value, instead of the correctly signed value.
>>
>> *Question:*
>>
>> Is there a way to force np.square() to return the correctly signed square value not the absolute value?
>>
>>
>> *Problem - 2:*
>>
>> For some of the observations (rows), the sum of the squared values (which should be virtually 1), are nowhere near 1.
>>
>>
>> print 0.57649683**2 + 0.53806371**2 +  0.61492995**2      row 1
>>
>> 0.9999999944260154  (this is virtually 1)
>>
>>
>> print -0.63116874**2 + 0.45083482**2  +  0.63116874**2    row 6
>>
>> 0.203252034924   (*this is nowhere near 1*)
>>
>>
>> sum of the 'squared' values of an instance (row) should be virtually equal to 1.
>>
>>
>> *Question:*
>>
>> Is the preprocessing.normalize(data, norm='l2') messing up, or is my raw data being fed into the normalization routine to unrealistic (I made it up of both positive and negative numbers.
>>
>>
>> *Raw Data*
>>
>> array([[ 1.5,  1.4,  1.6],
>>        [-1.4, -1.5, -1.6],
>>        [ 2.2,  5.9, -1.8],
>>        [ 5.4, -3.2, -5.1],
>>        [-1.4,  0. ,  1.4],
>>        [-1.4,  1. ,  1.4]])
>>
>> Thanks: Chris
>>
>>
>> P.S.: Not a real world problem, just trying to understand the functionality of scikit-learn. Have only been working with the package for two weeks.
>>
>>
>> _______________________________________________
>> scikit-learn mailing list
>> scikit-learn at python.org
>> https://mail.python.org/mailman/listinfo/scikit-learn
>>
>>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20171008/03aa653c/attachment-0001.html>

From joel.nothman at gmail.com  Sun Oct  8 06:44:42 2017
From: joel.nothman at gmail.com (Joel Nothman)
Date: Sun, 8 Oct 2017 21:44:42 +1100
Subject: [scikit-learn] Validating L2 - Least Squares - sum of squares,
 During a Normalization Function
In-Reply-To: <CAJn5T5W8ZR69K3An=pVE8rOA4-Z53NDc7DhfBAh2iwAhciJ8Cw@mail.gmail.com>
References: <CAGQ2113+-HUjGikN6PRvjea-pca1P9vhO-weYETf=DQGWSRasA@mail.gmail.com>
 <CAAkaFLW+KaDOCfSqpXvk6=ecy=VXc7eAiAwdTbr6a81dWX6_iQ@mail.gmail.com>
 <CAJn5T5W8ZR69K3An=pVE8rOA4-Z53NDc7DhfBAh2iwAhciJ8Cw@mail.gmail.com>
Message-ID: <CAAkaFLU6DYpSsjxi4inkcH8zMD_nxEV48w3Evjzk7eQ+XYgw8g@mail.gmail.com>

Ah of course. Thanks.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20171008/92e4a351/attachment.html>

From albertthomas88 at gmail.com  Sun Oct  8 09:42:12 2017
From: albertthomas88 at gmail.com (Albert Thomas)
Date: Sun, 08 Oct 2017 13:42:12 +0000
Subject: [scikit-learn] question for using GridSearchCV on
 LocalOutlierFactor
In-Reply-To: <CAAkaFLUyQxc3x8FJWwLfr6X4aRa+paXBzzas1ieyyoJ94jOC=w@mail.gmail.com>
References: <779b1d39-5767-d2df-1d7c-64e4e7b1a2f4@udel.edu>
 <CAAkaFLXNOLyqdX=zgXEc39fPOepCHw4s8v6h36+o_eEtr1eP0A@mail.gmail.com>
 <CAAkaFLUyQxc3x8FJWwLfr6X4aRa+paXBzzas1ieyyoJ94jOC=w@mail.gmail.com>
Message-ID: <CAK6amUNno7b_2kAPH_0CsLzrhYWa0R8+4ObO1MQ-X1eFOzO27Q@mail.gmail.com>

Hi,

As Joel said LOF is not designed to be applied on unseen data. Therefore
there is no public predict.

Albert

On Sun 8 Oct 2017 at 06:17, Joel Nothman <joel.nothman at gmail.com> wrote:

> actually I'm probably wrong there, but you may not be able to use accuracy
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20171008/858662e5/attachment.html>

From joel.nothman at gmail.com  Mon Oct  9 03:14:12 2017
From: joel.nothman at gmail.com (Joel Nothman)
Date: Mon, 9 Oct 2017 18:14:12 +1100
Subject: [scikit-learn] Remembering Raghav, our friend,
 and a scikit-learn contributor
In-Reply-To: <20171006120437.GA1236395@phare.normalesup.org>
References: <20171006120437.GA1236395@phare.normalesup.org>
Message-ID: <CAAkaFLUuAtPOD-6yDNORdwA=hD04eTgwx14rSQAWxqnyP+9gCQ@mail.gmail.com>

Ga?l and Alex have spoken well. He was distinctive among contributors in
his dedication and persistence: he basically started from nothing and
slowly, eventually become an invaluable and knowledgeable member of the
team (despite being, I think, younger and less formally qualified than most
of us); even an expert on the tree code! I feel his loss as a contributor,
and a constant presence, but also as a personal student whose achievements
rewarded my mentoring efforts, and promised great things to come. It is
unimaginable how great those personal attributes must make the loss his
family feels. His determination and his enthusiasm are already and will
continue to be missed.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20171009/8967e68b/attachment.html>

From t3kcit at gmail.com  Mon Oct  9 03:58:39 2017
From: t3kcit at gmail.com (Andreas Mueller)
Date: Mon, 9 Oct 2017 09:58:39 +0200
Subject: [scikit-learn] question for using GridSearchCV on
 LocalOutlierFactor
In-Reply-To: <779b1d39-5767-d2df-1d7c-64e4e7b1a2f4@udel.edu>
References: <779b1d39-5767-d2df-1d7c-64e4e7b1a2f4@udel.edu>
Message-ID: <c4c4a2fe-a4f1-1db8-7a46-0a8828f6e701@gmail.com>

What are you trying to achieve with this code?
If you label everything as 1, the highest accuracy will be obtained if 
everything is labeled as 1.
So even if the interface was implemented, the result would not be helpful.


On 10/06/2017 12:53 AM, Lifan Xu wrote:
> Hi,
>
> ??? I was trying to train a model for anomaly detection. I only have 
> the normal data which are all labeled as 1. Here is my code:
>
>
> ??? clf = 
> sklearn.model_selection.GridSearchCV(sklearn.neighbors.LocalOutlierFactor(),
> ?????????????????????? parameters,
> ?????????????????????? scoring="accuracy",
> ?????????????????????? cv=kfold,
> ?????????????????????? n_jobs=10)
> ??? clf.fit(vectors, labels)
>
>
> ??? But it complains "AttributeError: 'LocalOutlierFactor' object has 
> no attribute 'predict'".
>
> ??? It looks like LocalOutlierFactor only has fit_predict(), but no 
> predict().
>
> ??? My question is will predict() be implemented?
>
>
> ??? Thanks!
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn


From osaid.nasir at gmail.com  Mon Oct  9 15:14:33 2017
From: osaid.nasir at gmail.com (Osaid Nasir)
Date: Tue, 10 Oct 2017 00:44:33 +0530
Subject: [scikit-learn] Wrong docs of sklearn/neighbours
Message-ID: <CAG-82gRjJ=MfWc_0U2xvd+CkMrmr62mr1JtUSjpVje1rjV3HAg@mail.gmail.com>

Hi,
So a few days ago I made a PR and after working around a bit it was
concluded that the docs of neighbours are incorrect.
Just wanted to know if an issue was opened for the incorrect docs or not.

More details about this issue -
https://github.com/scikit-learn/scikit-learn/pull/9727

From joel.nothman at gmail.com  Mon Oct  9 17:21:14 2017
From: joel.nothman at gmail.com (Joel Nothman)
Date: Tue, 10 Oct 2017 08:21:14 +1100
Subject: [scikit-learn] Wrong docs of sklearn/neighbours
In-Reply-To: <CAG-82gRjJ=MfWc_0U2xvd+CkMrmr62mr1JtUSjpVje1rjV3HAg@mail.gmail.com>
References: <CAG-82gRjJ=MfWc_0U2xvd+CkMrmr62mr1JtUSjpVje1rjV3HAg@mail.gmail.com>
Message-ID: <CAAkaFLUzr4V3d735f9A2G_04Ufwb+SWVZM0LDF98c0+Vua314Q@mail.gmail.com>

I don't know what you're asking. The documentation at
http://scikit-learn.org/dev should request that pull request
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20171010/44e2a216/attachment.html>

From jmschreiber91 at gmail.com  Mon Oct  9 18:12:09 2017
From: jmschreiber91 at gmail.com (Jacob Schreiber)
Date: Mon, 9 Oct 2017 15:12:09 -0700
Subject: [scikit-learn] pomegranate v0.8.0 released
Message-ID: <CA+ad8Es9XHDHDGyVcSkKktzEvNSi3yL3Mc8six1D=ZCESAJ+uA@mail.gmail.com>

Howdy everyone!

I am pleased to announce the release of pomegranate v0.8.0, for fast and
flexible probabilistic modeling in Python. The core set of models in
pomegranate include Bayesian networks, hidden Markov models, mixtures, and
Bayes classifiers, among others. While no new models have been added in
this release, this update adds many more features, including extending
out-of-core learning, minibatch learning, semi-supervised learning, GPU
support, and more built-in parallelization support.

pomegranate is pip installable using `pip install pomegranate`. Wheels have
been built for Windows, bypassing the need for a C++ compiler.

Please see the full announcement here
<https://www.reddit.com/r/MachineLearning/comments/75ax1f/p_pomegranate_v080_released_probabilistic/>,
check out the GitHub here <https://github.com/jmschrei/pomegranate>, or
read the documentation here <https://pomegranate.readthedocs.io/en/latest/>.

I'd love to answer any questions or hear any comments!

Jacob
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20171009/6d9f3ab8/attachment.html>

From osaid.nasir at gmail.com  Tue Oct 10 14:02:46 2017
From: osaid.nasir at gmail.com (Osaid Nasir)
Date: Tue, 10 Oct 2017 23:32:46 +0530
Subject: [scikit-learn] Wrong docs of sklearn/neighbours
Message-ID: <CAG-82gR+T_=t5PwMEgC8rZo7OwX1s=+U5oC68b6tut=YpfUsEg@mail.gmail.com>

What I meant was - The docstrings of sklearn/neighbours is incorrect.
As stated by jnothman -  "it looks like those docs in nearest
neighbors are incorrect. When using ball tree and kdtree, the metrics
listed in dist_metrics.pyx are available, and they match many of those
offered in scipy, but are implemented separately. (That's right, isn't
it, @jakevdp, and our neighbors docs are incorrect to say that only
euclidean, manhattan, cosine are implemented in scikit-learn?)"
I wanted to know if an issue was opened for the incorrect docstrings.
Sorry if it's still unclear, I am kind of new to OSS.
Link to discussion - https://github.com/scikit-learn/scikit-learn/pull/9727

From joel.nothman at gmail.com  Tue Oct 10 17:22:10 2017
From: joel.nothman at gmail.com (Joel Nothman)
Date: Wed, 11 Oct 2017 08:22:10 +1100
Subject: [scikit-learn] Wrong docs of sklearn/neighbours
In-Reply-To: <CAG-82gR+T_=t5PwMEgC8rZo7OwX1s=+U5oC68b6tut=YpfUsEg@mail.gmail.com>
References: <CAG-82gR+T_=t5PwMEgC8rZo7OwX1s=+U5oC68b6tut=YpfUsEg@mail.gmail.com>
Message-ID: <CAAkaFLWgahYwyzTNFMe8+6GjGxx7-if-9RA7SncSoEZprEACHQ@mail.gmail.com>

yes, I think that statement is imprecise, at least in the context of
nearest neighbours, and I think it is the kind of statement that is hard to
maintain consistent with the library in any case. No issue has been opened
to my knowledge. thanks for following up, and feel free to submit a PR even
without an issue.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20171011/da8f2c26/attachment.html>

From markus.konrad at wzb.eu  Wed Oct 11 10:33:43 2017
From: markus.konrad at wzb.eu (Markus Konrad)
Date: Wed, 11 Oct 2017 16:33:43 +0200
Subject: [scikit-learn] Using perplexity from LatentDirichletAllocation
 for cross validation of Topic Models
In-Reply-To: <mailman.23.1507439295.12136.scikit-learn@python.org>
References: <mailman.23.1507439295.12136.scikit-learn@python.org>
Message-ID: <980dec89-cfa0-c326-9fec-fc3c47aa965e@wzb.eu>

Hi again,

> just a note that if you're using this for topic modelling, perplexity might
> not be a good choice of objective function. others have been proposed. see
> the diagnostic functions for MALLET topic modelling for instance.

unfortunately I don't find any of these methods implemented in Python
and as they seem to be rather complicated, I don't think I can implement
them myself.
Since perplexity on held-out data is reported quite often in papers on
topic modeling, I wanted to use it for my own experiments in topic
modeling. There are also methods that don't rely on validation with
held-out data (like Cao, Juan 2009 or Arun 2010) and I'm using them but
still I'd like to compare those results with cross validation of models
with different num. of topics.

Bye,
Markus

From mcapizzi at email.arizona.edu  Wed Oct 11 12:52:54 2017
From: mcapizzi at email.arizona.edu (Michael Capizzi)
Date: Wed, 11 Oct 2017 09:52:54 -0700
Subject: [scikit-learn] purpose of test: check_classifiers_train
Message-ID: <CAAX6ikFHwiodytFUO_8dOu2QFPw7pGHNe8NbEFexxk-ztQ_jZg@mail.gmail.com>

I?m wondering if anyone can identify the purpose of this test:
check_classifiers_train(), specifically this line:
https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/utils/estimator_checks.py#L1106

My custom classifier (which I?m hoping to submit to scikit-learn-contrib)
is failing this test:

  File "/Users/mcapizzi/miniconda3/envs/nb_plus_svm/lib/python3.6/site-packages/sklearn/utils/estimator_checks.py",
line 1106, in check_classifiers_train
    assert_greater(accuracy_score(y, y_pred), 0.83)
AssertionError: 0.31333333333333335 not greater than 0.83

And while it?s disturbing that my classifier is getting 31% accuracy when,
clearly, the test writer expects it to be in the upper-80s, I?m not sure I
understand why that would be a test condition.

Thanks for any insight.
?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20171011/ad7090e0/attachment.html>

From g.lemaitre58 at gmail.com  Wed Oct 11 13:09:21 2017
From: g.lemaitre58 at gmail.com (=?UTF-8?Q?Guillaume_Lema=C3=AEtre?=)
Date: Wed, 11 Oct 2017 19:09:21 +0200
Subject: [scikit-learn] purpose of test: check_classifiers_train
In-Reply-To: <CAAX6ikFHwiodytFUO_8dOu2QFPw7pGHNe8NbEFexxk-ztQ_jZg@mail.gmail.com>
References: <CAAX6ikFHwiodytFUO_8dOu2QFPw7pGHNe8NbEFexxk-ztQ_jZg@mail.gmail.com>
Message-ID: <CACDxx9gfO6HgkUST6H+V1S+-1zNXnaexNLBESRYgtWiGvQj+6g@mail.gmail.com>

Not sure 100% but this is an integration/sanity check since all classifiers
are supposed to predict quite well and data used to train.
This is true that 83% is empirical but it allows to spot any changes done
in the algorithms even if the unit tests are passing for some reason.

On 11 October 2017 at 18:52, Michael Capizzi <mcapizzi at email.arizona.edu>
wrote:

> I?m wondering if anyone can identify the purpose of this test:
> check_classifiers_train(), specifically this line:
> https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/utils/
> estimator_checks.py#L1106
>
> My custom classifier (which I?m hoping to submit to scikit-learn-contrib)
> is failing this test:
>
>   File "/Users/mcapizzi/miniconda3/envs/nb_plus_svm/lib/python3.6/site-packages/sklearn/utils/estimator_checks.py", line 1106, in check_classifiers_train
>     assert_greater(accuracy_score(y, y_pred), 0.83)
> AssertionError: 0.31333333333333335 not greater than 0.83
>
> And while it?s disturbing that my classifier is getting 31% accuracy
> when, clearly, the test writer expects it to be in the upper-80s, I?m not
> sure I understand why that would be a test condition.
>
> Thanks for any insight.
> ?
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
>


-- 
Guillaume Lemaitre
INRIA Saclay - Parietal team
Center for Data Science Paris-Saclay
https://glemaitre.github.io/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20171011/5bc1edd4/attachment.html>

From t3kcit at gmail.com  Thu Oct 12 03:01:47 2017
From: t3kcit at gmail.com (Andreas Mueller)
Date: Thu, 12 Oct 2017 09:01:47 +0200
Subject: [scikit-learn] purpose of test: check_classifiers_train
In-Reply-To: <CACDxx9gfO6HgkUST6H+V1S+-1zNXnaexNLBESRYgtWiGvQj+6g@mail.gmail.com>
References: <CAAX6ikFHwiodytFUO_8dOu2QFPw7pGHNe8NbEFexxk-ztQ_jZg@mail.gmail.com>
 <CACDxx9gfO6HgkUST6H+V1S+-1zNXnaexNLBESRYgtWiGvQj+6g@mail.gmail.com>
Message-ID: <77cfd611-1f28-7ced-7c25-7fdba1ae608b@gmail.com>

Yes, it's pretty empirical, and with the estimator tags PR 
(https://github.com/scikit-learn/scikit-learn/pull/8022) we will be able 
to relax it if there's a good reason you're not passing.
But the dataset is pretty trivial (iris), and you're getting chance 
performance (it's a balanced three class problem). So that is not a 
great sign for your estimator.

On 10/11/2017 07:09 PM, Guillaume Lema?tre wrote:
> Not sure 100% but this is an integration/sanity check since all 
> classifiers are supposed to predict quite well and data used to train.
> This is true that 83% is empirical but it allows to spot any changes 
> done in the algorithms even if the unit tests are passing for some reason.
>
> On 11 October 2017 at 18:52, Michael Capizzi 
> <mcapizzi at email.arizona.edu <mailto:mcapizzi at email.arizona.edu>> wrote:
>
>     I?m wondering if anyone can identify the purpose of this test:
>     |check_classifiers_train()|, specifically this line:
>     https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/utils/estimator_checks.py#L1106
>     <https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/utils/estimator_checks.py#L1106>
>
>     My custom classifier (which I?m hoping to submit to
>     |scikit-learn-contrib|) is failing this test:
>
>     |File
>     "/Users/mcapizzi/miniconda3/envs/nb_plus_svm/lib/python3.6/site-packages/sklearn/utils/estimator_checks.py",
>     line 1106, in check_classifiers_train
>     assert_greater(accuracy_score(y, y_pred), 0.83) AssertionError:
>     0.31333333333333335 not greater than 0.83 |
>
>     And while it?s disturbing that my classifier is getting 31%
>     |accuracy| when, clearly, the test writer expects it to be in the
>     upper-80s, I?m not sure I understand why that would be a test
>     condition.
>
>     Thanks for any insight.
>
>     ?
>
>     _______________________________________________
>     scikit-learn mailing list
>     scikit-learn at python.org <mailto:scikit-learn at python.org>
>     https://mail.python.org/mailman/listinfo/scikit-learn
>     <https://mail.python.org/mailman/listinfo/scikit-learn>
>
>
>
>
> -- 
> Guillaume Lemaitre
> INRIA Saclay - Parietal team
> Center for Data Science Paris-Saclay
> https://glemaitre.github.io/
>
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20171012/b2058102/attachment-0001.html>

From mcapizzi at email.arizona.edu  Thu Oct 12 14:27:22 2017
From: mcapizzi at email.arizona.edu (Michael Capizzi)
Date: Thu, 12 Oct 2017 11:27:22 -0700
Subject: [scikit-learn] purpose of test: check_classifiers_train
In-Reply-To: <77cfd611-1f28-7ced-7c25-7fdba1ae608b@gmail.com>
References: <CAAX6ikFHwiodytFUO_8dOu2QFPw7pGHNe8NbEFexxk-ztQ_jZg@mail.gmail.com>
 <CACDxx9gfO6HgkUST6H+V1S+-1zNXnaexNLBESRYgtWiGvQj+6g@mail.gmail.com>
 <77cfd611-1f28-7ced-7c25-7fdba1ae608b@gmail.com>
Message-ID: <CAAX6ikGxQ0+PTQqE2TUq__GGTPVce8YgpTj-Mk3nXT5rRWFtNw@mail.gmail.com>

Thanks @andreas, for your comments, especially the info that it's the
`iris` dataset.  I have to dig a bit deeper to see what's going on with the
performance there.  But now that I know it's `iris`, I can try to recreate.

-M

On Thu, Oct 12, 2017 at 12:01 AM, Andreas Mueller <t3kcit at gmail.com> wrote:

> Yes, it's pretty empirical, and with the estimator tags PR (
> https://github.com/scikit-learn/scikit-learn/pull/8022) we will be able
> to relax it if there's a good reason you're not passing.
> But the dataset is pretty trivial (iris), and you're getting chance
> performance (it's a balanced three class problem). So that is not a great
> sign for your estimator.
>
>
> On 10/11/2017 07:09 PM, Guillaume Lema?tre wrote:
>
> Not sure 100% but this is an integration/sanity check since all
> classifiers are supposed to predict quite well and data used to train.
> This is true that 83% is empirical but it allows to spot any changes done
> in the algorithms even if the unit tests are passing for some reason.
>
> On 11 October 2017 at 18:52, Michael Capizzi <mcapizzi at email.arizona.edu>
> wrote:
>
>> I?m wondering if anyone can identify the purpose of this test:
>> check_classifiers_train(), specifically this line:
>> https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/
>> sklearn/utils/estimator_checks.py#L1106
>>
>> My custom classifier (which I?m hoping to submit to scikit-learn-contrib)
>> is failing this test:
>>
>>   File "/Users/mcapizzi/miniconda3/envs/nb_plus_svm/lib/python3.6/site-packages/sklearn/utils/estimator_checks.py", line 1106, in check_classifiers_train
>>     assert_greater(accuracy_score(y, y_pred), 0.83)
>> AssertionError: 0.31333333333333335 not greater than 0.83
>>
>> And while it?s disturbing that my classifier is getting 31% accuracy
>> when, clearly, the test writer expects it to be in the upper-80s, I?m not
>> sure I understand why that would be a test condition.
>>
>> Thanks for any insight.
>> ?
>>
>> _______________________________________________
>> scikit-learn mailing list
>> scikit-learn at python.org
>> https://mail.python.org/mailman/listinfo/scikit-learn
>>
>>
>
>
> --
> Guillaume Lemaitre
> INRIA Saclay - Parietal team
> Center for Data Science Paris-Saclay
> https://glemaitre.github.io/
>
>
> _______________________________________________
> scikit-learn mailing listscikit-learn at python.orghttps://mail.python.org/mailman/listinfo/scikit-learn
>
>
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20171012/23c67d28/attachment.html>

From mcapizzi at email.arizona.edu  Thu Oct 12 16:10:57 2017
From: mcapizzi at email.arizona.edu (Michael Capizzi)
Date: Thu, 12 Oct 2017 13:10:57 -0700
Subject: [scikit-learn] purpose of test: check_classifiers_train
In-Reply-To: <CAAX6ikGxQ0+PTQqE2TUq__GGTPVce8YgpTj-Mk3nXT5rRWFtNw@mail.gmail.com>
References: <CAAX6ikFHwiodytFUO_8dOu2QFPw7pGHNe8NbEFexxk-ztQ_jZg@mail.gmail.com>
 <CACDxx9gfO6HgkUST6H+V1S+-1zNXnaexNLBESRYgtWiGvQj+6g@mail.gmail.com>
 <77cfd611-1f28-7ced-7c25-7fdba1ae608b@gmail.com>
 <CAAX6ikGxQ0+PTQqE2TUq__GGTPVce8YgpTj-Mk3nXT5rRWFtNw@mail.gmail.com>
Message-ID: <CAAX6ikHvDB+3oyag4KW5714BAs-pc_sgoRByBBbUXJJZyC5GAA@mail.gmail.com>

So it appears that the test check_classifiers_train() (
https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/utils/estimator_checks.py#L1079)
does *not* use the iris dataset after all:

X_m, y_m = make_blobs(n_samples=300, random_state=0)
X_m, y_m = shuffle(X_m, y_m, random_state=7)
X_m = StandardScaler().fit_transform(X_m)

But, this also explains why my classifier only gets accuracy of only 31%.
My classifier that I?m trying to build to contribute to scikit-learn-contrib
is designed to be used on NLP data where the features are *non-negative*
counts: https://nlp.stanford.edu/pubs/sidaw12_simple_sentiment.pdf

Interestingly enough, this classifier reports 100% accuracy on the iris
dataset (when last 10% is used for testing). But again, the main purpose of
this classifier is in NLP cases.

So @andreas mentioned that this can be relaxed ?if there?s a good reason.?
Does the above situation qualify?

-M
?

On Thu, Oct 12, 2017 at 11:27 AM, Michael Capizzi <
mcapizzi at email.arizona.edu> wrote:

> Thanks @andreas, for your comments, especially the info that it's the
> `iris` dataset.  I have to dig a bit deeper to see what's going on with the
> performance there.  But now that I know it's `iris`, I can try to recreate.
>
> -M
>
> On Thu, Oct 12, 2017 at 12:01 AM, Andreas Mueller <t3kcit at gmail.com>
> wrote:
>
>> Yes, it's pretty empirical, and with the estimator tags PR (
>> https://github.com/scikit-learn/scikit-learn/pull/8022) we will be able
>> to relax it if there's a good reason you're not passing.
>> But the dataset is pretty trivial (iris), and you're getting chance
>> performance (it's a balanced three class problem). So that is not a great
>> sign for your estimator.
>>
>>
>> On 10/11/2017 07:09 PM, Guillaume Lema?tre wrote:
>>
>> Not sure 100% but this is an integration/sanity check since all
>> classifiers are supposed to predict quite well and data used to train.
>> This is true that 83% is empirical but it allows to spot any changes done
>> in the algorithms even if the unit tests are passing for some reason.
>>
>> On 11 October 2017 at 18:52, Michael Capizzi <mcapizzi at email.arizona.edu>
>> wrote:
>>
>>> I?m wondering if anyone can identify the purpose of this test:
>>> check_classifiers_train(), specifically this line:
>>> https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/s
>>> klearn/utils/estimator_checks.py#L1106
>>>
>>> My custom classifier (which I?m hoping to submit to scikit-learn-contrib)
>>> is failing this test:
>>>
>>>   File "/Users/mcapizzi/miniconda3/envs/nb_plus_svm/lib/python3.6/site-packages/sklearn/utils/estimator_checks.py", line 1106, in check_classifiers_train
>>>     assert_greater(accuracy_score(y, y_pred), 0.83)
>>> AssertionError: 0.31333333333333335 not greater than 0.83
>>>
>>> And while it?s disturbing that my classifier is getting 31% accuracy
>>> when, clearly, the test writer expects it to be in the upper-80s, I?m not
>>> sure I understand why that would be a test condition.
>>>
>>> Thanks for any insight.
>>> ?
>>>
>>> _______________________________________________
>>> scikit-learn mailing list
>>> scikit-learn at python.org
>>> https://mail.python.org/mailman/listinfo/scikit-learn
>>>
>>>
>>
>>
>> --
>> Guillaume Lemaitre
>> INRIA Saclay - Parietal team
>> Center for Data Science Paris-Saclay
>> https://glemaitre.github.io/
>>
>>
>> _______________________________________________
>> scikit-learn mailing listscikit-learn at python.orghttps://mail.python.org/mailman/listinfo/scikit-learn
>>
>>
>>
>> _______________________________________________
>> scikit-learn mailing list
>> scikit-learn at python.org
>> https://mail.python.org/mailman/listinfo/scikit-learn
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20171012/9b52bf00/attachment-0001.html>

From t3kcit at gmail.com  Fri Oct 13 03:10:29 2017
From: t3kcit at gmail.com (Andreas Mueller)
Date: Fri, 13 Oct 2017 09:10:29 +0200
Subject: [scikit-learn] purpose of test: check_classifiers_train
In-Reply-To: <CAAX6ikHvDB+3oyag4KW5714BAs-pc_sgoRByBBbUXJJZyC5GAA@mail.gmail.com>
References: <CAAX6ikFHwiodytFUO_8dOu2QFPw7pGHNe8NbEFexxk-ztQ_jZg@mail.gmail.com>
 <CACDxx9gfO6HgkUST6H+V1S+-1zNXnaexNLBESRYgtWiGvQj+6g@mail.gmail.com>
 <77cfd611-1f28-7ced-7c25-7fdba1ae608b@gmail.com>
 <CAAX6ikGxQ0+PTQqE2TUq__GGTPVce8YgpTj-Mk3nXT5rRWFtNw@mail.gmail.com>
 <CAAX6ikHvDB+3oyag4KW5714BAs-pc_sgoRByBBbUXJJZyC5GAA@mail.gmail.com>
Message-ID: <f97fbbb3-dc2c-197c-d602-dfd542bc4d6d@gmail.com>

Sorry for the misinformation.
Yes, actually I'd argue you should raise an error on data that's not 
non-negative, if that's not valid input.
Right now there is no way to specify to the testing suite that your 
model requires positive data, that's what the PR is about
(among other things) that I referenced earlier.

On 10/12/2017 10:10 PM, Michael Capizzi wrote:
>
> So it appears that the test |check_classifiers_train()| 
> (https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/utils/estimator_checks.py#L1079) 
> does /not/ use the |iris| dataset after all:
>
> |X_m, y_m = make_blobs(n_samples=300, random_state=0) X_m, y_m = 
> shuffle(X_m, y_m, random_state=7) X_m = 
> StandardScaler().fit_transform(X_m) |
>
> But, this also explains why my classifier only gets accuracy of only 
> |31%|. My classifier that I?m trying to build to contribute to 
> |scikit-learn-contrib| is designed to be used on NLP data where the 
> features are /non-negative/ counts: 
> https://nlp.stanford.edu/pubs/sidaw12_simple_sentiment.pdf
>
> Interestingly enough, this classifier reports 100% accuracy on the 
> |iris| dataset (when last 10% is used for testing). But again, the 
> main purpose of this classifier is in NLP cases.
>
> So @andreas mentioned that this can be relaxed ?if there?s a good 
> reason.? Does the above situation qualify?
>
> -M
>
> ?
>
> On Thu, Oct 12, 2017 at 11:27 AM, Michael Capizzi 
> <mcapizzi at email.arizona.edu <mailto:mcapizzi at email.arizona.edu>> wrote:
>
>     Thanks @andreas, for your comments, especially the info that it's
>     the `iris` dataset.? I have to dig a bit deeper to see what's
>     going on with the performance there.? But now that I know it's
>     `iris`, I can try to recreate.
>
>     -M
>
>     On Thu, Oct 12, 2017 at 12:01 AM, Andreas Mueller
>     <t3kcit at gmail.com <mailto:t3kcit at gmail.com>> wrote:
>
>         Yes, it's pretty empirical, and with the estimator tags PR
>         (https://github.com/scikit-learn/scikit-learn/pull/8022
>         <https://github.com/scikit-learn/scikit-learn/pull/8022>) we
>         will be able to relax it if there's a good reason you're not
>         passing.
>         But the dataset is pretty trivial (iris), and you're getting
>         chance performance (it's a balanced three class problem). So
>         that is not a great sign for your estimator.
>
>
>         On 10/11/2017 07:09 PM, Guillaume Lema?tre wrote:
>>         Not sure 100% but this is an integration/sanity check since
>>         all classifiers are supposed to predict quite well and data
>>         used to train.
>>         This is true that 83% is empirical but it allows to spot any
>>         changes done in the algorithms even if the unit tests are
>>         passing for some reason.
>>
>>         On 11 October 2017 at 18:52, Michael Capizzi
>>         <mcapizzi at email.arizona.edu
>>         <mailto:mcapizzi at email.arizona.edu>> wrote:
>>
>>             I?m wondering if anyone can identify the purpose of this
>>             test: |check_classifiers_train()|, specifically this
>>             line:
>>             https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/utils/estimator_checks.py#L1106
>>             <https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/utils/estimator_checks.py#L1106>
>>
>>             My custom classifier (which I?m hoping to submit to
>>             |scikit-learn-contrib|) is failing this test:
>>
>>             |File
>>             "/Users/mcapizzi/miniconda3/envs/nb_plus_svm/lib/python3.6/site-packages/sklearn/utils/estimator_checks.py",
>>             line 1106, in check_classifiers_train
>>             assert_greater(accuracy_score(y, y_pred), 0.83)
>>             AssertionError: 0.31333333333333335 not greater than 0.83 |
>>
>>             And while it?s disturbing that my classifier is getting
>>             31% |accuracy| when, clearly, the test writer expects it
>>             to be in the upper-80s, I?m not sure I understand why
>>             that would be a test condition.
>>
>>             Thanks for any insight.
>>
>>             ?
>>
>>             _______________________________________________
>>             scikit-learn mailing list
>>             scikit-learn at python.org <mailto:scikit-learn at python.org>
>>             https://mail.python.org/mailman/listinfo/scikit-learn
>>             <https://mail.python.org/mailman/listinfo/scikit-learn>
>>
>>
>>
>>
>>         -- 
>>         Guillaume Lemaitre
>>         INRIA Saclay - Parietal team
>>         Center for Data Science Paris-Saclay
>>         https://glemaitre.github.io/
>>
>>
>>         _______________________________________________
>>         scikit-learn mailing list
>>         scikit-learn at python.org <mailto:scikit-learn at python.org>
>>         https://mail.python.org/mailman/listinfo/scikit-learn
>>         <https://mail.python.org/mailman/listinfo/scikit-learn>
>
>
>         _______________________________________________
>         scikit-learn mailing list
>         scikit-learn at python.org <mailto:scikit-learn at python.org>
>         https://mail.python.org/mailman/listinfo/scikit-learn
>         <https://mail.python.org/mailman/listinfo/scikit-learn>
>
>
>
>
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20171013/44c3b225/attachment-0001.html>

From lemhadri at stanford.edu  Sun Oct 15 21:42:56 2017
From: lemhadri at stanford.edu (Ismael Lemhadri)
Date: Sun, 15 Oct 2017 18:42:56 -0700
Subject: [scikit-learn] unclear help file for sklearn.decomposition.pca
Message-ID: <CANpSPFTgv+Oz7f97dandmrBBayqf_o9w=18oKHCFN0u5DNzj+g@mail.gmail.com>

Dear all,
The help file for the PCA class is unclear about the preprocessing
performed to the data.
You can check on line 410 here:
https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/
decomposition/pca.py#L410
that the matrix is centered but NOT scaled, before performing the singular
value decomposition.
However, the help files do not make any mention of it.
This is unclear for someone who, like me, just wanted to compare that the
PCA and np.linalg.svd give the same results. In academic settings, students
are often asked to compare different methods and to check that they yield
the same results. I expect that many students have confronted this problem
before...
Best,
Ismael Lemhadri
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20171015/c465bde7/attachment.html>

From rth.yurchak at gmail.com  Mon Oct 16 09:16:45 2017
From: rth.yurchak at gmail.com (Roman Yurchak)
Date: Mon, 16 Oct 2017 15:16:45 +0200
Subject: [scikit-learn] unclear help file for sklearn.decomposition.pca
In-Reply-To: <CANpSPFTgv+Oz7f97dandmrBBayqf_o9w=18oKHCFN0u5DNzj+g@mail.gmail.com>
References: <CANpSPFTgv+Oz7f97dandmrBBayqf_o9w=18oKHCFN0u5DNzj+g@mail.gmail.com>
Message-ID: <b2abdcfd-4736-929e-6304-b93832932043@gmail.com>

Ismael,

as far as I saw the sklearn.decomposition.PCA doesn't mention scaling at 
all (except for the whiten parameter which is post-transformation scaling).

So since it doesn't mention it, it makes sense that it doesn't do any 
scaling of the input. Same as np.linalg.svd.

You can verify that PCA and np.linalg.svd yield the same results, with

```
 >>> import numpy as np
 >>> from sklearn.decomposition import PCA
 >>> import numpy.linalg
 >>> X = np.random.RandomState(42).rand(10, 4)
 >>> n_components = 2
 >>> PCA(n_components, svd_solver='full').fit_transform(X)
```

and

```
 >>> U, s, V = np.linalg.svd(X - X.mean(axis=0), full_matrices=False)
 >>> (X - X.mean(axis=0)).dot(V[:n_components].T)
```

-- 
Roman

On 16/10/17 03:42, Ismael Lemhadri wrote:
> Dear all,
> The help file for the PCA class is unclear about the preprocessing
> performed to the data.
> You can check on line 410 here:
> https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/decomposition/pca.py#L410
> <https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/decomposition/pca.py#L410>
> that the matrix is centered but NOT scaled, before performing the
> singular value decomposition.
> However, the help files do not make any mention of it.
> This is unclear for someone who, like me, just wanted to compare that
> the PCA and np.linalg.svd give the same results. In academic settings,
> students are often asked to compare different methods and to check that
> they yield the same results. I expect that many students have confronted
> this problem before...
> Best,
> Ismael Lemhadri
>
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>


From seralouk at gmail.com  Mon Oct 16 09:27:48 2017
From: seralouk at gmail.com (Serafeim Loukas)
Date: Mon, 16 Oct 2017 15:27:48 +0200
Subject: [scikit-learn] Question about LDA's coef_ attribute
Message-ID: <58C6D0DA-9DE5-4EF5-97C1-48159831F5A9@gmail.com>

Dear Scikit-learn community,

Since the documentation of the LDA (http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html <http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html>) is not so clear, I would like to ask if the lda.coef_ attribute stores the eigenvectors from the SVD decomposition.

Thank you in advance,
Serafeim
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20171016/4263df5c/attachment.html>

From alexandre.gramfort at inria.fr  Mon Oct 16 10:57:52 2017
From: alexandre.gramfort at inria.fr (Alexandre Gramfort)
Date: Mon, 16 Oct 2017 16:57:52 +0200
Subject: [scikit-learn] Question about LDA's coef_ attribute
In-Reply-To: <58C6D0DA-9DE5-4EF5-97C1-48159831F5A9@gmail.com>
References: <58C6D0DA-9DE5-4EF5-97C1-48159831F5A9@gmail.com>
Message-ID: <CADeotZricOQhuHJMmW2Z14cqffEQyndYoxn-OgKAvTMQ7V0Y2g@mail.gmail.com>

no it stores the direction of the decision function to match the API of
linear models.

HTH
Alex

On Mon, Oct 16, 2017 at 3:27 PM, Serafeim Loukas <seralouk at gmail.com> wrote:
> Dear Scikit-learn community,
>
> Since the documentation of the LDA
> (http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html)
> is not so clear, I would like to ask if the lda.coef_ attribute stores the
> eigenvectors from the SVD decomposition.
>
> Thank you in advance,
> Serafeim
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>

From seralouk at gmail.com  Mon Oct 16 11:02:46 2017
From: seralouk at gmail.com (Serafeim Loukas)
Date: Mon, 16 Oct 2017 17:02:46 +0200
Subject: [scikit-learn] Question about LDA's coef_ attribute
In-Reply-To: <CADeotZricOQhuHJMmW2Z14cqffEQyndYoxn-OgKAvTMQ7V0Y2g@mail.gmail.com>
References: <58C6D0DA-9DE5-4EF5-97C1-48159831F5A9@gmail.com>
 <CADeotZricOQhuHJMmW2Z14cqffEQyndYoxn-OgKAvTMQ7V0Y2g@mail.gmail.com>
Message-ID: <413210D2-56AE-41A4-873F-D171BB36539D@gmail.com>

Dear Alex,

Thank you for the prompt response.

Are the eigenvectors stored in some variable ?
Does the lda.scalings_ attribute contain the eigenvectors ?

Best,
Serafeim

> On 16 Oct 2017, at 16:57, Alexandre Gramfort <alexandre.gramfort at inria.fr> wrote:
> 
> no it stores the direction of the decision function to match the API of
> linear models.
> 
> HTH
> Alex
> 
> On Mon, Oct 16, 2017 at 3:27 PM, Serafeim Loukas <seralouk at gmail.com> wrote:
>> Dear Scikit-learn community,
>> 
>> Since the documentation of the LDA
>> (http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html)
>> is not so clear, I would like to ask if the lda.coef_ attribute stores the
>> eigenvectors from the SVD decomposition.
>> 
>> Thank you in advance,
>> Serafeim
>> 
>> _______________________________________________
>> scikit-learn mailing list
>> scikit-learn at python.org
>> https://mail.python.org/mailman/listinfo/scikit-learn
>> 
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20171016/505c7da3/attachment-0001.html>

From lemhadri at stanford.edu  Mon Oct 16 11:16:38 2017
From: lemhadri at stanford.edu (Ismael Lemhadri)
Date: Mon, 16 Oct 2017 08:16:38 -0700
Subject: [scikit-learn] unclear help file for sklearn.decomposition.pca
Message-ID: <CANpSPFQs-zfUeggZeMsN6NdLGv512MDCW2ZqB8cgK1hFFdfHfw@mail.gmail.com>

Dear Roman,
My concern is actually not about not mentioning the scaling but about not
mentioning the centering.
That is, the sklearn PCA removes the mean but it does not mention it in the
help file.
This was quite messy for me to debug as I expected it to either: 1/ center
and scale simultaneously or / not scale and not center either.
It would be beneficial to explicit the behavior in the help file in my
opinion.
Ismael

On Mon, Oct 16, 2017 at 8:02 AM, <scikit-learn-request at python.org> wrote:

> Send scikit-learn mailing list submissions to
>         scikit-learn at python.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
>         https://mail.python.org/mailman/listinfo/scikit-learn
> or, via email, send a message with subject or body 'help' to
>         scikit-learn-request at python.org
>
> You can reach the person managing the list at
>         scikit-learn-owner at python.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of scikit-learn digest..."
>
>
> Today's Topics:
>
>    1. unclear help file for sklearn.decomposition.pca (Ismael Lemhadri)
>    2. Re: unclear help file for sklearn.decomposition.pca
>       (Roman Yurchak)
>    3. Question about LDA's coef_ attribute (Serafeim Loukas)
>    4. Re: Question about LDA's coef_ attribute (Alexandre Gramfort)
>    5. Re: Question about LDA's coef_ attribute (Serafeim Loukas)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Sun, 15 Oct 2017 18:42:56 -0700
> From: Ismael Lemhadri <lemhadri at stanford.edu>
> To: scikit-learn at python.org
> Subject: [scikit-learn] unclear help file for
>         sklearn.decomposition.pca
> Message-ID:
>         <CANpSPFTgv+Oz7f97dandmrBBayqf_o9w=18oKHCF
> N0u5DNzj+g at mail.gmail.com>
> Content-Type: text/plain; charset="utf-8"
>
> Dear all,
> The help file for the PCA class is unclear about the preprocessing
> performed to the data.
> You can check on line 410 here:
> https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/
> decomposition/pca.py#L410
> that the matrix is centered but NOT scaled, before performing the singular
> value decomposition.
> However, the help files do not make any mention of it.
> This is unclear for someone who, like me, just wanted to compare that the
> PCA and np.linalg.svd give the same results. In academic settings, students
> are often asked to compare different methods and to check that they yield
> the same results. I expect that many students have confronted this problem
> before...
> Best,
> Ismael Lemhadri
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <http://mail.python.org/pipermail/scikit-learn/
> attachments/20171015/c465bde7/attachment-0001.html>
>
> ------------------------------
>
> Message: 2
> Date: Mon, 16 Oct 2017 15:16:45 +0200
> From: Roman Yurchak <rth.yurchak at gmail.com>
> To: Scikit-learn mailing list <scikit-learn at python.org>
> Subject: Re: [scikit-learn] unclear help file for
>         sklearn.decomposition.pca
> Message-ID: <b2abdcfd-4736-929e-6304-b93832932043 at gmail.com>
> Content-Type: text/plain; charset=utf-8; format=flowed
>
> Ismael,
>
> as far as I saw the sklearn.decomposition.PCA doesn't mention scaling at
> all (except for the whiten parameter which is post-transformation scaling).
>
> So since it doesn't mention it, it makes sense that it doesn't do any
> scaling of the input. Same as np.linalg.svd.
>
> You can verify that PCA and np.linalg.svd yield the same results, with
>
> ```
>  >>> import numpy as np
>  >>> from sklearn.decomposition import PCA
>  >>> import numpy.linalg
>  >>> X = np.random.RandomState(42).rand(10, 4)
>  >>> n_components = 2
>  >>> PCA(n_components, svd_solver='full').fit_transform(X)
> ```
>
> and
>
> ```
>  >>> U, s, V = np.linalg.svd(X - X.mean(axis=0), full_matrices=False)
>  >>> (X - X.mean(axis=0)).dot(V[:n_components].T)
> ```
>
> --
> Roman
>
> On 16/10/17 03:42, Ismael Lemhadri wrote:
> > Dear all,
> > The help file for the PCA class is unclear about the preprocessing
> > performed to the data.
> > You can check on line 410 here:
> > https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/
> decomposition/pca.py#L410
> > <https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/
> decomposition/pca.py#L410>
> > that the matrix is centered but NOT scaled, before performing the
> > singular value decomposition.
> > However, the help files do not make any mention of it.
> > This is unclear for someone who, like me, just wanted to compare that
> > the PCA and np.linalg.svd give the same results. In academic settings,
> > students are often asked to compare different methods and to check that
> > they yield the same results. I expect that many students have confronted
> > this problem before...
> > Best,
> > Ismael Lemhadri
> >
> >
> > _______________________________________________
> > scikit-learn mailing list
> > scikit-learn at python.org
> > https://mail.python.org/mailman/listinfo/scikit-learn
> >
>
>
>
> ------------------------------
>
> Message: 3
> Date: Mon, 16 Oct 2017 15:27:48 +0200
> From: Serafeim Loukas <seralouk at gmail.com>
> To: scikit-learn at python.org
> Subject: [scikit-learn] Question about LDA's coef_ attribute
> Message-ID: <58C6D0DA-9DE5-4EF5-97C1-48159831F5A9 at gmail.com>
> Content-Type: text/plain; charset="us-ascii"
>
> Dear Scikit-learn community,
>
> Since the documentation of the LDA (http://scikit-learn.org/
> stable/modules/generated/sklearn.discriminant_analysis.
> LinearDiscriminantAnalysis.html <http://scikit-learn.org/
> stable/modules/generated/sklearn.discriminant_analysis.
> LinearDiscriminantAnalysis.html>) is not so clear, I would like to ask if
> the lda.coef_ attribute stores the eigenvectors from the SVD decomposition.
>
> Thank you in advance,
> Serafeim
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <http://mail.python.org/pipermail/scikit-learn/
> attachments/20171016/4263df5c/attachment-0001.html>
>
> ------------------------------
>
> Message: 4
> Date: Mon, 16 Oct 2017 16:57:52 +0200
> From: Alexandre Gramfort <alexandre.gramfort at inria.fr>
> To: Scikit-learn mailing list <scikit-learn at python.org>
> Subject: Re: [scikit-learn] Question about LDA's coef_ attribute
> Message-ID:
>         <CADeotZricOQhuHJMmW2Z14cqffEQyndYoxn-OgKAvTMQ7V0Y2g at mail.
> gmail.com>
> Content-Type: text/plain; charset="UTF-8"
>
> no it stores the direction of the decision function to match the API of
> linear models.
>
> HTH
> Alex
>
> On Mon, Oct 16, 2017 at 3:27 PM, Serafeim Loukas <seralouk at gmail.com>
> wrote:
> > Dear Scikit-learn community,
> >
> > Since the documentation of the LDA
> > (http://scikit-learn.org/stable/modules/generated/
> sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html)
> > is not so clear, I would like to ask if the lda.coef_ attribute stores
> the
> > eigenvectors from the SVD decomposition.
> >
> > Thank you in advance,
> > Serafeim
> >
> > _______________________________________________
> > scikit-learn mailing list
> > scikit-learn at python.org
> > https://mail.python.org/mailman/listinfo/scikit-learn
> >
>
>
> ------------------------------
>
> Message: 5
> Date: Mon, 16 Oct 2017 17:02:46 +0200
> From: Serafeim Loukas <seralouk at gmail.com>
> To: Scikit-learn mailing list <scikit-learn at python.org>
> Subject: Re: [scikit-learn] Question about LDA's coef_ attribute
> Message-ID: <413210D2-56AE-41A4-873F-D171BB36539D at gmail.com>
> Content-Type: text/plain; charset="us-ascii"
>
> Dear Alex,
>
> Thank you for the prompt response.
>
> Are the eigenvectors stored in some variable ?
> Does the lda.scalings_ attribute contain the eigenvectors ?
>
> Best,
> Serafeim
>
> > On 16 Oct 2017, at 16:57, Alexandre Gramfort <
> alexandre.gramfort at inria.fr> wrote:
> >
> > no it stores the direction of the decision function to match the API of
> > linear models.
> >
> > HTH
> > Alex
> >
> > On Mon, Oct 16, 2017 at 3:27 PM, Serafeim Loukas <seralouk at gmail.com>
> wrote:
> >> Dear Scikit-learn community,
> >>
> >> Since the documentation of the LDA
> >> (http://scikit-learn.org/stable/modules/generated/
> sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html)
> >> is not so clear, I would like to ask if the lda.coef_ attribute stores
> the
> >> eigenvectors from the SVD decomposition.
> >>
> >> Thank you in advance,
> >> Serafeim
> >>
> >> _______________________________________________
> >> scikit-learn mailing list
> >> scikit-learn at python.org
> >> https://mail.python.org/mailman/listinfo/scikit-learn
> >>
> > _______________________________________________
> > scikit-learn mailing list
> > scikit-learn at python.org
> > https://mail.python.org/mailman/listinfo/scikit-learn
>
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <http://mail.python.org/pipermail/scikit-learn/
> attachments/20171016/505c7da3/attachment.html>
>
> ------------------------------
>
> Subject: Digest Footer
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
>
> ------------------------------
>
> End of scikit-learn Digest, Vol 19, Issue 25
> ********************************************
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20171016/9dbed9c0/attachment-0001.html>

From rth.yurchak at gmail.com  Mon Oct 16 11:34:17 2017
From: rth.yurchak at gmail.com (Roman Yurchak)
Date: Mon, 16 Oct 2017 17:34:17 +0200
Subject: [scikit-learn] unclear help file for sklearn.decomposition.pca
In-Reply-To: <CANpSPFQs-zfUeggZeMsN6NdLGv512MDCW2ZqB8cgK1hFFdfHfw@mail.gmail.com>
References: <CANpSPFQs-zfUeggZeMsN6NdLGv512MDCW2ZqB8cgK1hFFdfHfw@mail.gmail.com>
Message-ID: <47eb67c2-89be-8390-3b31-8e39d1bf2b1d@gmail.com>

On 16/10/17 17:16, Ismael Lemhadri wrote:
> My concern is actually not about not mentioning the scaling but about
> not mentioning the centering.
> That is, the sklearn PCA removes the mean but it does not mention it in
> the help file.

I think it's currently assumed given the definition of the PCA, but you 
are right, the subtraction of the mean and the relationship to the SVD 
decomposition (i.e. TruncatedSVD) could be more clearly stated in the 
docsting and in the user manual,
 
http://scikit-learn.org/stable/modules/decomposition.html#principal-component-analysis-pca

Feel free to open an issue on Github about it or to submit a pull 
request to improve the documentation,

-- 
Roman

From t3kcit at gmail.com  Mon Oct 16 13:19:57 2017
From: t3kcit at gmail.com (Andreas Mueller)
Date: Mon, 16 Oct 2017 13:19:57 -0400
Subject: [scikit-learn] unclear help file for sklearn.decomposition.pca
In-Reply-To: <CANpSPFQs-zfUeggZeMsN6NdLGv512MDCW2ZqB8cgK1hFFdfHfw@mail.gmail.com>
References: <CANpSPFQs-zfUeggZeMsN6NdLGv512MDCW2ZqB8cgK1hFFdfHfw@mail.gmail.com>
Message-ID: <04fc445c-d8f3-a3a9-4ab2-0535826a2d03@gmail.com>

The definition of PCA has a centering step, but no scaling step.

On 10/16/2017 11:16 AM, Ismael Lemhadri wrote:
> Dear Roman,
> My concern is actually not about not mentioning the scaling but about 
> not mentioning the centering.
> That is, the sklearn PCA removes the mean but it does not mention it 
> in the help file.
> This was quite messy for me to debug as I expected it to either: 1/ 
> center and scale simultaneously or / not scale and not center either.
> It would be beneficial to explicit the behavior in the help file in my 
> opinion.
> Ismael
>
> On Mon, Oct 16, 2017 at 8:02 AM, <scikit-learn-request at python.org 
> <mailto:scikit-learn-request at python.org>> wrote:
>
>     Send scikit-learn mailing list submissions to
>     scikit-learn at python.org <mailto:scikit-learn at python.org>
>
>     To subscribe or unsubscribe via the World Wide Web, visit
>     https://mail.python.org/mailman/listinfo/scikit-learn
>     <https://mail.python.org/mailman/listinfo/scikit-learn>
>     or, via email, send a message with subject or body 'help' to
>     scikit-learn-request at python.org
>     <mailto:scikit-learn-request at python.org>
>
>     You can reach the person managing the list at
>     scikit-learn-owner at python.org <mailto:scikit-learn-owner at python.org>
>
>     When replying, please edit your Subject line so it is more specific
>     than "Re: Contents of scikit-learn digest..."
>
>
>     Today's Topics:
>
>     ? ?1. unclear help file for sklearn.decomposition.pca (Ismael
>     Lemhadri)
>     ? ?2. Re: unclear help file for sklearn.decomposition.pca
>     ? ? ? (Roman Yurchak)
>     ? ?3. Question about LDA's coef_ attribute (Serafeim Loukas)
>     ? ?4. Re: Question about LDA's coef_ attribute (Alexandre Gramfort)
>     ? ?5. Re: Question about LDA's coef_ attribute (Serafeim Loukas)
>
>
>     ----------------------------------------------------------------------
>
>     Message: 1
>     Date: Sun, 15 Oct 2017 18:42:56 -0700
>     From: Ismael Lemhadri <lemhadri at stanford.edu
>     <mailto:lemhadri at stanford.edu>>
>     To: scikit-learn at python.org <mailto:scikit-learn at python.org>
>     Subject: [scikit-learn] unclear help file for
>     ? ? ? ? sklearn.decomposition.pca
>     Message-ID:
>     ? ? ? ?
>     <CANpSPFTgv+Oz7f97dandmrBBayqf_o9w=18oKHCFN0u5DNzj+g at mail.gmail.com
>     <mailto:18oKHCFN0u5DNzj%2Bg at mail.gmail.com>>
>     Content-Type: text/plain; charset="utf-8"
>
>     Dear all,
>     The help file for the PCA class is unclear about the preprocessing
>     performed to the data.
>     You can check on line 410 here:
>     https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/
>     decomposition/pca.py#L410
>     <https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/%0Adecomposition/pca.py#L410>
>     that the matrix is centered but NOT scaled, before performing the
>     singular
>     value decomposition.
>     However, the help files do not make any mention of it.
>     This is unclear for someone who, like me, just wanted to compare
>     that the
>     PCA and np.linalg.svd give the same results. In academic settings,
>     students
>     are often asked to compare different methods and to check that
>     they yield
>     the same results. I expect that many students have confronted this
>     problem
>     before...
>     Best,
>     Ismael Lemhadri
>     -------------- next part --------------
>     An HTML attachment was scrubbed...
>     URL:
>     <http://mail.python.org/pipermail/scikit-learn/attachments/20171015/c465bde7/attachment-0001.html
>     <http://mail.python.org/pipermail/scikit-learn/attachments/20171015/c465bde7/attachment-0001.html>>
>
>     ------------------------------
>
>     Message: 2
>     Date: Mon, 16 Oct 2017 15:16:45 +0200
>     From: Roman Yurchak <rth.yurchak at gmail.com
>     <mailto:rth.yurchak at gmail.com>>
>     To: Scikit-learn mailing list <scikit-learn at python.org
>     <mailto:scikit-learn at python.org>>
>     Subject: Re: [scikit-learn] unclear help file for
>     ? ? ? ? sklearn.decomposition.pca
>     Message-ID: <b2abdcfd-4736-929e-6304-b93832932043 at gmail.com
>     <mailto:b2abdcfd-4736-929e-6304-b93832932043 at gmail.com>>
>     Content-Type: text/plain; charset=utf-8; format=flowed
>
>     Ismael,
>
>     as far as I saw the sklearn.decomposition.PCA doesn't mention
>     scaling at
>     all (except for the whiten parameter which is post-transformation
>     scaling).
>
>     So since it doesn't mention it, it makes sense that it doesn't do any
>     scaling of the input. Same as np.linalg.svd.
>
>     You can verify that PCA and np.linalg.svd yield the same results, with
>
>     ```
>     ?>>> import numpy as np
>     ?>>> from sklearn.decomposition import PCA
>     ?>>> import numpy.linalg
>     ?>>> X = np.random.RandomState(42).rand(10, 4)
>     ?>>> n_components = 2
>     ?>>> PCA(n_components, svd_solver='full').fit_transform(X)
>     ```
>
>     and
>
>     ```
>     ?>>> U, s, V = np.linalg.svd(X - X.mean(axis=0), full_matrices=False)
>     ?>>> (X - X.mean(axis=0)).dot(V[:n_components].T)
>     ```
>
>     --
>     Roman
>
>     On 16/10/17 03:42, Ismael Lemhadri wrote:
>     > Dear all,
>     > The help file for the PCA class is unclear about the preprocessing
>     > performed to the data.
>     > You can check on line 410 here:
>     >
>     https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/decomposition/pca.py#L410
>     <https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/decomposition/pca.py#L410>
>     >
>     <https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/decomposition/pca.py#L410
>     <https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/decomposition/pca.py#L410>>
>     > that the matrix is centered but NOT scaled, before performing the
>     > singular value decomposition.
>     > However, the help files do not make any mention of it.
>     > This is unclear for someone who, like me, just wanted to compare
>     that
>     > the PCA and np.linalg.svd give the same results. In academic
>     settings,
>     > students are often asked to compare different methods and to
>     check that
>     > they yield the same results. I expect that many students have
>     confronted
>     > this problem before...
>     > Best,
>     > Ismael Lemhadri
>     >
>     >
>     > _______________________________________________
>     > scikit-learn mailing list
>     > scikit-learn at python.org <mailto:scikit-learn at python.org>
>     > https://mail.python.org/mailman/listinfo/scikit-learn
>     <https://mail.python.org/mailman/listinfo/scikit-learn>
>     >
>
>
>
>     ------------------------------
>
>     Message: 3
>     Date: Mon, 16 Oct 2017 15:27:48 +0200
>     From: Serafeim Loukas <seralouk at gmail.com <mailto:seralouk at gmail.com>>
>     To: scikit-learn at python.org <mailto:scikit-learn at python.org>
>     Subject: [scikit-learn] Question about LDA's coef_ attribute
>     Message-ID: <58C6D0DA-9DE5-4EF5-97C1-48159831F5A9 at gmail.com
>     <mailto:58C6D0DA-9DE5-4EF5-97C1-48159831F5A9 at gmail.com>>
>     Content-Type: text/plain; charset="us-ascii"
>
>     Dear Scikit-learn community,
>
>     Since the documentation of the LDA
>     (http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html
>     <http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html>
>     <http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html
>     <http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html>>)
>     is not so clear, I would like to ask if the lda.coef_ attribute
>     stores the eigenvectors from the SVD decomposition.
>
>     Thank you in advance,
>     Serafeim
>     -------------- next part --------------
>     An HTML attachment was scrubbed...
>     URL:
>     <http://mail.python.org/pipermail/scikit-learn/attachments/20171016/4263df5c/attachment-0001.html
>     <http://mail.python.org/pipermail/scikit-learn/attachments/20171016/4263df5c/attachment-0001.html>>
>
>     ------------------------------
>
>     Message: 4
>     Date: Mon, 16 Oct 2017 16:57:52 +0200
>     From: Alexandre Gramfort <alexandre.gramfort at inria.fr
>     <mailto:alexandre.gramfort at inria.fr>>
>     To: Scikit-learn mailing list <scikit-learn at python.org
>     <mailto:scikit-learn at python.org>>
>     Subject: Re: [scikit-learn] Question about LDA's coef_ attribute
>     Message-ID:
>     ? ? ? ?
>     <CADeotZricOQhuHJMmW2Z14cqffEQyndYoxn-OgKAvTMQ7V0Y2g at mail.gmail.com
>     <mailto:CADeotZricOQhuHJMmW2Z14cqffEQyndYoxn-OgKAvTMQ7V0Y2g at mail.gmail.com>>
>     Content-Type: text/plain; charset="UTF-8"
>
>     no it stores the direction of the decision function to match the
>     API of
>     linear models.
>
>     HTH
>     Alex
>
>     On Mon, Oct 16, 2017 at 3:27 PM, Serafeim Loukas
>     <seralouk at gmail.com <mailto:seralouk at gmail.com>> wrote:
>     > Dear Scikit-learn community,
>     >
>     > Since the documentation of the LDA
>     >
>     (http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html
>     <http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html>)
>     > is not so clear, I would like to ask if the lda.coef_ attribute
>     stores the
>     > eigenvectors from the SVD decomposition.
>     >
>     > Thank you in advance,
>     > Serafeim
>     >
>     > _______________________________________________
>     > scikit-learn mailing list
>     > scikit-learn at python.org <mailto:scikit-learn at python.org>
>     > https://mail.python.org/mailman/listinfo/scikit-learn
>     <https://mail.python.org/mailman/listinfo/scikit-learn>
>     >
>
>
>     ------------------------------
>
>     Message: 5
>     Date: Mon, 16 Oct 2017 17:02:46 +0200
>     From: Serafeim Loukas <seralouk at gmail.com <mailto:seralouk at gmail.com>>
>     To: Scikit-learn mailing list <scikit-learn at python.org
>     <mailto:scikit-learn at python.org>>
>     Subject: Re: [scikit-learn] Question about LDA's coef_ attribute
>     Message-ID: <413210D2-56AE-41A4-873F-D171BB36539D at gmail.com
>     <mailto:413210D2-56AE-41A4-873F-D171BB36539D at gmail.com>>
>     Content-Type: text/plain; charset="us-ascii"
>
>     Dear Alex,
>
>     Thank you for the prompt response.
>
>     Are the eigenvectors stored in some variable ?
>     Does the lda.scalings_ attribute contain the eigenvectors ?
>
>     Best,
>     Serafeim
>
>     > On 16 Oct 2017, at 16:57, Alexandre Gramfort
>     <alexandre.gramfort at inria.fr <mailto:alexandre.gramfort at inria.fr>>
>     wrote:
>     >
>     > no it stores the direction of the decision function to match the
>     API of
>     > linear models.
>     >
>     > HTH
>     > Alex
>     >
>     > On Mon, Oct 16, 2017 at 3:27 PM, Serafeim Loukas
>     <seralouk at gmail.com <mailto:seralouk at gmail.com>> wrote:
>     >> Dear Scikit-learn community,
>     >>
>     >> Since the documentation of the LDA
>     >>
>     (http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html
>     <http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html>)
>     >> is not so clear, I would like to ask if the lda.coef_ attribute
>     stores the
>     >> eigenvectors from the SVD decomposition.
>     >>
>     >> Thank you in advance,
>     >> Serafeim
>     >>
>     >> _______________________________________________
>     >> scikit-learn mailing list
>     >> scikit-learn at python.org <mailto:scikit-learn at python.org>
>     >> https://mail.python.org/mailman/listinfo/scikit-learn
>     <https://mail.python.org/mailman/listinfo/scikit-learn>
>     >>
>     > _______________________________________________
>     > scikit-learn mailing list
>     > scikit-learn at python.org <mailto:scikit-learn at python.org>
>     > https://mail.python.org/mailman/listinfo/scikit-learn
>     <https://mail.python.org/mailman/listinfo/scikit-learn>
>
>     -------------- next part --------------
>     An HTML attachment was scrubbed...
>     URL:
>     <http://mail.python.org/pipermail/scikit-learn/attachments/20171016/505c7da3/attachment.html
>     <http://mail.python.org/pipermail/scikit-learn/attachments/20171016/505c7da3/attachment.html>>
>
>     ------------------------------
>
>     Subject: Digest Footer
>
>     _______________________________________________
>     scikit-learn mailing list
>     scikit-learn at python.org <mailto:scikit-learn at python.org>
>     https://mail.python.org/mailman/listinfo/scikit-learn
>     <https://mail.python.org/mailman/listinfo/scikit-learn>
>
>
>     ------------------------------
>
>     End of scikit-learn Digest, Vol 19, Issue 25
>     ********************************************
>
>
>
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20171016/f47e63a9/attachment-0001.html>

From lemhadri at stanford.edu  Mon Oct 16 14:27:11 2017
From: lemhadri at stanford.edu (Ismael Lemhadri)
Date: Mon, 16 Oct 2017 11:27:11 -0700
Subject: [scikit-learn] 1. Re: unclear help file for
 sklearn.decomposition.pca
Message-ID: <CANpSPFTA5KetWQ4q0n=4L9dvjaUc-sJwRh+=cDe+2wZCvRiv4g@mail.gmail.com>

@Andreas Muller:
My references do not assume centering, e.g.
http://ufldl.stanford.edu/wiki/index.php/PCA
any reference?


On Mon, Oct 16, 2017 at 10:20 AM, <scikit-learn-request at python.org> wrote:

> Send scikit-learn mailing list submissions to
>         scikit-learn at python.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
>         https://mail.python.org/mailman/listinfo/scikit-learn
> or, via email, send a message with subject or body 'help' to
>         scikit-learn-request at python.org
>
> You can reach the person managing the list at
>         scikit-learn-owner at python.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of scikit-learn digest..."
>
>
> Today's Topics:
>
>    1. Re: unclear help file for sklearn.decomposition.pca
>       (Andreas Mueller)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Mon, 16 Oct 2017 13:19:57 -0400
> From: Andreas Mueller <t3kcit at gmail.com>
> To: scikit-learn at python.org
> Subject: Re: [scikit-learn] unclear help file for
>         sklearn.decomposition.pca
> Message-ID: <04fc445c-d8f3-a3a9-4ab2-0535826a2d03 at gmail.com>
> Content-Type: text/plain; charset="utf-8"; Format="flowed"
>
> The definition of PCA has a centering step, but no scaling step.
>
> On 10/16/2017 11:16 AM, Ismael Lemhadri wrote:
> > Dear Roman,
> > My concern is actually not about not mentioning the scaling but about
> > not mentioning the centering.
> > That is, the sklearn PCA removes the mean but it does not mention it
> > in the help file.
> > This was quite messy for me to debug as I expected it to either: 1/
> > center and scale simultaneously or / not scale and not center either.
> > It would be beneficial to explicit the behavior in the help file in my
> > opinion.
> > Ismael
> >
> > On Mon, Oct 16, 2017 at 8:02 AM, <scikit-learn-request at python.org
> > <mailto:scikit-learn-request at python.org>> wrote:
> >
> >     Send scikit-learn mailing list submissions to
> >     scikit-learn at python.org <mailto:scikit-learn at python.org>
> >
> >     To subscribe or unsubscribe via the World Wide Web, visit
> >     https://mail.python.org/mailman/listinfo/scikit-learn
> >     <https://mail.python.org/mailman/listinfo/scikit-learn>
> >     or, via email, send a message with subject or body 'help' to
> >     scikit-learn-request at python.org
> >     <mailto:scikit-learn-request at python.org>
> >
> >     You can reach the person managing the list at
> >     scikit-learn-owner at python.org <mailto:scikit-learn-owner at python.org>
> >
> >     When replying, please edit your Subject line so it is more specific
> >     than "Re: Contents of scikit-learn digest..."
> >
> >
> >     Today's Topics:
> >
> >     ? ?1. unclear help file for sklearn.decomposition.pca (Ismael
> >     Lemhadri)
> >     ? ?2. Re: unclear help file for sklearn.decomposition.pca
> >     ? ? ? (Roman Yurchak)
> >     ? ?3. Question about LDA's coef_ attribute (Serafeim Loukas)
> >     ? ?4. Re: Question about LDA's coef_ attribute (Alexandre Gramfort)
> >     ? ?5. Re: Question about LDA's coef_ attribute (Serafeim Loukas)
> >
> >
> >     ------------------------------------------------------------
> ----------
> >
> >     Message: 1
> >     Date: Sun, 15 Oct 2017 18:42:56 -0700
> >     From: Ismael Lemhadri <lemhadri at stanford.edu
> >     <mailto:lemhadri at stanford.edu>>
> >     To: scikit-learn at python.org <mailto:scikit-learn at python.org>
> >     Subject: [scikit-learn] unclear help file for
> >     ? ? ? ? sklearn.decomposition.pca
> >     Message-ID:
> >     ? ? ? ?
> >     <CANpSPFTgv+Oz7f97dandmrBBayqf_o9w=18oKHCFN0u5DNzj+g at mail.gmail.com
> >     <mailto:18oKHCFN0u5DNzj%2Bg at mail.gmail.com>>
> >     Content-Type: text/plain; charset="utf-8"
> >
> >     Dear all,
> >     The help file for the PCA class is unclear about the preprocessing
> >     performed to the data.
> >     You can check on line 410 here:
> >     https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/
> >     decomposition/pca.py#L410
> >     <https://github.com/scikit-learn/scikit-learn/blob/
> ef5cb84a/sklearn/%0Adecomposition/pca.py#L410>
> >     that the matrix is centered but NOT scaled, before performing the
> >     singular
> >     value decomposition.
> >     However, the help files do not make any mention of it.
> >     This is unclear for someone who, like me, just wanted to compare
> >     that the
> >     PCA and np.linalg.svd give the same results. In academic settings,
> >     students
> >     are often asked to compare different methods and to check that
> >     they yield
> >     the same results. I expect that many students have confronted this
> >     problem
> >     before...
> >     Best,
> >     Ismael Lemhadri
> >     -------------- next part --------------
> >     An HTML attachment was scrubbed...
> >     URL:
> >     <http://mail.python.org/pipermail/scikit-learn/
> attachments/20171015/c465bde7/attachment-0001.html
> >     <http://mail.python.org/pipermail/scikit-learn/
> attachments/20171015/c465bde7/attachment-0001.html>>
> >
> >     ------------------------------
> >
> >     Message: 2
> >     Date: Mon, 16 Oct 2017 15:16:45 +0200
> >     From: Roman Yurchak <rth.yurchak at gmail.com
> >     <mailto:rth.yurchak at gmail.com>>
> >     To: Scikit-learn mailing list <scikit-learn at python.org
> >     <mailto:scikit-learn at python.org>>
> >     Subject: Re: [scikit-learn] unclear help file for
> >     ? ? ? ? sklearn.decomposition.pca
> >     Message-ID: <b2abdcfd-4736-929e-6304-b93832932043 at gmail.com
> >     <mailto:b2abdcfd-4736-929e-6304-b93832932043 at gmail.com>>
> >     Content-Type: text/plain; charset=utf-8; format=flowed
> >
> >     Ismael,
> >
> >     as far as I saw the sklearn.decomposition.PCA doesn't mention
> >     scaling at
> >     all (except for the whiten parameter which is post-transformation
> >     scaling).
> >
> >     So since it doesn't mention it, it makes sense that it doesn't do any
> >     scaling of the input. Same as np.linalg.svd.
> >
> >     You can verify that PCA and np.linalg.svd yield the same results,
> with
> >
> >     ```
> >     ?>>> import numpy as np
> >     ?>>> from sklearn.decomposition import PCA
> >     ?>>> import numpy.linalg
> >     ?>>> X = np.random.RandomState(42).rand(10, 4)
> >     ?>>> n_components = 2
> >     ?>>> PCA(n_components, svd_solver='full').fit_transform(X)
> >     ```
> >
> >     and
> >
> >     ```
> >     ?>>> U, s, V = np.linalg.svd(X - X.mean(axis=0), full_matrices=False)
> >     ?>>> (X - X.mean(axis=0)).dot(V[:n_components].T)
> >     ```
> >
> >     --
> >     Roman
> >
> >     On 16/10/17 03:42, Ismael Lemhadri wrote:
> >     > Dear all,
> >     > The help file for the PCA class is unclear about the preprocessing
> >     > performed to the data.
> >     > You can check on line 410 here:
> >     >
> >     https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/
> decomposition/pca.py#L410
> >     <https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/
> decomposition/pca.py#L410>
> >     >
> >     <https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/
> decomposition/pca.py#L410
> >     <https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/
> decomposition/pca.py#L410>>
> >     > that the matrix is centered but NOT scaled, before performing the
> >     > singular value decomposition.
> >     > However, the help files do not make any mention of it.
> >     > This is unclear for someone who, like me, just wanted to compare
> >     that
> >     > the PCA and np.linalg.svd give the same results. In academic
> >     settings,
> >     > students are often asked to compare different methods and to
> >     check that
> >     > they yield the same results. I expect that many students have
> >     confronted
> >     > this problem before...
> >     > Best,
> >     > Ismael Lemhadri
> >     >
> >     >
> >     > _______________________________________________
> >     > scikit-learn mailing list
> >     > scikit-learn at python.org <mailto:scikit-learn at python.org>
> >     > https://mail.python.org/mailman/listinfo/scikit-learn
> >     <https://mail.python.org/mailman/listinfo/scikit-learn>
> >     >
> >
> >
> >
> >     ------------------------------
> >
> >     Message: 3
> >     Date: Mon, 16 Oct 2017 15:27:48 +0200
> >     From: Serafeim Loukas <seralouk at gmail.com <mailto:seralouk at gmail.com
> >>
> >     To: scikit-learn at python.org <mailto:scikit-learn at python.org>
> >     Subject: [scikit-learn] Question about LDA's coef_ attribute
> >     Message-ID: <58C6D0DA-9DE5-4EF5-97C1-48159831F5A9 at gmail.com
> >     <mailto:58C6D0DA-9DE5-4EF5-97C1-48159831F5A9 at gmail.com>>
> >     Content-Type: text/plain; charset="us-ascii"
> >
> >     Dear Scikit-learn community,
> >
> >     Since the documentation of the LDA
> >     (http://scikit-learn.org/stable/modules/generated/
> sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html
> >     <http://scikit-learn.org/stable/modules/generated/
> sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html>
> >     <http://scikit-learn.org/stable/modules/generated/
> sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html
> >     <http://scikit-learn.org/stable/modules/generated/
> sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html>>)
> >     is not so clear, I would like to ask if the lda.coef_ attribute
> >     stores the eigenvectors from the SVD decomposition.
> >
> >     Thank you in advance,
> >     Serafeim
> >     -------------- next part --------------
> >     An HTML attachment was scrubbed...
> >     URL:
> >     <http://mail.python.org/pipermail/scikit-learn/
> attachments/20171016/4263df5c/attachment-0001.html
> >     <http://mail.python.org/pipermail/scikit-learn/
> attachments/20171016/4263df5c/attachment-0001.html>>
> >
> >     ------------------------------
> >
> >     Message: 4
> >     Date: Mon, 16 Oct 2017 16:57:52 +0200
> >     From: Alexandre Gramfort <alexandre.gramfort at inria.fr
> >     <mailto:alexandre.gramfort at inria.fr>>
> >     To: Scikit-learn mailing list <scikit-learn at python.org
> >     <mailto:scikit-learn at python.org>>
> >     Subject: Re: [scikit-learn] Question about LDA's coef_ attribute
> >     Message-ID:
> >     ? ? ? ?
> >     <CADeotZricOQhuHJMmW2Z14cqffEQyndYoxn-OgKAvTMQ7V0Y2g at mail.gmail.com
> >     <mailto:CADeotZricOQhuHJMmW2Z14cqffEQyndYoxn-OgKAvTMQ7V0Y2g@
> mail.gmail.com>>
> >     Content-Type: text/plain; charset="UTF-8"
> >
> >     no it stores the direction of the decision function to match the
> >     API of
> >     linear models.
> >
> >     HTH
> >     Alex
> >
> >     On Mon, Oct 16, 2017 at 3:27 PM, Serafeim Loukas
> >     <seralouk at gmail.com <mailto:seralouk at gmail.com>> wrote:
> >     > Dear Scikit-learn community,
> >     >
> >     > Since the documentation of the LDA
> >     >
> >     (http://scikit-learn.org/stable/modules/generated/
> sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html
> >     <http://scikit-learn.org/stable/modules/generated/
> sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html>)
> >     > is not so clear, I would like to ask if the lda.coef_ attribute
> >     stores the
> >     > eigenvectors from the SVD decomposition.
> >     >
> >     > Thank you in advance,
> >     > Serafeim
> >     >
> >     > _______________________________________________
> >     > scikit-learn mailing list
> >     > scikit-learn at python.org <mailto:scikit-learn at python.org>
> >     > https://mail.python.org/mailman/listinfo/scikit-learn
> >     <https://mail.python.org/mailman/listinfo/scikit-learn>
> >     >
> >
> >
> >     ------------------------------
> >
> >     Message: 5
> >     Date: Mon, 16 Oct 2017 17:02:46 +0200
> >     From: Serafeim Loukas <seralouk at gmail.com <mailto:seralouk at gmail.com
> >>
> >     To: Scikit-learn mailing list <scikit-learn at python.org
> >     <mailto:scikit-learn at python.org>>
> >     Subject: Re: [scikit-learn] Question about LDA's coef_ attribute
> >     Message-ID: <413210D2-56AE-41A4-873F-D171BB36539D at gmail.com
> >     <mailto:413210D2-56AE-41A4-873F-D171BB36539D at gmail.com>>
> >     Content-Type: text/plain; charset="us-ascii"
> >
> >     Dear Alex,
> >
> >     Thank you for the prompt response.
> >
> >     Are the eigenvectors stored in some variable ?
> >     Does the lda.scalings_ attribute contain the eigenvectors ?
> >
> >     Best,
> >     Serafeim
> >
> >     > On 16 Oct 2017, at 16:57, Alexandre Gramfort
> >     <alexandre.gramfort at inria.fr <mailto:alexandre.gramfort at inria.fr>>
> >     wrote:
> >     >
> >     > no it stores the direction of the decision function to match the
> >     API of
> >     > linear models.
> >     >
> >     > HTH
> >     > Alex
> >     >
> >     > On Mon, Oct 16, 2017 at 3:27 PM, Serafeim Loukas
> >     <seralouk at gmail.com <mailto:seralouk at gmail.com>> wrote:
> >     >> Dear Scikit-learn community,
> >     >>
> >     >> Since the documentation of the LDA
> >     >>
> >     (http://scikit-learn.org/stable/modules/generated/
> sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html
> >     <http://scikit-learn.org/stable/modules/generated/
> sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html>)
> >     >> is not so clear, I would like to ask if the lda.coef_ attribute
> >     stores the
> >     >> eigenvectors from the SVD decomposition.
> >     >>
> >     >> Thank you in advance,
> >     >> Serafeim
> >     >>
> >     >> _______________________________________________
> >     >> scikit-learn mailing list
> >     >> scikit-learn at python.org <mailto:scikit-learn at python.org>
> >     >> https://mail.python.org/mailman/listinfo/scikit-learn
> >     <https://mail.python.org/mailman/listinfo/scikit-learn>
> >     >>
> >     > _______________________________________________
> >     > scikit-learn mailing list
> >     > scikit-learn at python.org <mailto:scikit-learn at python.org>
> >     > https://mail.python.org/mailman/listinfo/scikit-learn
> >     <https://mail.python.org/mailman/listinfo/scikit-learn>
> >
> >     -------------- next part --------------
> >     An HTML attachment was scrubbed...
> >     URL:
> >     <http://mail.python.org/pipermail/scikit-learn/
> attachments/20171016/505c7da3/attachment.html
> >     <http://mail.python.org/pipermail/scikit-learn/
> attachments/20171016/505c7da3/attachment.html>>
> >
> >     ------------------------------
> >
> >     Subject: Digest Footer
> >
> >     _______________________________________________
> >     scikit-learn mailing list
> >     scikit-learn at python.org <mailto:scikit-learn at python.org>
> >     https://mail.python.org/mailman/listinfo/scikit-learn
> >     <https://mail.python.org/mailman/listinfo/scikit-learn>
> >
> >
> >     ------------------------------
> >
> >     End of scikit-learn Digest, Vol 19, Issue 25
> >     ********************************************
> >
> >
> >
> >
> > _______________________________________________
> > scikit-learn mailing list
> > scikit-learn at python.org
> > https://mail.python.org/mailman/listinfo/scikit-learn
>
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <http://mail.python.org/pipermail/scikit-learn/
> attachments/20171016/f47e63a9/attachment.html>
>
> ------------------------------
>
> Subject: Digest Footer
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
>
> ------------------------------
>
> End of scikit-learn Digest, Vol 19, Issue 28
> ********************************************
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20171016/a8c8929a/attachment-0001.html>

From michael.eickenberg at gmail.com  Mon Oct 16 14:41:26 2017
From: michael.eickenberg at gmail.com (Michael Eickenberg)
Date: Mon, 16 Oct 2017 11:41:26 -0700
Subject: [scikit-learn] 1. Re: unclear help file for
 sklearn.decomposition.pca
In-Reply-To: <CANpSPFTA5KetWQ4q0n=4L9dvjaUc-sJwRh+=cDe+2wZCvRiv4g@mail.gmail.com>
References: <CANpSPFTA5KetWQ4q0n=4L9dvjaUc-sJwRh+=cDe+2wZCvRiv4g@mail.gmail.com>
Message-ID: <CADxJN66q+Se2+2Zp0cLku1dA_6GPAPJ_QwQ3Z=Qu8MdjZHVn=g@mail.gmail.com>

Your document says:

> This data has already been pre-processed so that each of the features
 and  have about the same mean (zero) and variance.

This means that you do this before doing the eigendecomposition.

Check the wikipedia article
https://en.wikipedia.org/wiki/Principal_component_analysis - it says:

> To find the axes of the ellipsoid, we must first subtract the mean of
each variable from the dataset to center the data around the origin.

More intuitively: PCA diagonalizes the empirical covariance matrix. The
covariance matrix is the matrix of centered second order moments. To obtain
it you have to center the data.

Hope this helps.
Michael


On Mon, Oct 16, 2017 at 11:27 AM, Ismael Lemhadri <lemhadri at stanford.edu>
wrote:

> @Andreas Muller:
> My references do not assume centering, e.g. http://ufldl.stanford.
> edu/wiki/index.php/PCA
> any reference?
>
>
>
> On Mon, Oct 16, 2017 at 10:20 AM, <scikit-learn-request at python.org> wrote:
>
>> Send scikit-learn mailing list submissions to
>>         scikit-learn at python.org
>>
>> To subscribe or unsubscribe via the World Wide Web, visit
>>         https://mail.python.org/mailman/listinfo/scikit-learn
>> or, via email, send a message with subject or body 'help' to
>>         scikit-learn-request at python.org
>>
>> You can reach the person managing the list at
>>         scikit-learn-owner at python.org
>>
>> When replying, please edit your Subject line so it is more specific
>> than "Re: Contents of scikit-learn digest..."
>>
>>
>> Today's Topics:
>>
>>    1. Re: unclear help file for sklearn.decomposition.pca
>>       (Andreas Mueller)
>>
>>
>> ----------------------------------------------------------------------
>>
>> Message: 1
>> Date: Mon, 16 Oct 2017 13:19:57 -0400
>> From: Andreas Mueller <t3kcit at gmail.com>
>> To: scikit-learn at python.org
>> Subject: Re: [scikit-learn] unclear help file for
>>         sklearn.decomposition.pca
>> Message-ID: <04fc445c-d8f3-a3a9-4ab2-0535826a2d03 at gmail.com>
>> Content-Type: text/plain; charset="utf-8"; Format="flowed"
>>
>> The definition of PCA has a centering step, but no scaling step.
>>
>> On 10/16/2017 11:16 AM, Ismael Lemhadri wrote:
>> > Dear Roman,
>> > My concern is actually not about not mentioning the scaling but about
>> > not mentioning the centering.
>> > That is, the sklearn PCA removes the mean but it does not mention it
>> > in the help file.
>> > This was quite messy for me to debug as I expected it to either: 1/
>> > center and scale simultaneously or / not scale and not center either.
>> > It would be beneficial to explicit the behavior in the help file in my
>> > opinion.
>> > Ismael
>> >
>> > On Mon, Oct 16, 2017 at 8:02 AM, <scikit-learn-request at python.org
>> > <mailto:scikit-learn-request at python.org>> wrote:
>> >
>> >     Send scikit-learn mailing list submissions to
>> >     scikit-learn at python.org <mailto:scikit-learn at python.org>
>> >
>> >     To subscribe or unsubscribe via the World Wide Web, visit
>> >     https://mail.python.org/mailman/listinfo/scikit-learn
>> >     <https://mail.python.org/mailman/listinfo/scikit-learn>
>> >     or, via email, send a message with subject or body 'help' to
>> >     scikit-learn-request at python.org
>> >     <mailto:scikit-learn-request at python.org>
>> >
>> >     You can reach the person managing the list at
>> >     scikit-learn-owner at python.org <mailto:scikit-learn-owner at python.org
>> >
>> >
>> >     When replying, please edit your Subject line so it is more specific
>> >     than "Re: Contents of scikit-learn digest..."
>> >
>> >
>> >     Today's Topics:
>> >
>> >     ? ?1. unclear help file for sklearn.decomposition.pca (Ismael
>> >     Lemhadri)
>> >     ? ?2. Re: unclear help file for sklearn.decomposition.pca
>> >     ? ? ? (Roman Yurchak)
>> >     ? ?3. Question about LDA's coef_ attribute (Serafeim Loukas)
>> >     ? ?4. Re: Question about LDA's coef_ attribute (Alexandre Gramfort)
>> >     ? ?5. Re: Question about LDA's coef_ attribute (Serafeim Loukas)
>> >
>> >
>> >     -----------------------------------------------------------
>> -----------
>> >
>> >     Message: 1
>> >     Date: Sun, 15 Oct 2017 18:42:56 -0700
>> >     From: Ismael Lemhadri <lemhadri at stanford.edu
>> >     <mailto:lemhadri at stanford.edu>>
>> >     To: scikit-learn at python.org <mailto:scikit-learn at python.org>
>> >     Subject: [scikit-learn] unclear help file for
>> >     ? ? ? ? sklearn.decomposition.pca
>> >     Message-ID:
>> >     ? ? ? ?
>> >     <CANpSPFTgv+Oz7f97dandmrBBayqf_o9w=18oKHCFN0u5DNzj+g at mail.gmail.com
>> >     <mailto:18oKHCFN0u5DNzj%2Bg at mail.gmail.com>>
>> >     Content-Type: text/plain; charset="utf-8"
>> >
>> >     Dear all,
>> >     The help file for the PCA class is unclear about the preprocessing
>> >     performed to the data.
>> >     You can check on line 410 here:
>> >     https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/
>> >     decomposition/pca.py#L410
>> >     <https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a
>> /sklearn/%0Adecomposition/pca.py#L410>
>> >     that the matrix is centered but NOT scaled, before performing the
>> >     singular
>> >     value decomposition.
>> >     However, the help files do not make any mention of it.
>> >     This is unclear for someone who, like me, just wanted to compare
>> >     that the
>> >     PCA and np.linalg.svd give the same results. In academic settings,
>> >     students
>> >     are often asked to compare different methods and to check that
>> >     they yield
>> >     the same results. I expect that many students have confronted this
>> >     problem
>> >     before...
>> >     Best,
>> >     Ismael Lemhadri
>> >     -------------- next part --------------
>> >     An HTML attachment was scrubbed...
>> >     URL:
>> >     <http://mail.python.org/pipermail/scikit-learn/attachments/
>> 20171015/c465bde7/attachment-0001.html
>> >     <http://mail.python.org/pipermail/scikit-learn/attachments/
>> 20171015/c465bde7/attachment-0001.html>>
>> >
>> >     ------------------------------
>> >
>> >     Message: 2
>> >     Date: Mon, 16 Oct 2017 15:16:45 +0200
>> >     From: Roman Yurchak <rth.yurchak at gmail.com
>> >     <mailto:rth.yurchak at gmail.com>>
>> >     To: Scikit-learn mailing list <scikit-learn at python.org
>> >     <mailto:scikit-learn at python.org>>
>> >     Subject: Re: [scikit-learn] unclear help file for
>> >     ? ? ? ? sklearn.decomposition.pca
>> >     Message-ID: <b2abdcfd-4736-929e-6304-b93832932043 at gmail.com
>> >     <mailto:b2abdcfd-4736-929e-6304-b93832932043 at gmail.com>>
>> >     Content-Type: text/plain; charset=utf-8; format=flowed
>> >
>> >     Ismael,
>> >
>> >     as far as I saw the sklearn.decomposition.PCA doesn't mention
>> >     scaling at
>> >     all (except for the whiten parameter which is post-transformation
>> >     scaling).
>> >
>> >     So since it doesn't mention it, it makes sense that it doesn't do
>> any
>> >     scaling of the input. Same as np.linalg.svd.
>> >
>> >     You can verify that PCA and np.linalg.svd yield the same results,
>> with
>> >
>> >     ```
>> >     ?>>> import numpy as np
>> >     ?>>> from sklearn.decomposition import PCA
>> >     ?>>> import numpy.linalg
>> >     ?>>> X = np.random.RandomState(42).rand(10, 4)
>> >     ?>>> n_components = 2
>> >     ?>>> PCA(n_components, svd_solver='full').fit_transform(X)
>> >     ```
>> >
>> >     and
>> >
>> >     ```
>> >     ?>>> U, s, V = np.linalg.svd(X - X.mean(axis=0),
>> full_matrices=False)
>> >     ?>>> (X - X.mean(axis=0)).dot(V[:n_components].T)
>> >     ```
>> >
>> >     --
>> >     Roman
>> >
>> >     On 16/10/17 03:42, Ismael Lemhadri wrote:
>> >     > Dear all,
>> >     > The help file for the PCA class is unclear about the preprocessing
>> >     > performed to the data.
>> >     > You can check on line 410 here:
>> >     >
>> >     https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/
>> sklearn/decomposition/pca.py#L410
>> >     <https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a
>> /sklearn/decomposition/pca.py#L410>
>> >     >
>> >     <https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a
>> /sklearn/decomposition/pca.py#L410
>> >     <https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a
>> /sklearn/decomposition/pca.py#L410>>
>> >     > that the matrix is centered but NOT scaled, before performing the
>> >     > singular value decomposition.
>> >     > However, the help files do not make any mention of it.
>> >     > This is unclear for someone who, like me, just wanted to compare
>> >     that
>> >     > the PCA and np.linalg.svd give the same results. In academic
>> >     settings,
>> >     > students are often asked to compare different methods and to
>> >     check that
>> >     > they yield the same results. I expect that many students have
>> >     confronted
>> >     > this problem before...
>> >     > Best,
>> >     > Ismael Lemhadri
>> >     >
>> >     >
>> >     > _______________________________________________
>> >     > scikit-learn mailing list
>> >     > scikit-learn at python.org <mailto:scikit-learn at python.org>
>> >     > https://mail.python.org/mailman/listinfo/scikit-learn
>> >     <https://mail.python.org/mailman/listinfo/scikit-learn>
>> >     >
>> >
>> >
>> >
>> >     ------------------------------
>> >
>> >     Message: 3
>> >     Date: Mon, 16 Oct 2017 15:27:48 +0200
>> >     From: Serafeim Loukas <seralouk at gmail.com <mailto:
>> seralouk at gmail.com>>
>> >     To: scikit-learn at python.org <mailto:scikit-learn at python.org>
>> >     Subject: [scikit-learn] Question about LDA's coef_ attribute
>> >     Message-ID: <58C6D0DA-9DE5-4EF5-97C1-48159831F5A9 at gmail.com
>> >     <mailto:58C6D0DA-9DE5-4EF5-97C1-48159831F5A9 at gmail.com>>
>> >     Content-Type: text/plain; charset="us-ascii"
>> >
>> >     Dear Scikit-learn community,
>> >
>> >     Since the documentation of the LDA
>> >     (http://scikit-learn.org/stable/modules/generated/sklearn.
>> discriminant_analysis.LinearDiscriminantAnalysis.html
>> >     <http://scikit-learn.org/stable/modules/generated/sklearn.
>> discriminant_analysis.LinearDiscriminantAnalysis.html>
>> >     <http://scikit-learn.org/stable/modules/generated/sklearn.
>> discriminant_analysis.LinearDiscriminantAnalysis.html
>> >     <http://scikit-learn.org/stable/modules/generated/sklearn.
>> discriminant_analysis.LinearDiscriminantAnalysis.html>>)
>> >     is not so clear, I would like to ask if the lda.coef_ attribute
>> >     stores the eigenvectors from the SVD decomposition.
>> >
>> >     Thank you in advance,
>> >     Serafeim
>> >     -------------- next part --------------
>> >     An HTML attachment was scrubbed...
>> >     URL:
>> >     <http://mail.python.org/pipermail/scikit-learn/attachments/
>> 20171016/4263df5c/attachment-0001.html
>> >     <http://mail.python.org/pipermail/scikit-learn/attachments/
>> 20171016/4263df5c/attachment-0001.html>>
>> >
>> >     ------------------------------
>> >
>> >     Message: 4
>> >     Date: Mon, 16 Oct 2017 16:57:52 +0200
>> >     From: Alexandre Gramfort <alexandre.gramfort at inria.fr
>> >     <mailto:alexandre.gramfort at inria.fr>>
>> >     To: Scikit-learn mailing list <scikit-learn at python.org
>> >     <mailto:scikit-learn at python.org>>
>> >     Subject: Re: [scikit-learn] Question about LDA's coef_ attribute
>> >     Message-ID:
>> >     ? ? ? ?
>> >     <CADeotZricOQhuHJMmW2Z14cqffEQyndYoxn-OgKAvTMQ7V0Y2g at mail.gmail.com
>> >     <mailto:CADeotZricOQhuHJMmW2Z14cqffEQyndYoxn-
>> OgKAvTMQ7V0Y2g at mail.gmail.com>>
>> >     Content-Type: text/plain; charset="UTF-8"
>> >
>> >     no it stores the direction of the decision function to match the
>> >     API of
>> >     linear models.
>> >
>> >     HTH
>> >     Alex
>> >
>> >     On Mon, Oct 16, 2017 at 3:27 PM, Serafeim Loukas
>> >     <seralouk at gmail.com <mailto:seralouk at gmail.com>> wrote:
>> >     > Dear Scikit-learn community,
>> >     >
>> >     > Since the documentation of the LDA
>> >     >
>> >     (http://scikit-learn.org/stable/modules/generated/sklearn.
>> discriminant_analysis.LinearDiscriminantAnalysis.html
>> >     <http://scikit-learn.org/stable/modules/generated/sklearn.
>> discriminant_analysis.LinearDiscriminantAnalysis.html>)
>> >     > is not so clear, I would like to ask if the lda.coef_ attribute
>> >     stores the
>> >     > eigenvectors from the SVD decomposition.
>> >     >
>> >     > Thank you in advance,
>> >     > Serafeim
>> >     >
>> >     > _______________________________________________
>> >     > scikit-learn mailing list
>> >     > scikit-learn at python.org <mailto:scikit-learn at python.org>
>> >     > https://mail.python.org/mailman/listinfo/scikit-learn
>> >     <https://mail.python.org/mailman/listinfo/scikit-learn>
>> >     >
>> >
>> >
>> >     ------------------------------
>> >
>> >     Message: 5
>> >     Date: Mon, 16 Oct 2017 17:02:46 +0200
>> >     From: Serafeim Loukas <seralouk at gmail.com <mailto:
>> seralouk at gmail.com>>
>> >     To: Scikit-learn mailing list <scikit-learn at python.org
>> >     <mailto:scikit-learn at python.org>>
>> >     Subject: Re: [scikit-learn] Question about LDA's coef_ attribute
>> >     Message-ID: <413210D2-56AE-41A4-873F-D171BB36539D at gmail.com
>> >     <mailto:413210D2-56AE-41A4-873F-D171BB36539D at gmail.com>>
>> >     Content-Type: text/plain; charset="us-ascii"
>> >
>> >     Dear Alex,
>> >
>> >     Thank you for the prompt response.
>> >
>> >     Are the eigenvectors stored in some variable ?
>> >     Does the lda.scalings_ attribute contain the eigenvectors ?
>> >
>> >     Best,
>> >     Serafeim
>> >
>> >     > On 16 Oct 2017, at 16:57, Alexandre Gramfort
>> >     <alexandre.gramfort at inria.fr <mailto:alexandre.gramfort at inria.fr>>
>> >     wrote:
>> >     >
>> >     > no it stores the direction of the decision function to match the
>> >     API of
>> >     > linear models.
>> >     >
>> >     > HTH
>> >     > Alex
>> >     >
>> >     > On Mon, Oct 16, 2017 at 3:27 PM, Serafeim Loukas
>> >     <seralouk at gmail.com <mailto:seralouk at gmail.com>> wrote:
>> >     >> Dear Scikit-learn community,
>> >     >>
>> >     >> Since the documentation of the LDA
>> >     >>
>> >     (http://scikit-learn.org/stable/modules/generated/sklearn.
>> discriminant_analysis.LinearDiscriminantAnalysis.html
>> >     <http://scikit-learn.org/stable/modules/generated/sklearn.
>> discriminant_analysis.LinearDiscriminantAnalysis.html>)
>> >     >> is not so clear, I would like to ask if the lda.coef_ attribute
>> >     stores the
>> >     >> eigenvectors from the SVD decomposition.
>> >     >>
>> >     >> Thank you in advance,
>> >     >> Serafeim
>> >     >>
>> >     >> _______________________________________________
>> >     >> scikit-learn mailing list
>> >     >> scikit-learn at python.org <mailto:scikit-learn at python.org>
>> >     >> https://mail.python.org/mailman/listinfo/scikit-learn
>> >     <https://mail.python.org/mailman/listinfo/scikit-learn>
>> >     >>
>> >     > _______________________________________________
>> >     > scikit-learn mailing list
>> >     > scikit-learn at python.org <mailto:scikit-learn at python.org>
>> >     > https://mail.python.org/mailman/listinfo/scikit-learn
>> >     <https://mail.python.org/mailman/listinfo/scikit-learn>
>> >
>> >     -------------- next part --------------
>> >     An HTML attachment was scrubbed...
>> >     URL:
>> >     <http://mail.python.org/pipermail/scikit-learn/attachments/
>> 20171016/505c7da3/attachment.html
>> >     <http://mail.python.org/pipermail/scikit-learn/attachments/
>> 20171016/505c7da3/attachment.html>>
>> >
>> >     ------------------------------
>> >
>> >     Subject: Digest Footer
>> >
>> >     _______________________________________________
>> >     scikit-learn mailing list
>> >     scikit-learn at python.org <mailto:scikit-learn at python.org>
>> >     https://mail.python.org/mailman/listinfo/scikit-learn
>> >     <https://mail.python.org/mailman/listinfo/scikit-learn>
>> >
>> >
>> >     ------------------------------
>> >
>> >     End of scikit-learn Digest, Vol 19, Issue 25
>> >     ********************************************
>> >
>> >
>> >
>> >
>> > _______________________________________________
>> > scikit-learn mailing list
>> > scikit-learn at python.org
>> > https://mail.python.org/mailman/listinfo/scikit-learn
>>
>> -------------- next part --------------
>> An HTML attachment was scrubbed...
>> URL: <http://mail.python.org/pipermail/scikit-learn/attachments/
>> 20171016/f47e63a9/attachment.html>
>>
>> ------------------------------
>>
>> Subject: Digest Footer
>>
>> _______________________________________________
>> scikit-learn mailing list
>> scikit-learn at python.org
>> https://mail.python.org/mailman/listinfo/scikit-learn
>>
>>
>> ------------------------------
>>
>> End of scikit-learn Digest, Vol 19, Issue 28
>> ********************************************
>>
>
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20171016/a32c15e5/attachment-0001.html>

From t3kcit at gmail.com  Mon Oct 16 14:44:51 2017
From: t3kcit at gmail.com (Andreas Mueller)
Date: Mon, 16 Oct 2017 14:44:51 -0400
Subject: [scikit-learn] 1. Re: unclear help file for
 sklearn.decomposition.pca
In-Reply-To: <CANpSPFTA5KetWQ4q0n=4L9dvjaUc-sJwRh+=cDe+2wZCvRiv4g@mail.gmail.com>
References: <CANpSPFTA5KetWQ4q0n=4L9dvjaUc-sJwRh+=cDe+2wZCvRiv4g@mail.gmail.com>
Message-ID: <35142868-fce9-6cb3-eba3-015a0b106163@gmail.com>


On 10/16/2017 02:27 PM, Ismael Lemhadri wrote:
> @Andreas Muller:
> My references do not assume centering, e.g. 
> http://ufldl.stanford.edu/wiki/index.php/PCA
> any reference?
>
It kinda does but is not very clear about it:

This data has already been pre-processed so that each of the 
features\textstyle x_1and\textstyle x_2have about the same mean (zero) 
and variance.


Wikipedia is much clearer:
Consider a datamatrix 
<https://en.wikipedia.org/wiki/Matrix_%28mathematics%29>,*X*, with 
column-wise zeroempirical mean 
<https://en.wikipedia.org/wiki/Empirical_mean>(the sample mean of each 
column has been shifted to zero), where each of the/n/rows represents a 
different repetition of the experiment, and each of the/p/columns gives 
a particular kind of feature (say, the results from a particular sensor).
https://en.wikipedia.org/wiki/Principal_component_analysis#Details

I'm a bit surprised to find that ESL says "The SVD of the centered 
matrix X is another way of expressing the principal components of the 
variables in X",
so they assume scaling? They don't really have a great treatment of PCA, 
though.

Bishop <http://www.springer.com/us/book/9780387310732> and Murphy 
<https://mitpress.mit.edu/books/machine-learning-0> are pretty clear 
that they subtract the mean (or assume zero mean) but don't standardize.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20171016/81b3014b/attachment.html>

From olivertomic at zoho.com  Mon Oct 16 14:48:29 2017
From: olivertomic at zoho.com (Oliver Tomic)
Date: Mon, 16 Oct 2017 20:48:29 +0200
Subject: [scikit-learn] 1. Re: unclear help file for
 sklearn.decomposition.pca
In-Reply-To: <CANpSPFTA5KetWQ4q0n=4L9dvjaUc-sJwRh+=cDe+2wZCvRiv4g@mail.gmail.com>
References: <CANpSPFTA5KetWQ4q0n=4L9dvjaUc-sJwRh+=cDe+2wZCvRiv4g@mail.gmail.com>
Message-ID: <15f26840d65.e97b33c25239.3934951873824890747@zoho.com>

Dear Ismael,


PCA should always involve at the least centering, or, if the variables are to contribute equally, scaling. Here is a reference from the scientific area named "chemometrics". In Chemometrics PCA used not only for dimensionality reduction, but also for interpretation of variance by use of scores, loadings, correlation loadings, etc.


If you scroll down to subsection "Preprocessing" you will find more info on centering and scaling.


http://pubs.rsc.org/en/content/articlehtml/2014/ay/c3ay41907j 


best

Oliver


---- On Mon, 16 Oct 2017 20:27:11 +0200 Ismael Lemhadri &lt;lemhadri at stanford.edu&gt; wrote ----


@Andreas Muller: 

My references do not assume centering, e.g. http://ufldl.stanford.edu/wiki/index.php/PCA

any reference?


On Mon, Oct 16, 2017 at 10:20 AM, &lt;scikit-learn-request at python.org&gt; wrote:

Send scikit-learn mailing list submissions to

         scikit-learn at python.org

 
 To subscribe or unsubscribe via the World Wide Web, visit

         https://mail.python.org/mailman/listinfo/scikit-learn

 or, via email, send a message with subject or body 'help' to

         scikit-learn-request at python.org

 
 You can reach the person managing the list at

         scikit-learn-owner at python.org

 
 When replying, please edit your Subject line so it is more specific

 than "Re: Contents of scikit-learn digest..."

 
 Today's Topics:

 
    1. Re: unclear help file for sklearn.decomposition.pca

       (Andreas Mueller)

 
 ----------------------------------------------------------------------

 
 Message: 1

 Date: Mon, 16 Oct 2017 13:19:57 -0400

 From: Andreas Mueller &lt;t3kcit at gmail.com&gt;

 To: scikit-learn at python.org

 Subject: Re: [scikit-learn] unclear help file for

         sklearn.decomposition.pca

 Message-ID: &lt;04fc445c-d8f3-a3a9-4ab2-0535826a2d03 at gmail.com&gt;

 Content-Type: text/plain; charset="utf-8"; Format="flowed"

 
 The definition of PCA has a centering step, but no scaling step.

 
 On 10/16/2017 11:16 AM, Ismael Lemhadri wrote:

 &gt; Dear Roman,

 &gt; My concern is actually not about not mentioning the scaling but about

 &gt; not mentioning the centering.

 &gt; That is, the sklearn PCA removes the mean but it does not mention it

 &gt; in the help file.

 &gt; This was quite messy for me to debug as I expected it to either: 1/

 &gt; center and scale simultaneously or / not scale and not center either.

 &gt; It would be beneficial to explicit the behavior in the help file in my

 &gt; opinion.

 &gt; Ismael

 &gt;

 &gt; On Mon, Oct 16, 2017 at 8:02 AM, &lt;scikit-learn-request at python.org

 &gt; &lt;mailto:scikit-learn-request at python.org&gt;&gt; wrote:

 &gt;

 &gt;     Send scikit-learn mailing list submissions to

 &gt;     scikit-learn at python.org &lt;mailto:scikit-learn at python.org&gt;

 &gt;

 &gt;     To subscribe or unsubscribe via the World Wide Web, visit

 &gt;     https://mail.python.org/mailman/listinfo/scikit-learn

 &gt;     &lt;https://mail.python.org/mailman/listinfo/scikit-learn&gt;

 &gt;     or, via email, send a message with subject or body 'help' to

 &gt;     scikit-learn-request at python.org

 &gt;     &lt;mailto:scikit-learn-request at python.org&gt;

 &gt;

 &gt;     You can reach the person managing the list at

 &gt;     scikit-learn-owner at python.org &lt;mailto:scikit-learn-owner at python.org&gt;

 &gt;

 &gt;     When replying, please edit your Subject line so it is more specific

 &gt;     than "Re: Contents of scikit-learn digest..."

 &gt;

 &gt;

 &gt;     Today's Topics:

 &gt;

 &gt;     ? ?1. unclear help file for sklearn.decomposition.pca (Ismael

 &gt;     Lemhadri)

 &gt;     ? ?2. Re: unclear help file for sklearn.decomposition.pca

 &gt;     ? ? ? (Roman Yurchak)

 &gt;     ? ?3. Question about LDA's coef_ attribute (Serafeim Loukas)

 &gt;     ? ?4. Re: Question about LDA's coef_ attribute (Alexandre Gramfort)

 &gt;     ? ?5. Re: Question about LDA's coef_ attribute (Serafeim Loukas)

 &gt;

 &gt;

 &gt;     ----------------------------------------------------------------------

 &gt;

 &gt;     Message: 1

 &gt;     Date: Sun, 15 Oct 2017 18:42:56 -0700

 &gt;     From: Ismael Lemhadri &lt;lemhadri at stanford.edu

 &gt;     &lt;mailto:lemhadri at stanford.edu&gt;&gt;

 &gt;     To: scikit-learn at python.org &lt;mailto:scikit-learn at python.org&gt;

 &gt;     Subject: [scikit-learn] unclear help file for

 &gt;     ? ? ? ? sklearn.decomposition.pca

 &gt;     Message-ID:

 &gt;     ? ? ? ?

 &gt;     &lt;CANpSPFTgv+Oz7f97dandmrBBayqf_o9w=18oKHCFN0u5DNzj+g at mail.gmail.com

 &gt;     &lt;mailto:18oKHCFN0u5DNzj%2Bg at mail.gmail.com&gt;&gt;

 &gt;     Content-Type: text/plain; charset="utf-8"

 &gt;

 &gt;     Dear all,

 &gt;     The help file for the PCA class is unclear about the preprocessing

 &gt;     performed to the data.

 &gt;     You can check on line 410 here:

 &gt;     https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/

 &gt;     decomposition/pca.py#L410

 &gt;     &lt;https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/%0Adecomposition/pca.py#L410&gt;

 &gt;     that the matrix is centered but NOT scaled, before performing the

 &gt;     singular

 &gt;     value decomposition.

 &gt;     However, the help files do not make any mention of it.

 &gt;     This is unclear for someone who, like me, just wanted to compare

 &gt;     that the

 &gt;     PCA and np.linalg.svd give the same results. In academic settings,

 &gt;     students

 &gt;     are often asked to compare different methods and to check that

 &gt;     they yield

 &gt;     the same results. I expect that many students have confronted this

 &gt;     problem

 &gt;     before...

 &gt;     Best,

 &gt;     Ismael Lemhadri

 &gt;     -------------- next part --------------

 &gt;     An HTML attachment was scrubbed...

 &gt;     URL:

 &gt;     &lt;http://mail.python.org/pipermail/scikit-learn/attachments/20171015/c465bde7/attachment-0001.html

 &gt;     &lt;http://mail.python.org/pipermail/scikit-learn/attachments/20171015/c465bde7/attachment-0001.html&gt;&gt;

 &gt;

 &gt;     ------------------------------

 &gt;

 &gt;     Message: 2

 &gt;     Date: Mon, 16 Oct 2017 15:16:45 +0200

 &gt;     From: Roman Yurchak &lt;rth.yurchak at gmail.com

 &gt;     &lt;mailto:rth.yurchak at gmail.com&gt;&gt;

 &gt;     To: Scikit-learn mailing list &lt;scikit-learn at python.org

 &gt;     &lt;mailto:scikit-learn at python.org&gt;&gt;

 &gt;     Subject: Re: [scikit-learn] unclear help file for

 &gt;     ? ? ? ? sklearn.decomposition.pca

 &gt;     Message-ID: &lt;b2abdcfd-4736-929e-6304-b93832932043 at gmail.com

 &gt;     &lt;mailto:b2abdcfd-4736-929e-6304-b93832932043 at gmail.com&gt;&gt;

 &gt;     Content-Type: text/plain; charset=utf-8; format=flowed

 &gt;

 &gt;     Ismael,

 &gt;

 &gt;     as far as I saw the sklearn.decomposition.PCA doesn't mention

 &gt;     scaling at

 &gt;     all (except for the whiten parameter which is post-transformation

 &gt;     scaling).

 &gt;

 &gt;     So since it doesn't mention it, it makes sense that it doesn't do any

 &gt;     scaling of the input. Same as np.linalg.svd.

 &gt;

 &gt;     You can verify that PCA and np.linalg.svd yield the same results, with

 &gt;

 &gt;     ```

 &gt;     ?&gt;&gt;&gt; import numpy as np

 &gt;     ?&gt;&gt;&gt; from sklearn.decomposition import PCA

 &gt;     ?&gt;&gt;&gt; import numpy.linalg

 &gt;     ?&gt;&gt;&gt; X = np.random.RandomState(42).rand(10, 4)

 &gt;     ?&gt;&gt;&gt; n_components = 2

 &gt;     ?&gt;&gt;&gt; PCA(n_components, svd_solver='full').fit_transform(X)

 &gt;     ```

 &gt;

 &gt;     and

 &gt;

 &gt;     ```

 &gt;     ?&gt;&gt;&gt; U, s, V = np.linalg.svd(X - X.mean(axis=0), full_matrices=False)

 &gt;     ?&gt;&gt;&gt; (X - X.mean(axis=0)).dot(V[:n_components].T)

 &gt;     ```

 &gt;

 &gt;     --

 &gt;     Roman

 &gt;

 &gt;     On 16/10/17 03:42, Ismael Lemhadri wrote:

 &gt;     &gt; Dear all,

 &gt;     &gt; The help file for the PCA class is unclear about the preprocessing

 &gt;     &gt; performed to the data.

 &gt;     &gt; You can check on line 410 here:

 &gt;     &gt;

 &gt;     https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/decomposition/pca.py#L410

 &gt;     &lt;https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/decomposition/pca.py#L410&gt;

 &gt;     &gt;

 &gt;     &lt;https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/decomposition/pca.py#L410

 &gt;     &lt;https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/decomposition/pca.py#L410&gt;&gt;

 &gt;     &gt; that the matrix is centered but NOT scaled, before performing the

 &gt;     &gt; singular value decomposition.

 &gt;     &gt; However, the help files do not make any mention of it.

 &gt;     &gt; This is unclear for someone who, like me, just wanted to compare

 &gt;     that

 &gt;     &gt; the PCA and np.linalg.svd give the same results. In academic

 &gt;     settings,

 &gt;     &gt; students are often asked to compare different methods and to

 &gt;     check that

 &gt;     &gt; they yield the same results. I expect that many students have

 &gt;     confronted

 &gt;     &gt; this problem before...

 &gt;     &gt; Best,

 &gt;     &gt; Ismael Lemhadri

 &gt;     &gt;

 &gt;     &gt;

 &gt;     &gt; _______________________________________________

 &gt;     &gt; scikit-learn mailing list

 &gt;     &gt; scikit-learn at python.org &lt;mailto:scikit-learn at python.org&gt;

 &gt;     &gt; https://mail.python.org/mailman/listinfo/scikit-learn

 &gt;     &lt;https://mail.python.org/mailman/listinfo/scikit-learn&gt;

 &gt;     &gt;

 &gt;

 &gt;

 &gt;

 &gt;     ------------------------------

 &gt;

 &gt;     Message: 3

 &gt;     Date: Mon, 16 Oct 2017 15:27:48 +0200

 &gt;     From: Serafeim Loukas &lt;seralouk at gmail.com &lt;mailto:seralouk at gmail.com&gt;&gt;

 &gt;     To: scikit-learn at python.org &lt;mailto:scikit-learn at python.org&gt;

 &gt;     Subject: [scikit-learn] Question about LDA's coef_ attribute

 &gt;     Message-ID: &lt;58C6D0DA-9DE5-4EF5-97C1-48159831F5A9 at gmail.com

 &gt;     &lt;mailto:58C6D0DA-9DE5-4EF5-97C1-48159831F5A9 at gmail.com&gt;&gt;

 &gt;     Content-Type: text/plain; charset="us-ascii"

 &gt;

 &gt;     Dear Scikit-learn community,

 &gt;

 &gt;     Since the documentation of the LDA

 &gt;     (http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html

 &gt;     &lt;http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html&gt;

 &gt;     &lt;http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html

 &gt;     &lt;http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html&gt;&gt;)

 &gt;     is not so clear, I would like to ask if the lda.coef_ attribute

 &gt;     stores the eigenvectors from the SVD decomposition.

 &gt;

 &gt;     Thank you in advance,

 &gt;     Serafeim

 &gt;     -------------- next part --------------

 &gt;     An HTML attachment was scrubbed...

 &gt;     URL:

 &gt;     &lt;http://mail.python.org/pipermail/scikit-learn/attachments/20171016/4263df5c/attachment-0001.html

 &gt;     &lt;http://mail.python.org/pipermail/scikit-learn/attachments/20171016/4263df5c/attachment-0001.html&gt;&gt;

 &gt;

 &gt;     ------------------------------

 &gt;

 &gt;     Message: 4

 &gt;     Date: Mon, 16 Oct 2017 16:57:52 +0200

 &gt;     From: Alexandre Gramfort &lt;alexandre.gramfort at inria.fr

 &gt;     &lt;mailto:alexandre.gramfort at inria.fr&gt;&gt;

 &gt;     To: Scikit-learn mailing list &lt;scikit-learn at python.org

 &gt;     &lt;mailto:scikit-learn at python.org&gt;&gt;

 &gt;     Subject: Re: [scikit-learn] Question about LDA's coef_ attribute

 &gt;     Message-ID:

 &gt;     ? ? ? ?

 &gt;     &lt;CADeotZricOQhuHJMmW2Z14cqffEQyndYoxn-OgKAvTMQ7V0Y2g at mail.gmail.com

 &gt;     &lt;mailto:CADeotZricOQhuHJMmW2Z14cqffEQyndYoxn-OgKAvTMQ7V0Y2g at mail.gmail.com&gt;&gt;

 &gt;     Content-Type: text/plain; charset="UTF-8"

 &gt;

 &gt;     no it stores the direction of the decision function to match the

 &gt;     API of

 &gt;     linear models.

 &gt;

 &gt;     HTH

 &gt;     Alex

 &gt;

 &gt;     On Mon, Oct 16, 2017 at 3:27 PM, Serafeim Loukas

 &gt;     &lt;seralouk at gmail.com &lt;mailto:seralouk at gmail.com&gt;&gt; wrote:

 &gt;     &gt; Dear Scikit-learn community,

 &gt;     &gt;

 &gt;     &gt; Since the documentation of the LDA

 &gt;     &gt;

 &gt;     (http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html

 &gt;     &lt;http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html&gt;)

 &gt;     &gt; is not so clear, I would like to ask if the lda.coef_ attribute

 &gt;     stores the

 &gt;     &gt; eigenvectors from the SVD decomposition.

 &gt;     &gt;

 &gt;     &gt; Thank you in advance,

 &gt;     &gt; Serafeim

 &gt;     &gt;

 &gt;     &gt; _______________________________________________

 &gt;     &gt; scikit-learn mailing list

 &gt;     &gt; scikit-learn at python.org &lt;mailto:scikit-learn at python.org&gt;

 &gt;     &gt; https://mail.python.org/mailman/listinfo/scikit-learn

 &gt;     &lt;https://mail.python.org/mailman/listinfo/scikit-learn&gt;

 &gt;     &gt;

 &gt;

 &gt;

 &gt;     ------------------------------

 &gt;

 &gt;     Message: 5

 &gt;     Date: Mon, 16 Oct 2017 17:02:46 +0200

 &gt;     From: Serafeim Loukas &lt;seralouk at gmail.com &lt;mailto:seralouk at gmail.com&gt;&gt;

 &gt;     To: Scikit-learn mailing list &lt;scikit-learn at python.org

 &gt;     &lt;mailto:scikit-learn at python.org&gt;&gt;

 &gt;     Subject: Re: [scikit-learn] Question about LDA's coef_ attribute

 &gt;     Message-ID: &lt;413210D2-56AE-41A4-873F-D171BB36539D at gmail.com

 &gt;     &lt;mailto:413210D2-56AE-41A4-873F-D171BB36539D at gmail.com&gt;&gt;

 &gt;     Content-Type: text/plain; charset="us-ascii"

 &gt;

 &gt;     Dear Alex,

 &gt;

 &gt;     Thank you for the prompt response.

 &gt;

 &gt;     Are the eigenvectors stored in some variable ?

 &gt;     Does the lda.scalings_ attribute contain the eigenvectors ?

 &gt;

 &gt;     Best,

 &gt;     Serafeim

 &gt;

 &gt;     &gt; On 16 Oct 2017, at 16:57, Alexandre Gramfort

 &gt;     &lt;alexandre.gramfort at inria.fr &lt;mailto:alexandre.gramfort at inria.fr&gt;&gt;

 &gt;     wrote:

 &gt;     &gt;

 &gt;     &gt; no it stores the direction of the decision function to match the

 &gt;     API of

 &gt;     &gt; linear models.

 &gt;     &gt;

 &gt;     &gt; HTH

 &gt;     &gt; Alex

 &gt;     &gt;

 &gt;     &gt; On Mon, Oct 16, 2017 at 3:27 PM, Serafeim Loukas

 &gt;     &lt;seralouk at gmail.com &lt;mailto:seralouk at gmail.com&gt;&gt; wrote:

 &gt;     &gt;&gt; Dear Scikit-learn community,

 &gt;     &gt;&gt;

 &gt;     &gt;&gt; Since the documentation of the LDA

 &gt;     &gt;&gt;

 &gt;     (http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html

 &gt;     &lt;http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html&gt;)

 &gt;     &gt;&gt; is not so clear, I would like to ask if the lda.coef_ attribute

 &gt;     stores the

 &gt;     &gt;&gt; eigenvectors from the SVD decomposition.

 &gt;     &gt;&gt;

 &gt;     &gt;&gt; Thank you in advance,

 &gt;     &gt;&gt; Serafeim

 &gt;     &gt;&gt;

 &gt;     &gt;&gt; _______________________________________________

 &gt;     &gt;&gt; scikit-learn mailing list

 &gt;     &gt;&gt; scikit-learn at python.org &lt;mailto:scikit-learn at python.org&gt;

 &gt;     &gt;&gt; https://mail.python.org/mailman/listinfo/scikit-learn

 &gt;     &lt;https://mail.python.org/mailman/listinfo/scikit-learn&gt;

 &gt;     &gt;&gt;

 &gt;     &gt; _______________________________________________

 &gt;     &gt; scikit-learn mailing list

 &gt;     &gt; scikit-learn at python.org &lt;mailto:scikit-learn at python.org&gt;

 &gt;     &gt; https://mail.python.org/mailman/listinfo/scikit-learn

 &gt;     &lt;https://mail.python.org/mailman/listinfo/scikit-learn&gt;

 &gt;

 &gt;     -------------- next part --------------

 &gt;     An HTML attachment was scrubbed...

 &gt;     URL:

 &gt;     &lt;http://mail.python.org/pipermail/scikit-learn/attachments/20171016/505c7da3/attachment.html

 &gt;     &lt;http://mail.python.org/pipermail/scikit-learn/attachments/20171016/505c7da3/attachment.html&gt;&gt;

 &gt;

 &gt;     ------------------------------

 &gt;

 &gt;     Subject: Digest Footer

 &gt;

 &gt;     _______________________________________________

 &gt;     scikit-learn mailing list

 &gt;     scikit-learn at python.org &lt;mailto:scikit-learn at python.org&gt;

 &gt;     https://mail.python.org/mailman/listinfo/scikit-learn

 &gt;     &lt;https://mail.python.org/mailman/listinfo/scikit-learn&gt;

 &gt;

 &gt;

 &gt;     ------------------------------

 &gt;

 &gt;     End of scikit-learn Digest, Vol 19, Issue 25

 &gt;     ********************************************

 &gt;

 &gt;

 &gt;

 &gt;

 &gt; _______________________________________________

 &gt; scikit-learn mailing list

 &gt; scikit-learn at python.org

 &gt; https://mail.python.org/mailman/listinfo/scikit-learn

 
 -------------- next part --------------

 An HTML attachment was scrubbed...

 URL: &lt;http://mail.python.org/pipermail/scikit-learn/attachments/20171016/f47e63a9/attachment.html&gt;

 
 ------------------------------

 
 Subject: Digest Footer

 
 _______________________________________________

 scikit-learn mailing list

 scikit-learn at python.org

 https://mail.python.org/mailman/listinfo/scikit-learn

 
 ------------------------------

 
 End of scikit-learn Digest, Vol 19, Issue 28

 ********************************************


_______________________________________________

scikit-learn mailing list 

scikit-learn at python.org 

https://mail.python.org/mailman/listinfo/scikit-learn 


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20171016/620a9401/attachment-0001.html>

From se.raschka at gmail.com  Mon Oct 16 15:25:46 2017
From: se.raschka at gmail.com (Sebastian Raschka)
Date: Mon, 16 Oct 2017 15:25:46 -0400
Subject: [scikit-learn] 1. Re: unclear help file for
 sklearn.decomposition.pca
In-Reply-To: <CANpSPFTA5KetWQ4q0n=4L9dvjaUc-sJwRh+=cDe+2wZCvRiv4g@mail.gmail.com>
References: <CANpSPFTA5KetWQ4q0n=4L9dvjaUc-sJwRh+=cDe+2wZCvRiv4g@mail.gmail.com>
Message-ID: <033391A5-C191-4DFF-B80D-8E3C7AF3A74D@gmail.com>

Hi,

if you compute the principal components (i.e., eigendecomposition) from the covariance matrix, it shouldn't matter whether the data is centered or not, since the covariance matrix is computed as 

CovMat = \fact{1}{n} \sum_{i=1}^{n} (x_n - \bar{x}) (x_n - \bar{x})^T

where \bar{x} = vector of feature means

So, if you center the data prior to computing the covariance matrix, \bar{x} is simply 0.

Best,
Sebastian

> On Oct 16, 2017, at 2:27 PM, Ismael Lemhadri <lemhadri at stanford.edu> wrote:
> 
> @Andreas Muller: 
> My references do not assume centering, e.g. http://ufldl.stanford.edu/wiki/index.php/PCA <http://ufldl.stanford.edu/wiki/index.php/PCA>
> any reference?
> 
> 
> 
> On Mon, Oct 16, 2017 at 10:20 AM, <scikit-learn-request at python.org <mailto:scikit-learn-request at python.org>> wrote:
> Send scikit-learn mailing list submissions to
>         scikit-learn at python.org <mailto:scikit-learn at python.org>
> 
> To subscribe or unsubscribe via the World Wide Web, visit
>         https://mail.python.org/mailman/listinfo/scikit-learn <https://mail.python.org/mailman/listinfo/scikit-learn>
> or, via email, send a message with subject or body 'help' to
>         scikit-learn-request at python.org <mailto:scikit-learn-request at python.org>
> 
> You can reach the person managing the list at
>         scikit-learn-owner at python.org <mailto:scikit-learn-owner at python.org>
> 
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of scikit-learn digest..."
> 
> 
> Today's Topics:
> 
>    1. Re: unclear help file for sklearn.decomposition.pca
>       (Andreas Mueller)
> 
> 
> ----------------------------------------------------------------------
> 
> Message: 1
> Date: Mon, 16 Oct 2017 13:19:57 -0400
> From: Andreas Mueller <t3kcit at gmail.com <mailto:t3kcit at gmail.com>>
> To: scikit-learn at python.org <mailto:scikit-learn at python.org>
> Subject: Re: [scikit-learn] unclear help file for
>         sklearn.decomposition.pca
> Message-ID: <04fc445c-d8f3-a3a9-4ab2-0535826a2d03 at gmail.com <mailto:04fc445c-d8f3-a3a9-4ab2-0535826a2d03 at gmail.com>>
> Content-Type: text/plain; charset="utf-8"; Format="flowed"
> 
> The definition of PCA has a centering step, but no scaling step.
> 
> On 10/16/2017 11:16 AM, Ismael Lemhadri wrote:
> > Dear Roman,
> > My concern is actually not about not mentioning the scaling but about
> > not mentioning the centering.
> > That is, the sklearn PCA removes the mean but it does not mention it
> > in the help file.
> > This was quite messy for me to debug as I expected it to either: 1/
> > center and scale simultaneously or / not scale and not center either.
> > It would be beneficial to explicit the behavior in the help file in my
> > opinion.
> > Ismael
> >
> > On Mon, Oct 16, 2017 at 8:02 AM, <scikit-learn-request at python.org <mailto:scikit-learn-request at python.org>
> > <mailto:scikit-learn-request at python.org <mailto:scikit-learn-request at python.org>>> wrote:
> >
> >     Send scikit-learn mailing list submissions to
> >     scikit-learn at python.org <mailto:scikit-learn at python.org> <mailto:scikit-learn at python.org <mailto:scikit-learn at python.org>>
> >
> >     To subscribe or unsubscribe via the World Wide Web, visit
> >     https://mail.python.org/mailman/listinfo/scikit-learn <https://mail.python.org/mailman/listinfo/scikit-learn>
> >     <https://mail.python.org/mailman/listinfo/scikit-learn <https://mail.python.org/mailman/listinfo/scikit-learn>>
> >     or, via email, send a message with subject or body 'help' to
> >     scikit-learn-request at python.org <mailto:scikit-learn-request at python.org>
> >     <mailto:scikit-learn-request at python.org <mailto:scikit-learn-request at python.org>>
> >
> >     You can reach the person managing the list at
> >     scikit-learn-owner at python.org <mailto:scikit-learn-owner at python.org> <mailto:scikit-learn-owner at python.org <mailto:scikit-learn-owner at python.org>>
> >
> >     When replying, please edit your Subject line so it is more specific
> >     than "Re: Contents of scikit-learn digest..."
> >
> >
> >     Today's Topics:
> >
> >     ? ?1. unclear help file for sklearn.decomposition.pca (Ismael
> >     Lemhadri)
> >     ? ?2. Re: unclear help file for sklearn.decomposition.pca
> >     ? ? ? (Roman Yurchak)
> >     ? ?3. Question about LDA's coef_ attribute (Serafeim Loukas)
> >     ? ?4. Re: Question about LDA's coef_ attribute (Alexandre Gramfort)
> >     ? ?5. Re: Question about LDA's coef_ attribute (Serafeim Loukas)
> >
> >
> >     ----------------------------------------------------------------------
> >
> >     Message: 1
> >     Date: Sun, 15 Oct 2017 18:42:56 -0700
> >     From: Ismael Lemhadri <lemhadri at stanford.edu <mailto:lemhadri at stanford.edu>
> >     <mailto:lemhadri at stanford.edu <mailto:lemhadri at stanford.edu>>>
> >     To: scikit-learn at python.org <mailto:scikit-learn at python.org> <mailto:scikit-learn at python.org <mailto:scikit-learn at python.org>>
> >     Subject: [scikit-learn] unclear help file for
> >     ? ? ? ? sklearn.decomposition.pca
> >     Message-ID:
> >     ? ? ? ?
> >     <CANpSPFTgv+Oz7f97dandmrBBayqf_o9w=18oKHCFN0u5DNzj+g at mail.gmail.com <mailto:18oKHCFN0u5DNzj%2Bg at mail.gmail.com>
> >     <mailto:18oKHCFN0u5DNzj%2Bg at mail.gmail.com <mailto:18oKHCFN0u5DNzj%252Bg at mail.gmail.com>>>
> >     Content-Type: text/plain; charset="utf-8"
> >
> >     Dear all,
> >     The help file for the PCA class is unclear about the preprocessing
> >     performed to the data.
> >     You can check on line 410 here:
> >     https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/ <https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/>
> >     decomposition/pca.py#L410
> >     <https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/%0Adecomposition/pca.py#L410 <https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/%0Adecomposition/pca.py#L410>>
> >     that the matrix is centered but NOT scaled, before performing the
> >     singular
> >     value decomposition.
> >     However, the help files do not make any mention of it.
> >     This is unclear for someone who, like me, just wanted to compare
> >     that the
> >     PCA and np.linalg.svd give the same results. In academic settings,
> >     students
> >     are often asked to compare different methods and to check that
> >     they yield
> >     the same results. I expect that many students have confronted this
> >     problem
> >     before...
> >     Best,
> >     Ismael Lemhadri
> >     -------------- next part --------------
> >     An HTML attachment was scrubbed...
> >     URL:
> >     <http://mail.python.org/pipermail/scikit-learn/attachments/20171015/c465bde7/attachment-0001.html <http://mail.python.org/pipermail/scikit-learn/attachments/20171015/c465bde7/attachment-0001.html>
> >     <http://mail.python.org/pipermail/scikit-learn/attachments/20171015/c465bde7/attachment-0001.html <http://mail.python.org/pipermail/scikit-learn/attachments/20171015/c465bde7/attachment-0001.html>>>
> >
> >     ------------------------------
> >
> >     Message: 2
> >     Date: Mon, 16 Oct 2017 15:16:45 +0200
> >     From: Roman Yurchak <rth.yurchak at gmail.com <mailto:rth.yurchak at gmail.com>
> >     <mailto:rth.yurchak at gmail.com <mailto:rth.yurchak at gmail.com>>>
> >     To: Scikit-learn mailing list <scikit-learn at python.org <mailto:scikit-learn at python.org>
> >     <mailto:scikit-learn at python.org <mailto:scikit-learn at python.org>>>
> >     Subject: Re: [scikit-learn] unclear help file for
> >     ? ? ? ? sklearn.decomposition.pca
> >     Message-ID: <b2abdcfd-4736-929e-6304-b93832932043 at gmail.com <mailto:b2abdcfd-4736-929e-6304-b93832932043 at gmail.com>
> >     <mailto:b2abdcfd-4736-929e-6304-b93832932043 at gmail.com <mailto:b2abdcfd-4736-929e-6304-b93832932043 at gmail.com>>>
> >     Content-Type: text/plain; charset=utf-8; format=flowed
> >
> >     Ismael,
> >
> >     as far as I saw the sklearn.decomposition.PCA doesn't mention
> >     scaling at
> >     all (except for the whiten parameter which is post-transformation
> >     scaling).
> >
> >     So since it doesn't mention it, it makes sense that it doesn't do any
> >     scaling of the input. Same as np.linalg.svd.
> >
> >     You can verify that PCA and np.linalg.svd yield the same results, with
> >
> >     ```
> >     ?>>> import numpy as np
> >     ?>>> from sklearn.decomposition import PCA
> >     ?>>> import numpy.linalg
> >     ?>>> X = np.random.RandomState(42).rand(10, 4)
> >     ?>>> n_components = 2
> >     ?>>> PCA(n_components, svd_solver='full').fit_transform(X)
> >     ```
> >
> >     and
> >
> >     ```
> >     ?>>> U, s, V = np.linalg.svd(X - X.mean(axis=0), full_matrices=False)
> >     ?>>> (X - X.mean(axis=0)).dot(V[:n_components].T)
> >     ```
> >
> >     --
> >     Roman
> >
> >     On 16/10/17 03:42, Ismael Lemhadri wrote:
> >     > Dear all,
> >     > The help file for the PCA class is unclear about the preprocessing
> >     > performed to the data.
> >     > You can check on line 410 here:
> >     >
> >     https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/decomposition/pca.py#L410 <https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/decomposition/pca.py#L410>
> >     <https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/decomposition/pca.py#L410 <https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/decomposition/pca.py#L410>>
> >     >
> >     <https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/decomposition/pca.py#L410 <https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/decomposition/pca.py#L410>
> >     <https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/decomposition/pca.py#L410 <https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/decomposition/pca.py#L410>>>
> >     > that the matrix is centered but NOT scaled, before performing the
> >     > singular value decomposition.
> >     > However, the help files do not make any mention of it.
> >     > This is unclear for someone who, like me, just wanted to compare
> >     that
> >     > the PCA and np.linalg.svd give the same results. In academic
> >     settings,
> >     > students are often asked to compare different methods and to
> >     check that
> >     > they yield the same results. I expect that many students have
> >     confronted
> >     > this problem before...
> >     > Best,
> >     > Ismael Lemhadri
> >     >
> >     >
> >     > _______________________________________________
> >     > scikit-learn mailing list
> >     > scikit-learn at python.org <mailto:scikit-learn at python.org> <mailto:scikit-learn at python.org <mailto:scikit-learn at python.org>>
> >     > https://mail.python.org/mailman/listinfo/scikit-learn <https://mail.python.org/mailman/listinfo/scikit-learn>
> >     <https://mail.python.org/mailman/listinfo/scikit-learn <https://mail.python.org/mailman/listinfo/scikit-learn>>
> >     >
> >
> >
> >
> >     ------------------------------
> >
> >     Message: 3
> >     Date: Mon, 16 Oct 2017 15:27:48 +0200
> >     From: Serafeim Loukas <seralouk at gmail.com <mailto:seralouk at gmail.com> <mailto:seralouk at gmail.com <mailto:seralouk at gmail.com>>>
> >     To: scikit-learn at python.org <mailto:scikit-learn at python.org> <mailto:scikit-learn at python.org <mailto:scikit-learn at python.org>>
> >     Subject: [scikit-learn] Question about LDA's coef_ attribute
> >     Message-ID: <58C6D0DA-9DE5-4EF5-97C1-48159831F5A9 at gmail.com <mailto:58C6D0DA-9DE5-4EF5-97C1-48159831F5A9 at gmail.com>
> >     <mailto:58C6D0DA-9DE5-4EF5-97C1-48159831F5A9 at gmail.com <mailto:58C6D0DA-9DE5-4EF5-97C1-48159831F5A9 at gmail.com>>>
> >     Content-Type: text/plain; charset="us-ascii"
> >
> >     Dear Scikit-learn community,
> >
> >     Since the documentation of the LDA
> >     (http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html <http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html>
> >     <http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html <http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html>>
> >     <http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html <http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html>
> >     <http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html <http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html>>>)
> >     is not so clear, I would like to ask if the lda.coef_ attribute
> >     stores the eigenvectors from the SVD decomposition.
> >
> >     Thank you in advance,
> >     Serafeim
> >     -------------- next part --------------
> >     An HTML attachment was scrubbed...
> >     URL:
> >     <http://mail.python.org/pipermail/scikit-learn/attachments/20171016/4263df5c/attachment-0001.html <http://mail.python.org/pipermail/scikit-learn/attachments/20171016/4263df5c/attachment-0001.html>
> >     <http://mail.python.org/pipermail/scikit-learn/attachments/20171016/4263df5c/attachment-0001.html <http://mail.python.org/pipermail/scikit-learn/attachments/20171016/4263df5c/attachment-0001.html>>>
> >
> >     ------------------------------
> >
> >     Message: 4
> >     Date: Mon, 16 Oct 2017 16:57:52 +0200
> >     From: Alexandre Gramfort <alexandre.gramfort at inria.fr <mailto:alexandre.gramfort at inria.fr>
> >     <mailto:alexandre.gramfort at inria.fr <mailto:alexandre.gramfort at inria.fr>>>
> >     To: Scikit-learn mailing list <scikit-learn at python.org <mailto:scikit-learn at python.org>
> >     <mailto:scikit-learn at python.org <mailto:scikit-learn at python.org>>>
> >     Subject: Re: [scikit-learn] Question about LDA's coef_ attribute
> >     Message-ID:
> >     ? ? ? ?
> >     <CADeotZricOQhuHJMmW2Z14cqffEQyndYoxn-OgKAvTMQ7V0Y2g at mail.gmail.com <mailto:CADeotZricOQhuHJMmW2Z14cqffEQyndYoxn-OgKAvTMQ7V0Y2g at mail.gmail.com>
> >     <mailto:CADeotZricOQhuHJMmW2Z14cqffEQyndYoxn-OgKAvTMQ7V0Y2g at mail.gmail.com <mailto:CADeotZricOQhuHJMmW2Z14cqffEQyndYoxn-OgKAvTMQ7V0Y2g at mail.gmail.com>>>
> >     Content-Type: text/plain; charset="UTF-8"
> >
> >     no it stores the direction of the decision function to match the
> >     API of
> >     linear models.
> >
> >     HTH
> >     Alex
> >
> >     On Mon, Oct 16, 2017 at 3:27 PM, Serafeim Loukas
> >     <seralouk at gmail.com <mailto:seralouk at gmail.com> <mailto:seralouk at gmail.com <mailto:seralouk at gmail.com>>> wrote:
> >     > Dear Scikit-learn community,
> >     >
> >     > Since the documentation of the LDA
> >     >
> >     (http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html <http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html>
> >     <http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html <http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html>>)
> >     > is not so clear, I would like to ask if the lda.coef_ attribute
> >     stores the
> >     > eigenvectors from the SVD decomposition.
> >     >
> >     > Thank you in advance,
> >     > Serafeim
> >     >
> >     > _______________________________________________
> >     > scikit-learn mailing list
> >     > scikit-learn at python.org <mailto:scikit-learn at python.org> <mailto:scikit-learn at python.org <mailto:scikit-learn at python.org>>
> >     > https://mail.python.org/mailman/listinfo/scikit-learn <https://mail.python.org/mailman/listinfo/scikit-learn>
> >     <https://mail.python.org/mailman/listinfo/scikit-learn <https://mail.python.org/mailman/listinfo/scikit-learn>>
> >     >
> >
> >
> >     ------------------------------
> >
> >     Message: 5
> >     Date: Mon, 16 Oct 2017 17:02:46 +0200
> >     From: Serafeim Loukas <seralouk at gmail.com <mailto:seralouk at gmail.com> <mailto:seralouk at gmail.com <mailto:seralouk at gmail.com>>>
> >     To: Scikit-learn mailing list <scikit-learn at python.org <mailto:scikit-learn at python.org>
> >     <mailto:scikit-learn at python.org <mailto:scikit-learn at python.org>>>
> >     Subject: Re: [scikit-learn] Question about LDA's coef_ attribute
> >     Message-ID: <413210D2-56AE-41A4-873F-D171BB36539D at gmail.com <mailto:413210D2-56AE-41A4-873F-D171BB36539D at gmail.com>
> >     <mailto:413210D2-56AE-41A4-873F-D171BB36539D at gmail.com <mailto:413210D2-56AE-41A4-873F-D171BB36539D at gmail.com>>>
> >     Content-Type: text/plain; charset="us-ascii"
> >
> >     Dear Alex,
> >
> >     Thank you for the prompt response.
> >
> >     Are the eigenvectors stored in some variable ?
> >     Does the lda.scalings_ attribute contain the eigenvectors ?
> >
> >     Best,
> >     Serafeim
> >
> >     > On 16 Oct 2017, at 16:57, Alexandre Gramfort
> >     <alexandre.gramfort at inria.fr <mailto:alexandre.gramfort at inria.fr> <mailto:alexandre.gramfort at inria.fr <mailto:alexandre.gramfort at inria.fr>>>
> >     wrote:
> >     >
> >     > no it stores the direction of the decision function to match the
> >     API of
> >     > linear models.
> >     >
> >     > HTH
> >     > Alex
> >     >
> >     > On Mon, Oct 16, 2017 at 3:27 PM, Serafeim Loukas
> >     <seralouk at gmail.com <mailto:seralouk at gmail.com> <mailto:seralouk at gmail.com <mailto:seralouk at gmail.com>>> wrote:
> >     >> Dear Scikit-learn community,
> >     >>
> >     >> Since the documentation of the LDA
> >     >>
> >     (http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html <http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html>
> >     <http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html <http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html>>)
> >     >> is not so clear, I would like to ask if the lda.coef_ attribute
> >     stores the
> >     >> eigenvectors from the SVD decomposition.
> >     >>
> >     >> Thank you in advance,
> >     >> Serafeim
> >     >>
> >     >> _______________________________________________
> >     >> scikit-learn mailing list
> >     >> scikit-learn at python.org <mailto:scikit-learn at python.org> <mailto:scikit-learn at python.org <mailto:scikit-learn at python.org>>
> >     >> https://mail.python.org/mailman/listinfo/scikit-learn <https://mail.python.org/mailman/listinfo/scikit-learn>
> >     <https://mail.python.org/mailman/listinfo/scikit-learn <https://mail.python.org/mailman/listinfo/scikit-learn>>
> >     >>
> >     > _______________________________________________
> >     > scikit-learn mailing list
> >     > scikit-learn at python.org <mailto:scikit-learn at python.org> <mailto:scikit-learn at python.org <mailto:scikit-learn at python.org>>
> >     > https://mail.python.org/mailman/listinfo/scikit-learn <https://mail.python.org/mailman/listinfo/scikit-learn>
> >     <https://mail.python.org/mailman/listinfo/scikit-learn <https://mail.python.org/mailman/listinfo/scikit-learn>>
> >
> >     -------------- next part --------------
> >     An HTML attachment was scrubbed...
> >     URL:
> >     <http://mail.python.org/pipermail/scikit-learn/attachments/20171016/505c7da3/attachment.html <http://mail.python.org/pipermail/scikit-learn/attachments/20171016/505c7da3/attachment.html>
> >     <http://mail.python.org/pipermail/scikit-learn/attachments/20171016/505c7da3/attachment.html <http://mail.python.org/pipermail/scikit-learn/attachments/20171016/505c7da3/attachment.html>>>
> >
> >     ------------------------------
> >
> >     Subject: Digest Footer
> >
> >     _______________________________________________
> >     scikit-learn mailing list
> >     scikit-learn at python.org <mailto:scikit-learn at python.org> <mailto:scikit-learn at python.org <mailto:scikit-learn at python.org>>
> >     https://mail.python.org/mailman/listinfo/scikit-learn <https://mail.python.org/mailman/listinfo/scikit-learn>
> >     <https://mail.python.org/mailman/listinfo/scikit-learn <https://mail.python.org/mailman/listinfo/scikit-learn>>
> >
> >
> >     ------------------------------
> >
> >     End of scikit-learn Digest, Vol 19, Issue 25
> >     ********************************************
> >
> >
> >
> >
> > _______________________________________________
> > scikit-learn mailing list
> > scikit-learn at python.org <mailto:scikit-learn at python.org>
> > https://mail.python.org/mailman/listinfo/scikit-learn <https://mail.python.org/mailman/listinfo/scikit-learn>
> 
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20171016/f47e63a9/attachment.html <http://mail.python.org/pipermail/scikit-learn/attachments/20171016/f47e63a9/attachment.html>>
> 
> ------------------------------
> 
> Subject: Digest Footer
> 
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org <mailto:scikit-learn at python.org>
> https://mail.python.org/mailman/listinfo/scikit-learn <https://mail.python.org/mailman/listinfo/scikit-learn>
> 
> 
> ------------------------------
> 
> End of scikit-learn Digest, Vol 19, Issue 28
> ********************************************
> 
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20171016/739471ae/attachment-0001.html>

From se.raschka at gmail.com  Mon Oct 16 15:29:13 2017
From: se.raschka at gmail.com (Sebastian Raschka)
Date: Mon, 16 Oct 2017 15:29:13 -0400
Subject: [scikit-learn] 1. Re: unclear help file for
 sklearn.decomposition.pca
In-Reply-To: <033391A5-C191-4DFF-B80D-8E3C7AF3A74D@gmail.com>
References: <CANpSPFTA5KetWQ4q0n=4L9dvjaUc-sJwRh+=cDe+2wZCvRiv4g@mail.gmail.com>
 <033391A5-C191-4DFF-B80D-8E3C7AF3A74D@gmail.com>
Message-ID: <FD2EB640-BA10-4D3F-8F99-D336B7489A77@gmail.com>

Oh, never mind my previous email, because while the components should be the same, the projection of the data points onto those components would still be affected by centering vs non-centering I guess.

Best,
Sebastian

> On Oct 16, 2017, at 3:25 PM, Sebastian Raschka <se.raschka at gmail.com> wrote:
> 
> Hi,
> 
> if you compute the principal components (i.e., eigendecomposition) from the covariance matrix, it shouldn't matter whether the data is centered or not, since the covariance matrix is computed as 
> 
> CovMat = \fact{1}{n} \sum_{i=1}^{n} (x_n - \bar{x}) (x_n - \bar{x})^T
> 
> where \bar{x} = vector of feature means
> 
> So, if you center the data prior to computing the covariance matrix, \bar{x} is simply 0.
> 
> Best,
> Sebastian
> 
>> On Oct 16, 2017, at 2:27 PM, Ismael Lemhadri <lemhadri at stanford.edu <mailto:lemhadri at stanford.edu>> wrote:
>> 
>> @Andreas Muller: 
>> My references do not assume centering, e.g. http://ufldl.stanford.edu/wiki/index.php/PCA <http://ufldl.stanford.edu/wiki/index.php/PCA>
>> any reference?
>> 
>> 
>> 
>> On Mon, Oct 16, 2017 at 10:20 AM, <scikit-learn-request at python.org <mailto:scikit-learn-request at python.org>> wrote:
>> Send scikit-learn mailing list submissions to
>>         scikit-learn at python.org <mailto:scikit-learn at python.org>
>> 
>> To subscribe or unsubscribe via the World Wide Web, visit
>>         https://mail.python.org/mailman/listinfo/scikit-learn <https://mail.python.org/mailman/listinfo/scikit-learn>
>> or, via email, send a message with subject or body 'help' to
>>         scikit-learn-request at python.org <mailto:scikit-learn-request at python.org>
>> 
>> You can reach the person managing the list at
>>         scikit-learn-owner at python.org <mailto:scikit-learn-owner at python.org>
>> 
>> When replying, please edit your Subject line so it is more specific
>> than "Re: Contents of scikit-learn digest..."
>> 
>> 
>> Today's Topics:
>> 
>>    1. Re: unclear help file for sklearn.decomposition.pca
>>       (Andreas Mueller)
>> 
>> 
>> ----------------------------------------------------------------------
>> 
>> Message: 1
>> Date: Mon, 16 Oct 2017 13:19:57 -0400
>> From: Andreas Mueller <t3kcit at gmail.com <mailto:t3kcit at gmail.com>>
>> To: scikit-learn at python.org <mailto:scikit-learn at python.org>
>> Subject: Re: [scikit-learn] unclear help file for
>>         sklearn.decomposition.pca
>> Message-ID: <04fc445c-d8f3-a3a9-4ab2-0535826a2d03 at gmail.com <mailto:04fc445c-d8f3-a3a9-4ab2-0535826a2d03 at gmail.com>>
>> Content-Type: text/plain; charset="utf-8"; Format="flowed"
>> 
>> The definition of PCA has a centering step, but no scaling step.
>> 
>> On 10/16/2017 11:16 AM, Ismael Lemhadri wrote:
>> > Dear Roman,
>> > My concern is actually not about not mentioning the scaling but about
>> > not mentioning the centering.
>> > That is, the sklearn PCA removes the mean but it does not mention it
>> > in the help file.
>> > This was quite messy for me to debug as I expected it to either: 1/
>> > center and scale simultaneously or / not scale and not center either.
>> > It would be beneficial to explicit the behavior in the help file in my
>> > opinion.
>> > Ismael
>> >
>> > On Mon, Oct 16, 2017 at 8:02 AM, <scikit-learn-request at python.org <mailto:scikit-learn-request at python.org>
>> > <mailto:scikit-learn-request at python.org <mailto:scikit-learn-request at python.org>>> wrote:
>> >
>> >     Send scikit-learn mailing list submissions to
>> >     scikit-learn at python.org <mailto:scikit-learn at python.org> <mailto:scikit-learn at python.org <mailto:scikit-learn at python.org>>
>> >
>> >     To subscribe or unsubscribe via the World Wide Web, visit
>> >     https://mail.python.org/mailman/listinfo/scikit-learn <https://mail.python.org/mailman/listinfo/scikit-learn>
>> >     <https://mail.python.org/mailman/listinfo/scikit-learn <https://mail.python.org/mailman/listinfo/scikit-learn>>
>> >     or, via email, send a message with subject or body 'help' to
>> >     scikit-learn-request at python.org <mailto:scikit-learn-request at python.org>
>> >     <mailto:scikit-learn-request at python.org <mailto:scikit-learn-request at python.org>>
>> >
>> >     You can reach the person managing the list at
>> >     scikit-learn-owner at python.org <mailto:scikit-learn-owner at python.org> <mailto:scikit-learn-owner at python.org <mailto:scikit-learn-owner at python.org>>
>> >
>> >     When replying, please edit your Subject line so it is more specific
>> >     than "Re: Contents of scikit-learn digest..."
>> >
>> >
>> >     Today's Topics:
>> >
>> >     ? ?1. unclear help file for sklearn.decomposition.pca (Ismael
>> >     Lemhadri)
>> >     ? ?2. Re: unclear help file for sklearn.decomposition.pca
>> >     ? ? ? (Roman Yurchak)
>> >     ? ?3. Question about LDA's coef_ attribute (Serafeim Loukas)
>> >     ? ?4. Re: Question about LDA's coef_ attribute (Alexandre Gramfort)
>> >     ? ?5. Re: Question about LDA's coef_ attribute (Serafeim Loukas)
>> >
>> >
>> >     ----------------------------------------------------------------------
>> >
>> >     Message: 1
>> >     Date: Sun, 15 Oct 2017 18:42:56 -0700
>> >     From: Ismael Lemhadri <lemhadri at stanford.edu <mailto:lemhadri at stanford.edu>
>> >     <mailto:lemhadri at stanford.edu <mailto:lemhadri at stanford.edu>>>
>> >     To: scikit-learn at python.org <mailto:scikit-learn at python.org> <mailto:scikit-learn at python.org <mailto:scikit-learn at python.org>>
>> >     Subject: [scikit-learn] unclear help file for
>> >     ? ? ? ? sklearn.decomposition.pca
>> >     Message-ID:
>> >     ? ? ? ?
>> >     <CANpSPFTgv+Oz7f97dandmrBBayqf_o9w=18oKHCFN0u5DNzj+g at mail.gmail.com <mailto:18oKHCFN0u5DNzj%2Bg at mail.gmail.com>
>> >     <mailto:18oKHCFN0u5DNzj%2Bg at mail.gmail.com <mailto:18oKHCFN0u5DNzj%252Bg at mail.gmail.com>>>
>> >     Content-Type: text/plain; charset="utf-8"
>> >
>> >     Dear all,
>> >     The help file for the PCA class is unclear about the preprocessing
>> >     performed to the data.
>> >     You can check on line 410 here:
>> >     https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/ <https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/>
>> >     decomposition/pca.py#L410
>> >     <https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/%0Adecomposition/pca.py#L410 <https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/%0Adecomposition/pca.py#L410>>
>> >     that the matrix is centered but NOT scaled, before performing the
>> >     singular
>> >     value decomposition.
>> >     However, the help files do not make any mention of it.
>> >     This is unclear for someone who, like me, just wanted to compare
>> >     that the
>> >     PCA and np.linalg.svd give the same results. In academic settings,
>> >     students
>> >     are often asked to compare different methods and to check that
>> >     they yield
>> >     the same results. I expect that many students have confronted this
>> >     problem
>> >     before...
>> >     Best,
>> >     Ismael Lemhadri
>> >     -------------- next part --------------
>> >     An HTML attachment was scrubbed...
>> >     URL:
>> >     <http://mail.python.org/pipermail/scikit-learn/attachments/20171015/c465bde7/attachment-0001.html <http://mail.python.org/pipermail/scikit-learn/attachments/20171015/c465bde7/attachment-0001.html>
>> >     <http://mail.python.org/pipermail/scikit-learn/attachments/20171015/c465bde7/attachment-0001.html <http://mail.python.org/pipermail/scikit-learn/attachments/20171015/c465bde7/attachment-0001.html>>>
>> >
>> >     ------------------------------
>> >
>> >     Message: 2
>> >     Date: Mon, 16 Oct 2017 15:16:45 +0200
>> >     From: Roman Yurchak <rth.yurchak at gmail.com <mailto:rth.yurchak at gmail.com>
>> >     <mailto:rth.yurchak at gmail.com <mailto:rth.yurchak at gmail.com>>>
>> >     To: Scikit-learn mailing list <scikit-learn at python.org <mailto:scikit-learn at python.org>
>> >     <mailto:scikit-learn at python.org <mailto:scikit-learn at python.org>>>
>> >     Subject: Re: [scikit-learn] unclear help file for
>> >     ? ? ? ? sklearn.decomposition.pca
>> >     Message-ID: <b2abdcfd-4736-929e-6304-b93832932043 at gmail.com <mailto:b2abdcfd-4736-929e-6304-b93832932043 at gmail.com>
>> >     <mailto:b2abdcfd-4736-929e-6304-b93832932043 at gmail.com <mailto:b2abdcfd-4736-929e-6304-b93832932043 at gmail.com>>>
>> >     Content-Type: text/plain; charset=utf-8; format=flowed
>> >
>> >     Ismael,
>> >
>> >     as far as I saw the sklearn.decomposition.PCA doesn't mention
>> >     scaling at
>> >     all (except for the whiten parameter which is post-transformation
>> >     scaling).
>> >
>> >     So since it doesn't mention it, it makes sense that it doesn't do any
>> >     scaling of the input. Same as np.linalg.svd.
>> >
>> >     You can verify that PCA and np.linalg.svd yield the same results, with
>> >
>> >     ```
>> >     ?>>> import numpy as np
>> >     ?>>> from sklearn.decomposition import PCA
>> >     ?>>> import numpy.linalg
>> >     ?>>> X = np.random.RandomState(42).rand(10, 4)
>> >     ?>>> n_components = 2
>> >     ?>>> PCA(n_components, svd_solver='full').fit_transform(X)
>> >     ```
>> >
>> >     and
>> >
>> >     ```
>> >     ?>>> U, s, V = np.linalg.svd(X - X.mean(axis=0), full_matrices=False)
>> >     ?>>> (X - X.mean(axis=0)).dot(V[:n_components].T)
>> >     ```
>> >
>> >     --
>> >     Roman
>> >
>> >     On 16/10/17 03:42, Ismael Lemhadri wrote:
>> >     > Dear all,
>> >     > The help file for the PCA class is unclear about the preprocessing
>> >     > performed to the data.
>> >     > You can check on line 410 here:
>> >     >
>> >     https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/decomposition/pca.py#L410 <https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/decomposition/pca.py#L410>
>> >     <https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/decomposition/pca.py#L410 <https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/decomposition/pca.py#L410>>
>> >     >
>> >     <https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/decomposition/pca.py#L410 <https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/decomposition/pca.py#L410>
>> >     <https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/decomposition/pca.py#L410 <https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/decomposition/pca.py#L410>>>
>> >     > that the matrix is centered but NOT scaled, before performing the
>> >     > singular value decomposition.
>> >     > However, the help files do not make any mention of it.
>> >     > This is unclear for someone who, like me, just wanted to compare
>> >     that
>> >     > the PCA and np.linalg.svd give the same results. In academic
>> >     settings,
>> >     > students are often asked to compare different methods and to
>> >     check that
>> >     > they yield the same results. I expect that many students have
>> >     confronted
>> >     > this problem before...
>> >     > Best,
>> >     > Ismael Lemhadri
>> >     >
>> >     >
>> >     > _______________________________________________
>> >     > scikit-learn mailing list
>> >     > scikit-learn at python.org <mailto:scikit-learn at python.org> <mailto:scikit-learn at python.org <mailto:scikit-learn at python.org>>
>> >     > https://mail.python.org/mailman/listinfo/scikit-learn <https://mail.python.org/mailman/listinfo/scikit-learn>
>> >     <https://mail.python.org/mailman/listinfo/scikit-learn <https://mail.python.org/mailman/listinfo/scikit-learn>>
>> >     >
>> >
>> >
>> >
>> >     ------------------------------
>> >
>> >     Message: 3
>> >     Date: Mon, 16 Oct 2017 15:27:48 +0200
>> >     From: Serafeim Loukas <seralouk at gmail.com <mailto:seralouk at gmail.com> <mailto:seralouk at gmail.com <mailto:seralouk at gmail.com>>>
>> >     To: scikit-learn at python.org <mailto:scikit-learn at python.org> <mailto:scikit-learn at python.org <mailto:scikit-learn at python.org>>
>> >     Subject: [scikit-learn] Question about LDA's coef_ attribute
>> >     Message-ID: <58C6D0DA-9DE5-4EF5-97C1-48159831F5A9 at gmail.com <mailto:58C6D0DA-9DE5-4EF5-97C1-48159831F5A9 at gmail.com>
>> >     <mailto:58C6D0DA-9DE5-4EF5-97C1-48159831F5A9 at gmail.com <mailto:58C6D0DA-9DE5-4EF5-97C1-48159831F5A9 at gmail.com>>>
>> >     Content-Type: text/plain; charset="us-ascii"
>> >
>> >     Dear Scikit-learn community,
>> >
>> >     Since the documentation of the LDA
>> >     (http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html <http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html>
>> >     <http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html <http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html>>
>> >     <http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html <http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html>
>> >     <http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html <http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html>>>)
>> >     is not so clear, I would like to ask if the lda.coef_ attribute
>> >     stores the eigenvectors from the SVD decomposition.
>> >
>> >     Thank you in advance,
>> >     Serafeim
>> >     -------------- next part --------------
>> >     An HTML attachment was scrubbed...
>> >     URL:
>> >     <http://mail.python.org/pipermail/scikit-learn/attachments/20171016/4263df5c/attachment-0001.html <http://mail.python.org/pipermail/scikit-learn/attachments/20171016/4263df5c/attachment-0001.html>
>> >     <http://mail.python.org/pipermail/scikit-learn/attachments/20171016/4263df5c/attachment-0001.html <http://mail.python.org/pipermail/scikit-learn/attachments/20171016/4263df5c/attachment-0001.html>>>
>> >
>> >     ------------------------------
>> >
>> >     Message: 4
>> >     Date: Mon, 16 Oct 2017 16:57:52 +0200
>> >     From: Alexandre Gramfort <alexandre.gramfort at inria.fr <mailto:alexandre.gramfort at inria.fr>
>> >     <mailto:alexandre.gramfort at inria.fr <mailto:alexandre.gramfort at inria.fr>>>
>> >     To: Scikit-learn mailing list <scikit-learn at python.org <mailto:scikit-learn at python.org>
>> >     <mailto:scikit-learn at python.org <mailto:scikit-learn at python.org>>>
>> >     Subject: Re: [scikit-learn] Question about LDA's coef_ attribute
>> >     Message-ID:
>> >     ? ? ? ?
>> >     <CADeotZricOQhuHJMmW2Z14cqffEQyndYoxn-OgKAvTMQ7V0Y2g at mail.gmail.com <mailto:CADeotZricOQhuHJMmW2Z14cqffEQyndYoxn-OgKAvTMQ7V0Y2g at mail.gmail.com>
>> >     <mailto:CADeotZricOQhuHJMmW2Z14cqffEQyndYoxn-OgKAvTMQ7V0Y2g at mail.gmail.com <mailto:CADeotZricOQhuHJMmW2Z14cqffEQyndYoxn-OgKAvTMQ7V0Y2g at mail.gmail.com>>>
>> >     Content-Type: text/plain; charset="UTF-8"
>> >
>> >     no it stores the direction of the decision function to match the
>> >     API of
>> >     linear models.
>> >
>> >     HTH
>> >     Alex
>> >
>> >     On Mon, Oct 16, 2017 at 3:27 PM, Serafeim Loukas
>> >     <seralouk at gmail.com <mailto:seralouk at gmail.com> <mailto:seralouk at gmail.com <mailto:seralouk at gmail.com>>> wrote:
>> >     > Dear Scikit-learn community,
>> >     >
>> >     > Since the documentation of the LDA
>> >     >
>> >     (http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html <http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html>
>> >     <http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html <http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html>>)
>> >     > is not so clear, I would like to ask if the lda.coef_ attribute
>> >     stores the
>> >     > eigenvectors from the SVD decomposition.
>> >     >
>> >     > Thank you in advance,
>> >     > Serafeim
>> >     >
>> >     > _______________________________________________
>> >     > scikit-learn mailing list
>> >     > scikit-learn at python.org <mailto:scikit-learn at python.org> <mailto:scikit-learn at python.org <mailto:scikit-learn at python.org>>
>> >     > https://mail.python.org/mailman/listinfo/scikit-learn <https://mail.python.org/mailman/listinfo/scikit-learn>
>> >     <https://mail.python.org/mailman/listinfo/scikit-learn <https://mail.python.org/mailman/listinfo/scikit-learn>>
>> >     >
>> >
>> >
>> >     ------------------------------
>> >
>> >     Message: 5
>> >     Date: Mon, 16 Oct 2017 17:02:46 +0200
>> >     From: Serafeim Loukas <seralouk at gmail.com <mailto:seralouk at gmail.com> <mailto:seralouk at gmail.com <mailto:seralouk at gmail.com>>>
>> >     To: Scikit-learn mailing list <scikit-learn at python.org <mailto:scikit-learn at python.org>
>> >     <mailto:scikit-learn at python.org <mailto:scikit-learn at python.org>>>
>> >     Subject: Re: [scikit-learn] Question about LDA's coef_ attribute
>> >     Message-ID: <413210D2-56AE-41A4-873F-D171BB36539D at gmail.com <mailto:413210D2-56AE-41A4-873F-D171BB36539D at gmail.com>
>> >     <mailto:413210D2-56AE-41A4-873F-D171BB36539D at gmail.com <mailto:413210D2-56AE-41A4-873F-D171BB36539D at gmail.com>>>
>> >     Content-Type: text/plain; charset="us-ascii"
>> >
>> >     Dear Alex,
>> >
>> >     Thank you for the prompt response.
>> >
>> >     Are the eigenvectors stored in some variable ?
>> >     Does the lda.scalings_ attribute contain the eigenvectors ?
>> >
>> >     Best,
>> >     Serafeim
>> >
>> >     > On 16 Oct 2017, at 16:57, Alexandre Gramfort
>> >     <alexandre.gramfort at inria.fr <mailto:alexandre.gramfort at inria.fr> <mailto:alexandre.gramfort at inria.fr <mailto:alexandre.gramfort at inria.fr>>>
>> >     wrote:
>> >     >
>> >     > no it stores the direction of the decision function to match the
>> >     API of
>> >     > linear models.
>> >     >
>> >     > HTH
>> >     > Alex
>> >     >
>> >     > On Mon, Oct 16, 2017 at 3:27 PM, Serafeim Loukas
>> >     <seralouk at gmail.com <mailto:seralouk at gmail.com> <mailto:seralouk at gmail.com <mailto:seralouk at gmail.com>>> wrote:
>> >     >> Dear Scikit-learn community,
>> >     >>
>> >     >> Since the documentation of the LDA
>> >     >>
>> >     (http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html <http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html>
>> >     <http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html <http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html>>)
>> >     >> is not so clear, I would like to ask if the lda.coef_ attribute
>> >     stores the
>> >     >> eigenvectors from the SVD decomposition.
>> >     >>
>> >     >> Thank you in advance,
>> >     >> Serafeim
>> >     >>
>> >     >> _______________________________________________
>> >     >> scikit-learn mailing list
>> >     >> scikit-learn at python.org <mailto:scikit-learn at python.org> <mailto:scikit-learn at python.org <mailto:scikit-learn at python.org>>
>> >     >> https://mail.python.org/mailman/listinfo/scikit-learn <https://mail.python.org/mailman/listinfo/scikit-learn>
>> >     <https://mail.python.org/mailman/listinfo/scikit-learn <https://mail.python.org/mailman/listinfo/scikit-learn>>
>> >     >>
>> >     > _______________________________________________
>> >     > scikit-learn mailing list
>> >     > scikit-learn at python.org <mailto:scikit-learn at python.org> <mailto:scikit-learn at python.org <mailto:scikit-learn at python.org>>
>> >     > https://mail.python.org/mailman/listinfo/scikit-learn <https://mail.python.org/mailman/listinfo/scikit-learn>
>> >     <https://mail.python.org/mailman/listinfo/scikit-learn <https://mail.python.org/mailman/listinfo/scikit-learn>>
>> >
>> >     -------------- next part --------------
>> >     An HTML attachment was scrubbed...
>> >     URL:
>> >     <http://mail.python.org/pipermail/scikit-learn/attachments/20171016/505c7da3/attachment.html <http://mail.python.org/pipermail/scikit-learn/attachments/20171016/505c7da3/attachment.html>
>> >     <http://mail.python.org/pipermail/scikit-learn/attachments/20171016/505c7da3/attachment.html <http://mail.python.org/pipermail/scikit-learn/attachments/20171016/505c7da3/attachment.html>>>
>> >
>> >     ------------------------------
>> >
>> >     Subject: Digest Footer
>> >
>> >     _______________________________________________
>> >     scikit-learn mailing list
>> >     scikit-learn at python.org <mailto:scikit-learn at python.org> <mailto:scikit-learn at python.org <mailto:scikit-learn at python.org>>
>> >     https://mail.python.org/mailman/listinfo/scikit-learn <https://mail.python.org/mailman/listinfo/scikit-learn>
>> >     <https://mail.python.org/mailman/listinfo/scikit-learn <https://mail.python.org/mailman/listinfo/scikit-learn>>
>> >
>> >
>> >     ------------------------------
>> >
>> >     End of scikit-learn Digest, Vol 19, Issue 25
>> >     ********************************************
>> >
>> >
>> >
>> >
>> > _______________________________________________
>> > scikit-learn mailing list
>> > scikit-learn at python.org <mailto:scikit-learn at python.org>
>> > https://mail.python.org/mailman/listinfo/scikit-learn <https://mail.python.org/mailman/listinfo/scikit-learn>
>> 
>> -------------- next part --------------
>> An HTML attachment was scrubbed...
>> URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20171016/f47e63a9/attachment.html <http://mail.python.org/pipermail/scikit-learn/attachments/20171016/f47e63a9/attachment.html>>
>> 
>> ------------------------------
>> 
>> Subject: Digest Footer
>> 
>> _______________________________________________
>> scikit-learn mailing list
>> scikit-learn at python.org <mailto:scikit-learn at python.org>
>> https://mail.python.org/mailman/listinfo/scikit-learn <https://mail.python.org/mailman/listinfo/scikit-learn>
>> 
>> 
>> ------------------------------
>> 
>> End of scikit-learn Digest, Vol 19, Issue 28
>> ********************************************
>> 
>> _______________________________________________
>> scikit-learn mailing list
>> scikit-learn at python.org <mailto:scikit-learn at python.org>
>> https://mail.python.org/mailman/listinfo/scikit-learn
> 
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20171016/332c793f/attachment-0001.html>

From lemhadri at stanford.edu  Mon Oct 16 15:29:11 2017
From: lemhadri at stanford.edu (Ismael Lemhadri)
Date: Mon, 16 Oct 2017 19:29:11 +0000
Subject: [scikit-learn] Unclear help file about sklearn.decomposition.pca
In-Reply-To: <mailman.2473.1508179722.12136.scikit-learn@python.org>
References: <mailman.2473.1508179722.12136.scikit-learn@python.org>
Message-ID: <CANpSPFT639gYFzAG3bhq-kyF=56P9QjJ5s+epXXuHi2jDHhXbQ@mail.gmail.com>

Thank you all for your feedback.
The initial problem I came with wasnt the definition of PCA but what the
sklearn method does. In practice I would always make sure the data is both
centered and scaled before performing PCA. This is the recommended method
because without scaling, the biggest direction could wrongly seem to
explain a huge fraction of the variance.
So my point was simply to clarify in the help file and the user guide what
the PCA class does precisely to leave no unclarity to the reader. Moving
forward I have now submitted a pull request on github as initially
suggested by Roman on this thread.
Best,
Ismael

On Mon, 16 Oct 2017 at 11:49 AM, <scikit-learn-request at python.org> wrote:

> Send scikit-learn mailing list submissions to
>         scikit-learn at python.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
>         https://mail.python.org/mailman/listinfo/scikit-learn
> or, via email, send a message with subject or body 'help' to
>         scikit-learn-request at python.org
>
> You can reach the person managing the list at
>         scikit-learn-owner at python.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of scikit-learn digest..."
>
>
> Today's Topics:
>
>    1. Re: 1. Re: unclear help file for sklearn.decomposition.pca
>       (Andreas Mueller)
>    2. Re: 1. Re: unclear help file for sklearn.decomposition.pca
>       (Oliver Tomic)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Mon, 16 Oct 2017 14:44:51 -0400
> From: Andreas Mueller <t3kcit at gmail.com>
> To: scikit-learn at python.org
> Subject: Re: [scikit-learn] 1. Re: unclear help file for
>         sklearn.decomposition.pca
> Message-ID: <35142868-fce9-6cb3-eba3-015a0b106163 at gmail.com>
> Content-Type: text/plain; charset="utf-8"; Format="flowed"
>
>
>
> On 10/16/2017 02:27 PM, Ismael Lemhadri wrote:
> > @Andreas Muller:
> > My references do not assume centering, e.g.
> > http://ufldl.stanford.edu/wiki/index.php/PCA
> > any reference?
> >
> It kinda does but is not very clear about it:
>
> This data has already been pre-processed so that each of the
> features\textstyle x_1and\textstyle x_2have about the same mean (zero)
> and variance.
>
>
>
> Wikipedia is much clearer:
> Consider a datamatrix
> <https://en.wikipedia.org/wiki/Matrix_%28mathematics%29>,*X*, with
> column-wise zeroempirical mean
> <https://en.wikipedia.org/wiki/Empirical_mean>(the sample mean of each
> column has been shifted to zero), where each of the/n/rows represents a
> different repetition of the experiment, and each of the/p/columns gives
> a particular kind of feature (say, the results from a particular sensor).
> https://en.wikipedia.org/wiki/Principal_component_analysis#Details
>
> I'm a bit surprised to find that ESL says "The SVD of the centered
> matrix X is another way of expressing the principal components of the
> variables in X",
> so they assume scaling? They don't really have a great treatment of PCA,
> though.
>
> Bishop <http://www.springer.com/us/book/9780387310732> and Murphy
> <https://mitpress.mit.edu/books/machine-learning-0> are pretty clear
> that they subtract the mean (or assume zero mean) but don't standardize.
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <
> http://mail.python.org/pipermail/scikit-learn/attachments/20171016/81b3014b/attachment-0001.html
> >
>
> ------------------------------
>
> Message: 2
> Date: Mon, 16 Oct 2017 20:48:29 +0200
> From: Oliver Tomic <olivertomic at zoho.com>
> To: "Scikit-learn mailing list" <scikit-learn at python.org>
> Cc: <scikit-learn at python.org>
> Subject: Re: [scikit-learn] 1. Re: unclear help file for
>         sklearn.decomposition.pca
> Message-ID: <15f26840d65.e97b33c25239.3934951873824890747 at zoho.com>
> Content-Type: text/plain; charset="utf-8"
>
> Dear Ismael,
>
>
>
> PCA should always involve at the least centering, or, if the variables are
> to contribute equally, scaling. Here is a reference from the scientific
> area named "chemometrics". In Chemometrics PCA used not only for
> dimensionality reduction, but also for interpretation of variance by use of
> scores, loadings, correlation loadings, etc.
>
>
>
> If you scroll down to subsection "Preprocessing" you will find more info
> on centering and scaling.
>
>
> http://pubs.rsc.org/en/content/articlehtml/2014/ay/c3ay41907j
>
>
>
> best
>
> Oliver
>
>
>
>
> ---- On Mon, 16 Oct 2017 20:27:11 +0200 Ismael Lemhadri &
> lt;lemhadri at stanford.edu&gt; wrote ----
>
>
>
>
> @Andreas Muller:
>
> My references do not assume centering, e.g.
> http://ufldl.stanford.edu/wiki/index.php/PCA
>
> any reference?
>
>
>
>
>
>
>
> On Mon, Oct 16, 2017 at 10:20 AM, &lt;scikit-learn-request at python.org&gt;
> wrote:
>
> Send scikit-learn mailing list submissions to
>
>          scikit-learn at python.org
>
>
>
>  To subscribe or unsubscribe via the World Wide Web, visit
>
>          https://mail.python.org/mailman/listinfo/scikit-learn
>
>  or, via email, send a message with subject or body 'help' to
>
>          scikit-learn-request at python.org
>
>
>
>  You can reach the person managing the list at
>
>          scikit-learn-owner at python.org
>
>
>
>  When replying, please edit your Subject line so it is more specific
>
>  than "Re: Contents of scikit-learn digest..."
>
>
>
>
>
>  Today's Topics:
>
>
>
>     1. Re: unclear help file for sklearn.decomposition.pca
>
>        (Andreas Mueller)
>
>
>
>
>
>  ----------------------------------------------------------------------
>
>
>
>  Message: 1
>
>  Date: Mon, 16 Oct 2017 13:19:57 -0400
>
>  From: Andreas Mueller &lt;t3kcit at gmail.com&gt;
>
>  To: scikit-learn at python.org
>
>  Subject: Re: [scikit-learn] unclear help file for
>
>          sklearn.decomposition.pca
>
>  Message-ID: &lt;04fc445c-d8f3-a3a9-4ab2-0535826a2d03 at gmail.com&gt;
>
>  Content-Type: text/plain; charset="utf-8"; Format="flowed"
>
>
>
>  The definition of PCA has a centering step, but no scaling step.
>
>
>
>  On 10/16/2017 11:16 AM, Ismael Lemhadri wrote:
>
>  &gt; Dear Roman,
>
>  &gt; My concern is actually not about not mentioning the scaling but about
>
>  &gt; not mentioning the centering.
>
>  &gt; That is, the sklearn PCA removes the mean but it does not mention it
>
>  &gt; in the help file.
>
>  &gt; This was quite messy for me to debug as I expected it to either: 1/
>
>  &gt; center and scale simultaneously or / not scale and not center either.
>
>  &gt; It would be beneficial to explicit the behavior in the help file in
> my
>
>  &gt; opinion.
>
>  &gt; Ismael
>
>  &gt;
>
>  &gt; On Mon, Oct 16, 2017 at 8:02 AM, &lt;scikit-learn-request at python.org
>
>  &gt; &lt;mailto:scikit-learn-request at python.org&gt;&gt; wrote:
>
>  &gt;
>
>  &gt;     Send scikit-learn mailing list submissions to
>
>  &gt;     scikit-learn at python.org &lt;mailto:scikit-learn at python.org&gt;
>
>  &gt;
>
>  &gt;     To subscribe or unsubscribe via the World Wide Web, visit
>
>  &gt;     https://mail.python.org/mailman/listinfo/scikit-learn
>
>  &gt;     &lt;https://mail.python.org/mailman/listinfo/scikit-learn&gt;
>
>  &gt;     or, via email, send a message with subject or body 'help' to
>
>  &gt;     scikit-learn-request at python.org
>
>  &gt;     &lt;mailto:scikit-learn-request at python.org&gt;
>
>  &gt;
>
>  &gt;     You can reach the person managing the list at
>
>  &gt;     scikit-learn-owner at python.org &lt;mailto:
> scikit-learn-owner at python.org&gt;
>
>  &gt;
>
>  &gt;     When replying, please edit your Subject line so it is more
> specific
>
>  &gt;     than "Re: Contents of scikit-learn digest..."
>
>  &gt;
>
>  &gt;
>
>  &gt;     Today's Topics:
>
>  &gt;
>
>  &gt;     ? ?1. unclear help file for sklearn.decomposition.pca (Ismael
>
>  &gt;     Lemhadri)
>
>  &gt;     ? ?2. Re: unclear help file for sklearn.decomposition.pca
>
>  &gt;     ? ? ? (Roman Yurchak)
>
>  &gt;     ? ?3. Question about LDA's coef_ attribute (Serafeim Loukas)
>
>  &gt;     ? ?4. Re: Question about LDA's coef_ attribute (Alexandre
> Gramfort)
>
>  &gt;     ? ?5. Re: Question about LDA's coef_ attribute (Serafeim Loukas)
>
>  &gt;
>
>  &gt;
>
>  &gt;
>  ----------------------------------------------------------------------
>
>  &gt;
>
>  &gt;     Message: 1
>
>  &gt;     Date: Sun, 15 Oct 2017 18:42:56 -0700
>
>  &gt;     From: Ismael Lemhadri &lt;lemhadri at stanford.edu
>
>  &gt;     &lt;mailto:lemhadri at stanford.edu&gt;&gt;
>
>  &gt;     To: scikit-learn at python.org &lt;mailto:scikit-learn at python.org
> &gt;
>
>  &gt;     Subject: [scikit-learn] unclear help file for
>
>  &gt;     ? ? ? ? sklearn.decomposition.pca
>
>  &gt;     Message-ID:
>
>  &gt;     ? ? ? ?
>
>  &gt;     &lt;CANpSPFTgv+Oz7f97dandmrBBayqf_o9w=
> 18oKHCFN0u5DNzj+g at mail.gmail.com
>
>  &gt;     &lt;mailto:18oKHCFN0u5DNzj%2Bg at mail.gmail.com&gt;&gt;
>
>  &gt;     Content-Type: text/plain; charset="utf-8"
>
>  &gt;
>
>  &gt;     Dear all,
>
>  &gt;     The help file for the PCA class is unclear about the
> preprocessing
>
>  &gt;     performed to the data.
>
>  &gt;     You can check on line 410 here:
>
>  &gt;
> https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/
>
>  &gt;     decomposition/pca.py#L410
>
>  &gt;     &lt;
> https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/%0Adecomposition/pca.py#L410&gt
> ;
>
>  &gt;     that the matrix is centered but NOT scaled, before performing the
>
>  &gt;     singular
>
>  &gt;     value decomposition.
>
>  &gt;     However, the help files do not make any mention of it.
>
>  &gt;     This is unclear for someone who, like me, just wanted to compare
>
>  &gt;     that the
>
>  &gt;     PCA and np.linalg.svd give the same results. In academic
> settings,
>
>  &gt;     students
>
>  &gt;     are often asked to compare different methods and to check that
>
>  &gt;     they yield
>
>  &gt;     the same results. I expect that many students have confronted
> this
>
>  &gt;     problem
>
>  &gt;     before...
>
>  &gt;     Best,
>
>  &gt;     Ismael Lemhadri
>
>  &gt;     -------------- next part --------------
>
>  &gt;     An HTML attachment was scrubbed...
>
>  &gt;     URL:
>
>  &gt;     &lt;
> http://mail.python.org/pipermail/scikit-learn/attachments/20171015/c465bde7/attachment-0001.html
>
>  &gt;     &lt;
> http://mail.python.org/pipermail/scikit-learn/attachments/20171015/c465bde7/attachment-0001.html&gt;&gt
> ;
>
>  &gt;
>
>  &gt;     ------------------------------
>
>  &gt;
>
>  &gt;     Message: 2
>
>  &gt;     Date: Mon, 16 Oct 2017 15:16:45 +0200
>
>  &gt;     From: Roman Yurchak &lt;rth.yurchak at gmail.com
>
>  &gt;     &lt;mailto:rth.yurchak at gmail.com&gt;&gt;
>
>  &gt;     To: Scikit-learn mailing list &lt;scikit-learn at python.org
>
>  &gt;     &lt;mailto:scikit-learn at python.org&gt;&gt;
>
>  &gt;     Subject: Re: [scikit-learn] unclear help file for
>
>  &gt;     ? ? ? ? sklearn.decomposition.pca
>
>  &gt;     Message-ID: &lt;b2abdcfd-4736-929e-6304-b93832932043 at gmail.com
>
>  &gt;     &lt;mailto:b2abdcfd-4736-929e-6304-b93832932043 at gmail.com
> &gt;&gt;
>
>  &gt;     Content-Type: text/plain; charset=utf-8; format=flowed
>
>  &gt;
>
>  &gt;     Ismael,
>
>  &gt;
>
>  &gt;     as far as I saw the sklearn.decomposition.PCA doesn't mention
>
>  &gt;     scaling at
>
>  &gt;     all (except for the whiten parameter which is post-transformation
>
>  &gt;     scaling).
>
>  &gt;
>
>  &gt;     So since it doesn't mention it, it makes sense that it doesn't
> do any
>
>  &gt;     scaling of the input. Same as np.linalg.svd.
>
>  &gt;
>
>  &gt;     You can verify that PCA and np.linalg.svd yield the same
> results, with
>
>  &gt;
>
>  &gt;     ```
>
>  &gt;     ?&gt;&gt;&gt; import numpy as np
>
>  &gt;     ?&gt;&gt;&gt; from sklearn.decomposition import PCA
>
>  &gt;     ?&gt;&gt;&gt; import numpy.linalg
>
>  &gt;     ?&gt;&gt;&gt; X = np.random.RandomState(42).rand(10, 4)
>
>  &gt;     ?&gt;&gt;&gt; n_components = 2
>
>  &gt;     ?&gt;&gt;&gt; PCA(n_components,
> svd_solver='full').fit_transform(X)
>
>  &gt;     ```
>
>  &gt;
>
>  &gt;     and
>
>  &gt;
>
>  &gt;     ```
>
>  &gt;     ?&gt;&gt;&gt; U, s, V = np.linalg.svd(X - X.mean(axis=0),
> full_matrices=False)
>
>  &gt;     ?&gt;&gt;&gt; (X - X.mean(axis=0)).dot(V[:n_components].T)
>
>  &gt;     ```
>
>  &gt;
>
>  &gt;     --
>
>  &gt;     Roman
>
>  &gt;
>
>  &gt;     On 16/10/17 03:42, Ismael Lemhadri wrote:
>
>  &gt;     &gt; Dear all,
>
>  &gt;     &gt; The help file for the PCA class is unclear about the
> preprocessing
>
>  &gt;     &gt; performed to the data.
>
>  &gt;     &gt; You can check on line 410 here:
>
>  &gt;     &gt;
>
>  &gt;
> https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/decomposition/pca.py#L410
>
>  &gt;     &lt;
> https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/decomposition/pca.py#L410&gt
> ;
>
>  &gt;     &gt;
>
>  &gt;     &lt;
> https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/decomposition/pca.py#L410
>
>  &gt;     &lt;
> https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/decomposition/pca.py#L410&gt;&gt
> ;
>
>  &gt;     &gt; that the matrix is centered but NOT scaled, before
> performing the
>
>  &gt;     &gt; singular value decomposition.
>
>  &gt;     &gt; However, the help files do not make any mention of it.
>
>  &gt;     &gt; This is unclear for someone who, like me, just wanted to
> compare
>
>  &gt;     that
>
>  &gt;     &gt; the PCA and np.linalg.svd give the same results. In academic
>
>  &gt;     settings,
>
>  &gt;     &gt; students are often asked to compare different methods and to
>
>  &gt;     check that
>
>  &gt;     &gt; they yield the same results. I expect that many students
> have
>
>  &gt;     confronted
>
>  &gt;     &gt; this problem before...
>
>  &gt;     &gt; Best,
>
>  &gt;     &gt; Ismael Lemhadri
>
>  &gt;     &gt;
>
>  &gt;     &gt;
>
>  &gt;     &gt; _______________________________________________
>
>  &gt;     &gt; scikit-learn mailing list
>
>  &gt;     &gt; scikit-learn at python.org &lt;mailto:scikit-learn at python.org
> &gt;
>
>  &gt;     &gt; https://mail.python.org/mailman/listinfo/scikit-learn
>
>  &gt;     &lt;https://mail.python.org/mailman/listinfo/scikit-learn&gt;
>
>  &gt;     &gt;
>
>  &gt;
>
>  &gt;
>
>  &gt;
>
>  &gt;     ------------------------------
>
>  &gt;
>
>  &gt;     Message: 3
>
>  &gt;     Date: Mon, 16 Oct 2017 15:27:48 +0200
>
>  &gt;     From: Serafeim Loukas &lt;seralouk at gmail.com &lt;mailto:
> seralouk at gmail.com&gt;&gt;
>
>  &gt;     To: scikit-learn at python.org &lt;mailto:scikit-learn at python.org
> &gt;
>
>  &gt;     Subject: [scikit-learn] Question about LDA's coef_ attribute
>
>  &gt;     Message-ID: &lt;58C6D0DA-9DE5-4EF5-97C1-48159831F5A9 at gmail.com
>
>  &gt;     &lt;mailto:58C6D0DA-9DE5-4EF5-97C1-48159831F5A9 at gmail.com
> &gt;&gt;
>
>  &gt;     Content-Type: text/plain; charset="us-ascii"
>
>  &gt;
>
>  &gt;     Dear Scikit-learn community,
>
>  &gt;
>
>  &gt;     Since the documentation of the LDA
>
>  &gt;     (
> http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html
>
>  &gt;     &lt;
> http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html&gt
> ;
>
>  &gt;     &lt;
> http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html
>
>  &gt;     &lt;
> http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html&gt;&gt
> ;)
>
>  &gt;     is not so clear, I would like to ask if the lda.coef_ attribute
>
>  &gt;     stores the eigenvectors from the SVD decomposition.
>
>  &gt;
>
>  &gt;     Thank you in advance,
>
>  &gt;     Serafeim
>
>  &gt;     -------------- next part --------------
>
>  &gt;     An HTML attachment was scrubbed...
>
>  &gt;     URL:
>
>  &gt;     &lt;
> http://mail.python.org/pipermail/scikit-learn/attachments/20171016/4263df5c/attachment-0001.html
>
>  &gt;     &lt;
> http://mail.python.org/pipermail/scikit-learn/attachments/20171016/4263df5c/attachment-0001.html&gt;&gt
> ;
>
>  &gt;
>
>  &gt;     ------------------------------
>
>  &gt;
>
>  &gt;     Message: 4
>
>  &gt;     Date: Mon, 16 Oct 2017 16:57:52 +0200
>
>  &gt;     From: Alexandre Gramfort &lt;alexandre.gramfort at inria.fr
>
>  &gt;     &lt;mailto:alexandre.gramfort at inria.fr&gt;&gt;
>
>  &gt;     To: Scikit-learn mailing list &lt;scikit-learn at python.org
>
>  &gt;     &lt;mailto:scikit-learn at python.org&gt;&gt;
>
>  &gt;     Subject: Re: [scikit-learn] Question about LDA's coef_ attribute
>
>  &gt;     Message-ID:
>
>  &gt;     ? ? ? ?
>
>  &gt;     &
> lt;CADeotZricOQhuHJMmW2Z14cqffEQyndYoxn-OgKAvTMQ7V0Y2g at mail.gmail.com
>
>  &gt;     &lt;mailto:
> CADeotZricOQhuHJMmW2Z14cqffEQyndYoxn-OgKAvTMQ7V0Y2g at mail.gmail.com&gt;&gt;
>
>  &gt;     Content-Type: text/plain; charset="UTF-8"
>
>  &gt;
>
>  &gt;     no it stores the direction of the decision function to match the
>
>  &gt;     API of
>
>  &gt;     linear models.
>
>  &gt;
>
>  &gt;     HTH
>
>  &gt;     Alex
>
>  &gt;
>
>  &gt;     On Mon, Oct 16, 2017 at 3:27 PM, Serafeim Loukas
>
>  &gt;     &lt;seralouk at gmail.com &lt;mailto:seralouk at gmail.com&gt;&gt;
> wrote:
>
>  &gt;     &gt; Dear Scikit-learn community,
>
>  &gt;     &gt;
>
>  &gt;     &gt; Since the documentation of the LDA
>
>  &gt;     &gt;
>
>  &gt;     (
> http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html
>
>  &gt;     &lt;
> http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html&gt
> ;)
>
>  &gt;     &gt; is not so clear, I would like to ask if the lda.coef_
> attribute
>
>  &gt;     stores the
>
>  &gt;     &gt; eigenvectors from the SVD decomposition.
>
>  &gt;     &gt;
>
>  &gt;     &gt; Thank you in advance,
>
>  &gt;     &gt; Serafeim
>
>  &gt;     &gt;
>
>  &gt;     &gt; _______________________________________________
>
>  &gt;     &gt; scikit-learn mailing list
>
>  &gt;     &gt; scikit-learn at python.org &lt;mailto:scikit-learn at python.org
> &gt;
>
>  &gt;     &gt; https://mail.python.org/mailman/listinfo/scikit-learn
>
>  &gt;     &lt;https://mail.python.org/mailman/listinfo/scikit-learn&gt;
>
>  &gt;     &gt;
>
>  &gt;
>
>  &gt;
>
>  &gt;     ------------------------------
>
>  &gt;
>
>  &gt;     Message: 5
>
>  &gt;     Date: Mon, 16 Oct 2017 17:02:46 +0200
>
>  &gt;     From: Serafeim Loukas &lt;seralouk at gmail.com &lt;mailto:
> seralouk at gmail.com&gt;&gt;
>
>  &gt;     To: Scikit-learn mailing list &lt;scikit-learn at python.org
>
>  &gt;     &lt;mailto:scikit-learn at python.org&gt;&gt;
>
>  &gt;     Subject: Re: [scikit-learn] Question about LDA's coef_ attribute
>
>  &gt;     Message-ID: &lt;413210D2-56AE-41A4-873F-D171BB36539D at gmail.com
>
>  &gt;     &lt;mailto:413210D2-56AE-41A4-873F-D171BB36539D at gmail.com
> &gt;&gt;
>
>  &gt;     Content-Type: text/plain; charset="us-ascii"
>
>  &gt;
>
>  &gt;     Dear Alex,
>
>  &gt;
>
>  &gt;     Thank you for the prompt response.
>
>  &gt;
>
>  &gt;     Are the eigenvectors stored in some variable ?
>
>  &gt;     Does the lda.scalings_ attribute contain the eigenvectors ?
>
>  &gt;
>
>  &gt;     Best,
>
>  &gt;     Serafeim
>
>  &gt;
>
>  &gt;     &gt; On 16 Oct 2017, at 16:57, Alexandre Gramfort
>
>  &gt;     &lt;alexandre.gramfort at inria.fr &lt;mailto:
> alexandre.gramfort at inria.fr&gt;&gt;
>
>  &gt;     wrote:
>
>  &gt;     &gt;
>
>  &gt;     &gt; no it stores the direction of the decision function to
> match the
>
>  &gt;     API of
>
>  &gt;     &gt; linear models.
>
>  &gt;     &gt;
>
>  &gt;     &gt; HTH
>
>  &gt;     &gt; Alex
>
>  &gt;     &gt;
>
>  &gt;     &gt; On Mon, Oct 16, 2017 at 3:27 PM, Serafeim Loukas
>
>  &gt;     &lt;seralouk at gmail.com &lt;mailto:seralouk at gmail.com&gt;&gt;
> wrote:
>
>  &gt;     &gt;&gt; Dear Scikit-learn community,
>
>  &gt;     &gt;&gt;
>
>  &gt;     &gt;&gt; Since the documentation of the LDA
>
>  &gt;     &gt;&gt;
>
>  &gt;     (
> http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html
>
>  &gt;     &lt;
> http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html&gt
> ;)
>
>  &gt;     &gt;&gt; is not so clear, I would like to ask if the lda.coef_
> attribute
>
>  &gt;     stores the
>
>  &gt;     &gt;&gt; eigenvectors from the SVD decomposition.
>
>  &gt;     &gt;&gt;
>
>  &gt;     &gt;&gt; Thank you in advance,
>
>  &gt;     &gt;&gt; Serafeim
>
>  &gt;     &gt;&gt;
>
>  &gt;     &gt;&gt; _______________________________________________
>
>  &gt;     &gt;&gt; scikit-learn mailing list
>
>  &gt;     &gt;&gt; scikit-learn at python.org &lt;mailto:
> scikit-learn at python.org&gt;
>
>  &gt;     &gt;&gt; https://mail.python.org/mailman/listinfo/scikit-learn
>
>  &gt;     &lt;https://mail.python.org/mailman/listinfo/scikit-learn&gt;
>
>  &gt;     &gt;&gt;
>
>  &gt;     &gt; _______________________________________________
>
>  &gt;     &gt; scikit-learn mailing list
>
>  &gt;     &gt; scikit-learn at python.org &lt;mailto:scikit-learn at python.org
> &gt;
>
>  &gt;     &gt; https://mail.python.org/mailman/listinfo/scikit-learn
>
>  &gt;     &lt;https://mail.python.org/mailman/listinfo/scikit-learn&gt;
>
>  &gt;
>
>  &gt;     -------------- next part --------------
>
>  &gt;     An HTML attachment was scrubbed...
>
>  &gt;     URL:
>
>  &gt;     &lt;
> http://mail.python.org/pipermail/scikit-learn/attachments/20171016/505c7da3/attachment.html
>
>  &gt;     &lt;
> http://mail.python.org/pipermail/scikit-learn/attachments/20171016/505c7da3/attachment.html&gt;&gt
> ;
>
>  &gt;
>
>  &gt;     ------------------------------
>
>  &gt;
>
>  &gt;     Subject: Digest Footer
>
>  &gt;
>
>  &gt;     _______________________________________________
>
>  &gt;     scikit-learn mailing list
>
>  &gt;     scikit-learn at python.org &lt;mailto:scikit-learn at python.org&gt;
>
>  &gt;     https://mail.python.org/mailman/listinfo/scikit-learn
>
>  &gt;     &lt;https://mail.python.org/mailman/listinfo/scikit-learn&gt;
>
>  &gt;
>
>  &gt;
>
>  &gt;     ------------------------------
>
>  &gt;
>
>  &gt;     End of scikit-learn Digest, Vol 19, Issue 25
>
>  &gt;     ********************************************
>
>  &gt;
>
>  &gt;
>
>  &gt;
>
>  &gt;
>
>  &gt; _______________________________________________
>
>  &gt; scikit-learn mailing list
>
>  &gt; scikit-learn at python.org
>
>  &gt; https://mail.python.org/mailman/listinfo/scikit-learn
>
>
>
>  -------------- next part --------------
>
>  An HTML attachment was scrubbed...
>
>  URL: &lt;
> http://mail.python.org/pipermail/scikit-learn/attachments/20171016/f47e63a9/attachment.html&gt
> ;
>
>
>
>  ------------------------------
>
>
>
>  Subject: Digest Footer
>
>
>
>  _______________________________________________
>
>  scikit-learn mailing list
>
>  scikit-learn at python.org
>
>  https://mail.python.org/mailman/listinfo/scikit-learn
>
>
>
>
>
>  ------------------------------
>
>
>
>  End of scikit-learn Digest, Vol 19, Issue 28
>
>  ********************************************
>
>
>
>
>
>
> _______________________________________________
>
> scikit-learn mailing list
>
> scikit-learn at python.org
>
> https://mail.python.org/mailman/listinfo/scikit-learn
>
>
>
>
>
>
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <
> http://mail.python.org/pipermail/scikit-learn/attachments/20171016/620a9401/attachment.html
> >
>
> ------------------------------
>
> Subject: Digest Footer
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
>
> ------------------------------
>
> End of scikit-learn Digest, Vol 19, Issue 31
> ********************************************
>
-- 

Sent from a mobile phone and may contain errors
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20171016/2c7c65c3/attachment-0001.html>

From rth.yurchak at gmail.com  Mon Oct 16 15:38:17 2017
From: rth.yurchak at gmail.com (Roman Yurchak)
Date: Mon, 16 Oct 2017 21:38:17 +0200
Subject: [scikit-learn] 1. Re: unclear help file for
 sklearn.decomposition.pca
In-Reply-To: <FD2EB640-BA10-4D3F-8F99-D336B7489A77@gmail.com>
References: <CANpSPFTA5KetWQ4q0n=4L9dvjaUc-sJwRh+=cDe+2wZCvRiv4g@mail.gmail.com>
 <033391A5-C191-4DFF-B80D-8E3C7AF3A74D@gmail.com>
 <FD2EB640-BA10-4D3F-8F99-D336B7489A77@gmail.com>
Message-ID: <780c15c5-8405-2d87-1d04-b5e124184a8d@gmail.com>

It might be useful to have some of these comments in the docs.

Currently the PCA docsting only states that PCA is computed with SVD and 
then goes on discussing randomized SVD solvers. The user guide is not 
more helpful on this subject either,

Ismael opened a documentation PR on it in 
https://github.com/scikit-learn/scikit-learn/pull/9934

-- 
Roman

On 16/10/17 21:29, Sebastian Raschka wrote:
> Oh, never mind my previous email, because while the components should be
> the same, the projection of the data points onto those components would
> still be affected by centering vs non-centering I guess.
>
> Best,
> Sebastian
>
>> On Oct 16, 2017, at 3:25 PM, Sebastian Raschka <se.raschka at gmail.com
>> <mailto:se.raschka at gmail.com>> wrote:
>>
>> Hi,
>>
>> if you compute the principal components (i.e., eigendecomposition)
>> from the covariance matrix, it shouldn't matter whether the data is
>> centered or not, since the covariance matrix is computed as
>>
>> CovMat = \fact{1}{n} \sum_{i=1}^{n} (x_n - \bar{x}) (x_n - \bar{x})^T
>>
>> where \bar{x} = vector of feature means
>>
>> So, if you center the data prior to computing the covariance matrix,
>> \bar{x} is simply 0.
>>
>> Best,
>> Sebastian
>>
>>> On Oct 16, 2017, at 2:27 PM, Ismael Lemhadri <lemhadri at stanford.edu
>>> <mailto:lemhadri at stanford.edu>> wrote:
>>>
>>> @Andreas Muller:
>>> My references do not assume centering,
>>> e.g. http://ufldl.stanford.edu/wiki/index.php/PCA
>>> any reference?
>>>
>>>
>>>
>>> On Mon, Oct 16, 2017 at 10:20 AM, <scikit-learn-request at python.org
>>> <mailto:scikit-learn-request at python.org>> wrote:
>>>
>>>     Send scikit-learn mailing list submissions to
>>>             scikit-learn at python.org <mailto:scikit-learn at python.org>
>>>
>>>     To subscribe or unsubscribe via the World Wide Web, visit
>>>             https://mail.python.org/mailman/listinfo/scikit-learn
>>>     <https://mail.python.org/mailman/listinfo/scikit-learn>
>>>     or, via email, send a message with subject or body 'help' to
>>>             scikit-learn-request at python.org
>>>     <mailto:scikit-learn-request at python.org>
>>>
>>>     You can reach the person managing the list at
>>>             scikit-learn-owner at python.org
>>>     <mailto:scikit-learn-owner at python.org>
>>>
>>>     When replying, please edit your Subject line so it is more specific
>>>     than "Re: Contents of scikit-learn digest..."
>>>
>>>
>>>     Today's Topics:
>>>
>>>        1. Re: unclear help file for sklearn.decomposition.pca
>>>           (Andreas Mueller)
>>>
>>>
>>>     ----------------------------------------------------------------------
>>>
>>>     Message: 1
>>>     Date: Mon, 16 Oct 2017 13:19:57 -0400
>>>     From: Andreas Mueller <t3kcit at gmail.com <mailto:t3kcit at gmail.com>>
>>>     To: scikit-learn at python.org <mailto:scikit-learn at python.org>
>>>     Subject: Re: [scikit-learn] unclear help file for
>>>             sklearn.decomposition.pca
>>>     Message-ID: <04fc445c-d8f3-a3a9-4ab2-0535826a2d03 at gmail.com
>>>     <mailto:04fc445c-d8f3-a3a9-4ab2-0535826a2d03 at gmail.com>>
>>>     Content-Type: text/plain; charset="utf-8"; Format="flowed"
>>>
>>>     The definition of PCA has a centering step, but no scaling step.
>>>
>>>     On 10/16/2017 11:16 AM, Ismael Lemhadri wrote:
>>>     > Dear Roman,
>>>     > My concern is actually not about not mentioning the scaling but
>>>     about
>>>     > not mentioning the centering.
>>>     > That is, the sklearn PCA removes the mean but it does not
>>>     mention it
>>>     > in the help file.
>>>     > This was quite messy for me to debug as I expected it to either: 1/
>>>     > center and scale simultaneously or / not scale and not center
>>>     either.
>>>     > It would be beneficial to explicit the behavior in the help
>>>     file in my
>>>     > opinion.
>>>     > Ismael
>>>     >
>>>     > On Mon, Oct 16, 2017 at 8:02 AM,
>>>     <scikit-learn-request at python.org
>>>     <mailto:scikit-learn-request at python.org>
>>>     > <mailto:scikit-learn-request at python.org
>>>     <mailto:scikit-learn-request at python.org>>> wrote:
>>>     >
>>>     >     Send scikit-learn mailing list submissions to
>>>     >     scikit-learn at python.org <mailto:scikit-learn at python.org>
>>>     <mailto:scikit-learn at python.org <mailto:scikit-learn at python.org>>
>>>     >
>>>     >     To subscribe or unsubscribe via the World Wide Web, visit
>>>     >     https://mail.python.org/mailman/listinfo/scikit-learn
>>>     <https://mail.python.org/mailman/listinfo/scikit-learn>
>>>     >     <https://mail.python.org/mailman/listinfo/scikit-learn
>>>     <https://mail.python.org/mailman/listinfo/scikit-learn>>
>>>     >     or, via email, send a message with subject or body 'help' to
>>>     >     scikit-learn-request at python.org
>>>     <mailto:scikit-learn-request at python.org>
>>>     >     <mailto:scikit-learn-request at python.org
>>>     <mailto:scikit-learn-request at python.org>>
>>>     >
>>>     >     You can reach the person managing the list at
>>>     >     scikit-learn-owner at python.org
>>>     <mailto:scikit-learn-owner at python.org>
>>>     <mailto:scikit-learn-owner at python.org
>>>     <mailto:scikit-learn-owner at python.org>>
>>>     >
>>>     >     When replying, please edit your Subject line so it is more
>>>     specific
>>>     >     than "Re: Contents of scikit-learn digest..."
>>>     >
>>>     >
>>>     >     Today's Topics:
>>>     >
>>>     >     ? ?1. unclear help file for sklearn.decomposition.pca (Ismael
>>>     >     Lemhadri)
>>>     >     ? ?2. Re: unclear help file for sklearn.decomposition.pca
>>>     >     ? ? ? (Roman Yurchak)
>>>     >     ? ?3. Question about LDA's coef_ attribute (Serafeim Loukas)
>>>     >     ? ?4. Re: Question about LDA's coef_ attribute (Alexandre
>>>     Gramfort)
>>>     >     ? ?5. Re: Question about LDA's coef_ attribute (Serafeim
>>>     Loukas)
>>>     >
>>>     >
>>>     >
>>>      ----------------------------------------------------------------------
>>>     >
>>>     >     Message: 1
>>>     >     Date: Sun, 15 Oct 2017 18:42:56 -0700
>>>     >     From: Ismael Lemhadri <lemhadri at stanford.edu
>>>     <mailto:lemhadri at stanford.edu>
>>>     >     <mailto:lemhadri at stanford.edu <mailto:lemhadri at stanford.edu>>>
>>>     >     To: scikit-learn at python.org
>>>     <mailto:scikit-learn at python.org> <mailto:scikit-learn at python.org
>>>     <mailto:scikit-learn at python.org>>
>>>     >     Subject: [scikit-learn] unclear help file for
>>>     >     ? ? ? ? sklearn.decomposition.pca
>>>     >     Message-ID:
>>>     >     ? ? ? ?
>>>     >
>>>      <CANpSPFTgv+Oz7f97dandmrBBayqf_o9w=18oKHCFN0u5DNzj+g at mail.gmail.com
>>>     <mailto:18oKHCFN0u5DNzj%2Bg at mail.gmail.com>
>>>     >     <mailto:18oKHCFN0u5DNzj%2Bg at mail.gmail.com
>>>     <mailto:18oKHCFN0u5DNzj%252Bg at mail.gmail.com>>>
>>>     >     Content-Type: text/plain; charset="utf-8"
>>>     >
>>>     >     Dear all,
>>>     >     The help file for the PCA class is unclear about the
>>>     preprocessing
>>>     >     performed to the data.
>>>     >     You can check on line 410 here:
>>>     >
>>>      https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/
>>>     <https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/>
>>>     >     decomposition/pca.py#L410
>>>     >
>>>      <https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/%0Adecomposition/pca.py#L410
>>>     <https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/%0Adecomposition/pca.py#L410>>
>>>     >     that the matrix is centered but NOT scaled, before
>>>     performing the
>>>     >     singular
>>>     >     value decomposition.
>>>     >     However, the help files do not make any mention of it.
>>>     >     This is unclear for someone who, like me, just wanted to
>>>     compare
>>>     >     that the
>>>     >     PCA and np.linalg.svd give the same results. In academic
>>>     settings,
>>>     >     students
>>>     >     are often asked to compare different methods and to check that
>>>     >     they yield
>>>     >     the same results. I expect that many students have
>>>     confronted this
>>>     >     problem
>>>     >     before...
>>>     >     Best,
>>>     >     Ismael Lemhadri
>>>     >     -------------- next part --------------
>>>     >     An HTML attachment was scrubbed...
>>>     >     URL:
>>>     >
>>>      <http://mail.python.org/pipermail/scikit-learn/attachments/20171015/c465bde7/attachment-0001.html
>>>     <http://mail.python.org/pipermail/scikit-learn/attachments/20171015/c465bde7/attachment-0001.html>
>>>     >
>>>      <http://mail.python.org/pipermail/scikit-learn/attachments/20171015/c465bde7/attachment-0001.html
>>>     <http://mail.python.org/pipermail/scikit-learn/attachments/20171015/c465bde7/attachment-0001.html>>>
>>>     >
>>>     >     ------------------------------
>>>     >
>>>     >     Message: 2
>>>     >     Date: Mon, 16 Oct 2017 15:16:45 +0200
>>>     >     From: Roman Yurchak <rth.yurchak at gmail.com
>>>     <mailto:rth.yurchak at gmail.com>
>>>     >     <mailto:rth.yurchak at gmail.com <mailto:rth.yurchak at gmail.com>>>
>>>     >     To: Scikit-learn mailing list <scikit-learn at python.org
>>>     <mailto:scikit-learn at python.org>
>>>     >     <mailto:scikit-learn at python.org
>>>     <mailto:scikit-learn at python.org>>>
>>>     >     Subject: Re: [scikit-learn] unclear help file for
>>>     >     ? ? ? ? sklearn.decomposition.pca
>>>     >     Message-ID: <b2abdcfd-4736-929e-6304-b93832932043 at gmail.com
>>>     <mailto:b2abdcfd-4736-929e-6304-b93832932043 at gmail.com>
>>>     >     <mailto:b2abdcfd-4736-929e-6304-b93832932043 at gmail.com
>>>     <mailto:b2abdcfd-4736-929e-6304-b93832932043 at gmail.com>>>
>>>     >     Content-Type: text/plain; charset=utf-8; format=flowed
>>>     >
>>>     >     Ismael,
>>>     >
>>>     >     as far as I saw the sklearn.decomposition.PCA doesn't mention
>>>     >     scaling at
>>>     >     all (except for the whiten parameter which is
>>>     post-transformation
>>>     >     scaling).
>>>     >
>>>     >     So since it doesn't mention it, it makes sense that it
>>>     doesn't do any
>>>     >     scaling of the input. Same as np.linalg.svd.
>>>     >
>>>     >     You can verify that PCA and np.linalg.svd yield the same
>>>     results, with
>>>     >
>>>     >     ```
>>>     >     ?>>> import numpy as np
>>>     >     ?>>> from sklearn.decomposition import PCA
>>>     >     ?>>> import numpy.linalg
>>>     >     ?>>> X = np.random.RandomState(42).rand(10, 4)
>>>     >     ?>>> n_components = 2
>>>     >     ?>>> PCA(n_components, svd_solver='full').fit_transform(X)
>>>     >     ```
>>>     >
>>>     >     and
>>>     >
>>>     >     ```
>>>     >     ?>>> U, s, V = np.linalg.svd(X - X.mean(axis=0),
>>>     full_matrices=False)
>>>     >     ?>>> (X - X.mean(axis=0)).dot(V[:n_components].T)
>>>     >     ```
>>>     >
>>>     >     --
>>>     >     Roman
>>>     >
>>>     >     On 16/10/17 03:42, Ismael Lemhadri wrote:
>>>     >     > Dear all,
>>>     >     > The help file for the PCA class is unclear about the
>>>     preprocessing
>>>     >     > performed to the data.
>>>     >     > You can check on line 410 here:
>>>     >     >
>>>     >
>>>      https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/decomposition/pca.py#L410
>>>     <https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/decomposition/pca.py#L410>
>>>     >
>>>      <https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/decomposition/pca.py#L410
>>>     <https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/decomposition/pca.py#L410>>
>>>     >     >
>>>     >
>>>      <https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/decomposition/pca.py#L410
>>>     <https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/decomposition/pca.py#L410>
>>>     >
>>>      <https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/decomposition/pca.py#L410
>>>     <https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/decomposition/pca.py#L410>>>
>>>     >     > that the matrix is centered but NOT scaled, before
>>>     performing the
>>>     >     > singular value decomposition.
>>>     >     > However, the help files do not make any mention of it.
>>>     >     > This is unclear for someone who, like me, just wanted to
>>>     compare
>>>     >     that
>>>     >     > the PCA and np.linalg.svd give the same results. In academic
>>>     >     settings,
>>>     >     > students are often asked to compare different methods and to
>>>     >     check that
>>>     >     > they yield the same results. I expect that many students have
>>>     >     confronted
>>>     >     > this problem before...
>>>     >     > Best,
>>>     >     > Ismael Lemhadri
>>>     >     >
>>>     >     >
>>>     >     > _______________________________________________
>>>     >     > scikit-learn mailing list
>>>     >     > scikit-learn at python.org <mailto:scikit-learn at python.org>
>>>     <mailto:scikit-learn at python.org <mailto:scikit-learn at python.org>>
>>>     >     > https://mail.python.org/mailman/listinfo/scikit-learn
>>>     <https://mail.python.org/mailman/listinfo/scikit-learn>
>>>     >     <https://mail.python.org/mailman/listinfo/scikit-learn
>>>     <https://mail.python.org/mailman/listinfo/scikit-learn>>
>>>     >     >
>>>     >
>>>     >
>>>     >
>>>     >     ------------------------------
>>>     >
>>>     >     Message: 3
>>>     >     Date: Mon, 16 Oct 2017 15:27:48 +0200
>>>     >     From: Serafeim Loukas <seralouk at gmail.com
>>>     <mailto:seralouk at gmail.com> <mailto:seralouk at gmail.com
>>>     <mailto:seralouk at gmail.com>>>
>>>     >     To: scikit-learn at python.org
>>>     <mailto:scikit-learn at python.org> <mailto:scikit-learn at python.org
>>>     <mailto:scikit-learn at python.org>>
>>>     >     Subject: [scikit-learn] Question about LDA's coef_ attribute
>>>     >     Message-ID: <58C6D0DA-9DE5-4EF5-97C1-48159831F5A9 at gmail.com
>>>     <mailto:58C6D0DA-9DE5-4EF5-97C1-48159831F5A9 at gmail.com>
>>>     >     <mailto:58C6D0DA-9DE5-4EF5-97C1-48159831F5A9 at gmail.com
>>>     <mailto:58C6D0DA-9DE5-4EF5-97C1-48159831F5A9 at gmail.com>>>
>>>     >     Content-Type: text/plain; charset="us-ascii"
>>>     >
>>>     >     Dear Scikit-learn community,
>>>     >
>>>     >     Since the documentation of the LDA
>>>     >
>>>      (http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html
>>>     <http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html>
>>>     >
>>>      <http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html
>>>     <http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html>>
>>>     >
>>>      <http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html
>>>     <http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html>
>>>     >
>>>      <http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html
>>>     <http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html>>>)
>>>     >     is not so clear, I would like to ask if the lda.coef_ attribute
>>>     >     stores the eigenvectors from the SVD decomposition.
>>>     >
>>>     >     Thank you in advance,
>>>     >     Serafeim
>>>     >     -------------- next part --------------
>>>     >     An HTML attachment was scrubbed...
>>>     >     URL:
>>>     >
>>>      <http://mail.python.org/pipermail/scikit-learn/attachments/20171016/4263df5c/attachment-0001.html
>>>     <http://mail.python.org/pipermail/scikit-learn/attachments/20171016/4263df5c/attachment-0001.html>
>>>     >
>>>      <http://mail.python.org/pipermail/scikit-learn/attachments/20171016/4263df5c/attachment-0001.html
>>>     <http://mail.python.org/pipermail/scikit-learn/attachments/20171016/4263df5c/attachment-0001.html>>>
>>>     >
>>>     >     ------------------------------
>>>     >
>>>     >     Message: 4
>>>     >     Date: Mon, 16 Oct 2017 16:57:52 +0200
>>>     >     From: Alexandre Gramfort <alexandre.gramfort at inria.fr
>>>     <mailto:alexandre.gramfort at inria.fr>
>>>     >     <mailto:alexandre.gramfort at inria.fr
>>>     <mailto:alexandre.gramfort at inria.fr>>>
>>>     >     To: Scikit-learn mailing list <scikit-learn at python.org
>>>     <mailto:scikit-learn at python.org>
>>>     >     <mailto:scikit-learn at python.org
>>>     <mailto:scikit-learn at python.org>>>
>>>     >     Subject: Re: [scikit-learn] Question about LDA's coef_
>>>     attribute
>>>     >     Message-ID:
>>>     >     ? ? ? ?
>>>     >
>>>      <CADeotZricOQhuHJMmW2Z14cqffEQyndYoxn-OgKAvTMQ7V0Y2g at mail.gmail.com
>>>     <mailto:CADeotZricOQhuHJMmW2Z14cqffEQyndYoxn-OgKAvTMQ7V0Y2g at mail.gmail.com>
>>>     >
>>>      <mailto:CADeotZricOQhuHJMmW2Z14cqffEQyndYoxn-OgKAvTMQ7V0Y2g at mail.gmail.com
>>>     <mailto:CADeotZricOQhuHJMmW2Z14cqffEQyndYoxn-OgKAvTMQ7V0Y2g at mail.gmail.com>>>
>>>     >     Content-Type: text/plain; charset="UTF-8"
>>>     >
>>>     >     no it stores the direction of the decision function to
>>>     match the
>>>     >     API of
>>>     >     linear models.
>>>     >
>>>     >     HTH
>>>     >     Alex
>>>     >
>>>     >     On Mon, Oct 16, 2017 at 3:27 PM, Serafeim Loukas
>>>     >     <seralouk at gmail.com <mailto:seralouk at gmail.com>
>>>     <mailto:seralouk at gmail.com <mailto:seralouk at gmail.com>>> wrote:
>>>     >     > Dear Scikit-learn community,
>>>     >     >
>>>     >     > Since the documentation of the LDA
>>>     >     >
>>>     >
>>>      (http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html
>>>     <http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html>
>>>     >
>>>      <http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html
>>>     <http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html>>)
>>>     >     > is not so clear, I would like to ask if the lda.coef_
>>>     attribute
>>>     >     stores the
>>>     >     > eigenvectors from the SVD decomposition.
>>>     >     >
>>>     >     > Thank you in advance,
>>>     >     > Serafeim
>>>     >     >
>>>     >     > _______________________________________________
>>>     >     > scikit-learn mailing list
>>>     >     > scikit-learn at python.org <mailto:scikit-learn at python.org>
>>>     <mailto:scikit-learn at python.org <mailto:scikit-learn at python.org>>
>>>     >     > https://mail.python.org/mailman/listinfo/scikit-learn
>>>     <https://mail.python.org/mailman/listinfo/scikit-learn>
>>>     >     <https://mail.python.org/mailman/listinfo/scikit-learn
>>>     <https://mail.python.org/mailman/listinfo/scikit-learn>>
>>>     >     >
>>>     >
>>>     >
>>>     >     ------------------------------
>>>     >
>>>     >     Message: 5
>>>     >     Date: Mon, 16 Oct 2017 17:02:46 +0200
>>>     >     From: Serafeim Loukas <seralouk at gmail.com
>>>     <mailto:seralouk at gmail.com> <mailto:seralouk at gmail.com
>>>     <mailto:seralouk at gmail.com>>>
>>>     >     To: Scikit-learn mailing list <scikit-learn at python.org
>>>     <mailto:scikit-learn at python.org>
>>>     >     <mailto:scikit-learn at python.org
>>>     <mailto:scikit-learn at python.org>>>
>>>     >     Subject: Re: [scikit-learn] Question about LDA's coef_
>>>     attribute
>>>     >     Message-ID: <413210D2-56AE-41A4-873F-D171BB36539D at gmail.com
>>>     <mailto:413210D2-56AE-41A4-873F-D171BB36539D at gmail.com>
>>>     >     <mailto:413210D2-56AE-41A4-873F-D171BB36539D at gmail.com
>>>     <mailto:413210D2-56AE-41A4-873F-D171BB36539D at gmail.com>>>
>>>     >     Content-Type: text/plain; charset="us-ascii"
>>>     >
>>>     >     Dear Alex,
>>>     >
>>>     >     Thank you for the prompt response.
>>>     >
>>>     >     Are the eigenvectors stored in some variable ?
>>>     >     Does the lda.scalings_ attribute contain the eigenvectors ?
>>>     >
>>>     >     Best,
>>>     >     Serafeim
>>>     >
>>>     >     > On 16 Oct 2017, at 16:57, Alexandre Gramfort
>>>     >     <alexandre.gramfort at inria.fr
>>>     <mailto:alexandre.gramfort at inria.fr>
>>>     <mailto:alexandre.gramfort at inria.fr
>>>     <mailto:alexandre.gramfort at inria.fr>>>
>>>     >     wrote:
>>>     >     >
>>>     >     > no it stores the direction of the decision function to
>>>     match the
>>>     >     API of
>>>     >     > linear models.
>>>     >     >
>>>     >     > HTH
>>>     >     > Alex
>>>     >     >
>>>     >     > On Mon, Oct 16, 2017 at 3:27 PM, Serafeim Loukas
>>>     >     <seralouk at gmail.com <mailto:seralouk at gmail.com>
>>>     <mailto:seralouk at gmail.com <mailto:seralouk at gmail.com>>> wrote:
>>>     >     >> Dear Scikit-learn community,
>>>     >     >>
>>>     >     >> Since the documentation of the LDA
>>>     >     >>
>>>     >
>>>      (http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html
>>>     <http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html>
>>>     >
>>>      <http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html
>>>     <http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html>>)
>>>     >     >> is not so clear, I would like to ask if the lda.coef_
>>>     attribute
>>>     >     stores the
>>>     >     >> eigenvectors from the SVD decomposition.
>>>     >     >>
>>>     >     >> Thank you in advance,
>>>     >     >> Serafeim
>>>     >     >>
>>>     >     >> _______________________________________________
>>>     >     >> scikit-learn mailing list
>>>     >     >> scikit-learn at python.org <mailto:scikit-learn at python.org>
>>>     <mailto:scikit-learn at python.org <mailto:scikit-learn at python.org>>
>>>     >     >> https://mail.python.org/mailman/listinfo/scikit-learn
>>>     <https://mail.python.org/mailman/listinfo/scikit-learn>
>>>     >     <https://mail.python.org/mailman/listinfo/scikit-learn
>>>     <https://mail.python.org/mailman/listinfo/scikit-learn>>
>>>     >     >>
>>>     >     > _______________________________________________
>>>     >     > scikit-learn mailing list
>>>     >     > scikit-learn at python.org <mailto:scikit-learn at python.org>
>>>     <mailto:scikit-learn at python.org <mailto:scikit-learn at python.org>>
>>>     >     > https://mail.python.org/mailman/listinfo/scikit-learn
>>>     <https://mail.python.org/mailman/listinfo/scikit-learn>
>>>     >     <https://mail.python.org/mailman/listinfo/scikit-learn
>>>     <https://mail.python.org/mailman/listinfo/scikit-learn>>
>>>     >
>>>     >     -------------- next part --------------
>>>     >     An HTML attachment was scrubbed...
>>>     >     URL:
>>>     >
>>>      <http://mail.python.org/pipermail/scikit-learn/attachments/20171016/505c7da3/attachment.html
>>>     <http://mail.python.org/pipermail/scikit-learn/attachments/20171016/505c7da3/attachment.html>
>>>     >
>>>      <http://mail.python.org/pipermail/scikit-learn/attachments/20171016/505c7da3/attachment.html
>>>     <http://mail.python.org/pipermail/scikit-learn/attachments/20171016/505c7da3/attachment.html>>>
>>>     >
>>>     >     ------------------------------
>>>     >
>>>     >     Subject: Digest Footer
>>>     >
>>>     >     _______________________________________________
>>>     >     scikit-learn mailing list
>>>     >     scikit-learn at python.org <mailto:scikit-learn at python.org>
>>>     <mailto:scikit-learn at python.org <mailto:scikit-learn at python.org>>
>>>     >     https://mail.python.org/mailman/listinfo/scikit-learn
>>>     <https://mail.python.org/mailman/listinfo/scikit-learn>
>>>     >     <https://mail.python.org/mailman/listinfo/scikit-learn
>>>     <https://mail.python.org/mailman/listinfo/scikit-learn>>
>>>     >
>>>     >
>>>     >     ------------------------------
>>>     >
>>>     >     End of scikit-learn Digest, Vol 19, Issue 25
>>>     >     ********************************************
>>>     >
>>>     >
>>>     >
>>>     >
>>>     > _______________________________________________
>>>     > scikit-learn mailing list
>>>     > scikit-learn at python.org <mailto:scikit-learn at python.org>
>>>     > https://mail.python.org/mailman/listinfo/scikit-learn
>>>     <https://mail.python.org/mailman/listinfo/scikit-learn>
>>>
>>>     -------------- next part --------------
>>>     An HTML attachment was scrubbed...
>>>     URL:
>>>     <http://mail.python.org/pipermail/scikit-learn/attachments/20171016/f47e63a9/attachment.html
>>>     <http://mail.python.org/pipermail/scikit-learn/attachments/20171016/f47e63a9/attachment.html>>
>>>
>>>     ------------------------------
>>>
>>>     Subject: Digest Footer
>>>
>>>     _______________________________________________
>>>     scikit-learn mailing list
>>>     scikit-learn at python.org <mailto:scikit-learn at python.org>
>>>     https://mail.python.org/mailman/listinfo/scikit-learn
>>>     <https://mail.python.org/mailman/listinfo/scikit-learn>
>>>
>>>
>>>     ------------------------------
>>>
>>>     End of scikit-learn Digest, Vol 19, Issue 28
>>>     ********************************************
>>>
>>>
>>> _______________________________________________
>>> scikit-learn mailing list
>>> scikit-learn at python.org <mailto:scikit-learn at python.org>
>>> https://mail.python.org/mailman/listinfo/scikit-learn
>>
>> _______________________________________________
>> scikit-learn mailing list
>> scikit-learn at python.org <mailto:scikit-learn at python.org>
>> https://mail.python.org/mailman/listinfo/scikit-learn
>
>
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>


From t3kcit at gmail.com  Tue Oct 17 11:40:40 2017
From: t3kcit at gmail.com (Andreas Mueller)
Date: Tue, 17 Oct 2017 11:40:40 -0400
Subject: [scikit-learn] Unclear help file about sklearn.decomposition.pca
In-Reply-To: <CANpSPFT639gYFzAG3bhq-kyF=56P9QjJ5s+epXXuHi2jDHhXbQ@mail.gmail.com>
References: <mailman.2473.1508179722.12136.scikit-learn@python.org>
 <CANpSPFT639gYFzAG3bhq-kyF=56P9QjJ5s+epXXuHi2jDHhXbQ@mail.gmail.com>
Message-ID: <7a74179e-1e52-afc9-05ba-68f41868e2b5@gmail.com>

In general scikit-learn avoids automatic preprocessing.
That's a convention to give the user more control and decrease 
surprising behavior (ostensibly).
So scikit-learn will usually do what the algorithm is supposed to do, 
and nothing more.

I'm not sure what the best way do document this is, as this has come up 
with different models.
For example the R wrapper of libsvm does automatic scaling, while we 
apply the SVM.

We could add "this model does not do any automatic preprocessing" to all 
docstrings, but that seems
a bit redundant. We could add it to 
https://github.com/scikit-learn/scikit-learn/pull/9517, but
that is probably not where you would have looked.

Other suggestions welcome.

On 10/16/2017 03:29 PM, Ismael Lemhadri wrote:
> Thank you all for your feedback.
> The initial problem I came with wasnt the definition of PCA but what 
> the sklearn method does. In practice I would always make sure the data 
> is both centered and scaled before performing PCA. This is the 
> recommended method because without scaling, the biggest direction 
> could wrongly seem to explain a huge fraction of the variance.
> So my point was simply to clarify in the help file and the user guide 
> what the PCA class does precisely to leave no unclarity to the reader. 
> Moving forward I have now submitted a pull request on github as 
> initially suggested by Roman on this thread.
> Best,
> Ismael
>
> On Mon, 16 Oct 2017 at 11:49 AM, <scikit-learn-request at python.org 
> <mailto:scikit-learn-request at python.org>> wrote:
>
>     Send scikit-learn mailing list submissions to
>     scikit-learn at python.org <mailto:scikit-learn at python.org>
>
>     To subscribe or unsubscribe via the World Wide Web, visit
>     https://mail.python.org/mailman/listinfo/scikit-learn
>     or, via email, send a message with subject or body 'help' to
>     scikit-learn-request at python.org
>     <mailto:scikit-learn-request at python.org>
>
>     You can reach the person managing the list at
>     scikit-learn-owner at python.org <mailto:scikit-learn-owner at python.org>
>
>     When replying, please edit your Subject line so it is more specific
>     than "Re: Contents of scikit-learn digest..."
>
>
>     Today's Topics:
>
>     ? ?1. Re: 1. Re: unclear help file for sklearn.decomposition.pca
>     ? ? ? (Andreas Mueller)
>     ? ?2. Re: 1. Re: unclear help file for sklearn.decomposition.pca
>     ? ? ? (Oliver Tomic)
>
>
>     ----------------------------------------------------------------------
>
>     Message: 1
>     Date: Mon, 16 Oct 2017 14:44:51 -0400
>     From: Andreas Mueller <t3kcit at gmail.com <mailto:t3kcit at gmail.com>>
>     To: scikit-learn at python.org <mailto:scikit-learn at python.org>
>     Subject: Re: [scikit-learn] 1. Re: unclear help file for
>     ? ? ? ? sklearn.decomposition.pca
>     Message-ID: <35142868-fce9-6cb3-eba3-015a0b106163 at gmail.com
>     <mailto:35142868-fce9-6cb3-eba3-015a0b106163 at gmail.com>>
>     Content-Type: text/plain; charset="utf-8"; Format="flowed"
>
>
>
>     On 10/16/2017 02:27 PM, Ismael Lemhadri wrote:
>     > @Andreas Muller:
>     > My references do not assume centering, e.g.
>     > http://ufldl.stanford.edu/wiki/index.php/PCA
>     > any reference?
>     >
>     It kinda does but is not very clear about it:
>
>     This data has already been pre-processed so that each of the
>     features\textstyle x_1and\textstyle x_2have about the same mean (zero)
>     and variance.
>
>
>
>     Wikipedia is much clearer:
>     Consider a datamatrix
>     <https://en.wikipedia.org/wiki/Matrix_%28mathematics%29>,*X*, with
>     column-wise zeroempirical mean
>     <https://en.wikipedia.org/wiki/Empirical_mean>(the sample mean of each
>     column has been shifted to zero), where each of the/n/rows
>     represents a
>     different repetition of the experiment, and each of the/p/columns
>     gives
>     a particular kind of feature (say, the results from a particular
>     sensor).
>     https://en.wikipedia.org/wiki/Principal_component_analysis#Details
>
>     I'm a bit surprised to find that ESL says "The SVD of the centered
>     matrix X is another way of expressing the principal components of the
>     variables in X",
>     so they assume scaling? They don't really have a great treatment
>     of PCA,
>     though.
>
>     Bishop <http://www.springer.com/us/book/9780387310732> and Murphy
>     <https://mitpress.mit.edu/books/machine-learning-0> are pretty clear
>     that they subtract the mean (or assume zero mean) but don't
>     standardize.
>     -------------- next part --------------
>     An HTML attachment was scrubbed...
>     URL:
>     <http://mail.python.org/pipermail/scikit-learn/attachments/20171016/81b3014b/attachment-0001.html>
>
>     ------------------------------
>
>     Message: 2
>     Date: Mon, 16 Oct 2017 20:48:29 +0200
>     From: Oliver Tomic <olivertomic at zoho.com
>     <mailto:olivertomic at zoho.com>>
>     To: "Scikit-learn mailing list" <scikit-learn at python.org
>     <mailto:scikit-learn at python.org>>
>     Cc: <scikit-learn at python.org <mailto:scikit-learn at python.org>>
>     Subject: Re: [scikit-learn] 1. Re: unclear help file for
>     ? ? ? ? sklearn.decomposition.pca
>     Message-ID: <15f26840d65.e97b33c25239.3934951873824890747 at zoho.com
>     <mailto:15f26840d65.e97b33c25239.3934951873824890747 at zoho.com>>
>     Content-Type: text/plain; charset="utf-8"
>
>     Dear Ismael,
>
>
>
>     PCA should always involve at the least centering, or, if the
>     variables are to contribute equally, scaling. Here is a reference
>     from the scientific area named "chemometrics". In Chemometrics PCA
>     used not only for dimensionality reduction, but also for
>     interpretation of variance by use of scores, loadings, correlation
>     loadings, etc.
>
>
>
>     If you scroll down to subsection "Preprocessing" you will find
>     more info on centering and scaling.
>
>
>     http://pubs.rsc.org/en/content/articlehtml/2014/ay/c3ay41907j
>
>
>
>     best
>
>     Oliver
>
>
>
>
>     ---- On Mon, 16 Oct 2017 20:27:11 +0200 Ismael Lemhadri
>     &lt;lemhadri at stanford.edu <mailto:lt%3Blemhadri at stanford.edu>&gt;
>     wrote ----
>
>
>
>
>     @Andreas Muller:
>
>     My references do not assume centering, e.g.
>     http://ufldl.stanford.edu/wiki/index.php/PCA
>
>     any reference?
>
>
>
>
>
>
>
>     On Mon, Oct 16, 2017 at 10:20 AM,
>     &lt;scikit-learn-request at python.org
>     <mailto:lt%3Bscikit-learn-request at python.org>&gt; wrote:
>
>     Send scikit-learn mailing list submissions to
>
>     scikit-learn at python.org <mailto:scikit-learn at python.org>
>
>
>
>     ?To subscribe or unsubscribe via the World Wide Web, visit
>
>     https://mail.python.org/mailman/listinfo/scikit-learn
>
>     ?or, via email, send a message with subject or body 'help' to
>
>     scikit-learn-request at python.org
>     <mailto:scikit-learn-request at python.org>
>
>
>
>     ?You can reach the person managing the list at
>
>     scikit-learn-owner at python.org <mailto:scikit-learn-owner at python.org>
>
>
>
>     ?When replying, please edit your Subject line so it is more specific
>
>     ?than "Re: Contents of scikit-learn digest..."
>
>
>
>
>
>     ?Today's Topics:
>
>
>
>     ? ? 1. Re: unclear help file for sklearn.decomposition.pca
>
>     ? ? ? ?(Andreas Mueller)
>
>
>
>
>
>     ?----------------------------------------------------------------------
>
>
>
>     ?Message: 1
>
>     ?Date: Mon, 16 Oct 2017 13:19:57 -0400
>
>     ?From: Andreas Mueller &lt;t3kcit at gmail.com
>     <mailto:lt%3Bt3kcit at gmail.com>&gt;
>
>     ?To: scikit-learn at python.org <mailto:scikit-learn at python.org>
>
>     ?Subject: Re: [scikit-learn] unclear help file for
>
>     ? ? ? ? ?sklearn.decomposition.pca
>
>     ?Message-ID: &lt;04fc445c-d8f3-a3a9-4ab2-0535826a2d03 at gmail.com
>     <mailto:lt%3B04fc445c-d8f3-a3a9-4ab2-0535826a2d03 at gmail.com>&gt;
>
>     ?Content-Type: text/plain; charset="utf-8"; Format="flowed"
>
>
>
>     ?The definition of PCA has a centering step, but no scaling step.
>
>
>
>     ?On 10/16/2017 11:16 AM, Ismael Lemhadri wrote:
>
>     ?&gt; Dear Roman,
>
>     ?&gt; My concern is actually not about not mentioning the scaling
>     but about
>
>     ?&gt; not mentioning the centering.
>
>     ?&gt; That is, the sklearn PCA removes the mean but it does not
>     mention it
>
>     ?&gt; in the help file.
>
>     ?&gt; This was quite messy for me to debug as I expected it to
>     either: 1/
>
>     ?&gt; center and scale simultaneously or / not scale and not
>     center either.
>
>     ?&gt; It would be beneficial to explicit the behavior in the help
>     file in my
>
>     ?&gt; opinion.
>
>     ?&gt; Ismael
>
>     ?&gt;
>
>     ?&gt; On Mon, Oct 16, 2017 at 8:02 AM,
>     &lt;scikit-learn-request at python.org
>     <mailto:lt%3Bscikit-learn-request at python.org>
>
>     ?&gt; &lt;mailto:scikit-learn-request at python.org
>     <mailto:scikit-learn-request at python.org>&gt;&gt; wrote:
>
>     ?&gt;
>
>     ?&gt;? ? ?Send scikit-learn mailing list submissions to
>
>     ?&gt; scikit-learn at python.org <mailto:scikit-learn at python.org>
>     &lt;mailto:scikit-learn at python.org
>     <mailto:scikit-learn at python.org>&gt;
>
>     ?&gt;
>
>     ?&gt;? ? ?To subscribe or unsubscribe via the World Wide Web, visit
>
>     ?&gt; https://mail.python.org/mailman/listinfo/scikit-learn
>
>     ?&gt;? ?
>     ?&lt;https://mail.python.org/mailman/listinfo/scikit-learn&gt;
>
>     ?&gt;? ? ?or, via email, send a message with subject or body 'help' to
>
>     ?&gt; scikit-learn-request at python.org
>     <mailto:scikit-learn-request at python.org>
>
>     ?&gt;? ? ?&lt;mailto:scikit-learn-request at python.org
>     <mailto:scikit-learn-request at python.org>&gt;
>
>     ?&gt;
>
>     ?&gt;? ? ?You can reach the person managing the list at
>
>     ?&gt; scikit-learn-owner at python.org
>     <mailto:scikit-learn-owner at python.org>
>     &lt;mailto:scikit-learn-owner at python.org
>     <mailto:scikit-learn-owner at python.org>&gt;
>
>     ?&gt;
>
>     ?&gt;? ? ?When replying, please edit your Subject line so it is
>     more specific
>
>     ?&gt;? ? ?than "Re: Contents of scikit-learn digest..."
>
>     ?&gt;
>
>     ?&gt;
>
>     ?&gt;? ? ?Today's Topics:
>
>     ?&gt;
>
>     ?&gt;? ? ?? ?1. unclear help file for sklearn.decomposition.pca
>     (Ismael
>
>     ?&gt;? ? ?Lemhadri)
>
>     ?&gt;? ? ?? ?2. Re: unclear help file for sklearn.decomposition.pca
>
>     ?&gt;? ? ?? ? ? (Roman Yurchak)
>
>     ?&gt;? ? ?? ?3. Question about LDA's coef_ attribute (Serafeim Loukas)
>
>     ?&gt;? ? ?? ?4. Re: Question about LDA's coef_ attribute
>     (Alexandre Gramfort)
>
>     ?&gt;? ? ?? ?5. Re: Question about LDA's coef_ attribute (Serafeim
>     Loukas)
>
>     ?&gt;
>
>     ?&gt;
>
>     ?&gt;
>     ?----------------------------------------------------------------------
>
>     ?&gt;
>
>     ?&gt;? ? ?Message: 1
>
>     ?&gt;? ? ?Date: Sun, 15 Oct 2017 18:42:56 -0700
>
>     ?&gt;? ? ?From: Ismael Lemhadri &lt;lemhadri at stanford.edu
>     <mailto:lt%3Blemhadri at stanford.edu>
>
>     ?&gt;? ? ?&lt;mailto:lemhadri at stanford.edu
>     <mailto:lemhadri at stanford.edu>&gt;&gt;
>
>     ?&gt;? ? ?To: scikit-learn at python.org
>     <mailto:scikit-learn at python.org>
>     &lt;mailto:scikit-learn at python.org
>     <mailto:scikit-learn at python.org>&gt;
>
>     ?&gt;? ? ?Subject: [scikit-learn] unclear help file for
>
>     ?&gt;? ? ?? ? ? ? sklearn.decomposition.pca
>
>     ?&gt;? ? ?Message-ID:
>
>     ?&gt;? ? ?? ? ? ?
>
>     ?&gt;? ?
>     ?&lt;CANpSPFTgv+Oz7f97dandmrBBayqf_o9w=18oKHCFN0u5DNzj+g at mail.gmail.com
>     <mailto:18oKHCFN0u5DNzj%2Bg at mail.gmail.com>
>
>     ?&gt;? ? ?&lt;mailto:18oKHCFN0u5DNzj%2Bg at mail.gmail.com
>     <mailto:18oKHCFN0u5DNzj%252Bg at mail.gmail.com>&gt;&gt;
>
>     ?&gt;? ? ?Content-Type: text/plain; charset="utf-8"
>
>     ?&gt;
>
>     ?&gt;? ? ?Dear all,
>
>     ?&gt;? ? ?The help file for the PCA class is unclear about the
>     preprocessing
>
>     ?&gt;? ? ?performed to the data.
>
>     ?&gt;? ? ?You can check on line 410 here:
>
>     ?&gt;
>     https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/
>
>     ?&gt;? ? ?decomposition/pca.py#L410
>
>     ?&gt;? ?
>     ?&lt;https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/%0Adecomposition/pca.py#L410&gt;
>
>     ?&gt;? ? ?that the matrix is centered but NOT scaled, before
>     performing the
>
>     ?&gt;? ? ?singular
>
>     ?&gt;? ? ?value decomposition.
>
>     ?&gt;? ? ?However, the help files do not make any mention of it.
>
>     ?&gt;? ? ?This is unclear for someone who, like me, just wanted to
>     compare
>
>     ?&gt;? ? ?that the
>
>     ?&gt;? ? ?PCA and np.linalg.svd give the same results. In academic
>     settings,
>
>     ?&gt;? ? ?students
>
>     ?&gt;? ? ?are often asked to compare different methods and to
>     check that
>
>     ?&gt;? ? ?they yield
>
>     ?&gt;? ? ?the same results. I expect that many students have
>     confronted this
>
>     ?&gt;? ? ?problem
>
>     ?&gt;? ? ?before...
>
>     ?&gt;? ? ?Best,
>
>     ?&gt;? ? ?Ismael Lemhadri
>
>     ?&gt;? ? ?-------------- next part --------------
>
>     ?&gt;? ? ?An HTML attachment was scrubbed...
>
>     ?&gt;? ? ?URL:
>
>     ?&gt;? ?
>     ?&lt;http://mail.python.org/pipermail/scikit-learn/attachments/20171015/c465bde7/attachment-0001.html
>
>     ?&gt;? ?
>     ?&lt;http://mail.python.org/pipermail/scikit-learn/attachments/20171015/c465bde7/attachment-0001.html&gt;&gt;
>
>     ?&gt;
>
>     ?&gt;? ? ?------------------------------
>
>     ?&gt;
>
>     ?&gt;? ? ?Message: 2
>
>     ?&gt;? ? ?Date: Mon, 16 Oct 2017 15:16:45 +0200
>
>     ?&gt;? ? ?From: Roman Yurchak &lt;rth.yurchak at gmail.com
>     <mailto:lt%3Brth.yurchak at gmail.com>
>
>     ?&gt;? ? ?&lt;mailto:rth.yurchak at gmail.com
>     <mailto:rth.yurchak at gmail.com>&gt;&gt;
>
>     ?&gt;? ? ?To: Scikit-learn mailing list
>     &lt;scikit-learn at python.org <mailto:lt%3Bscikit-learn at python.org>
>
>     ?&gt;? ? ?&lt;mailto:scikit-learn at python.org
>     <mailto:scikit-learn at python.org>&gt;&gt;
>
>     ?&gt;? ? ?Subject: Re: [scikit-learn] unclear help file for
>
>     ?&gt;? ? ?? ? ? ? sklearn.decomposition.pca
>
>     ?&gt;? ? ?Message-ID:
>     &lt;b2abdcfd-4736-929e-6304-b93832932043 at gmail.com
>     <mailto:lt%3Bb2abdcfd-4736-929e-6304-b93832932043 at gmail.com>
>
>     ?&gt;? ?
>     ?&lt;mailto:b2abdcfd-4736-929e-6304-b93832932043 at gmail.com
>     <mailto:b2abdcfd-4736-929e-6304-b93832932043 at gmail.com>&gt;&gt;
>
>     ?&gt;? ? ?Content-Type: text/plain; charset=utf-8; format=flowed
>
>     ?&gt;
>
>     ?&gt;? ? ?Ismael,
>
>     ?&gt;
>
>     ?&gt;? ? ?as far as I saw the sklearn.decomposition.PCA doesn't
>     mention
>
>     ?&gt;? ? ?scaling at
>
>     ?&gt;? ? ?all (except for the whiten parameter which is
>     post-transformation
>
>     ?&gt;? ? ?scaling).
>
>     ?&gt;
>
>     ?&gt;? ? ?So since it doesn't mention it, it makes sense that it
>     doesn't do any
>
>     ?&gt;? ? ?scaling of the input. Same as np.linalg.svd.
>
>     ?&gt;
>
>     ?&gt;? ? ?You can verify that PCA and np.linalg.svd yield the same
>     results, with
>
>     ?&gt;
>
>     ?&gt;? ? ?```
>
>     ?&gt;? ? ??&gt;&gt;&gt; import numpy as np
>
>     ?&gt;? ? ??&gt;&gt;&gt; from sklearn.decomposition import PCA
>
>     ?&gt;? ? ??&gt;&gt;&gt; import numpy.linalg
>
>     ?&gt;? ? ??&gt;&gt;&gt; X = np.random.RandomState(42).rand(10, 4)
>
>     ?&gt;? ? ??&gt;&gt;&gt; n_components = 2
>
>     ?&gt;? ? ??&gt;&gt;&gt; PCA(n_components,
>     svd_solver='full').fit_transform(X)
>
>     ?&gt;? ? ?```
>
>     ?&gt;
>
>     ?&gt;? ? ?and
>
>     ?&gt;
>
>     ?&gt;? ? ?```
>
>     ?&gt;? ? ??&gt;&gt;&gt; U, s, V = np.linalg.svd(X -
>     X.mean(axis=0), full_matrices=False)
>
>     ?&gt;? ? ??&gt;&gt;&gt; (X - X.mean(axis=0)).dot(V[:n_components].T)
>
>     ?&gt;? ? ?```
>
>     ?&gt;
>
>     ?&gt;? ? ?--
>
>     ?&gt;? ? ?Roman
>
>     ?&gt;
>
>     ?&gt;? ? ?On 16/10/17 03:42, Ismael Lemhadri wrote:
>
>     ?&gt;? ? ?&gt; Dear all,
>
>     ?&gt;? ? ?&gt; The help file for the PCA class is unclear about
>     the preprocessing
>
>     ?&gt;? ? ?&gt; performed to the data.
>
>     ?&gt;? ? ?&gt; You can check on line 410 here:
>
>     ?&gt;? ? ?&gt;
>
>     ?&gt;
>     https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/decomposition/pca.py#L410
>
>     ?&gt;? ?
>     ?&lt;https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/decomposition/pca.py#L410&gt;
>
>     ?&gt;? ? ?&gt;
>
>     ?&gt;? ?
>     ?&lt;https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/decomposition/pca.py#L410
>
>     ?&gt;? ?
>     ?&lt;https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/decomposition/pca.py#L410&gt;&gt;
>
>     ?&gt;? ? ?&gt; that the matrix is centered but NOT scaled, before
>     performing the
>
>     ?&gt;? ? ?&gt; singular value decomposition.
>
>     ?&gt;? ? ?&gt; However, the help files do not make any mention of it.
>
>     ?&gt;? ? ?&gt; This is unclear for someone who, like me, just
>     wanted to compare
>
>     ?&gt;? ? ?that
>
>     ?&gt;? ? ?&gt; the PCA and np.linalg.svd give the same results. In
>     academic
>
>     ?&gt;? ? ?settings,
>
>     ?&gt;? ? ?&gt; students are often asked to compare different
>     methods and to
>
>     ?&gt;? ? ?check that
>
>     ?&gt;? ? ?&gt; they yield the same results. I expect that many
>     students have
>
>     ?&gt;? ? ?confronted
>
>     ?&gt;? ? ?&gt; this problem before...
>
>     ?&gt;? ? ?&gt; Best,
>
>     ?&gt;? ? ?&gt; Ismael Lemhadri
>
>     ?&gt;? ? ?&gt;
>
>     ?&gt;? ? ?&gt;
>
>     ?&gt;? ? ?&gt; _______________________________________________
>
>     ?&gt;? ? ?&gt; scikit-learn mailing list
>
>     ?&gt;? ? ?&gt; scikit-learn at python.org
>     <mailto:scikit-learn at python.org>
>     &lt;mailto:scikit-learn at python.org
>     <mailto:scikit-learn at python.org>&gt;
>
>     ?&gt;? ? ?&gt; https://mail.python.org/mailman/listinfo/scikit-learn
>
>     ?&gt;? ?
>     ?&lt;https://mail.python.org/mailman/listinfo/scikit-learn&gt;
>
>     ?&gt;? ? ?&gt;
>
>     ?&gt;
>
>     ?&gt;
>
>     ?&gt;
>
>     ?&gt;? ? ?------------------------------
>
>     ?&gt;
>
>     ?&gt;? ? ?Message: 3
>
>     ?&gt;? ? ?Date: Mon, 16 Oct 2017 15:27:48 +0200
>
>     ?&gt;? ? ?From: Serafeim Loukas &lt;seralouk at gmail.com
>     <mailto:lt%3Bseralouk at gmail.com> &lt;mailto:seralouk at gmail.com
>     <mailto:seralouk at gmail.com>&gt;&gt;
>
>     ?&gt;? ? ?To: scikit-learn at python.org
>     <mailto:scikit-learn at python.org>
>     &lt;mailto:scikit-learn at python.org
>     <mailto:scikit-learn at python.org>&gt;
>
>     ?&gt;? ? ?Subject: [scikit-learn] Question about LDA's coef_ attribute
>
>     ?&gt;? ? ?Message-ID:
>     &lt;58C6D0DA-9DE5-4EF5-97C1-48159831F5A9 at gmail.com
>     <mailto:lt%3B58C6D0DA-9DE5-4EF5-97C1-48159831F5A9 at gmail.com>
>
>     ?&gt;? ?
>     ?&lt;mailto:58C6D0DA-9DE5-4EF5-97C1-48159831F5A9 at gmail.com
>     <mailto:58C6D0DA-9DE5-4EF5-97C1-48159831F5A9 at gmail.com>&gt;&gt;
>
>     ?&gt;? ? ?Content-Type: text/plain; charset="us-ascii"
>
>     ?&gt;
>
>     ?&gt;? ? ?Dear Scikit-learn community,
>
>     ?&gt;
>
>     ?&gt;? ? ?Since the documentation of the LDA
>
>     ?&gt;? ?
>     ?(http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html
>
>     ?&gt;? ?
>     ?&lt;http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html&gt;
>
>     ?&gt;? ?
>     ?&lt;http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html
>
>     ?&gt;? ?
>     ?&lt;http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html&gt;&gt;)
>
>     ?&gt;? ? ?is not so clear, I would like to ask if the lda.coef_
>     attribute
>
>     ?&gt;? ? ?stores the eigenvectors from the SVD decomposition.
>
>     ?&gt;
>
>     ?&gt;? ? ?Thank you in advance,
>
>     ?&gt;? ? ?Serafeim
>
>     ?&gt;? ? ?-------------- next part --------------
>
>     ?&gt;? ? ?An HTML attachment was scrubbed...
>
>     ?&gt;? ? ?URL:
>
>     ?&gt;? ?
>     ?&lt;http://mail.python.org/pipermail/scikit-learn/attachments/20171016/4263df5c/attachment-0001.html
>
>     ?&gt;? ?
>     ?&lt;http://mail.python.org/pipermail/scikit-learn/attachments/20171016/4263df5c/attachment-0001.html&gt;&gt;
>
>     ?&gt;
>
>     ?&gt;? ? ?------------------------------
>
>     ?&gt;
>
>     ?&gt;? ? ?Message: 4
>
>     ?&gt;? ? ?Date: Mon, 16 Oct 2017 16:57:52 +0200
>
>     ?&gt;? ? ?From: Alexandre Gramfort &lt;alexandre.gramfort at inria.fr
>     <mailto:lt%3Balexandre.gramfort at inria.fr>
>
>     ?&gt;? ? ?&lt;mailto:alexandre.gramfort at inria.fr
>     <mailto:alexandre.gramfort at inria.fr>&gt;&gt;
>
>     ?&gt;? ? ?To: Scikit-learn mailing list
>     &lt;scikit-learn at python.org <mailto:lt%3Bscikit-learn at python.org>
>
>     ?&gt;? ? ?&lt;mailto:scikit-learn at python.org
>     <mailto:scikit-learn at python.org>&gt;&gt;
>
>     ?&gt;? ? ?Subject: Re: [scikit-learn] Question about LDA's coef_
>     attribute
>
>     ?&gt;? ? ?Message-ID:
>
>     ?&gt;? ? ?? ? ? ?
>
>     ?&gt;? ?
>     ?&lt;CADeotZricOQhuHJMmW2Z14cqffEQyndYoxn-OgKAvTMQ7V0Y2g at mail.gmail.com
>     <mailto:lt%3BCADeotZricOQhuHJMmW2Z14cqffEQyndYoxn-OgKAvTMQ7V0Y2g at mail.gmail.com>
>
>     ?&gt;? ?
>     ?&lt;mailto:CADeotZricOQhuHJMmW2Z14cqffEQyndYoxn-OgKAvTMQ7V0Y2g at mail.gmail.com
>     <mailto:CADeotZricOQhuHJMmW2Z14cqffEQyndYoxn-OgKAvTMQ7V0Y2g at mail.gmail.com>&gt;&gt;
>
>     ?&gt;? ? ?Content-Type: text/plain; charset="UTF-8"
>
>     ?&gt;
>
>     ?&gt;? ? ?no it stores the direction of the decision function to
>     match the
>
>     ?&gt;? ? ?API of
>
>     ?&gt;? ? ?linear models.
>
>     ?&gt;
>
>     ?&gt;? ? ?HTH
>
>     ?&gt;? ? ?Alex
>
>     ?&gt;
>
>     ?&gt;? ? ?On Mon, Oct 16, 2017 at 3:27 PM, Serafeim Loukas
>
>     ?&gt;? ? ?&lt;seralouk at gmail.com <mailto:lt%3Bseralouk at gmail.com>
>     &lt;mailto:seralouk at gmail.com <mailto:seralouk at gmail.com>&gt;&gt;
>     wrote:
>
>     ?&gt;? ? ?&gt; Dear Scikit-learn community,
>
>     ?&gt;? ? ?&gt;
>
>     ?&gt;? ? ?&gt; Since the documentation of the LDA
>
>     ?&gt;? ? ?&gt;
>
>     ?&gt;? ?
>     ?(http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html
>
>     ?&gt;? ?
>     ?&lt;http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html&gt;)
>
>     ?&gt;? ? ?&gt; is not so clear, I would like to ask if the
>     lda.coef_ attribute
>
>     ?&gt;? ? ?stores the
>
>     ?&gt;? ? ?&gt; eigenvectors from the SVD decomposition.
>
>     ?&gt;? ? ?&gt;
>
>     ?&gt;? ? ?&gt; Thank you in advance,
>
>     ?&gt;? ? ?&gt; Serafeim
>
>     ?&gt;? ? ?&gt;
>
>     ?&gt;? ? ?&gt; _______________________________________________
>
>     ?&gt;? ? ?&gt; scikit-learn mailing list
>
>     ?&gt;? ? ?&gt; scikit-learn at python.org
>     <mailto:scikit-learn at python.org>
>     &lt;mailto:scikit-learn at python.org
>     <mailto:scikit-learn at python.org>&gt;
>
>     ?&gt;? ? ?&gt; https://mail.python.org/mailman/listinfo/scikit-learn
>
>     ?&gt;? ?
>     ?&lt;https://mail.python.org/mailman/listinfo/scikit-learn&gt;
>
>     ?&gt;? ? ?&gt;
>
>     ?&gt;
>
>     ?&gt;
>
>     ?&gt;? ? ?------------------------------
>
>     ?&gt;
>
>     ?&gt;? ? ?Message: 5
>
>     ?&gt;? ? ?Date: Mon, 16 Oct 2017 17:02:46 +0200
>
>     ?&gt;? ? ?From: Serafeim Loukas &lt;seralouk at gmail.com
>     <mailto:lt%3Bseralouk at gmail.com> &lt;mailto:seralouk at gmail.com
>     <mailto:seralouk at gmail.com>&gt;&gt;
>
>     ?&gt;? ? ?To: Scikit-learn mailing list
>     &lt;scikit-learn at python.org <mailto:lt%3Bscikit-learn at python.org>
>
>     ?&gt;? ? ?&lt;mailto:scikit-learn at python.org
>     <mailto:scikit-learn at python.org>&gt;&gt;
>
>     ?&gt;? ? ?Subject: Re: [scikit-learn] Question about LDA's coef_
>     attribute
>
>     ?&gt;? ? ?Message-ID:
>     &lt;413210D2-56AE-41A4-873F-D171BB36539D at gmail.com
>     <mailto:lt%3B413210D2-56AE-41A4-873F-D171BB36539D at gmail.com>
>
>     ?&gt;? ?
>     ?&lt;mailto:413210D2-56AE-41A4-873F-D171BB36539D at gmail.com
>     <mailto:413210D2-56AE-41A4-873F-D171BB36539D at gmail.com>&gt;&gt;
>
>     ?&gt;? ? ?Content-Type: text/plain; charset="us-ascii"
>
>     ?&gt;
>
>     ?&gt;? ? ?Dear Alex,
>
>     ?&gt;
>
>     ?&gt;? ? ?Thank you for the prompt response.
>
>     ?&gt;
>
>     ?&gt;? ? ?Are the eigenvectors stored in some variable ?
>
>     ?&gt;? ? ?Does the lda.scalings_ attribute contain the eigenvectors ?
>
>     ?&gt;
>
>     ?&gt;? ? ?Best,
>
>     ?&gt;? ? ?Serafeim
>
>     ?&gt;
>
>     ?&gt;? ? ?&gt; On 16 Oct 2017, at 16:57, Alexandre Gramfort
>
>     ?&gt;? ? ?&lt;alexandre.gramfort at inria.fr
>     <mailto:lt%3Balexandre.gramfort at inria.fr>
>     &lt;mailto:alexandre.gramfort at inria.fr
>     <mailto:alexandre.gramfort at inria.fr>&gt;&gt;
>
>     ?&gt;? ? ?wrote:
>
>     ?&gt;? ? ?&gt;
>
>     ?&gt;? ? ?&gt; no it stores the direction of the decision function
>     to match the
>
>     ?&gt;? ? ?API of
>
>     ?&gt;? ? ?&gt; linear models.
>
>     ?&gt;? ? ?&gt;
>
>     ?&gt;? ? ?&gt; HTH
>
>     ?&gt;? ? ?&gt; Alex
>
>     ?&gt;? ? ?&gt;
>
>     ?&gt;? ? ?&gt; On Mon, Oct 16, 2017 at 3:27 PM, Serafeim Loukas
>
>     ?&gt;? ? ?&lt;seralouk at gmail.com <mailto:lt%3Bseralouk at gmail.com>
>     &lt;mailto:seralouk at gmail.com <mailto:seralouk at gmail.com>&gt;&gt;
>     wrote:
>
>     ?&gt;? ? ?&gt;&gt; Dear Scikit-learn community,
>
>     ?&gt;? ? ?&gt;&gt;
>
>     ?&gt;? ? ?&gt;&gt; Since the documentation of the LDA
>
>     ?&gt;? ? ?&gt;&gt;
>
>     ?&gt;? ?
>     ?(http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html
>
>     ?&gt;? ?
>     ?&lt;http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html&gt;)
>
>     ?&gt;? ? ?&gt;&gt; is not so clear, I would like to ask if the
>     lda.coef_ attribute
>
>     ?&gt;? ? ?stores the
>
>     ?&gt;? ? ?&gt;&gt; eigenvectors from the SVD decomposition.
>
>     ?&gt;? ? ?&gt;&gt;
>
>     ?&gt;? ? ?&gt;&gt; Thank you in advance,
>
>     ?&gt;? ? ?&gt;&gt; Serafeim
>
>     ?&gt;? ? ?&gt;&gt;
>
>     ?&gt;? ? ?&gt;&gt; _______________________________________________
>
>     ?&gt;? ? ?&gt;&gt; scikit-learn mailing list
>
>     ?&gt;? ? ?&gt;&gt; scikit-learn at python.org
>     <mailto:scikit-learn at python.org>
>     &lt;mailto:scikit-learn at python.org
>     <mailto:scikit-learn at python.org>&gt;
>
>     ?&gt;? ? ?&gt;&gt;
>     https://mail.python.org/mailman/listinfo/scikit-learn
>
>     ?&gt;? ?
>     ?&lt;https://mail.python.org/mailman/listinfo/scikit-learn&gt;
>
>     ?&gt;? ? ?&gt;&gt;
>
>     ?&gt;? ? ?&gt; _______________________________________________
>
>     ?&gt;? ? ?&gt; scikit-learn mailing list
>
>     ?&gt;? ? ?&gt; scikit-learn at python.org
>     <mailto:scikit-learn at python.org>
>     &lt;mailto:scikit-learn at python.org
>     <mailto:scikit-learn at python.org>&gt;
>
>     ?&gt;? ? ?&gt; https://mail.python.org/mailman/listinfo/scikit-learn
>
>     ?&gt;? ?
>     ?&lt;https://mail.python.org/mailman/listinfo/scikit-learn&gt;
>
>     ?&gt;
>
>     ?&gt;? ? ?-------------- next part --------------
>
>     ?&gt;? ? ?An HTML attachment was scrubbed...
>
>     ?&gt;? ? ?URL:
>
>     ?&gt;? ?
>     ?&lt;http://mail.python.org/pipermail/scikit-learn/attachments/20171016/505c7da3/attachment.html
>
>     ?&gt;? ?
>     ?&lt;http://mail.python.org/pipermail/scikit-learn/attachments/20171016/505c7da3/attachment.html&gt;&gt;
>
>     ?&gt;
>
>     ?&gt;? ? ?------------------------------
>
>     ?&gt;
>
>     ?&gt;? ? ?Subject: Digest Footer
>
>     ?&gt;
>
>     ?&gt; ?_______________________________________________
>
>     ?&gt;? ? ?scikit-learn mailing list
>
>     ?&gt; scikit-learn at python.org <mailto:scikit-learn at python.org>
>     &lt;mailto:scikit-learn at python.org
>     <mailto:scikit-learn at python.org>&gt;
>
>     ?&gt; https://mail.python.org/mailman/listinfo/scikit-learn
>
>     ?&gt;? ?
>     ?&lt;https://mail.python.org/mailman/listinfo/scikit-learn&gt;
>
>     ?&gt;
>
>     ?&gt;
>
>     ?&gt;? ? ?------------------------------
>
>     ?&gt;
>
>     ?&gt;? ? ?End of scikit-learn Digest, Vol 19, Issue 25
>
>     ?&gt;? ? ?********************************************
>
>     ?&gt;
>
>     ?&gt;
>
>     ?&gt;
>
>     ?&gt;
>
>     ?&gt; _______________________________________________
>
>     ?&gt; scikit-learn mailing list
>
>     ?&gt; scikit-learn at python.org <mailto:scikit-learn at python.org>
>
>     ?&gt; https://mail.python.org/mailman/listinfo/scikit-learn
>
>
>
>     ?-------------- next part --------------
>
>     ?An HTML attachment was scrubbed...
>
>     ?URL:
>     &lt;http://mail.python.org/pipermail/scikit-learn/attachments/20171016/f47e63a9/attachment.html&gt;
>
>
>
>     ?------------------------------
>
>
>
>     ?Subject: Digest Footer
>
>
>
>     ?_______________________________________________
>
>     ?scikit-learn mailing list
>
>     scikit-learn at python.org <mailto:scikit-learn at python.org>
>
>     https://mail.python.org/mailman/listinfo/scikit-learn
>
>
>
>
>
>     ?------------------------------
>
>
>
>     ?End of scikit-learn Digest, Vol 19, Issue 28
>
>     ?********************************************
>
>
>
>
>
>
>     _______________________________________________
>
>     scikit-learn mailing list
>
>     scikit-learn at python.org <mailto:scikit-learn at python.org>
>
>     https://mail.python.org/mailman/listinfo/scikit-learn
>
>
>
>
>
>
>     -------------- next part --------------
>     An HTML attachment was scrubbed...
>     URL:
>     <http://mail.python.org/pipermail/scikit-learn/attachments/20171016/620a9401/attachment.html>
>
>     ------------------------------
>
>     Subject: Digest Footer
>
>     _______________________________________________
>     scikit-learn mailing list
>     scikit-learn at python.org <mailto:scikit-learn at python.org>
>     https://mail.python.org/mailman/listinfo/scikit-learn
>
>
>     ------------------------------
>
>     End of scikit-learn Digest, Vol 19, Issue 31
>     ********************************************
>
> -- 
>
> Sent from a mobile phone and may contain errors
>
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20171017/fae5d71c/attachment-0001.html>

From drraph at gmail.com  Tue Oct 17 11:44:55 2017
From: drraph at gmail.com (Raphael C)
Date: Tue, 17 Oct 2017 16:44:55 +0100
Subject: [scikit-learn] Unclear help file about sklearn.decomposition.pca
In-Reply-To: <7a74179e-1e52-afc9-05ba-68f41868e2b5@gmail.com>
References: <mailman.2473.1508179722.12136.scikit-learn@python.org>
 <CANpSPFT639gYFzAG3bhq-kyF=56P9QjJ5s+epXXuHi2jDHhXbQ@mail.gmail.com>
 <7a74179e-1e52-afc9-05ba-68f41868e2b5@gmail.com>
Message-ID: <CAFHc1QZigBoA0erY2hwJht2kiAenp=QTQDSb3uc0Uzs5SoEC7Q@mail.gmail.com>

How about including the scaling that people might want to use in the
User Guide examples?

Raphael

On 17 October 2017 at 16:40, Andreas Mueller <t3kcit at gmail.com> wrote:
> In general scikit-learn avoids automatic preprocessing.
> That's a convention to give the user more control and decrease surprising
> behavior (ostensibly).
> So scikit-learn will usually do what the algorithm is supposed to do, and
> nothing more.
>
> I'm not sure what the best way do document this is, as this has come up with
> different models.
> For example the R wrapper of libsvm does automatic scaling, while we apply
> the SVM.
>
> We could add "this model does not do any automatic preprocessing" to all
> docstrings, but that seems
> a bit redundant. We could add it to
> https://github.com/scikit-learn/scikit-learn/pull/9517, but
> that is probably not where you would have looked.
>
> Other suggestions welcome.
>
>
> On 10/16/2017 03:29 PM, Ismael Lemhadri wrote:
>
> Thank you all for your feedback.
> The initial problem I came with wasnt the definition of PCA but what the
> sklearn method does. In practice I would always make sure the data is both
> centered and scaled before performing PCA. This is the recommended method
> because without scaling, the biggest direction could wrongly seem to explain
> a huge fraction of the variance.
> So my point was simply to clarify in the help file and the user guide what
> the PCA class does precisely to leave no unclarity to the reader. Moving
> forward I have now submitted a pull request on github as initially suggested
> by Roman on this thread.
> Best,
> Ismael
>
> On Mon, 16 Oct 2017 at 11:49 AM, <scikit-learn-request at python.org> wrote:
>>
>> Send scikit-learn mailing list submissions to
>>         scikit-learn at python.org
>>
>> To subscribe or unsubscribe via the World Wide Web, visit
>>         https://mail.python.org/mailman/listinfo/scikit-learn
>> or, via email, send a message with subject or body 'help' to
>>         scikit-learn-request at python.org
>>
>> You can reach the person managing the list at
>>         scikit-learn-owner at python.org
>>
>> When replying, please edit your Subject line so it is more specific
>> than "Re: Contents of scikit-learn digest..."
>>
>>
>> Today's Topics:
>>
>>    1. Re: 1. Re: unclear help file for sklearn.decomposition.pca
>>       (Andreas Mueller)
>>    2. Re: 1. Re: unclear help file for sklearn.decomposition.pca
>>       (Oliver Tomic)
>>
>>
>> ----------------------------------------------------------------------
>>
>> Message: 1
>> Date: Mon, 16 Oct 2017 14:44:51 -0400
>> From: Andreas Mueller <t3kcit at gmail.com>
>> To: scikit-learn at python.org
>> Subject: Re: [scikit-learn] 1. Re: unclear help file for
>>         sklearn.decomposition.pca
>> Message-ID: <35142868-fce9-6cb3-eba3-015a0b106163 at gmail.com>
>> Content-Type: text/plain; charset="utf-8"; Format="flowed"
>>
>>
>>
>> On 10/16/2017 02:27 PM, Ismael Lemhadri wrote:
>> > @Andreas Muller:
>> > My references do not assume centering, e.g.
>> > http://ufldl.stanford.edu/wiki/index.php/PCA
>> > any reference?
>> >
>> It kinda does but is not very clear about it:
>>
>> This data has already been pre-processed so that each of the
>> features\textstyle x_1and\textstyle x_2have about the same mean (zero)
>> and variance.
>>
>>
>>
>> Wikipedia is much clearer:
>> Consider a datamatrix
>> <https://en.wikipedia.org/wiki/Matrix_%28mathematics%29>,*X*, with
>> column-wise zeroempirical mean
>> <https://en.wikipedia.org/wiki/Empirical_mean>(the sample mean of each
>> column has been shifted to zero), where each of the/n/rows represents a
>> different repetition of the experiment, and each of the/p/columns gives
>> a particular kind of feature (say, the results from a particular sensor).
>> https://en.wikipedia.org/wiki/Principal_component_analysis#Details
>>
>> I'm a bit surprised to find that ESL says "The SVD of the centered
>> matrix X is another way of expressing the principal components of the
>> variables in X",
>> so they assume scaling? They don't really have a great treatment of PCA,
>> though.
>>
>> Bishop <http://www.springer.com/us/book/9780387310732> and Murphy
>> <https://mitpress.mit.edu/books/machine-learning-0> are pretty clear
>> that they subtract the mean (or assume zero mean) but don't standardize.
>> -------------- next part --------------
>> An HTML attachment was scrubbed...
>> URL:
>> <http://mail.python.org/pipermail/scikit-learn/attachments/20171016/81b3014b/attachment-0001.html>
>>
>> ------------------------------
>>
>> Message: 2
>> Date: Mon, 16 Oct 2017 20:48:29 +0200
>> From: Oliver Tomic <olivertomic at zoho.com>
>> To: "Scikit-learn mailing list" <scikit-learn at python.org>
>> Cc: <scikit-learn at python.org>
>> Subject: Re: [scikit-learn] 1. Re: unclear help file for
>>         sklearn.decomposition.pca
>> Message-ID: <15f26840d65.e97b33c25239.3934951873824890747 at zoho.com>
>> Content-Type: text/plain; charset="utf-8"
>>
>> Dear Ismael,
>>
>>
>>
>> PCA should always involve at the least centering, or, if the variables are
>> to contribute equally, scaling. Here is a reference from the scientific area
>> named "chemometrics". In Chemometrics PCA used not only for dimensionality
>> reduction, but also for interpretation of variance by use of scores,
>> loadings, correlation loadings, etc.
>>
>>
>>
>> If you scroll down to subsection "Preprocessing" you will find more info
>> on centering and scaling.
>>
>>
>> http://pubs.rsc.org/en/content/articlehtml/2014/ay/c3ay41907j
>>
>>
>>
>> best
>>
>> Oliver
>>
>>
>>
>>
>> ---- On Mon, 16 Oct 2017 20:27:11 +0200 Ismael Lemhadri
>> &lt;lemhadri at stanford.edu&gt; wrote ----
>>
>>
>>
>>
>> @Andreas Muller:
>>
>> My references do not assume centering, e.g.
>> http://ufldl.stanford.edu/wiki/index.php/PCA
>>
>> any reference?
>>
>>
>>
>>
>>
>>
>>
>> On Mon, Oct 16, 2017 at 10:20 AM, &lt;scikit-learn-request at python.org&gt;
>> wrote:
>>
>> Send scikit-learn mailing list submissions to
>>
>>          scikit-learn at python.org
>>
>>
>>
>>  To subscribe or unsubscribe via the World Wide Web, visit
>>
>>          https://mail.python.org/mailman/listinfo/scikit-learn
>>
>>  or, via email, send a message with subject or body 'help' to
>>
>>          scikit-learn-request at python.org
>>
>>
>>
>>  You can reach the person managing the list at
>>
>>          scikit-learn-owner at python.org
>>
>>
>>
>>  When replying, please edit your Subject line so it is more specific
>>
>>  than "Re: Contents of scikit-learn digest..."
>>
>>
>>
>>
>>
>>  Today's Topics:
>>
>>
>>
>>     1. Re: unclear help file for sklearn.decomposition.pca
>>
>>        (Andreas Mueller)
>>
>>
>>
>>
>>
>>  ----------------------------------------------------------------------
>>
>>
>>
>>  Message: 1
>>
>>  Date: Mon, 16 Oct 2017 13:19:57 -0400
>>
>>  From: Andreas Mueller &lt;t3kcit at gmail.com&gt;
>>
>>  To: scikit-learn at python.org
>>
>>  Subject: Re: [scikit-learn] unclear help file for
>>
>>          sklearn.decomposition.pca
>>
>>  Message-ID: &lt;04fc445c-d8f3-a3a9-4ab2-0535826a2d03 at gmail.com&gt;
>>
>>  Content-Type: text/plain; charset="utf-8"; Format="flowed"
>>
>>
>>
>>  The definition of PCA has a centering step, but no scaling step.
>>
>>
>>
>>  On 10/16/2017 11:16 AM, Ismael Lemhadri wrote:
>>
>>  &gt; Dear Roman,
>>
>>  &gt; My concern is actually not about not mentioning the scaling but
>> about
>>
>>  &gt; not mentioning the centering.
>>
>>  &gt; That is, the sklearn PCA removes the mean but it does not mention it
>>
>>  &gt; in the help file.
>>
>>  &gt; This was quite messy for me to debug as I expected it to either: 1/
>>
>>  &gt; center and scale simultaneously or / not scale and not center
>> either.
>>
>>  &gt; It would be beneficial to explicit the behavior in the help file in
>> my
>>
>>  &gt; opinion.
>>
>>  &gt; Ismael
>>
>>  &gt;
>>
>>  &gt; On Mon, Oct 16, 2017 at 8:02 AM, &lt;scikit-learn-request at python.org
>>
>>  &gt; &lt;mailto:scikit-learn-request at python.org&gt;&gt; wrote:
>>
>>  &gt;
>>
>>  &gt;     Send scikit-learn mailing list submissions to
>>
>>  &gt;     scikit-learn at python.org &lt;mailto:scikit-learn at python.org&gt;
>>
>>  &gt;
>>
>>  &gt;     To subscribe or unsubscribe via the World Wide Web, visit
>>
>>  &gt;     https://mail.python.org/mailman/listinfo/scikit-learn
>>
>>  &gt;     &lt;https://mail.python.org/mailman/listinfo/scikit-learn&gt;
>>
>>  &gt;     or, via email, send a message with subject or body 'help' to
>>
>>  &gt;     scikit-learn-request at python.org
>>
>>  &gt;     &lt;mailto:scikit-learn-request at python.org&gt;
>>
>>  &gt;
>>
>>  &gt;     You can reach the person managing the list at
>>
>>  &gt;     scikit-learn-owner at python.org
>> &lt;mailto:scikit-learn-owner at python.org&gt;
>>
>>  &gt;
>>
>>  &gt;     When replying, please edit your Subject line so it is more
>> specific
>>
>>  &gt;     than "Re: Contents of scikit-learn digest..."
>>
>>  &gt;
>>
>>  &gt;
>>
>>  &gt;     Today's Topics:
>>
>>  &gt;
>>
>>  &gt;     ? ?1. unclear help file for sklearn.decomposition.pca (Ismael
>>
>>  &gt;     Lemhadri)
>>
>>  &gt;     ? ?2. Re: unclear help file for sklearn.decomposition.pca
>>
>>  &gt;     ? ? ? (Roman Yurchak)
>>
>>  &gt;     ? ?3. Question about LDA's coef_ attribute (Serafeim Loukas)
>>
>>  &gt;     ? ?4. Re: Question about LDA's coef_ attribute (Alexandre
>> Gramfort)
>>
>>  &gt;     ? ?5. Re: Question about LDA's coef_ attribute (Serafeim Loukas)
>>
>>  &gt;
>>
>>  &gt;
>>
>>  &gt;
>> ----------------------------------------------------------------------
>>
>>  &gt;
>>
>>  &gt;     Message: 1
>>
>>  &gt;     Date: Sun, 15 Oct 2017 18:42:56 -0700
>>
>>  &gt;     From: Ismael Lemhadri &lt;lemhadri at stanford.edu
>>
>>  &gt;     &lt;mailto:lemhadri at stanford.edu&gt;&gt;
>>
>>  &gt;     To: scikit-learn at python.org
>> &lt;mailto:scikit-learn at python.org&gt;
>>
>>  &gt;     Subject: [scikit-learn] unclear help file for
>>
>>  &gt;     ? ? ? ? sklearn.decomposition.pca
>>
>>  &gt;     Message-ID:
>>
>>  &gt;     ? ? ? ?
>>
>>  &gt;
>> &lt;CANpSPFTgv+Oz7f97dandmrBBayqf_o9w=18oKHCFN0u5DNzj+g at mail.gmail.com
>>
>>  &gt;     &lt;mailto:18oKHCFN0u5DNzj%2Bg at mail.gmail.com&gt;&gt;
>>
>>  &gt;     Content-Type: text/plain; charset="utf-8"
>>
>>  &gt;
>>
>>  &gt;     Dear all,
>>
>>  &gt;     The help file for the PCA class is unclear about the
>> preprocessing
>>
>>  &gt;     performed to the data.
>>
>>  &gt;     You can check on line 410 here:
>>
>>  &gt;
>> https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/
>>
>>  &gt;     decomposition/pca.py#L410
>>
>>  &gt;
>> &lt;https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/%0Adecomposition/pca.py#L410&gt;
>>
>>  &gt;     that the matrix is centered but NOT scaled, before performing
>> the
>>
>>  &gt;     singular
>>
>>  &gt;     value decomposition.
>>
>>  &gt;     However, the help files do not make any mention of it.
>>
>>  &gt;     This is unclear for someone who, like me, just wanted to compare
>>
>>  &gt;     that the
>>
>>  &gt;     PCA and np.linalg.svd give the same results. In academic
>> settings,
>>
>>  &gt;     students
>>
>>  &gt;     are often asked to compare different methods and to check that
>>
>>  &gt;     they yield
>>
>>  &gt;     the same results. I expect that many students have confronted
>> this
>>
>>  &gt;     problem
>>
>>  &gt;     before...
>>
>>  &gt;     Best,
>>
>>  &gt;     Ismael Lemhadri
>>
>>  &gt;     -------------- next part --------------
>>
>>  &gt;     An HTML attachment was scrubbed...
>>
>>  &gt;     URL:
>>
>>  &gt;
>> &lt;http://mail.python.org/pipermail/scikit-learn/attachments/20171015/c465bde7/attachment-0001.html
>>
>>  &gt;
>> &lt;http://mail.python.org/pipermail/scikit-learn/attachments/20171015/c465bde7/attachment-0001.html&gt;&gt;
>>
>>  &gt;
>>
>>  &gt;     ------------------------------
>>
>>  &gt;
>>
>>  &gt;     Message: 2
>>
>>  &gt;     Date: Mon, 16 Oct 2017 15:16:45 +0200
>>
>>  &gt;     From: Roman Yurchak &lt;rth.yurchak at gmail.com
>>
>>  &gt;     &lt;mailto:rth.yurchak at gmail.com&gt;&gt;
>>
>>  &gt;     To: Scikit-learn mailing list &lt;scikit-learn at python.org
>>
>>  &gt;     &lt;mailto:scikit-learn at python.org&gt;&gt;
>>
>>  &gt;     Subject: Re: [scikit-learn] unclear help file for
>>
>>  &gt;     ? ? ? ? sklearn.decomposition.pca
>>
>>  &gt;     Message-ID: &lt;b2abdcfd-4736-929e-6304-b93832932043 at gmail.com
>>
>>  &gt;
>> &lt;mailto:b2abdcfd-4736-929e-6304-b93832932043 at gmail.com&gt;&gt;
>>
>>  &gt;     Content-Type: text/plain; charset=utf-8; format=flowed
>>
>>  &gt;
>>
>>  &gt;     Ismael,
>>
>>  &gt;
>>
>>  &gt;     as far as I saw the sklearn.decomposition.PCA doesn't mention
>>
>>  &gt;     scaling at
>>
>>  &gt;     all (except for the whiten parameter which is
>> post-transformation
>>
>>  &gt;     scaling).
>>
>>  &gt;
>>
>>  &gt;     So since it doesn't mention it, it makes sense that it doesn't
>> do any
>>
>>  &gt;     scaling of the input. Same as np.linalg.svd.
>>
>>  &gt;
>>
>>  &gt;     You can verify that PCA and np.linalg.svd yield the same
>> results, with
>>
>>  &gt;
>>
>>  &gt;     ```
>>
>>  &gt;     ?&gt;&gt;&gt; import numpy as np
>>
>>  &gt;     ?&gt;&gt;&gt; from sklearn.decomposition import PCA
>>
>>  &gt;     ?&gt;&gt;&gt; import numpy.linalg
>>
>>  &gt;     ?&gt;&gt;&gt; X = np.random.RandomState(42).rand(10, 4)
>>
>>  &gt;     ?&gt;&gt;&gt; n_components = 2
>>
>>  &gt;     ?&gt;&gt;&gt; PCA(n_components,
>> svd_solver='full').fit_transform(X)
>>
>>  &gt;     ```
>>
>>  &gt;
>>
>>  &gt;     and
>>
>>  &gt;
>>
>>  &gt;     ```
>>
>>  &gt;     ?&gt;&gt;&gt; U, s, V = np.linalg.svd(X - X.mean(axis=0),
>> full_matrices=False)
>>
>>  &gt;     ?&gt;&gt;&gt; (X - X.mean(axis=0)).dot(V[:n_components].T)
>>
>>  &gt;     ```
>>
>>  &gt;
>>
>>  &gt;     --
>>
>>  &gt;     Roman
>>
>>  &gt;
>>
>>  &gt;     On 16/10/17 03:42, Ismael Lemhadri wrote:
>>
>>  &gt;     &gt; Dear all,
>>
>>  &gt;     &gt; The help file for the PCA class is unclear about the
>> preprocessing
>>
>>  &gt;     &gt; performed to the data.
>>
>>  &gt;     &gt; You can check on line 410 here:
>>
>>  &gt;     &gt;
>>
>>  &gt;
>> https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/decomposition/pca.py#L410
>>
>>  &gt;
>> &lt;https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/decomposition/pca.py#L410&gt;
>>
>>  &gt;     &gt;
>>
>>  &gt;
>> &lt;https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/decomposition/pca.py#L410
>>
>>  &gt;
>> &lt;https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/decomposition/pca.py#L410&gt;&gt;
>>
>>  &gt;     &gt; that the matrix is centered but NOT scaled, before
>> performing the
>>
>>  &gt;     &gt; singular value decomposition.
>>
>>  &gt;     &gt; However, the help files do not make any mention of it.
>>
>>  &gt;     &gt; This is unclear for someone who, like me, just wanted to
>> compare
>>
>>  &gt;     that
>>
>>  &gt;     &gt; the PCA and np.linalg.svd give the same results. In
>> academic
>>
>>  &gt;     settings,
>>
>>  &gt;     &gt; students are often asked to compare different methods and
>> to
>>
>>  &gt;     check that
>>
>>  &gt;     &gt; they yield the same results. I expect that many students
>> have
>>
>>  &gt;     confronted
>>
>>  &gt;     &gt; this problem before...
>>
>>  &gt;     &gt; Best,
>>
>>  &gt;     &gt; Ismael Lemhadri
>>
>>  &gt;     &gt;
>>
>>  &gt;     &gt;
>>
>>  &gt;     &gt; _______________________________________________
>>
>>  &gt;     &gt; scikit-learn mailing list
>>
>>  &gt;     &gt; scikit-learn at python.org
>> &lt;mailto:scikit-learn at python.org&gt;
>>
>>  &gt;     &gt; https://mail.python.org/mailman/listinfo/scikit-learn
>>
>>  &gt;     &lt;https://mail.python.org/mailman/listinfo/scikit-learn&gt;
>>
>>  &gt;     &gt;
>>
>>  &gt;
>>
>>  &gt;
>>
>>  &gt;
>>
>>  &gt;     ------------------------------
>>
>>  &gt;
>>
>>  &gt;     Message: 3
>>
>>  &gt;     Date: Mon, 16 Oct 2017 15:27:48 +0200
>>
>>  &gt;     From: Serafeim Loukas &lt;seralouk at gmail.com
>> &lt;mailto:seralouk at gmail.com&gt;&gt;
>>
>>  &gt;     To: scikit-learn at python.org
>> &lt;mailto:scikit-learn at python.org&gt;
>>
>>  &gt;     Subject: [scikit-learn] Question about LDA's coef_ attribute
>>
>>  &gt;     Message-ID: &lt;58C6D0DA-9DE5-4EF5-97C1-48159831F5A9 at gmail.com
>>
>>  &gt;
>> &lt;mailto:58C6D0DA-9DE5-4EF5-97C1-48159831F5A9 at gmail.com&gt;&gt;
>>
>>  &gt;     Content-Type: text/plain; charset="us-ascii"
>>
>>  &gt;
>>
>>  &gt;     Dear Scikit-learn community,
>>
>>  &gt;
>>
>>  &gt;     Since the documentation of the LDA
>>
>>  &gt;
>> (http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html
>>
>>  &gt;
>> &lt;http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html&gt;
>>
>>  &gt;
>> &lt;http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html
>>
>>  &gt;
>> &lt;http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html&gt;&gt;)
>>
>>  &gt;     is not so clear, I would like to ask if the lda.coef_ attribute
>>
>>  &gt;     stores the eigenvectors from the SVD decomposition.
>>
>>  &gt;
>>
>>  &gt;     Thank you in advance,
>>
>>  &gt;     Serafeim
>>
>>  &gt;     -------------- next part --------------
>>
>>  &gt;     An HTML attachment was scrubbed...
>>
>>  &gt;     URL:
>>
>>  &gt;
>> &lt;http://mail.python.org/pipermail/scikit-learn/attachments/20171016/4263df5c/attachment-0001.html
>>
>>  &gt;
>> &lt;http://mail.python.org/pipermail/scikit-learn/attachments/20171016/4263df5c/attachment-0001.html&gt;&gt;
>>
>>  &gt;
>>
>>  &gt;     ------------------------------
>>
>>  &gt;
>>
>>  &gt;     Message: 4
>>
>>  &gt;     Date: Mon, 16 Oct 2017 16:57:52 +0200
>>
>>  &gt;     From: Alexandre Gramfort &lt;alexandre.gramfort at inria.fr
>>
>>  &gt;     &lt;mailto:alexandre.gramfort at inria.fr&gt;&gt;
>>
>>  &gt;     To: Scikit-learn mailing list &lt;scikit-learn at python.org
>>
>>  &gt;     &lt;mailto:scikit-learn at python.org&gt;&gt;
>>
>>  &gt;     Subject: Re: [scikit-learn] Question about LDA's coef_ attribute
>>
>>  &gt;     Message-ID:
>>
>>  &gt;     ? ? ? ?
>>
>>  &gt;
>> &lt;CADeotZricOQhuHJMmW2Z14cqffEQyndYoxn-OgKAvTMQ7V0Y2g at mail.gmail.com
>>
>>  &gt;
>> &lt;mailto:CADeotZricOQhuHJMmW2Z14cqffEQyndYoxn-OgKAvTMQ7V0Y2g at mail.gmail.com&gt;&gt;
>>
>>  &gt;     Content-Type: text/plain; charset="UTF-8"
>>
>>  &gt;
>>
>>  &gt;     no it stores the direction of the decision function to match the
>>
>>  &gt;     API of
>>
>>  &gt;     linear models.
>>
>>  &gt;
>>
>>  &gt;     HTH
>>
>>  &gt;     Alex
>>
>>  &gt;
>>
>>  &gt;     On Mon, Oct 16, 2017 at 3:27 PM, Serafeim Loukas
>>
>>  &gt;     &lt;seralouk at gmail.com &lt;mailto:seralouk at gmail.com&gt;&gt;
>> wrote:
>>
>>  &gt;     &gt; Dear Scikit-learn community,
>>
>>  &gt;     &gt;
>>
>>  &gt;     &gt; Since the documentation of the LDA
>>
>>  &gt;     &gt;
>>
>>  &gt;
>> (http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html
>>
>>  &gt;
>> &lt;http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html&gt;)
>>
>>  &gt;     &gt; is not so clear, I would like to ask if the lda.coef_
>> attribute
>>
>>  &gt;     stores the
>>
>>  &gt;     &gt; eigenvectors from the SVD decomposition.
>>
>>  &gt;     &gt;
>>
>>  &gt;     &gt; Thank you in advance,
>>
>>  &gt;     &gt; Serafeim
>>
>>  &gt;     &gt;
>>
>>  &gt;     &gt; _______________________________________________
>>
>>  &gt;     &gt; scikit-learn mailing list
>>
>>  &gt;     &gt; scikit-learn at python.org
>> &lt;mailto:scikit-learn at python.org&gt;
>>
>>  &gt;     &gt; https://mail.python.org/mailman/listinfo/scikit-learn
>>
>>  &gt;     &lt;https://mail.python.org/mailman/listinfo/scikit-learn&gt;
>>
>>  &gt;     &gt;
>>
>>  &gt;
>>
>>  &gt;
>>
>>  &gt;     ------------------------------
>>
>>  &gt;
>>
>>  &gt;     Message: 5
>>
>>  &gt;     Date: Mon, 16 Oct 2017 17:02:46 +0200
>>
>>  &gt;     From: Serafeim Loukas &lt;seralouk at gmail.com
>> &lt;mailto:seralouk at gmail.com&gt;&gt;
>>
>>  &gt;     To: Scikit-learn mailing list &lt;scikit-learn at python.org
>>
>>  &gt;     &lt;mailto:scikit-learn at python.org&gt;&gt;
>>
>>  &gt;     Subject: Re: [scikit-learn] Question about LDA's coef_ attribute
>>
>>  &gt;     Message-ID: &lt;413210D2-56AE-41A4-873F-D171BB36539D at gmail.com
>>
>>  &gt;
>> &lt;mailto:413210D2-56AE-41A4-873F-D171BB36539D at gmail.com&gt;&gt;
>>
>>  &gt;     Content-Type: text/plain; charset="us-ascii"
>>
>>  &gt;
>>
>>  &gt;     Dear Alex,
>>
>>  &gt;
>>
>>  &gt;     Thank you for the prompt response.
>>
>>  &gt;
>>
>>  &gt;     Are the eigenvectors stored in some variable ?
>>
>>  &gt;     Does the lda.scalings_ attribute contain the eigenvectors ?
>>
>>  &gt;
>>
>>  &gt;     Best,
>>
>>  &gt;     Serafeim
>>
>>  &gt;
>>
>>  &gt;     &gt; On 16 Oct 2017, at 16:57, Alexandre Gramfort
>>
>>  &gt;     &lt;alexandre.gramfort at inria.fr
>> &lt;mailto:alexandre.gramfort at inria.fr&gt;&gt;
>>
>>  &gt;     wrote:
>>
>>  &gt;     &gt;
>>
>>  &gt;     &gt; no it stores the direction of the decision function to
>> match the
>>
>>  &gt;     API of
>>
>>  &gt;     &gt; linear models.
>>
>>  &gt;     &gt;
>>
>>  &gt;     &gt; HTH
>>
>>  &gt;     &gt; Alex
>>
>>  &gt;     &gt;
>>
>>  &gt;     &gt; On Mon, Oct 16, 2017 at 3:27 PM, Serafeim Loukas
>>
>>  &gt;     &lt;seralouk at gmail.com &lt;mailto:seralouk at gmail.com&gt;&gt;
>> wrote:
>>
>>  &gt;     &gt;&gt; Dear Scikit-learn community,
>>
>>  &gt;     &gt;&gt;
>>
>>  &gt;     &gt;&gt; Since the documentation of the LDA
>>
>>  &gt;     &gt;&gt;
>>
>>  &gt;
>> (http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html
>>
>>  &gt;
>> &lt;http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html&gt;)
>>
>>  &gt;     &gt;&gt; is not so clear, I would like to ask if the lda.coef_
>> attribute
>>
>>  &gt;     stores the
>>
>>  &gt;     &gt;&gt; eigenvectors from the SVD decomposition.
>>
>>  &gt;     &gt;&gt;
>>
>>  &gt;     &gt;&gt; Thank you in advance,
>>
>>  &gt;     &gt;&gt; Serafeim
>>
>>  &gt;     &gt;&gt;
>>
>>  &gt;     &gt;&gt; _______________________________________________
>>
>>  &gt;     &gt;&gt; scikit-learn mailing list
>>
>>  &gt;     &gt;&gt; scikit-learn at python.org
>> &lt;mailto:scikit-learn at python.org&gt;
>>
>>  &gt;     &gt;&gt; https://mail.python.org/mailman/listinfo/scikit-learn
>>
>>  &gt;     &lt;https://mail.python.org/mailman/listinfo/scikit-learn&gt;
>>
>>  &gt;     &gt;&gt;
>>
>>  &gt;     &gt; _______________________________________________
>>
>>  &gt;     &gt; scikit-learn mailing list
>>
>>  &gt;     &gt; scikit-learn at python.org
>> &lt;mailto:scikit-learn at python.org&gt;
>>
>>  &gt;     &gt; https://mail.python.org/mailman/listinfo/scikit-learn
>>
>>  &gt;     &lt;https://mail.python.org/mailman/listinfo/scikit-learn&gt;
>>
>>  &gt;
>>
>>  &gt;     -------------- next part --------------
>>
>>  &gt;     An HTML attachment was scrubbed...
>>
>>  &gt;     URL:
>>
>>  &gt;
>> &lt;http://mail.python.org/pipermail/scikit-learn/attachments/20171016/505c7da3/attachment.html
>>
>>  &gt;
>> &lt;http://mail.python.org/pipermail/scikit-learn/attachments/20171016/505c7da3/attachment.html&gt;&gt;
>>
>>  &gt;
>>
>>  &gt;     ------------------------------
>>
>>  &gt;
>>
>>  &gt;     Subject: Digest Footer
>>
>>  &gt;
>>
>>  &gt;     _______________________________________________
>>
>>  &gt;     scikit-learn mailing list
>>
>>  &gt;     scikit-learn at python.org &lt;mailto:scikit-learn at python.org&gt;
>>
>>  &gt;     https://mail.python.org/mailman/listinfo/scikit-learn
>>
>>  &gt;     &lt;https://mail.python.org/mailman/listinfo/scikit-learn&gt;
>>
>>  &gt;
>>
>>  &gt;
>>
>>  &gt;     ------------------------------
>>
>>  &gt;
>>
>>  &gt;     End of scikit-learn Digest, Vol 19, Issue 25
>>
>>  &gt;     ********************************************
>>
>>  &gt;
>>
>>  &gt;
>>
>>  &gt;
>>
>>  &gt;
>>
>>  &gt; _______________________________________________
>>
>>  &gt; scikit-learn mailing list
>>
>>  &gt; scikit-learn at python.org
>>
>>  &gt; https://mail.python.org/mailman/listinfo/scikit-learn
>>
>>
>>
>>  -------------- next part --------------
>>
>>  An HTML attachment was scrubbed...
>>
>>  URL:
>> &lt;http://mail.python.org/pipermail/scikit-learn/attachments/20171016/f47e63a9/attachment.html&gt;
>>
>>
>>
>>  ------------------------------
>>
>>
>>
>>  Subject: Digest Footer
>>
>>
>>
>>  _______________________________________________
>>
>>  scikit-learn mailing list
>>
>>  scikit-learn at python.org
>>
>>  https://mail.python.org/mailman/listinfo/scikit-learn
>>
>>
>>
>>
>>
>>  ------------------------------
>>
>>
>>
>>  End of scikit-learn Digest, Vol 19, Issue 28
>>
>>  ********************************************
>>
>>
>>
>>
>>
>>
>> _______________________________________________
>>
>> scikit-learn mailing list
>>
>> scikit-learn at python.org
>>
>> https://mail.python.org/mailman/listinfo/scikit-learn
>>
>>
>>
>>
>>
>>
>> -------------- next part --------------
>> An HTML attachment was scrubbed...
>> URL:
>> <http://mail.python.org/pipermail/scikit-learn/attachments/20171016/620a9401/attachment.html>
>>
>> ------------------------------
>>
>> Subject: Digest Footer
>>
>> _______________________________________________
>> scikit-learn mailing list
>> scikit-learn at python.org
>> https://mail.python.org/mailman/listinfo/scikit-learn
>>
>>
>> ------------------------------
>>
>> End of scikit-learn Digest, Vol 19, Issue 31
>> ********************************************
>
> --
>
> Sent from a mobile phone and may contain errors
>
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
>
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>

From lemhadri at stanford.edu  Tue Oct 17 23:18:27 2017
From: lemhadri at stanford.edu (Ismael Lemhadri)
Date: Tue, 17 Oct 2017 20:18:27 -0700
Subject: [scikit-learn] scikit-learn Digest, Vol 19, Issue 37
In-Reply-To: <mailman.2703.1508255100.12136.scikit-learn@python.org>
References: <mailman.2703.1508255100.12136.scikit-learn@python.org>
Message-ID: <CANpSPFTD0wq0t_tCCGDiNxX8QUtTJPaGEgAhsT1W-F-4HkMCZw@mail.gmail.com>

How about editing the various chunks of code concerned to add the option to
scale the parameters, and set it by default to NOT scale? This would make
what happens clear without the redundancy Andreas mentioned, and would add
more convenience to the user shall they want to scale their data.

On Tue, Oct 17, 2017 at 8:45 AM, <scikit-learn-request at python.org> wrote:

> Send scikit-learn mailing list submissions to
>         scikit-learn at python.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
>         https://mail.python.org/mailman/listinfo/scikit-learn
> or, via email, send a message with subject or body 'help' to
>         scikit-learn-request at python.org
>
> You can reach the person managing the list at
>         scikit-learn-owner at python.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of scikit-learn digest..."
>
>
> Today's Topics:
>
>    1. Re: Unclear help file about sklearn.decomposition.pca (Raphael C)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Tue, 17 Oct 2017 16:44:55 +0100
> From: Raphael C <drraph at gmail.com>
> To: Scikit-learn mailing list <scikit-learn at python.org>
> Subject: Re: [scikit-learn] Unclear help file about
>         sklearn.decomposition.pca
> Message-ID:
>         <CAFHc1QZigBoA0erY2hwJht2kiAenp=QTQDSb3uc0Uzs5SoEC7Q at mail.
> gmail.com>
> Content-Type: text/plain; charset="UTF-8"
>
> How about including the scaling that people might want to use in the
> User Guide examples?
>
> Raphael
>
> On 17 October 2017 at 16:40, Andreas Mueller <t3kcit at gmail.com> wrote:
> > In general scikit-learn avoids automatic preprocessing.
> > That's a convention to give the user more control and decrease surprising
> > behavior (ostensibly).
> > So scikit-learn will usually do what the algorithm is supposed to do, and
> > nothing more.
> >
> > I'm not sure what the best way do document this is, as this has come up
> with
> > different models.
> > For example the R wrapper of libsvm does automatic scaling, while we
> apply
> > the SVM.
> >
> > We could add "this model does not do any automatic preprocessing" to all
> > docstrings, but that seems
> > a bit redundant. We could add it to
> > https://github.com/scikit-learn/scikit-learn/pull/9517, but
> > that is probably not where you would have looked.
> >
> > Other suggestions welcome.
> >
> >
> > On 10/16/2017 03:29 PM, Ismael Lemhadri wrote:
> >
> > Thank you all for your feedback.
> > The initial problem I came with wasnt the definition of PCA but what the
> > sklearn method does. In practice I would always make sure the data is
> both
> > centered and scaled before performing PCA. This is the recommended method
> > because without scaling, the biggest direction could wrongly seem to
> explain
> > a huge fraction of the variance.
> > So my point was simply to clarify in the help file and the user guide
> what
> > the PCA class does precisely to leave no unclarity to the reader. Moving
> > forward I have now submitted a pull request on github as initially
> suggested
> > by Roman on this thread.
> > Best,
> > Ismael
> >
> > On Mon, 16 Oct 2017 at 11:49 AM, <scikit-learn-request at python.org>
> wrote:
> >>
> >> Send scikit-learn mailing list submissions to
> >>         scikit-learn at python.org
> >>
> >> To subscribe or unsubscribe via the World Wide Web, visit
> >>         https://mail.python.org/mailman/listinfo/scikit-learn
> >> or, via email, send a message with subject or body 'help' to
> >>         scikit-learn-request at python.org
> >>
> >> You can reach the person managing the list at
> >>         scikit-learn-owner at python.org
> >>
> >> When replying, please edit your Subject line so it is more specific
> >> than "Re: Contents of scikit-learn digest..."
> >>
> >>
> >> Today's Topics:
> >>
> >>    1. Re: 1. Re: unclear help file for sklearn.decomposition.pca
> >>       (Andreas Mueller)
> >>    2. Re: 1. Re: unclear help file for sklearn.decomposition.pca
> >>       (Oliver Tomic)
> >>
> >>
> >> ----------------------------------------------------------------------
> >>
> >> Message: 1
> >> Date: Mon, 16 Oct 2017 14:44:51 -0400
> >> From: Andreas Mueller <t3kcit at gmail.com>
> >> To: scikit-learn at python.org
> >> Subject: Re: [scikit-learn] 1. Re: unclear help file for
> >>         sklearn.decomposition.pca
> >> Message-ID: <35142868-fce9-6cb3-eba3-015a0b106163 at gmail.com>
> >> Content-Type: text/plain; charset="utf-8"; Format="flowed"
> >>
> >>
> >>
> >> On 10/16/2017 02:27 PM, Ismael Lemhadri wrote:
> >> > @Andreas Muller:
> >> > My references do not assume centering, e.g.
> >> > http://ufldl.stanford.edu/wiki/index.php/PCA
> >> > any reference?
> >> >
> >> It kinda does but is not very clear about it:
> >>
> >> This data has already been pre-processed so that each of the
> >> features\textstyle x_1and\textstyle x_2have about the same mean (zero)
> >> and variance.
> >>
> >>
> >>
> >> Wikipedia is much clearer:
> >> Consider a datamatrix
> >> <https://en.wikipedia.org/wiki/Matrix_%28mathematics%29>,*X*, with
> >> column-wise zeroempirical mean
> >> <https://en.wikipedia.org/wiki/Empirical_mean>(the sample mean of each
> >> column has been shifted to zero), where each of the/n/rows represents a
> >> different repetition of the experiment, and each of the/p/columns gives
> >> a particular kind of feature (say, the results from a particular
> sensor).
> >> https://en.wikipedia.org/wiki/Principal_component_analysis#Details
> >>
> >> I'm a bit surprised to find that ESL says "The SVD of the centered
> >> matrix X is another way of expressing the principal components of the
> >> variables in X",
> >> so they assume scaling? They don't really have a great treatment of PCA,
> >> though.
> >>
> >> Bishop <http://www.springer.com/us/book/9780387310732> and Murphy
> >> <https://mitpress.mit.edu/books/machine-learning-0> are pretty clear
> >> that they subtract the mean (or assume zero mean) but don't standardize.
> >> -------------- next part --------------
> >> An HTML attachment was scrubbed...
> >> URL:
> >> <http://mail.python.org/pipermail/scikit-learn/
> attachments/20171016/81b3014b/attachment-0001.html>
> >>
> >> ------------------------------
> >>
> >> Message: 2
> >> Date: Mon, 16 Oct 2017 20:48:29 +0200
> >> From: Oliver Tomic <olivertomic at zoho.com>
> >> To: "Scikit-learn mailing list" <scikit-learn at python.org>
> >> Cc: <scikit-learn at python.org>
> >> Subject: Re: [scikit-learn] 1. Re: unclear help file for
> >>         sklearn.decomposition.pca
> >> Message-ID: <15f26840d65.e97b33c25239.3934951873824890747 at zoho.com>
> >> Content-Type: text/plain; charset="utf-8"
> >>
> >> Dear Ismael,
> >>
> >>
> >>
> >> PCA should always involve at the least centering, or, if the variables
> are
> >> to contribute equally, scaling. Here is a reference from the scientific
> area
> >> named "chemometrics". In Chemometrics PCA used not only for
> dimensionality
> >> reduction, but also for interpretation of variance by use of scores,
> >> loadings, correlation loadings, etc.
> >>
> >>
> >>
> >> If you scroll down to subsection "Preprocessing" you will find more info
> >> on centering and scaling.
> >>
> >>
> >> http://pubs.rsc.org/en/content/articlehtml/2014/ay/c3ay41907j
> >>
> >>
> >>
> >> best
> >>
> >> Oliver
> >>
> >>
> >>
> >>
> >> ---- On Mon, 16 Oct 2017 20:27:11 +0200 Ismael Lemhadri
> >> &lt;lemhadri at stanford.edu&gt; wrote ----
> >>
> >>
> >>
> >>
> >> @Andreas Muller:
> >>
> >> My references do not assume centering, e.g.
> >> http://ufldl.stanford.edu/wiki/index.php/PCA
> >>
> >> any reference?
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >> On Mon, Oct 16, 2017 at 10:20 AM, &lt;scikit-learn-request at python.org
> &gt;
> >> wrote:
> >>
> >> Send scikit-learn mailing list submissions to
> >>
> >>          scikit-learn at python.org
> >>
> >>
> >>
> >>  To subscribe or unsubscribe via the World Wide Web, visit
> >>
> >>          https://mail.python.org/mailman/listinfo/scikit-learn
> >>
> >>  or, via email, send a message with subject or body 'help' to
> >>
> >>          scikit-learn-request at python.org
> >>
> >>
> >>
> >>  You can reach the person managing the list at
> >>
> >>          scikit-learn-owner at python.org
> >>
> >>
> >>
> >>  When replying, please edit your Subject line so it is more specific
> >>
> >>  than "Re: Contents of scikit-learn digest..."
> >>
> >>
> >>
> >>
> >>
> >>  Today's Topics:
> >>
> >>
> >>
> >>     1. Re: unclear help file for sklearn.decomposition.pca
> >>
> >>        (Andreas Mueller)
> >>
> >>
> >>
> >>
> >>
> >>  ----------------------------------------------------------------------
> >>
> >>
> >>
> >>  Message: 1
> >>
> >>  Date: Mon, 16 Oct 2017 13:19:57 -0400
> >>
> >>  From: Andreas Mueller &lt;t3kcit at gmail.com&gt;
> >>
> >>  To: scikit-learn at python.org
> >>
> >>  Subject: Re: [scikit-learn] unclear help file for
> >>
> >>          sklearn.decomposition.pca
> >>
> >>  Message-ID: &lt;04fc445c-d8f3-a3a9-4ab2-0535826a2d03 at gmail.com&gt;
> >>
> >>  Content-Type: text/plain; charset="utf-8"; Format="flowed"
> >>
> >>
> >>
> >>  The definition of PCA has a centering step, but no scaling step.
> >>
> >>
> >>
> >>  On 10/16/2017 11:16 AM, Ismael Lemhadri wrote:
> >>
> >>  &gt; Dear Roman,
> >>
> >>  &gt; My concern is actually not about not mentioning the scaling but
> >> about
> >>
> >>  &gt; not mentioning the centering.
> >>
> >>  &gt; That is, the sklearn PCA removes the mean but it does not mention
> it
> >>
> >>  &gt; in the help file.
> >>
> >>  &gt; This was quite messy for me to debug as I expected it to either:
> 1/
> >>
> >>  &gt; center and scale simultaneously or / not scale and not center
> >> either.
> >>
> >>  &gt; It would be beneficial to explicit the behavior in the help file
> in
> >> my
> >>
> >>  &gt; opinion.
> >>
> >>  &gt; Ismael
> >>
> >>  &gt;
> >>
> >>  &gt; On Mon, Oct 16, 2017 at 8:02 AM, &lt;scikit-learn-request@
> python.org
> >>
> >>  &gt; &lt;mailto:scikit-learn-request at python.org&gt;&gt; wrote:
> >>
> >>  &gt;
> >>
> >>  &gt;     Send scikit-learn mailing list submissions to
> >>
> >>  &gt;     scikit-learn at python.org &lt;mailto:scikit-learn at python.org
> &gt;
> >>
> >>  &gt;
> >>
> >>  &gt;     To subscribe or unsubscribe via the World Wide Web, visit
> >>
> >>  &gt;     https://mail.python.org/mailman/listinfo/scikit-learn
> >>
> >>  &gt;     &lt;https://mail.python.org/mailman/listinfo/scikit-learn&gt;
> >>
> >>  &gt;     or, via email, send a message with subject or body 'help' to
> >>
> >>  &gt;     scikit-learn-request at python.org
> >>
> >>  &gt;     &lt;mailto:scikit-learn-request at python.org&gt;
> >>
> >>  &gt;
> >>
> >>  &gt;     You can reach the person managing the list at
> >>
> >>  &gt;     scikit-learn-owner at python.org
> >> &lt;mailto:scikit-learn-owner at python.org&gt;
> >>
> >>  &gt;
> >>
> >>  &gt;     When replying, please edit your Subject line so it is more
> >> specific
> >>
> >>  &gt;     than "Re: Contents of scikit-learn digest..."
> >>
> >>  &gt;
> >>
> >>  &gt;
> >>
> >>  &gt;     Today's Topics:
> >>
> >>  &gt;
> >>
> >>  &gt;     ? ?1. unclear help file for sklearn.decomposition.pca (Ismael
> >>
> >>  &gt;     Lemhadri)
> >>
> >>  &gt;     ? ?2. Re: unclear help file for sklearn.decomposition.pca
> >>
> >>  &gt;     ? ? ? (Roman Yurchak)
> >>
> >>  &gt;     ? ?3. Question about LDA's coef_ attribute (Serafeim Loukas)
> >>
> >>  &gt;     ? ?4. Re: Question about LDA's coef_ attribute (Alexandre
> >> Gramfort)
> >>
> >>  &gt;     ? ?5. Re: Question about LDA's coef_ attribute (Serafeim
> Loukas)
> >>
> >>  &gt;
> >>
> >>  &gt;
> >>
> >>  &gt;
> >> ----------------------------------------------------------------------
> >>
> >>  &gt;
> >>
> >>  &gt;     Message: 1
> >>
> >>  &gt;     Date: Sun, 15 Oct 2017 18:42:56 -0700
> >>
> >>  &gt;     From: Ismael Lemhadri &lt;lemhadri at stanford.edu
> >>
> >>  &gt;     &lt;mailto:lemhadri at stanford.edu&gt;&gt;
> >>
> >>  &gt;     To: scikit-learn at python.org
> >> &lt;mailto:scikit-learn at python.org&gt;
> >>
> >>  &gt;     Subject: [scikit-learn] unclear help file for
> >>
> >>  &gt;     ? ? ? ? sklearn.decomposition.pca
> >>
> >>  &gt;     Message-ID:
> >>
> >>  &gt;     ? ? ? ?
> >>
> >>  &gt;
> >> &lt;CANpSPFTgv+Oz7f97dandmrBBayqf_o9w=18oKHCFN0u5DNzj+g at mail.gmail.com
> >>
> >>  &gt;     &lt;mailto:18oKHCFN0u5DNzj%2Bg at mail.gmail.com&gt;&gt;
> >>
> >>  &gt;     Content-Type: text/plain; charset="utf-8"
> >>
> >>  &gt;
> >>
> >>  &gt;     Dear all,
> >>
> >>  &gt;     The help file for the PCA class is unclear about the
> >> preprocessing
> >>
> >>  &gt;     performed to the data.
> >>
> >>  &gt;     You can check on line 410 here:
> >>
> >>  &gt;
> >> https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/
> >>
> >>  &gt;     decomposition/pca.py#L410
> >>
> >>  &gt;
> >> &lt;https://github.com/scikit-learn/scikit-learn/blob/
> ef5cb84a/sklearn/%0Adecomposition/pca.py#L410&gt;
> >>
> >>  &gt;     that the matrix is centered but NOT scaled, before performing
> >> the
> >>
> >>  &gt;     singular
> >>
> >>  &gt;     value decomposition.
> >>
> >>  &gt;     However, the help files do not make any mention of it.
> >>
> >>  &gt;     This is unclear for someone who, like me, just wanted to
> compare
> >>
> >>  &gt;     that the
> >>
> >>  &gt;     PCA and np.linalg.svd give the same results. In academic
> >> settings,
> >>
> >>  &gt;     students
> >>
> >>  &gt;     are often asked to compare different methods and to check that
> >>
> >>  &gt;     they yield
> >>
> >>  &gt;     the same results. I expect that many students have confronted
> >> this
> >>
> >>  &gt;     problem
> >>
> >>  &gt;     before...
> >>
> >>  &gt;     Best,
> >>
> >>  &gt;     Ismael Lemhadri
> >>
> >>  &gt;     -------------- next part --------------
> >>
> >>  &gt;     An HTML attachment was scrubbed...
> >>
> >>  &gt;     URL:
> >>
> >>  &gt;
> >> &lt;http://mail.python.org/pipermail/scikit-learn/
> attachments/20171015/c465bde7/attachment-0001.html
> >>
> >>  &gt;
> >> &lt;http://mail.python.org/pipermail/scikit-learn/
> attachments/20171015/c465bde7/attachment-0001.html&gt;&gt;
> >>
> >>  &gt;
> >>
> >>  &gt;     ------------------------------
> >>
> >>  &gt;
> >>
> >>  &gt;     Message: 2
> >>
> >>  &gt;     Date: Mon, 16 Oct 2017 15:16:45 +0200
> >>
> >>  &gt;     From: Roman Yurchak &lt;rth.yurchak at gmail.com
> >>
> >>  &gt;     &lt;mailto:rth.yurchak at gmail.com&gt;&gt;
> >>
> >>  &gt;     To: Scikit-learn mailing list &lt;scikit-learn at python.org
> >>
> >>  &gt;     &lt;mailto:scikit-learn at python.org&gt;&gt;
> >>
> >>  &gt;     Subject: Re: [scikit-learn] unclear help file for
> >>
> >>  &gt;     ? ? ? ? sklearn.decomposition.pca
> >>
> >>  &gt;     Message-ID: &lt;b2abdcfd-4736-929e-6304-
> b93832932043 at gmail.com
> >>
> >>  &gt;
> >> &lt;mailto:b2abdcfd-4736-929e-6304-b93832932043 at gmail.com&gt;&gt;
> >>
> >>  &gt;     Content-Type: text/plain; charset=utf-8; format=flowed
> >>
> >>  &gt;
> >>
> >>  &gt;     Ismael,
> >>
> >>  &gt;
> >>
> >>  &gt;     as far as I saw the sklearn.decomposition.PCA doesn't mention
> >>
> >>  &gt;     scaling at
> >>
> >>  &gt;     all (except for the whiten parameter which is
> >> post-transformation
> >>
> >>  &gt;     scaling).
> >>
> >>  &gt;
> >>
> >>  &gt;     So since it doesn't mention it, it makes sense that it doesn't
> >> do any
> >>
> >>  &gt;     scaling of the input. Same as np.linalg.svd.
> >>
> >>  &gt;
> >>
> >>  &gt;     You can verify that PCA and np.linalg.svd yield the same
> >> results, with
> >>
> >>  &gt;
> >>
> >>  &gt;     ```
> >>
> >>  &gt;     ?&gt;&gt;&gt; import numpy as np
> >>
> >>  &gt;     ?&gt;&gt;&gt; from sklearn.decomposition import PCA
> >>
> >>  &gt;     ?&gt;&gt;&gt; import numpy.linalg
> >>
> >>  &gt;     ?&gt;&gt;&gt; X = np.random.RandomState(42).rand(10, 4)
> >>
> >>  &gt;     ?&gt;&gt;&gt; n_components = 2
> >>
> >>  &gt;     ?&gt;&gt;&gt; PCA(n_components,
> >> svd_solver='full').fit_transform(X)
> >>
> >>  &gt;     ```
> >>
> >>  &gt;
> >>
> >>  &gt;     and
> >>
> >>  &gt;
> >>
> >>  &gt;     ```
> >>
> >>  &gt;     ?&gt;&gt;&gt; U, s, V = np.linalg.svd(X - X.mean(axis=0),
> >> full_matrices=False)
> >>
> >>  &gt;     ?&gt;&gt;&gt; (X - X.mean(axis=0)).dot(V[:n_components].T)
> >>
> >>  &gt;     ```
> >>
> >>  &gt;
> >>
> >>  &gt;     --
> >>
> >>  &gt;     Roman
> >>
> >>  &gt;
> >>
> >>  &gt;     On 16/10/17 03:42, Ismael Lemhadri wrote:
> >>
> >>  &gt;     &gt; Dear all,
> >>
> >>  &gt;     &gt; The help file for the PCA class is unclear about the
> >> preprocessing
> >>
> >>  &gt;     &gt; performed to the data.
> >>
> >>  &gt;     &gt; You can check on line 410 here:
> >>
> >>  &gt;     &gt;
> >>
> >>  &gt;
> >> https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/
> decomposition/pca.py#L410
> >>
> >>  &gt;
> >> &lt;https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/
> decomposition/pca.py#L410&gt;
> >>
> >>  &gt;     &gt;
> >>
> >>  &gt;
> >> &lt;https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/
> decomposition/pca.py#L410
> >>
> >>  &gt;
> >> &lt;https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/
> decomposition/pca.py#L410&gt;&gt;
> >>
> >>  &gt;     &gt; that the matrix is centered but NOT scaled, before
> >> performing the
> >>
> >>  &gt;     &gt; singular value decomposition.
> >>
> >>  &gt;     &gt; However, the help files do not make any mention of it.
> >>
> >>  &gt;     &gt; This is unclear for someone who, like me, just wanted to
> >> compare
> >>
> >>  &gt;     that
> >>
> >>  &gt;     &gt; the PCA and np.linalg.svd give the same results. In
> >> academic
> >>
> >>  &gt;     settings,
> >>
> >>  &gt;     &gt; students are often asked to compare different methods and
> >> to
> >>
> >>  &gt;     check that
> >>
> >>  &gt;     &gt; they yield the same results. I expect that many students
> >> have
> >>
> >>  &gt;     confronted
> >>
> >>  &gt;     &gt; this problem before...
> >>
> >>  &gt;     &gt; Best,
> >>
> >>  &gt;     &gt; Ismael Lemhadri
> >>
> >>  &gt;     &gt;
> >>
> >>  &gt;     &gt;
> >>
> >>  &gt;     &gt; _______________________________________________
> >>
> >>  &gt;     &gt; scikit-learn mailing list
> >>
> >>  &gt;     &gt; scikit-learn at python.org
> >> &lt;mailto:scikit-learn at python.org&gt;
> >>
> >>  &gt;     &gt; https://mail.python.org/mailman/listinfo/scikit-learn
> >>
> >>  &gt;     &lt;https://mail.python.org/mailman/listinfo/scikit-learn&gt;
> >>
> >>  &gt;     &gt;
> >>
> >>  &gt;
> >>
> >>  &gt;
> >>
> >>  &gt;
> >>
> >>  &gt;     ------------------------------
> >>
> >>  &gt;
> >>
> >>  &gt;     Message: 3
> >>
> >>  &gt;     Date: Mon, 16 Oct 2017 15:27:48 +0200
> >>
> >>  &gt;     From: Serafeim Loukas &lt;seralouk at gmail.com
> >> &lt;mailto:seralouk at gmail.com&gt;&gt;
> >>
> >>  &gt;     To: scikit-learn at python.org
> >> &lt;mailto:scikit-learn at python.org&gt;
> >>
> >>  &gt;     Subject: [scikit-learn] Question about LDA's coef_ attribute
> >>
> >>  &gt;     Message-ID: &lt;58C6D0DA-9DE5-4EF5-97C1-
> 48159831F5A9 at gmail.com
> >>
> >>  &gt;
> >> &lt;mailto:58C6D0DA-9DE5-4EF5-97C1-48159831F5A9 at gmail.com&gt;&gt;
> >>
> >>  &gt;     Content-Type: text/plain; charset="us-ascii"
> >>
> >>  &gt;
> >>
> >>  &gt;     Dear Scikit-learn community,
> >>
> >>  &gt;
> >>
> >>  &gt;     Since the documentation of the LDA
> >>
> >>  &gt;
> >> (http://scikit-learn.org/stable/modules/generated/
> sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html
> >>
> >>  &gt;
> >> &lt;http://scikit-learn.org/stable/modules/generated/
> sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html&gt;
> >>
> >>  &gt;
> >> &lt;http://scikit-learn.org/stable/modules/generated/
> sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html
> >>
> >>  &gt;
> >> &lt;http://scikit-learn.org/stable/modules/generated/
> sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html&gt;&gt;)
> >>
> >>  &gt;     is not so clear, I would like to ask if the lda.coef_
> attribute
> >>
> >>  &gt;     stores the eigenvectors from the SVD decomposition.
> >>
> >>  &gt;
> >>
> >>  &gt;     Thank you in advance,
> >>
> >>  &gt;     Serafeim
> >>
> >>  &gt;     -------------- next part --------------
> >>
> >>  &gt;     An HTML attachment was scrubbed...
> >>
> >>  &gt;     URL:
> >>
> >>  &gt;
> >> &lt;http://mail.python.org/pipermail/scikit-learn/
> attachments/20171016/4263df5c/attachment-0001.html
> >>
> >>  &gt;
> >> &lt;http://mail.python.org/pipermail/scikit-learn/
> attachments/20171016/4263df5c/attachment-0001.html&gt;&gt;
> >>
> >>  &gt;
> >>
> >>  &gt;     ------------------------------
> >>
> >>  &gt;
> >>
> >>  &gt;     Message: 4
> >>
> >>  &gt;     Date: Mon, 16 Oct 2017 16:57:52 +0200
> >>
> >>  &gt;     From: Alexandre Gramfort &lt;alexandre.gramfort at inria.fr
> >>
> >>  &gt;     &lt;mailto:alexandre.gramfort at inria.fr&gt;&gt;
> >>
> >>  &gt;     To: Scikit-learn mailing list &lt;scikit-learn at python.org
> >>
> >>  &gt;     &lt;mailto:scikit-learn at python.org&gt;&gt;
> >>
> >>  &gt;     Subject: Re: [scikit-learn] Question about LDA's coef_
> attribute
> >>
> >>  &gt;     Message-ID:
> >>
> >>  &gt;     ? ? ? ?
> >>
> >>  &gt;
> >> &lt;CADeotZricOQhuHJMmW2Z14cqffEQyndYoxn-OgKAvTMQ7V0Y2g at mail.gmail.com
> >>
> >>  &gt;
> >> &lt;mailto:CADeotZricOQhuHJMmW2Z14cqffEQyndYoxn-
> OgKAvTMQ7V0Y2g at mail.gmail.com&gt;&gt;
> >>
> >>  &gt;     Content-Type: text/plain; charset="UTF-8"
> >>
> >>  &gt;
> >>
> >>  &gt;     no it stores the direction of the decision function to match
> the
> >>
> >>  &gt;     API of
> >>
> >>  &gt;     linear models.
> >>
> >>  &gt;
> >>
> >>  &gt;     HTH
> >>
> >>  &gt;     Alex
> >>
> >>  &gt;
> >>
> >>  &gt;     On Mon, Oct 16, 2017 at 3:27 PM, Serafeim Loukas
> >>
> >>  &gt;     &lt;seralouk at gmail.com &lt;mailto:seralouk at gmail.com&gt;&gt;
> >> wrote:
> >>
> >>  &gt;     &gt; Dear Scikit-learn community,
> >>
> >>  &gt;     &gt;
> >>
> >>  &gt;     &gt; Since the documentation of the LDA
> >>
> >>  &gt;     &gt;
> >>
> >>  &gt;
> >> (http://scikit-learn.org/stable/modules/generated/
> sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html
> >>
> >>  &gt;
> >> &lt;http://scikit-learn.org/stable/modules/generated/
> sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html&gt;)
> >>
> >>  &gt;     &gt; is not so clear, I would like to ask if the lda.coef_
> >> attribute
> >>
> >>  &gt;     stores the
> >>
> >>  &gt;     &gt; eigenvectors from the SVD decomposition.
> >>
> >>  &gt;     &gt;
> >>
> >>  &gt;     &gt; Thank you in advance,
> >>
> >>  &gt;     &gt; Serafeim
> >>
> >>  &gt;     &gt;
> >>
> >>  &gt;     &gt; _______________________________________________
> >>
> >>  &gt;     &gt; scikit-learn mailing list
> >>
> >>  &gt;     &gt; scikit-learn at python.org
> >> &lt;mailto:scikit-learn at python.org&gt;
> >>
> >>  &gt;     &gt; https://mail.python.org/mailman/listinfo/scikit-learn
> >>
> >>  &gt;     &lt;https://mail.python.org/mailman/listinfo/scikit-learn&gt;
> >>
> >>  &gt;     &gt;
> >>
> >>  &gt;
> >>
> >>  &gt;
> >>
> >>  &gt;     ------------------------------
> >>
> >>  &gt;
> >>
> >>  &gt;     Message: 5
> >>
> >>  &gt;     Date: Mon, 16 Oct 2017 17:02:46 +0200
> >>
> >>  &gt;     From: Serafeim Loukas &lt;seralouk at gmail.com
> >> &lt;mailto:seralouk at gmail.com&gt;&gt;
> >>
> >>  &gt;     To: Scikit-learn mailing list &lt;scikit-learn at python.org
> >>
> >>  &gt;     &lt;mailto:scikit-learn at python.org&gt;&gt;
> >>
> >>  &gt;     Subject: Re: [scikit-learn] Question about LDA's coef_
> attribute
> >>
> >>  &gt;     Message-ID: &lt;413210D2-56AE-41A4-873F-
> D171BB36539D at gmail.com
> >>
> >>  &gt;
> >> &lt;mailto:413210D2-56AE-41A4-873F-D171BB36539D at gmail.com&gt;&gt;
> >>
> >>  &gt;     Content-Type: text/plain; charset="us-ascii"
> >>
> >>  &gt;
> >>
> >>  &gt;     Dear Alex,
> >>
> >>  &gt;
> >>
> >>  &gt;     Thank you for the prompt response.
> >>
> >>  &gt;
> >>
> >>  &gt;     Are the eigenvectors stored in some variable ?
> >>
> >>  &gt;     Does the lda.scalings_ attribute contain the eigenvectors ?
> >>
> >>  &gt;
> >>
> >>  &gt;     Best,
> >>
> >>  &gt;     Serafeim
> >>
> >>  &gt;
> >>
> >>  &gt;     &gt; On 16 Oct 2017, at 16:57, Alexandre Gramfort
> >>
> >>  &gt;     &lt;alexandre.gramfort at inria.fr
> >> &lt;mailto:alexandre.gramfort at inria.fr&gt;&gt;
> >>
> >>  &gt;     wrote:
> >>
> >>  &gt;     &gt;
> >>
> >>  &gt;     &gt; no it stores the direction of the decision function to
> >> match the
> >>
> >>  &gt;     API of
> >>
> >>  &gt;     &gt; linear models.
> >>
> >>  &gt;     &gt;
> >>
> >>  &gt;     &gt; HTH
> >>
> >>  &gt;     &gt; Alex
> >>
> >>  &gt;     &gt;
> >>
> >>  &gt;     &gt; On Mon, Oct 16, 2017 at 3:27 PM, Serafeim Loukas
> >>
> >>  &gt;     &lt;seralouk at gmail.com &lt;mailto:seralouk at gmail.com&gt;&gt;
> >> wrote:
> >>
> >>  &gt;     &gt;&gt; Dear Scikit-learn community,
> >>
> >>  &gt;     &gt;&gt;
> >>
> >>  &gt;     &gt;&gt; Since the documentation of the LDA
> >>
> >>  &gt;     &gt;&gt;
> >>
> >>  &gt;
> >> (http://scikit-learn.org/stable/modules/generated/
> sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html
> >>
> >>  &gt;
> >> &lt;http://scikit-learn.org/stable/modules/generated/
> sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html&gt;)
> >>
> >>  &gt;     &gt;&gt; is not so clear, I would like to ask if the lda.coef_
> >> attribute
> >>
> >>  &gt;     stores the
> >>
> >>  &gt;     &gt;&gt; eigenvectors from the SVD decomposition.
> >>
> >>  &gt;     &gt;&gt;
> >>
> >>  &gt;     &gt;&gt; Thank you in advance,
> >>
> >>  &gt;     &gt;&gt; Serafeim
> >>
> >>  &gt;     &gt;&gt;
> >>
> >>  &gt;     &gt;&gt; _______________________________________________
> >>
> >>  &gt;     &gt;&gt; scikit-learn mailing list
> >>
> >>  &gt;     &gt;&gt; scikit-learn at python.org
> >> &lt;mailto:scikit-learn at python.org&gt;
> >>
> >>  &gt;     &gt;&gt; https://mail.python.org/
> mailman/listinfo/scikit-learn
> >>
> >>  &gt;     &lt;https://mail.python.org/mailman/listinfo/scikit-learn&gt;
> >>
> >>  &gt;     &gt;&gt;
> >>
> >>  &gt;     &gt; _______________________________________________
> >>
> >>  &gt;     &gt; scikit-learn mailing list
> >>
> >>  &gt;     &gt; scikit-learn at python.org
> >> &lt;mailto:scikit-learn at python.org&gt;
> >>
> >>  &gt;     &gt; https://mail.python.org/mailman/listinfo/scikit-learn
> >>
> >>  &gt;     &lt;https://mail.python.org/mailman/listinfo/scikit-learn&gt;
> >>
> >>  &gt;
> >>
> >>  &gt;     -------------- next part --------------
> >>
> >>  &gt;     An HTML attachment was scrubbed...
> >>
> >>  &gt;     URL:
> >>
> >>  &gt;
> >> &lt;http://mail.python.org/pipermail/scikit-learn/
> attachments/20171016/505c7da3/attachment.html
> >>
> >>  &gt;
> >> &lt;http://mail.python.org/pipermail/scikit-learn/
> attachments/20171016/505c7da3/attachment.html&gt;&gt;
> >>
> >>  &gt;
> >>
> >>  &gt;     ------------------------------
> >>
> >>  &gt;
> >>
> >>  &gt;     Subject: Digest Footer
> >>
> >>  &gt;
> >>
> >>  &gt;     _______________________________________________
> >>
> >>  &gt;     scikit-learn mailing list
> >>
> >>  &gt;     scikit-learn at python.org &lt;mailto:scikit-learn at python.org
> &gt;
> >>
> >>  &gt;     https://mail.python.org/mailman/listinfo/scikit-learn
> >>
> >>  &gt;     &lt;https://mail.python.org/mailman/listinfo/scikit-learn&gt;
> >>
> >>  &gt;
> >>
> >>  &gt;
> >>
> >>  &gt;     ------------------------------
> >>
> >>  &gt;
> >>
> >>  &gt;     End of scikit-learn Digest, Vol 19, Issue 25
> >>
> >>  &gt;     ********************************************
> >>
> >>  &gt;
> >>
> >>  &gt;
> >>
> >>  &gt;
> >>
> >>  &gt;
> >>
> >>  &gt; _______________________________________________
> >>
> >>  &gt; scikit-learn mailing list
> >>
> >>  &gt; scikit-learn at python.org
> >>
> >>  &gt; https://mail.python.org/mailman/listinfo/scikit-learn
> >>
> >>
> >>
> >>  -------------- next part --------------
> >>
> >>  An HTML attachment was scrubbed...
> >>
> >>  URL:
> >> &lt;http://mail.python.org/pipermail/scikit-learn/
> attachments/20171016/f47e63a9/attachment.html&gt;
> >>
> >>
> >>
> >>  ------------------------------
> >>
> >>
> >>
> >>  Subject: Digest Footer
> >>
> >>
> >>
> >>  _______________________________________________
> >>
> >>  scikit-learn mailing list
> >>
> >>  scikit-learn at python.org
> >>
> >>  https://mail.python.org/mailman/listinfo/scikit-learn
> >>
> >>
> >>
> >>
> >>
> >>  ------------------------------
> >>
> >>
> >>
> >>  End of scikit-learn Digest, Vol 19, Issue 28
> >>
> >>  ********************************************
> >>
> >>
> >>
> >>
> >>
> >>
> >> _______________________________________________
> >>
> >> scikit-learn mailing list
> >>
> >> scikit-learn at python.org
> >>
> >> https://mail.python.org/mailman/listinfo/scikit-learn
> >>
> >>
> >>
> >>
> >>
> >>
> >> -------------- next part --------------
> >> An HTML attachment was scrubbed...
> >> URL:
> >> <http://mail.python.org/pipermail/scikit-learn/
> attachments/20171016/620a9401/attachment.html>
> >>
> >> ------------------------------
> >>
> >> Subject: Digest Footer
> >>
> >> _______________________________________________
> >> scikit-learn mailing list
> >> scikit-learn at python.org
> >> https://mail.python.org/mailman/listinfo/scikit-learn
> >>
> >>
> >> ------------------------------
> >>
> >> End of scikit-learn Digest, Vol 19, Issue 31
> >> ********************************************
> >
> > --
> >
> > Sent from a mobile phone and may contain errors
> >
> >
> > _______________________________________________
> > scikit-learn mailing list
> > scikit-learn at python.org
> > https://mail.python.org/mailman/listinfo/scikit-learn
> >
> >
> >
> > _______________________________________________
> > scikit-learn mailing list
> > scikit-learn at python.org
> > https://mail.python.org/mailman/listinfo/scikit-learn
> >
>
>
> ------------------------------
>
> Subject: Digest Footer
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
>
> ------------------------------
>
> End of scikit-learn Digest, Vol 19, Issue 37
> ********************************************
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20171017/28441171/attachment-0001.html>

From jbbrown at kuhp.kyoto-u.ac.jp  Tue Oct 17 23:30:27 2017
From: jbbrown at kuhp.kyoto-u.ac.jp (Brown J.B.)
Date: Wed, 18 Oct 2017 12:30:27 +0900
Subject: [scikit-learn] scikit-learn Digest, Vol 19, Issue 37
In-Reply-To: <CANpSPFTD0wq0t_tCCGDiNxX8QUtTJPaGEgAhsT1W-F-4HkMCZw@mail.gmail.com>
References: <mailman.2703.1508255100.12136.scikit-learn@python.org>
 <CANpSPFTD0wq0t_tCCGDiNxX8QUtTJPaGEgAhsT1W-F-4HkMCZw@mail.gmail.com>
Message-ID: <CAJe_vxA3vJvKHNuqjsf1+j9omAMj=zT7uNc14girkZ4JgyrYcw@mail.gmail.com>

2017-10-18 12:18 GMT+09:00 Ismael Lemhadri <lemhadri at stanford.edu>:

> How about editing the various chunks of code concerned to add the option
> to scale the parameters, and set it by default to NOT scale? This would
> make what happens clear without the redundancy Andreas mentioned, and would
> add more convenience to the user shall they want to scale their data.
>

>From my perspectives:

That's a very nice, rational idea.
For end users, it preserves compatibility of existing codebases, but allows
both near-effortless updating of code for those who want to use
Scikit-learn's scaling as well as ease of application for new users and
tools.

One issue of caution would be where the scaling occurs, such as globally
before any cross-validation, or per-split with the transformation stored
and applied to prediction data per fold of CV.
One more keyword argument would need to be added to allow user
specification of this, and a state variable would have to be stored and
accessible from the methods of the parent estimator.

J.B.


>
>
>> Today's Topics:
>>
>>    1. Re: Unclear help file about sklearn.decomposition.pca (Raphael C)
>>
>>
>> ----------------------------------------------------------------------
>>
>> Message: 1
>> Date: Tue, 17 Oct 2017 16:44:55 +0100
>> From: Raphael C <drraph at gmail.com>
>> To: Scikit-learn mailing list <scikit-learn at python.org>
>> Subject: Re: [scikit-learn] Unclear help file about
>>         sklearn.decomposition.pca
>> Message-ID:
>>         <CAFHc1QZigBoA0erY2hwJht2kiAenp=QTQDSb3uc0Uzs5SoEC7Q at mail.gm
>> ail.com>
>> Content-Type: text/plain; charset="UTF-8"
>>
>> How about including the scaling that people might want to use in the
>> User Guide examples?
>>
>> Raphael
>>
>> On 17 October 2017 at 16:40, Andreas Mueller <t3kcit at gmail.com> wrote:
>> > In general scikit-learn avoids automatic preprocessing.
>> > That's a convention to give the user more control and decrease
>> surprising
>> > behavior (ostensibly).
>> > So scikit-learn will usually do what the algorithm is supposed to do,
>> and
>> > nothing more.
>> >
>> > I'm not sure what the best way do document this is, as this has come up
>> with
>> > different models.
>> > For example the R wrapper of libsvm does automatic scaling, while we
>> apply
>> > the SVM.
>> >
>> > We could add "this model does not do any automatic preprocessing" to all
>> > docstrings, but that seems
>> > a bit redundant. We could add it to
>> > https://github.com/scikit-learn/scikit-learn/pull/9517, but
>> > that is probably not where you would have looked.
>> >
>> > Other suggestions welcome.
>> >
>> >
>> > On 10/16/2017 03:29 PM, Ismael Lemhadri wrote:
>> >
>> > Thank you all for your feedback.
>> > The initial problem I came with wasnt the definition of PCA but what the
>> > sklearn method does. In practice I would always make sure the data is
>> both
>> > centered and scaled before performing PCA. This is the recommended
>> method
>> > because without scaling, the biggest direction could wrongly seem to
>> explain
>> > a huge fraction of the variance.
>> > So my point was simply to clarify in the help file and the user guide
>> what
>> > the PCA class does precisely to leave no unclarity to the reader. Moving
>> > forward I have now submitted a pull request on github as initially
>> suggested
>> > by Roman on this thread.
>> > Best,
>> > Ismael
>> >
>> > On Mon, 16 Oct 2017 at 11:49 AM, <scikit-learn-request at python.org>
>> wrote:
>> >>
>> >> Send scikit-learn mailing list submissions to
>> >>         scikit-learn at python.org
>> >>
>> >> To subscribe or unsubscribe via the World Wide Web, visit
>> >>         https://mail.python.org/mailman/listinfo/scikit-learn
>> >> or, via email, send a message with subject or body 'help' to
>> >>         scikit-learn-request at python.org
>> >>
>> >> You can reach the person managing the list at
>> >>         scikit-learn-owner at python.org
>> >>
>> >> When replying, please edit your Subject line so it is more specific
>> >> than "Re: Contents of scikit-learn digest..."
>> >>
>> >>
>> >> Today's Topics:
>> >>
>> >>    1. Re: 1. Re: unclear help file for sklearn.decomposition.pca
>> >>       (Andreas Mueller)
>> >>    2. Re: 1. Re: unclear help file for sklearn.decomposition.pca
>> >>       (Oliver Tomic)
>> >>
>> >>
>> >> ----------------------------------------------------------------------
>> >>
>> >> Message: 1
>> >> Date: Mon, 16 Oct 2017 14:44:51 -0400
>> >> From: Andreas Mueller <t3kcit at gmail.com>
>> >> To: scikit-learn at python.org
>> >> Subject: Re: [scikit-learn] 1. Re: unclear help file for
>> >>         sklearn.decomposition.pca
>> >> Message-ID: <35142868-fce9-6cb3-eba3-015a0b106163 at gmail.com>
>> >> Content-Type: text/plain; charset="utf-8"; Format="flowed"
>> >>
>> >>
>> >>
>> >> On 10/16/2017 02:27 PM, Ismael Lemhadri wrote:
>> >> > @Andreas Muller:
>> >> > My references do not assume centering, e.g.
>> >> > http://ufldl.stanford.edu/wiki/index.php/PCA
>> >> > any reference?
>> >> >
>> >> It kinda does but is not very clear about it:
>> >>
>> >> This data has already been pre-processed so that each of the
>> >> features\textstyle x_1and\textstyle x_2have about the same mean (zero)
>> >> and variance.
>> >>
>> >>
>> >>
>> >> Wikipedia is much clearer:
>> >> Consider a datamatrix
>> >> <https://en.wikipedia.org/wiki/Matrix_%28mathematics%29>,*X*, with
>> >> column-wise zeroempirical mean
>> >> <https://en.wikipedia.org/wiki/Empirical_mean>(the sample mean of each
>> >> column has been shifted to zero), where each of the/n/rows represents a
>> >> different repetition of the experiment, and each of the/p/columns gives
>> >> a particular kind of feature (say, the results from a particular
>> sensor).
>> >> https://en.wikipedia.org/wiki/Principal_component_analysis#Details
>> >>
>> >> I'm a bit surprised to find that ESL says "The SVD of the centered
>> >> matrix X is another way of expressing the principal components of the
>> >> variables in X",
>> >> so they assume scaling? They don't really have a great treatment of
>> PCA,
>> >> though.
>> >>
>> >> Bishop <http://www.springer.com/us/book/9780387310732> and Murphy
>> >> <https://mitpress.mit.edu/books/machine-learning-0> are pretty clear
>> >> that they subtract the mean (or assume zero mean) but don't
>> standardize.
>> >> -------------- next part --------------
>> >> An HTML attachment was scrubbed...
>> >> URL:
>> >> <http://mail.python.org/pipermail/scikit-learn/attachments/
>> 20171016/81b3014b/attachment-0001.html>
>> >>
>> >> ------------------------------
>> >>
>> >> Message: 2
>> >> Date: Mon, 16 Oct 2017 20:48:29 +0200
>> >> From: Oliver Tomic <olivertomic at zoho.com>
>> >> To: "Scikit-learn mailing list" <scikit-learn at python.org>
>> >> Cc: <scikit-learn at python.org>
>> >> Subject: Re: [scikit-learn] 1. Re: unclear help file for
>> >>         sklearn.decomposition.pca
>> >> Message-ID: <15f26840d65.e97b33c25239.3934951873824890747 at zoho.com>
>> >> Content-Type: text/plain; charset="utf-8"
>> >>
>> >> Dear Ismael,
>> >>
>> >>
>> >>
>> >> PCA should always involve at the least centering, or, if the variables
>> are
>> >> to contribute equally, scaling. Here is a reference from the
>> scientific area
>> >> named "chemometrics". In Chemometrics PCA used not only for
>> dimensionality
>> >> reduction, but also for interpretation of variance by use of scores,
>> >> loadings, correlation loadings, etc.
>> >>
>> >>
>> >>
>> >> If you scroll down to subsection "Preprocessing" you will find more
>> info
>> >> on centering and scaling.
>> >>
>> >>
>> >> http://pubs.rsc.org/en/content/articlehtml/2014/ay/c3ay41907j
>> >>
>> >>
>> >>
>> >> best
>> >>
>> >> Oliver
>> >>
>> >>
>> >>
>> >>
>> >> ---- On Mon, 16 Oct 2017 20:27:11 +0200 Ismael Lemhadri
>> >> &lt;lemhadri at stanford.edu&gt; wrote ----
>> >>
>> >>
>> >>
>> >>
>> >> @Andreas Muller:
>> >>
>> >> My references do not assume centering, e.g.
>> >> http://ufldl.stanford.edu/wiki/index.php/PCA
>> >>
>> >> any reference?
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >> On Mon, Oct 16, 2017 at 10:20 AM, &lt;scikit-learn-request at python.org
>> &gt;
>> >> wrote:
>> >>
>> >> Send scikit-learn mailing list submissions to
>> >>
>> >>          scikit-learn at python.org
>> >>
>> >>
>> >>
>> >>  To subscribe or unsubscribe via the World Wide Web, visit
>> >>
>> >>          https://mail.python.org/mailman/listinfo/scikit-learn
>> >>
>> >>  or, via email, send a message with subject or body 'help' to
>> >>
>> >>          scikit-learn-request at python.org
>> >>
>> >>
>> >>
>> >>  You can reach the person managing the list at
>> >>
>> >>          scikit-learn-owner at python.org
>> >>
>> >>
>> >>
>> >>  When replying, please edit your Subject line so it is more specific
>> >>
>> >>  than "Re: Contents of scikit-learn digest..."
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>  Today's Topics:
>> >>
>> >>
>> >>
>> >>     1. Re: unclear help file for sklearn.decomposition.pca
>> >>
>> >>        (Andreas Mueller)
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>  ------------------------------------------------------------
>> ----------
>> >>
>> >>
>> >>
>> >>  Message: 1
>> >>
>> >>  Date: Mon, 16 Oct 2017 13:19:57 -0400
>> >>
>> >>  From: Andreas Mueller &lt;t3kcit at gmail.com&gt;
>> >>
>> >>  To: scikit-learn at python.org
>> >>
>> >>  Subject: Re: [scikit-learn] unclear help file for
>> >>
>> >>          sklearn.decomposition.pca
>> >>
>> >>  Message-ID: &lt;04fc445c-d8f3-a3a9-4ab2-0535826a2d03 at gmail.com&gt;
>> >>
>> >>  Content-Type: text/plain; charset="utf-8"; Format="flowed"
>> >>
>> >>
>> >>
>> >>  The definition of PCA has a centering step, but no scaling step.
>> >>
>> >>
>> >>
>> >>  On 10/16/2017 11:16 AM, Ismael Lemhadri wrote:
>> >>
>> >>  &gt; Dear Roman,
>> >>
>> >>  &gt; My concern is actually not about not mentioning the scaling but
>> >> about
>> >>
>> >>  &gt; not mentioning the centering.
>> >>
>> >>  &gt; That is, the sklearn PCA removes the mean but it does not
>> mention it
>> >>
>> >>  &gt; in the help file.
>> >>
>> >>  &gt; This was quite messy for me to debug as I expected it to either:
>> 1/
>> >>
>> >>  &gt; center and scale simultaneously or / not scale and not center
>> >> either.
>> >>
>> >>  &gt; It would be beneficial to explicit the behavior in the help file
>> in
>> >> my
>> >>
>> >>  &gt; opinion.
>> >>
>> >>  &gt; Ismael
>> >>
>> >>  &gt;
>> >>
>> >>  &gt; On Mon, Oct 16, 2017 at 8:02 AM, &lt;scikit-learn-request at pytho
>> n.org
>> >>
>> >>  &gt; &lt;mailto:scikit-learn-request at python.org&gt;&gt; wrote:
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;     Send scikit-learn mailing list submissions to
>> >>
>> >>  &gt;     scikit-learn at python.org &lt;mailto:scikit-learn at python.org
>> &gt;
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;     To subscribe or unsubscribe via the World Wide Web, visit
>> >>
>> >>  &gt;     https://mail.python.org/mailman/listinfo/scikit-learn
>> >>
>> >>  &gt;     &lt;https://mail.python.org/mailman/listinfo/scikit-learn&gt
>> ;
>> >>
>> >>  &gt;     or, via email, send a message with subject or body 'help' to
>> >>
>> >>  &gt;     scikit-learn-request at python.org
>> >>
>> >>  &gt;     &lt;mailto:scikit-learn-request at python.org&gt;
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;     You can reach the person managing the list at
>> >>
>> >>  &gt;     scikit-learn-owner at python.org
>> >> &lt;mailto:scikit-learn-owner at python.org&gt;
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;     When replying, please edit your Subject line so it is more
>> >> specific
>> >>
>> >>  &gt;     than "Re: Contents of scikit-learn digest..."
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;     Today's Topics:
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;     ? ?1. unclear help file for sklearn.decomposition.pca (Ismael
>> >>
>> >>  &gt;     Lemhadri)
>> >>
>> >>  &gt;     ? ?2. Re: unclear help file for sklearn.decomposition.pca
>> >>
>> >>  &gt;     ? ? ? (Roman Yurchak)
>> >>
>> >>  &gt;     ? ?3. Question about LDA's coef_ attribute (Serafeim Loukas)
>> >>
>> >>  &gt;     ? ?4. Re: Question about LDA's coef_ attribute (Alexandre
>> >> Gramfort)
>> >>
>> >>  &gt;     ? ?5. Re: Question about LDA's coef_ attribute (Serafeim
>> Loukas)
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;
>> >> ----------------------------------------------------------------------
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;     Message: 1
>> >>
>> >>  &gt;     Date: Sun, 15 Oct 2017 18:42:56 -0700
>> >>
>> >>  &gt;     From: Ismael Lemhadri &lt;lemhadri at stanford.edu
>> >>
>> >>  &gt;     &lt;mailto:lemhadri at stanford.edu&gt;&gt;
>> >>
>> >>  &gt;     To: scikit-learn at python.org
>> >> &lt;mailto:scikit-learn at python.org&gt;
>> >>
>> >>  &gt;     Subject: [scikit-learn] unclear help file for
>> >>
>> >>  &gt;     ? ? ? ? sklearn.decomposition.pca
>> >>
>> >>  &gt;     Message-ID:
>> >>
>> >>  &gt;     ? ? ? ?
>> >>
>> >>  &gt;
>> >> &lt;CANpSPFTgv+Oz7f97dandmrBBayqf_o9w=18oKHCFN0u5DNzj+g at mail.gmail.com
>> >>
>> >>  &gt;     &lt;mailto:18oKHCFN0u5DNzj%2Bg at mail.gmail.com&gt;&gt;
>> >>
>> >>  &gt;     Content-Type: text/plain; charset="utf-8"
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;     Dear all,
>> >>
>> >>  &gt;     The help file for the PCA class is unclear about the
>> >> preprocessing
>> >>
>> >>  &gt;     performed to the data.
>> >>
>> >>  &gt;     You can check on line 410 here:
>> >>
>> >>  &gt;
>> >> https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/
>> >>
>> >>  &gt;     decomposition/pca.py#L410
>> >>
>> >>  &gt;
>> >> &lt;https://github.com/scikit-learn/scikit-learn/blob/ef5cb8
>> 4a/sklearn/%0Adecomposition/pca.py#L410&gt;
>> >>
>> >>  &gt;     that the matrix is centered but NOT scaled, before performing
>> >> the
>> >>
>> >>  &gt;     singular
>> >>
>> >>  &gt;     value decomposition.
>> >>
>> >>  &gt;     However, the help files do not make any mention of it.
>> >>
>> >>  &gt;     This is unclear for someone who, like me, just wanted to
>> compare
>> >>
>> >>  &gt;     that the
>> >>
>> >>  &gt;     PCA and np.linalg.svd give the same results. In academic
>> >> settings,
>> >>
>> >>  &gt;     students
>> >>
>> >>  &gt;     are often asked to compare different methods and to check
>> that
>> >>
>> >>  &gt;     they yield
>> >>
>> >>  &gt;     the same results. I expect that many students have confronted
>> >> this
>> >>
>> >>  &gt;     problem
>> >>
>> >>  &gt;     before...
>> >>
>> >>  &gt;     Best,
>> >>
>> >>  &gt;     Ismael Lemhadri
>> >>
>> >>  &gt;     -------------- next part --------------
>> >>
>> >>  &gt;     An HTML attachment was scrubbed...
>> >>
>> >>  &gt;     URL:
>> >>
>> >>  &gt;
>> >> &lt;http://mail.python.org/pipermail/scikit-learn/attachment
>> s/20171015/c465bde7/attachment-0001.html
>> >>
>> >>  &gt;
>> >> &lt;http://mail.python.org/pipermail/scikit-learn/attachment
>> s/20171015/c465bde7/attachment-0001.html&gt;&gt;
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;     ------------------------------
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;     Message: 2
>> >>
>> >>  &gt;     Date: Mon, 16 Oct 2017 15:16:45 +0200
>> >>
>> >>  &gt;     From: Roman Yurchak &lt;rth.yurchak at gmail.com
>> >>
>> >>  &gt;     &lt;mailto:rth.yurchak at gmail.com&gt;&gt;
>> >>
>> >>  &gt;     To: Scikit-learn mailing list &lt;scikit-learn at python.org
>> >>
>> >>  &gt;     &lt;mailto:scikit-learn at python.org&gt;&gt;
>> >>
>> >>  &gt;     Subject: Re: [scikit-learn] unclear help file for
>> >>
>> >>  &gt;     ? ? ? ? sklearn.decomposition.pca
>> >>
>> >>  &gt;     Message-ID: &lt;b2abdcfd-4736-929e-6304-b9
>> 3832932043 at gmail.com
>> >>
>> >>  &gt;
>> >> &lt;mailto:b2abdcfd-4736-929e-6304-b93832932043 at gmail.com&gt;&gt;
>> >>
>> >>  &gt;     Content-Type: text/plain; charset=utf-8; format=flowed
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;     Ismael,
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;     as far as I saw the sklearn.decomposition.PCA doesn't mention
>> >>
>> >>  &gt;     scaling at
>> >>
>> >>  &gt;     all (except for the whiten parameter which is
>> >> post-transformation
>> >>
>> >>  &gt;     scaling).
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;     So since it doesn't mention it, it makes sense that it
>> doesn't
>> >> do any
>> >>
>> >>  &gt;     scaling of the input. Same as np.linalg.svd.
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;     You can verify that PCA and np.linalg.svd yield the same
>> >> results, with
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;     ```
>> >>
>> >>  &gt;     ?&gt;&gt;&gt; import numpy as np
>> >>
>> >>  &gt;     ?&gt;&gt;&gt; from sklearn.decomposition import PCA
>> >>
>> >>  &gt;     ?&gt;&gt;&gt; import numpy.linalg
>> >>
>> >>  &gt;     ?&gt;&gt;&gt; X = np.random.RandomState(42).rand(10, 4)
>> >>
>> >>  &gt;     ?&gt;&gt;&gt; n_components = 2
>> >>
>> >>  &gt;     ?&gt;&gt;&gt; PCA(n_components,
>> >> svd_solver='full').fit_transform(X)
>> >>
>> >>  &gt;     ```
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;     and
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;     ```
>> >>
>> >>  &gt;     ?&gt;&gt;&gt; U, s, V = np.linalg.svd(X - X.mean(axis=0),
>> >> full_matrices=False)
>> >>
>> >>  &gt;     ?&gt;&gt;&gt; (X - X.mean(axis=0)).dot(V[:n_components].T)
>> >>
>> >>  &gt;     ```
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;     --
>> >>
>> >>  &gt;     Roman
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;     On 16/10/17 03:42, Ismael Lemhadri wrote:
>> >>
>> >>  &gt;     &gt; Dear all,
>> >>
>> >>  &gt;     &gt; The help file for the PCA class is unclear about the
>> >> preprocessing
>> >>
>> >>  &gt;     &gt; performed to the data.
>> >>
>> >>  &gt;     &gt; You can check on line 410 here:
>> >>
>> >>  &gt;     &gt;
>> >>
>> >>  &gt;
>> >> https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/
>> sklearn/decomposition/pca.py#L410
>> >>
>> >>  &gt;
>> >> &lt;https://github.com/scikit-learn/scikit-learn/blob/ef5cb8
>> 4a/sklearn/decomposition/pca.py#L410&gt;
>> >>
>> >>  &gt;     &gt;
>> >>
>> >>  &gt;
>> >> &lt;https://github.com/scikit-learn/scikit-learn/blob/ef5cb8
>> 4a/sklearn/decomposition/pca.py#L410
>> >>
>> >>  &gt;
>> >> &lt;https://github.com/scikit-learn/scikit-learn/blob/ef5cb8
>> 4a/sklearn/decomposition/pca.py#L410&gt;&gt;
>> >>
>> >>  &gt;     &gt; that the matrix is centered but NOT scaled, before
>> >> performing the
>> >>
>> >>  &gt;     &gt; singular value decomposition.
>> >>
>> >>  &gt;     &gt; However, the help files do not make any mention of it.
>> >>
>> >>  &gt;     &gt; This is unclear for someone who, like me, just wanted to
>> >> compare
>> >>
>> >>  &gt;     that
>> >>
>> >>  &gt;     &gt; the PCA and np.linalg.svd give the same results. In
>> >> academic
>> >>
>> >>  &gt;     settings,
>> >>
>> >>  &gt;     &gt; students are often asked to compare different methods
>> and
>> >> to
>> >>
>> >>  &gt;     check that
>> >>
>> >>  &gt;     &gt; they yield the same results. I expect that many students
>> >> have
>> >>
>> >>  &gt;     confronted
>> >>
>> >>  &gt;     &gt; this problem before...
>> >>
>> >>  &gt;     &gt; Best,
>> >>
>> >>  &gt;     &gt; Ismael Lemhadri
>> >>
>> >>  &gt;     &gt;
>> >>
>> >>  &gt;     &gt;
>> >>
>> >>  &gt;     &gt; _______________________________________________
>> >>
>> >>  &gt;     &gt; scikit-learn mailing list
>> >>
>> >>  &gt;     &gt; scikit-learn at python.org
>> >> &lt;mailto:scikit-learn at python.org&gt;
>> >>
>> >>  &gt;     &gt; https://mail.python.org/mailman/listinfo/scikit-learn
>> >>
>> >>  &gt;     &lt;https://mail.python.org/mailman/listinfo/scikit-learn&gt
>> ;
>> >>
>> >>  &gt;     &gt;
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;     ------------------------------
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;     Message: 3
>> >>
>> >>  &gt;     Date: Mon, 16 Oct 2017 15:27:48 +0200
>> >>
>> >>  &gt;     From: Serafeim Loukas &lt;seralouk at gmail.com
>> >> &lt;mailto:seralouk at gmail.com&gt;&gt;
>> >>
>> >>  &gt;     To: scikit-learn at python.org
>> >> &lt;mailto:scikit-learn at python.org&gt;
>> >>
>> >>  &gt;     Subject: [scikit-learn] Question about LDA's coef_ attribute
>> >>
>> >>  &gt;     Message-ID: &lt;58C6D0DA-9DE5-4EF5-97C1-48
>> 159831F5A9 at gmail.com
>> >>
>> >>  &gt;
>> >> &lt;mailto:58C6D0DA-9DE5-4EF5-97C1-48159831F5A9 at gmail.com&gt;&gt;
>> >>
>> >>  &gt;     Content-Type: text/plain; charset="us-ascii"
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;     Dear Scikit-learn community,
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;     Since the documentation of the LDA
>> >>
>> >>  &gt;
>> >> (http://scikit-learn.org/stable/modules/generated/sklearn.
>> discriminant_analysis.LinearDiscriminantAnalysis.html
>> >>
>> >>  &gt;
>> >> &lt;http://scikit-learn.org/stable/modules/generated/sklearn
>> .discriminant_analysis.LinearDiscriminantAnalysis.html&gt;
>> >>
>> >>  &gt;
>> >> &lt;http://scikit-learn.org/stable/modules/generated/sklearn
>> .discriminant_analysis.LinearDiscriminantAnalysis.html
>> >>
>> >>  &gt;
>> >> &lt;http://scikit-learn.org/stable/modules/generated/sklearn
>> .discriminant_analysis.LinearDiscriminantAnalysis.html&gt;&gt;)
>> >>
>> >>  &gt;     is not so clear, I would like to ask if the lda.coef_
>> attribute
>> >>
>> >>  &gt;     stores the eigenvectors from the SVD decomposition.
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;     Thank you in advance,
>> >>
>> >>  &gt;     Serafeim
>> >>
>> >>  &gt;     -------------- next part --------------
>> >>
>> >>  &gt;     An HTML attachment was scrubbed...
>> >>
>> >>  &gt;     URL:
>> >>
>> >>  &gt;
>> >> &lt;http://mail.python.org/pipermail/scikit-learn/attachment
>> s/20171016/4263df5c/attachment-0001.html
>> >>
>> >>  &gt;
>> >> &lt;http://mail.python.org/pipermail/scikit-learn/attachment
>> s/20171016/4263df5c/attachment-0001.html&gt;&gt;
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;     ------------------------------
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;     Message: 4
>> >>
>> >>  &gt;     Date: Mon, 16 Oct 2017 16:57:52 +0200
>> >>
>> >>  &gt;     From: Alexandre Gramfort &lt;alexandre.gramfort at inria.fr
>> >>
>> >>  &gt;     &lt;mailto:alexandre.gramfort at inria.fr&gt;&gt;
>> >>
>> >>  &gt;     To: Scikit-learn mailing list &lt;scikit-learn at python.org
>> >>
>> >>  &gt;     &lt;mailto:scikit-learn at python.org&gt;&gt;
>> >>
>> >>  &gt;     Subject: Re: [scikit-learn] Question about LDA's coef_
>> attribute
>> >>
>> >>  &gt;     Message-ID:
>> >>
>> >>  &gt;     ? ? ? ?
>> >>
>> >>  &gt;
>> >> &lt;CADeotZricOQhuHJMmW2Z14cqffEQyndYoxn-OgKAvTMQ7V0Y2g at mail.gmail.com
>> >>
>> >>  &gt;
>> >> &lt;mailto:CADeotZricOQhuHJMmW2Z14cqffEQyndYoxn-OgKAvTMQ7V0Y
>> 2g at mail.gmail.com&gt;&gt;
>> >>
>> >>  &gt;     Content-Type: text/plain; charset="UTF-8"
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;     no it stores the direction of the decision function to match
>> the
>> >>
>> >>  &gt;     API of
>> >>
>> >>  &gt;     linear models.
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;     HTH
>> >>
>> >>  &gt;     Alex
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;     On Mon, Oct 16, 2017 at 3:27 PM, Serafeim Loukas
>> >>
>> >>  &gt;     &lt;seralouk at gmail.com &lt;mailto:seralouk at gmail.com&gt;&gt;
>> >> wrote:
>> >>
>> >>  &gt;     &gt; Dear Scikit-learn community,
>> >>
>> >>  &gt;     &gt;
>> >>
>> >>  &gt;     &gt; Since the documentation of the LDA
>> >>
>> >>  &gt;     &gt;
>> >>
>> >>  &gt;
>> >> (http://scikit-learn.org/stable/modules/generated/sklearn.
>> discriminant_analysis.LinearDiscriminantAnalysis.html
>> >>
>> >>  &gt;
>> >> &lt;http://scikit-learn.org/stable/modules/generated/sklearn
>> .discriminant_analysis.LinearDiscriminantAnalysis.html&gt;)
>> >>
>> >>  &gt;     &gt; is not so clear, I would like to ask if the lda.coef_
>> >> attribute
>> >>
>> >>  &gt;     stores the
>> >>
>> >>  &gt;     &gt; eigenvectors from the SVD decomposition.
>> >>
>> >>  &gt;     &gt;
>> >>
>> >>  &gt;     &gt; Thank you in advance,
>> >>
>> >>  &gt;     &gt; Serafeim
>> >>
>> >>  &gt;     &gt;
>> >>
>> >>  &gt;     &gt; _______________________________________________
>> >>
>> >>  &gt;     &gt; scikit-learn mailing list
>> >>
>> >>  &gt;     &gt; scikit-learn at python.org
>> >> &lt;mailto:scikit-learn at python.org&gt;
>> >>
>> >>  &gt;     &gt; https://mail.python.org/mailman/listinfo/scikit-learn
>> >>
>> >>  &gt;     &lt;https://mail.python.org/mailman/listinfo/scikit-learn&gt
>> ;
>> >>
>> >>  &gt;     &gt;
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;     ------------------------------
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;     Message: 5
>> >>
>> >>  &gt;     Date: Mon, 16 Oct 2017 17:02:46 +0200
>> >>
>> >>  &gt;     From: Serafeim Loukas &lt;seralouk at gmail.com
>> >> &lt;mailto:seralouk at gmail.com&gt;&gt;
>> >>
>> >>  &gt;     To: Scikit-learn mailing list &lt;scikit-learn at python.org
>> >>
>> >>  &gt;     &lt;mailto:scikit-learn at python.org&gt;&gt;
>> >>
>> >>  &gt;     Subject: Re: [scikit-learn] Question about LDA's coef_
>> attribute
>> >>
>> >>  &gt;     Message-ID: &lt;413210D2-56AE-41A4-873F-D1
>> 71BB36539D at gmail.com
>> >>
>> >>  &gt;
>> >> &lt;mailto:413210D2-56AE-41A4-873F-D171BB36539D at gmail.com&gt;&gt;
>> >>
>> >>  &gt;     Content-Type: text/plain; charset="us-ascii"
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;     Dear Alex,
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;     Thank you for the prompt response.
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;     Are the eigenvectors stored in some variable ?
>> >>
>> >>  &gt;     Does the lda.scalings_ attribute contain the eigenvectors ?
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;     Best,
>> >>
>> >>  &gt;     Serafeim
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;     &gt; On 16 Oct 2017, at 16:57, Alexandre Gramfort
>> >>
>> >>  &gt;     &lt;alexandre.gramfort at inria.fr
>> >> &lt;mailto:alexandre.gramfort at inria.fr&gt;&gt;
>> >>
>> >>  &gt;     wrote:
>> >>
>> >>  &gt;     &gt;
>> >>
>> >>  &gt;     &gt; no it stores the direction of the decision function to
>> >> match the
>> >>
>> >>  &gt;     API of
>> >>
>> >>  &gt;     &gt; linear models.
>> >>
>> >>  &gt;     &gt;
>> >>
>> >>  &gt;     &gt; HTH
>> >>
>> >>  &gt;     &gt; Alex
>> >>
>> >>  &gt;     &gt;
>> >>
>> >>  &gt;     &gt; On Mon, Oct 16, 2017 at 3:27 PM, Serafeim Loukas
>> >>
>> >>  &gt;     &lt;seralouk at gmail.com &lt;mailto:seralouk at gmail.com&gt;&gt;
>> >> wrote:
>> >>
>> >>  &gt;     &gt;&gt; Dear Scikit-learn community,
>> >>
>> >>  &gt;     &gt;&gt;
>> >>
>> >>  &gt;     &gt;&gt; Since the documentation of the LDA
>> >>
>> >>  &gt;     &gt;&gt;
>> >>
>> >>  &gt;
>> >> (http://scikit-learn.org/stable/modules/generated/sklearn.
>> discriminant_analysis.LinearDiscriminantAnalysis.html
>> >>
>> >>  &gt;
>> >> &lt;http://scikit-learn.org/stable/modules/generated/sklearn
>> .discriminant_analysis.LinearDiscriminantAnalysis.html&gt;)
>> >>
>> >>  &gt;     &gt;&gt; is not so clear, I would like to ask if the
>> lda.coef_
>> >> attribute
>> >>
>> >>  &gt;     stores the
>> >>
>> >>  &gt;     &gt;&gt; eigenvectors from the SVD decomposition.
>> >>
>> >>  &gt;     &gt;&gt;
>> >>
>> >>  &gt;     &gt;&gt; Thank you in advance,
>> >>
>> >>  &gt;     &gt;&gt; Serafeim
>> >>
>> >>  &gt;     &gt;&gt;
>> >>
>> >>  &gt;     &gt;&gt; _______________________________________________
>> >>
>> >>  &gt;     &gt;&gt; scikit-learn mailing list
>> >>
>> >>  &gt;     &gt;&gt; scikit-learn at python.org
>> >> &lt;mailto:scikit-learn at python.org&gt;
>> >>
>> >>  &gt;     &gt;&gt; https://mail.python.org/mailma
>> n/listinfo/scikit-learn
>> >>
>> >>  &gt;     &lt;https://mail.python.org/mailman/listinfo/scikit-learn&gt
>> ;
>> >>
>> >>  &gt;     &gt;&gt;
>> >>
>> >>  &gt;     &gt; _______________________________________________
>> >>
>> >>  &gt;     &gt; scikit-learn mailing list
>> >>
>> >>  &gt;     &gt; scikit-learn at python.org
>> >> &lt;mailto:scikit-learn at python.org&gt;
>> >>
>> >>  &gt;     &gt; https://mail.python.org/mailman/listinfo/scikit-learn
>> >>
>> >>  &gt;     &lt;https://mail.python.org/mailman/listinfo/scikit-learn&gt
>> ;
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;     -------------- next part --------------
>> >>
>> >>  &gt;     An HTML attachment was scrubbed...
>> >>
>> >>  &gt;     URL:
>> >>
>> >>  &gt;
>> >> &lt;http://mail.python.org/pipermail/scikit-learn/attachment
>> s/20171016/505c7da3/attachment.html
>> >>
>> >>  &gt;
>> >> &lt;http://mail.python.org/pipermail/scikit-learn/attachment
>> s/20171016/505c7da3/attachment.html&gt;&gt;
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;     ------------------------------
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;     Subject: Digest Footer
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;     _______________________________________________
>> >>
>> >>  &gt;     scikit-learn mailing list
>> >>
>> >>  &gt;     scikit-learn at python.org &lt;mailto:scikit-learn at python.org
>> &gt;
>> >>
>> >>  &gt;     https://mail.python.org/mailman/listinfo/scikit-learn
>> >>
>> >>  &gt;     &lt;https://mail.python.org/mailman/listinfo/scikit-learn&gt
>> ;
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;     ------------------------------
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;     End of scikit-learn Digest, Vol 19, Issue 25
>> >>
>> >>  &gt;     ********************************************
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;
>> >>
>> >>  &gt; _______________________________________________
>> >>
>> >>  &gt; scikit-learn mailing list
>> >>
>> >>  &gt; scikit-learn at python.org
>> >>
>> >>  &gt; https://mail.python.org/mailman/listinfo/scikit-learn
>> >>
>> >>
>> >>
>> >>  -------------- next part --------------
>> >>
>> >>  An HTML attachment was scrubbed...
>> >>
>> >>  URL:
>> >> &lt;http://mail.python.org/pipermail/scikit-learn/attachment
>> s/20171016/f47e63a9/attachment.html&gt;
>> >>
>> >>
>> >>
>> >>  ------------------------------
>> >>
>> >>
>> >>
>> >>  Subject: Digest Footer
>> >>
>> >>
>> >>
>> >>  _______________________________________________
>> >>
>> >>  scikit-learn mailing list
>> >>
>> >>  scikit-learn at python.org
>> >>
>> >>  https://mail.python.org/mailman/listinfo/scikit-learn
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>  ------------------------------
>> >>
>> >>
>> >>
>> >>  End of scikit-learn Digest, Vol 19, Issue 28
>> >>
>> >>  ********************************************
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >> _______________________________________________
>> >>
>> >> scikit-learn mailing list
>> >>
>> >> scikit-learn at python.org
>> >>
>> >> https://mail.python.org/mailman/listinfo/scikit-learn
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >> -------------- next part --------------
>> >> An HTML attachment was scrubbed...
>> >> URL:
>> >> <http://mail.python.org/pipermail/scikit-learn/attachments/
>> 20171016/620a9401/attachment.html>
>> >>
>> >> ------------------------------
>> >>
>> >> Subject: Digest Footer
>> >>
>> >> _______________________________________________
>> >> scikit-learn mailing list
>> >> scikit-learn at python.org
>> >> https://mail.python.org/mailman/listinfo/scikit-learn
>> >>
>> >>
>> >> ------------------------------
>> >>
>> >> End of scikit-learn Digest, Vol 19, Issue 31
>> >> ********************************************
>> >
>> > --
>> >
>> > Sent from a mobile phone and may contain errors
>> >
>> >
>> > _______________________________________________
>> > scikit-learn mailing list
>> > scikit-learn at python.org
>> > https://mail.python.org/mailman/listinfo/scikit-learn
>> >
>> >
>> >
>> > _______________________________________________
>> > scikit-learn mailing list
>> > scikit-learn at python.org
>> > https://mail.python.org/mailman/listinfo/scikit-learn
>> >
>>
>>
>> ------------------------------
>>
>> Subject: Digest Footer
>>
>> _______________________________________________
>> scikit-learn mailing list
>> scikit-learn at python.org
>> https://mail.python.org/mailman/listinfo/scikit-learn
>>
>>
>> ------------------------------
>>
>> End of scikit-learn Digest, Vol 19, Issue 37
>> ********************************************
>>
>
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20171018/d6bb9438/attachment-0001.html>

From jbbrown at kuhp.kyoto-u.ac.jp  Tue Oct 17 23:30:28 2017
From: jbbrown at kuhp.kyoto-u.ac.jp (Brown J.B.)
Date: Wed, 18 Oct 2017 12:30:28 +0900
Subject: [scikit-learn] scikit-learn Digest, Vol 19, Issue 37
In-Reply-To: <CANpSPFTD0wq0t_tCCGDiNxX8QUtTJPaGEgAhsT1W-F-4HkMCZw@mail.gmail.com>
References: <mailman.2703.1508255100.12136.scikit-learn@python.org>
 <CANpSPFTD0wq0t_tCCGDiNxX8QUtTJPaGEgAhsT1W-F-4HkMCZw@mail.gmail.com>
Message-ID: <CAJe_vxD-mx5dDjc0tdKF-dJ-S36NC3xHr+PbQKVs7vMyj-ET0g@mail.gmail.com>

2017-10-18 12:18 GMT+09:00 Ismael Lemhadri <lemhadri at stanford.edu>:

> How about editing the various chunks of code concerned to add the option
> to scale the parameters, and set it by default to NOT scale? This would
> make what happens clear without the redundancy Andreas mentioned, and would
> add more convenience to the user shall they want to scale their data.
>

>From my perspectives:

That's a very nice, rational idea.
For end users, it preserves compatibility of existing codebases, but allows
both near-effortless updating of code for those who want to use
Scikit-learn's scaling as well as ease of application for new users and
tools.

One issue of caution would be where the scaling occurs, such as globally
before any cross-validation, or per-split with the transformation stored
and applied to prediction data per fold of CV.
One more keyword argument would need to be added to allow user
specification of this, and a state variable would have to be stored and
accessible from the methods of the parent estimator.

J.B.


>
>
>> Today's Topics:
>>
>>    1. Re: Unclear help file about sklearn.decomposition.pca (Raphael C)
>>
>>
>> ----------------------------------------------------------------------
>>
>> Message: 1
>> Date: Tue, 17 Oct 2017 16:44:55 +0100
>> From: Raphael C <drraph at gmail.com>
>> To: Scikit-learn mailing list <scikit-learn at python.org>
>> Subject: Re: [scikit-learn] Unclear help file about
>>         sklearn.decomposition.pca
>> Message-ID:
>>         <CAFHc1QZigBoA0erY2hwJht2kiAenp=QTQDSb3uc0Uzs5SoEC7Q at mail.gm
>> ail.com>
>> Content-Type: text/plain; charset="UTF-8"
>>
>> How about including the scaling that people might want to use in the
>> User Guide examples?
>>
>> Raphael
>>
>> On 17 October 2017 at 16:40, Andreas Mueller <t3kcit at gmail.com> wrote:
>> > In general scikit-learn avoids automatic preprocessing.
>> > That's a convention to give the user more control and decrease
>> surprising
>> > behavior (ostensibly).
>> > So scikit-learn will usually do what the algorithm is supposed to do,
>> and
>> > nothing more.
>> >
>> > I'm not sure what the best way do document this is, as this has come up
>> with
>> > different models.
>> > For example the R wrapper of libsvm does automatic scaling, while we
>> apply
>> > the SVM.
>> >
>> > We could add "this model does not do any automatic preprocessing" to all
>> > docstrings, but that seems
>> > a bit redundant. We could add it to
>> > https://github.com/scikit-learn/scikit-learn/pull/9517, but
>> > that is probably not where you would have looked.
>> >
>> > Other suggestions welcome.
>> >
>> >
>> > On 10/16/2017 03:29 PM, Ismael Lemhadri wrote:
>> >
>> > Thank you all for your feedback.
>> > The initial problem I came with wasnt the definition of PCA but what the
>> > sklearn method does. In practice I would always make sure the data is
>> both
>> > centered and scaled before performing PCA. This is the recommended
>> method
>> > because without scaling, the biggest direction could wrongly seem to
>> explain
>> > a huge fraction of the variance.
>> > So my point was simply to clarify in the help file and the user guide
>> what
>> > the PCA class does precisely to leave no unclarity to the reader. Moving
>> > forward I have now submitted a pull request on github as initially
>> suggested
>> > by Roman on this thread.
>> > Best,
>> > Ismael
>> >
>> > On Mon, 16 Oct 2017 at 11:49 AM, <scikit-learn-request at python.org>
>> wrote:
>> >>
>> >> Send scikit-learn mailing list submissions to
>> >>         scikit-learn at python.org
>> >>
>> >> To subscribe or unsubscribe via the World Wide Web, visit
>> >>         https://mail.python.org/mailman/listinfo/scikit-learn
>> >> or, via email, send a message with subject or body 'help' to
>> >>         scikit-learn-request at python.org
>> >>
>> >> You can reach the person managing the list at
>> >>         scikit-learn-owner at python.org
>> >>
>> >> When replying, please edit your Subject line so it is more specific
>> >> than "Re: Contents of scikit-learn digest..."
>> >>
>> >>
>> >> Today's Topics:
>> >>
>> >>    1. Re: 1. Re: unclear help file for sklearn.decomposition.pca
>> >>       (Andreas Mueller)
>> >>    2. Re: 1. Re: unclear help file for sklearn.decomposition.pca
>> >>       (Oliver Tomic)
>> >>
>> >>
>> >> ----------------------------------------------------------------------
>> >>
>> >> Message: 1
>> >> Date: Mon, 16 Oct 2017 14:44:51 -0400
>> >> From: Andreas Mueller <t3kcit at gmail.com>
>> >> To: scikit-learn at python.org
>> >> Subject: Re: [scikit-learn] 1. Re: unclear help file for
>> >>         sklearn.decomposition.pca
>> >> Message-ID: <35142868-fce9-6cb3-eba3-015a0b106163 at gmail.com>
>> >> Content-Type: text/plain; charset="utf-8"; Format="flowed"
>> >>
>> >>
>> >>
>> >> On 10/16/2017 02:27 PM, Ismael Lemhadri wrote:
>> >> > @Andreas Muller:
>> >> > My references do not assume centering, e.g.
>> >> > http://ufldl.stanford.edu/wiki/index.php/PCA
>> >> > any reference?
>> >> >
>> >> It kinda does but is not very clear about it:
>> >>
>> >> This data has already been pre-processed so that each of the
>> >> features\textstyle x_1and\textstyle x_2have about the same mean (zero)
>> >> and variance.
>> >>
>> >>
>> >>
>> >> Wikipedia is much clearer:
>> >> Consider a datamatrix
>> >> <https://en.wikipedia.org/wiki/Matrix_%28mathematics%29>,*X*, with
>> >> column-wise zeroempirical mean
>> >> <https://en.wikipedia.org/wiki/Empirical_mean>(the sample mean of each
>> >> column has been shifted to zero), where each of the/n/rows represents a
>> >> different repetition of the experiment, and each of the/p/columns gives
>> >> a particular kind of feature (say, the results from a particular
>> sensor).
>> >> https://en.wikipedia.org/wiki/Principal_component_analysis#Details
>> >>
>> >> I'm a bit surprised to find that ESL says "The SVD of the centered
>> >> matrix X is another way of expressing the principal components of the
>> >> variables in X",
>> >> so they assume scaling? They don't really have a great treatment of
>> PCA,
>> >> though.
>> >>
>> >> Bishop <http://www.springer.com/us/book/9780387310732> and Murphy
>> >> <https://mitpress.mit.edu/books/machine-learning-0> are pretty clear
>> >> that they subtract the mean (or assume zero mean) but don't
>> standardize.
>> >> -------------- next part --------------
>> >> An HTML attachment was scrubbed...
>> >> URL:
>> >> <http://mail.python.org/pipermail/scikit-learn/attachments/
>> 20171016/81b3014b/attachment-0001.html>
>> >>
>> >> ------------------------------
>> >>
>> >> Message: 2
>> >> Date: Mon, 16 Oct 2017 20:48:29 +0200
>> >> From: Oliver Tomic <olivertomic at zoho.com>
>> >> To: "Scikit-learn mailing list" <scikit-learn at python.org>
>> >> Cc: <scikit-learn at python.org>
>> >> Subject: Re: [scikit-learn] 1. Re: unclear help file for
>> >>         sklearn.decomposition.pca
>> >> Message-ID: <15f26840d65.e97b33c25239.3934951873824890747 at zoho.com>
>> >> Content-Type: text/plain; charset="utf-8"
>> >>
>> >> Dear Ismael,
>> >>
>> >>
>> >>
>> >> PCA should always involve at the least centering, or, if the variables
>> are
>> >> to contribute equally, scaling. Here is a reference from the
>> scientific area
>> >> named "chemometrics". In Chemometrics PCA used not only for
>> dimensionality
>> >> reduction, but also for interpretation of variance by use of scores,
>> >> loadings, correlation loadings, etc.
>> >>
>> >>
>> >>
>> >> If you scroll down to subsection "Preprocessing" you will find more
>> info
>> >> on centering and scaling.
>> >>
>> >>
>> >> http://pubs.rsc.org/en/content/articlehtml/2014/ay/c3ay41907j
>> >>
>> >>
>> >>
>> >> best
>> >>
>> >> Oliver
>> >>
>> >>
>> >>
>> >>
>> >> ---- On Mon, 16 Oct 2017 20:27:11 +0200 Ismael Lemhadri
>> >> &lt;lemhadri at stanford.edu&gt; wrote ----
>> >>
>> >>
>> >>
>> >>
>> >> @Andreas Muller:
>> >>
>> >> My references do not assume centering, e.g.
>> >> http://ufldl.stanford.edu/wiki/index.php/PCA
>> >>
>> >> any reference?
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >> On Mon, Oct 16, 2017 at 10:20 AM, &lt;scikit-learn-request at python.org
>> &gt;
>> >> wrote:
>> >>
>> >> Send scikit-learn mailing list submissions to
>> >>
>> >>          scikit-learn at python.org
>> >>
>> >>
>> >>
>> >>  To subscribe or unsubscribe via the World Wide Web, visit
>> >>
>> >>          https://mail.python.org/mailman/listinfo/scikit-learn
>> >>
>> >>  or, via email, send a message with subject or body 'help' to
>> >>
>> >>          scikit-learn-request at python.org
>> >>
>> >>
>> >>
>> >>  You can reach the person managing the list at
>> >>
>> >>          scikit-learn-owner at python.org
>> >>
>> >>
>> >>
>> >>  When replying, please edit your Subject line so it is more specific
>> >>
>> >>  than "Re: Contents of scikit-learn digest..."
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>  Today's Topics:
>> >>
>> >>
>> >>
>> >>     1. Re: unclear help file for sklearn.decomposition.pca
>> >>
>> >>        (Andreas Mueller)
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>  ------------------------------------------------------------
>> ----------
>> >>
>> >>
>> >>
>> >>  Message: 1
>> >>
>> >>  Date: Mon, 16 Oct 2017 13:19:57 -0400
>> >>
>> >>  From: Andreas Mueller &lt;t3kcit at gmail.com&gt;
>> >>
>> >>  To: scikit-learn at python.org
>> >>
>> >>  Subject: Re: [scikit-learn] unclear help file for
>> >>
>> >>          sklearn.decomposition.pca
>> >>
>> >>  Message-ID: &lt;04fc445c-d8f3-a3a9-4ab2-0535826a2d03 at gmail.com&gt;
>> >>
>> >>  Content-Type: text/plain; charset="utf-8"; Format="flowed"
>> >>
>> >>
>> >>
>> >>  The definition of PCA has a centering step, but no scaling step.
>> >>
>> >>
>> >>
>> >>  On 10/16/2017 11:16 AM, Ismael Lemhadri wrote:
>> >>
>> >>  &gt; Dear Roman,
>> >>
>> >>  &gt; My concern is actually not about not mentioning the scaling but
>> >> about
>> >>
>> >>  &gt; not mentioning the centering.
>> >>
>> >>  &gt; That is, the sklearn PCA removes the mean but it does not
>> mention it
>> >>
>> >>  &gt; in the help file.
>> >>
>> >>  &gt; This was quite messy for me to debug as I expected it to either:
>> 1/
>> >>
>> >>  &gt; center and scale simultaneously or / not scale and not center
>> >> either.
>> >>
>> >>  &gt; It would be beneficial to explicit the behavior in the help file
>> in
>> >> my
>> >>
>> >>  &gt; opinion.
>> >>
>> >>  &gt; Ismael
>> >>
>> >>  &gt;
>> >>
>> >>  &gt; On Mon, Oct 16, 2017 at 8:02 AM, &lt;scikit-learn-request at pytho
>> n.org
>> >>
>> >>  &gt; &lt;mailto:scikit-learn-request at python.org&gt;&gt; wrote:
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;     Send scikit-learn mailing list submissions to
>> >>
>> >>  &gt;     scikit-learn at python.org &lt;mailto:scikit-learn at python.org
>> &gt;
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;     To subscribe or unsubscribe via the World Wide Web, visit
>> >>
>> >>  &gt;     https://mail.python.org/mailman/listinfo/scikit-learn
>> >>
>> >>  &gt;     &lt;https://mail.python.org/mailman/listinfo/scikit-learn&gt
>> ;
>> >>
>> >>  &gt;     or, via email, send a message with subject or body 'help' to
>> >>
>> >>  &gt;     scikit-learn-request at python.org
>> >>
>> >>  &gt;     &lt;mailto:scikit-learn-request at python.org&gt;
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;     You can reach the person managing the list at
>> >>
>> >>  &gt;     scikit-learn-owner at python.org
>> >> &lt;mailto:scikit-learn-owner at python.org&gt;
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;     When replying, please edit your Subject line so it is more
>> >> specific
>> >>
>> >>  &gt;     than "Re: Contents of scikit-learn digest..."
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;     Today's Topics:
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;     ? ?1. unclear help file for sklearn.decomposition.pca (Ismael
>> >>
>> >>  &gt;     Lemhadri)
>> >>
>> >>  &gt;     ? ?2. Re: unclear help file for sklearn.decomposition.pca
>> >>
>> >>  &gt;     ? ? ? (Roman Yurchak)
>> >>
>> >>  &gt;     ? ?3. Question about LDA's coef_ attribute (Serafeim Loukas)
>> >>
>> >>  &gt;     ? ?4. Re: Question about LDA's coef_ attribute (Alexandre
>> >> Gramfort)
>> >>
>> >>  &gt;     ? ?5. Re: Question about LDA's coef_ attribute (Serafeim
>> Loukas)
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;
>> >> ----------------------------------------------------------------------
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;     Message: 1
>> >>
>> >>  &gt;     Date: Sun, 15 Oct 2017 18:42:56 -0700
>> >>
>> >>  &gt;     From: Ismael Lemhadri &lt;lemhadri at stanford.edu
>> >>
>> >>  &gt;     &lt;mailto:lemhadri at stanford.edu&gt;&gt;
>> >>
>> >>  &gt;     To: scikit-learn at python.org
>> >> &lt;mailto:scikit-learn at python.org&gt;
>> >>
>> >>  &gt;     Subject: [scikit-learn] unclear help file for
>> >>
>> >>  &gt;     ? ? ? ? sklearn.decomposition.pca
>> >>
>> >>  &gt;     Message-ID:
>> >>
>> >>  &gt;     ? ? ? ?
>> >>
>> >>  &gt;
>> >> &lt;CANpSPFTgv+Oz7f97dandmrBBayqf_o9w=18oKHCFN0u5DNzj+g at mail.gmail.com
>> >>
>> >>  &gt;     &lt;mailto:18oKHCFN0u5DNzj%2Bg at mail.gmail.com&gt;&gt;
>> >>
>> >>  &gt;     Content-Type: text/plain; charset="utf-8"
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;     Dear all,
>> >>
>> >>  &gt;     The help file for the PCA class is unclear about the
>> >> preprocessing
>> >>
>> >>  &gt;     performed to the data.
>> >>
>> >>  &gt;     You can check on line 410 here:
>> >>
>> >>  &gt;
>> >> https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/
>> >>
>> >>  &gt;     decomposition/pca.py#L410
>> >>
>> >>  &gt;
>> >> &lt;https://github.com/scikit-learn/scikit-learn/blob/ef5cb8
>> 4a/sklearn/%0Adecomposition/pca.py#L410&gt;
>> >>
>> >>  &gt;     that the matrix is centered but NOT scaled, before performing
>> >> the
>> >>
>> >>  &gt;     singular
>> >>
>> >>  &gt;     value decomposition.
>> >>
>> >>  &gt;     However, the help files do not make any mention of it.
>> >>
>> >>  &gt;     This is unclear for someone who, like me, just wanted to
>> compare
>> >>
>> >>  &gt;     that the
>> >>
>> >>  &gt;     PCA and np.linalg.svd give the same results. In academic
>> >> settings,
>> >>
>> >>  &gt;     students
>> >>
>> >>  &gt;     are often asked to compare different methods and to check
>> that
>> >>
>> >>  &gt;     they yield
>> >>
>> >>  &gt;     the same results. I expect that many students have confronted
>> >> this
>> >>
>> >>  &gt;     problem
>> >>
>> >>  &gt;     before...
>> >>
>> >>  &gt;     Best,
>> >>
>> >>  &gt;     Ismael Lemhadri
>> >>
>> >>  &gt;     -------------- next part --------------
>> >>
>> >>  &gt;     An HTML attachment was scrubbed...
>> >>
>> >>  &gt;     URL:
>> >>
>> >>  &gt;
>> >> &lt;http://mail.python.org/pipermail/scikit-learn/attachment
>> s/20171015/c465bde7/attachment-0001.html
>> >>
>> >>  &gt;
>> >> &lt;http://mail.python.org/pipermail/scikit-learn/attachment
>> s/20171015/c465bde7/attachment-0001.html&gt;&gt;
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;     ------------------------------
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;     Message: 2
>> >>
>> >>  &gt;     Date: Mon, 16 Oct 2017 15:16:45 +0200
>> >>
>> >>  &gt;     From: Roman Yurchak &lt;rth.yurchak at gmail.com
>> >>
>> >>  &gt;     &lt;mailto:rth.yurchak at gmail.com&gt;&gt;
>> >>
>> >>  &gt;     To: Scikit-learn mailing list &lt;scikit-learn at python.org
>> >>
>> >>  &gt;     &lt;mailto:scikit-learn at python.org&gt;&gt;
>> >>
>> >>  &gt;     Subject: Re: [scikit-learn] unclear help file for
>> >>
>> >>  &gt;     ? ? ? ? sklearn.decomposition.pca
>> >>
>> >>  &gt;     Message-ID: &lt;b2abdcfd-4736-929e-6304-b9
>> 3832932043 at gmail.com
>> >>
>> >>  &gt;
>> >> &lt;mailto:b2abdcfd-4736-929e-6304-b93832932043 at gmail.com&gt;&gt;
>> >>
>> >>  &gt;     Content-Type: text/plain; charset=utf-8; format=flowed
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;     Ismael,
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;     as far as I saw the sklearn.decomposition.PCA doesn't mention
>> >>
>> >>  &gt;     scaling at
>> >>
>> >>  &gt;     all (except for the whiten parameter which is
>> >> post-transformation
>> >>
>> >>  &gt;     scaling).
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;     So since it doesn't mention it, it makes sense that it
>> doesn't
>> >> do any
>> >>
>> >>  &gt;     scaling of the input. Same as np.linalg.svd.
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;     You can verify that PCA and np.linalg.svd yield the same
>> >> results, with
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;     ```
>> >>
>> >>  &gt;     ?&gt;&gt;&gt; import numpy as np
>> >>
>> >>  &gt;     ?&gt;&gt;&gt; from sklearn.decomposition import PCA
>> >>
>> >>  &gt;     ?&gt;&gt;&gt; import numpy.linalg
>> >>
>> >>  &gt;     ?&gt;&gt;&gt; X = np.random.RandomState(42).rand(10, 4)
>> >>
>> >>  &gt;     ?&gt;&gt;&gt; n_components = 2
>> >>
>> >>  &gt;     ?&gt;&gt;&gt; PCA(n_components,
>> >> svd_solver='full').fit_transform(X)
>> >>
>> >>  &gt;     ```
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;     and
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;     ```
>> >>
>> >>  &gt;     ?&gt;&gt;&gt; U, s, V = np.linalg.svd(X - X.mean(axis=0),
>> >> full_matrices=False)
>> >>
>> >>  &gt;     ?&gt;&gt;&gt; (X - X.mean(axis=0)).dot(V[:n_components].T)
>> >>
>> >>  &gt;     ```
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;     --
>> >>
>> >>  &gt;     Roman
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;     On 16/10/17 03:42, Ismael Lemhadri wrote:
>> >>
>> >>  &gt;     &gt; Dear all,
>> >>
>> >>  &gt;     &gt; The help file for the PCA class is unclear about the
>> >> preprocessing
>> >>
>> >>  &gt;     &gt; performed to the data.
>> >>
>> >>  &gt;     &gt; You can check on line 410 here:
>> >>
>> >>  &gt;     &gt;
>> >>
>> >>  &gt;
>> >> https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/
>> sklearn/decomposition/pca.py#L410
>> >>
>> >>  &gt;
>> >> &lt;https://github.com/scikit-learn/scikit-learn/blob/ef5cb8
>> 4a/sklearn/decomposition/pca.py#L410&gt;
>> >>
>> >>  &gt;     &gt;
>> >>
>> >>  &gt;
>> >> &lt;https://github.com/scikit-learn/scikit-learn/blob/ef5cb8
>> 4a/sklearn/decomposition/pca.py#L410
>> >>
>> >>  &gt;
>> >> &lt;https://github.com/scikit-learn/scikit-learn/blob/ef5cb8
>> 4a/sklearn/decomposition/pca.py#L410&gt;&gt;
>> >>
>> >>  &gt;     &gt; that the matrix is centered but NOT scaled, before
>> >> performing the
>> >>
>> >>  &gt;     &gt; singular value decomposition.
>> >>
>> >>  &gt;     &gt; However, the help files do not make any mention of it.
>> >>
>> >>  &gt;     &gt; This is unclear for someone who, like me, just wanted to
>> >> compare
>> >>
>> >>  &gt;     that
>> >>
>> >>  &gt;     &gt; the PCA and np.linalg.svd give the same results. In
>> >> academic
>> >>
>> >>  &gt;     settings,
>> >>
>> >>  &gt;     &gt; students are often asked to compare different methods
>> and
>> >> to
>> >>
>> >>  &gt;     check that
>> >>
>> >>  &gt;     &gt; they yield the same results. I expect that many students
>> >> have
>> >>
>> >>  &gt;     confronted
>> >>
>> >>  &gt;     &gt; this problem before...
>> >>
>> >>  &gt;     &gt; Best,
>> >>
>> >>  &gt;     &gt; Ismael Lemhadri
>> >>
>> >>  &gt;     &gt;
>> >>
>> >>  &gt;     &gt;
>> >>
>> >>  &gt;     &gt; _______________________________________________
>> >>
>> >>  &gt;     &gt; scikit-learn mailing list
>> >>
>> >>  &gt;     &gt; scikit-learn at python.org
>> >> &lt;mailto:scikit-learn at python.org&gt;
>> >>
>> >>  &gt;     &gt; https://mail.python.org/mailman/listinfo/scikit-learn
>> >>
>> >>  &gt;     &lt;https://mail.python.org/mailman/listinfo/scikit-learn&gt
>> ;
>> >>
>> >>  &gt;     &gt;
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;     ------------------------------
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;     Message: 3
>> >>
>> >>  &gt;     Date: Mon, 16 Oct 2017 15:27:48 +0200
>> >>
>> >>  &gt;     From: Serafeim Loukas &lt;seralouk at gmail.com
>> >> &lt;mailto:seralouk at gmail.com&gt;&gt;
>> >>
>> >>  &gt;     To: scikit-learn at python.org
>> >> &lt;mailto:scikit-learn at python.org&gt;
>> >>
>> >>  &gt;     Subject: [scikit-learn] Question about LDA's coef_ attribute
>> >>
>> >>  &gt;     Message-ID: &lt;58C6D0DA-9DE5-4EF5-97C1-48
>> 159831F5A9 at gmail.com
>> >>
>> >>  &gt;
>> >> &lt;mailto:58C6D0DA-9DE5-4EF5-97C1-48159831F5A9 at gmail.com&gt;&gt;
>> >>
>> >>  &gt;     Content-Type: text/plain; charset="us-ascii"
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;     Dear Scikit-learn community,
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;     Since the documentation of the LDA
>> >>
>> >>  &gt;
>> >> (http://scikit-learn.org/stable/modules/generated/sklearn.
>> discriminant_analysis.LinearDiscriminantAnalysis.html
>> >>
>> >>  &gt;
>> >> &lt;http://scikit-learn.org/stable/modules/generated/sklearn
>> .discriminant_analysis.LinearDiscriminantAnalysis.html&gt;
>> >>
>> >>  &gt;
>> >> &lt;http://scikit-learn.org/stable/modules/generated/sklearn
>> .discriminant_analysis.LinearDiscriminantAnalysis.html
>> >>
>> >>  &gt;
>> >> &lt;http://scikit-learn.org/stable/modules/generated/sklearn
>> .discriminant_analysis.LinearDiscriminantAnalysis.html&gt;&gt;)
>> >>
>> >>  &gt;     is not so clear, I would like to ask if the lda.coef_
>> attribute
>> >>
>> >>  &gt;     stores the eigenvectors from the SVD decomposition.
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;     Thank you in advance,
>> >>
>> >>  &gt;     Serafeim
>> >>
>> >>  &gt;     -------------- next part --------------
>> >>
>> >>  &gt;     An HTML attachment was scrubbed...
>> >>
>> >>  &gt;     URL:
>> >>
>> >>  &gt;
>> >> &lt;http://mail.python.org/pipermail/scikit-learn/attachment
>> s/20171016/4263df5c/attachment-0001.html
>> >>
>> >>  &gt;
>> >> &lt;http://mail.python.org/pipermail/scikit-learn/attachment
>> s/20171016/4263df5c/attachment-0001.html&gt;&gt;
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;     ------------------------------
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;     Message: 4
>> >>
>> >>  &gt;     Date: Mon, 16 Oct 2017 16:57:52 +0200
>> >>
>> >>  &gt;     From: Alexandre Gramfort &lt;alexandre.gramfort at inria.fr
>> >>
>> >>  &gt;     &lt;mailto:alexandre.gramfort at inria.fr&gt;&gt;
>> >>
>> >>  &gt;     To: Scikit-learn mailing list &lt;scikit-learn at python.org
>> >>
>> >>  &gt;     &lt;mailto:scikit-learn at python.org&gt;&gt;
>> >>
>> >>  &gt;     Subject: Re: [scikit-learn] Question about LDA's coef_
>> attribute
>> >>
>> >>  &gt;     Message-ID:
>> >>
>> >>  &gt;     ? ? ? ?
>> >>
>> >>  &gt;
>> >> &lt;CADeotZricOQhuHJMmW2Z14cqffEQyndYoxn-OgKAvTMQ7V0Y2g at mail.gmail.com
>> >>
>> >>  &gt;
>> >> &lt;mailto:CADeotZricOQhuHJMmW2Z14cqffEQyndYoxn-OgKAvTMQ7V0Y
>> 2g at mail.gmail.com&gt;&gt;
>> >>
>> >>  &gt;     Content-Type: text/plain; charset="UTF-8"
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;     no it stores the direction of the decision function to match
>> the
>> >>
>> >>  &gt;     API of
>> >>
>> >>  &gt;     linear models.
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;     HTH
>> >>
>> >>  &gt;     Alex
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;     On Mon, Oct 16, 2017 at 3:27 PM, Serafeim Loukas
>> >>
>> >>  &gt;     &lt;seralouk at gmail.com &lt;mailto:seralouk at gmail.com&gt;&gt;
>> >> wrote:
>> >>
>> >>  &gt;     &gt; Dear Scikit-learn community,
>> >>
>> >>  &gt;     &gt;
>> >>
>> >>  &gt;     &gt; Since the documentation of the LDA
>> >>
>> >>  &gt;     &gt;
>> >>
>> >>  &gt;
>> >> (http://scikit-learn.org/stable/modules/generated/sklearn.
>> discriminant_analysis.LinearDiscriminantAnalysis.html
>> >>
>> >>  &gt;
>> >> &lt;http://scikit-learn.org/stable/modules/generated/sklearn
>> .discriminant_analysis.LinearDiscriminantAnalysis.html&gt;)
>> >>
>> >>  &gt;     &gt; is not so clear, I would like to ask if the lda.coef_
>> >> attribute
>> >>
>> >>  &gt;     stores the
>> >>
>> >>  &gt;     &gt; eigenvectors from the SVD decomposition.
>> >>
>> >>  &gt;     &gt;
>> >>
>> >>  &gt;     &gt; Thank you in advance,
>> >>
>> >>  &gt;     &gt; Serafeim
>> >>
>> >>  &gt;     &gt;
>> >>
>> >>  &gt;     &gt; _______________________________________________
>> >>
>> >>  &gt;     &gt; scikit-learn mailing list
>> >>
>> >>  &gt;     &gt; scikit-learn at python.org
>> >> &lt;mailto:scikit-learn at python.org&gt;
>> >>
>> >>  &gt;     &gt; https://mail.python.org/mailman/listinfo/scikit-learn
>> >>
>> >>  &gt;     &lt;https://mail.python.org/mailman/listinfo/scikit-learn&gt
>> ;
>> >>
>> >>  &gt;     &gt;
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;     ------------------------------
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;     Message: 5
>> >>
>> >>  &gt;     Date: Mon, 16 Oct 2017 17:02:46 +0200
>> >>
>> >>  &gt;     From: Serafeim Loukas &lt;seralouk at gmail.com
>> >> &lt;mailto:seralouk at gmail.com&gt;&gt;
>> >>
>> >>  &gt;     To: Scikit-learn mailing list &lt;scikit-learn at python.org
>> >>
>> >>  &gt;     &lt;mailto:scikit-learn at python.org&gt;&gt;
>> >>
>> >>  &gt;     Subject: Re: [scikit-learn] Question about LDA's coef_
>> attribute
>> >>
>> >>  &gt;     Message-ID: &lt;413210D2-56AE-41A4-873F-D1
>> 71BB36539D at gmail.com
>> >>
>> >>  &gt;
>> >> &lt;mailto:413210D2-56AE-41A4-873F-D171BB36539D at gmail.com&gt;&gt;
>> >>
>> >>  &gt;     Content-Type: text/plain; charset="us-ascii"
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;     Dear Alex,
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;     Thank you for the prompt response.
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;     Are the eigenvectors stored in some variable ?
>> >>
>> >>  &gt;     Does the lda.scalings_ attribute contain the eigenvectors ?
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;     Best,
>> >>
>> >>  &gt;     Serafeim
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;     &gt; On 16 Oct 2017, at 16:57, Alexandre Gramfort
>> >>
>> >>  &gt;     &lt;alexandre.gramfort at inria.fr
>> >> &lt;mailto:alexandre.gramfort at inria.fr&gt;&gt;
>> >>
>> >>  &gt;     wrote:
>> >>
>> >>  &gt;     &gt;
>> >>
>> >>  &gt;     &gt; no it stores the direction of the decision function to
>> >> match the
>> >>
>> >>  &gt;     API of
>> >>
>> >>  &gt;     &gt; linear models.
>> >>
>> >>  &gt;     &gt;
>> >>
>> >>  &gt;     &gt; HTH
>> >>
>> >>  &gt;     &gt; Alex
>> >>
>> >>  &gt;     &gt;
>> >>
>> >>  &gt;     &gt; On Mon, Oct 16, 2017 at 3:27 PM, Serafeim Loukas
>> >>
>> >>  &gt;     &lt;seralouk at gmail.com &lt;mailto:seralouk at gmail.com&gt;&gt;
>> >> wrote:
>> >>
>> >>  &gt;     &gt;&gt; Dear Scikit-learn community,
>> >>
>> >>  &gt;     &gt;&gt;
>> >>
>> >>  &gt;     &gt;&gt; Since the documentation of the LDA
>> >>
>> >>  &gt;     &gt;&gt;
>> >>
>> >>  &gt;
>> >> (http://scikit-learn.org/stable/modules/generated/sklearn.
>> discriminant_analysis.LinearDiscriminantAnalysis.html
>> >>
>> >>  &gt;
>> >> &lt;http://scikit-learn.org/stable/modules/generated/sklearn
>> .discriminant_analysis.LinearDiscriminantAnalysis.html&gt;)
>> >>
>> >>  &gt;     &gt;&gt; is not so clear, I would like to ask if the
>> lda.coef_
>> >> attribute
>> >>
>> >>  &gt;     stores the
>> >>
>> >>  &gt;     &gt;&gt; eigenvectors from the SVD decomposition.
>> >>
>> >>  &gt;     &gt;&gt;
>> >>
>> >>  &gt;     &gt;&gt; Thank you in advance,
>> >>
>> >>  &gt;     &gt;&gt; Serafeim
>> >>
>> >>  &gt;     &gt;&gt;
>> >>
>> >>  &gt;     &gt;&gt; _______________________________________________
>> >>
>> >>  &gt;     &gt;&gt; scikit-learn mailing list
>> >>
>> >>  &gt;     &gt;&gt; scikit-learn at python.org
>> >> &lt;mailto:scikit-learn at python.org&gt;
>> >>
>> >>  &gt;     &gt;&gt; https://mail.python.org/mailma
>> n/listinfo/scikit-learn
>> >>
>> >>  &gt;     &lt;https://mail.python.org/mailman/listinfo/scikit-learn&gt
>> ;
>> >>
>> >>  &gt;     &gt;&gt;
>> >>
>> >>  &gt;     &gt; _______________________________________________
>> >>
>> >>  &gt;     &gt; scikit-learn mailing list
>> >>
>> >>  &gt;     &gt; scikit-learn at python.org
>> >> &lt;mailto:scikit-learn at python.org&gt;
>> >>
>> >>  &gt;     &gt; https://mail.python.org/mailman/listinfo/scikit-learn
>> >>
>> >>  &gt;     &lt;https://mail.python.org/mailman/listinfo/scikit-learn&gt
>> ;
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;     -------------- next part --------------
>> >>
>> >>  &gt;     An HTML attachment was scrubbed...
>> >>
>> >>  &gt;     URL:
>> >>
>> >>  &gt;
>> >> &lt;http://mail.python.org/pipermail/scikit-learn/attachment
>> s/20171016/505c7da3/attachment.html
>> >>
>> >>  &gt;
>> >> &lt;http://mail.python.org/pipermail/scikit-learn/attachment
>> s/20171016/505c7da3/attachment.html&gt;&gt;
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;     ------------------------------
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;     Subject: Digest Footer
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;     _______________________________________________
>> >>
>> >>  &gt;     scikit-learn mailing list
>> >>
>> >>  &gt;     scikit-learn at python.org &lt;mailto:scikit-learn at python.org
>> &gt;
>> >>
>> >>  &gt;     https://mail.python.org/mailman/listinfo/scikit-learn
>> >>
>> >>  &gt;     &lt;https://mail.python.org/mailman/listinfo/scikit-learn&gt
>> ;
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;     ------------------------------
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;     End of scikit-learn Digest, Vol 19, Issue 25
>> >>
>> >>  &gt;     ********************************************
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;
>> >>
>> >>  &gt;
>> >>
>> >>  &gt; _______________________________________________
>> >>
>> >>  &gt; scikit-learn mailing list
>> >>
>> >>  &gt; scikit-learn at python.org
>> >>
>> >>  &gt; https://mail.python.org/mailman/listinfo/scikit-learn
>> >>
>> >>
>> >>
>> >>  -------------- next part --------------
>> >>
>> >>  An HTML attachment was scrubbed...
>> >>
>> >>  URL:
>> >> &lt;http://mail.python.org/pipermail/scikit-learn/attachment
>> s/20171016/f47e63a9/attachment.html&gt;
>> >>
>> >>
>> >>
>> >>  ------------------------------
>> >>
>> >>
>> >>
>> >>  Subject: Digest Footer
>> >>
>> >>
>> >>
>> >>  _______________________________________________
>> >>
>> >>  scikit-learn mailing list
>> >>
>> >>  scikit-learn at python.org
>> >>
>> >>  https://mail.python.org/mailman/listinfo/scikit-learn
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>  ------------------------------
>> >>
>> >>
>> >>
>> >>  End of scikit-learn Digest, Vol 19, Issue 28
>> >>
>> >>  ********************************************
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >> _______________________________________________
>> >>
>> >> scikit-learn mailing list
>> >>
>> >> scikit-learn at python.org
>> >>
>> >> https://mail.python.org/mailman/listinfo/scikit-learn
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >> -------------- next part --------------
>> >> An HTML attachment was scrubbed...
>> >> URL:
>> >> <http://mail.python.org/pipermail/scikit-learn/attachments/
>> 20171016/620a9401/attachment.html>
>> >>
>> >> ------------------------------
>> >>
>> >> Subject: Digest Footer
>> >>
>> >> _______________________________________________
>> >> scikit-learn mailing list
>> >> scikit-learn at python.org
>> >> https://mail.python.org/mailman/listinfo/scikit-learn
>> >>
>> >>
>> >> ------------------------------
>> >>
>> >> End of scikit-learn Digest, Vol 19, Issue 31
>> >> ********************************************
>> >
>> > --
>> >
>> > Sent from a mobile phone and may contain errors
>> >
>> >
>> > _______________________________________________
>> > scikit-learn mailing list
>> > scikit-learn at python.org
>> > https://mail.python.org/mailman/listinfo/scikit-learn
>> >
>> >
>> >
>> > _______________________________________________
>> > scikit-learn mailing list
>> > scikit-learn at python.org
>> > https://mail.python.org/mailman/listinfo/scikit-learn
>> >
>>
>>
>> ------------------------------
>>
>> Subject: Digest Footer
>>
>> _______________________________________________
>> scikit-learn mailing list
>> scikit-learn at python.org
>> https://mail.python.org/mailman/listinfo/scikit-learn
>>
>>
>> ------------------------------
>>
>> End of scikit-learn Digest, Vol 19, Issue 37
>> ********************************************
>>
>
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20171018/9902e06b/attachment-0001.html>

From stuart at stuartreynolds.net  Wed Oct 18 14:37:51 2017
From: stuart at stuartreynolds.net (Stuart Reynolds)
Date: Wed, 18 Oct 2017 11:37:51 -0700
Subject: [scikit-learn] Can fit a model with a target array of
 probabilities?
In-Reply-To: <CAMMTP+DjTP20UWmM+XUqnb-imZ_2b2UshMWEev_-M20_pVtXsg@mail.gmail.com>
References: <CAAy-kdkCpm-205Lb7Gj_=GJZ61kBa9Cktc7+QVwZ7LLmz5tErA@mail.gmail.com>
 <e201e89e-dcce-8722-82b9-f352f8a9e56d@gmail.com>
 <CAAy-kdmNq=tB7dhENM56XRvy_-t-qi=ROsGar2VT0MJgju4gUA@mail.gmail.com>
 <CAMMTP+C8PmHKDh3Yt24Smd5yrOidvFBuzEVQ8-suzxY8+BVghw@mail.gmail.com>
 <CAAy-kdkvUnabY54t-dRitsBEpE08kSN41oiE8RrUUK21J6yGcA@mail.gmail.com>
 <CAL9=spMua4Vbp-LnzaMA=P55V=wqLb6Q-_W04FRTYqmY5vo2Sw@mail.gmail.com>
 <CAAy-kd=WyVpGK0Op7xg4mo2odV3ZbMUJbudGzsKfy4PLmP3GuQ@mail.gmail.com>
 <CAMMTP+A-i0QJexERobyfYbxz0D5BV6YNuYQYaXfvLDPNF2-tqg@mail.gmail.com>
 <CAMMTP+DjTP20UWmM+XUqnb-imZ_2b2UshMWEev_-M20_pVtXsg@mail.gmail.com>
Message-ID: <CAAy-kd=8CdCFzzaX68FgEpAji3eM02g-jr8r_0CnxnztsR_kmg@mail.gmail.com>

Good know -- thank you.

On Fri, Oct 6, 2017 at 5:25 AM,  <josef.pktd at gmail.com> wrote:
>
>
> On Thu, Oct 5, 2017 at 3:27 PM, <josef.pktd at gmail.com> wrote:
>>
>>
>>
>> On Thu, Oct 5, 2017 at 2:52 PM, Stuart Reynolds
>> <stuart at stuartreynolds.net> wrote:
>>>
>>> Turns out sm.Logit does allow setting the tolerance.
>>> Some and quick and dirty time profiling of different methods on a 100k
>>> * 30 features dataset, with different solvers and losses:
>>>
>>> sklearn.LogisticRegression: l1 1.13864398003 (seconds)
>>> sklearn.LogisticRegression: l2 0.0538778305054
>>> sm.Logit l1 0.0922629833221  # Although didn't converge
>>> sm.Logit l1_cvxopt_cp 0.958268165588
>>> sm.Logit newton 0.133476018906
>>> sm.Logit nm 0.369864940643
>>> sm.Logit bfgs 0.105798006058
>>> sm.Logit lbfgs 0.06241106987
>>> sm.Logit powell 1.64219808578
>>> sm.Logit cg 0.2184278965
>>> sm.Logit ncg 0.216138124466
>>> sm.Logit basinhopping 8.82164621353
>>> sm.GLM.fit IRLS 0.544688940048
>>> sm.GLM L2: 1.29778695107
>>>
>>> I've been getting good results from sm.Logit.fit (although
>>> unregularized).
>>> statsmodels GLM seems a little slow. Not sure why.
>>>
>>> My benchmark may be a little apples-to-oranges, since the stopping
>>> criteria probably aren't comparable.
>>
>>
>> I think that's a problem with GLM IRLS.
>> AFAIK, but never fully tested, is that the objective function is
>> proportional to the number of observations and the convergence
>> criterion becomes tighter as nobs increases.
>> I don't find the issue or PR discussion anymore, but one of our
>> contributors fixed maxiter at 15 or something like that for IRLS with
>> around 4 to 5 million observations and mostly categorical explanatory
>> variables in his application.
>>
>> unfortunately (no upfront design and decisions across models)
>> https://github.com/statsmodels/statsmodels/issues/2825
>
>
>
> Interesting timing excercise, I tried a bit more yesterday.
>
> GLM IRLS is not slow because of the convergence criterion, but it seems like
> it takes much longer when the design matrix is not well conditioned.
> The random dataset generated by sklearn has singular values in the range of
> 1e-14 or 1e-15
> This doesn't affect the other estimators much and lbfgs is almost always the
> fastest with bfgs close behind.
>
> When I add some noise to the feature matrix so it's not so close to
> singular, then IRLS is roughly in the same neighborhood as the faster scipy
> optimizers
> With n_samples=1000000, n_features=50, Logit is around 5 or 6 seconds (for
> lbfgs, bfgs and newton) slightly faster than sklearnLogistic regression
> regularized, but GLM is about 4 times slower with 17 to 20 seconds
> GLM L2 is much slower in this case because of the current non-optimized
> implementation of coordinate descend.
>
> aside: In master and next release of statsmodels there is a interface to
> scipy.minimize, which allows that all new optimizers can be used, e.g.
> dogleg and other new trust region newton methods will be better optimizers
> for many cases.
>
> Josef
>
>
>>
>>
>> Josef
>>
>>
>>>
>>>
>>>
>>> For tiny models, which I'm also building: 100 samples, 5 features
>>>
>>> sklearn.LogisticRegression: l1 0.00137376785278
>>> sklearn.LogisticRegression: l2 0.00167894363403
>>> sm.Logit l1 0.0198900699615
>>> sm.Logit l1_cvxopt_cp 0.162448167801
>>> sm.Logit newton 0.00689911842346
>>> sm.Logit nm 0.0754928588867
>>> sm.Logit bfgs 0.0210938453674
>>> sm.Logit lbfgs 0.0156588554382
>>> sm.Logit powell 0.0161390304565
>>> sm.Logit cg 0.00759506225586
>>> sm.Logit ncg 0.00541186332703
>>> sm.Logit basinhopping 0.3076171875
>>> sm.GLM.fit IRLS 0.00902199745178
>>> sm.GLM L2: 0.0208361148834
>>>
>>> I couldn't get sm.GLM.fit to work with non "IRLS" solvers. (hits a
>>> division by zero).
>>>
>>>
>>> ----
>>>
>>> import sklearn.datasets
>>> from sklearn.preprocessing import StandardScaler
>>> X, y = sklearn.datasets.make_classification(n_samples=10000,
>>> n_features=30, random_state=123)
>>> X = StandardScaler(copy=True, with_mean=True,
>>> with_std=True).fit_transform(X)
>>>
>>> import time
>>> tol = 0.0001
>>> maxiter = 100
>>> DISP = 0
>>>
>>>
>>> if 1: # sk.LogisticRegression
>>>     import sklearn
>>>     from sklearn.linear_model import LogisticRegression
>>>
>>>     for method in ["l1", "l2"]: # TODO, add solvers:
>>>         t = time.time()
>>>         model = LogisticRegression(C=1, tol=tol, max_iter=maxiter,
>>> penalty=method)
>>>         model.fit(X,y)
>>>         print "sklearn.LogisticRegression:", method, time.time() - t
>>>
>>>
>>>
>>>
>>> if 1: # sm.Logit.fit_regularized
>>>     from statsmodels.discrete.discrete_model import Logit
>>>     for method in ["l1", "l1_cvxopt_cp"]:
>>>         t = time.time()
>>>         model = Logit(y,X)
>>>         result = model.fit_regularized(method=method, maxiter=maxiter,
>>>                                        alpha=1.,
>>>                                        abstol=tol,
>>>                                        acc=tol,
>>>                                        tol=tol, gtol=tol, pgtol=tol,
>>>                                        disp=DISP)
>>>         print "sm.Logit", method, time.time() - t
>>>
>>> if 1: # sm.Logit.fit
>>>     from statsmodels.discrete.discrete_model import Logit
>>>
>>>     SOLVERS = ["newton", "nm",
>>> "bfgs","lbfgs","powell","cg","ncg","basinhopping",]
>>>     for method in SOLVERS:
>>>         t = time.time()
>>>         model = Logit(y,X)
>>>         result = model.fit(method=method, maxiter=maxiter,
>>>                            niter=maxiter,
>>>                            ftol=tol,
>>>                            tol=tol, gtol=tol, pgtol=tol,  # Hmmm..
>>> needs to be reviewed.
>>>                            disp=DISP)
>>>         print "sm.Logit", method, time.time() - t
>>>
>>> if 1: # sm.GLM.fit
>>>     from statsmodels.genmod.generalized_linear_model import GLM
>>>     from statsmodels.genmod.generalized_linear_model import families
>>>     for method in ["IRLS"]:
>>>         t = time.time()
>>>         model = GLM(y, X,
>>> family=families.Binomial(link=families.links.logit))
>>>         result = model.fit(method=method, cnvrg_tol=tol,
>>> maxiter=maxiter, full_output=False, disp=DISP)
>>>         print "sm.GLM.fit", method, time.time() - t
>>>
>>>
>>> if 1: # GLM.fit_regularized
>>>     from statsmodels.genmod.generalized_linear_model import GLM
>>>     from statsmodels.genmod.generalized_linear_model import families
>>>     t = time.time()
>>>     model = GLM(y, X,
>>> family=families.Binomial(link=families.links.logit))
>>>     result = model.fit_regularized(method='elastic_net', alpha=1.0,
>>> L1_wt=0.0, cnvrg_tol=tol, maxiter=maxiter)
>>>     print "sm.GLM L2:", time.time() - t
>>>
>>>
>>>
>>> if 0: # GLM.fit
>>>     # Hits division by zero.
>>>     SOLVERS = ["bfgs","lbfgs", "netwon", "nm",
>>> "powell","cg","ncg","basinhopping",]
>>>     from statsmodels.genmod.generalized_linear_model import GLM
>>>     from statsmodels.genmod.generalized_linear_model import families
>>>     for method in SOLVERS:
>>>         t = time.time()
>>>         model = GLM(y, X,
>>> family=families.Binomial(link=families.links.logit))
>>>         result = model.fit(method=method,
>>> #                           scale="X2",
>>> #                            alpha=1.,
>>> #                            abstol=tol,
>>> #                            acc=tol,
>>> #                            tol=tol, gtol=tol, pgtol=tol,
>>> #                            maxiter=maxiter,
>>> #                            #full_output=False,
>>>                            disp=DISP)
>>>         print "sm.GLM.fit", method, time.time() - t
>>>
>>>
>>> On Thu, Oct 5, 2017 at 10:32 AM, Sean Violante <sean.violante at gmail.com>
>>> wrote:
>>> > Stuart
>>> > have you tried glmnet ( in R) there is a python version
>>> > https://web.stanford.edu/~hastie/glmnet_python/ ....
>>> >
>>> >
>>> >
>>> >
>>> > On Thu, Oct 5, 2017 at 6:34 PM, Stuart Reynolds
>>> > <stuart at stuartreynolds.net>
>>> > wrote:
>>> >>
>>> >> Thanks Josef. Was very useful.
>>> >>
>>> >> result.remove_data() reduces a 5 parameter Logit result object from
>>> >> megabytes to 5Kb (as compared to a minimum uncompressed size of the
>>> >> parameters of ~320 bytes). Is big improvement. I'll experiment with
>>> >> what you suggest -- since this is still >10x larger than possible. I
>>> >> think the difference is mostly attribute names.
>>> >> I don't mind the lack of a multinomial support. I've often had better
>>> >> results mixing independent models for each class.
>>> >>
>>> >> I'll experiment with the different solvers.  I tried the Logit model
>>> >> in the past -- its fit function only exposed a maxiter, and not a
>>> >> tolerance -- meaning I had to set maxiter very high. The newer
>>> >> statsmodels GLM module looks great and seem to solve this.
>>> >>
>>> >> For other who come this way, I think the magic for ridge regression
>>> >> is:
>>> >>
>>> >>         from statsmodels.genmod.generalized_linear_model import GLM
>>> >>         from statsmodels.genmod.generalized_linear_model import
>>> >> families
>>> >>         from statsmodels.genmod.generalized_linear_model.families
>>> >> import
>>> >> links
>>> >>
>>> >>         model = GLM(y, Xtrain,
>>> >> family=families.Binomial(link=links.Logit))
>>> >>         result = model.fit_regularized(method='elastic_net',
>>> >> alpha=l2weight, L1_wt=0.0, tol=...)
>>> >>         result.remove_data()
>>> >>         result.predict(Xtest)
>>> >>
>>> >> One last thing -- its clear that it should be possible to do something
>>> >> like scikit's LogisticRegressionCV in order to quickly optimize a
>>> >> single parameter by re-using past coefficients.
>>> >> Are there any wrappers in statsmodels for doing this or should I roll
>>> >> my
>>> >> own?
>>> >>
>>> >>
>>> >> - Stu
>>> >>
>>> >>
>>> >> On Wed, Oct 4, 2017 at 3:43 PM,  <josef.pktd at gmail.com> wrote:
>>> >> >
>>> >> >
>>> >> > On Wed, Oct 4, 2017 at 4:26 PM, Stuart Reynolds
>>> >> > <stuart at stuartreynolds.net>
>>> >> > wrote:
>>> >> >>
>>> >> >> Hi Andy,
>>> >> >> Thanks -- I'll give another statsmodels another go.
>>> >> >> I remember I had some fitting speed issues with it in the past, and
>>> >> >> also some issues related their models keeping references to the
>>> >> >> data
>>> >> >> (=disaster for serialization and multiprocessing) -- although that
>>> >> >> was
>>> >> >> a long time ago.
>>> >> >
>>> >> >
>>> >> > The second has not changed and will not change, but there is a
>>> >> > remove_data
>>> >> > method that deletes all references to full, data sized arrays.
>>> >> > However,
>>> >> > once
>>> >> > the data is removed, it is not possible anymore to compute any new
>>> >> > results
>>> >> > statistics which are almost all lazily computed.
>>> >> > The fitting speed depends a lot on the optimizer, convergence
>>> >> > criteria
>>> >> > and
>>> >> > difficulty of the problem, and availability of good starting
>>> >> > parameters.
>>> >> > Almost all nonlinear estimation problems use the scipy optimizers,
>>> >> > all
>>> >> > unconstrained optimizers can be used. There are no optimized special
>>> >> > methods
>>> >> > for cases with a very large number of features.
>>> >> >
>>> >> > Multinomial/multiclass models don't support continuous response
>>> >> > (yet),
>>> >> > all
>>> >> > other GLM and discrete models allow for continuous data in the
>>> >> > interval
>>> >> > extension of the domain.
>>> >> >
>>> >> > Josef
>>> >> >
>>> >> >
>>> >> >>
>>> >> >> - Stuart
>>> >> >>
>>> >> >> On Wed, Oct 4, 2017 at 1:09 PM, Andreas Mueller <t3kcit at gmail.com>
>>> >> >> wrote:
>>> >> >> > Hi Stuart.
>>> >> >> > There is no interface to do this in scikit-learn (and maybe we
>>> >> >> > should
>>> >> >> > at
>>> >> >> > this to the FAQ).
>>> >> >> > Yes, in principle this would be possible with several of the
>>> >> >> > models.
>>> >> >> >
>>> >> >> > I think statsmodels can do that, and I think I saw another glm
>>> >> >> > package
>>> >> >> > for Python that does that?
>>> >> >> >
>>> >> >> > It's certainly a legitimate use-case but would require
>>> >> >> > substantial
>>> >> >> > changes to the code. I think so far we decided not to support
>>> >> >> > this in scikit-learn. Basically we don't have a concept of a link
>>> >> >> > function, and it's a concept that only applies to a subset of
>>> >> >> > models.
>>> >> >> > We try to have a consistent interface for all our estimators, and
>>> >> >> > this doesn't really fit well within that interface.
>>> >> >> >
>>> >> >> > Hth,
>>> >> >> > Andy
>>> >> >> >
>>> >> >> >
>>> >> >> > On 10/04/2017 03:58 PM, Stuart Reynolds wrote:
>>> >> >> >>
>>> >> >> >> I'd like to fit a model that maps a matrix of continuous inputs
>>> >> >> >> to a
>>> >> >> >> target that's between 0 and 1 (a probability).
>>> >> >> >>
>>> >> >> >> In principle, I'd expect logistic regression should work out of
>>> >> >> >> the
>>> >> >> >> box with no modification (although its often posed as being
>>> >> >> >> strictly
>>> >> >> >> for classification, its loss function allows for fitting targets
>>> >> >> >> in
>>> >> >> >> the range 0 to 1, and not strictly zero or one.)
>>> >> >> >>
>>> >> >> >> However, scikit's LogisticRegression and LogisticRegressionCV
>>> >> >> >> reject
>>> >> >> >> target arrays that are continuous. Other LR implementations
>>> >> >> >> allow a
>>> >> >> >> matrix of probability estimates. Looking at:
>>> >> >> >>
>>> >> >> >>
>>> >> >> >>
>>> >> >> >>
>>> >> >> >> http://scikit-learn-general.narkive.com/4dSCktaM/using-logistic-regression-on-a-continuous-target-variable
>>> >> >> >> and the fix here:
>>> >> >> >> https://github.com/scikit-learn/scikit-learn/pull/5084, which
>>> >> >> >> disables
>>> >> >> >> continuous inputs, it looks like there was some reason for this.
>>> >> >> >> So
>>> >> >> >> ... I'm looking for alternatives.
>>> >> >> >>
>>> >> >> >> SGDClassifier allows log loss and (if I understood the docs
>>> >> >> >> correctly)
>>> >> >> >> adds a logistic link function, but also rejects continuous
>>> >> >> >> targets.
>>> >> >> >> Oddly, SGDRegressor only allows  ?squared_loss?, ?huber?,
>>> >> >> >> ?epsilon_insensitive?, or ?squared_epsilon_insensitive?, and
>>> >> >> >> doesn't
>>> >> >> >> seems to give a logistic function.
>>> >> >> >>
>>> >> >> >> In principle, GLM allow this, but scikit's docs say the GLM
>>> >> >> >> models
>>> >> >> >> only allows strict linear functions of their input, and doesn't
>>> >> >> >> allow
>>> >> >> >> a logistic link function. The docs direct people to the
>>> >> >> >> LogisticRegression class for this case.
>>> >> >> >>
>>> >> >> >> In R, there is:
>>> >> >> >>
>>> >> >> >> glm(Total_Service_Points_Won/Total_Service_Points_Played ~ ... ,
>>> >> >> >>      family = binomial(link=logit), weights =
>>> >> >> >> Total_Service_Points_Played)
>>> >> >> >> which would be ideal.
>>> >> >> >>
>>> >> >> >> Is something similar available in scikit? (Or any continuous
>>> >> >> >> model
>>> >> >> >> that takes and 0 to 1 target and outputs a 0 to 1 target?)
>>> >> >> >>
>>> >> >> >> I was surprised to see that the implementation of
>>> >> >> >> CalibratedClassifierCV(method="sigmoid") uses an internal
>>> >> >> >> implementation of logistic regression to do its logistic
>>> >> >> >> regressing
>>> >> >> >> --
>>> >> >> >> which I can use, although I'd prefer to use a user-facing
>>> >> >> >> library.
>>> >> >> >>
>>> >> >> >> Thanks,
>>> >> >> >> - Stuart
>>> >> >> >> _______________________________________________
>>> >> >> >> scikit-learn mailing list
>>> >> >> >> scikit-learn at python.org
>>> >> >> >> https://mail.python.org/mailman/listinfo/scikit-learn
>>> >> >> >
>>> >> >> >
>>> >> >> > _______________________________________________
>>> >> >> > scikit-learn mailing list
>>> >> >> > scikit-learn at python.org
>>> >> >> > https://mail.python.org/mailman/listinfo/scikit-learn
>>> >> >> _______________________________________________
>>> >> >> scikit-learn mailing list
>>> >> >> scikit-learn at python.org
>>> >> >> https://mail.python.org/mailman/listinfo/scikit-learn
>>> >> >
>>> >> >
>>> >> >
>>> >> > _______________________________________________
>>> >> > scikit-learn mailing list
>>> >> > scikit-learn at python.org
>>> >> > https://mail.python.org/mailman/listinfo/scikit-learn
>>> >> >
>>> >> _______________________________________________
>>> >> scikit-learn mailing list
>>> >> scikit-learn at python.org
>>> >> https://mail.python.org/mailman/listinfo/scikit-learn
>>> >
>>> >
>>> >
>>> > _______________________________________________
>>> > scikit-learn mailing list
>>> > scikit-learn at python.org
>>> > https://mail.python.org/mailman/listinfo/scikit-learn
>>> >
>>> _______________________________________________
>>> scikit-learn mailing list
>>> scikit-learn at python.org
>>> https://mail.python.org/mailman/listinfo/scikit-learn
>>
>>
>
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>

From t3kcit at gmail.com  Wed Oct 18 15:01:26 2017
From: t3kcit at gmail.com (Andreas Mueller)
Date: Wed, 18 Oct 2017 15:01:26 -0400
Subject: [scikit-learn] scikit-learn Digest, Vol 19, Issue 37
In-Reply-To: <CANpSPFTD0wq0t_tCCGDiNxX8QUtTJPaGEgAhsT1W-F-4HkMCZw@mail.gmail.com>
References: <mailman.2703.1508255100.12136.scikit-learn@python.org>
 <CANpSPFTD0wq0t_tCCGDiNxX8QUtTJPaGEgAhsT1W-F-4HkMCZw@mail.gmail.com>
Message-ID: <ae0d5401-7c58-436f-209d-e43bd9516182@gmail.com>


On 10/17/2017 11:18 PM, Ismael Lemhadri wrote:
> How about editing the various chunks of code concerned to add the 
> option to scale the parameters, and set it by default to NOT scale? 
> This would make what happens clear without the redundancy Andreas 
> mentioned, and would add more convenience to the user shall they want 
> to scale their data.
>
I don't feel that would add a lot, and it would still requires the users 
to read the docs.
There are many ways to scale, and applying any of them is very easy with 
scikit-learn.

The main source of confusion seems to be that you expected the PCA to 
scale, and it doesn't.
It doesn't say anywhere that it scales, and scaling is not part of the 
definition of PCA
(in contrast to subtracting the mean).

I guess part of the confusion came from the somewhat cryptic docstring 
about SVD, but you fixed that.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20171018/850b7938/attachment-0001.html>

From ynr.info at gmail.com  Thu Oct 19 15:36:59 2017
From: ynr.info at gmail.com (Yaser Norouzzadeh)
Date: Thu, 19 Oct 2017 21:36:59 +0200
Subject: [scikit-learn] How compute weighted accuracy for multi-class
 classification?
Message-ID: <CADfKySphrhc7h=bh70U-GkzaEWKAcMsEmNhL0=rghbBH-RJ5sw@mail.gmail.com>

I do multi-class classification on unbalanced classes. I'm using
SGDClassifier(), GradientBoostingClassifier(), RandomForestClassifier(),
and LogisticRegression()with class_weight='balanced'. To compare the
results. it is required to compute the accuracy. I tried the following way
to compute weighted accuracy:

n_samples = len(y_train)
weights_cof =
float(n_samples)/(n_classes*np.bincount(data[target_label].as_matrix().astype(int))[1:])
sample_weights = np.ones((n_samples,n_classes)) * weights_cof
print accuracy_score(y_test, y_pred, sample_weight=sample_weights)
y_train is a binary array. So sample_weights has the same shape as y_train
(n_samples, n_classes). When I run the script, I received the following
error:

Traceback (most recent call last):
  File "C:\Program Files (x86)\JetBrains\PyCharm Community Edition
2016.3.2\helpers\pydev\pydevd.py", line 1596, in <module>
    globals = debugger.run(setup['file'], None, None, is_module)
  File "C:\Program Files (x86)\JetBrains\PyCharm Community Edition
2016.3.2\helpers\pydev\pydevd.py", line 974, in run
    pydev_imports.execfile(file, globals, locals)  # execute the script
  File "D:/Destiny/DestinyScripts/MainLocationAware.py", line 424, in
<module>
    predict_country(featuresDF, score, featuresLabel, country_sample_size,
'gbc')
  File "D:/Destiny/DestinyScripts/MainLocationAware.py", line 313, in
predict_country
    print accuracy_score(y_test, y_pred, sample_weight=sample_weights)
  File
"C:\ProgramData\Anaconda2\lib\site-packages\sklearn\metrics\classification.py",
line 183, in accuracy_score
    return _weighted_sum(score, sample_weight, normalize)
  File
"C:\ProgramData\Anaconda2\lib\site-packages\sklearn\metrics\classification.py",
line 108, in _weighted_sum
    return np.average(sample_score, weights=sample_weight)
  File
"C:\ProgramData\Anaconda2\lib\site-packages\numpy\lib\function_base.py",
line 1124, in average
    "Axis must be specified when shapes of a and weights "
TypeError: Axis must be specified when shapes of a and weights differ.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20171019/aa061ba5/attachment.html>

From s.atasever at gmail.com  Fri Oct 20 09:13:58 2017
From: s.atasever at gmail.com (Sema Atasever)
Date: Fri, 20 Oct 2017 16:13:58 +0300
Subject: [scikit-learn] How to get centroids from SciPy's hierarchical
 agglomerative clustering?
Message-ID: <CAAir+Crvh43=T_BwFCdOpjSFe_dRLrJWAJtbh8RQA46nTWGqaA@mail.gmail.com>

Dear scikit-learn members,

I am using SciPy's hierarchical agglomerative clustering methods to cluster
a
1000 x 22 matrix of features, after clustering my data set with
scipy.cluster.hierarchy.linkage and and assigning each sample to a cluster,
I can't seem to figure out how to get the centroid from the resulting
clusters.
I would like to extract one element or a few out of each cluster, which is
the closest to that cluster's centroid.

Below follows my code:


*D=np.loadtxt(open("C:\dataset.txt", "rb"), delimiter=";")*

*Y = hierarchy.linkage(D, 'ward')*
*assignments = hierarchy.fcluster(Y, 5, criterion="maxclust")*

I am taking my matrix of features, computing the euclidean distance between
them, and then passing them onto the hierarchical clustering method. From
there, I am creating flat clusters, with a maximum of 5 clusters

Now, based on the flat clusters *assignments*, how do I get the 1 x 22
centroid that represents each flat cluster?

Best.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20171020/ef716f3a/attachment-0001.html>
-------------- next part --------------
import time
from scipy.cluster import hierarchy
from scipy import cluster
import numpy as np
from scipy.cluster.hierarchy import cophenet
from scipy.spatial.distance import pdist

import sys

start_time = time.clock()

D=np.loadtxt(open("C:\dataset.txt", "rb"), delimiter=";")

print ('D')
print (D)
print ('D.shape')
print (D.shape)

#clustering = scipy.cluster.hierarchy.ward(D)
print ('Clustering')
Y = hierarchy.linkage(D, 'ward')

print ('\nClustering results :')
assignments = hierarchy.fcluster(Y, 5, criterion="maxclust")
print("\nCLUSTERS --> " + str(assignments) + "\n")
print ('tofile')
assignments.tofile('assignments.out',sep=',',format='%s')

c, coph_dists = hierarchy.cophenet(Y, pdist(D))
print("Cophenetic Correlation Coefficient:" + str(c) + "\n")

print("\nRunning Time:")
print (time.clock() - start_time, "seconds")
-------------- next part --------------
0.018707;0.124675;0.237638;0.011247;0.01647;0.047426;0.14517;0.897749;0.002773;0.099477;0.96536;0;0.03464;0.82721;0.0034;0.169385;0.885692;0.001665;0.000565;0.99147;0.92944;0.000085
0.011247;0.129665;0.897749;0.002773;0.0034;0.119203;0.194505;0.885692;0.001665;0.112641;0.981905;0;0.01809;0.863615;0.000565;0.13582;0.858511;0.001186;0.000085;0.99969;0.953705;0.000045
0.002773;0.82721;0.885692;0.001665;0.000565;0.047426;0.14381;0.858511;0.001186;0.140304;0.99147;0;0.00853;0.92944;0.000085;0.070475;0.859386;0.001082;0.000045;0.999705;0.99365;0.000025
0.001665;0.863615;0.858511;0.001186;0.000085;0.5;0.774168;0.859386;0.001082;0.139532;0.99969;0;0.00031;0.953705;0.000045;0.04625;0.913392;0.000642;0.000025;0.99966;0.9937;0.000005
0.001186;0.92944;0.859386;0.001082;0.000045;0.268941;0.709509;0.913392;0.000642;0.085968;0.999705;0;0.000295;0.99365;0.000025;0.00633;0.944633;0.000362;0.000005;0.99966;0.998645;0
0.001082;0.953705;0.913392;0.000642;0.000025;0.017986;0.249927;0.944633;0.000362;0.055004;0.99966;0;0.00034;0.9937;0.000005;0.00629;0.94682;0.007858;0;0.99985;0.998525;0
0.000642;0.99365;0.944633;0.000362;0.000005;0.119203;0.574443;0.94682;0.007858;0.045323;0.99966;0;0.00034;0.998645;0;0.001355;0.948023;0.007399;0;0.999875;0.997725;0
0.000362;0.9937;0.94682;0.007858;0;0.119203;0.50625;0.948023;0.007399;0.044578;0.99985;0;0.00015;0.998525;0;0.001475;0.916962;0.005112;0;0.999955;0.99794;0
0.007858;0.998645;0.948023;0.007399;0;0.047426;0.418484;0.916962;0.005112;0.077926;0.999875;0;0.000125;0.997725;0;0.002275;0.950778;0.011906;0;0.999935;0.999175;0
0.007399;0.998525;0.916962;0.005112;0;0.268941;0.125319;0.950778;0.011906;0.037314;0.999955;0;0.000045;0.99794;0;0.002055;0.973144;0.007902;0;0.99988;0.99985;0
0.005112;0.997725;0.950778;0.011906;0;0.017986;0.604679;0.973144;0.007902;0.018954;0.999935;0;0.000065;0.999175;0;0.000825;0.972865;0.019002;0;0.998785;0.999865;0
0.011906;0.99794;0.973144;0.007902;0;0.119203;0.635295;0.972865;0.019002;0.008133;0.99988;0;0.00012;0.99985;0;0.00015;0.974425;0.002117;0;0.99759;0.999485;0
0.007902;0.999175;0.972865;0.019002;0;0.731059;0.730468;0.974425;0.002117;0.023458;0.998785;0;0.001215;0.999865;0;0.000135;0.994329;0.000052;0;0.99663;0.99973;0
0.019002;0.99985;0.974425;0.002117;0;0.119203;0.129769;0.994329;0.000052;0.005619;0.99759;0;0.00241;0.999485;0;0.000515;0.993937;0.00005;0;0.99338;0.9992;0
0.002117;0.999865;0.994329;0.000052;0;0.119203;0.622459;0.993937;0.00005;0.006013;0.99663;0;0.00337;0.99973;0;0.00027;0.991298;0.00005;0;0.97732;0.99885;0
0.000052;0.999485;0.993937;0.00005;0;0.268941;0.741734;0.991298;0.00005;0.008653;0.99338;0;0.00662;0.9992;0;0.0008;0.986392;0.000049;0;0.936015;0.991345;0
0.00005;0.99973;0.991298;0.00005;0;0.047426;0.677433;0.986392;0.000049;0.01356;0.97732;0;0.02268;0.99885;0;0.00115;0.959869;0.000734;0;0.93506;0.927085;0
0.00005;0.9992;0.986392;0.000049;0;0.047426;0.41411;0.959869;0.000734;0.039397;0.936015;0.000005;0.06398;0.991345;0;0.008655;0.930654;0.00031;0;0.89332;0.827545;0
0.000049;0.99885;0.959869;0.000734;0;0.119203;0.242688;0.930654;0.00031;0.069037;0.93506;0.000025;0.06492;0.927085;0;0.072915;0.850705;0.000559;0;0.72336;0.186605;0.000035
0.000734;0.991345;0.930654;0.00031;0;0.119203;0.638763;0.850705;0.000559;0.148739;0.89332;0.00015;0.10654;0.827545;0;0.172455;0.406755;0.002126;0.000035;0.010995;0.005055;0.000125
0.00031;0.927085;0.850705;0.000559;0;0.268941;0.558481;0.406755;0.002126;0.59112;0.72336;0.00061;0.27603;0.186605;0.000035;0.81336;0.026838;0.013985;0.000125;0.007855;0.003725;0.00212
0.045581;0.003725;0.017474;0.030584;0.00293;0.006693;0.425802;0.990065;0.000029;0.009906;0.99053;0.000005;0.00947;0.992875;0;0.00712;0.992701;0.000028;0;0.99602;0.998105;0
0.030584;0.007785;0.990065;0.000029;0;0.047426;0.674366;0.992701;0.000028;0.00727;0.995605;0;0.004395;0.99748;0;0.002515;0.993223;0.000028;0;0.999155;0.999925;0
0.000029;0.992875;0.992701;0.000028;0;0.119203;0.460085;0.993223;0.000028;0.00675;0.99602;0;0.00398;0.998105;0;0.001895;0.99432;0.000031;0;0.993065;0.99991;0
0.000028;0.99748;0.993223;0.000028;0;0.880797;0.654301;0.99432;0.000031;0.005649;0.999155;0.00001;0.000835;0.999925;0;0.000075;0.991634;0.000031;0;0.992505;0.99978;0
0.000028;0.998105;0.99432;0.000031;0;0.119203;0.943588;0.991634;0.000031;0.008337;0.993065;0.00001;0.00693;0.99991;0;0.00009;0.983706;0.000029;0;0.990995;0.999515;0
0.000031;0.999925;0.991634;0.000031;0;0.119203;0.516244;0.983706;0.000029;0.016263;0.992505;0.000005;0.007485;0.99978;0;0.00022;0.977937;0.000033;0;0.863215;0.858215;0
0.000031;0.99991;0.983706;0.000029;0;0.119203;0.711155;0.977937;0.000033;0.022031;0.990995;0.000015;0.00899;0.999515;0;0.000485;0.845221;0.000098;0;0.09421;0.00026;0.000055
0.000029;0.99978;0.977937;0.000033;0;0.5;0.649081;0.845221;0.000098;0.154681;0.863215;0.00021;0.136575;0.858215;0;0.141785;0.04336;0.000106;0.000055;0.107615;0.000715;0.000035
0.000321;0.000715;0.048489;0.000277;0.00002;0.017986;0.560699;0.820928;0.000821;0.178252;0.530395;0.000115;0.469495;0.99119;0.00001;0.0088;0.863209;0.001434;0.000015;0.699615;0.99851;0.00005
0.000277;0.005355;0.820928;0.000821;0.00001;0.119203;0.541157;0.863209;0.001434;0.135354;0.638845;0.00256;0.35859;0.99503;0.000015;0.00495;0.890425;0.003466;0.00005;0.87806;0.999095;0.00009
0.000821;0.99119;0.863209;0.001434;0.000015;0.5;0.744977;0.890425;0.003466;0.106105;0.699615;0.00862;0.29176;0.99851;0.00005;0.001435;0.952441;0.006001;0.00009;0.88481;0.999255;0.00008
0.001434;0.99503;0.890425;0.003466;0.00005;0.993307;0.188773;0.952441;0.006001;0.041558;0.87806;0.01783;0.10411;0.999095;0.00009;0.000815;0.949341;0.006039;0.00008;0.914125;0.99932;0.00007
0.003466;0.99851;0.952441;0.006001;0.00009;0.047426;0.546862;0.949341;0.006039;0.044624;0.88481;0.01795;0.097245;0.999255;0.00008;0.00067;0.956362;0.009821;0.00007;0.911185;0.99881;0.000065
0.006001;0.999095;0.949341;0.006039;0.00008;0.047426;0.623868;0.956362;0.009821;0.033816;0.914125;0.029305;0.05657;0.99932;0.00007;0.000605;0.951798;0.010139;0.000065;0.74046;0.94771;0.00007
0.006039;0.999255;0.956362;0.009821;0.00007;0.047426;0.300903;0.951798;0.010139;0.038064;0.911185;0.029045;0.05977;0.99881;0.000065;0.00113;0.828896;0.01042;0.00007;0.13319;0.35828;0.000045
0.009821;0.99932;0.951798;0.010139;0.000065;0.5;0.202943;0.828896;0.01042;0.160684;0.74046;0.03021;0.22933;0.94771;0.00007;0.05222;0.307728;0.009116;0.000045;0.09318;0.08325;0.000015
0.017436;0.16625;0.284925;0.014029;0.01574;0.731059;0.569546;0.921145;0.004507;0.074347;0.9877;0.000105;0.01219;0.826805;0.000865;0.17233;0.944256;0.002158;0.00024;0.99453;0.968465;0
0.014029;0.186115;0.921145;0.004507;0.000865;0.047426;0.591701;0.944256;0.002158;0.053584;0.988945;0.000085;0.010965;0.90219;0.00024;0.09757;0.973556;0.000105;0;0.998665;0.999845;0
0.004507;0.826805;0.944256;0.002158;0.00024;0.731059;0.403236;0.973556;0.000105;0.026337;0.99453;0.000095;0.00537;0.968465;0;0.031535;0.984;0.005515;0;0.99947;0.99998;0
0.002158;0.90219;0.973556;0.000105;0;0.731059;0.670843;0.984;0.005515;0.010485;0.998665;0.000165;0.00117;0.999845;0;0.000155;0.98331;0.006372;0;0.99939;0.999965;0
0.000105;0.968465;0.984;0.005515;0;0.731059;0.647713;0.98331;0.006372;0.01032;0.99947;0.00016;0.000375;0.99998;0;0.00002;0.983529;0.000127;0;0.99927;0.99995;0
0.005515;0.999845;0.98331;0.006372;0;0.731059;0.648397;0.983529;0.000127;0.016344;0.99939;0.000165;0.000445;0.999965;0;0.000035;0.983404;0.000119;0;0.99767;0.999965;0
0.006372;0.99998;0.983529;0.000127;0;0.5;0.474273;0.983404;0.000119;0.016475;0.99927;0.00015;0.000575;0.99995;0;0.00005;0.989049;0.000111;0;0.99492;0.99981;0
0.000127;0.999965;0.983404;0.000119;0;0.119203;0.874022;0.989049;0.000111;0.010839;0.99767;0.000125;0.0022;0.999965;0;0.000035;0.989007;0.000106;0;0.988035;0.997405;0
0.000119;0.99995;0.989049;0.000111;0;0.119203;0.690188;0.989007;0.000106;0.010887;0.99492;0.00011;0.00497;0.99981;0;0.00019;0.980341;0.000108;0;0.526095;0.58035;0.00001
0.000111;0.999965;0.989007;0.000106;0;0.731059;0.759328;0.980341;0.000108;0.019551;0.988035;0.000115;0.01185;0.997405;0;0.002595;0.570886;0.000126;0.00001;0.01005;0.000315;0
0.000106;0.99981;0.980341;0.000108;0;0.268941;0.105741;0.570886;0.000126;0.428989;0.526095;0.00016;0.47375;0.58035;0.00001;0.41964;0.034664;0.016534;0;0.00194;0.000095;0.00153
0.007686;0.000095;0.01485;0.001999;0.002085;0.982014;0.562423;0.837819;0.001893;0.160291;0.849295;0.00085;0.14986;0.949475;0.004605;0.045925;0.88749;0.002598;0.00579;0.908275;0.965845;0.015695
0.001999;0.00075;0.837819;0.001893;0.004605;0.047426;0.479012;0.88749;0.002598;0.10991;0.852845;0.001775;0.14538;0.96004;0.00579;0.034165;0.927171;0.006;0.015695;0.84288;0.9234;0.016605
0.001893;0.949475;0.88749;0.002598;0.00579;0.047426;0.329378;0.927171;0.006;0.066833;0.908275;0.00206;0.08967;0.965845;0.015695;0.018465;0.873447;0.00682;0.016605;0.609305;0.763585;0.011885
0.00682;0.9234;0.667733;0.044298;0.011885;0.017986;0.580786;0.616588;0.019968;0.363443;0.75779;0.00007;0.242135;0.62486;0.003705;0.371435;0.587801;0.028494;0.00153;0.82616;0.657235;0.000375
0.044298;0.763585;0.616588;0.019968;0.003705;0.047426;0.232723;0.587801;0.028494;0.383704;0.79082;0.00005;0.20913;0.60124;0.00153;0.39723;0.70343;0.000574;0.000375;0.917945;0.820595;0.00033
0.019968;0.62486;0.587801;0.028494;0.00153;0.268941;0.192943;0.70343;0.000574;0.295995;0.82616;0.000065;0.173775;0.657235;0.000375;0.34239;0.83762;0.000666;0.00033;0.99624;0.8311;0.00007
0.028494;0.60124;0.70343;0.000574;0.000375;0.119203;0.562915;0.83762;0.000666;0.161714;0.917945;0.000005;0.08205;0.820595;0.00033;0.179075;0.866332;0.00091;0.00007;0.997035;0.84518;0.000195
0.000574;0.657235;0.83762;0.000666;0.00033;0.047426;0.696355;0.866332;0.00091;0.132758;0.99624;0;0.00376;0.8311;0.00007;0.16883;0.908206;0.001531;0.000195;0.99945;0.98656;0.000255
0.000666;0.820595;0.866332;0.00091;0.00007;0.047426;0.410476;0.908206;0.001531;0.090263;0.997035;0.00002;0.00294;0.84518;0.000195;0.15463;0.990397;0.002568;0.000255;0.999185;0.98512;0.00016
0.00091;0.8311;0.908206;0.001531;0.000195;0.880797;0.599408;0.990397;0.002568;0.007035;0.99945;0.00004;0.00051;0.98656;0.000255;0.013185;0.988479;0.003215;0.00016;0.992115;0.97375;0.000325
0.001531;0.84518;0.990397;0.002568;0.000255;0.268941;0.657461;0.988479;0.003215;0.008305;0.999185;0.00005;0.000765;0.98512;0.00016;0.014715;0.978729;0.005073;0.000325;0.98836;0.92396;0.00087
0.002568;0.98656;0.988479;0.003215;0.00016;0.119203;0.539666;0.978729;0.005073;0.016196;0.992115;0.000055;0.00783;0.97375;0.000325;0.02592;0.95616;0.000468;0.00087;0.956485;0.712735;0.002005
0.003215;0.98512;0.978729;0.005073;0.000325;0.880797;0.374491;0.95616;0.000468;0.043372;0.98836;0.000065;0.011575;0.92396;0.00087;0.07517;0.83461;0.001043;0.002005;0.87522;0.6837;0.001705
0.005073;0.97375;0.95616;0.000468;0.00087;0.268941;0.858514;0.83461;0.001043;0.16435;0.956485;0.00008;0.04344;0.712735;0.002005;0.28526;0.77946;0.000883;0.001705;0.653005;0.61257;0.00996
0.01564;0.04553;0.290042;0.010728;0.02103;0.017986;0.842374;0.399617;0.00021;0.600172;0.56267;0.00003;0.4373;0.236565;0.00039;0.763045;0.63882;0.000063;0.000115;0.97811;0.83704;0.000235
0.010728;0.11161;0.399617;0.00021;0.00039;0.006693;0.747439;0.63882;0.000063;0.361117;0.958805;0.00001;0.041185;0.318835;0.000115;0.68105;0.907575;0.000127;0.000235;0.992855;0.959525;0.001165
0.00021;0.236565;0.63882;0.000063;0.000115;0.006693;0.612302;0.907575;0.000127;0.092297;0.97811;0.00002;0.02187;0.83704;0.000235;0.162725;0.97619;0.000765;0.001165;0.99501;0.99454;0.000045
0.000063;0.318835;0.907575;0.000127;0.000235;0.268941;0.82186;0.97619;0.000765;0.023045;0.992855;0.000365;0.00678;0.959525;0.001165;0.03931;0.994775;0.000432;0.000045;0.994265;0.99584;0.000005
0.000127;0.83704;0.97619;0.000765;0.001165;0.982014;0.337825;0.994775;0.000432;0.00479;0.99501;0.00082;0.004165;0.99454;0.000045;0.005415;0.995052;0.000375;0.000005;0.991925;0.996335;0.00002
0.000765;0.959525;0.994775;0.000432;0.000045;0.017986;0.744407;0.995052;0.000375;0.004567;0.994265;0.000745;0.004985;0.99584;0.000005;0.00415;0.698019;0.035516;0.00002;0.99122;0.999385;0
0.000432;0.99454;0.995052;0.000375;0.000005;0.047426;0.771182;0.698019;0.035516;0.266467;0.991925;0.00073;0.007355;0.996335;0.00002;0.00364;0.985321;0.000891;0;0.98882;0.99948;0
0.000375;0.99584;0.698019;0.035516;0.00002;0.002473;0.067926;0.985321;0.000891;0.013789;0.99122;0.00035;0.008435;0.999385;0;0.000615;0.987359;0.00069;0;0.989335;0.999465;0
0.035516;0.996335;0.985321;0.000891;0;0.017986;0.143195;0.987359;0.00069;0.011951;0.98882;0.00031;0.01087;0.99948;0;0.00052;0.99514;0.000677;0;0.9921;0.997935;0
0.000891;0.999385;0.987359;0.00069;0;0.047426;0.828353;0.99514;0.000677;0.004183;0.989335;0.00034;0.010325;0.999465;0;0.000535;0.99568;0.000759;0;0.99373;0.995715;0.000005
0.00069;0.99948;0.99514;0.000677;0;0.880797;0.743263;0.99568;0.000759;0.003561;0.9921;0.00078;0.00712;0.997935;0;0.002065;0.995501;0.00119;0.000005;0.994305;0.992805;0
0.000677;0.999465;0.99568;0.000759;0;0.047426;0;0.995501;0.00119;0.003309;0.99373;0.002095;0.004175;0.995715;0.000005;0.00428;0.994722;0.001429;0;0.992685;0.98594;0.000005
0.000759;0.997935;0.995501;0.00119;0.000005;0.047426;0.174221;0.994722;0.001429;0.003847;0.994305;0.002815;0.002875;0.992805;0;0.007195;0.991894;0.001465;0.000005;0.98756;0.96571;0.00006
0.00119;0.995715;0.994722;0.001429;0;0.952574;0.586618;0.991894;0.001465;0.00664;0.992685;0.00292;0.004395;0.98594;0.000005;0.014055;0.983442;0.001519;0.00006;0.962325;0.954445;0.000005
0.001429;0.992805;0.991894;0.001465;0.000005;0.880797;0;0.983442;0.001519;0.015037;0.98756;0.003025;0.00941;0.96571;0.00006;0.03423;0.971276;0.001489;0.000005;0.947785;0.95847;0.00003
0.001465;0.98594;0.983442;0.001519;0.00006;0.047426;0.659934;0.971276;0.001489;0.027237;0.962325;0.00299;0.03469;0.954445;0.000005;0.04555;0.967771;0.001357;0.00003;0.90679;0.94371;0.000045
0.001519;0.96571;0.971276;0.001489;0.000005;0.119203;0.735362;0.967771;0.001357;0.030872;0.947785;0.00257;0.049645;0.95847;0.00003;0.0415;0.949186;0.00155;0.000045;0.616485;0.95689;0.000145
0.001489;0.954445;0.967771;0.001357;0.00003;0.047426;0.819505;0.949186;0.00155;0.049265;0.90679;0.003135;0.090075;0.94371;0.000045;0.05625;0.856811;0.001535;0.000145;0.44766;0.9595;0.026255
0.001357;0.95847;0.949186;0.00155;0.000045;0.731059;0.523233;0.856811;0.001535;0.141654;0.616485;0.00299;0.380525;0.95689;0.000145;0.042965;0.801397;0.010113;0.026255;0.463005;0.959085;0.02656
0.00155;0.94371;0.856811;0.001535;0.000145;0.047426;0.431435;0.801397;0.010113;0.188491;0.44766;0.0026;0.54974;0.9595;0.026255;0.01425;0.806374;0.010923;0.02656;0.462105;0.99036;0.00024
0.001535;0.95689;0.801397;0.010113;0.026255;0.5;0.807057;0.806374;0.010923;0.182701;0.463005;0.004725;0.53227;0.959085;0.02656;0.01435;0.81649;0.003514;0.00024;0.45188;0.992355;0.0002
0.010113;0.9595;0.806374;0.010923;0.02656;0.5;0.266198;0.81649;0.003514;0.179996;0.462105;0.008805;0.52909;0.99036;0.00024;0.0094;0.812562;0.006458;0.0002;0.61938;0.99343;0.000175
0.010923;0.959085;0.81649;0.003514;0.00024;0.119203;0.678743;0.812562;0.006458;0.18098;0.45188;0.0159;0.53222;0.992355;0.0002;0.007445;0.849717;0.007718;0.000175;0.618415;0.99176;0.00018
0.003514;0.99036;0.812562;0.006458;0.0002;0.119203;0.859603;0.849717;0.007718;0.142565;0.61938;0.019125;0.361495;0.99343;0.000175;0.006395;0.555754;0.037422;0.00018;0.59242;0.939245;0.000625
0.006458;0.992355;0.849717;0.007718;0.000175;0.880797;0.490251;0.555754;0.037422;0.406824;0.618415;0.055;0.32659;0.99176;0.00018;0.008055;0.529584;0.058647;0.000625;0.340065;0.474595;0.000135
0.007718;0.99343;0.555754;0.037422;0.00018;0.982014;0.79883;0.529584;0.058647;0.411769;0.59242;0.11823;0.28935;0.939245;0.000625;0.06013;0.292512;0.044162;0.000135;0.051765;0.20853;0.00028
0.000957;0.98768;0.730081;0.000559;0;0.268941;0.830335;0.980776;0.000497;0.018725;0.9461;0.000005;0.053895;0.99918;0.00001;0.000805;0.997544;0.000497;0.000015;0.99917;0.99878;0
0.000559;0.99557;0.980776;0.000497;0.00001;0.268941;0.856313;0.997544;0.000497;0.001959;0.99779;0;0.00221;0.997795;0.000015;0.00219;0.998332;0.000494;0;0.99849;0.9935;0
0.000497;0.99918;0.997544;0.000497;0.000015;0.731059;0.827213;0.998332;0.000494;0.001172;0.99917;0.000005;0.000825;0.99878;0;0.001215;0.996346;0.000494;0;0.9985;0.98461;0
0.000497;0.997795;0.998332;0.000494;0;0.731059;0.81427;0.996346;0.000494;0.00316;0.99849;0.000005;0.001505;0.9935;0;0.0065;0.993368;0.000501;0;0.992605;0.988;0.000015
0.000494;0.99878;0.996346;0.000494;0;0.119203;0.961134;0.993368;0.000501;0.006131;0.9985;0;0.0015;0.98461;0;0.01539;0.992533;0.000506;0.000015;0.989965;0.98581;0.00003
0.000494;0.9935;0.993368;0.000501;0;0.119203;0.832018;0.992533;0.000506;0.006961;0.992605;0;0.007395;0.988;0.000015;0.011985;0.990923;0.000511;0.00003;0.99728;0.995275;0.00003
0.000501;0.98461;0.992533;0.000506;0.000015;0.047426;0.809382;0.990923;0.000511;0.008568;0.989965;0;0.010035;0.98581;0.00003;0.014165;0.996516;0.000523;0.00003;0.996675;0.98707;0.000015
0.000506;0.988;0.990923;0.000511;0.00003;0.5;0.067232;0.996516;0.000523;0.002961;0.99728;0.000035;0.002685;0.995275;0.00003;0.004695;0.993579;0.000525;0.000015;0.99261;0.981265;0.00003
0.000511;0.98581;0.996516;0.000523;0.00003;0.731059;0.920999;0.993579;0.000525;0.005895;0.996675;0.000055;0.00327;0.98707;0.000015;0.01291;0.990289;0.000538;0.00003;0.988385;0.947705;0.00003
0.000523;0.995275;0.993579;0.000525;0.000015;0.731059;0.799633;0.990289;0.000538;0.009175;0.99261;0.00008;0.00731;0.981265;0.00003;0.01871;0.977686;0.000546;0.00003;0.97561;0.923975;0.00028
0.000525;0.98707;0.990289;0.000538;0.00003;0.952574;0.435364;0.977686;0.000546;0.021771;0.988385;0.00009;0.011525;0.947705;0.00003;0.05227;0.965517;0.000622;0.00028;0.952515;0.93705;0.00055
0.000538;0.981265;0.977686;0.000546;0.00003;0.268941;0.932579;0.965517;0.000622;0.033859;0.97561;0.00007;0.024315;0.923975;0.00028;0.075745;0.962168;0.000718;0.00055;0.854885;0.924345;0.000995
0.000546;0.947705;0.965517;0.000622;0.00028;0.268941;0.447692;0.962168;0.000718;0.037113;0.952515;0.000075;0.04741;0.93705;0.00055;0.0624;0.925381;0.000871;0.000995;0.363205;0.9256;0.001055
0.000622;0.923975;0.962168;0.000718;0.00055;0.268941;0.333811;0.925381;0.000871;0.073748;0.854885;0.000075;0.14504;0.924345;0.000995;0.07466;0.761896;0.000888;0.001055;0.328265;0.943525;0.00169
0.000718;0.93705;0.925381;0.000871;0.000995;0.268941;0.490251;0.761896;0.000888;0.237219;0.363205;0.00005;0.636745;0.9256;0.001055;0.073355;0.586986;0.001108;0.00169;0.308075;0.896365;0.001855
0.000871;0.924345;0.761896;0.000888;0.001055;0.047426;0.652263;0.586986;0.001108;0.411906;0.328265;0.000075;0.67166;0.943525;0.00169;0.054785;0.564536;0.001588;0.001855;0.304215;0.883225;0.003075
0.000888;0.9256;0.586986;0.001108;0.00169;0.268941;0.569301;0.564536;0.001588;0.433879;0.308075;0.00135;0.69058;0.896365;0.001855;0.101785;0.396333;0.002181;0.003075;0.301465;0.737975;0.02046
0.001108;0.943525;0.564536;0.001588;0.001855;0.017986;0.684386;0.396333;0.002181;0.601486;0.304215;0.00191;0.693875;0.883225;0.003075;0.1137;0.346999;0.033141;0.02046;0.29365;0.233375;0.01128
0.001588;0.896365;0.396333;0.002181;0.003075;0.017986;0.096477;0.346999;0.033141;0.61986;0.301465;0.077405;0.62113;0.737975;0.02046;0.241565;0.176194;0.009508;0.01128;0.12172;0.231395;0.008695
0.016925;0.374755;0.233115;0.012332;0.01381;0.047426;0.764228;0.630499;0.00924;0.36026;0.276805;0.01643;0.70677;0.617895;0.00969;0.37241;0.772146;0.012769;0.013045;0.732945;0.60029;0.021355
0.012332;0.469085;0.630499;0.00924;0.00969;0.047426;0.865879;0.772146;0.012769;0.215087;0.690085;0.02366;0.286255;0.629555;0.013045;0.357405;0.776677;0.025177;0.021355;0.726385;0.48773;0.01814
0.00924;0.617895;0.772146;0.012769;0.013045;0.952574;0.219943;0.776677;0.025177;0.198149;0.732945;0.052575;0.214485;0.60029;0.021355;0.37836;0.736971;0.019104;0.01814;0.37464;0.46342;0.01562
0.012769;0.629555;0.776677;0.025177;0.021355;0.731059;0.27788;0.736971;0.019104;0.243925;0.726385;0.03757;0.236045;0.48773;0.01814;0.49413;0.317226;0.050084;0.01562;0.35038;0.373345;0.002765
0.005976;0.373345;0.614189;0.001306;0.0005;0.047426;0.181384;0.912772;0.000908;0.086319;0.838805;0.001195;0.160005;0.910095;0.000025;0.089875;0.937605;0.000561;0.000005;0.928195;0.99178;0.00001
0.001306;0.454815;0.912772;0.000908;0.000025;0.119203;0.795272;0.937605;0.000561;0.061832;0.849815;0.00019;0.14999;0.975345;0.000005;0.02465;0.969245;0.000549;0.00001;0.93743;0.994425;0.000055
0.000908;0.910095;0.937605;0.000561;0.000005;0.119203;0.477515;0.969245;0.000549;0.030208;0.928195;0.00016;0.071645;0.99178;0.00001;0.008215;0.972595;0.000576;0.000055;0.95139;0.9984;0
0.000561;0.975345;0.969245;0.000549;0.00001;0.5;0.136933;0.972595;0.000576;0.026831;0.93743;0.0002;0.062375;0.994425;0.000055;0.00552;0.978573;0.000506;0;0.978105;0.997615;0
0.000549;0.99178;0.972595;0.000576;0.000055;0.268941;0.46456;0.978573;0.000506;0.020921;0.95139;0.000045;0.04857;0.9984;0;0.001595;0.987216;0.000498;0;0.985105;0.996075;0
0.000576;0.994425;0.978573;0.000506;0;0.047426;0.887455;0.987216;0.000498;0.012286;0.978105;0.00002;0.021875;0.997615;0;0.002385;0.992744;0.0005;0;0.994865;0.97265;0
0.000506;0.9984;0.987216;0.000498;0;0.119203;0.702452;0.992744;0.0005;0.006756;0.985105;0.000025;0.014875;0.996075;0;0.00392;0.988197;0.000489;0;0.99444;0.979705;0
0.000498;0.997615;0.992744;0.0005;0;0.5;0.875665;0.988197;0.000489;0.011314;0.994865;0.000005;0.00513;0.97265;0;0.02735;0.990406;0.000488;0;0.99607;0.989265;0.000005
0.0005;0.996075;0.988197;0.000489;0;0.119203;0.767456;0.990406;0.000488;0.009106;0.99444;0;0.00556;0.979705;0;0.020295;0.991066;0.00049;0.000005;0.99732;0.992175;0.000005
0.000489;0.97265;0.990406;0.000488;0;0.119203;0.870456;0.991066;0.00049;0.008444;0.99607;0;0.00393;0.989265;0.000005;0.01073;0.992453;0.00049;0.000005;0.99849;0.998165;0
0.000488;0.979705;0.991066;0.00049;0.000005;0.119203;0.404681;0.992453;0.00049;0.007056;0.99732;0;0.002675;0.992175;0.000005;0.00782;0.993078;0.000701;0;0.996475;0.997235;0
0.00049;0.989265;0.992453;0.00049;0.000005;0.5;0.528718;0.993078;0.000701;0.006219;0.99849;0;0.001505;0.998165;0;0.001835;0.986028;0.001689;0;0.99557;0.994495;0
0.00049;0.992175;0.993078;0.000701;0;0.119203;0.731059;0.986028;0.001689;0.012283;0.996475;0;0.003525;0.997235;0;0.002765;0.964063;0.005651;0;0.99068;0.99584;0
0.000701;0.998165;0.986028;0.001689;0;0.268941;0.927372;0.964063;0.005651;0.030286;0.99557;0;0.00443;0.994495;0;0.005505;0.959323;0.006267;0;0.990665;0.9978;0
0.001689;0.997235;0.964063;0.005651;0;0.880797;0.917208;0.959323;0.006267;0.03441;0.99068;0;0.00932;0.99584;0;0.00416;0.959971;0.006267;0;0.99072;0.999215;0
0.005651;0.994495;0.959323;0.006267;0;0.268941;0.78228;0.959971;0.006267;0.033762;0.990665;0;0.009335;0.9978;0;0.0022;0.988766;0.001747;0;0.997225;0.99971;0
0.006267;0.99584;0.959971;0.006267;0;0.119203;0.774692;0.988766;0.001747;0.009487;0.99072;0.000005;0.009275;0.999215;0;0.000785;0.994391;0.001031;0;0.99681;0.998675;0
0.006267;0.9978;0.988766;0.001747;0;0.268941;0.573709;0.994391;0.001031;0.004579;0.997225;0.000045;0.002735;0.99971;0;0.00029;0.995306;0.000733;0;0.995085;0.990985;0
0.001747;0.999215;0.994391;0.001031;0;0.731059;0.898256;0.995306;0.000733;0.003962;0.99681;0.00008;0.003115;0.998675;0;0.001325;0.99305;0.00079;0;0.97747;0.98036;0.0001
0.001031;0.99971;0.995306;0.000733;0;0.731059;0.816977;0.99305;0.00079;0.00616;0.995085;0.000245;0.004675;0.990985;0;0.00901;0.983932;0.000736;0.0001;0.91756;0.87817;0.000095
0.000733;0.998675;0.99305;0.00079;0;0.982014;0.942133;0.983932;0.000736;0.015332;0.97747;0.000255;0.022275;0.98036;0.0001;0.01954;0.929899;0.000692;0.000095;0.621795;0.741475;0.000065
0.00079;0.990985;0.983932;0.000736;0.0001;0.731059;0.946495;0.929899;0.000692;0.069407;0.91756;0.00013;0.08231;0.87817;0.000095;0.12173;0.786519;0.000657;0.000065;0.306305;0.65228;0.000035
0.000736;0.98036;0.929899;0.000692;0.000095;0.047426;0.58589;0.786519;0.000657;0.212822;0.621795;0.00005;0.37815;0.741475;0.000065;0.25846;0.320147;0.000649;0.000035;0.137985;0.6676;0.000015
0.000692;0.87817;0.786519;0.000657;0.000065;0.268941;0.726115;0.320147;0.000649;0.679204;0.306305;0.000055;0.69364;0.65228;0.000035;0.347685;0.269077;0.00057;0.000015;0.226085;0.73308;0.00004
0.000649;0.65228;0.269077;0.00057;0.000015;0.017986;0.58686;0.652072;0.00051;0.347418;0.226085;0.000015;0.7739;0.73308;0.00004;0.26688;0.835977;0.000981;0.00001;0.69475;0.90209;0.00004
0.00057;0.6676;0.652072;0.00051;0.00004;0.047426;0.858635;0.835977;0.000981;0.163042;0.6709;0.00146;0.32764;0.839975;0.00001;0.160015;0.856337;0.01417;0.00004;0.701525;0.98958;0.00004
0.00051;0.73308;0.835977;0.000981;0.00001;0.268941;0.864244;0.856337;0.01417;0.129495;0.69475;0.016075;0.28918;0.90209;0.00004;0.09787;0.896084;0.003634;0.00004;0.689315;0.99145;0.000005
0.000981;0.839975;0.856337;0.01417;0.00004;0.880797;0.696355;0.896084;0.003634;0.100281;0.701525;0.009435;0.28904;0.98958;0.00004;0.010375;0.892637;0.002077;0.000005;0.76921;0.99113;0
0.01417;0.90209;0.896084;0.003634;0.00004;0.017986;0.427025;0.892637;0.002077;0.105286;0.689315;0.0048;0.30588;0.99145;0.000005;0.00855;0.919162;0.001536;0;0.783955;0.988235;0.000025
0.003634;0.98958;0.892637;0.002077;0.000005;0.268941;0.860926;0.919162;0.001536;0.079302;0.76921;0.00318;0.227615;0.99113;0;0.008865;0.919701;0.006413;0.000025;0.812805;0.98671;0.00003
0.002077;0.99145;0.919162;0.001536;0;0.119203;0.92484;0.919701;0.006413;0.073884;0.783955;0.007535;0.208505;0.988235;0.000025;0.01174;0.926317;0.009928;0.00003;0.808425;0.98278;0.00001
0.001536;0.99113;0.919701;0.006413;0.000025;0.880797;0.11165;0.926317;0.009928;0.063756;0.812805;0.010585;0.17661;0.98671;0.00003;0.013265;0.898436;0.009508;0.00001;0.82216;0.975785;0.000125
0.006413;0.988235;0.926317;0.009928;0.00003;0.047426;0.652036;0.898436;0.009508;0.092054;0.808425;0.006785;0.18479;0.98278;0.00001;0.017205;0.768988;0.003593;0.000125;0.81598;0.944165;0.000235
0.009928;0.98671;0.898436;0.009508;0.00001;0.047426;0.796571;0.768988;0.003593;0.22742;0.82216;0.009255;0.16858;0.975785;0.000125;0.024095;0.759377;0.003509;0.000235;0.843795;0.905635;0.000205
0.009508;0.98278;0.768988;0.003593;0.000125;0.993307;0.343891;0.759377;0.003509;0.237113;0.81598;0.008875;0.17514;0.944165;0.000235;0.0556;0.9015;0.000622;0.000205;0.83146;0.865585;0.000145
0.003593;0.975785;0.759377;0.003509;0.000235;0.047426;0.660607;0.9015;0.000622;0.097878;0.843795;0.000255;0.15595;0.905635;0.000205;0.09416;0.883214;0.000605;0.000145;0.831165;0.66026;0.000635
0.003509;0.944165;0.9015;0.000622;0.000205;0.017986;0.665299;0.883214;0.000605;0.116182;0.83146;0.00027;0.168265;0.865585;0.000145;0.134275;0.807254;0.008317;0.000635;0.79281;0.63202;0.000215
0.000622;0.905635;0.883214;0.000605;0.000145;0.119203;0.763506;0.807254;0.008317;0.184432;0.831165;0.00066;0.16818;0.66026;0.000635;0.33911;0.800022;0.008268;0.000215;0.83577;0.693355;0.00048
0.000605;0.865585;0.807254;0.008317;0.000635;0.006693;0;0.800022;0.008268;0.191712;0.79281;0.00117;0.20602;0.63202;0.000215;0.36777;0.834787;0.008956;0.00048;0.83721;0.708485;0.0008
0.008317;0.66026;0.800022;0.008268;0.000215;0.002473;0.196866;0.834787;0.008956;0.156257;0.83577;0.00297;0.16126;0.693355;0.00048;0.306165;0.840311;0.008663;0.0008;0.837855;0.80477;0.00096
0.008268;0.63202;0.834787;0.008956;0.00048;0.047426;0.906447;0.840311;0.008663;0.151025;0.83721;0.00177;0.16102;0.708485;0.0008;0.29071;0.879978;0.001297;0.00096;0.861185;0.897515;0.00145
0.008956;0.693355;0.840311;0.008663;0.0008;0.006693;0;0.879978;0.001297;0.118725;0.837855;0.001585;0.16056;0.80477;0.00096;0.19427;0.918668;0.001536;0.00145;0.86976;0.94828;0.000555
0.008663;0.708485;0.879978;0.001297;0.00096;0.000335;0;0.918668;0.001536;0.079796;0.861185;0.00181;0.137005;0.897515;0.00145;0.101035;0.935537;0.00076;0.000555;0.870135;0.94317;0.00462
0.001297;0.80477;0.918668;0.001536;0.00145;0.002473;0.642906;0.935537;0.00076;0.063707;0.86976;0.000345;0.1299;0.94828;0.000555;0.05117;0.935059;0.002356;0.00462;0.862075;0.757925;0.005535
0.001536;0.897515;0.935537;0.00076;0.000555;0.119203;0.732433;0.935059;0.002356;0.062588;0.870135;0.00104;0.12883;0.94317;0.00462;0.052215;0.872384;0.00263;0.005535;0.831585;0.628115;0.0169
0.00076;0.94828;0.935059;0.002356;0.00462;0.017986;0;0.872384;0.00263;0.124985;0.862075;0.00093;0.13699;0.757925;0.005535;0.23654;0.487121;0.006569;0.0169;0.720505;0.445455;0.0052
0.215145;0.023915;0.436929;0.093897;0.176545;0.017986;0.42678;0.697251;0.024701;0.278049;0.71705;0.000505;0.28245;0.57;0.01179;0.41821;0.82364;0.009498;0.005895;0.892945;0.954385;0.001335
0.093897;0.199265;0.697251;0.024701;0.01179;0.047426;0.590976;0.82364;0.009498;0.166862;0.73817;0.000085;0.26175;0.88729;0.005895;0.10681;0.910777;0.006181;0.001335;0.987505;0.988275;0.000305
0.024701;0.57;0.82364;0.009498;0.005895;0.982014;0.445715;0.910777;0.006181;0.083044;0.892945;0.00016;0.106895;0.954385;0.001335;0.044285;0.97476;0.000229;0.000305;0.996145;0.99543;0.000095
0.009498;0.88729;0.910777;0.006181;0.001335;0.880797;0.462819;0.97476;0.000229;0.025011;0.987505;0.00027;0.01222;0.988275;0.000305;0.011425;0.997116;0.000178;0.000095;0.99748;0.99904;0.000005
0.006181;0.954385;0.97476;0.000229;0.000305;0.119203;0.628083;0.997116;0.000178;0.002704;0.996145;0.000325;0.00353;0.99543;0.000095;0.00447;0.998764;0.000203;0.000005;0.997555;0.999705;0
0.000229;0.988275;0.997116;0.000178;0.000095;0.5;0.537181;0.998764;0.000203;0.001035;0.99748;0.00049;0.00203;0.99904;0.000005;0.00096;0.999008;0.000209;0;0.994705;0.99978;0
0.000178;0.99543;0.998764;0.000203;0.000005;0.731059;0.879637;0.999008;0.000209;0.000779;0.997555;0.00051;0.00193;0.999705;0;0.00029;0.990253;0.000226;0;0.992255;0.999705;0
0.000203;0.99904;0.999008;0.000209;0;0.047426;0.612065;0.990253;0.000226;0.00952;0.994705;0.000555;0.00474;0.99978;0;0.000215;0.997233;0.000272;0;0.817805;0.999255;0
0.000209;0.999705;0.990253;0.000226;0;0.017986;0.53096;0.997233;0.000272;0.002494;0.992255;0.000685;0.00706;0.999705;0;0.00029;0.938901;0.000291;0;0.222375;0.992435;0.00005
0.000226;0.99978;0.997233;0.000272;0;0.731059;0.647484;0.938901;0.000291;0.060805;0.817805;0.000695;0.18149;0.999255;0;0.000745;0.72451;0.000367;0.00005;0.036905;0.947175;0.00003
0.00013;0.922025;0.364249;0.00213;0.005715;0.047426;0.518991;0.308918;0.002281;0.688803;0.723365;0.00033;0.276305;0.12559;0.006335;0.86808;0.357071;0.020147;0.060185;0.8305;0.156635;0.095225
0.00213;0.668675;0.308918;0.002281;0.006335;0.119203;0.307464;0.357071;0.020147;0.622786;0.806825;0.00004;0.19314;0.12354;0.060185;0.81628;0.375391;0.042258;0.095225;0.92828;0.25798;0.09275
0.002281;0.12559;0.357071;0.020147;0.060185;0.268941;0.511748;0.375391;0.042258;0.582353;0.8305;0.000095;0.16941;0.156635;0.095225;0.74814;0.443494;0.048388;0.09275;0.928895;0.369355;0.035785
0.020147;0.12354;0.375391;0.042258;0.095225;0.952574;0.118888;0.443494;0.048388;0.50812;0.92828;0.000155;0.071565;0.25798;0.09275;0.649275;0.494535;0.028246;0.035785;0.99971;0.742055;0.02584
0.042258;0.156635;0.443494;0.048388;0.09275;0.268941;0.622929;0.494535;0.028246;0.477223;0.928895;0.00005;0.07106;0.369355;0.035785;0.594865;0.846338;0.026611;0.02584;0.999785;0.76705;0.024415
0.048388;0.25798;0.494535;0.028246;0.035785;0.047426;0.562423;0.846338;0.026611;0.127053;0.99971;0.000035;0.000255;0.742055;0.02584;0.23211;0.857148;0.031749;0.024415;0.9998;0.90537;0.016645
0.028246;0.369355;0.846338;0.026611;0.02584;0.5;0.797057;0.857148;0.031749;0.111103;0.999785;0.000035;0.00018;0.76705;0.024415;0.208535;0.869248;0.037365;0.016645;0.999785;0.923295;0.01349
0.026611;0.742055;0.857148;0.031749;0.024415;0.000335;0.172216;0.869248;0.037365;0.093385;0.9998;0.00001;0.00019;0.90537;0.016645;0.07798;0.863812;0.005462;0.01349;0.99969;0.940925;0.0069
0.031749;0.76705;0.869248;0.037365;0.016645;0.000911;0.356635;0.863812;0.005462;0.130727;0.999785;0.000005;0.00021;0.923295;0.01349;0.063215;0.869657;0.003265;0.0069;0.99849;0.95727;0.002485
0.037365;0.90537;0.863812;0.005462;0.01349;0.047426;0.408058;0.869657;0.003265;0.127078;0.99969;0.000005;0.000305;0.940925;0.0069;0.052175;0.918198;0.002636;0.002485;0.99212;0.94279;0.002545
0.005462;0.923295;0.869657;0.003265;0.0069;0.017986;0.558235;0.918198;0.002636;0.079164;0.99849;0.000005;0.001505;0.95727;0.002485;0.04024;0.971191;0.004413;0.002545;0.99019;0.942785;0.00162
0.003265;0.940925;0.918198;0.002636;0.002485;0.017986;0.337602;0.971191;0.004413;0.024393;0.99212;0.000025;0.00785;0.94279;0.002545;0.05466;0.970546;0.004215;0.00162;0.99108;0.95137;0.000585
0.002636;0.95727;0.971191;0.004413;0.002545;0.5;0.567584;0.970546;0.004215;0.025236;0.99019;0.000355;0.009455;0.942785;0.00162;0.055585;0.973876;0.004464;0.000585;0.99098;0.98178;0.000335
0.004413;0.94279;0.970546;0.004215;0.00162;0.002473;0.596283;0.973876;0.004464;0.02166;0.99108;0.002395;0.00652;0.95137;0.000585;0.04805;0.98581;0.003557;0.000335;0.992525;0.990695;0.00028
0.004215;0.942785;0.973876;0.004464;0.000585;0.006693;0.350236;0.98581;0.003557;0.010635;0.99098;0.00267;0.006355;0.98178;0.000335;0.017885;0.99319;0.00201;0.00028;0.994555;0.99393;0.00024
0.004464;0.95137;0.98581;0.003557;0.000335;0.017986;0.490001;0.99319;0.00201;0.004797;0.992525;0.003925;0.003545;0.990695;0.00028;0.00902;0.975266;0.001886;0.00024;0.993715;0.98917;0.000305
0.003557;0.98178;0.99319;0.00201;0.00028;0.002473;0.289667;0.975266;0.001886;0.022848;0.994555;0.00446;0.00099;0.99393;0.00024;0.005825;0.963298;0.001794;0.000305;0.925885;0.968695;0.00031
0.00201;0.990695;0.975266;0.001886;0.00024;0.952574;0.512497;0.963298;0.001794;0.034909;0.993715;0.00438;0.001905;0.98917;0.000305;0.010525;0.92978;0.001558;0.00031;0.540275;0.93395;0.00062
0.001886;0.99393;0.963298;0.001794;0.000305;0.268941;0.630881;0.92978;0.001558;0.068664;0.925885;0.0038;0.070315;0.968695;0.00031;0.031;0.787683;0.001195;0.00062;0.355865;0.779765;0.00167
0.001794;0.98917;0.92978;0.001558;0.00031;0.047426;0.717684;0.787683;0.001195;0.211123;0.540275;0.00243;0.457295;0.93395;0.00062;0.065435;0.600138;0.005584;0.00167;0.31428;0.59979;0.00468
0.001558;0.968695;0.787683;0.001195;0.00062;0.731059;0.425557;0.600138;0.005584;0.394278;0.355865;0.00173;0.64241;0.779765;0.00167;0.21856;0.402265;0.006805;0.00468;0.138625;0.31866;0.0042
0.001195;0.93395;0.600138;0.005584;0.00167;0.119203;0.53544;0.402265;0.006805;0.590928;0.31428;0.00261;0.68311;0.59979;0.00468;0.395525;0.183378;0.00551;0.0042;0.15613;0.40126;0.00302
0.005088;0.76473;0.309906;0.00554;0.012715;0.002473;0.406127;0.777393;0.003535;0.219075;0.59713;0.00187;0.401;0.84125;0.007995;0.150765;0.807087;0.002486;0.003905;0.803955;0.948585;0.003685
0.00554;0.788885;0.777393;0.003535;0.007995;0.006693;0.556754;0.807087;0.002486;0.190428;0.63132;0.002555;0.36613;0.880335;0.003905;0.11576;0.887243;0.002843;0.003685;0.794365;0.936445;0.00316
0.003535;0.84125;0.807087;0.002486;0.003905;0.731059;0.433889;0.887243;0.002843;0.109914;0.803955;0.00384;0.192205;0.948585;0.003685;0.04773;0.893323;0.002102;0.00316;0.83752;0.93592;0.00189
0.002486;0.880335;0.887243;0.002843;0.003685;0.982014;0.479761;0.893323;0.002102;0.104576;0.794365;0.002205;0.20343;0.936445;0.00316;0.0604;0.900027;0.012638;0.00189;0.9926;0.97545;0.000095
0.002843;0.948585;0.893323;0.002102;0.00316;0.047426;0.361468;0.900027;0.012638;0.087333;0.83752;0.03516;0.127315;0.93592;0.00189;0.06219;0.975056;0.000199;0.000095;0.99454;0.98534;0.00001
0.002102;0.936445;0.900027;0.012638;0.00189;0.268941;0.554532;0.975056;0.000199;0.024744;0.9926;0;0.0074;0.97545;0.000095;0.02445;0.978655;0.000121;0.00001;0.9988;0.99275;0.00003
0.012638;0.93592;0.975056;0.000199;0.000095;0.047426;0.663739;0.978655;0.000121;0.021224;0.99454;0;0.00546;0.98534;0.00001;0.01465;0.977053;0.000113;0.00003;0.999965;0.993755;0.00003
0.000199;0.97545;0.978655;0.000121;0.00001;0.5;0.509249;0.977053;0.000113;0.022832;0.9988;0;0.0012;0.99275;0.00003;0.007215;0.992569;0.000105;0.00003;0.99999;0.997675;0.000025
0.000121;0.98534;0.977053;0.000113;0.00003;0.119203;0.556754;0.992569;0.000105;0.007324;0.999965;0;0.00003;0.993755;0.00003;0.006215;0.994002;0.000101;0.000025;0.999945;0.998245;0.00001
0.000113;0.99275;0.992569;0.000105;0.00003;0.268941;0.510748;0.994002;0.000101;0.005896;0.99999;0;0.00001;0.997675;0.000025;0.0023;0.994146;0.000097;0.00001;0.999945;0.998665;0.000015
0.000105;0.993755;0.994002;0.000101;0.000025;0.047426;0.668631;0.994146;0.000097;0.005757;0.999945;0;0.000055;0.998245;0.00001;0.001745;0.990392;0.000091;0.000015;0.999985;0.9958;0.00003
0.000101;0.997675;0.994146;0.000097;0.00001;0.880797;0.563653;0.990392;0.000091;0.009516;0.999945;0;0.000055;0.998665;0.000015;0.00132;0.976833;0.000095;0.00003;0.99955;0.96302;0.00002
0.000097;0.998245;0.990392;0.000091;0.000015;0.268941;0.624103;0.976833;0.000095;0.023071;0.999985;0;0.000015;0.9958;0.00003;0.004165;0.964373;0.000079;0.00002;0.99815;0.89511;0.00004
0.000091;0.998665;0.976833;0.000095;0.00003;0.047426;0.732824;0.964373;0.000079;0.035549;0.99955;0;0.00045;0.96302;0.00002;0.03696;0.927369;0.000084;0.00004;0.67103;0.3607;0.0012
0.000095;0.9958;0.964373;0.000079;0.00002;0.5;0.257309;0.927369;0.000084;0.072547;0.99815;0;0.00185;0.89511;0.00004;0.10485;0.53775;0.000467;0.0012;0.00005;0.029465;0.00055
0.000079;0.96302;0.927369;0.000084;0.00004;0.268941;0.426291;0.53775;0.000467;0.461783;0.67103;0;0.32897;0.3607;0.0012;0.6381;0.03403;0.00025;0.00055;0.00004;0.165985;0.006415
0.00383;0.38926;0.236014;0.003239;0.003625;0.047426;0.567093;0.915407;0.003566;0.081023;0.98159;0;0.01841;0.947535;0.0003;0.052155;0.943768;0.003522;0.000025;0.998065;0.990915;0.000015
0.003239;0.67505;0.915407;0.003566;0.0003;0.119203;0.649991;0.943768;0.003522;0.05271;0.993855;0;0.006145;0.978435;0.000025;0.02154;0.97497;0.003647;0.000015;0.999925;0.9988;0.000005
0.003566;0.947535;0.943768;0.003522;0.000025;0.268941;0.168682;0.97497;0.003647;0.021383;0.998065;0;0.001935;0.990915;0.000015;0.00907;0.977859;0.006134;0.000005;0.99993;0.99963;0
0.003522;0.978435;0.97497;0.003647;0.000015;0.731059;0.626212;0.977859;0.006134;0.016007;0.999925;0;0.000075;0.9988;0.000005;0.001195;0.979186;0.002391;0;0.99992;0.99975;0
0.003647;0.990915;0.977859;0.006134;0.000005;0.047426;0.643595;0.979186;0.002391;0.018423;0.99993;0;0.00007;0.99963;0;0.00037;0.976956;0.019416;0;0.999955;0.999865;0
0.006134;0.9988;0.979186;0.002391;0;0.047426;0.633207;0.976956;0.019416;0.003626;0.99992;0;0.000075;0.99975;0;0.00025;0.977775;0.018765;0;0.999475;0.999735;0
0.002391;0.99963;0.976956;0.019416;0;0.731059;0.451652;0.977775;0.018765;0.003458;0.999955;0;0.00004;0.999865;0;0.000135;0.958516;0.019427;0;0.997625;0.998795;0.000005
0.019416;0.99975;0.977775;0.018765;0;0.5;0.351831;0.958516;0.019427;0.022057;0.999475;0;0.000525;0.999735;0;0.000265;0.939596;0.012129;0.000005;0.995565;0.995135;0.00001
0.018765;0.999865;0.958516;0.019427;0;0.017986;0.649764;0.939596;0.012129;0.048277;0.997625;0;0.002375;0.998795;0.000005;0.001205;0.929099;0.000103;0.00001;0.99212;0.93475;0.000075
0.019427;0.999735;0.939596;0.012129;0.000005;0.5;0.72889;0.929099;0.000103;0.070799;0.995565;0.000065;0.00437;0.995135;0.00001;0.004855;0.899209;0.000145;0.000075;0.69978;0.86029;0.000645
0.012129;0.998795;0.929099;0.000103;0.00001;0.047426;0;0.899209;0.000145;0.100646;0.99212;0.00015;0.00773;0.93475;0.000075;0.065175;0.688027;0.019032;0.000645;0.541115;0.76356;0.000835
0.000103;0.995135;0.899209;0.000145;0.000075;0.119203;0.529715;0.688027;0.019032;0.292942;0.69978;0.000195;0.300025;0.86029;0.000645;0.13907;0.535122;0.023184;0.000835;0.072645;0.577295;0.00155
0.000145;0.93475;0.688027;0.019032;0.000645;0.119203;0.594837;0.535122;0.023184;0.441695;0.541115;0.00068;0.458205;0.76356;0.000835;0.23561;0.253804;0.037863;0.00155;0.0537;0.539185;0.014415
0.016194;0.871655;0.474513;0.003796;0.00002;0.999089;0.854706;0.886393;0.003584;0.110021;0.72637;0.00009;0.273535;0.99664;0.000105;0.003255;0.972395;0.002091;0.00001;0.98776;0.999535;0.00001
0.003796;0.89761;0.886393;0.003584;0.000105;0.5;0.574443;0.972395;0.002091;0.025514;0.95491;0;0.04509;0.998805;0.00001;0.001185;0.983808;0.004594;0.00001;0.999035;0.999955;0
0.003584;0.99664;0.972395;0.002091;0.00001;0.017986;0.69572;0.983808;0.004594;0.011598;0.98776;0;0.01224;0.999535;0.00001;0.000455;0.988746;0.00354;0;0.99984;0.99996;0
0.002091;0.998805;0.983808;0.004594;0.00001;0.880797;0.587587;0.988746;0.00354;0.007714;0.999035;0;0.000965;0.999955;0;0.000045;0.987907;0.003515;0;0.999925;0.99997;0
0.004594;0.999535;0.988746;0.00354;0;0.268941;0.421407;0.987907;0.003515;0.008577;0.99984;0;0.00016;0.99996;0;0.00004;0.987366;0.003642;0;0.999925;0.99997;0
0.00354;0.999955;0.987907;0.003515;0;0.017986;0.703704;0.987366;0.003642;0.008992;0.999925;0;0.000075;0.99997;0;0.00003;0.992792;0.003669;0;0.99868;0.999075;0
0.003515;0.99996;0.987366;0.003642;0;0.047426;0.541157;0.992792;0.003669;0.003539;0.999925;0;0.000075;0.99997;0;0.00003;0.978008;0.000283;0;0.995235;0.995175;0
0.003642;0.99997;0.992792;0.003669;0;0.119203;0.345925;0.978008;0.000283;0.021709;0.99868;0;0.00132;0.999075;0;0.000925;0.949489;0.000294;0;0.983435;0.93294;0
0.003669;0.99997;0.978008;0.000283;0;0.047426;0.492501;0.949489;0.000294;0.050217;0.995235;0;0.004765;0.995175;0;0.004825;0.887114;0.008341;0;0.09773;0.21798;0.00002
0.000283;0.999075;0.949489;0.000294;0;0.047426;0.69211;0.887114;0.008341;0.104545;0.983435;0;0.016565;0.93294;0;0.06706;0.162053;0.001193;0.00002;0.00002;0.00001;0.00013
0.058759;0.05644;0.087108;0.045649;0.005085;0.119203;0.515995;0.765074;0.000727;0.234199;0.83774;0.000005;0.162255;0.81827;0.00214;0.17959;0.882753;0.021212;0.00241;0.99501;0.95542;0.001065
0.045649;0.2046;0.765074;0.000727;0.00214;0.047426;0.485754;0.882753;0.021212;0.096033;0.961865;0.00001;0.038125;0.904835;0.00241;0.09275;0.921078;0.020797;0.001065;0.9979;0.99656;0.001015
0.000727;0.81827;0.882753;0.021212;0.00241;0.5;0.105458;0.921078;0.020797;0.058127;0.99501;0.00011;0.00488;0.95542;0.001065;0.04352;0.952567;0.01298;0.001015;0.99753;0.99849;0.000055
0.021212;0.904835;0.921078;0.020797;0.001065;0.982014;0.724321;0.952567;0.01298;0.034452;0.9979;0.000165;0.00193;0.99656;0.001015;0.002425;0.959229;0.008507;0.000055;0.997095;0.998675;0.000035
0.020797;0.95542;0.952567;0.01298;0.001015;0.017986;0.564145;0.959229;0.008507;0.032264;0.99753;0.00004;0.00243;0.99849;0.000055;0.001455;0.952852;0.004328;0.000035;0.99754;0.999615;0.00002
0.01298;0.99656;0.959229;0.008507;0.000055;0.119203;0.426046;0.952852;0.004328;0.042819;0.997095;0.000045;0.002855;0.998675;0.000035;0.00129;0.967478;0.00761;0.00002;0.96629;0.998805;0.000025
0.008507;0.99849;0.952852;0.004328;0.000035;0.731059;0.483007;0.967478;0.00761;0.02491;0.99754;0.00004;0.00242;0.999615;0.00002;0.00036;0.965641;0.013501;0.000025;0.93346;0.994795;0.000005
0.004328;0.998675;0.967478;0.00761;0.00002;0.993307;0.629017;0.965641;0.013501;0.020856;0.96629;0.00007;0.033635;0.998805;0.000025;0.00117;0.937103;0.009307;0.000005;0.79533;0.958305;0.000015
0.00761;0.999615;0.965641;0.013501;0.000025;0.047426;0.646571;0.937103;0.009307;0.05359;0.93346;0.00003;0.06651;0.994795;0.000005;0.0052;0.872211;0.006475;0.000015;0.29515;0.417065;0.0001
0.013501;0.998805;0.937103;0.009307;0.000005;0.006693;0.752688;0.872211;0.006475;0.121314;0.79533;0.000025;0.204645;0.958305;0.000015;0.04168;0.403012;0.010731;0.0001;0.086925;0.047445;0.0001
0.000248;0.047445;0.159499;0.01337;0.01208;0.047426;0.379187;0.376677;0.009884;0.613436;0.39702;0.009645;0.59333;0.25185;0.012515;0.73563;0.461333;0.047876;0.013425;0.54763;0.36767;0.033715
0.01337;0.075375;0.376677;0.009884;0.012515;0.017986;0.390788;0.461333;0.047876;0.490791;0.511055;0.00433;0.484615;0.355585;0.013425;0.63099;0.463707;0.06173;0.033715;0.499995;0.38941;0.11568
0.009884;0.25185;0.461333;0.047876;0.013425;0.731059;0.477515;0.463707;0.06173;0.474562;0.54763;0.009625;0.44274;0.36767;0.033715;0.598615;0.358241;0.107791;0.11568;0.43526;0.3975;0.120315
0.433684;0.354385;0.456107;0.285092;0.374965;0.000911;0.212822;0.451319;0.21337;0.335313;0.455275;0.318615;0.22611;0.34145;0.28912;0.369435;0.468819;0.128847;0.224045;0.48841;0.44711;0.151545
0.285092;0.34851;0.451319;0.21337;0.28912;0.047426;0.45289;0.468819;0.128847;0.402336;0.461635;0.120335;0.41803;0.36007;0.224045;0.41589;0.390947;0.152922;0.151545;0.443355;0.37222;0.1348
0.21337;0.34145;0.468819;0.128847;0.224045;0.997527;0.357783;0.390947;0.152922;0.456131;0.48841;0.11718;0.39441;0.44711;0.151545;0.401345;0.28364;0.157172;0.1348;0.44061;0.38131;0.060585
0.009523;0.543745;0.303167;0.003024;0.001865;0.047426;0.585162;0.722601;0.000382;0.277015;0.76568;0;0.23432;0.84027;0.000335;0.15939;0.952793;0.002588;0.000035;0.97546;0.99394;0.000045
0.003024;0.62532;0.722601;0.000382;0.000335;0.047426;0.622694;0.952793;0.002588;0.044617;0.958605;0.000005;0.04139;0.97097;0.000035;0.02899;0.960747;0.010213;0.000045;0.99695;0.99964;0.00001
0.000382;0.84027;0.952793;0.002588;0.000035;0.993307;0.715856;0.960747;0.010213;0.029042;0.97546;0.000015;0.02453;0.99394;0.000045;0.006015;0.97127;0.009996;0.00001;0.999055;0.999745;0
0.002588;0.97097;0.960747;0.010213;0.000045;0.000911;0.409992;0.97127;0.009996;0.018734;0.99695;0.000145;0.002905;0.99964;0.00001;0.00035;0.978003;0.008433;0;0.999665;0.99988;0
0.010213;0.99394;0.97127;0.009996;0.00001;0.017986;0.621048;0.978003;0.008433;0.01356;0.999055;0.00017;0.00077;0.999745;0;0.00025;0.97468;0.005086;0;0.999495;0.99988;0
0.009996;0.99964;0.978003;0.008433;0;0.017986;0.572241;0.97468;0.005086;0.020234;0.999665;0.000175;0.00016;0.99988;0;0.00012;0.971391;0.002684;0;0.999035;0.99942;0
0.008433;0.999745;0.97468;0.005086;0;0.982014;0.779714;0.971391;0.002684;0.025924;0.999495;0.000175;0.000325;0.99988;0;0.00012;0.983543;0.002671;0;0.99605;0.998565;0
0.005086;0.99988;0.971391;0.002684;0;0.5;0.676121;0.983543;0.002671;0.013788;0.999035;0.000165;0.000805;0.99942;0;0.00058;0.976944;0.003737;0;0.820965;0.975085;0
0.002684;0.99988;0.983543;0.002671;0;0.047426;0.568811;0.976944;0.003737;0.019321;0.99605;0.00014;0.003815;0.998565;0;0.001435;0.90339;0.003918;0;0.62775;0.79208;0.00001
0.002671;0.99942;0.976944;0.003737;0;0.268941;0.478763;0.90339;0.003918;0.092692;0.820965;0.00012;0.178915;0.975085;0;0.024915;0.688066;0.001286;0.00001;0.0077;0.0004;0.000005
0.003737;0.998565;0.90339;0.003918;0;0.731059;0.151614;0.688066;0.001286;0.310651;0.62775;0.00013;0.37213;0.79208;0.00001;0.20791;0.0283;0.006459;0.000005;0.00589;0.00034;0.002375
0.015602;0.00034;0.007961;0.007639;0.014385;0.268941;0.680267;0.533576;0.01568;0.450742;0.362265;0.021065;0.61667;0.46985;0.016435;0.51371;0.528723;0.033213;0.029375;0.36286;0.46981;0.22959
0.007639;0.00034;0.533576;0.01568;0.016435;0.006693;0.531209;0.528723;0.033213;0.438063;0.3624;0.04356;0.59404;0.469805;0.029375;0.500815;0.526536;0.150858;0.22959;0.28257;0.00239;0.97829
0.01568;0.46985;0.528723;0.033213;0.029375;0.002473;0.284958;0.526536;0.150858;0.322606;0.36286;0.112155;0.524985;0.46981;0.22959;0.3006;0.098255;0.838856;0.97829;0.2797;0.00208;0.99112
0.123275;0.009245;0.02676;0.016445;0.01395;0.047426;0.367955;0.698385;0.004956;0.296659;0.67018;0.00288;0.32694;0.57482;0.00155;0.42363;0.726327;0.004661;0.000345;0.695505;0.625075;0.00009
0.016445;0.036845;0.698385;0.004956;0.00155;0.000335;0.576641;0.726327;0.004661;0.26901;0.69382;0.007765;0.298415;0.612335;0.000345;0.387315;0.744239;0.007437;0.00009;0.706095;0.991795;0.000305
0.004956;0.57482;0.726327;0.004661;0.000345;0.000123;0.232188;0.744239;0.007437;0.248323;0.695505;0.008815;0.29568;0.625075;0.00009;0.37483;0.875082;0.017297;0.000305;0.699285;0.998455;0.0002
0.004661;0.612335;0.744239;0.007437;0.00009;0.993307;0.514496;0.875082;0.017297;0.107625;0.706095;0.044475;0.24943;0.991795;0.000305;0.00791;0.859601;0.014418;0.0002;0.68658;0.99909;0.000175
0.007437;0.625075;0.875082;0.017297;0.000305;0.731059;0.602765;0.859601;0.014418;0.125979;0.699285;0.028595;0.27212;0.998455;0.0002;0.00134;0.854739;0.003121;0.000175;0.696075;0.9978;0.000175
0.017297;0.991795;0.859601;0.014418;0.0002;0.000335;0.862238;0.854739;0.003121;0.14214;0.68658;0.001615;0.3118;0.99909;0.000175;0.00074;0.858006;0.001603;0.000175;0.685765;0.993635;0.00006
0.014418;0.998455;0.854739;0.003121;0.000175;0.006693;0.933392;0.858006;0.001603;0.140395;0.696075;0.004605;0.299325;0.9978;0.000175;0.00203;0.846209;0.005215;0.00006;0.53054;0.95905;0.000055
0.003121;0.99909;0.858006;0.001603;0.000175;0.268941;0.264833;0.846209;0.005215;0.148576;0.685765;0.01489;0.299345;0.993635;0.00006;0.006305;0.741611;0.085383;0.000055;0.139405;0.618475;0.00003
0.001603;0.9978;0.846209;0.005215;0.00006;0.047426;0.689761;0.741611;0.085383;0.173007;0.53054;0.24932;0.22014;0.95905;0.000055;0.040895;0.347325;0.095584;0.00003;0.02677;0.000095;0.00003
0.005215;0.993635;0.741611;0.085383;0.000055;0.5;0.912456;0.347325;0.095584;0.557091;0.139405;0.276475;0.58412;0.618475;0.00003;0.381495;0.019142;0.10675;0.00003;0.01456;0.00019;0.095115
0.052629;0.004505;0.027964;0.00876;0.00003;0.5;0.358933;0.806463;0.00049;0.193047;0.604655;0.001425;0.39392;0.99924;0.000005;0.000755;0.834297;0.000289;0;0.64702;0.999875;0
0.00876;0.006095;0.806463;0.00049;0.000005;0.119203;0.754545;0.834297;0.000289;0.165412;0.6077;0.000825;0.39147;0.999665;0;0.000335;0.849292;0.000396;0;0.766675;0.99963;0
0.00049;0.99924;0.834297;0.000289;0;0.006693;0.56193;0.849292;0.000396;0.150313;0.64702;0.001145;0.351835;0.999875;0;0.000125;0.858368;0.003047;0;0.771335;0.9995;0
0.000289;0.999665;0.849292;0.000396;0;0.880797;0.219428;0.858368;0.003047;0.138585;0.766675;0.001715;0.23161;0.99963;0;0.00037;0.882154;0.006152;0;0.780665;0.999055;0.00001
0.000396;0.999875;0.858368;0.003047;0;0.952574;0.633439;0.882154;0.006152;0.11169;0.771335;0.003365;0.22529;0.9995;0;0.0005;0.896231;0.007104;0.00001;0.773745;0.998305;0.0001
0.003047;0.99963;0.882154;0.006152;0;0.006693;0.481508;0.896231;0.007104;0.096669;0.780665;0.00441;0.21493;0.999055;0.00001;0.00094;0.875222;0.009731;0.0001;0.62026;0.87086;0.001515
0;0;0.154717;0.033982;0.00901;0.268941;0;0.450862;0.029515;0.519625;0.882605;0.000885;0.11651;0.01912;0.058145;0.92274;0.49784;0.000487;0.000925;0.982985;0.938575;0.00043
0.033982;0.00946;0.450862;0.029515;0.058145;0.017986;0;0.49784;0.000487;0.501675;0.973215;0.00005;0.02674;0.022465;0.000925;0.97661;0.671119;0.030746;0.00043;0.99773;0.941545;0.007965
0.029515;0.01912;0.49784;0.000487;0.000925;0.047426;0.783808;0.671119;0.030746;0.298134;0.982985;0.00001;0.017005;0.938575;0.00043;0.060995;0.926306;0.029383;0.007965;0.99928;0.974755;0.000575
0.000487;0.022465;0.671119;0.030746;0.00043;0.268941;0;0.926306;0.029383;0.044313;0.99773;0.000005;0.00227;0.941545;0.007965;0.05049;0.943898;0.023915;0.000575;0.999575;0.983035;0.00038
0.030746;0.938575;0.926306;0.029383;0.007965;0.999089;0;0.943898;0.023915;0.032188;0.99928;0;0.00072;0.974755;0.000575;0.024675;0.946757;0.02385;0.00038;0.9997;0.98451;0.00048
0.029383;0.941545;0.943898;0.023915;0.000575;0.119203;0;0.946757;0.02385;0.029393;0.999575;0;0.000425;0.983035;0.00038;0.016585;0.973433;0.010812;0.00048;0.99946;0.98263;0.000315
0.023915;0.974755;0.946757;0.02385;0.00038;0.119203;0;0.973433;0.010812;0.015755;0.9997;0;0.0003;0.98451;0.00048;0.01501;0.98272;0.00576;0.000315;0.99959;0.970715;0.00037
0.02385;0.983035;0.973433;0.010812;0.00048;0.268941;0.457354;0.98272;0.00576;0.01152;0.99946;0;0.00054;0.98263;0.000315;0.017055;0.925524;0.004902;0.00037;0.99937;0.990615;0.000005
0.010812;0.98451;0.98272;0.00576;0.000315;0.880797;0;0.925524;0.004902;0.069574;0.99959;0;0.00041;0.970715;0.00037;0.028915;0.934042;0.004635;0.000005;0.99861;0.99236;0.000025
0.00576;0.98263;0.925524;0.004902;0.00037;0.047426;0;0.934042;0.004635;0.061323;0.99937;0;0.00063;0.990615;0.000005;0.00938;0.935987;0.004522;0.000025;0.99752;0.99594;0.000035
0.004902;0.970715;0.934042;0.004635;0.000005;0.047426;0.327613;0.935987;0.004522;0.059493;0.99861;0;0.00139;0.99236;0.000025;0.00762;0.930653;0.004258;0.000035;0.9634;0.992505;0.000025
0.004635;0.990615;0.935987;0.004522;0.000025;0.006693;0.734388;0.930653;0.004258;0.065089;0.99752;0;0.00248;0.99594;0.000035;0.004025;0.914764;0.004364;0.000025;0.953745;0.949785;0.000025
0.004522;0.99236;0.930653;0.004258;0.000035;0.047426;0.401072;0.914764;0.004364;0.080869;0.9634;0.00001;0.036585;0.992505;0.000025;0.007465;0.958324;0.004768;0.000025;0.87023;0.89622;0.000045
0.004258;0.99594;0.914764;0.004364;0.000025;0.119203;0.641757;0.958324;0.004768;0.036908;0.953745;0;0.046255;0.949785;0.000025;0.05019;0.912132;0.005029;0.000045;0.56526;0.746965;0.0002
0.004364;0.992505;0.958324;0.004768;0.000025;0.017986;0.356864;0.912132;0.005029;0.08284;0.87023;0.000015;0.129755;0.89622;0.000045;0.10374;0.758128;0.006438;0.0002;0.002995;0.20104;0.0003
0.016831;0.358595;0.578182;0.013816;0.002785;0.952574;0.323223;0.555629;0.005534;0.438835;0.98473;0.000925;0.014345;0.38724;0.002925;0.60983;0.945764;0.003957;0.002115;0.99187;0.96638;0.00051
0.013816;0.375275;0.555629;0.005534;0.002925;0.017986;0;0.945764;0.003957;0.050278;0.987845;0.00098;0.011175;0.866995;0.002115;0.130885;0.981456;0.002807;0.00051;0.99677;0.97407;0.00029
0.005534;0.38724;0.945764;0.003957;0.002115;0.047426;0.287;0.981456;0.002807;0.015737;0.99187;0.00097;0.00716;0.96638;0.00051;0.03311;0.986032;0.002599;0.00029;0.99697;0.98336;0.000355
0.003957;0.866995;0.981456;0.002807;0.00051;0.268941;0.280497;0.986032;0.002599;0.011367;0.99677;0.001135;0.002095;0.97407;0.00029;0.025635;0.963723;0.002297;0.000355;0.99642;0.993795;0.000235
0.002807;0.96638;0.986032;0.002599;0.00029;0.997527;0.787346;0.963723;0.002297;0.033983;0.99697;0.001105;0.001935;0.98336;0.000355;0.016285;0.946536;0.002093;0.000235;0.976085;0.99305;0.000045
0.002599;0.97407;0.963723;0.002297;0.000355;0.731059;0.393888;0.946536;0.002093;0.051368;0.99642;0.001135;0.00244;0.993795;0.000235;0.005965;0.899785;0.001629;0.000045;0.94598;0.99252;0.00001
0.002297;0.98336;0.946536;0.002093;0.000235;0.119203;0.664408;0.899785;0.001629;0.098585;0.976085;0.000975;0.02294;0.99305;0.000045;0.006905;0.957911;0.001307;0.00001;0.90795;0.964075;0.000025
0.002093;0.993795;0.899785;0.001629;0.000045;0.047426;0.541653;0.957911;0.001307;0.04078;0.94598;0.00086;0.053155;0.99252;0.00001;0.00747;0.925846;0.000656;0.000025;0.929;0.962145;0.00002
0.001629;0.99305;0.957911;0.001307;0.00001;0.997527;0.461824;0.925846;0.000656;0.073499;0.90795;0.000145;0.0919;0.964075;0.000025;0.03591;0.931595;0.003894;0.00002;0.90583;0.992855;0
0.001307;0.99252;0.925846;0.000656;0.000025;0.006693;0.040738;0.931595;0.003894;0.06451;0.929;0.00005;0.070945;0.962145;0.00002;0.037835;0.939859;0.00652;0;0.93054;0.99086;0
0.000656;0.964075;0.931595;0.003894;0.00002;0.119203;0.348418;0.939859;0.00652;0.05362;0.90583;0.000045;0.094125;0.992855;0;0.007145;0.946984;0.015602;0;0.98002;0.99467;0.000025
0.003894;0.962145;0.939859;0.00652;0;0.047426;0.552308;0.946984;0.015602;0.037413;0.93054;0.00009;0.06937;0.99086;0;0.00914;0.952612;0.019741;0.000025;0.984675;0.99892;0.00001
0.00652;0.992855;0.946984;0.015602;0;0.5;0.656334;0.952612;0.019741;0.027647;0.98002;0.004865;0.015115;0.99467;0.000025;0.005305;0.975809;0.005229;0.00001;0.981175;0.99972;0.000005
0.015602;0.99086;0.952612;0.019741;0.000025;0.5;0.110171;0.975809;0.005229;0.018963;0.984675;0.00638;0.008945;0.99892;0.00001;0.001075;0.9729;0.002372;0.000005;0.98068;0.99965;0.000005
0.019741;0.99467;0.975809;0.005229;0.00001;0.119203;0.410234;0.9729;0.002372;0.024725;0.981175;0.00664;0.012175;0.99972;0.000005;0.000275;0.967987;0.002747;0.000005;0.979735;0.999585;0.000005
0.005229;0.99892;0.9729;0.002372;0.000005;0.731059;0.43561;0.967987;0.002747;0.029265;0.98068;0.00781;0.01151;0.99965;0.000005;0.000345;0.964951;0.002741;0.000005;0.94575;0.998035;0.00001
0.002372;0.99972;0.967987;0.002747;0.000005;0.993307;0.873471;0.964951;0.002741;0.03231;0.979735;0.00782;0.01245;0.999585;0.000005;0.00041;0.949247;0.005199;0.00001;0.80056;0.994005;0.000005
0.002747;0.99965;0.964951;0.002741;0.000005;0.268941;0.076209;0.949247;0.005199;0.045555;0.94575;0.0079;0.04635;0.998035;0.00001;0.00196;0.887366;0.005654;0.000005;0.688945;0.9708;0.00001
0.002741;0.999585;0.949247;0.005199;0.00001;0.047426;0.599888;0.887366;0.005654;0.106978;0.80056;0.00768;0.19176;0.994005;0.000005;0.005985;0.817977;0.001293;0.00001;0.53396;0.58927;0.000075
0.005199;0.998035;0.887366;0.005654;0.000005;0.119203;0.546862;0.817977;0.001293;0.180732;0.688945;0.003475;0.30758;0.9708;0.00001;0.029195;0.480834;0.001184;0.000075;0.06866;0.290245;0.00006
0.039691;0.29076;0.217925;0.037461;0.001875;0.006693;0.033118;0.897403;0.003192;0.099406;0.89427;0.00116;0.104575;0.84816;0.004905;0.146935;0.943572;0.004495;0.004455;0.94774;0.96776;0.000035
0.037461;0.35052;0.897403;0.003192;0.004905;0.731059;0.640377;0.943572;0.004495;0.051933;0.921725;0.005605;0.07267;0.94332;0.004455;0.052225;0.960054;0.003227;0.000035;0.945885;0.99667;0.000065
0.003192;0.84816;0.943572;0.004495;0.004455;0.119203;0.617512;0.960054;0.003227;0.036719;0.94774;0.006435;0.04583;0.96776;0.000035;0.0322;0.965698;0.006847;0.000065;0.978995;0.99865;0.000015
0.004495;0.94332;0.960054;0.003227;0.000035;0.119203;0.880271;0.965698;0.006847;0.027455;0.945885;0.00681;0.04731;0.99667;0.000065;0.00326;0.977105;0.006575;0.000015;0.97542;0.99948;0.00001
0.003227;0.96776;0.965698;0.006847;0.000065;0.047426;0.949213;0.977105;0.006575;0.016322;0.978995;0.00609;0.01492;0.99865;0.000015;0.001335;0.968994;0.00653;0.00001;0.9719;0.995305;0.00002
0.006847;0.99667;0.977105;0.006575;0.000015;0.119203;0.656109;0.968994;0.00653;0.024472;0.97542;0.00669;0.017885;0.99948;0.00001;0.0005;0.963838;0.007135;0.00002;0.85043;0.882385;0.000075
0.006575;0.99865;0.968994;0.00653;0.00001;0.731059;0.656334;0.963838;0.007135;0.029026;0.9719;0.007075;0.021025;0.995305;0.00002;0.00467;0.840779;0.006575;0.000075;0.22736;0.000145;0.00002
0.00653;0.99948;0.963838;0.007135;0.00002;0.119203;0.573464;0.840779;0.006575;0.152646;0.85043;0.006285;0.143285;0.882385;0.000075;0.11754;0.091996;0.003345;0.00002;0.211615;0.000335;0.00004
0.004151;0.000335;0.088489;0.002625;0.000015;0.006693;0.383906;0.84977;0.000989;0.149242;0.667905;0.000715;0.331385;0.95094;0.000005;0.049055;0.876225;0.001092;0.00002;0.735625;0.987495;0.00011
0.002625;0.005325;0.84977;0.000989;0.000005;0.017986;0.512997;0.876225;0.001092;0.122681;0.68807;0.00101;0.310915;0.986735;0.00002;0.013245;0.899106;0.012279;0.00011;0.87947;0.99907;0.000085
0.000989;0.95094;0.876225;0.001092;0.00002;0.047426;0.709921;0.899106;0.012279;0.088617;0.735625;0.03448;0.229895;0.987495;0.00011;0.0124;0.953778;0.012985;0.000085;0.89393;0.999365;0.00004
0.001092;0.986735;0.899106;0.012279;0.00011;0.268941;0.394126;0.953778;0.012985;0.033236;0.87947;0.03662;0.083905;0.99907;0.000085;0.000845;0.943407;0.012475;0.00004;0.91121;0.99955;0.000025
0.012279;0.987495;0.953778;0.012985;0.000085;0.006693;0.587345;0.943407;0.012475;0.044118;0.89393;0.03726;0.06881;0.999365;0.00004;0.000595;0.93755;0.015337;0.000025;0.93434;0.999645;0
0.012985;0.99907;0.943407;0.012475;0.00004;0.017986;0.415081;0.93755;0.015337;0.047114;0.91121;0.041845;0.046945;0.99955;0.000025;0.000425;0.947392;0.014637;0;0.94503;0.998915;0.000005
0.012475;0.999365;0.93755;0.015337;0.000025;0.017986;0.165896;0.947392;0.014637;0.037967;0.93434;0.03917;0.026485;0.999645;0;0.00035;0.919015;0.01572;0.000005;0.93852;0.99677;0
0.015337;0.99955;0.947392;0.014637;0;0.5;0.397474;0.919015;0.01572;0.065266;0.94503;0.03722;0.01775;0.998915;0.000005;0.001085;0.929965;0.016396;0;0.915355;0.990225;0.00004
0.014637;0.999645;0.919015;0.01572;0.000005;0.268941;0.639455;0.929965;0.016396;0.053639;0.93852;0.03296;0.02852;0.99677;0;0.00323;0.926139;0.014298;0.00004;0.545265;0.840905;0.000195
0.01572;0.998915;0.929965;0.016396;0;0.047426;0.591942;0.926139;0.014298;0.059563;0.915355;0.02418;0.060465;0.990225;0.00004;0.009735;0.738654;0.011581;0.000195;0.013195;0.04587;0.000325
0;0;0;0;0;0.5;0;0.391082;0.042862;0.56605;0.580285;0.00943;0.41028;0.20188;0.076295;0.72182;0.378362;0.01362;0.01168;0.57477;0.174885;0.014035
0;0;0.391082;0.042862;0.076295;0.047426;0.924909;0.378362;0.01362;0.608015;0.585175;0.01556;0.399265;0.17155;0.01168;0.816765;0.374827;0.028905;0.014035;0.432765;0.13603;0.01416
0.042862;0.20188;0.378362;0.01362;0.01168;0.047426;0.592425;0.374827;0.028905;0.59627;0.57477;0.043775;0.381455;0.174885;0.014035;0.811085;0.284398;0.04922;0.01416;0.38998;0.08367;0.00108
0.01362;0.17155;0.374827;0.028905;0.014035;0.268941;0;0.284398;0.04922;0.666382;0.432765;0.08428;0.482955;0.13603;0.01416;0.84981;0.236825;0.01542;0.00108;0.19747;0.031885;0.001405
0.028905;0.174885;0.284398;0.04922;0.01416;0.119203;0.935896;0.236825;0.01542;0.747752;0.38998;0.02976;0.580255;0.08367;0.00108;0.91525;0.114677;0.008075;0.001405;0.116255;0.106835;0.002185
0.196522;0.006;0.085432;0.067487;0.13455;0.268941;0.524231;0.72818;0.020202;0.251616;0.21978;0.059835;0.72038;0.9791;0.00059;0.02031;0.929024;0.000584;0.000005;0.824555;0.997305;0
0.067487;0.00652;0.72818;0.020202;0.00059;0.017986;0.733802;0.929024;0.000584;0.07039;0.79313;0.001565;0.205305;0.994305;0.000005;0.005685;0.940505;0.000199;0;0.994645;0.999965;0
0.020202;0.9791;0.929024;0.000584;0.000005;0.047426;0.582732;0.940505;0.000199;0.059294;0.824555;0.000425;0.175015;0.997305;0;0.002695;0.995205;0.000782;0;0.995995;0.999985;0
0.000584;0.994305;0.940505;0.000199;0;0.993307;0.764408;0.995205;0.000782;0.004014;0.994645;0.000455;0.004905;0.999965;0;0.000035;0.995366;0.00036;0;0.997315;0.999985;0
0.000199;0.997305;0.995205;0.000782;0;0.5;0.67787;0.995366;0.00036;0.004275;0.995995;0.00092;0.003085;0.999985;0;0.000015;0.995832;0.000262;0;0.998145;0.999955;0
0.000782;0.999965;0.995366;0.00036;0;0.047426;0.643365;0.995832;0.000262;0.003905;0.997315;0.00063;0.002055;0.999985;0;0.000015;0.982158;0.003321;0;0.996555;0.999705;0
0.00036;0.999985;0.995832;0.000262;0;0.731059;0.71095;0.982158;0.003321;0.014521;0.998145;0.000355;0.0015;0.999955;0;0.000045;0.998649;0.000167;0;0.983645;0.99948;0
0.000262;0.999985;0.982158;0.003321;0;0.880797;0.110073;0.998649;0.000167;0.001185;0.996555;0.000345;0.003105;0.999705;0;0.000295;0.942811;0.000159;0;0.96035;0.993775;0
0.003321;0.999955;0.998649;0.000167;0;0.017986;0.594355;0.942811;0.000159;0.05703;0.983645;0.00032;0.016035;0.99948;0;0.00052;0.861163;0.000161;0;0.8695;0.951335;0.000025
0.000167;0.999705;0.942811;0.000159;0;0.047426;0.642447;0.861163;0.000161;0.138676;0.96035;0.000325;0.039325;0.993775;0;0.006225;0.741963;0.000211;0.000025;0.66911;0.65815;0.000665
0.000159;0.99948;0.861163;0.000161;0;0.952574;0.691043;0.741963;0.000211;0.257824;0.8695;0.00044;0.13006;0.951335;0.000025;0.048635;0.507518;0.000489;0.000665;0.008865;0.14373;0.000185
0.048465;0.037485;0.039065;0.00768;0.00252;0.047426;0.389123;0.461798;0.004353;0.533849;0.019205;0.0099;0.970895;0.42302;0.00299;0.57399;0.533577;0.022381;0.00311;0.130295;0.941575;0.009075
0.00768;0.110085;0.461798;0.004353;0.00299;0.982014;0.658586;0.533577;0.022381;0.44404;0.104295;0.05235;0.84335;0.54824;0.00311;0.44865;0.671838;0.058067;0.009075;0.515535;0.96784;0.01217
0.004353;0.42302;0.533577;0.022381;0.00311;0.119203;0.604201;0.671838;0.058067;0.270094;0.130295;0.152835;0.71687;0.941575;0.009075;0.049345;0.822636;0.122701;0.01217;0.508255;0.97043;0.013275
0.022381;0.54824;0.671838;0.058067;0.009075;0.952574;0.289873;0.822636;0.122701;0.054661;0.515535;0.34076;0.143705;0.96784;0.01217;0.019985;0.822419;0.123597;0.013275;0.507845;0.97443;0.01317
0.058067;0.941575;0.822636;0.122701;0.01217;0.006693;0.519989;0.822419;0.123597;0.053988;0.508255;0.3463;0.14545;0.97043;0.013275;0.0163;0.823818;0.140793;0.01317;0.264705;0.976055;0.01299
0.122701;0.96784;0.822419;0.123597;0.013275;0.268941;0.868641;0.823818;0.140793;0.03539;0.507845;0.40898;0.08318;0.97443;0.01317;0.0124;0.724353;0.140958;0.01299;0.243235;0.950135;0.01436
0.123597;0.97043;0.823818;0.140793;0.01317;0.731059;0.807213;0.724353;0.140958;0.134688;0.264705;0.40967;0.325625;0.976055;0.01299;0.01095;0.681298;0.141753;0.01436;0.1433;0.943525;0.014515
0.140793;0.97443;0.724353;0.140958;0.01299;0.006693;0.443986;0.681298;0.141753;0.17695;0.243235;0.410685;0.34608;0.950135;0.01436;0.035505;0.655003;0.177905;0.014515;0.105825;0.88375;0.02716
0.140958;0.976055;0.681298;0.141753;0.01436;0.047426;0.577861;0.655003;0.177905;0.167092;0.1433;0.519005;0.33769;0.943525;0.014515;0.041965;0.439965;0.176959;0.02716;0.064255;0.6663;0.02771
0.135941;0.982965;0.494251;0.130105;0.0005;0.017986;0.82419;0.687801;0.076645;0.235555;0.11218;0.225745;0.66208;0.99447;0.00023;0.0053;0.715229;0.050735;0.000145;0.346325;0.99888;0.000135
0.130105;0.991505;0.687801;0.076645;0.00023;0.880797;0.365169;0.715229;0.050735;0.234036;0.152095;0.1476;0.700305;0.99811;0.000145;0.001745;0.780232;0.027103;0.000135;0.48979;0.999215;0.000155
0.076645;0.99447;0.715229;0.050735;0.000145;0.006693;0.73711;0.780232;0.027103;0.192662;0.346325;0.081115;0.572555;0.99888;0.000135;0.00098;0.827835;0.011504;0.000155;0.556275;0.999195;0.000185
0.050735;0.99811;0.780232;0.027103;0.000135;0.017986;0.89697;0.827835;0.011504;0.160661;0.48979;0.0343;0.47591;0.999215;0.000155;0.00063;0.849788;0.012991;0.000185;0.572015;0.99947;0.000185
0.027103;0.99888;0.827835;0.011504;0.000155;0.047426;0.951386;0.849788;0.012991;0.137223;0.556275;0.03873;0.405;0.999195;0.000185;0.00062;0.856847;0.03394;0.000185;0.5826;0.998835;0.00028
0.011504;0.999215;0.849788;0.012991;0.000185;0.731059;0.333144;0.856847;0.03394;0.109214;0.572015;0.10158;0.32641;0.99947;0.000185;0.000345;0.860244;0.040007;0.00028;0.611555;0.99878;0.000335
0.012991;0.999195;0.856847;0.03394;0.000185;0.119203;0.697833;0.860244;0.040007;0.099745;0.5826;0.119685;0.29771;0.998835;0.00028;0.00088;0.869899;0.039657;0.000335;0.61741;0.9967;0.00049
0.03394;0.99947;0.860244;0.040007;0.00028;0.047426;0.91849;0.869899;0.039657;0.09044;0.611555;0.11858;0.26986;0.99878;0.000335;0.00088;0.868502;0.041394;0.00049;0.617595;0.98451;0.00055
0.040007;0.998835;0.869899;0.039657;0.000335;0.017986;0.920488;0.868502;0.041394;0.090102;0.61741;0.123635;0.25895;0.9967;0.00049;0.00281;0.846343;0.049927;0.00055;0.585585;0.80116;0.000615
0.039657;0.99878;0.868502;0.041394;0.00049;0.731059;0.460334;0.846343;0.049927;0.103728;0.617595;0.149175;0.23323;0.98451;0.00055;0.014935;0.768281;0.01581;0.000615;0.0927;0.211975;0.0005
0.041394;0.9967;0.846343;0.049927;0.00055;0.880797;0.939289;0.768281;0.01581;0.215911;0.585585;0.046755;0.367665;0.80116;0.000615;0.198225;0.38895;0.010584;0.0005;0.001205;0.03347;0.00035
0.656645;0.00001;0.003507;0.343683;0.03902;0.002473;0.042574;0.183935;0.034452;0.78161;0.00176;0.08322;0.915015;0.01635;0.01423;0.969415;0.483025;0.015169;0.000785;0.15604;0.83889;0.015835
0.343683;0.000045;0.183935;0.034452;0.01423;0.002473;0.376366;0.483025;0.015169;0.501806;0.035615;0.044295;0.92009;0.78056;0.000785;0.218655;0.659011;0.036557;0.015835;0.48023;0.914475;0.0037
0.034452;0.01635;0.483025;0.015169;0.000785;0.017986;0.829911;0.659011;0.036557;0.304432;0.15604;0.090095;0.753865;0.83889;0.015835;0.145275;0.788224;0.053553;0.0037;0.541425;0.962635;0.001565
0.015169;0.78056;0.659011;0.036557;0.015835;0.047426;0.52822;0.788224;0.053553;0.158221;0.48023;0.15302;0.366745;0.914475;0.0037;0.081825;0.816171;0.06148;0.001565;0.543155;0.96357;0.00086
0.036557;0.83889;0.788224;0.053553;0.0037;0.119203;0.300063;0.816171;0.06148;0.122344;0.541425;0.17878;0.27979;0.962635;0.001565;0.03579;0.61041;0.06702;0.00086;0.545335;0.940435;0.000375
0.053553;0.914475;0.816171;0.06148;0.001565;0.119203;0.051612;0.61041;0.06702;0.322567;0.543155;0.19621;0.26063;0.96357;0.00086;0.035565;0.757486;0.042628;0.000375;0.55229;0.98244;0.0001
0.06702;0.96357;0.757486;0.042628;0.000375;0.5;0.53917;0.795484;0.028673;0.175842;0.55229;0.085855;0.36185;0.98244;0.0001;0.01746;0.843293;0.009765;0.000065;0.829545;0.992755;0.000075
0.042628;0.940435;0.795484;0.028673;0.0001;0.017986;0.5045;0.843293;0.009765;0.146943;0.737355;0.02917;0.233475;0.984405;0.000065;0.01553;0.922754;0.009138;0.000075;0.845565;0.99923;0.000095
0.028673;0.98244;0.843293;0.009765;0.000065;0.006693;0.822882;0.922754;0.009138;0.068108;0.829545;0.02728;0.143175;0.992755;0.000075;0.00717;0.947489;0.013114;0.000095;0.87844;0.999195;0.000085
0.009765;0.984405;0.922754;0.009138;0.000075;0.731059;0.226356;0.947489;0.013114;0.039399;0.845565;0.03919;0.11525;0.99923;0.000095;0.000675;0.950012;0.014957;0.000085;0.85613;0.999005;0.000095
0.009138;0.992755;0.947489;0.013114;0.000095;0.880797;0.762964;0.950012;0.014957;0.035031;0.87844;0.043285;0.078275;0.999195;0.000085;0.00072;0.933441;0.01297;0.000095;0.87702;0.99868;0.000135
0.013114;0.99923;0.950012;0.014957;0.000085;0.119203;0.49575;0.933441;0.01297;0.053589;0.85613;0.037315;0.106555;0.999005;0.000095;0.0009;0.939395;0.011747;0.000135;0.85386;0.75198;0.0001
0.014957;0.999195;0.933441;0.01297;0.000095;0.119203;0.926967;0.939395;0.011747;0.04886;0.87702;0.03505;0.087935;0.99868;0.000135;0.001185;0.843773;0.010351;0.0001;0.845845;0.592815;0.0001
0.01297;0.999005;0.939395;0.011747;0.000135;0.993307;0.82977;0.843773;0.010351;0.145872;0.85386;0.030895;0.11524;0.75198;0.0001;0.24791;0.783118;0.008009;0.0001;0.54552;0.38329;0.00025
0.011747;0.99868;0.843773;0.010351;0.0001;0.017986;0.157626;0.783118;0.008009;0.208872;0.845845;0.02387;0.130285;0.592815;0.0001;0.40708;0.362769;0.007787;0.00025;0.01407;0.00516;0.000135
0.135499;0.05353;0.114052;0.296745;0.05371;0.006693;0.700987;0.302434;0.278437;0.419129;0.293185;0.315315;0.3915;0.294375;0.074375;0.63125;0.43709;0.163904;0.089695;0.56235;0.76037;0.078715
0.296745;0.061315;0.302434;0.278437;0.074375;0.119203;0.623868;0.43709;0.163904;0.399003;0.56148;0.40194;0.03658;0.40252;0.089695;0.507775;0.564683;0.160196;0.078715;0.57023;0.834395;0.063955
0.278437;0.294375;0.43709;0.163904;0.089695;0.880797;0.122711;0.564683;0.160196;0.275119;0.56235;0.40178;0.035865;0.76037;0.078715;0.160915;0.631301;0.261389;0.063955;0.606495;0.84442;0.053555
0.163904;0.40252;0.564683;0.160196;0.078715;0.047426;0.272297;0.631301;0.261389;0.107309;0.57023;0.36351;0.06626;0.834395;0.063955;0.10165;0.646499;0.16864;0.053555;0.599175;0.863395;0.052585
0.160196;0.76037;0.631301;0.261389;0.063955;0.119203;0.39556;0.646499;0.16864;0.184863;0.606495;0.28002;0.113485;0.84442;0.053555;0.10203;0.712859;0.106811;0.052585;0.589005;0.86396;0.021865
0.261389;0.834395;0.646499;0.16864;0.053555;0.006693;0.37332;0.712859;0.106811;0.180332;0.599175;0.226645;0.174185;0.863395;0.052585;0.08402;0.64705;0.04561;0.021865;0.524695;0.816375;0.011905
0.16864;0.84442;0.712859;0.106811;0.052585;0.006693;0.358243;0.64705;0.04561;0.307341;0.589005;0.09895;0.312045;0.86396;0.021865;0.114175;0.614437;0.027183;0.011905;0.45374;0.768515;0.00949
0.106811;0.863395;0.64705;0.04561;0.021865;0.268941;0.304915;0.614437;0.027183;0.358381;0.524695;0.069075;0.406225;0.816375;0.011905;0.171725;0.703697;0.011573;0.00949;0.29578;0.684785;0.00636
0.04561;0.86396;0.614437;0.027183;0.011905;0.5;0.724721;0.703697;0.011573;0.284728;0.45374;0.023365;0.522895;0.768515;0.00949;0.22199;0.534543;0.011006;0.00636;0.0345;0.6574;0.00472
0.027183;0.816375;0.703697;0.011573;0.00949;0.268941;0.282114;0.534543;0.011006;0.454454;0.29578;0.02253;0.6817;0.684785;0.00636;0.308855;0.232159;0.005177;0.00472;0.031035;0.700195;0.02213
0.00473;0.3241;0.350221;0.00506;0.000755;0.047426;0.143441;0.699583;0.073229;0.227185;0.553395;0.21869;0.227905;0.96644;0.00041;0.03315;0.816351;0.070852;0.000155;0.566575;0.99858;0.000105
0.00506;0.366695;0.699583;0.073229;0.00041;0.952574;0.84064;0.816351;0.070852;0.112796;0.557975;0.2078;0.23422;0.997025;0.000155;0.00282;0.833272;0.071005;0.000105;0.563555;0.99931;0.00008
0.073229;0.96644;0.816351;0.070852;0.000155;0.047426;0.760058;0.833272;0.071005;0.095722;0.566575;0.208185;0.22524;0.99858;0.000105;0.001315;0.825541;0.064411;0.00008;0.537945;0.99643;0.00007
0.070852;0.997025;0.833272;0.071005;0.000105;0.952574;0.221836;0.825541;0.064411;0.110047;0.563555;0.187865;0.24858;0.99931;0.00008;0.000605;0.76448;0.057256;0.00007;0.529915;0.977655;0.00004
0.071005;0.99858;0.825541;0.064411;0.00008;0.047426;0.944486;0.76448;0.057256;0.178265;0.537945;0.16652;0.295535;0.99643;0.00007;0.0035;0.799794;0.039371;0.00004;0.491185;0.85463;0.000045
0.064411;0.99931;0.76448;0.057256;0.00007;0.006693;0.63158;0.799794;0.039371;0.160835;0.529915;0.11448;0.355605;0.977655;0.00004;0.022305;0.729742;0.006509;0.000045;0.40531;0.444645;0.000035
0.057256;0.99643;0.799794;0.039371;0.00004;0.047426;0.907795;0.729742;0.006509;0.263749;0.491185;0.016765;0.49205;0.85463;0.000045;0.145325;0.426999;0.007413;0.000035;0.02225;0.00028;0.000575
0.145666;0.000695;0.033272;0.055623;0.00853;0.5;0.541405;0.926382;0.000096;0.073522;0.843945;0.00017;0.15588;0.970595;0.00006;0.02935;0.940511;0.000023;0;0.953875;0.98702;0
0.055623;0.006995;0.926382;0.000096;0.00006;0.047426;0.699937;0.940511;0.000023;0.059467;0.862105;0.00001;0.137885;0.97443;0;0.02557;0.976222;0.000022;0;0.982095;0.998355;0.000005
0.000096;0.970595;0.940511;0.000023;0;0.017986;0.671284;0.976222;0.000022;0.023757;0.953875;0.00001;0.04612;0.98702;0;0.01298;0.993243;0.000024;0.000005;0.994185;0.99968;0
0.000023;0.97443;0.976222;0.000022;0;0.731059;0.328495;0.993243;0.000024;0.006733;0.982095;0.00001;0.01789;0.998355;0.000005;0.001645;0.993217;0.000019;0;0.994545;0.99993;0
0.000022;0.98702;0.993243;0.000024;0.000005;0.5;0.532205;0.993217;0.000019;0.006764;0.994185;0;0.005815;0.99968;0;0.00032;0.981312;0.00002;0;0.99868;0.999955;0
0.000024;0.998355;0.993217;0.000019;0;0.119203;0.758413;0.981312;0.00002;0.018669;0.994545;0;0.005455;0.99993;0;0.00007;0.986215;0.000021;0;0.99815;0.99994;0
0.000019;0.99968;0.981312;0.00002;0;0.880797;0.915444;0.986215;0.000021;0.01376;0.99868;0.000005;0.001305;0.999955;0;0.000045;0.981592;0.000023;0;0.99711;0.99996;0
0.00002;0.99993;0.986215;0.000021;0;0.880797;0.301535;0.981592;0.000023;0.018385;0.99815;0.00001;0.00184;0.99994;0;0.00006;0.941308;0.000358;0;0.996185;0.999855;0
0.000021;0.999955;0.981592;0.000023;0;0.047426;0.634599;0.941308;0.000358;0.058334;0.99711;0.00001;0.00288;0.99996;0;0.00004;0.93448;0.009178;0;0.99366;0.999525;0
0.000023;0.99994;0.941308;0.000358;0;0.119203;0.677214;0.93448;0.009178;0.056342;0.996185;0.000015;0.0038;0.999855;0;0.000145;0.949114;0.009702;0;0.9835;0.996155;0.000005
0.000358;0.99996;0.93448;0.009178;0;0.5;0.928906;0.949114;0.009702;0.041186;0.99366;0.000055;0.00629;0.999525;0;0.000475;0.944706;0.011433;0.000005;0.974695;0.979895;0.000015
0.009178;0.999855;0.949114;0.009702;0;0.5;0.478264;0.944706;0.011433;0.043861;0.9835;0.00026;0.01624;0.996155;0.000005;0.00384;0.913931;0.000824;0.000015;0.92854;0.956505;0.00005
0.009702;0.999525;0.944706;0.011433;0.000005;0.119203;0.79249;0.913931;0.000824;0.085247;0.974695;0.0005;0.024805;0.979895;0.000015;0.020095;0.930718;0.001761;0.00005;0.713685;0.90563;0.000395
0.011433;0.996155;0.913931;0.000824;0.000015;0.731059;0.324098;0.930718;0.001761;0.067521;0.92854;0.001045;0.070415;0.956505;0.00005;0.043445;0.864235;0.001042;0.000395;0.162975;0.19894;0.00024
0;0;0.012563;0.031383;0.04252;0.047426;0;0.04613;0.028078;0.925792;0.070975;0.008555;0.92047;0.021285;0.0476;0.931115;0.081138;0.058992;0.1173;0.15175;0.11005;0.3226
0.031383;0.016605;0.04613;0.028078;0.0476;0.119203;0;0.081138;0.058992;0.859867;0.12939;0.000685;0.86992;0.032885;0.1173;0.849815;0.1309;0.16166;0.3226;0.16313;0.11136;0.16339
0.028078;0.021285;0.081138;0.058992;0.1173;0.268941;0.4955;0.1309;0.16166;0.70744;0.15175;0.00072;0.84753;0.11005;0.3226;0.56735;0.137245;0.082743;0.16339;0.1591;0.09556;0.17981
0.058992;0.032885;0.1309;0.16166;0.3226;0.268941;0.338721;0.137245;0.082743;0.78001;0.16313;0.002095;0.834775;0.11136;0.16339;0.725245;0.12733;0.091878;0.17981;0.156315;0.026305;0.13399
0.16166;0.11005;0.137245;0.082743;0.16339;0.268941;0.274283;0.12733;0.091878;0.780795;0.1591;0.003945;0.836955;0.09556;0.17981;0.724635;0.09131;0.067623;0.13399;0.136865;0.016955;0.053115
0.082743;0.11136;0.12733;0.091878;0.17981;0.119203;0.827641;0.09131;0.067623;0.841065;0.156315;0.001255;0.842425;0.026305;0.13399;0.839705;0.07691;0.026867;0.053115;0.186475;0.01868;0.032075
0.375831;0.0047;0.072876;0.256222;0.3165;0.047426;0.449918;0.353882;0.107159;0.538961;0.528155;0.010745;0.46111;0.243185;0.17387;0.58294;0.433259;0.075603;0.04413;0.54029;0.33045;0.03016
0.256222;0.005755;0.353882;0.107159;0.17387;0.268941;0.530711;0.433259;0.075603;0.491138;0.550045;0.01088;0.43907;0.32176;0.04413;0.634115;0.522615;0.011902;0.03016;0.538125;0.309965;0.101805
0.107159;0.243185;0.433259;0.075603;0.04413;0.017986;0.335146;0.522615;0.011902;0.465481;0.54029;0.00242;0.457285;0.33045;0.03016;0.63939;0.308761;0.03825;0.101805;0.421285;0.10569;0.089135
0.075603;0.32176;0.522615;0.011902;0.03016;0.880797;0.399392;0.308761;0.03825;0.652989;0.538125;0.002875;0.459;0.309965;0.101805;0.58823;0.204036;0.040263;0.089135;0.42249;0.101195;0.090405
0.516911;0.016385;0.19982;0.352548;0.51205;0.5;0.295046;0.560497;0.07015;0.369355;0.372485;0.01183;0.615685;0.37679;0.19802;0.425195;0.541587;0.074902;0.18791;0.308155;0.4377;0.187975
0.352548;0.016055;0.560497;0.07015;0.19802;0.119203;0.633671;0.541587;0.074902;0.383513;0.28225;0.008255;0.709495;0.41645;0.18791;0.395645;0.553029;0.065657;0.187975;0.33866;0.458785;0.15128
0.07015;0.37679;0.541587;0.074902;0.18791;0.268941;0.354573;0.553029;0.065657;0.381315;0.308155;0.00836;0.683485;0.4377;0.187975;0.374325;0.462528;0.053313;0.15128;0.353415;0.31059;0.08192
0.026577;0.020385;0.008295;0.014807;0.022775;0.119203;0.744787;0.014138;0.011437;0.974422;0.010095;0.003725;0.986175;0.01818;0.01915;0.96267;0.014365;0.017495;0.030405;0.05698;0.014175;0.046875
0.014807;0.011135;0.014138;0.011437;0.01915;0.006693;0;0.014365;0.017495;0.968143;0.017475;0.004585;0.977945;0.011255;0.030405;0.95834;0.035577;0.035033;0.046875;0.06479;0.021495;0.08489
0.011437;0.01818;0.014365;0.017495;0.030405;0.119203;0.43807;0.035577;0.035033;0.92939;0.05698;0.02319;0.91983;0.014175;0.046875;0.93895;0.043143;0.233167;0.08489;0.06641;0.025315;0.44885
0.613505;0.01631;0.009938;0.376255;0.639675;0.268941;0;0.030718;0.06462;0.904662;0.03681;0.06119;0.902;0.024625;0.06805;0.907325;0.03539;0.0937;0.05172;0.094125;0.028745;0.052275
0.376255;0.0141;0.030718;0.06462;0.06805;0.119203;0.425557;0.03539;0.0937;0.87091;0.042995;0.13568;0.82133;0.027785;0.05172;0.92049;0.061435;0.116125;0.052275;0.090085;0.06051;0.0343
0.06462;0.024625;0.03539;0.0937;0.05172;0.268941;0;0.061435;0.116125;0.82244;0.094125;0.179975;0.725895;0.028745;0.052275;0.918985;0.075297;0.111107;0.0343;0.06458;0.080855;0.02412
0.0937;0.027785;0.061435;0.116125;0.052275;0.5;0.49725;0.075297;0.111107;0.813595;0.090085;0.187915;0.722;0.06051;0.0343;0.90519;0.072718;0.06182;0.02412;0.04134;0.08724;0.02173
0.00354;0.010515;0.04845;0.032935;0.00107;0.047426;0;0.070268;0.036757;0.892978;0.11816;0.064965;0.81688;0.022375;0.00855;0.969075;0.46148;0.072568;0.0191;0.673245;0.30018;0.01642
0.032935;0.01134;0.070268;0.036757;0.00855;0.982014;0.473026;0.46148;0.072568;0.46595;0.661285;0.126035;0.212675;0.261675;0.0191;0.719225;0.486712;0.080392;0.01642;0.688385;0.303385;0.01304
0.036757;0.022375;0.46148;0.072568;0.0191;0.268941;0.322786;0.486712;0.080392;0.432895;0.673245;0.144365;0.18239;0.30018;0.01642;0.6834;0.495885;0.082237;0.01304;0.70222;0.315885;0.01877
0.072568;0.261675;0.486712;0.080392;0.01642;0.047426;0.669517;0.495885;0.082237;0.421878;0.688385;0.151435;0.16018;0.303385;0.01304;0.683575;0.509052;0.11391;0.01877;0.70478;0.262315;0.016575
0.080392;0.30018;0.495885;0.082237;0.01304;0.047426;0.236855;0.509052;0.11391;0.377035;0.70222;0.20905;0.08873;0.315885;0.01877;0.66534;0.483548;0.122138;0.016575;0.70299;0.24378;0.001265
0.082237;0.303385;0.509052;0.11391;0.01877;0.5;0;0.483548;0.122138;0.394317;0.70478;0.2277;0.067525;0.262315;0.016575;0.72111;0.473385;0.075258;0.001265;0.715495;0.228565;0.00325
0.11391;0.315885;0.483548;0.122138;0.016575;0.119203;0;0.473385;0.075258;0.45136;0.70299;0.14925;0.147765;0.24378;0.001265;0.754955;0.47203;0.05679;0.00325;0.66485;0.157805;0.016885
0.122138;0.262315;0.473385;0.075258;0.001265;0.047426;0.428494;0.47203;0.05679;0.471183;0.715495;0.11033;0.17418;0.228565;0.00325;0.768185;0.411328;0.04956;0.016885;0.591525;0.097895;0.01743
0.04956;0.157805;0.34471;0.041895;0.01743;0.5;0;0.339703;0.0525;0.6078;0.57445;0.069335;0.356215;0.104955;0.035665;0.859385;0.329855;0.070038;0.04478;0.397255;0.018535;0.03945
0.041895;0.097895;0.339703;0.0525;0.035665;0.268941;0;0.329855;0.070038;0.600105;0.5536;0.095295;0.3511;0.10611;0.04478;0.84911;0.207895;0.035182;0.03945;0.33793;0.0151;0.00957
0.0525;0.104955;0.329855;0.070038;0.04478;0.047426;0.283128;0.207895;0.035182;0.756917;0.397255;0.030915;0.571825;0.018535;0.03945;0.94201;0.176515;0.01056;0.00957;0.330375;0.0046;0.0074
0.070038;0.10611;0.207895;0.035182;0.03945;0.268941;0.614437;0.176515;0.01056;0.812928;0.33793;0.01155;0.65053;0.0151;0.00957;0.975325;0.167487;0.0043;0.0074;0.434915;0.0063;0.00106
0.01056;0.0151;0.167487;0.0043;0.0074;0.047426;0.65701;0.220608;0.001595;0.777798;0.434915;0.00213;0.562955;0.0063;0.00106;0.99264;0.313553;0.009798;0.00009;0.75055;0.077505;0.00011
0.0043;0.0046;0.220608;0.001595;0.00106;0.047426;0.673927;0.313553;0.009798;0.676652;0.571895;0.019505;0.408605;0.05521;0.00009;0.9447;0.414027;0.017125;0.00011;0.839935;0.37029;0.001435
0.001595;0.0063;0.313553;0.009798;0.00009;0.5;0.613014;0.414027;0.017125;0.568845;0.75055;0.03414;0.215305;0.077505;0.00011;0.922385;0.605112;0.03315;0.001435;0.82286;0.39205;0.001015
0.009798;0.05521;0.414027;0.017125;0.00011;0.5;0.864713;0.605112;0.03315;0.361738;0.839935;0.064865;0.0952;0.37029;0.001435;0.628275;0.607455;0.034493;0.001015;0.813385;0.41155;0.005405
0.017125;0.077505;0.605112;0.03315;0.001435;0.119203;0.365401;0.607455;0.034493;0.358055;0.82286;0.06797;0.109175;0.39205;0.001015;0.606935;0.612468;0.034733;0.005405;0.67088;0.36469;0.177345
0.03315;0.37029;0.607455;0.034493;0.001015;0.5;0;0.612468;0.034733;0.352797;0.813385;0.06406;0.122555;0.41155;0.005405;0.58304;0.517785;0.122218;0.177345;0.588985;0.30619;0.00855
0.034493;0.39205;0.612468;0.034733;0.005405;0.731059;0.335369;0.517785;0.122218;0.359995;0.67088;0.06709;0.262025;0.36469;0.177345;0.457965;0.447587;0.020215;0.00855;0.570465;0.29797;0.00858
0.03825;0.017545;0.014318;0.019893;0.01298;0.017986;0;0.0692;0.004787;0.92601;0.113395;0.00435;0.88225;0.025005;0.005225;0.96977;0.092275;0.003103;0.00169;0.170975;0.074675;0.004075
0.019893;0.00723;0.0692;0.004787;0.005225;0.047426;0.540163;0.092275;0.003103;0.904618;0.15964;0.004515;0.83584;0.02491;0.00169;0.973395;0.122825;0.003;0.004075;0.191775;0.0848;0.27785
0.004787;0.025005;0.092275;0.003103;0.00169;0.017986;0;0.122825;0.003;0.874175;0.170975;0.001925;0.8271;0.074675;0.004075;0.92125;0.138288;0.143592;0.27785;0.23689;0.08324;0.37499
0.104458;0.17356;0.46859;0.003233;0.005525;0.268941;0.153683;0.694742;0.002003;0.303252;0.814055;0.000565;0.185375;0.57543;0.00344;0.42113;0.78621;0.003978;0.007265;0.99522;0.71965;0.00615
0.003233;0.183335;0.694742;0.002003;0.00344;0.047426;0.697833;0.78621;0.003978;0.209813;0.88789;0.00069;0.11142;0.68453;0.007265;0.308205;0.857435;0.00372;0.00615;0.995955;0.964195;0.00631
0.002003;0.57543;0.78621;0.003978;0.007265;0.952574;0;0.857435;0.00372;0.138843;0.99522;0.00129;0.003485;0.71965;0.00615;0.2742;0.980075;0.004015;0.00631;0.99626;0.97884;0.00576
0.003978;0.68453;0.857435;0.00372;0.00615;0.268941;0;0.980075;0.004015;0.015915;0.995955;0.00172;0.002325;0.964195;0.00631;0.029505;0.98755;0.003655;0.00576;0.996685;0.98117;0.005295
0.00372;0.71965;0.980075;0.004015;0.00631;0.017986;0;0.98755;0.003655;0.008797;0.99626;0.00155;0.00219;0.97884;0.00576;0.015405;0.988928;0.003363;0.005295;0.994825;0.99049;0.005015
0.004015;0.964195;0.98755;0.003655;0.00576;0.047426;0;0.988928;0.003363;0.007707;0.996685;0.00143;0.001885;0.98117;0.005295;0.01353;0.992657;0.003273;0.005015;0.99044;0.990525;0.00481
0.003655;0.97884;0.988928;0.003363;0.005295;0.731059;0;0.992657;0.003273;0.004072;0.994825;0.00153;0.00365;0.99049;0.005015;0.004495;0.990483;0.003162;0.00481;0.976775;0.992055;0.0045
0.003363;0.98117;0.992657;0.003273;0.005015;0.5;0;0.990483;0.003162;0.006355;0.99044;0.001515;0.008045;0.990525;0.00481;0.004665;0.984415;0.00307;0.0045;0.96587;0.98822;0.00452
0.001027;0.00815;0.04548;0.005155;0.008125;0.5;0;0.047235;0.002805;0.949962;0.08795;0.00286;0.909195;0.00652;0.00275;0.99073;0.025753;0.010873;0.00697;0.02939;0.00602;0.00928
0.005155;0.00748;0.047235;0.002805;0.00275;0.119203;0.518991;0.025753;0.010873;0.963375;0.046045;0.014775;0.93918;0.00546;0.00697;0.98757;0.017705;0.013395;0.00928;0.010945;0.003995;0.000995
0.002805;0.00652;0.025753;0.010873;0.00697;0.5;0;0.017705;0.013395;0.968897;0.02939;0.01751;0.953095;0.00602;0.00928;0.9847;0.00747;0.002553;0.000995;0.01015;0.0037;0.00114
0.011648;0.00326;0.259949;0.009214;0.001485;0.047426;0.422139;0.542494;0.008225;0.449279;0.93471;0.00001;0.06527;0.61042;0.00009;0.389495;0.651345;0.008246;0.000395;0.990185;0.96649;0.00072
0.009214;0.36762;0.542494;0.008225;0.00009;0.119203;0.890123;0.651345;0.008246;0.340411;0.97164;0.00023;0.028135;0.882705;0.000395;0.1169;0.941853;0.008034;0.00072;0.99696;0.978495;0.000385
0.008225;0.61042;0.651345;0.008246;0.000395;0.880797;0.483506;0.941853;0.008034;0.050111;0.990185;0.000435;0.009375;0.96649;0.00072;0.03279;0.979765;0.00628;0.000385;0.996245;0.97443;0.00022
0.008246;0.882705;0.941853;0.008034;0.00072;0.268941;0.425557;0.979765;0.00628;0.01396;0.99696;0.000375;0.00267;0.978495;0.000385;0.02113;0.978436;0.006093;0.00022;0.991095;0.968015;0.00022
0.008034;0.96649;0.979765;0.00628;0.000385;0.119203;0.856436;0.978436;0.006093;0.015471;0.996245;0.000375;0.003385;0.97443;0.00022;0.025345;0.974771;0.006033;0.00022;0.963555;0.95887;0.00023
0.00628;0.978495;0.978436;0.006093;0.00022;0.268941;0.317562;0.974771;0.006033;0.019199;0.991095;0.00048;0.00843;0.968015;0.00022;0.03177;0.962937;0.005887;0.00023;0.874775;0.920665;0.00004
0.006093;0.97443;0.974771;0.006033;0.00022;0.880797;0;0.962937;0.005887;0.031176;0.963555;0.000625;0.03582;0.95887;0.00023;0.0409;0.920609;0.005801;0.00004;0.82676;0.89538;0.00002
0.006033;0.968015;0.962937;0.005887;0.00023;0.119203;0.17814;0.920609;0.005801;0.073589;0.874775;0.000555;0.12467;0.920665;0.00004;0.07929;0.895763;0.005974;0.00002;0.624945;0.79742;0.00003
0.005887;0.95887;0.920609;0.005801;0.00004;0.047426;0.166311;0.895763;0.005974;0.098264;0.82676;0.000475;0.172765;0.89538;0.00002;0.1046;0.796023;0.006204;0.00003;0.60107;0.73603;0.001305
0.005801;0.920665;0.895763;0.005974;0.00002;0.047426;0.359854;0.796023;0.006204;0.197773;0.624945;0.001435;0.37362;0.79742;0.00003;0.20255;0.767601;0.016983;0.001305;0.42314;0.58837;0.011735
0.03328;0.58837;0.590761;0.029599;0.005965;0.5;0;0.657496;0.041377;0.301124;0.395335;0.10213;0.50253;0.608135;0.00651;0.38535;0.55314;0.061807;0.005345;0.375245;0.430355;0.010735
0.029599;0.553525;0.657496;0.041377;0.00651;0.268941;0.695932;0.55314;0.061807;0.385053;0.387495;0.164585;0.44792;0.596125;0.005345;0.39853;0.4938;0.063865;0.010735;0.38687;0.430265;0.008865
0.041377;0.608135;0.55314;0.061807;0.005345;0.119203;0.743073;0.4938;0.063865;0.442335;0.375245;0.16537;0.459385;0.430355;0.010735;0.55891;0.497645;0.073357;0.008865;0.21599;0.118605;0.049355
0.061807;0.596125;0.4938;0.063865;0.010735;0.5;0;0.497645;0.073357;0.429001;0.38687;0.195715;0.41742;0.430265;0.008865;0.560875;0.187418;0.085925;0.049355;0.141505;0.06293;0.006785
0.063865;0.430355;0.497645;0.073357;0.008865;0.880797;0.166311;0.187418;0.085925;0.726658;0.21599;0.19293;0.59108;0.118605;0.049355;0.832045;0.102045;0.01035;0.006785;0.14212;0.238355;0.00159
0.008394;0.00667;0.263455;0.009003;0.000215;0.119203;0.172787;0.915401;0.011049;0.073551;0.956775;0.00012;0.043105;0.85256;0.00146;0.14598;0.929632;0.010315;0.000005;0.99021;0.991325;0.00006
0.009003;0.00714;0.915401;0.011049;0.00146;0.047426;0.533699;0.929632;0.010315;0.060053;0.982315;0.00003;0.01766;0.954495;0.000005;0.045495;0.94454;0.01034;0.00006;0.996785;0.990885;0.000045
0.011049;0.85256;0.929632;0.010315;0.000005;0.047426;0.769058;0.94454;0.01034;0.045118;0.99021;0.00005;0.009745;0.991325;0.00006;0.008605;0.975858;0.010046;0.000045;0.997925;0.99358;0.000045
0.010315;0.954495;0.94454;0.01034;0.00006;0.5;0;0.975858;0.010046;0.014098;0.996785;0.000045;0.003175;0.990885;0.000045;0.00907;0.975785;0.010735;0.000045;0.998065;0.993775;0.00014
0.01034;0.991325;0.975858;0.010046;0.000045;0.047426;0;0.975785;0.010735;0.01348;0.997925;0.000085;0.00199;0.99358;0.000045;0.006375;0.976515;0.010468;0.00014;0.99815;0.993945;0.000225
0.010046;0.990885;0.975785;0.010735;0.000045;0.119203;0.575909;0.976515;0.010468;0.013018;0.998065;0.000115;0.00182;0.993775;0.00014;0.006085;0.97543;0.011144;0.000225;0.99589;0.99136;0.00064
0.010735;0.99358;0.976515;0.010468;0.00014;0.119203;0.184975;0.97543;0.011144;0.013427;0.99815;0.000305;0.001555;0.993945;0.000225;0.005825;0.971504;0.012501;0.00064;0.9927;0.99131;0.00016
0.010468;0.993775;0.97543;0.011144;0.000225;0.268941;0;0.971504;0.012501;0.015995;0.99589;0.000495;0.003615;0.99136;0.00064;0.008;0.82295;0.020809;0.00016;0.987745;0.986835;0.00172
0.011144;0.993945;0.971504;0.012501;0.00064;0.119203;0.142095;0.82295;0.020809;0.156242;0.9927;0.00049;0.006815;0.99131;0.00016;0.00853;0.909587;0.041755;0.00172;0.92169;0.955895;0.004415
0.012501;0.99136;0.82295;0.020809;0.00016;0.119203;0.184072;0.909587;0.041755;0.048658;0.987745;0.000635;0.01162;0.986835;0.00172;0.011445;0.938792;0.00256;0.004415;0.73264;0.422545;0.007725
0.020809;0.99131;0.909587;0.041755;0.00172;0.993307;0.279086;0.938792;0.00256;0.058647;0.92169;0.000705;0.077605;0.955895;0.004415;0.03969;0.577593;0.004443;0.007725;0;0;0
0.17708;0.09371;0.150078;0.097634;0.022995;0.002473;0.581759;0.1483;0.057911;0.793788;0.072085;0.093645;0.83427;0.23194;0.014575;0.753485;0.172221;0.078638;0.0174;0.052905;0.281195;0.03608
0.097634;0.226165;0.1483;0.057911;0.014575;0.017986;0.391979;0.172221;0.078638;0.749137;0.07542;0.14842;0.776155;0.31226;0.0174;0.670335;0.151958;0.135481;0.03608;0.0519;0.355;0.061285
0.057911;0.23194;0.172221;0.078638;0.0174;0.002473;0.312169;0.151958;0.135481;0.71256;0.052905;0.30274;0.644355;0.281195;0.03608;0.682725;0.181127;0.215524;0.061285;0.04679;0.35547;0.025615
0.195372;0.35547;0.19666;0.096011;0.05566;0.002473;0.415809;0.518003;0.065275;0.416722;0.25957;0.0885;0.65193;0.62944;0.034315;0.336245;0.49048;0.08103;0.03386;0.23328;0.56809;0.11162
0.096011;0.35042;0.518003;0.065275;0.034315;0.006693;0.302378;0.49048;0.08103;0.428492;0.253985;0.10395;0.64207;0.58629;0.03386;0.37985;0.454348;0.163195;0.11162;0.22249;0.57109;0.209715
0.065275;0.62944;0.49048;0.08103;0.03386;0.017986;0.147669;0.454348;0.163195;0.382457;0.23328;0.228075;0.538645;0.56809;0.11162;0.32029;0.267234;0.58237;0.209715;0.213535;0.639275;0.22055
0.022855;0.48835;0.289493;0.027182;0.003885;0.5;0.473774;0.776303;0.013859;0.209836;0.98874;0.000165;0.01109;0.7428;0.002985;0.254215;0.803057;0.009798;0.00086;0.995515;0.941915;0.001135
0.027182;0.52987;0.776303;0.013859;0.002985;0.119203;0.518741;0.803057;0.009798;0.187148;0.99412;0.00016;0.00572;0.846245;0.00086;0.152905;0.832054;0.00825;0.001135;0.995795;0.984245;0.00023
0.013859;0.7428;0.803057;0.009798;0.00086;0.047426;0.5015;0.832054;0.00825;0.159693;0.995515;0.000205;0.004275;0.941915;0.001135;0.056945;0.941852;0.010194;0.00023;0.99777;0.991585;0.00009
0.009798;0.846245;0.832054;0.00825;0.001135;0.047426;0.696355;0.941852;0.010194;0.047956;0.995795;0.00029;0.003915;0.984245;0.00023;0.01553;0.959365;0.004951;0.00009;0.99695;0.994855;0.00007
0.00825;0.941915;0.941852;0.010194;0.00023;0.5;0.667744;0.959365;0.004951;0.035683;0.99777;0.000365;0.00186;0.991585;0.00009;0.008325;0.95915;0.005164;0.00007;0.982845;0.997385;0.000065
0.010194;0.984245;0.959365;0.004951;0.00009;0.006693;0.213157;0.95915;0.005164;0.035684;0.99695;0.000375;0.00267;0.994855;0.00007;0.005075;0.960774;0.011928;0.000065;0.970085;0.998845;0.000025
0.004951;0.991585;0.95915;0.005164;0.00007;0.017986;0.241404;0.960774;0.011928;0.027299;0.982845;0.00056;0.0166;0.997385;0.000065;0.00255;0.955816;0.011862;0.000025;0.954725;0.997235;0.00002
0.005164;0.994855;0.960774;0.011928;0.000065;0.268941;0.246011;0.955816;0.011862;0.03232;0.970085;0.000515;0.0294;0.998845;0.000025;0.001125;0.954553;0.008754;0.00002;0.92295;0.99026;0.000025
0.011928;0.997385;0.955816;0.011862;0.000025;0.047426;0.401793;0.954553;0.008754;0.036693;0.954725;0.00044;0.04484;0.997235;0.00002;0.00274;0.941829;0.011506;0.000025;0.849865;0.992925;0.00008
0.011862;0.998845;0.954553;0.008754;0.00002;0.002473;0.450909;0.941829;0.011506;0.046666;0.92295;0.000875;0.076175;0.99026;0.000025;0.009715;0.91486;0.01039;0.00008;0.86574;0.99447;0.000135
0.008754;0.997235;0.941829;0.011506;0.000025;0.119203;0.462073;0.91486;0.01039;0.074749;0.849865;0.003085;0.14705;0.992925;0.00008;0.006995;0.905965;0.010567;0.000135;0.87188;0.99534;0.000295
0.011506;0.99026;0.91486;0.01039;0.00008;0.047426;0.396995;0.905965;0.010567;0.083468;0.86574;0.01311;0.12115;0.99447;0.000135;0.005395;0.894234;0.021695;0.000295;0.81786;0.990035;0.000295
0.01039;0.992925;0.905965;0.010567;0.000135;0.993307;0.49625;0.894234;0.021695;0.084072;0.87188;0.035925;0.092195;0.99534;0.000295;0.004365;0.868185;0.01984;0.000295;0.585615;0.939105;0.000185
0.010567;0.99447;0.894234;0.021695;0.000295;0.047426;0.597726;0.868185;0.01984;0.111977;0.81786;0.031685;0.15046;0.990035;0.000295;0.00967;0.728245;0.011937;0.000185;0.504485;0.81877;0.00026
0.03836;0.327785;0.241021;0.085151;0.007975;0.880797;0.669959;0.565981;0.00072;0.433298;0.886705;0.000115;0.113175;0.693515;0.00184;0.304645;0.632878;0.018191;0.00223;0.982515;0.9415;0.00438
0.085151;0.518175;0.565981;0.00072;0.00184;0.268941;0.461079;0.632878;0.018191;0.348931;0.978735;0.000205;0.02106;0.74984;0.00223;0.24793;0.708821;0.026391;0.00438;0.99865;0.958705;0.005475
0.00072;0.693515;0.632878;0.018191;0.00223;0.5;0.632742;0.708821;0.026391;0.264788;0.982515;0.00022;0.017265;0.9415;0.00438;0.05412;0.730203;0.238877;0.005475;0.99287;0.951705;0.005335
0.018191;0.74984;0.708821;0.026391;0.00438;0.880797;0.15046;0.730203;0.238877;0.03092;0.99865;0.00024;0.001105;0.958705;0.005475;0.035825;0.742679;0.109392;0.005335;0.96456;0.905375;0.00483
0.026391;0.9415;0.730203;0.238877;0.005475;0.731059;0.598447;0.742679;0.109392;0.147928;0.99287;0.00025;0.00688;0.951705;0.005335;0.042955;0.749228;0.002024;0.00483;0.936635;0.864805;0.00445
0.238877;0.958705;0.742679;0.109392;0.005335;0.5;0.681354;0.749228;0.002024;0.248748;0.96456;0.000235;0.035205;0.905375;0.00483;0.089795;0.700694;0.001945;0.00445;0.899215;0.826625;0.00701
0.109392;0.951705;0.749228;0.002024;0.00483;0.119203;0.689546;0.700694;0.001945;0.29736;0.936635;0.000255;0.063105;0.864805;0.00445;0.130745;0.673652;0.003296;0.00701;0.457495;0.778875;0.0016
0.002024;0.905375;0.700694;0.001945;0.00445;0.5;0.336038;0.673652;0.003296;0.323051;0.899215;0.001275;0.09951;0.826625;0.00701;0.166365;0.412924;0.001634;0.0016;0.258015;0.735555;0.001175
0.001945;0.864805;0.673652;0.003296;0.00701;0.268941;0.113046;0.412924;0.001634;0.585443;0.457495;0.0009;0.54161;0.778875;0.0016;0.219525;0.391856;0.002479;0.001175;0.103055;0.68894;0.00374
0.020389;0.293805;0.188952;0.01907;0.023475;0.268941;0.756392;0.169143;0.029758;0.8011;0.059395;0.012925;0.92768;0.27889;0.04659;0.67452;0.174125;0.029215;0.04579;0.05402;0.27581;0.051765
0.01907;0.317505;0.169143;0.029758;0.04659;0.268941;0.50375;0.174125;0.029215;0.79666;0.059445;0.01264;0.92791;0.288805;0.04579;0.66541;0.164915;0.031023;0.051765;0.043755;0.12112;0.030935
0.029758;0.27889;0.174125;0.029215;0.04579;0.268941;0.582246;0.164915;0.031023;0.804067;0.05402;0.01028;0.935705;0.27581;0.051765;0.67243;0.082438;0.019455;0.030935;0.03989;0.05164;0.017685
0.028612;0.121105;0.31783;0.022798;0.04005;0.731059;0.591942;0.317827;0.032988;0.649188;0.581325;0.01146;0.40722;0.05433;0.054515;0.891155;0.353158;0.022745;0.02496;0.65008;0.11955;0.05328
0.022798;0.105835;0.317827;0.032988;0.054515;0.017986;0.85483;0.353158;0.022745;0.624098;0.60596;0.02053;0.37351;0.100355;0.02496;0.874685;0.384815;0.04347;0.05328;0.66562;0.124735;0.081675
0.032988;0.05433;0.353158;0.022745;0.02496;0.017986;0;0.384815;0.04347;0.571715;0.65008;0.03366;0.31626;0.11955;0.05328;0.82717;0.395178;0.066348;0.081675;0.623705;0.268185;0.306255
0.022745;0.100355;0.384815;0.04347;0.05328;0.119203;0.822299;0.395178;0.066348;0.538473;0.66562;0.05102;0.283355;0.124735;0.081675;0.79359;0.445945;0.219855;0.306255;0.615645;0.428125;0.38477
0.04347;0.11955;0.395178;0.066348;0.081675;0.5;0.326513;0.445945;0.219855;0.3342;0.623705;0.133455;0.24284;0.268185;0.306255;0.42556;0.521885;0.268503;0.38477;0.572595;0.42866;0.393385
0.066348;0.124735;0.445945;0.219855;0.306255;0.017986;0.794293;0.521885;0.268503;0.209613;0.615645;0.152235;0.23212;0.428125;0.38477;0.187105;0.500628;0.302847;0.393385;0.54993;0.402245;0.476185
0.219855;0.268185;0.521885;0.268503;0.38477;0.731059;0.457106;0.500628;0.302847;0.196525;0.572595;0.21231;0.215095;0.42866;0.393385;0.177955;0.476087;0.35931;0.476185;0.537075;0.34198;0.43785
0.268503;0.428125;0.500628;0.302847;0.393385;0.017986;0.7062;0.476087;0.35931;0.164605;0.54993;0.242435;0.20764;0.402245;0.476185;0.12157;0.439527;0.374792;0.43785;0.502805;0.29895;0.442065
0.302847;0.42866;0.476087;0.35931;0.476185;0.047426;0.7833;0.439527;0.374792;0.185683;0.537075;0.311735;0.15119;0.34198;0.43785;0.220175;0.400877;0.382895;0.442065;0.497505;0.28882;0.478415
0.35931;0.402245;0.439527;0.374792;0.43785;0.002473;0.595801;0.400877;0.382895;0.216228;0.502805;0.323725;0.17347;0.29895;0.442065;0.258985;0.393163;0.400915;0.478415;0.44422;0.18275;0.4564
0.500741;0.02485;0.037466;0.278571;0.329055;0.006693;0.5055;0.37334;0.171343;0.455315;0.085345;0.37033;0.544315;0.311375;0.06161;0.62702;0.503327;0.063648;0.01798;0.18141;0.675985;0.02312
0.278571;0.034715;0.37334;0.171343;0.06161;0.119203;0.654753;0.503327;0.063648;0.433027;0.12577;0.1729;0.70133;0.59572;0.01798;0.386305;0.619089;0.052034;0.02312;0.28779;0.83779;0.076975
0.171343;0.311375;0.503327;0.063648;0.01798;0.006693;0.131244;0.619089;0.052034;0.328873;0.18141;0.13292;0.685665;0.675985;0.02312;0.30089;0.708482;0.081531;0.076975;0.30064;0.848975;0.082075
0.063648;0.59572;0.619089;0.052034;0.02312;0.119203;0.164242;0.708482;0.081531;0.209987;0.28779;0.16755;0.544665;0.83779;0.076975;0.08523;0.703303;0.159909;0.082075;0.29958;0.862085;0.081775
0.052034;0.675985;0.708482;0.081531;0.076975;0.017986;0.624806;0.703303;0.159909;0.136788;0.30064;0.39758;0.301775;0.848975;0.082075;0.068955;0.697688;0.207407;0.081775;0.25714;0.86025;0.079415
0.081531;0.83779;0.703303;0.159909;0.082075;0.952574;0.434381;0.697688;0.207407;0.094906;0.29958;0.540365;0.16005;0.862085;0.081775;0.056145;0.662207;0.208228;0.079415;0.06188;0.776365;0.05837
0.159909;0.848975;0.697688;0.207407;0.081775;0.880797;0.760605;0.662207;0.208228;0.129566;0.25714;0.54519;0.197675;0.86025;0.079415;0.060335;0.512643;0.146896;0.05837;0.017115;0.552755;0.041415
0.207407;0.862085;0.662207;0.208228;0.079415;0.047426;0.449918;0.512643;0.146896;0.340462;0.06188;0.38223;0.555895;0.776365;0.05837;0.165265;0.255339;0.130962;0.041415;0.017345;0.56452;0.036745
0.036203;0.07372;0.043445;0.163358;0.02432;0.119203;0.1483;0.330288;0.171219;0.498491;0.04471;0.445235;0.510055;0.040935;0.01426;0.9448;0.337796;0.235398;0.17788;0.0463;0.05337;0.167195
0.163358;0.079605;0.330288;0.171219;0.01426;0.5;0.710538;0.337796;0.235398;0.426806;0.04644;0.474455;0.479105;0.05673;0.17788;0.76539;0.34783;0.229794;0.167195;0.034015;0.057275;0.192105
0.171219;0.040935;0.337796;0.235398;0.17788;0.047426;0.205217;0.34783;0.229794;0.422378;0.0463;0.466205;0.4875;0.05337;0.167195;0.779435;0.260772;0.204249;0.192105;0.01028;0.022295;0.159655
0.235398;0.05673;0.34783;0.229794;0.167195;0.268941;0.405404;0.260772;0.204249;0.534983;0.034015;0.40428;0.56171;0.057275;0.192105;0.750625;0.02651;0.097317;0.159655;0.009335;0.012965;0.04383
0.539011;0.05614;0.025182;0.330092;0.224425;0.017986;0.767991;0.301262;0.070738;0.628;0.122885;0.097065;0.78005;0.08419;0.115085;0.800725;0.297394;0.212362;0.220855;0.135415;0.084505;0.581895
0.330092;0.016245;0.301262;0.070738;0.115085;0.880797;0.455369;0.297394;0.212362;0.490244;0.13035;0.38277;0.48688;0.08535;0.220855;0.693795;0.298943;0.356933;0.581895;0.146955;0.026085;0.836295
0.070738;0.08419;0.297394;0.212362;0.220855;0.047426;0.435855;0.298943;0.356933;0.344123;0.135415;0.44069;0.423895;0.084505;0.581895;0.333595;0.064878;0.7284;0.836295;0.14356;0.01854;0.89126
0.068927;0.840615;0.581664;0.161469;0.00569;0.997527;0.908212;0.584732;0.244265;0.171005;0.23361;0.593135;0.173255;0.853295;0.00225;0.14446;0.510535;0.266601;0.001905;0.22392;0.91452;0.00171
0.161469;0.873555;0.584732;0.244265;0.00225;0.047426;0.924701;0.510535;0.266601;0.222866;0.22479;0.614555;0.160655;0.88527;0.001905;0.11283;0.427378;0.279635;0.00171;0.21689;0.95983;0.000885
0.244265;0.853295;0.510535;0.266601;0.001905;0.5;0.750073;0.427378;0.279635;0.292983;0.22392;0.69445;0.081625;0.91452;0.00171;0.083765;0.418142;0.263138;0.000885;0.155345;0.98464;0.00069
0.266601;0.88527;0.427378;0.279635;0.00171;0.731059;0.579812;0.418142;0.263138;0.318718;0.21689;0.64485;0.13826;0.95983;0.000885;0.03928;0.505092;0.156584;0.00069;0.13734;0.992875;0.00023
0.279635;0.91452;0.418142;0.263138;0.000885;0.006693;0.668188;0.505092;0.156584;0.338325;0.155345;0.403015;0.44164;0.98464;0.00069;0.014675;0.610378;0.144788;0.00023;0.04262;0.994805;0.00011
0.156584;0.98464;0.610378;0.144788;0.00023;0.006693;0.825779;0.591837;0.085382;0.322782;0.04262;0.22917;0.72821;0.994805;0.00011;0.00509;0.591742;0.084802;0.000035;0.05518;0.998425;0.00002
0.144788;0.992875;0.591837;0.085382;0.00011;0.952574;0.570036;0.591742;0.084802;0.323458;0.039055;0.22767;0.733275;0.99847;0.000035;0.0015;0.60838;0.053529;0.00002;0.13752;0.998865;0.000015
0.085382;0.994805;0.591742;0.084802;0.000035;0.047426;0.683088;0.60838;0.053529;0.338089;0.05518;0.10028;0.844535;0.998425;0.00002;0.001555;0.639592;0.087421;0.000015;0.24948;0.998945;0.00001
0.084802;0.99847;0.60838;0.053529;0.00002;0.119203;0.531209;0.639592;0.087421;0.272987;0.13752;0.121215;0.741265;0.998865;0.000015;0.00112;0.676965;0.074179;0.00001;0.41526;0.99931;0.00002
0.053529;0.998425;0.639592;0.087421;0.000015;0.006693;0.933888;0.676965;0.074179;0.248856;0.24948;0.06909;0.68143;0.998945;0.00001;0.001045;0.732455;0.115291;0.00002;0.419725;0.999035;0.000025
0.087421;0.998865;0.676965;0.074179;0.00001;0.119203;0.49725;0.732455;0.115291;0.152252;0.41526;0.19065;0.394085;0.99931;0.00002;0.00067;0.722424;0.139895;0.000025;0.444615;0.998905;0.00005
0.074179;0.998945;0.732455;0.115291;0.00002;0.006693;0.551813;0.722424;0.139895;0.137681;0.419725;0.194895;0.38538;0.999035;0.000025;0.00094;0.729832;0.154564;0.00005;0.46995;0.99834;0.00013
0.115291;0.99931;0.722424;0.139895;0.000025;0.119203;0.871468;0.729832;0.154564;0.115606;0.444615;0.28249;0.272895;0.998905;0.00005;0.00105;0.737737;0.14686;0.00013;0.470645;0.994165;0.000435
0.139895;0.999035;0.729832;0.154564;0.00005;0.952574;0.576153;0.737737;0.14686;0.115403;0.46995;0.307555;0.22249;0.99834;0.00013;0.001535;0.722068;0.196303;0.000435;0.46197;0.964955;0.00051
0.154564;0.998905;0.737737;0.14686;0.00013;0.952574;0.512248;0.722068;0.196303;0.081626;0.470645;0.4518;0.07755;0.994165;0.000435;0.005395;0.685807;0.177894;0.00051;0.454145;0.89329;0.001985
0.043957;0.004735;0.098329;0.053448;0.000085;0.017986;0.813968;0.661306;0.041298;0.297394;0.340235;0.05461;0.605155;0.906875;0.000055;0.093065;0.718214;0.055799;0.000825;0.443755;0.985815;0.000015
0.053448;0.14522;0.661306;0.041298;0.000055;0.997527;0.147041;0.718214;0.055799;0.225986;0.393965;0.09593;0.510105;0.967825;0.000825;0.03135;0.7551;0.119364;0.000015;0.53512;0.999805;0
0.041298;0.906875;0.718214;0.055799;0.000825;0.006693;0.638763;0.7551;0.119364;0.125536;0.443755;0.276885;0.27936;0.985815;0.000015;0.01417;0.806664;0.13755;0;0.53912;0.99993;0
0.055799;0.967825;0.7551;0.119364;0.000015;0.268941;0.117637;0.806664;0.13755;0.055784;0.53512;0.378845;0.086035;0.999805;0;0.00019;0.806771;0.138641;0;0.547215;0.999975;0
0.119364;0.985815;0.806664;0.13755;0;0.047426;0.880692;0.806771;0.138641;0.054588;0.53912;0.400195;0.06069;0.99993;0;0.000065;0.803366;0.150307;0;0.547055;1;0
0.13755;0.999805;0.806771;0.138641;0;0.993307;0.900429;0.803366;0.150307;0.046326;0.547215;0.436835;0.01595;0.999975;0;0.00002;0.805029;0.150907;0;0.532975;0.999985;0
0.138641;0.99993;0.803366;0.150307;0;0.982014;0.772942;0.805029;0.150907;0.044064;0.547055;0.438605;0.01434;1;0;0;0.809801;0.124824;0;0.52383;0.99998;0
0.150307;0.999975;0.805029;0.150907;0;0.5;0.542398;0.809801;0.124824;0.065375;0.532975;0.360235;0.10679;0.999985;0;0.000015;0.808449;0.051238;0;0.543115;0.999945;0
0.150907;1;0.809801;0.124824;0;0.006693;0.568811;0.808449;0.051238;0.140314;0.52383;0.111535;0.36464;0.99998;0;0.00002;0.815633;0.032548;0;0.54443;0.999155;0
0.124824;0.999985;0.808449;0.051238;0;0.047426;0.940476;0.815633;0.032548;0.151819;0.543115;0.073755;0.38313;0.999945;0;0.000055;0.810426;0.026291;0;0.49918;0.97686;0
0.051238;0.99998;0.815633;0.032548;0;0.268941;0.496;0.810426;0.026291;0.163286;0.54443;0.07366;0.38192;0.999155;0;0.000845;0.77564;0.021494;0;0.38838;0.541635;0.000015
0.032548;0.999945;0.810426;0.026291;0;0.731059;0.80659;0.77564;0.021494;0.202866;0.49918;0.04216;0.45866;0.97686;0;0.02314;0.396386;0.012258;0.000015;0.046565;0.00115;0.00005
0.195516;0.005825;0.030502;0.027948;0.002305;0.017986;0.749885;0.727036;0.034209;0.238753;0.64448;0.000065;0.35545;0.852675;0.000545;0.14678;0.763106;0.011671;0.0002;0.72078;0.94965;0.000735
0.027948;0.02321;0.727036;0.034209;0.000545;0.017986;0.709921;0.763106;0.011671;0.22522;0.65093;0.000035;0.34903;0.904585;0.0002;0.09521;0.8159;0.005906;0.000735;0.86045;0.976195;0.00035
0.034209;0.852675;0.763106;0.011671;0.0002;0.017986;0.329599;0.8159;0.005906;0.178194;0.72078;0.000205;0.279015;0.94965;0.000735;0.049615;0.892774;0.005277;0.00035;0.96265;0.994295;0.000265
0.011671;0.904585;0.8159;0.005906;0.000735;0.047426;0.426046;0.892774;0.005277;0.101951;0.86045;0.001405;0.13815;0.976195;0.00035;0.023455;0.931419;0.004844;0.000265;0.967745;0.99844;0.000265
0.005906;0.94965;0.892774;0.005277;0.00035;0.119203;0.775738;0.931419;0.004844;0.063737;0.96265;0.001195;0.036155;0.994295;0.000265;0.00544;0.937517;0.018325;0.000265;0.98314;0.99706;0.000245
0.005277;0.976195;0.931419;0.004844;0.000265;0.268941;0.458347;0.937517;0.018325;0.044159;0.967745;0.001035;0.03122;0.99844;0.000265;0.001295;0.941367;0.01702;0.000245;0.975035;0.978955;0.00012
0.004844;0.994295;0.937517;0.018325;0.000265;0.119203;0.867726;0.941367;0.01702;0.041615;0.98314;0.000705;0.016155;0.99706;0.000245;0.0027;0.953912;0.017535;0.00012;0.92816;0.91472;0.000125
0.018325;0.99844;0.941367;0.01702;0.000245;0.017986;0.432907;0.953912;0.017535;0.028549;0.975035;0.003075;0.02189;0.978955;0.00012;0.020915;0.889121;0.01773;0.000125;0.613935;0.551475;0.000685
0.01702;0.99706;0.953912;0.017535;0.00012;0.268941;0.491501;0.889121;0.01773;0.093147;0.92816;0.00417;0.06767;0.91472;0.000125;0.08515;0.54902;0.02499;0.000685;0.075345;0.088345;0.00059
0.017535;0.978955;0.889121;0.01773;0.000125;0.880797;0.24564;0.54902;0.02499;0.42599;0.613935;0.006995;0.37907;0.551475;0.000685;0.44784;0.074892;0.026536;0.00059;0.03223;0.15518;0.006405
0.605172;0.278355;0.062571;0.424644;0.410645;0.006693;0.394126;0.196025;0.09259;0.711385;0.17465;0.02056;0.80479;0.22902;0.185295;0.585685;0.214865;0.045459;0.08024;0.19373;0.377135;0.05033
0.424644;0.161035;0.196025;0.09259;0.185295;0.047426;0.302378;0.214865;0.045459;0.739676;0.185445;0.01511;0.799445;0.28256;0.08024;0.6372;0.253248;0.025883;0.05033;0.26256;0.51199;0.017225
0.09259;0.22902;0.214865;0.045459;0.08024;0.006693;0.267959;0.253248;0.025883;0.720869;0.19373;0.013465;0.7928;0.377135;0.05033;0.57254;0.375487;0.008685;0.017225;0.251945;0.717605;0.000495
0.00617;0.897675;0.488368;0.006016;0.00011;0.880797;0.460085;0.877834;0.01785;0.104318;0.81894;0.0492;0.13186;0.98133;0.00106;0.017615;0.90609;0.025938;0.00002;0.870375;0.99951;0.00001
0.006016;0.927805;0.877834;0.01785;0.00106;0.017986;0.698886;0.90609;0.025938;0.067971;0.832235;0.07414;0.093625;0.992875;0.00002;0.007105;0.945031;0.030343;0.00001;0.87496;0.99778;0.00002
0.01785;0.98133;0.90609;0.025938;0.00002;0.017986;0.655883;0.945031;0.030343;0.024627;0.870375;0.087295;0.042335;0.99951;0.00001;0.00048;0.945986;0.034813;0.00002;0.873205;0.997615;0.000165
0.025938;0.992875;0.945031;0.030343;0.00001;0.999089;0.622929;0.945986;0.034813;0.019202;0.87496;0.09832;0.026715;0.99778;0.00002;0.002205;0.933207;0.036536;0.000165;0.85199;0.99254;0.000535
0.030343;0.99951;0.945986;0.034813;0.00002;0.731059;0.273487;0.933207;0.036536;0.030257;0.873205;0.102525;0.02427;0.997615;0.000165;0.00222;0.930128;0.034683;0.000535;0.827605;0.970565;0.00053
0.034813;0.99778;0.933207;0.036536;0.000165;0.952574;0.343215;0.930128;0.034683;0.035192;0.85199;0.101735;0.04628;0.99254;0.000535;0.00693;0.919688;0.028816;0.00053;0.75046;0.86604;0.00102
0.036536;0.997615;0.930128;0.034683;0.000535;0.017986;0.684386;0.919688;0.028816;0.051494;0.827605;0.084815;0.08758;0.970565;0.00053;0.0289;0.852461;0.01561;0.00102;0.49339;0.62879;0.001245
0.034683;0.99254;0.919688;0.028816;0.00053;0.268941;0.645656;0.852461;0.01561;0.131932;0.75046;0.04425;0.2053;0.86604;0.00102;0.13294;0.534152;0.003622;0.001245;0.218195;0.21517;0.000605
0.026693;0.01096;0.265681;0.011607;0.00182;0.982014;0.453881;0.618549;0.015223;0.366227;0.72674;0.004395;0.268865;0.539075;0.000585;0.46034;0.741214;0.006325;0.00095;0.90296;0.966915;0.000395
0.011607;0.179285;0.618549;0.015223;0.000585;0.017986;0.647028;0.741214;0.006325;0.252462;0.760665;0.005175;0.234165;0.803985;0.00095;0.195065;0.905906;0.011582;0.000395;0.951745;0.996685;0.00008
0.015223;0.539075;0.741214;0.006325;0.00095;0.982014;0.703287;0.905906;0.011582;0.082514;0.90296;0.02261;0.07443;0.966915;0.000395;0.032695;0.944098;0.029576;0.00008;0.957975;0.99909;0.000005
0.006325;0.803985;0.905906;0.011582;0.000395;0.5;0.55305;0.944098;0.029576;0.026326;0.951745;0.0221;0.02615;0.996685;0.00008;0.00324;0.979523;0.009021;0.000005;0.959685;0.99973;0
0.011582;0.966915;0.944098;0.029576;0.00008;0.731059;0.594837;0.979523;0.009021;0.011456;0.957975;0.0221;0.019925;0.99909;0.000005;0.000905;0.974805;0.007241;0;0.961255;0.999765;0
0.029576;0.996685;0.979523;0.009021;0.000005;0.119203;0.573709;0.974805;0.007241;0.017952;0.959685;0.021675;0.018645;0.99973;0;0.00026;0.975323;0.005307;0;0.956015;0.99962;0.000005
0.009021;0.99909;0.974805;0.007241;0;0.119203;0.858878;0.975323;0.005307;0.019368;0.961255;0.015875;0.02287;0.999765;0;0.00023;0.977973;0.005678;0.000005;0.938675;0.99681;0.000005
0.007241;0.99973;0.975323;0.005307;0;0.268941;0.541157;0.977973;0.005678;0.01635;0.956015;0.01479;0.029195;0.99962;0.000005;0.00038;0.89965;0.003629;0.000005;0.83286;0.99588;0.00003
0.005307;0.999765;0.977973;0.005678;0.000005;0.119203;0.454129;0.89965;0.003629;0.096722;0.938675;0.00455;0.056775;0.99681;0.000005;0.003185;0.802942;0.003703;0.00003;0.647995;0.86083;0.001215
0.117867;0.26207;0.400802;0.128459;0.253235;0.5;0.556014;0.518825;0.231077;0.250096;0.24833;0.35067;0.401;0.562105;0.285065;0.152825;0.591785;0.27536;0.30108;0.253955;0.605365;0.34608
0.128459;0.417645;0.518825;0.231077;0.285065;0.119203;0.417268;0.591785;0.27536;0.132856;0.254145;0.47259;0.27326;0.59964;0.30108;0.099285;0.5806;0.326843;0.34608;0.24828;0.585555;0.253805
0.231077;0.562105;0.591785;0.27536;0.30108;0.5;0.442259;0.5806;0.326843;0.092559;0.253955;0.585315;0.160735;0.605365;0.34608;0.048555;0.496352;0.30655;0.253805;0.08075;0.448145;0.146245
0.27536;0.59964;0.5806;0.326843;0.34608;0.982014;0.49675;0.496352;0.30655;0.197098;0.24828;0.607255;0.144465;0.585555;0.253805;0.16064;0.244768;0.223494;0.146245;0.07013;0.384275;0.06439
0.320433;0.323065;0.143533;0.300319;0.45472;0.119203;0.467795;0.264884;0.190396;0.544719;0.08443;0.349125;0.566445;0.333115;0.21445;0.452435;0.278186;0.135596;0.09618;0.0073;0.30999;0.03284
0.300319;0.321615;0.264884;0.190396;0.21445;0.006693;0.508999;0.278186;0.135596;0.586218;0.011225;0.2466;0.742175;0.327655;0.09618;0.576165;0.276837;0.112596;0.03284;0.019815;0.433255;0.01417
0.190396;0.333115;0.278186;0.135596;0.09618;0.006693;0.579568;0.276837;0.112596;0.610567;0.0073;0.262305;0.730395;0.30999;0.03284;0.65717;0.338157;0.117166;0.01417;0.01946;0.60325;0.006565
0.117166;0.433255;0.409957;0.135867;0.006565;0.047426;0.258841;0.55324;0.108041;0.338719;0.030645;0.187985;0.78137;0.762975;0.00274;0.234285;0.698894;0.082843;0.000745;0.400935;0.947595;0.000145
0.135867;0.60325;0.55324;0.108041;0.00274;0.880797;0.571751;0.698894;0.082843;0.218263;0.34536;0.116505;0.538135;0.882745;0.000745;0.11651;0.738808;0.044454;0.000145;0.51234;0.966305;0.000035
0.108041;0.762975;0.698894;0.082843;0.000745;0.017986;0.635295;0.738808;0.044454;0.216737;0.400935;0.00121;0.597855;0.947595;0.000145;0.05226;0.780351;0.038928;0.000035;0.646915;0.99894;0.000015
0.082843;0.882745;0.738808;0.044454;0.000145;0.047426;0.595078;0.780351;0.038928;0.180718;0.51234;0.00104;0.486615;0.966305;0.000035;0.033655;0.835894;0.028012;0.000015;0.710375;0.999695;0
0.044454;0.947595;0.780351;0.038928;0.000035;0.993307;0.49375;0.835894;0.028012;0.136094;0.646915;0.00154;0.351545;0.99894;0.000015;0.001045;0.858533;0.022061;0;0.931465;0.999895;0
0.038928;0.966305;0.835894;0.028012;0.000015;0.006693;0.625509;0.858533;0.022061;0.119407;0.710375;0.00013;0.289495;0.999695;0;0.000305;0.929935;0.011041;0;0.95487;0.999925;0
0.028012;0.99894;0.858533;0.022061;0;0.047426;0.630416;0.929935;0.011041;0.059024;0.931465;0.00058;0.067955;0.999895;0;0.000105;0.930383;0.020299;0;0.95979;0.99996;0
0.022061;0.999695;0.929935;0.011041;0;0.5;0.727108;0.930383;0.020299;0.049317;0.95487;0.000985;0.044145;0.999925;0;0.000075;0.948261;0.021801;0;0.99307;0.999975;0
0.011041;0.999895;0.930383;0.020299;0;0.999089;0.597726;0.948261;0.021801;0.02994;0.95979;0.000965;0.03925;0.99996;0;0.00004;0.951396;0.021133;0;0.991755;0.99994;0
0.020299;0.999925;0.948261;0.021801;0;0.119203;0.629483;0.951396;0.021133;0.02747;0.99307;0.00077;0.00616;0.999975;0;0.000025;0.940279;0.022624;0;0.975115;0.995185;0.000005
0.021801;0.99996;0.951396;0.021133;0;0.017986;0.49575;0.940279;0.022624;0.037098;0.991755;0.000595;0.00765;0.99994;0;0.00006;0.933002;0.033462;0.000005;0.826075;0.975205;0.000005
0.021133;0.999975;0.940279;0.022624;0;0.047426;0.573464;0.933002;0.033462;0.033534;0.975115;0.000495;0.02439;0.995185;0.000005;0.004805;0.89075;0.009591;0.000005;0.53353;0.566305;0.00009
0.022624;0.99994;0.933002;0.033462;0.000005;0.268941;0.769058;0.89075;0.009591;0.099659;0.826075;0.000565;0.17336;0.975205;0.000005;0.02479;0.489688;0.009605;0.00009;0.21474;0.05342;0.00137
0.169306;0.013035;0.227023;0.134204;0.36217;0.006693;0.190618;0.261917;0.122615;0.615468;0.16644;0.021185;0.812375;0.219695;0.33731;0.442995;0.226244;0.154303;0.366615;0.18518;0.23895;0.389535
0.134204;0.14274;0.261917;0.122615;0.33731;0.119203;0.37145;0.226244;0.154303;0.619453;0.16955;0.035775;0.794675;0.226515;0.366615;0.40687;0.221625;0.246692;0.389535;0.15779;0.193295;0.416485
0.122615;0.219695;0.226244;0.154303;0.366615;0.268941;0.187247;0.221625;0.246692;0.531682;0.18518;0.20852;0.6063;0.23895;0.389535;0.371515;0.150122;0.439331;0.416485;0.0922;0.17111;0.42161
0.360876;0.045495;0.030823;0.229745;0.1417;0.268941;0.252184;0.810875;0.026923;0.162205;0.94562;0.00207;0.052315;0.714125;0.031235;0.254645;0.832646;0.015529;0.00067;0.980125;0.83288;0.000025
0.229745;0.030605;0.810875;0.026923;0.031235;0.047426;0.577129;0.832646;0.015529;0.151825;0.95392;0.00068;0.0454;0.767825;0.00067;0.231505;0.862348;0.01474;0.000025;0.98447;0.99332;0.000005
0.026923;0.714125;0.832646;0.015529;0.00067;0.017986;0.247685;0.862348;0.01474;0.122914;0.980125;0.00071;0.01917;0.83288;0.000025;0.167095;0.947407;0.021472;0.000005;0.98;0.99654;0
0.015529;0.767825;0.862348;0.01474;0.000025;0.880797;0.812144;0.947407;0.021472;0.031123;0.98447;0.000305;0.015225;0.99332;0.000005;0.00668;0.958852;0.02142;0;0.98299;0.99962;0
0.01474;0.83288;0.947407;0.021472;0.000005;0.006693;0.338945;0.958852;0.02142;0.019728;0.98;0.000155;0.019845;0.99654;0;0.00346;0.95798;0.035254;0;0.98083;0.999885;0
0.021472;0.99332;0.958852;0.02142;0;0.017986;0.560453;0.95798;0.035254;0.006766;0.98299;0.000255;0.016755;0.99962;0;0.00038;0.956206;0.031935;0;0.981915;0.99976;0
0.02142;0.99654;0.95798;0.035254;0;0.119203;0.949262;0.956206;0.031935;0.011861;0.98083;0.00032;0.018855;0.999885;0;0.000115;0.955248;0.033787;0;0.97836;0.999685;0
0.035254;0.99962;0.956206;0.031935;0;0.268941;0.180643;0.955248;0.033787;0.010965;0.981915;0.00261;0.015475;0.99976;0;0.00024;0.936864;0.022927;0;0.98018;0.99942;0.00001
0.031935;0.999885;0.955248;0.033787;0;0.006693;0.654075;0.936864;0.022927;0.040208;0.97836;0.00312;0.018525;0.999685;0;0.00031;0.960807;0.013967;0.00001;0.970095;0.99287;0.00017
0.033787;0.99976;0.936864;0.022927;0;0.5;0.619635;0.960807;0.013967;0.025226;0.98018;0.003605;0.016215;0.99942;0.00001;0.00057;0.958898;0.014061;0.00017;0.89414;0.89841;0.000115
0.022927;0.999685;0.960807;0.013967;0.00001;0.268941;0.658811;0.958898;0.014061;0.027043;0.970095;0.00331;0.0266;0.99287;0.00017;0.00696;0.857225;0.013883;0.000115;0.34729;0.42702;0.00005
0.013967;0.99942;0.958898;0.014061;0.00017;0.017986;0.456858;0.857225;0.013883;0.128892;0.89414;0.002945;0.102915;0.89841;0.000115;0.101475;0.293429;0.00095;0.00005;0.08709;0.180795;0.000325
0.053402;0.05304;0.388277;0.02114;0.003445;0.119203;0.453138;0.789636;0.001162;0.209203;0.98144;0.00188;0.01668;0.751395;0.00099;0.24762;0.908447;0.000999;0.000185;0.987525;0.96663;0.00008
0.02114;0.138265;0.789636;0.001162;0.00099;0.119203;0.417997;0.908447;0.000999;0.090553;0.98585;0.00224;0.01191;0.84203;0.000185;0.15778;0.95532;0.000937;0.00008;0.990435;0.990465;0.000005
0.001162;0.751395;0.908447;0.000999;0.000185;0.017986;0.75676;0.95532;0.000937;0.043745;0.987525;0.002235;0.01024;0.96663;0.00008;0.033295;0.970625;0.001;0.000005;0.99005;0.99944;0
0.000999;0.84203;0.95532;0.000937;0.00008;0.5;0.392218;0.970625;0.001;0.028377;0.990435;0.00253;0.00704;0.990465;0.000005;0.00953;0.98777;0.002869;0;0.987285;0.999925;0
0.000937;0.96663;0.970625;0.001;0.000005;0.731059;0.452147;0.98777;0.002869;0.009359;0.99005;0.00272;0.007225;0.99944;0;0.00056;0.988386;0.002826;0;0.98058;0.999945;0
0.001;0.990465;0.98777;0.002869;0;0.047426;0.717075;0.988386;0.002826;0.008788;0.987285;0.002815;0.0099;0.999925;0;0.000075;0.986482;0.002728;0;0.90396;0.999715;0
0.002869;0.99944;0.988386;0.002826;0;0.731059;0.599168;0.986482;0.002728;0.01079;0.98058;0.00277;0.01665;0.999945;0;0.000055;0.949335;0.002608;0;0.79067;0.997805;0
0.002826;0.999925;0.986482;0.002728;0;0.268941;0.418727;0.949335;0.002608;0.048056;0.90396;0.002485;0.093555;0.999715;0;0.000285;0.894906;0.000561;0;0.613075;0.99325;0
0.002728;0.999945;0.949335;0.002608;0;0.047426;0.338945;0.894906;0.000561;0.104533;0.79067;0.001295;0.208035;0.997805;0;0.002195;0.831475;0.000459;0;0.21291;0.89818;0.000035
0.002608;0.999715;0.894906;0.000561;0;0.119203;0.54066;0.831475;0.000459;0.168064;0.613075;0.000965;0.38596;0.99325;0;0.006745;0.670707;0.000255;0.000035;0.01856;0.52777;0.00005
0.000561;0.997805;0.831475;0.000459;0;0.017986;0.650219;0.670707;0.000255;0.329037;0.21291;0.000305;0.78678;0.89818;0.000035;0.101785;0.461112;0.000275;0.00005;0.014285;0.375255;0.000165
0.004963;0.413425;0.586973;0.00787;0.000385;0.119203;0.556754;0.782925;0.01972;0.197356;0.843905;0.05805;0.098045;0.829205;0.000095;0.170705;0.922093;0.022487;0.000025;0.90252;0.974525;0.00001
0.00787;0.478385;0.782925;0.01972;0.000095;0.731059;0.482257;0.922093;0.022487;0.055417;0.89978;0.066565;0.03365;0.961165;0.000025;0.038805;0.957097;0.025278;0.00001;0.90362;0.998265;0.000005
0.01972;0.829205;0.922093;0.022487;0.000025;0.047426;0.661951;0.957097;0.025278;0.017626;0.90252;0.07076;0.026715;0.974525;0.00001;0.02547;0.965836;0.02943;0.000005;0.905945;0.99996;0
0.022487;0.961165;0.957097;0.025278;0.00001;0.5;0.388648;0.965836;0.02943;0.004736;0.90362;0.087635;0.008745;0.998265;0.000005;0.001735;0.967324;0.030391;0;0.9057;0.99999;0
0.025278;0.974525;0.965836;0.02943;0.000005;0.952574;0.387935;0.967324;0.030391;0.002284;0.905945;0.09059;0.00346;0.99996;0;0.00004;0.966539;0.030706;0;0.905795;0.99998;0
0.02943;0.998265;0.967324;0.030391;0;0.268941;0.403236;0.966539;0.030706;0.002755;0.9057;0.09156;0.00274;0.99999;0;0.00001;0.967325;0.030698;0;0.905635;0.999895;0
0.030391;0.99996;0.966539;0.030706;0;0.731059;0.405645;0.967325;0.030698;0.001977;0.905795;0.09153;0.002675;0.99998;0;0.00002;0.967232;0.031702;0;0.9049;0.999495;0
0.030706;0.99999;0.967325;0.030698;0;0.731059;0.178726;0.967232;0.031702;0.001065;0.905635;0.09184;0.00252;0.999895;0;0.000105;0.919575;0.030363;0;0.89767;0.98743;0
0.030698;0.99998;0.967232;0.031702;0;0.731059;0.349781;0.919575;0.030363;0.050059;0.9049;0.0905;0.00459;0.999495;0;0.000505;0.887172;0.029423;0;0.82682;0.971635;0.000015
0.031702;0.999895;0.919575;0.030363;0;0.268941;0.682872;0.887172;0.029423;0.083405;0.89767;0.087645;0.014685;0.98743;0;0.01257;0.816927;0.062632;0.000015;0.82106;0.91973;0.000205
0.030363;0.999495;0.887172;0.029423;0;0.119203;0.159494;0.816927;0.062632;0.12044;0.82682;0.08287;0.090305;0.971635;0.000015;0.02835;0.787736;0.054233;0.000205;0.79348;0.88724;0.000505
0.029423;0.98743;0.816927;0.062632;0.000015;0.047426;0.534694;0.787736;0.054233;0.158034;0.82106;0.07606;0.102885;0.91973;0.000205;0.08007;0.711564;0.051131;0.000505;0.74029;0.73375;0.001805
0.062632;0.971635;0.787736;0.054233;0.000205;0.952574;0.52448;0.711564;0.051131;0.237305;0.79348;0.06237;0.14415;0.88724;0.000505;0.112255;0.608275;0.058776;0.001805;0.43086;0.23118;0.00117
0.054233;0.91973;0.711564;0.051131;0.000505;0.047426;0.553791;0.608275;0.058776;0.33295;0.74029;0.03351;0.2262;0.73375;0.001805;0.264445;0.258937;0.019701;0.00117;0.50049;0.070285;0.00194
0.017203;0.11529;0.520492;0.037677;0.02414;0.268941;0.548844;0.580248;0.02201;0.397745;0.756025;0.010925;0.233055;0.51948;0.009945;0.47058;0.698737;0.030299;0.010295;0.88446;0.668265;0.012985
0.037677;0.37781;0.580248;0.02201;0.009945;0.119203;0.655431;0.698737;0.030299;0.270963;0.77913;0.01174;0.20913;0.618355;0.010295;0.37135;0.791273;0.009766;0.012985;0.89776;0.780025;0.006275
0.02201;0.51948;0.698737;0.030299;0.010295;0.952574;0.546614;0.791273;0.009766;0.19896;0.88446;0.01329;0.102245;0.668265;0.012985;0.31875;0.730418;0.007767;0.006275;0.937965;0.82055;0.004565
0.030299;0.618355;0.791273;0.009766;0.012985;0.017986;0.466301;0.730418;0.007767;0.261816;0.89776;0.012165;0.09008;0.780025;0.006275;0.2137;0.879258;0.008677;0.004565;0.95611;0.85965;0.004345
0.009766;0.668265;0.730418;0.007767;0.006275;0.047426;0.472528;0.879258;0.008677;0.112065;0.937965;0.01279;0.049245;0.82055;0.004565;0.174885;0.90788;0.008818;0.004345;0.959855;0.871255;0.00473
0.007767;0.780025;0.879258;0.008677;0.004565;0.952574;0.413382;0.90788;0.008818;0.083303;0.95611;0.01329;0.030595;0.85965;0.004345;0.13601;0.915555;0.008875;0.00473;0.953775;0.856225;0.00139
0.008875;0.871255;0.905;0.005713;0.00139;0.268941;0.202297;0.93792;0.005667;0.056415;0.959;0.009345;0.03166;0.91684;0.00199;0.08117;0.940758;0.002303;0.000695;0.96797;0.92315;0.00038
0.005713;0.856225;0.93792;0.005667;0.00199;0.119203;0.132849;0.940758;0.002303;0.05694;0.968505;0.00391;0.027585;0.91301;0.000695;0.086295;0.94556;0.001318;0.00038;0.978145;0.93056;0.0008
0.005667;0.91684;0.940758;0.002303;0.000695;0.017986;0.523483;0.94556;0.001318;0.053123;0.96797;0.002255;0.029775;0.92315;0.00038;0.07647;0.954353;0.001002;0.0008;0.980665;0.96007;0.0027
0.002303;0.91301;0.94556;0.001318;0.00038;0.047426;0.405886;0.954353;0.001002;0.044645;0.978145;0.001205;0.02065;0.93056;0.0008;0.06864;0.970367;0.00184;0.0027;0.97821;0.950025;0.00211
0.001318;0.92315;0.954353;0.001002;0.0008;0.119203;0.237036;0.970367;0.00184;0.027795;0.980665;0.00098;0.01836;0.96007;0.0027;0.03723;0.964118;0.001533;0.00211;0.97853;0.925405;0.0019
0.001002;0.93056;0.970367;0.00184;0.0027;0.047426;0.362622;0.964118;0.001533;0.034342;0.97821;0.000955;0.020825;0.950025;0.00211;0.04786;0.951968;0.001503;0.0019;0.975985;0.882365;0.001855
0.00184;0.96007;0.964118;0.001533;0.00211;0.047426;0.557248;0.951968;0.001503;0.046527;0.97853;0.001105;0.020365;0.925405;0.0019;0.07269;0.929175;0.002153;0.001855;0.962415;0.468725;0.00356
0.001533;0.950025;0.951968;0.001503;0.0019;0.047426;0.361929;0.929175;0.002153;0.06867;0.975985;0.00245;0.02156;0.882365;0.001855;0.11578;0.71557;0.004722;0.00356;0.956485;0.193475;0.000775
0.004722;0.468725;0.57498;0.003543;0.000775;0.982014;0.605396;0.705705;0.003945;0.290355;0.962305;0.00764;0.03006;0.449105;0.00025;0.55065;0.816147;0.00503;0.000175;0.980525;0.87648;0.00048
0.003543;0.193475;0.705705;0.003945;0.00025;0.047426;0.641527;0.816147;0.00503;0.178822;0.972525;0.009885;0.01759;0.65977;0.000175;0.340055;0.928503;0.005717;0.00048;0.981305;0.97376;0.00046
0.003945;0.449105;0.816147;0.00503;0.000175;0.731059;0.213661;0.928503;0.005717;0.065782;0.980525;0.010955;0.008525;0.87648;0.00048;0.12304;0.977533;0.005908;0.00046;0.982655;0.97487;0.000115
0.00503;0.65977;0.928503;0.005717;0.00048;0.017986;0;0.977533;0.005908;0.016557;0.981305;0.011355;0.00734;0.97376;0.00046;0.025775;0.978763;0.005687;0.000115;0.98072;0.97336;0.000105
0.005717;0.87648;0.977533;0.005908;0.00046;0.047426;0.405886;0.978763;0.005687;0.01555;0.982655;0.01126;0.006085;0.97487;0.000115;0.025015;0.97704;0.005512;0.000105;0.97693;0.97021;0.000145
0.005908;0.97376;0.978763;0.005687;0.000115;0.880797;0.363316;0.97704;0.005512;0.01745;0.98072;0.01092;0.00837;0.97336;0.000105;0.02653;0.97357;0.004812;0.000145;0.944455;0.95911;0.00017
0.005687;0.97487;0.97704;0.005512;0.000105;0.268941;0.241953;0.97357;0.004812;0.021618;0.97693;0.00948;0.01359;0.97021;0.000145;0.029645;0.951783;0.004457;0.00017;0.937185;0.954225;0.003925
0.005512;0.97336;0.97357;0.004812;0.000145;0.5;0;0.951783;0.004457;0.04376;0.944455;0.008745;0.0468;0.95911;0.00017;0.04072;0.945705;0.004822;0.003925;0.932305;0.92195;0.005025
0.004812;0.97021;0.951783;0.004457;0.00017;0.017986;0.366793;0.945705;0.004822;0.04947;0.937185;0.00572;0.05709;0.954225;0.003925;0.04185;0.927128;0.005043;0.005025;0.87158;0.779135;0.00642
0.004457;0.95911;0.945705;0.004822;0.003925;0.5;0.6142;0.927128;0.005043;0.067827;0.932305;0.00506;0.062635;0.92195;0.005025;0.07302;0.825357;0.006058;0.00642;0.624155;0.68187;0.001275
0.004822;0.954225;0.927128;0.005043;0.005025;0.268941;0;0.825357;0.006058;0.168582;0.87158;0.005695;0.12272;0.779135;0.00642;0.214445;0.653012;0.00223;0.001275;0.57584;0.63081;0.004335
0.17936;0.160865;0.125303;0.289635;0.074625;0.982014;0.758963;0.230978;0.339525;0.429495;0.411995;0.15379;0.43421;0.277595;0.279695;0.44271;0.231341;0.138174;0.249395;0.40191;0.25236;0.2782
0.289635;0.14809;0.230978;0.339525;0.279695;0.731059;0.568075;0.231341;0.138174;0.630483;0.413205;0.161615;0.425175;0.277305;0.249395;0.4733;0.219355;0.192249;0.2782;0.402465;0.213725;0.44551
0.339525;0.277595;0.231341;0.138174;0.249395;0.017986;0.656334;0.219355;0.192249;0.588396;0.40191;0.29475;0.30334;0.25236;0.2782;0.46944;0.206048;0.2896;0.44551;0.393845;0.19366;0.51316
0.027728;0.327505;0.123247;0.032287;0.05787;0.119203;0.507;0.146923;0.016662;0.836415;0.008315;0.00313;0.988555;0.195105;0.036295;0.7686;0.236783;0.021069;0.052495;0.005985;0.077225;0.074395
0.032287;0.229715;0.146923;0.016662;0.036295;0.119203;0.478513;0.236783;0.021069;0.742146;0.00632;0.0071;0.98658;0.13999;0.052495;0.80751;0.209081;0.123932;0.074395;0.006775;0.020005;0.269645
0.016662;0.195105;0.236783;0.021069;0.052495;0.017986;0.391026;0.209081;0.123932;0.666988;0.005985;0.092595;0.90142;0.077225;0.074395;0.84838;0.194416;0.204664;0.269645;0.0099;0.01678;0.70076
0.021069;0.13999;0.209081;0.123932;0.074395;0.017986;0.31131;0.194416;0.204664;0.600918;0.006775;0.145205;0.848015;0.020005;0.269645;0.71035;0.00925;0.548279;0.70076;0.010455;0.01638;0.84245
0.416777;0.017555;0.167377;0.31311;0.121895;0.731059;0.793966;0.718661;0.012621;0.268718;0.42235;0.01014;0.56751;0.73539;0.026845;0.237765;0.727166;0.013593;0.0282;0.40276;0.777705;0.062955
0.31311;0.027;0.718661;0.012621;0.026845;0.017986;0.731648;0.727166;0.013593;0.259241;0.43103;0.0117;0.55727;0.752225;0.0282;0.219575;0.726236;0.055581;0.062955;0.37198;0.506475;0.24142
0.012621;0.73539;0.727166;0.013593;0.0282;0.006693;0.656785;0.726236;0.055581;0.218181;0.40276;0.10291;0.494325;0.777705;0.062955;0.15934;0.365043;0.410842;0.24142;0.26927;0.32409;0.246035
0.245291;0.34814;0.212421;0.074918;0.05245;0.5;0.50625;0.397227;0.023749;0.579025;0.118295;0.037465;0.844245;0.58403;0.012735;0.403235;0.398808;0.005056;0.009635;0.085635;0.68566;0.00474
0.074918;0.379985;0.397227;0.023749;0.012735;0.119203;0.41096;0.398808;0.005056;0.596136;0.098485;0.00524;0.896275;0.550845;0.009635;0.43952;0.461516;0.0154;0.00474;0.01401;0.71674;0.003935
0.023749;0.58403;0.398808;0.005056;0.009635;0.268941;0.662845;0.461516;0.0154;0.523084;0.085635;0.031335;0.88303;0.68566;0.00474;0.3096;0.51697;0.009804;0.003935;0.018035;0.832985;0.009135
0.005056;0.550845;0.461516;0.0154;0.00474;0.880797;0.343215;0.51697;0.009804;0.473224;0.01401;0.00726;0.978725;0.71674;0.003935;0.279325;0.597526;0.01121;0.009135;0.951705;0.977725;0.000225
0.009804;0.71674;0.597526;0.01121;0.009135;0.017986;0.442505;0.963139;0.001878;0.034982;0.951705;0.00007;0.04822;0.977725;0.000225;0.02205;0.977579;0.002321;0.00006;0.996;0.995075;0.000075
0.01121;0.832985;0.963139;0.001878;0.000225;0.047426;0.687831;0.977579;0.002321;0.0201;0.97045;0.000005;0.02955;0.990075;0.00006;0.00986;0.982535;0.00259;0.000075;0.99951;0.999355;0.00001
0.001878;0.977725;0.977579;0.002321;0.00006;0.268941;0.605157;0.982535;0.00259;0.014878;0.996;0.000005;0.003995;0.995075;0.000075;0.00486;0.970418;0.007003;0.00001;0.99979;0.999835;0
0.002321;0.990075;0.982535;0.00259;0.000075;0.997527;0.421407;0.970418;0.007003;0.022578;0.99951;0.000015;0.000475;0.999355;0.00001;0.000635;0.990571;0.000015;0;0.99981;0.999775;0
0.00259;0.995075;0.970418;0.007003;0.00001;0.119203;0.767456;0.990571;0.000015;0.009414;0.99979;0.00001;0.0002;0.999835;0;0.000165;0.991016;0.000015;0;0.999815;0.999425;0
0.007003;0.999355;0.990571;0.000015;0;0.119203;0.667744;0.991016;0.000015;0.008969;0.99981;0.00001;0.00018;0.999775;0;0.000225;0.997152;0.000015;0;0.999045;0.993135;0
0.000015;0.999835;0.991016;0.000015;0;0.880797;0.629716;0.997152;0.000015;0.002833;0.999815;0.00001;0.000175;0.999425;0;0.000575;0.98941;0.000017;0;0.98493;0.979265;0
0.000015;0.999775;0.997152;0.000015;0;0.006693;0.360545;0.98941;0.000017;0.010573;0.999045;0.000015;0.000945;0.993135;0;0.00686;0.968359;0.000018;0;0.950495;0.917895;0.000005
0.000015;0.999425;0.98941;0.000017;0;0.119203;0.721718;0.968359;0.000018;0.031622;0.98493;0.000015;0.015055;0.979265;0;0.02073;0.859678;0.000018;0.000005;0.088615;0.011185;0.000025
0.000017;0.993135;0.968359;0.000018;0;0.047426;0.670622;0.859678;0.000018;0.140301;0.950495;0;0.0495;0.917895;0.000005;0.082095;0.064821;0.000048;0.000025;0.000545;0.00149;0.000075
0.070239;0.46193;0.243624;0.015721;0.026925;0.880797;0.332256;0.348625;0.00909;0.642286;0.59002;0.00191;0.40807;0.426185;0.00386;0.56996;0.779195;0.004555;0.0004;0.7646;0.943315;0.000115
0.015721;0.441395;0.348625;0.00909;0.00386;0.119203;0.723522;0.779195;0.004555;0.216248;0.605385;0.002265;0.392345;0.825815;0.0004;0.173785;0.876497;0.000992;0.000115;0.881765;0.96683;0.0003
0.00909;0.426185;0.779195;0.004555;0.0004;0.731059;0.72232;0.876497;0.000992;0.122509;0.7646;0.00283;0.232565;0.943315;0.000115;0.05657;0.92032;0.003701;0.0003;0.932795;0.99828;0.000025
0.004555;0.825815;0.876497;0.000992;0.000115;0.006693;0.634831;0.92032;0.003701;0.075979;0.881765;0.002865;0.11537;0.96683;0.0003;0.03287;0.915752;0.006651;0.000025;0.961245;0.999625;0.00001
0.000992;0.943315;0.92032;0.003701;0.0003;0.993307;0.482257;0.915752;0.006651;0.077597;0.932795;0.00184;0.065365;0.99828;0.000025;0.001695;0.93026;0.003818;0.00001;0.9629;0.999645;0.000005
0.003701;0.96683;0.915752;0.006651;0.000025;0.119203;0.667744;0.93026;0.003818;0.065922;0.961245;0.00091;0.03785;0.999625;0.00001;0.00036;0.928574;0.003691;0.000005;0.9931;0.999425;0
0.006651;0.99828;0.93026;0.003818;0.00001;0.006693;0.735945;0.928574;0.003691;0.067735;0.9629;0.00054;0.03656;0.999645;0.000005;0.00035;0.94107;0.003515;0;0.995035;0.999155;0
0.003818;0.999625;0.928574;0.003691;0.000005;0.5;0.594596;0.94107;0.003515;0.055411;0.9931;0.00002;0.00688;0.999425;0;0.000565;0.937346;0.00351;0;0.99551;0.99771;0
0.003691;0.999645;0.94107;0.003515;0;0.982014;0.862;0.937346;0.00351;0.059142;0.995035;0.000005;0.004955;0.999155;0;0.000845;0.937017;0.003514;0;0.99591;0.997;0
0.003515;0.999425;0.937346;0.00351;0;0.119203;0.50125;0.937017;0.003514;0.059469;0.99551;0.000015;0.004475;0.99771;0;0.00229;0.941151;0.007797;0;0.999065;0.99973;0
0.00351;0.999155;0.937017;0.003514;0;0.017986;0.606828;0.941151;0.007797;0.051055;0.99591;0.00002;0.004075;0.997;0;0.003;0.946522;0.049331;0;0.998855;0.99975;0
0.003514;0.99771;0.941151;0.007797;0;0.731059;0.631812;0.946522;0.049331;0.004147;0.999065;0.000025;0.00091;0.99973;0;0.00027;0.928382;0.046864;0;0.998965;0.999925;0
0.007797;0.997;0.946522;0.049331;0;0.119203;0.618928;0.928382;0.046864;0.024754;0.998855;0.000025;0.00112;0.99975;0;0.00025;0.920958;0.050121;0;0.998585;0.999995;0
0.049331;0.99973;0.928382;0.046864;0;0.119203;0.699517;0.920958;0.050121;0.028921;0.998965;0.000065;0.00097;0.999925;0;0.000075;0.918932;0.050568;0;0.99638;0.99957;0
0.046864;0.99975;0.920958;0.050121;0;0.5;0.369817;0.918932;0.050568;0.030501;0.998585;0.00017;0.001245;0.999995;0;0.000005;0.916916;0.054479;0;0.989065;0.997635;0
0.050121;0.999925;0.918932;0.050568;0;0.952574;0.545127;0.916916;0.054479;0.028603;0.99638;0.00019;0.00343;0.99957;0;0.000425;0.903487;0.061343;0;0.942615;0.938935;0
0.050568;0.999995;0.916916;0.054479;0;0.047426;0.454129;0.903487;0.061343;0.03517;0.989065;0.000195;0.01074;0.997635;0;0.002365;0.862616;0.059384;0;0.525525;0.495115;0.000045
0.054479;0.99957;0.903487;0.061343;0;0.119203;0.554038;0.862616;0.059384;0.078002;0.942615;0.00031;0.05708;0.938935;0;0.061065;0.493029;0.057625;0.000045;0.2918;0.007175;0.000015
0.000312;0.77563;0.69858;0.008457;0.001315;0.017986;0.023846;0.840155;0.012171;0.147674;0.593275;0.02645;0.380275;0.993505;0.00042;0.006075;0.843499;0.01695;0.000165;0.764455;0.99946;0.000085
0.008457;0.815995;0.840155;0.012171;0.00042;0.952574;0.581273;0.843499;0.01695;0.139551;0.602745;0.03984;0.357415;0.99714;0.000165;0.002695;0.893066;0.016267;0.000085;0.76289;0.999935;0.00001
0.012171;0.993505;0.843499;0.01695;0.000165;0.017986;0.771535;0.893066;0.016267;0.090664;0.764455;0.037975;0.19756;0.99946;0.000085;0.000455;0.889942;0.025703;0.00001;0.77882;0.99999;0
0.01695;0.99714;0.893066;0.016267;0.000085;0.047426;0.36749;0.889942;0.025703;0.084357;0.76289;0.035675;0.20144;0.999935;0.00001;0.000055;0.895318;0.006338;0;0.77966;0.99993;0
0.016267;0.99946;0.889942;0.025703;0.00001;0.952574;0.289873;0.895318;0.006338;0.098345;0.77882;0.008245;0.21294;0.99999;0;0.00001;0.895533;0.012918;0;0.940415;0.99986;0
0.025703;0.999935;0.895318;0.006338;0;0.006693;0.53917;0.895533;0.012918;0.09155;0.77966;0.00259;0.21775;0.99993;0;0.00007;0.923163;0.024207;0;0.981335;0.99958;0
0.006338;0.99999;0.895533;0.012918;0;0.017986;0.620813;0.923163;0.024207;0.052628;0.940415;0.002165;0.057415;0.99986;0;0.00014;0.921555;0.020688;0;0.98323;0.999415;0
0.012918;0.99993;0.923163;0.024207;0;0.880797;0.156171;0.921555;0.020688;0.057755;0.981335;0.00222;0.01644;0.99958;0;0.00042;0.922658;0.020884;0;0.97834;0.99242;0
0.024207;0.99986;0.921555;0.020688;0;0.880797;0.329157;0.922658;0.020884;0.056458;0.98323;0.00276;0.01401;0.999415;0;0.000585;0.911964;0.019579;0;0.97259;0.982525;0.000005
0.020688;0.99958;0.922658;0.020884;0;0.047426;0.739428;0.911964;0.019579;0.068456;0.97834;0.00198;0.01968;0.99242;0;0.00758;0.905252;0.019928;0.000005;0.93192;0.708675;0.00178
0.020884;0.999415;0.911964;0.019579;0;0.119203;0.83835;0.905252;0.019928;0.07482;0.97259;0.001935;0.025475;0.982525;0.000005;0.01747;0.752797;0.06456;0.00178;0.86602;0.43534;0.00038
0.019579;0.99242;0.905252;0.019928;0.000005;0.268941;0.356176;0.752797;0.06456;0.182643;0.93192;0.001885;0.066195;0.708675;0.00178;0.289545;0.623952;0.068247;0.00038;0.6401;0.358005;0.000695
0.019928;0.982525;0.752797;0.06456;0.00178;0.006693;0.578593;0.623952;0.068247;0.307802;0.86602;0.001965;0.132015;0.43534;0.00038;0.56428;0.489499;0.080529;0.000695;0.11764;0.28247;0.022285
0.06456;0.708675;0.623952;0.068247;0.00038;0.047426;0.756208;0.489499;0.080529;0.429974;0.6401;0.001745;0.35816;0.358005;0.000695;0.6413;0.322241;0.026967;0.022285;0.02507;0.005515;0.061765
0.031684;0.857975;0.591718;0.030314;0.01495;0.268941;0.437085;0.665428;0.028909;0.305664;0.729925;0.00035;0.26973;0.92023;0.009785;0.069985;0.683693;0.070416;0.009095;0.839775;0.884135;0.00937
0.030314;0.922025;0.665428;0.028909;0.009785;0.268941;0.72968;0.683693;0.070416;0.245891;0.824375;0.000585;0.17504;0.91314;0.009095;0.077765;0.607899;0.059813;0.00937;0.84897;0.861735;0.00685
0.028909;0.92023;0.683693;0.070416;0.009095;0.047426;0.442752;0.607899;0.059813;0.332284;0.839775;0.00114;0.15908;0.884135;0.00937;0.10649;0.638815;0.017424;0.00685;0.98108;0.91305;0.00286
0.070416;0.91314;0.607899;0.059813;0.00937;0.047426;0.200367;0.638815;0.017424;0.343763;0.84897;0.002635;0.148395;0.861735;0.00685;0.13142;0.906612;0.014782;0.00286;0.983895;0.946215;0.00038
0.059813;0.884135;0.638815;0.017424;0.00685;0.268941;0.25143;0.906612;0.014782;0.078608;0.98108;0.002405;0.016515;0.91305;0.00286;0.084095;0.91436;0.041409;0.00038;0.984595;0.9673;0.00068
0.017424;0.861735;0.906612;0.014782;0.00286;0.119203;0.662622;0.91436;0.041409;0.044231;0.983895;0.003125;0.01298;0.946215;0.00038;0.053405;0.922438;0.039709;0.00068;0.976645;0.988505;0.000475
0.014782;0.91305;0.91436;0.041409;0.00038;0.268941;0.741351;0.922438;0.039709;0.037851;0.984595;0.004045;0.011355;0.9673;0.00068;0.03202;0.912946;0.012341;0.000475;0.9614;0.96916;0.000125
0.041409;0.946215;0.922438;0.039709;0.00068;0.047426;0.491751;0.912946;0.012341;0.074713;0.976645;0.004425;0.01893;0.988505;0.000475;0.01102;0.871394;0.011749;0.000125;0.86664;0.90582;0.00009
0.039709;0.9673;0.912946;0.012341;0.000475;0.119203;0.750447;0.871394;0.011749;0.116861;0.9614;0.00353;0.035075;0.96916;0.000125;0.03072;0.745977;0.010617;0.00009;0.35069;0.849955;0.000825
0.012341;0.988505;0.871394;0.011749;0.000125;0.119203;0.586375;0.745977;0.010617;0.243406;0.86664;0.001365;0.131995;0.90582;0.00009;0.09409;0.535968;0.010236;0.000825;0.082405;0.58166;0.007875
0.011749;0.96916;0.745977;0.010617;0.00009;0.268941;0.652716;0.535968;0.010236;0.453796;0.35069;0.001225;0.648085;0.849955;0.000825;0.14922;0.224142;0.023322;0.007875;0.080675;0.573655;0.01228
0.467789;0.402415;0.160883;0.397475;0.26729;0.047426;0.194662;0.160315;0.367892;0.471792;0.050145;0.91417;0.035685;0.38613;0.14125;0.472615;0.160479;0.334401;0.095935;0.01591;0.13425;0.030705
0.397475;0.39792;0.160315;0.367892;0.14125;0.017986;0.251242;0.160479;0.334401;0.50512;0.04806;0.883415;0.068525;0.377135;0.095935;0.52693;0.06319;0.251878;0.030705;0.00432;0.174025;0.009885
0.367892;0.38613;0.160479;0.334401;0.095935;0.731059;0.465057;0.06319;0.251878;0.684932;0.01591;0.693285;0.290805;0.13425;0.030705;0.835045;0.077847;0.119556;0.009885;0.0044;0.258015;0.02209
0.014816;0.63475;0.32555;0.044885;0.00301;0.731059;0.612302;0.744343;0.000469;0.255191;0.443715;0.000115;0.556175;0.95907;0.00124;0.039695;0.898445;0.004489;0.000015;0.935345;0.998115;0.00001
0.044885;0.73782;0.744343;0.000469;0.00124;0.119203;0.482507;0.898445;0.004489;0.097066;0.751295;0.000045;0.24866;0.98695;0.000015;0.013035;0.967578;0.006426;0.00001;0.989345;0.999195;0
0.000469;0.95907;0.898445;0.004489;0.000015;0.119203;0.482257;0.967578;0.006426;0.025998;0.935345;0.000075;0.06458;0.998115;0.00001;0.00188;0.987543;0.007383;0;0.99398;0.99974;0
0.004489;0.98695;0.967578;0.006426;0.00001;0.731059;0.783808;0.987543;0.007383;0.005075;0.989345;0.000225;0.01043;0.999195;0;0.000805;0.990914;0.005796;0;0.99745;0.99989;0
0.006426;0.998115;0.987543;0.007383;0;0.982014;0.478513;0.990914;0.005796;0.00329;0.99398;0.00027;0.00575;0.99974;0;0.00026;0.990226;0.007709;0;0.998425;0.999915;0
0.007383;0.999195;0.990914;0.005796;0;0.119203;0.642906;0.990226;0.007709;0.002065;0.99745;0.00026;0.00229;0.99989;0;0.00011;0.986697;0.008308;0;0.99441;0.999555;0
0.005796;0.99974;0.990226;0.007709;0;0.5;0.579568;0.986697;0.008308;0.004996;0.998425;0.00026;0.001315;0.999915;0;0.000085;0.984764;0.010668;0;0.98549;0.999695;0
0.007709;0.99989;0.986697;0.008308;0;0.5;0.49975;0.984764;0.010668;0.004566;0.99441;0.00025;0.005335;0.999555;0;0.000445;0.932042;0.005286;0;0.981;0.99757;0.000005
0.008308;0.999915;0.984764;0.010668;0;0.047426;0.622929;0.932042;0.005286;0.062668;0.98549;0.00021;0.014295;0.999695;0;0.0003;0.897413;0.008152;0.000005;0.969965;0.988415;0.000015
0.010668;0.999555;0.932042;0.005286;0;0.119203;0.652263;0.897413;0.008152;0.094434;0.981;0.000325;0.01867;0.99757;0.000005;0.002425;0.774709;0.051049;0.000015;0.922605;0.91501;0.000475
0.005286;0.999695;0.897413;0.008152;0.000005;0.880797;0.598207;0.774709;0.051049;0.17424;0.969965;0.00258;0.02745;0.988415;0.000015;0.01157;0.892793;0.00096;0.000475;0.740745;0.59227;0.000405
0.008152;0.99757;0.774709;0.051049;0.000015;0.880797;0.646799;0.892793;0.00096;0.106247;0.922605;0.00207;0.075325;0.91501;0.000475;0.084515;0.651697;0.020964;0.000405;0.64603;0.28343;0.001385
0.051049;0.988415;0.892793;0.00096;0.000475;0.119203;0.672607;0.651697;0.020964;0.327341;0.740745;0.003175;0.256085;0.59227;0.000405;0.407325;0.348978;0.046345;0.001385;0.525935;0.124105;0.0106
0.00096;0.91501;0.651697;0.020964;0.000405;0.5;0.432662;0.348978;0.046345;0.604677;0.64603;0.003325;0.35065;0.28343;0.001385;0.71518;0.216691;0.049581;0.0106;0.014825;0.05478;0.002445
0.244046;0.042235;0.070635;0.252899;0.231335;0.119203;0.387223;0.151636;0.192961;0.655401;0.02978;0.024265;0.94595;0.068345;0.07435;0.857305;0.177387;0.19279;0.0897;0.038765;0.201825;0.052645
0.252899;0.051475;0.151636;0.192961;0.07435;0.731059;0.338497;0.177387;0.19279;0.629823;0.03726;0.01629;0.94645;0.139855;0.0897;0.770445;0.184655;0.168597;0.052645;0.082245;0.24576;0.107765
0.192961;0.068345;0.177387;0.19279;0.0897;0.119203;0.705993;0.184655;0.168597;0.646747;0.038765;0.034965;0.926265;0.201825;0.052645;0.745535;0.1243;0.066202;0.107765;0.100885;0.241215;0.119285
0.19279;0.139855;0.184655;0.168597;0.052645;0.731059;0.770299;0.1243;0.066202;0.8095;0.082245;0.05316;0.864595;0.24576;0.107765;0.64648;0.127962;0.080631;0.119285;0.1224;0.15103;0.100585
0.168597;0.201825;0.1243;0.066202;0.107765;0.268941;0.737303;0.127962;0.080631;0.791407;0.100885;0.047945;0.85117;0.241215;0.119285;0.6395;0.102957;0.059213;0.100585;0.101215;0.14582;0.06736
0.066202;0.24576;0.127962;0.080631;0.119285;0.119203;0.054681;0.102957;0.059213;0.837828;0.1224;0.01553;0.862065;0.15103;0.100585;0.748385;0.082389;0.049064;0.06736;0.05987;0.125535;0.274335
0.086311;0.09811;0.071487;0.041316;0.03442;0.5;0.599168;0.700817;0.038899;0.260282;0.8255;0.000265;0.174235;0.52158;0.000715;0.4777;0.879807;0.010499;0.000435;0.92195;0.93884;0.001095
0.041316;0.103415;0.700817;0.038899;0.000715;0.047426;0.54215;0.879807;0.010499;0.109693;0.91131;0.00036;0.08833;0.874305;0.000435;0.125255;0.903103;0.010594;0.001095;0.98419;0.988495;0.00145
0.038899;0.52158;0.879807;0.010499;0.000435;0.731059;0.563899;0.903103;0.010594;0.086304;0.92195;0.000305;0.077745;0.93884;0.001095;0.060065;0.941273;0.011557;0.00145;0.99558;0.997435;0.00037
0.010499;0.874305;0.903103;0.010594;0.001095;0.119203;0.523732;0.941273;0.011557;0.04717;0.98419;0.000835;0.01497;0.988495;0.00145;0.01006;0.942238;0.02383;0.00037;0.996225;0.997435;0.000085
0.010594;0.93884;0.941273;0.011557;0.00145;0.5;0.404922;0.942238;0.02383;0.033933;0.99558;0.00129;0.003135;0.997435;0.00037;0.00219;0.942245;0.025595;0.000085;0.996595;0.997895;0.000025
0.011557;0.988495;0.942238;0.02383;0.00037;0.119203;0.600128;0.942245;0.025595;0.032158;0.996225;0.000555;0.003215;0.997435;0.000085;0.00248;0.946874;0.025965;0.000025;0.9972;0.998595;0.00001
0.02383;0.997435;0.942245;0.025595;0.000085;0.268941;0.872806;0.946874;0.025965;0.027163;0.996595;0.00044;0.002965;0.997895;0.000025;0.002085;0.961152;0.012959;0.00001;0.996695;0.998355;0.00001
0.025595;0.997435;0.946874;0.025965;0.000025;0.731059;0.596523;0.961152;0.012959;0.025891;0.9972;0.00033;0.00247;0.998595;0.00001;0.0014;0.961435;0.012668;0.00001;0.9963;0.99718;0.00003
0.025965;0.997895;0.961152;0.012959;0.00001;0.268941;0.358013;0.961435;0.012668;0.025895;0.996695;0.000295;0.003005;0.998355;0.00001;0.001635;0.949567;0.013057;0.00003;0.971605;0.99375;0.000175
0.012959;0.998595;0.961435;0.012668;0.00001;0.119203;0.208169;0.949567;0.013057;0.037373;0.9963;0.000235;0.003455;0.99718;0.00003;0.00279;0.926577;0.013039;0.000175;0.857865;0.9572;0.00014
0.012668;0.998355;0.949567;0.013057;0.00003;0.5;0.486503;0.926577;0.013039;0.060384;0.971605;0.000175;0.02822;0.99375;0.000175;0.006075;0.69503;0.020528;0.00014;0.625915;0.864895;0.000055
0.013057;0.99718;0.926577;0.013039;0.000175;0.5;0.743073;0.69503;0.020528;0.284442;0.857865;0.00023;0.141905;0.9572;0.00014;0.04266;0.571592;0.010595;0.000055;0.15446;0.73876;0.00018
0.013039;0.99375;0.69503;0.020528;0.00014;0.047426;0.271505;0.571592;0.010595;0.417812;0.625915;0.00013;0.373955;0.864895;0.000055;0.13505;0.388616;0.000599;0.00018;0.01918;0.056315;0.000445
0.877398;0.005865;0.011039;0.433857;0.743395;0.119203;0.848386;0.223352;0.102413;0.674234;0.001545;0.053345;0.94511;0.003155;0.25382;0.743025;0.226929;0.061162;0.17009;0.007795;0.02804;0.22332
0.433857;0.005005;0.223352;0.102413;0.25382;0.006693;0;0.226929;0.061162;0.711909;0.003455;0.013075;0.98347;0.00294;0.17009;0.82697;0.243847;0.07856;0.22332;0.016335;0.02596;0.27579
0.102413;0.003155;0.226929;0.061162;0.17009;0.047426;0.550329;0.243847;0.07856;0.677594;0.007795;0.01228;0.979925;0.02804;0.22332;0.748645;0.214738;0.118239;0.27579;0.025855;0.02591;0.41661
0.061162;0.00294;0.243847;0.07856;0.22332;0.017986;0.567829;0.214738;0.118239;0.667024;0.016335;0.078845;0.904825;0.02596;0.27579;0.69825;0.017505;0.354746;0.41661;0.02238;0.000385;0.46134
0.02856;0.61206;0.355419;0.132972;0.066195;0.268941;0.580299;0.638202;0.161022;0.200777;0.457855;0.300775;0.24137;0.702715;0.05949;0.237795;0.753497;0.122959;0.06141;0.544545;0.90516;0.012545
0.132972;0.68353;0.638202;0.161022;0.05949;0.731059;0.621048;0.753497;0.122959;0.123543;0.49541;0.27642;0.22817;0.83524;0.06141;0.10335;0.794198;0.079238;0.012545;0.605695;0.90945;0.01104
0.161022;0.702715;0.753497;0.122959;0.06141;0.006693;0.729483;0.794198;0.079238;0.126562;0.544545;0.19547;0.259985;0.90516;0.012545;0.08229;0.816012;0.03898;0.01104;0.717;0.91094;0.011095
0.122959;0.83524;0.794198;0.079238;0.012545;0.119203;0;0.816012;0.03898;0.145009;0.605695;0.0762;0.318105;0.90945;0.01104;0.07951;0.85974;0.038033;0.011095;0.75047;0.97366;0.010925
0.079238;0.90516;0.816012;0.03898;0.01104;0.731059;0.42678;0.85974;0.038033;0.102225;0.717;0.07322;0.209775;0.91094;0.011095;0.077965;0.891582;0.035294;0.010925;0.766445;0.97573;0.009975
0.03898;0.90945;0.85974;0.038033;0.011095;0.5;0.852205;0.891582;0.035294;0.073123;0.75047;0.064765;0.18476;0.97366;0.010925;0.015415;0.879269;0.016563;0.009975;0.78366;0.9734;0.009995
0.038033;0.91094;0.891582;0.035294;0.010925;0.017986;0.705785;0.879269;0.016563;0.104168;0.766445;0.039585;0.19397;0.97573;0.009975;0.014295;0.881799;0.017071;0.009995;0.784475;0.81777;0.004115
0.035294;0.97366;0.879269;0.016563;0.009975;0.047426;0.822882;0.881799;0.017071;0.101129;0.78366;0.04107;0.175265;0.9734;0.009995;0.016605;0.845827;0.01495;0.004115;0.72615;0.156;0.009955
0.016563;0.97573;0.881799;0.017071;0.009995;0.731059;0.597005;0.845827;0.01495;0.139223;0.784475;0.04057;0.174955;0.81777;0.004115;0.178115;0.53855;0.017626;0.009955;0.60601;0.090745;0.073645
0.017071;0.9734;0.845827;0.01495;0.004115;0.731059;0;0.53855;0.017626;0.443826;0.72615;0.042725;0.231125;0.156;0.009955;0.83405;0.513196;0.03298;0.073645;0.3555;0.07954;0.01302
0.01495;0.81777;0.53855;0.017626;0.009955;0.5;0.939687;0.513196;0.03298;0.453823;0.60601;0.02492;0.36907;0.090745;0.073645;0.83561;0.322861;0.007267;0.01302;0;0;0
0.043787;0.05646;0.258063;0.055705;0.01184;0.880797;0;0.379678;0.08489;0.535435;0.470475;0.15996;0.36957;0.28888;0.00982;0.7013;0.368812;0.097102;0.026905;0.40787;0.44511;0.03022
0.055705;0.06992;0.379678;0.08489;0.00982;0.952574;0.748758;0.368812;0.097102;0.53408;0.443625;0.1673;0.389065;0.294;0.026905;0.679095;0.42649;0.08586;0.03022;0.388095;0.477695;0.089985
0.08489;0.28888;0.368812;0.097102;0.026905;0.5;0.955212;0.42649;0.08586;0.487657;0.40787;0.1415;0.45064;0.44511;0.03022;0.524675;0.432895;0.104313;0.089985;0.33401;0.482325;0.137765
0.097102;0.294;0.42649;0.08586;0.03022;0.047426;0;0.432895;0.104313;0.462792;0.388095;0.11864;0.493265;0.477695;0.089985;0.43232;0.408167;0.117502;0.137765;0.28457;0.487745;0.188485
0.08586;0.44511;0.432895;0.104313;0.089985;0.119203;0.382252;0.408167;0.117502;0.47433;0.33401;0.09724;0.56875;0.482325;0.137765;0.37991;0.386157;0.145388;0.188485;0.291065;0.469985;0.242975
0.104313;0.477695;0.408167;0.117502;0.137765;0.047426;0.226356;0.386157;0.145388;0.468452;0.28457;0.10229;0.61314;0.487745;0.188485;0.323765;0.380525;0.240205;0.242975;0.28476;0.398015;0.22975
0.117502;0.482325;0.386157;0.145388;0.188485;0.268941;0;0.380525;0.240205;0.379272;0.291065;0.237435;0.471505;0.469985;0.242975;0.28704;0.341388;0.250175;0.22975;0.289825;0.392215;0.277595
0.145388;0.487745;0.380525;0.240205;0.242975;0.047426;0;0.341388;0.250175;0.40844;0.28476;0.2706;0.44464;0.398015;0.22975;0.37224;0.34102;0.322962;0.277595;0.21514;0.289805;0.27614
0.240205;0.469985;0.341388;0.250175;0.22975;0.982014;0.712181;0.34102;0.322962;0.336015;0.289825;0.36833;0.34184;0.392215;0.277595;0.33019;0.252472;0.32201;0.27614;0.16341;0.234;0.325905
0.250175;0.398015;0.34102;0.322962;0.277595;0.268941;0;0.252472;0.32201;0.425517;0.21514;0.36788;0.41698;0.289805;0.27614;0.434055;0.198705;0.330693;0.325905;0.068105;0.135905;0.09931
0.322962;0.392215;0.252472;0.32201;0.27614;0.047426;0;0.198705;0.330693;0.470605;0.16341;0.33548;0.50112;0.234;0.325905;0.44009;0.102005;0.144705;0.09931;0.036925;0.12209;0.04676
0.020037;0.06566;0.08008;0.012965;0.02213;0.119203;0.095695;0.12057;0.05478;0.82465;0.094445;0.02384;0.881715;0.146695;0.08572;0.767585;0.121997;0.07059;0.07967;0.100335;0.14548;0.23975
0.012965;0.12045;0.12057;0.05478;0.08572;0.268941;0.323879;0.121997;0.07059;0.807413;0.096465;0.06151;0.842025;0.14753;0.07967;0.7728;0.122907;0.266025;0.23975;0.058745;0.133605;0.181895
0.05478;0.146695;0.121997;0.07059;0.07967;0.880797;0.342539;0.122907;0.266025;0.611063;0.100335;0.2923;0.60736;0.14548;0.23975;0.614765;0.096175;0.250025;0.181895;0.04967;0.127195;0.08359
0.07059;0.14753;0.122907;0.266025;0.23975;0.268941;0;0.096175;0.250025;0.653803;0.058745;0.318155;0.6231;0.133605;0.181895;0.684505;0.088433;0.203967;0.08359;0.031935;0.12023;0.12714
0.266025;0.14548;0.096175;0.250025;0.181895;0.5;0.958751;0.088433;0.203967;0.707597;0.04967;0.324345;0.62598;0.127195;0.08359;0.789215;0.076082;0.21916;0.12714;0.021405;0.11354;0.42301
0.250025;0.133605;0.088433;0.203967;0.08359;0.5;0.776607;0.076082;0.21916;0.704757;0.031935;0.31118;0.656885;0.12023;0.12714;0.75263;0.067473;0.389925;0.42301;0.016375;0.05222;0.44526
0.203967;0.127195;0.076082;0.21916;0.12714;0.119203;0.227409;0.067473;0.389925;0.542605;0.021405;0.35684;0.62175;0.11354;0.42301;0.46346;0.034298;0.440122;0.44526;0.013225;0.01792;0.42497
0.21916;0.12023;0.067473;0.389925;0.42301;0.5;0.448929;0.034298;0.440122;0.525583;0.016375;0.434985;0.548645;0.05222;0.44526;0.50252;0.015572;0.466192;0.42497;0.01317;0.00593;0.49296
0.075743;0.01377;0.019985;0.033562;0.061215;0.047426;0.333588;0.031448;0.0067;0.961855;0.017245;0.00391;0.97885;0.04565;0.00949;0.94486;0.035345;0.018715;0.032295;0.034915;0.054165;0.089575
0.033562;0.024825;0.031448;0.0067;0.00949;0.047426;0;0.035345;0.018715;0.94594;0.02006;0.005135;0.974805;0.05063;0.032295;0.917075;0.04454;0.049862;0.089575;0.052065;0.09709;0.070275
0.0067;0.04565;0.035345;0.018715;0.032295;0.268941;0;0.04454;0.049862;0.9056;0.034915;0.01015;0.954935;0.054165;0.089575;0.856265;0.074578;0.039975;0.070275;0.06022;0.09006;0.08181
0.147265;0.078075;0.005927;0.066955;0.092685;0.000911;0;0.026457;0.01907;0.954473;0.049005;0.01061;0.940385;0.00391;0.02753;0.96856;0.02916;0.015722;0.019605;0.052475;0.00836;0.01503
0.066955;0.00372;0.026457;0.01907;0.02753;0.119203;0.574198;0.02916;0.015722;0.955118;0.052575;0.01184;0.935585;0.005745;0.019605;0.97465;0.030418;0.01354;0.01503;0.01965;0.005925;0.02182
0.01907;0.00391;0.02916;0.015722;0.019605;0.000911;0.421163;0.030418;0.01354;0.956047;0.052475;0.01205;0.93548;0.00836;0.01503;0.976615;0.012788;0.0166;0.02182;0.001435;0.00441;0.054145
0.015722;0.005745;0.030418;0.01354;0.01503;0.268941;0;0.012788;0.0166;0.970612;0.01965;0.01138;0.968965;0.005925;0.02182;0.97226;0.002922;0.030463;0.054145;0.02237;0.00749;0.07034
0.001475;0.36885;0.668505;0.000307;0.00005;0.119203;0.462819;0.894004;0.000155;0.10584;0.833085;0.0001;0.166815;0.849635;0.00001;0.15035;0.99197;0.000399;0.00031;0.9855;0.999235;0.000005
0.000307;0.59599;0.894004;0.000155;0.00001;0.731059;0.512497;0.99197;0.000399;0.007629;0.980335;0.00058;0.01909;0.99619;0.00031;0.00349;0.994721;0.000289;0.000005;0.99374;0.99981;0.000005
0.000155;0.849635;0.99197;0.000399;0.00031;0.731059;0.474023;0.994721;0.000289;0.00499;0.9855;0.000575;0.013925;0.999235;0.000005;0.00076;0.997663;0.000253;0.000005;0.99501;0.999825;0
0.000399;0.99619;0.994721;0.000289;0.000005;0.268941;0.542646;0.997663;0.000253;0.002083;0.99374;0.000475;0.005785;0.99981;0.000005;0.000185;0.998093;0.000236;0;0.995375;0.999415;0
0.000289;0.999235;0.997663;0.000253;0.000005;0.268941;0.880376;0.998093;0.000236;0.001673;0.99501;0.00043;0.004565;0.999825;0;0.000175;0.998078;0.000223;0;0.994185;0.999815;0
0.000253;0.99981;0.998093;0.000236;0;0.119203;0.963137;0.998078;0.000223;0.001698;0.995375;0.00039;0.00423;0.999415;0;0.000585;0.997815;0.000293;0;0.983925;0.99949;0
0.000236;0.999825;0.998078;0.000223;0;0.731059;0.294423;0.997815;0.000293;0.001894;0.994185;0.0006;0.00522;0.999815;0;0.000185;0.994286;0.000233;0;0.96946;0.99851;0
0.000223;0.999415;0.997815;0.000293;0;0.119203;0.779026;0.994286;0.000233;0.005479;0.983925;0.00042;0.01565;0.99949;0;0.00051;0.989138;0.000176;0;0.87281;0.984655;0.00001
0.000293;0.999815;0.994286;0.000233;0;0.119203;0.933949;0.989138;0.000176;0.010686;0.96946;0.00025;0.03029;0.99851;0;0.00149;0.952301;0.0002;0.00001;0.657025;0.92759;0.000005
0.000233;0.99949;0.989138;0.000176;0;0.119203;0.874132;0.952301;0.0002;0.047499;0.87281;0.00031;0.126875;0.984655;0.00001;0.01534;0.838287;0.000237;0.000005;0.45733;0.28615;0.00011
0.000176;0.99851;0.952301;0.0002;0.00001;0.017986;0.597486;0.838287;0.000237;0.161478;0.657025;0.000415;0.34256;0.92759;0.000005;0.07241;0.492564;0.001048;0.00011;0.307155;0.126925;0.006605
0.0002;0.984655;0.838287;0.000237;0.000005;0.047426;0.292556;0.492564;0.001048;0.506385;0.45733;0.00273;0.539935;0.28615;0.00011;0.713735;0.396609;0.007122;0.006605;0.166705;0.07609;0.37417
0.123087;0.551305;0.437416;0.000198;0.00003;0.268941;0.958553;0.929457;0.00026;0.070283;0.877055;0.000495;0.12245;0.98818;0.000005;0.011815;0.956604;0.000793;0;0.959175;0.99984;0
0.000198;0.57103;0.929457;0.00026;0.000005;0.731059;0.241587;0.956604;0.000793;0.042603;0.947685;0.0021;0.050215;0.99899;0;0.00101;0.986152;0.001325;0;0.97721;0.999775;0
0.00026;0.98818;0.956604;0.000793;0;0.047426;0.408541;0.986152;0.001325;0.012525;0.959175;0.003695;0.037135;0.99984;0;0.00016;0.992142;0.004593;0;0.97965;0.999855;0
0.000793;0.99899;0.986152;0.001325;0;0.731059;0.698465;0.992142;0.004593;0.003265;0.97721;0.0135;0.009295;0.999775;0;0.00022;0.992982;0.004917;0;0.98316;0.999985;0
0.001325;0.99984;0.992142;0.004593;0;0.731059;0.968016;0.992982;0.004917;0.002103;0.97965;0.01447;0.005885;0.999855;0;0.000145;0.994195;0.005238;0;0.983125;0.99999;0
0.004593;0.999775;0.992982;0.004917;0;0.880797;0.253128;0.994195;0.005238;0.000568;0.98316;0.015435;0.00141;0.999985;0;0.000015;0.994185;0.005235;0;0.98307;0.999975;0
0.004917;0.999855;0.994195;0.005238;0;0.880797;0.61822;0.994185;0.005235;0.00058;0.983125;0.015425;0.00145;0.99999;0;0.00001;0.984539;0.004627;0;0.981785;0.999855;0
0.005238;0.999985;0.994185;0.005235;0;0.268941;0.786339;0.984539;0.004627;0.010832;0.98307;0.0136;0.00333;0.999975;0;0.00002;0.993685;0.004466;0;0.96675;0.99878;0.000015
0.005235;0.99999;0.984539;0.004627;0;0.5;0.917964;0.993685;0.004466;0.001849;0.981785;0.013105;0.00511;0.999855;0;0.000145;0.936231;0.003903;0.000015;0.9494;0.98597;0.000015
0.004627;0.999975;0.993685;0.004466;0;0.731059;0.572486;0.936231;0.003903;0.059866;0.96675;0.011385;0.021865;0.99878;0.000015;0.001205;0.926046;0.003587;0.000015;0.377835;0.36747;0.00001
0.004466;0.999855;0.936231;0.003903;0.000015;0.268941;0.809536;0.926046;0.003587;0.070365;0.9494;0.010425;0.04017;0.98597;0.000015;0.014015;0.351062;0.000627;0.00001;0.0156;0.04279;0.00075
0.07117;0.033505;0.049323;0.034027;0.03918;0.268941;0.101652;0.10372;0.091057;0.805222;0.179525;0.087345;0.73313;0.027915;0.09477;0.877315;0.075614;0.124608;0.194005;0.203;0.009625;0.271825
0.034027;0.024025;0.10372;0.091057;0.09477;0.268941;0.209656;0.075614;0.124608;0.79978;0.193865;0.156715;0.649425;0.009875;0.194005;0.79612;0.108011;0.203201;0.271825;0.19864;0.003615;0.49224
0.091057;0.027915;0.075614;0.124608;0.194005;0.006693;0.558728;0.108011;0.203201;0.688788;0.203;0.236415;0.560585;0.009625;0.271825;0.71855;0.111272;0.472115;0.49224;0.19907;0.003285;0.565065
0.038881;0.120225;0.404216;0.015587;0.01967;0.5;0.339842;0.514228;0.037326;0.448448;0.965845;0.0055;0.02866;0.54425;0.00578;0.44997;0.950958;0.001323;0.002565;0.97537;0.986455;0.000525
0.015587;0.412035;0.514228;0.037326;0.00578;0.047426;0.626446;0.950958;0.001323;0.04772;0.968825;0.00112;0.030055;0.900075;0.002565;0.09736;0.978831;0.000548;0.000525;0.97653;0.99408;0.00242
0.037326;0.54425;0.950958;0.001323;0.002565;0.731059;0.85056;0.978831;0.000548;0.020622;0.97537;0.00084;0.023795;0.986455;0.000525;0.01302;0.990017;0.001193;0.00242;0.984275;0.99822;0.000005
0.001323;0.900075;0.978831;0.000548;0.000525;0.268941;0.380836;0.990017;0.001193;0.008788;0.97653;0.00088;0.02259;0.99408;0.00242;0.003495;0.993978;0.000443;0.000005;0.99688;0.999415;0.000005
0.000548;0.986455;0.990017;0.001193;0.00242;0.047426;0.150077;0.993978;0.000443;0.00558;0.984275;0.001045;0.01468;0.99822;0.000005;0.00178;0.998578;0.000545;0.000005;0.99733;0.999945;0.000005
0.001193;0.99408;0.993978;0.000443;0.000005;0.268941;0.526974;0.998578;0.000545;0.000878;0.99688;0.00135;0.00177;0.999415;0.000005;0.000585;0.998905;0.000523;0.000005;0.997075;0.999875;0.000005
0.000443;0.99822;0.998578;0.000545;0.000005;0.880797;0.682872;0.998905;0.000523;0.000572;0.99733;0.001285;0.001385;0.999945;0.000005;0.00005;0.998797;0.00053;0.000005;0.99394;0.998605;0.000005
0.000545;0.999415;0.998905;0.000523;0.000005;0.731059;0.064706;0.998797;0.00053;0.000673;0.997075;0.001305;0.00162;0.999875;0.000005;0.00012;0.965224;0.000405;0.000005;0.991325;0.994495;0
0.000523;0.999945;0.998797;0.00053;0.000005;0.119203;0.739428;0.965224;0.000405;0.034369;0.99394;0.00093;0.005125;0.998605;0.000005;0.00139;0.962779;0.000293;0;0.852445;0.943065;0.00001
0.00053;0.999875;0.965224;0.000405;0.000005;0.119203;0.396277;0.962779;0.000293;0.036925;0.991325;0.000595;0.00808;0.994495;0;0.005495;0.850491;0.033524;0.00001;0.0068;0.071925;0.000025
0.03162;0.040395;0.503815;0.029962;0.013505;0.268941;0;0.703892;0.08432;0.21179;0.565735;0.23955;0.19472;0.5497;0.01153;0.43877;0.753057;0.098058;0.010375;0.5823;0.70057;0.012095
0.029962;0.05726;0.703892;0.08432;0.01153;0.268941;0;0.753057;0.098058;0.148886;0.56739;0.28223;0.15038;0.694915;0.010375;0.29471;0.76006;0.112938;0.012095;0.57926;0.79179;0.006215
0.08432;0.5497;0.753057;0.098058;0.010375;0.017986;0.550082;0.76006;0.112938;0.127;0.5823;0.325375;0.09232;0.70057;0.012095;0.287335;0.789565;0.118916;0.006215;0.50601;0.79274;0.00515
0.098058;0.694915;0.76006;0.112938;0.012095;0.952574;0.371217;0.789565;0.118916;0.091522;0.57926;0.349355;0.07139;0.79179;0.006215;0.202;0.765465;0.025462;0.00515;0.82929;0.96184;0.00342
0.112938;0.70057;0.789565;0.118916;0.006215;0.5;0.755654;0.765465;0.025462;0.209072;0.50601;0.07006;0.42393;0.79274;0.00515;0.20211;0.929679;0.013284;0.00342;0.829625;0.95972;0.003585
0.118916;0.79179;0.765465;0.025462;0.00515;0.006693;0.608735;0.929679;0.013284;0.057036;0.82929;0.035385;0.135325;0.96184;0.00342;0.034735;0.929084;0.005161;0.003585;0.86051;0.975475;0.00012
0.025462;0.79274;0.929679;0.013284;0.00342;0.119203;0.805652;0.929084;0.005161;0.065757;0.829625;0.01085;0.15953;0.95972;0.003585;0.036695;0.944631;0.002017;0.00012;0.81225;0.981935;0.000205
0.013284;0.96184;0.929084;0.005161;0.003585;0.268941;0.306826;0.944631;0.002017;0.053356;0.86051;0.004885;0.13461;0.975475;0.00012;0.02441;0.930697;0.002031;0.000205;0.842725;0.96409;0.00006
0.005161;0.95972;0.944631;0.002017;0.00012;0.731059;0.592184;0.930697;0.002031;0.067274;0.81225;0.00484;0.18291;0.981935;0.000205;0.017865;0.934907;0.002426;0.00006;0.98007;0.971595;0.004005
0.002017;0.975475;0.930697;0.002031;0.000205;0.006693;0.540412;0.934907;0.002426;0.062667;0.842725;0.00617;0.151105;0.96409;0.00006;0.03585;0.983191;0.002179;0.004005;0.983995;0.972815;0.000045
0.002031;0.981935;0.934907;0.002426;0.00006;0.006693;0.074813;0.983191;0.002179;0.014632;0.98007;0.001485;0.018445;0.971595;0.004005;0.024405;0.984906;0.000872;0.000045;0.991955;0.99879;0.00001
0.002426;0.96409;0.983191;0.002179;0.004005;0.268941;0.85619;0.984906;0.000872;0.014222;0.983995;0.001525;0.01448;0.972815;0.000045;0.02714;0.996217;0.000934;0.00001;0.99179;0.999025;0.000005
0.002179;0.971595;0.984906;0.000872;0.000045;0.017986;0.620577;0.996217;0.000934;0.002849;0.991955;0.001745;0.0063;0.99879;0.00001;0.0012;0.996241;0.000901;0.000005;0.994755;0.999835;0
0.000872;0.972815;0.996217;0.000934;0.00001;0.017986;0.347964;0.996241;0.000901;0.002859;0.99179;0.00165;0.00656;0.999025;0.000005;0.00097;0.997499;0.000879;0;0.99497;0.99984;0
0.000934;0.99879;0.996241;0.000901;0.000005;0.119203;0.069203;0.997499;0.000879;0.001622;0.994755;0.00159;0.00366;0.999835;0;0.00016;0.997572;0.000771;0;0.99671;0.99914;0
0.000901;0.999025;0.997499;0.000879;0;0.006693;0.418484;0.997572;0.000771;0.001657;0.99497;0.001265;0.003765;0.99984;0;0.00016;0.997919;0.000817;0;0.996295;0.99795;0.000025
0.000879;0.999835;0.997572;0.000771;0;0.002473;0.162465;0.997919;0.000817;0.001264;0.99671;0.001405;0.001885;0.99914;0;0.00086;0.997384;0.000954;0.000025;0.99158;0.99502;0.000095
0.000771;0.99984;0.997919;0.000817;0;0.119203;0.712181;0.997384;0.000954;0.001662;0.996295;0.00179;0.001915;0.99795;0.000025;0.002025;0.994836;0.001047;0.000095;0.96586;0.93082;0.000185
0.000817;0.99914;0.997384;0.000954;0.000025;0.880797;0.481508;0.994836;0.001047;0.004114;0.99158;0.002;0.00642;0.99502;0.000095;0.004875;0.964775;0.001196;0.000185;0.762505;0.835665;0.000225
0.000954;0.99795;0.994836;0.001047;0.000095;0.880797;0.631812;0.964775;0.001196;0.034027;0.96586;0.002225;0.031915;0.93082;0.000185;0.06899;0.86516;0.001261;0.000225;0.7057;0.558095;0.00105
0.001047;0.99502;0.964775;0.001196;0.000185;0.5;0.482257;0.86516;0.001261;0.133578;0.762505;0.002215;0.23528;0.835665;0.000225;0.16411;0.753553;0.001893;0.00105;0.12273;0.041935;0.00289
0.001196;0.93082;0.86516;0.001261;0.000225;0.017986;0.60659;0.753553;0.001893;0.244554;0.7057;0.00306;0.29124;0.558095;0.00105;0.440855;0.055411;0.001593;0.00289;0.00822;0.02212;0.007515
0.788936;0.103575;0.137407;0.291289;0.510475;0.000911;0.345247;0.523947;0.023999;0.452054;0.44333;0.039725;0.51694;0.130605;0.031225;0.838175;0.642719;0.012107;0.005265;0.63097;0.42217;0.001905
0.291289;0.113175;0.523947;0.023999;0.031225;0.993307;0.513497;0.642719;0.012107;0.345174;0.58881;0.03001;0.38118;0.34144;0.005265;0.653295;0.683682;0.013476;0.001905;0.91091;0.99041;0.00128
0.023999;0.130605;0.642719;0.012107;0.005265;0.002473;0.70765;0.683682;0.013476;0.302842;0.63097;0.037475;0.331555;0.42217;0.001905;0.575925;0.966409;0.018109;0.00128;0.91759;0.99638;0.00148
0.012107;0.34144;0.683682;0.013476;0.001905;0.880797;0.398193;0.966409;0.018109;0.015482;0.91091;0.052;0.03709;0.99041;0.00128;0.00831;0.970626;0.018201;0.00148;0.91963;0.997625;0.00146
0.013476;0.42217;0.966409;0.018109;0.00128;0.047426;0.50475;0.970626;0.018201;0.011174;0.91759;0.052075;0.030335;0.99638;0.00148;0.00214;0.971724;0.019175;0.00146;0.919405;0.99762;0.001535
0.018109;0.99041;0.970626;0.018201;0.00148;0.5;0.670622;0.971724;0.019175;0.009102;0.91963;0.055025;0.025345;0.997625;0.00146;0.00092;0.971649;0.020185;0.001535;0.920505;0.997755;0.001635
0.018201;0.99638;0.971724;0.019175;0.00146;0.047426;0.837263;0.971649;0.020185;0.008166;0.919405;0.05798;0.022615;0.99762;0.001535;0.000845;0.972061;0.024793;0.001635;0.916995;0.99755;0.001515
0.019175;0.997625;0.971649;0.020185;0.001535;0.119203;0.38509;0.972061;0.024793;0.003143;0.920505;0.071705;0.007785;0.997755;0.001635;0.000605;0.970825;0.0249;0.001515;0.914895;0.997065;0.00164
0.020185;0.99762;0.972061;0.024793;0.001635;0.047426;0.876099;0.970825;0.0249;0.004275;0.916995;0.07215;0.010855;0.99755;0.001515;0.000935;0.969964;0.024258;0.00164;0.8981;0.99574;0.00164
0.024793;0.997755;0.970825;0.0249;0.001515;0.268941;0.628783;0.969964;0.024258;0.00578;0.914895;0.0701;0.015005;0.997065;0.00164;0.0013;0.963924;0.023821;0.00164;0.77277;0.91953;0.0017
0.024258;0.997065;0.963924;0.023821;0.00164;0.017986;0.82707;0.896744;0.021121;0.082135;0.77277;0.06063;0.1666;0.91953;0.0017;0.07877;0.791631;0.017758;0.001375;0.185805;0.720835;0.00909
0.023821;0.99574;0.896744;0.021121;0.0017;0.268941;0.734778;0.791631;0.017758;0.190611;0.550395;0.050865;0.39874;0.826565;0.001375;0.17206;0.302558;0.019;0.00909;0.168105;0.49661;0.007565
0.021121;0.91953;0.791631;0.017758;0.001375;0.002473;0.68005;0.302558;0.019;0.678443;0.185805;0.046875;0.76732;0.720835;0.00909;0.270075;0.221916;0.017911;0.007565;0.09511;0.240215;0.000355
0.017911;0.49661;0.112119;0.000719;0.000355;0.017986;0.276878;0.124779;0.000599;0.874622;0.08564;0.000585;0.91378;0.287665;0.00018;0.71215;0.182412;0.006577;0.00007;0.149275;0.605275;0.000485
0.000719;0.240215;0.124779;0.000599;0.00018;0.119203;0.619164;0.182412;0.006577;0.811011;0.08709;0.002685;0.910225;0.459115;0.00007;0.540815;0.578848;0.002928;0.000485;0.917795;0.97692;0.000045
0.000599;0.287665;0.182412;0.006577;0.00007;0.017986;0.532454;0.578848;0.002928;0.418227;0.149275;0.00727;0.843455;0.605275;0.000485;0.39425;0.964218;0.004298;0.000045;0.920595;0.99073;0.00001
0.006577;0.459115;0.578848;0.002928;0.000485;0.047426;0.809382;0.964218;0.004298;0.031485;0.917795;0.01182;0.07039;0.97692;0.000045;0.023035;0.969755;0.006703;0.00001;0.92264;0.9974;0
0.002928;0.605275;0.964218;0.004298;0.000045;0.119203;0.611352;0.969755;0.006703;0.023542;0.920595;0.01907;0.060335;0.99073;0.00001;0.00926;0.97266;0.007507;0;0.92522;0.999945;0
0.004298;0.97692;0.969755;0.006703;0.00001;0.006693;0.454377;0.97266;0.007507;0.019833;0.92264;0.02149;0.05587;0.9974;0;0.0026;0.974368;0.007847;0;0.86459;0.999925;0
0.006703;0.99073;0.97266;0.007507;0;0.5;0;0.974368;0.007847;0.017785;0.92522;0.02251;0.05227;0.999945;0;0.000055;0.954152;0.007805;0;0.827605;0.999305;0
0.007507;0.9974;0.974368;0.007847;0;0.119203;0.420676;0.954152;0.007805;0.038045;0.86459;0.022385;0.11303;0.999925;0;0.000075;0.941617;0.006452;0;0.68657;0.99482;0.000015
0.007847;0.999945;0.954152;0.007805;0;0.006693;0.597005;0.941617;0.006452;0.05193;0.827605;0.018325;0.154065;0.999305;0;0.000695;0.89311;0.002887;0.000015;0.61304;0.91268;0.000105
0.007805;0.999925;0.941617;0.006452;0;0.268941;0.843301;0.89311;0.002887;0.104003;0.68657;0.007615;0.305815;0.99482;0.000015;0.005165;0.508917;0.001508;0.000105;0.381995;0.047145;0.00013
0.006254;0.10634;0.437328;0.006163;0.007055;0.119203;0.441026;0.598076;0.002859;0.399067;0.85524;0.000025;0.14474;0.857225;0.000035;0.14274;0.619705;0.3127;0.000035;0.907755;0.97247;0.00002
0.006163;0.11769;0.598076;0.002859;0.000035;0.047426;0.545375;0.619705;0.3127;0.067595;0.8649;0.000005;0.135095;0.936685;0.000035;0.06328;0.641463;0.001006;0.00002;0.99261;0.99431;0.000015
0.002859;0.857225;0.619705;0.3127;0.000035;0.006693;0.331591;0.641463;0.001006;0.357529;0.907755;0.000005;0.09224;0.97247;0.00002;0.027505;0.994125;0.000769;0.000015;0.99466;0.998335;0
0.3127;0.936685;0.641463;0.001006;0.00002;0.880797;0.579081;0.994125;0.000769;0.005106;0.99261;0.00002;0.007365;0.99431;0.000015;0.00568;0.99615;0.000769;0;0.996205;0.997785;0
0.001006;0.97247;0.994125;0.000769;0.000015;0.000911;0.587587;0.99615;0.000769;0.003081;0.99466;0.000035;0.005305;0.998335;0;0.001665;0.996482;0.000786;0;0.995685;0.997975;0
0.000769;0.99431;0.99615;0.000769;0;0.006693;0.602765;0.996482;0.000786;0.002734;0.996205;0.000085;0.003715;0.997785;0;0.002215;0.996372;0.000791;0;0.99726;0.998505;0
0.000769;0.998335;0.996482;0.000786;0;0.006693;0.63158;0.996372;0.000791;0.002837;0.995685;0.0001;0.004215;0.997975;0;0.002025;0.997073;0.000806;0;0.997265;0.9996;0
0.000786;0.997785;0.996372;0.000791;0;0.006693;0.407816;0.997073;0.000806;0.002121;0.99726;0.000145;0.002595;0.998505;0;0.001495;0.99744;0.000874;0;0.997275;0.999755;0
0.000791;0.997975;0.997073;0.000806;0;0.002473;0.703496;0.99744;0.000874;0.001686;0.997265;0.00035;0.002385;0.9996;0;0.0004;0.997495;0.000927;0;0.996195;0.999665;0
0.000806;0.998505;0.99744;0.000874;0;0.006693;0.426536;0.997495;0.000927;0.001576;0.997275;0.00051;0.00221;0.999755;0;0.000245;0.997105;0.000986;0;0.991565;0.998205;0
0.000874;0.9996;0.997495;0.000927;0;0.268941;0.504;0.997105;0.000986;0.001906;0.996195;0.000685;0.00312;0.999665;0;0.000325;0.994595;0.001256;0;0.97325;0.993285;0.00001
0.000927;0.999755;0.997105;0.000986;0;0.5;0.215514;0.994595;0.001256;0.004146;0.991565;0.000775;0.007655;0.998205;0;0.00179;0.985926;0.001783;0.00001;0.88068;0.977215;0.00014
0.000986;0.999665;0.994595;0.001256;0;0.006693;0.428249;0.985926;0.001783;0.012294;0.97325;0.00096;0.025795;0.993285;0.00001;0.00671;0.949713;0.001824;0.00014;0.6797;0.90534;0.00062
0.001256;0.998205;0.985926;0.001783;0.00001;0.268941;0.302589;0.949713;0.001824;0.048463;0.88068;0.000955;0.118365;0.977215;0.00014;0.022645;0.858761;0.001941;0.00062;0.452755;0.549865;0.00063
0.001783;0.993285;0.949713;0.001824;0.00014;0.268941;0.133542;0.858761;0.001941;0.139296;0.6797;0.000825;0.31947;0.90534;0.00062;0.09404;0.664599;0.00181;0.00063;0.19054;0.2027;0.00126
0.337798;0.29303;0.373989;0.328641;0.00105;0.017986;0.306188;0.936203;0.000497;0.0633;0.90625;0.00014;0.093605;0.90442;0.00032;0.095265;0.9476;0.000378;0.000015;0.96289;0.997795;0.000015
0.328641;0.305245;0.936203;0.000497;0.00032;0.017986;0.49675;0.9476;0.000378;0.05202;0.919915;0.00009;0.07999;0.924945;0.000015;0.07504;0.986208;0.00039;0.000015;0.967945;0.99972;0
0.000497;0.90442;0.9476;0.000378;0.000015;0.006693;0;0.986208;0.00039;0.013402;0.96289;0.000125;0.036985;0.997795;0.000015;0.00219;0.988535;0.00037;0;0.96964;0.99984;0
0.000378;0.924945;0.986208;0.00039;0.000015;0.5;0.465803;0.988535;0.00037;0.011095;0.967945;0.00008;0.031975;0.99972;0;0.00028;0.98914;0.000353;0;0.978405;0.99981;0
0.00039;0.997795;0.988535;0.00037;0;0.017986;0.581516;0.98914;0.000353;0.010505;0.96964;0.00003;0.03033;0.99984;0;0.000155;0.992052;0.000352;0;0.993685;0.998945;0
0.00037;0.99972;0.98914;0.000353;0;0.017986;0.5015;0.992052;0.000352;0.007595;0.978405;0.000025;0.02157;0.99981;0;0.000185;0.996857;0.000387;0;0.993415;0.99808;0
0.000353;0.99984;0.992052;0.000352;0;0.982014;0.714838;0.996857;0.000387;0.002758;0.993685;0.00013;0.00619;0.998945;0;0.001055;0.996478;0.000427;0;0.967075;0.97019;0.000005
0.000352;0.99981;0.996857;0.000387;0;0.017986;0.759328;0.996478;0.000427;0.003093;0.993415;0.00025;0.006335;0.99808;0;0.001915;0.651414;0.00045;0.000005;0.93995;0.959215;0.000105
0.000387;0.998945;0.996478;0.000427;0;0.119203;0.482008;0.651414;0.00045;0.348135;0.967075;0.000315;0.032605;0.97019;0.000005;0.029805;0.638132;0.000739;0.000105;0.829055;0.56655;0.00369
0.024916;0.13511;0.083099;0.003429;0.004395;0.5;0.618928;0.833442;0.000544;0.166016;0.77875;0.00016;0.22109;0.72393;0.000295;0.27578;0.900284;0.000516;0.00011;0.798825;0.9431;0.000055
0.003429;0.18435;0.833442;0.000544;0.000295;0.5;0.594596;0.900284;0.000516;0.099201;0.79435;0.00026;0.205385;0.908855;0.00011;0.09104;0.913277;0.000491;0.000055;0.872965;0.97014;0.000045
0.000544;0.72393;0.900284;0.000516;0.00011;0.268941;0.380129;0.913277;0.000491;0.086232;0.798825;0.00037;0.200805;0.9431;0.000055;0.056845;0.947004;0.000657;0.000045;0.976505;0.998325;0.000015
0.000516;0.908855;0.913277;0.000491;0.000055;0.047426;0.859241;0.947004;0.000657;0.052341;0.872965;0.00088;0.12616;0.97014;0.000045;0.029815;0.990912;0.001126;0.000015;0.97461;0.99926;0.000005
0.000491;0.9431;0.947004;0.000657;0.000045;0.268941;0.843565;0.990912;0.001126;0.007964;0.976505;0.002315;0.02118;0.998325;0.000015;0.001665;0.990592;0.001056;0.000005;0.966895;0.999225;0
0.000657;0.97014;0.990912;0.001126;0.000015;0.017986;0.640377;0.990592;0.001056;0.008352;0.97461;0.002115;0.02328;0.99926;0.000005;0.00073;0.988009;0.000997;0;0.933145;0.999175;0.000005
0.001126;0.998325;0.990592;0.001056;0.000005;0.119203;0.706615;0.988009;0.000997;0.010996;0.966895;0.001945;0.031165;0.999225;0;0.000775;0.976742;0.000994;0.000005;0.928095;0.999275;0
0.001056;0.99926;0.988009;0.000997;0;0.5;0.267567;0.976742;0.000994;0.022266;0.933145;0.00193;0.06493;0.999175;0.000005;0.00082;0.975092;0.000954;0;0.93032;0.999575;0
0.000997;0.999225;0.976742;0.000994;0.000005;0.000911;0;0.975092;0.000954;0.023954;0.928095;0.001815;0.07009;0.999275;0;0.000725;0.975934;0.000601;0;0.95702;0.998955;0
0.000994;0.999175;0.975092;0.000954;0;0.006693;0.582003;0.975934;0.000601;0.023466;0.93032;0.000755;0.068925;0.999575;0;0.000425;0.984627;0.000592;0;0.948755;0.99477;0.00003
0.000954;0.999275;0.975934;0.000601;0;0.006693;0.195919;0.984627;0.000592;0.014779;0.95702;0.00073;0.04225;0.998955;0;0.00104;0.980477;0.000939;0.00003;0.92894;0.979865;0.00001
0.000601;0.999575;0.984627;0.000592;0;0.999089;0.770122;0.980477;0.000939;0.018586;0.948755;0.00174;0.049505;0.99477;0.00003;0.005205;0.968904;0.000576;0.00001;0.572635;0.86469;0.000005
0.000592;0.998955;0.980477;0.000939;0.00003;0.017986;0.717278;0.968904;0.000576;0.030521;0.92894;0.00067;0.07039;0.979865;0.00001;0.020125;0.811744;0.000461;0.000005;0.48962;0.755425;0.00026
0.000939;0.99477;0.968904;0.000576;0.00001;0.006693;0.128308;0.811744;0.000461;0.187792;0.572635;0.00033;0.427035;0.86469;0.000005;0.135295;0.747651;0.000499;0.00026;0.44755;0.66023;0.00067
0.000576;0.979865;0.811744;0.000461;0.000005;0.006693;0.665967;0.747651;0.000499;0.251849;0.48962;0.00019;0.510185;0.755425;0.00026;0.244315;0.701896;0.000792;0.00067;0.489705;0.647525;0.000215
0.394946;0.05515;0.017304;0.359422;0.071675;0.268941;0.517243;0.071247;0.015992;0.912764;0.126005;0.00325;0.87075;0.08669;0.04368;0.869635;0.080344;0.020387;0.053595;0.15408;0.078355;0.110005
0.359422;0.046015;0.071247;0.015992;0.04368;0.047426;0.452642;0.080344;0.020387;0.899269;0.154265;0.00639;0.839345;0.08559;0.053595;0.860815;0.077926;0.04283;0.110005;0.14104;0.037705;0.32469
0.015992;0.08669;0.080344;0.020387;0.053595;0.119203;0.391979;0.077926;0.04283;0.879245;0.15408;0.01714;0.82878;0.078355;0.110005;0.811645;0.089373;0.183947;0.32469;0.196935;0.02657;0.613585
0.731227;0.103465;0.016222;0.633662;0.449795;0.047426;0.4955;0.008109;0.004287;0.987604;0.001465;0.003575;0.99496;0.021815;0.00824;0.969945;0.006544;0.005036;0.012075;0.001395;0.01142;0.050475
0.633662;0.04515;0.008109;0.004287;0.00824;0.5;0.553544;0.006544;0.005036;0.988419;0.001375;0.001985;0.996635;0.01721;0.012075;0.970715;0.004621;0.030017;0.050475;0.00204;0.000435;0.117535
0.004287;0.021815;0.006544;0.005036;0.012075;0.5;0.060825;0.004621;0.030017;0.965362;0.001395;0.03853;0.960075;0.01142;0.050475;0.938105;0.001174;0.068579;0.117535;0.031135;0.000405;0.84667
0.00083;0.07071;0.58924;0.001614;0.000675;0.119203;0.094149;0.970646;0.002144;0.027211;0.971725;0.004925;0.02335;0.942305;0.00046;0.057235;0.979617;0.002134;0.00007;0.98288;0.99449;0.000015
0.001614;0.15774;0.970646;0.002144;0.00046;0.006693;0.511748;0.979617;0.002134;0.018251;0.98328;0.005285;0.01144;0.957665;0.00007;0.042265;0.991759;0.002197;0.000015;0.973905;0.99851;0.00002
0.002144;0.942305;0.979617;0.002134;0.00007;0.017986;0.533201;0.991759;0.002197;0.006044;0.98288;0.00553;0.01159;0.99449;0.000015;0.005495;0.990107;0.001976;0.00002;0.94422;0.982305;0.000015
0.002134;0.957665;0.991759;0.002197;0.000015;0.997527;0.471531;0.990107;0.001976;0.007921;0.973905;0.00486;0.02124;0.99851;0.00002;0.001475;0.974811;0.001339;0.000015;0.901405;0.963705;0.00001
0.002197;0.99449;0.990107;0.001976;0.00002;0.017986;0.376601;0.974811;0.001339;0.023851;0.94422;0.002955;0.052825;0.982305;0.000015;0.01768;0.954339;0.000376;0.00001;0.8092;0.921455;0.000035
0.001976;0.99851;0.974811;0.001339;0.000015;0.047426;0.568565;0.954339;0.000376;0.045287;0.901405;0.00007;0.09853;0.963705;0.00001;0.036285;0.909521;0.000384;0.000035;0.536595;0.69848;0.000105
0.001339;0.982305;0.954339;0.000376;0.00001;0.268941;0.508749;0.909521;0.000384;0.090097;0.8092;0.00007;0.19073;0.921455;0.000035;0.078515;0.744327;0.000507;0.000105;0.273175;0.49226;0.000035
0.000507;0.69848;0.255494;0.000426;0.000035;0.119203;0.662845;0.520587;0.000404;0.47901;0.753145;0.00002;0.246835;0.80744;0.000015;0.19255;0.81141;0.000123;0.000005;0.836405;0.86008;0.000025
0.000426;0.49226;0.520587;0.000404;0.000015;0.006693;0.645656;0.81141;0.000123;0.188463;0.79577;0.00024;0.20399;0.82705;0.000005;0.172935;0.565844;0.001336;0.000025;0.969475;0.997845;0.000055
0.000404;0.80744;0.81141;0.000123;0.000005;0.5;0.348645;0.565844;0.001336;0.432819;0.836405;0.002935;0.160655;0.86008;0.000025;0.139895;0.656122;0.004552;0.000055;0.96856;0.99939;0.000035
0.000123;0.82705;0.565844;0.001336;0.000025;0.880797;0;0.656122;0.004552;0.339324;0.969475;0.012555;0.01796;0.997845;0.000055;0.002105;0.656332;0.003846;0.000035;0.96538;0.9993;0.000025
0.001336;0.86008;0.656122;0.004552;0.000055;0.119203;0.492751;0.656332;0.003846;0.339819;0.96856;0.010455;0.02098;0.99939;0.000035;0.00057;0.655242;0.003469;0.000025;0.965975;0.999015;0.00003
0.004552;0.997845;0.656332;0.003846;0.000035;0.002473;0.578105;0.655242;0.003469;0.341287;0.96538;0.009335;0.02528;0.9993;0.000025;0.000675;0.655346;0.003316;0.00003;0.963365;0.984315;0.000035
0.003846;0.99939;0.655242;0.003469;0.000025;0.006693;0.309598;0.655346;0.003316;0.341339;0.965975;0.00887;0.025155;0.999015;0.00003;0.000955;0.649576;0.001339;0.000035;0.94756;0.94064;0.00002
0.003469;0.9993;0.655346;0.003316;0.00003;0.119203;0.239577;0.649576;0.001339;0.349086;0.963365;0.002935;0.0337;0.984315;0.000035;0.01565;0.629749;0.000877;0.00002;0.86325;0.83018;0.000255
0.003316;0.999015;0.649576;0.001339;0.000035;0.017986;0.398912;0.629749;0.000877;0.369374;0.94756;0.001565;0.050875;0.94064;0.00002;0.05934;0.564826;0.000901;0.000255;0.583445;0.5316;0.016715
0.354647;0.096795;0.920014;0.001482;0.002345;0.047426;0.285366;0.971172;0.000611;0.028216;0.936665;0.000595;0.06274;0.978945;0.00019;0.02086;0.977431;0.000456;0.000025;0.944865;0.99471;0.00001
0.001482;0.84609;0.971172;0.000611;0.00019;0.047426;0.352744;0.977431;0.000456;0.022112;0.94663;0.000295;0.05307;0.987755;0.000025;0.01222;0.979161;0.000464;0.00001;0.90633;0.998805;0.000005
0.000611;0.978945;0.977431;0.000456;0.000025;0.952574;0.737884;0.979161;0.000464;0.020376;0.944865;0.000335;0.0548;0.99471;0.00001;0.00528;0.967681;0.000434;0.000005;0.8714;0.996115;0.000005
0.000456;0.987755;0.979161;0.000464;0.00001;0.119203;0.256737;0.967681;0.000434;0.031886;0.90633;0.00025;0.09342;0.998805;0.000005;0.00119;0.955141;0.000367;0.000005;0.8663;0.9919;0
0.000464;0.99471;0.967681;0.000434;0.000005;0.047426;0.466301;0.955141;0.000367;0.044494;0.8714;0.00005;0.12855;0.996115;0.000005;0.003885;0.952036;0.000427;0;0.753095;0.70699;0.000005
0.000434;0.998805;0.955141;0.000367;0.000005;0.017986;0.700987;0.952036;0.000427;0.047534;0.8663;0.000235;0.13346;0.9919;0;0.008095;0.487044;0.000447;0.000005;0.309805;0.01765;0.00004
0.002531;0.066015;0.167337;0.003514;0.001915;0.006693;0.570527;0.259626;0.009479;0.730896;0.35822;0.014025;0.627755;0.41961;0.013365;0.567025;0.271761;0.019424;0.040445;0.250835;0.443425;0.08734
0.003514;0.15915;0.259626;0.009479;0.013365;0.993307;0.740775;0.271761;0.019424;0.708816;0.359815;0.01678;0.623405;0.45442;0.040445;0.505135;0.231769;0.035709;0.08734;0.04983;0.12242;0.034425
0.009479;0.41961;0.271761;0.019424;0.040445;0.017986;0.07433;0.231769;0.035709;0.732524;0.250835;0.01874;0.730425;0.443425;0.08734;0.46924;0.057766;0.015526;0.034425;0.027835;0.042395;0.012585
0.846291;0.084875;0.060585;0.819976;0.845115;0.017986;0.361698;0.067862;0.371845;0.560294;0.07561;0.27304;0.651345;0.09291;0.41818;0.488915;0.087926;0.249604;0.18578;0.0723;0.092335;0.186765
0.819976;0.091255;0.067862;0.371845;0.41818;0.047426;0.632045;0.087926;0.249604;0.66247;0.073885;0.21214;0.713975;0.094765;0.18578;0.719455;0.136736;0.103644;0.186765;0.051345;0.078905;0.210595
0.371845;0.09291;0.087926;0.249604;0.18578;0.5;0.326293;0.136736;0.103644;0.75962;0.0723;0.076545;0.851155;0.092335;0.186765;0.7209;0.090418;0.120135;0.210595;0.05694;0.084695;0.64752
0.02653;0.00065;0.021099;0.003349;0.00296;0.047426;0.587102;0.540865;0.000251;0.458882;0.861405;0.000045;0.13855;0.722105;0.00001;0.27788;0.567283;0.00371;0.00003;0.938535;0.801155;0.000035
0.003349;0.01393;0.540865;0.000251;0.00001;0.119203;0.787011;0.567283;0.00371;0.429005;0.9039;0.000015;0.09608;0.758225;0.00003;0.241745;0.598061;0.002727;0.000035;0.994755;0.999755;0.000005
0.000251;0.722105;0.567283;0.00371;0.00003;0.047426;0.675463;0.598061;0.002727;0.399213;0.938535;0.000015;0.06145;0.801155;0.000035;0.198815;0.99477;0.000459;0.000005;0.99803;0.9999;0
0.00371;0.758225;0.598061;0.002727;0.000035;0.047426;0;0.99477;0.000459;0.004772;0.994755;0.00002;0.005225;0.999755;0.000005;0.000245;0.998302;0.000509;0;0.999155;0.99991;0
0.002727;0.801155;0.99477;0.000459;0.000005;0.017986;0.251807;0.998302;0.000509;0.001187;0.99803;0.000015;0.001955;0.9999;0;0.000095;0.99857;0.000566;0;0.999675;0.999895;0
0.000459;0.999755;0.998302;0.000509;0;0.119203;0.825347;0.99857;0.000566;0.000864;0.999155;0.00002;0.000825;0.99991;0;0.00009;0.998302;0.000784;0;0.9997;0.99958;0
0.000509;0.9999;0.99857;0.000566;0;0.006693;0.116603;0.998302;0.000784;0.000916;0.999675;0.00002;0.00031;0.999895;0;0.000105;0.993794;0.002995;0;0.99909;0.98541;0
0.000566;0.99991;0.998302;0.000784;0;0.997527;0.513247;0.993794;0.002995;0.003208;0.9997;0.000035;0.00026;0.99958;0;0.000415;0.986671;0.00409;0;0.998285;0.973455;0.000005
0.000784;0.999895;0.993794;0.002995;0;0.119203;0.775216;0.986671;0.00409;0.009238;0.99909;0.000025;0.00088;0.98541;0;0.01459;0.661328;0.004091;0.000005;0.967545;0.92464;0.001375
0.002995;0.99958;0.986671;0.00409;0;0.047426;0.624338;0.661328;0.004091;0.334581;0.998285;0.000025;0.00169;0.973455;0.000005;0.02654;0.63481;0.004546;0.001375;0.896735;0.851765;0.00136
0.00409;0.98541;0.661328;0.004091;0.000005;0.993307;0.784147;0.63481;0.004546;0.360644;0.967545;0.00002;0.032435;0.92464;0.001375;0.073985;0.586915;0.004545;0.00136;0.63871;0.761875;0.00158
0.004091;0.973455;0.63481;0.004546;0.001375;0.017986;0.649308;0.586915;0.004545;0.408539;0.896735;0.00003;0.10323;0.851765;0.00136;0.146875;0.472238;0.005924;0.00158;0.210025;0.47412;0.001675
0.010743;0.032735;0.273015;0.007382;0.00636;0.5;0.4975;0.839985;0.002717;0.157297;0.757275;0.005365;0.23736;0.922695;0.00007;0.077235;0.85345;0.002008;0.00011;0.80953;0.935775;0.00022
0.007382;0.034285;0.839985;0.002717;0.00007;0.119203;0.864362;0.85345;0.002008;0.14454;0.77172;0.003905;0.22437;0.93518;0.00011;0.06471;0.872653;0.002407;0.00022;0.86501;0.909295;0.00024
0.002717;0.922695;0.85345;0.002008;0.00011;0.268941;0.722521;0.872653;0.002407;0.124938;0.80953;0.004595;0.185875;0.935775;0.00022;0.064;0.887152;0.004;0.00024;0.855265;0.90686;0.00018
-------------- next part --------------
A non-text attachment was scrubbed...
Name: assignments.out
Type: application/octet-stream
Size: 1999 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20171020/ef716f3a/attachment-0001.obj>

From se.raschka at gmail.com  Fri Oct 20 13:08:40 2017
From: se.raschka at gmail.com (Sebastian Raschka)
Date: Fri, 20 Oct 2017 13:08:40 -0400
Subject: [scikit-learn] How to get centroids from SciPy's hierarchical
 agglomerative clustering?
In-Reply-To: <CAAir+Crvh43=T_BwFCdOpjSFe_dRLrJWAJtbh8RQA46nTWGqaA@mail.gmail.com>
References: <CAAir+Crvh43=T_BwFCdOpjSFe_dRLrJWAJtbh8RQA46nTWGqaA@mail.gmail.com>
Message-ID: <7890BE40-0A0B-49CD-9719-957CA105F93A@gmail.com>

Independent from the implementation, and unless you use the 'centroid' or 'average linkage' method, cluster centroids don't need to be computed when performing the agglomerative hierarchical clustering . But you can always compute it manually by simply averaging all samples from a cluster (for each feature).

Best.
Sebastian

> On Oct 20, 2017, at 9:13 AM, Sema Atasever <s.atasever at gmail.com> wrote:
> 
> Dear scikit-learn members,
> 
> I am using SciPy's hierarchical agglomerative clustering methods to cluster a 
> 1000 x 22 matrix of features, after clustering my data set with scipy.cluster.hierarchy.linkage and and assigning each sample to a cluster,
> I can't seem to figure out how to get the centroid from the resulting clusters. 
> I would like to extract one element or a few out of each cluster, which is the closest to that cluster's centroid.
> 
> Below follows my code:
> 
> D=np.loadtxt(open("C:\dataset.txt", "rb"), delimiter=";")
> Y = hierarchy.linkage(D, 'ward')
> assignments = hierarchy.fcluster(Y, 5, criterion="maxclust")
> 
> I am taking my matrix of features, computing the euclidean distance between them, and then passing them onto the hierarchical clustering method. From there, I am creating flat clusters, with a maximum of 5 clusters
> 
> Now, based on the flat clusters assignments, how do I get the 1 x 22 centroid that represents each flat cluster?
> 
> Best.
> <SciPy_python_codes.py><dataset.txt><assignments.out>_______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn


From t3kcit at gmail.com  Fri Oct 20 15:07:04 2017
From: t3kcit at gmail.com (Andreas Mueller)
Date: Fri, 20 Oct 2017 15:07:04 -0400
Subject: [scikit-learn] How to get centroids from SciPy's hierarchical
 agglomerative clustering?
In-Reply-To: <7890BE40-0A0B-49CD-9719-957CA105F93A@gmail.com>
References: <CAAir+Crvh43=T_BwFCdOpjSFe_dRLrJWAJtbh8RQA46nTWGqaA@mail.gmail.com>
 <7890BE40-0A0B-49CD-9719-957CA105F93A@gmail.com>
Message-ID: <dc6bdaeb-0163-7004-7749-1295cbaf9b42@gmail.com>

The centroids don't "represent" the clusters, though, and you can 
construct arbitrary complex
clusterings where all the centroids are identical.

On 10/20/2017 01:08 PM, Sebastian Raschka wrote:
> Independent from the implementation, and unless you use the 'centroid' or 'average linkage' method, cluster centroids don't need to be computed when performing the agglomerative hierarchical clustering . But you can always compute it manually by simply averaging all samples from a cluster (for each feature).
>
> Best.
> Sebastian
>
>> On Oct 20, 2017, at 9:13 AM, Sema Atasever <s.atasever at gmail.com> wrote:
>>
>> Dear scikit-learn members,
>>
>> I am using SciPy's hierarchical agglomerative clustering methods to cluster a
>> 1000 x 22 matrix of features, after clustering my data set with scipy.cluster.hierarchy.linkage and and assigning each sample to a cluster,
>> I can't seem to figure out how to get the centroid from the resulting clusters.
>> I would like to extract one element or a few out of each cluster, which is the closest to that cluster's centroid.
>>
>> Below follows my code:
>>
>> D=np.loadtxt(open("C:\dataset.txt", "rb"), delimiter=";")
>> Y = hierarchy.linkage(D, 'ward')
>> assignments = hierarchy.fcluster(Y, 5, criterion="maxclust")
>>
>> I am taking my matrix of features, computing the euclidean distance between them, and then passing them onto the hierarchical clustering method. From there, I am creating flat clusters, with a maximum of 5 clusters
>>
>> Now, based on the flat clusters assignments, how do I get the 1 x 22 centroid that represents each flat cluster?
>>
>> Best.
>> <SciPy_python_codes.py><dataset.txt><assignments.out>_______________________________________________
>> scikit-learn mailing list
>> scikit-learn at python.org
>> https://mail.python.org/mailman/listinfo/scikit-learn
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn


From hristo.a.georgiev at gmail.com  Sun Oct 22 08:12:20 2017
From: hristo.a.georgiev at gmail.com (Hristo Georgiev)
Date: Sun, 22 Oct 2017 13:12:20 +0100
Subject: [scikit-learn] question for using GridSearchCV on
 LocalOutlierFactor
In-Reply-To: <c4c4a2fe-a4f1-1db8-7a46-0a8828f6e701@gmail.com>
References: <779b1d39-5767-d2df-1d7c-64e4e7b1a2f4@udel.edu>
 <c4c4a2fe-a4f1-1db8-7a46-0a8828f6e701@gmail.com>
Message-ID: <CAExUgtDKPfDhDGTUb+SHH6hr886_GFArMbkwCF4znB9fYRHHRA@mail.gmail.com>

Hi,

As it has been indicated by other members, methods such as
``LocalOutlierFactor`` do not expose a ``predict``  method by design.

However, if you nevertheless would still like to keep experimenting in the
direction of attempting to make predictions on "unseen" data, you could
simply create a sub-class with a ``predict()`` wrapper, as in:
https://gist.github.com/hristog/b6151d21aa38a6c80d80d160b7771ce9

Hristo


> On 10/06/2017 12:53 AM, Lifan Xu wrote:
>
>> Hi,
>>
>>     I was trying to train a model for anomaly detection. I only have the
>> normal data which are all labeled as 1. Here is my code:
>>
>>
>>     clf = sklearn.model_selection.GridSearchCV(sklearn.neighbors.
>> LocalOutlierFactor(),
>>                        parameters,
>>                        scoring="accuracy",
>>                        cv=kfold,
>>                        n_jobs=10)
>>     clf.fit(vectors, labels)
>>
>>
>>     But it complains "AttributeError: 'LocalOutlierFactor' object has no
>> attribute 'predict'".
>>
>>     It looks like LocalOutlierFactor only has fit_predict(), but no
>> predict().
>>
>>     My question is will predict() be implemented?
>>
>>
>>     Thanks!
>>
>> _______________________________________________
>> scikit-learn mailing list
>> scikit-learn at python.org
>> https://mail.python.org/mailman/listinfo/scikit-learn
>>
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20171022/1a5f70e7/attachment.html>

From s.atasever at gmail.com  Mon Oct 23 08:33:49 2017
From: s.atasever at gmail.com (Sema Atasever)
Date: Mon, 23 Oct 2017 15:33:49 +0300
Subject: [scikit-learn] How to get centroids from SciPy's hierarchical
 agglomerative clustering?
In-Reply-To: <dc6bdaeb-0163-7004-7749-1295cbaf9b42@gmail.com>
References: <CAAir+Crvh43=T_BwFCdOpjSFe_dRLrJWAJtbh8RQA46nTWGqaA@mail.gmail.com>
 <7890BE40-0A0B-49CD-9719-957CA105F93A@gmail.com>
 <dc6bdaeb-0163-7004-7749-1295cbaf9b42@gmail.com>
Message-ID: <CAAir+Cptk5UkzqYXqMe0M15BmUG1mVtyeXTDHp=1cp50VWXp-A@mail.gmail.com>

Thank you very much for your answer.

On Fri, Oct 20, 2017 at 10:07 PM, Andreas Mueller <t3kcit at gmail.com> wrote:

> The centroids don't "represent" the clusters, though, and you can
> construct arbitrary complex
> clusterings where all the centroids are identical.
>
>
> On 10/20/2017 01:08 PM, Sebastian Raschka wrote:
>
>> Independent from the implementation, and unless you use the 'centroid' or
>> 'average linkage' method, cluster centroids don't need to be computed when
>> performing the agglomerative hierarchical clustering . But you can always
>> compute it manually by simply averaging all samples from a cluster (for
>> each feature).
>>
>> Best.
>> Sebastian
>>
>> On Oct 20, 2017, at 9:13 AM, Sema Atasever <s.atasever at gmail.com> wrote:
>>>
>>> Dear scikit-learn members,
>>>
>>> I am using SciPy's hierarchical agglomerative clustering methods to
>>> cluster a
>>> 1000 x 22 matrix of features, after clustering my data set with
>>> scipy.cluster.hierarchy.linkage and and assigning each sample to a
>>> cluster,
>>> I can't seem to figure out how to get the centroid from the resulting
>>> clusters.
>>> I would like to extract one element or a few out of each cluster, which
>>> is the closest to that cluster's centroid.
>>>
>>> Below follows my code:
>>>
>>> D=np.loadtxt(open("C:\dataset.txt", "rb"), delimiter=";")
>>> Y = hierarchy.linkage(D, 'ward')
>>> assignments = hierarchy.fcluster(Y, 5, criterion="maxclust")
>>>
>>> I am taking my matrix of features, computing the euclidean distance
>>> between them, and then passing them onto the hierarchical clustering
>>> method. From there, I am creating flat clusters, with a maximum of 5
>>> clusters
>>>
>>> Now, based on the flat clusters assignments, how do I get the 1 x 22
>>> centroid that represents each flat cluster?
>>>
>>> Best.
>>> <SciPy_python_codes.py><dataset.txt><assignments.out>_______
>>> ________________________________________
>>> scikit-learn mailing list
>>> scikit-learn at python.org
>>> https://mail.python.org/mailman/listinfo/scikit-learn
>>>
>> _______________________________________________
>> scikit-learn mailing list
>> scikit-learn at python.org
>> https://mail.python.org/mailman/listinfo/scikit-learn
>>
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20171023/cd9cbed7/attachment.html>

From t3kcit at gmail.com  Mon Oct 23 12:23:11 2017
From: t3kcit at gmail.com (Andreas Mueller)
Date: Mon, 23 Oct 2017 12:23:11 -0400
Subject: [scikit-learn] [ANN] scikit-learn 0.19.1 is out!
Message-ID: <786dc996-bd54-b3a4-3674-649a1b828334@gmail.com>

Hey everybody.

We just released 0.19.1, fixing some issues and bugs in the last release.
It's highly recommended you upgrade your installation. The new release is
available via pip, conda (main) and conda-forge.

A big thank you to everybody who contributed, in particular Joel 
(@jnothman)!

The release includes several improvements and fixes to the 
model_selection and
pipeline modules, and t-SNE.

You can find the full changelog here:
http://scikit-learn.org/stable/whats_new.html#version-0-19-1

Happy learning!

Andy

From gael.varoquaux at normalesup.org  Mon Oct 23 12:27:33 2017
From: gael.varoquaux at normalesup.org (Gael Varoquaux)
Date: Mon, 23 Oct 2017 18:27:33 +0200
Subject: [scikit-learn] [ANN] scikit-learn 0.19.1 is out!
In-Reply-To: <786dc996-bd54-b3a4-3674-649a1b828334@gmail.com>
References: <786dc996-bd54-b3a4-3674-649a1b828334@gmail.com>
Message-ID: <20171023162733.GA1862803@phare.normalesup.org>

Hurray! Great job; thanks to all involved!

Ga?l

On Mon, Oct 23, 2017 at 12:23:11PM -0400, Andreas Mueller wrote:
> Hey everybody.

> We just released 0.19.1, fixing some issues and bugs in the last release.
> It's highly recommended you upgrade your installation. The new release is
> available via pip, conda (main) and conda-forge.

> A big thank you to everybody who contributed, in particular Joel
> (@jnothman)!

> The release includes several improvements and fixes to the model_selection
> and
> pipeline modules, and t-SNE.

> You can find the full changelog here:
> http://scikit-learn.org/stable/whats_new.html#version-0-19-1

> Happy learning!

> Andy
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn

-- 
    Gael Varoquaux
    Researcher, INRIA Parietal
    NeuroSpin/CEA Saclay , Bat 145, 91191 Gif-sur-Yvette France
    Phone:  ++ 33-1-69-08-79-68
    http://gael-varoquaux.info            http://twitter.com/GaelVaroquaux

From raga.markely at gmail.com  Mon Oct 23 13:10:48 2017
From: raga.markely at gmail.com (Raga Markely)
Date: Mon, 23 Oct 2017 12:10:48 -0500
Subject: [scikit-learn] [ANN] scikit-learn 0.19.1 is out!
In-Reply-To: <20171023162733.GA1862803@phare.normalesup.org>
References: <786dc996-bd54-b3a4-3674-649a1b828334@gmail.com>
 <20171023162733.GA1862803@phare.normalesup.org>
Message-ID: <CAOLKFqt0_1TgV6k25XnA4mVk0E3CmXY5Y==_h=68dj6MuRz6NA@mail.gmail.com>

Great! Thank you very much!

Best,
Raga

On Oct 23, 2017 11:44 AM, "Gael Varoquaux" <gael.varoquaux at normalesup.org>
wrote:

> Hurray! Great job; thanks to all involved!
>
> Ga?l
>
> On Mon, Oct 23, 2017 at 12:23:11PM -0400, Andreas Mueller wrote:
> > Hey everybody.
>
> > We just released 0.19.1, fixing some issues and bugs in the last release.
> > It's highly recommended you upgrade your installation. The new release is
> > available via pip, conda (main) and conda-forge.
>
> > A big thank you to everybody who contributed, in particular Joel
> > (@jnothman)!
>
> > The release includes several improvements and fixes to the
> model_selection
> > and
> > pipeline modules, and t-SNE.
>
> > You can find the full changelog here:
> > http://scikit-learn.org/stable/whats_new.html#version-0-19-1
>
> > Happy learning!
>
> > Andy
> > _______________________________________________
> > scikit-learn mailing list
> > scikit-learn at python.org
> > https://mail.python.org/mailman/listinfo/scikit-learn
>
> --
>     Gael Varoquaux
>     Researcher, INRIA Parietal
>     NeuroSpin/CEA Saclay , Bat 145, 91191 Gif-sur-Yvette France
>     Phone:  ++ 33-1-69-08-79-68
>     http://gael-varoquaux.info            http://twitter.com/GaelVaroquaux
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20171023/bb567245/attachment.html>

From nikhil684 at gmail.com  Tue Oct 24 15:20:27 2017
From: nikhil684 at gmail.com (Nikhil Rayaprolu)
Date: Wed, 25 Oct 2017 00:50:27 +0530
Subject: [scikit-learn] Interested to Contribute to Scikit Learn
Message-ID: <CAKYz7F5JJbC0yban9n=kqm9EqroVYqUG+p4vtbG2JT2-Z874wQ@mail.gmail.com>

Hello EveryOne,

I am Nikhil Rayaprolu, an undergraduate student from India, I would like to
contribute to scikit-learn. Can anyone help me by giving me some critical
issue, that could be a good contribution to scikit learn. I have come
across Parallel decision tree building in proposed projects, so I would
like to know the status of it and if that could be contributed I am happy
to contribute to the issue.

Thank You
Nikhil Rayaprolu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20171025/d75838ed/attachment.html>

From joel.nothman at gmail.com  Tue Oct 24 16:57:39 2017
From: joel.nothman at gmail.com (Joel Nothman)
Date: Wed, 25 Oct 2017 07:57:39 +1100
Subject: [scikit-learn] Interested to Contribute to Scikit Learn
In-Reply-To: <CAKYz7F5JJbC0yban9n=kqm9EqroVYqUG+p4vtbG2JT2-Z874wQ@mail.gmail.com>
References: <CAKYz7F5JJbC0yban9n=kqm9EqroVYqUG+p4vtbG2JT2-Z874wQ@mail.gmail.com>
Message-ID: <CAAkaFLVb5OF0gk5pi+CrmR9fftzzzoMbtMqvYcLiWyf_4-T0EQ@mail.gmail.com>

hello and welcome Nikhil,

as described in our contributor guide, which you should read, we would much
prefer to make your acquaintance through non-critical contributions. please
start by looking for issues labelled as "easy" or"good first issue", and
"help wanted" more generally indicated issues where assistance may be
appreciated (although that label can sometimes be out of date).

we look forward to your contributions.

On 25 Oct 2017 6:26 am, "Nikhil Rayaprolu" <nikhil684 at gmail.com> wrote:

> Hello EveryOne,
>
> I am Nikhil Rayaprolu, an undergraduate student from India, I would like
> to contribute to scikit-learn. Can anyone help me by giving me some
> critical issue, that could be a good contribution to scikit learn. I have
> come across Parallel decision tree building in proposed projects, so I
> would like to know the status of it and if that could be contributed I am
> happy to contribute to the issue.
>
> Thank You
> Nikhil Rayaprolu
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20171025/f6b67d86/attachment.html>

From l.lomasto at innovationengineering.eu  Thu Oct 26 05:34:47 2017
From: l.lomasto at innovationengineering.eu (Luigi Lomasto)
Date: Thu, 26 Oct 2017 11:34:47 +0200
Subject: [scikit-learn] laoding multi-targets
Message-ID: <003b01d34e3d$aa26fdd0$fe74f970$@innovationengineering.eu>

Hi, 

 
I'm writing to ask a question. Can I load a dataset multilabels with
load_files function? For example, I have a dataset like this:

 
folder

               folder1

                              txt1

                              txt2

               folder2

                              txt2

                              txt3

 
So, item txt2 have two labels, folder1 and foder2. How I can load
mutli-targets?

Thanks for answer. Best regards, 

 
Luigi

 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20171026/13f96926/attachment.html>

From bruno.moura.pesquisa at gmail.com  Thu Oct 26 06:14:18 2017
From: bruno.moura.pesquisa at gmail.com (Bruno Lustosa)
Date: Thu, 26 Oct 2017 07:14:18 -0300
Subject: [scikit-learn] laoding multi-targets
In-Reply-To: <003b01d34e3d$aa26fdd0$fe74f970$@innovationengineering.eu>
References: <003b01d34e3d$aa26fdd0$fe74f970$@innovationengineering.eu>
Message-ID: <FF21A14C-933E-4EE5-9058-2A8BBD307745@gmail.com>

Maybe u need use a np.loadtxt and to specify the columns.
Try use ?usecol? into this task

file1=np.loadtxt(path, usecol=(1,2))
file2=np.loadtxt(parh,usecol=(1,2))

I hope help u!


Att,
=========================
Prof. Bruno Lustosa de Moura
Instituto Federal do R. G. do Norte 
IFRN - Campus Natal Central
+55 84 99991-9550
=========================
"Dai me Senhor Deus aquilo
Que Vos resta..."


Em 26 de out de 2017, ?(s) 06:34, Luigi Lomasto <l.lomasto at innovationengineering.eu> escreveu:

> Hi,
>  
> I?m writing to ask a question. Can I load a dataset multilabels with load_files function? For example, I have a dataset like this:
>  
> folder
>                folder1
>                               txt1
>                               txt2
>                folder2
>                               txt2
>                               txt3
>  
> So, item txt2 have two labels, folder1 and foder2. How I can load mutli-targets?
> Thanks for answer. Best regards,
>  
> Luigi
>  
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20171026/4eab3ab3/attachment.html>

From gauravdhingra.gxyd at gmail.com  Tue Oct 31 15:31:27 2017
From: gauravdhingra.gxyd at gmail.com (Gaurav Dhingra)
Date: Wed, 1 Nov 2017 01:01:27 +0530
Subject: [scikit-learn] Topic for thesis work on scikit learn
Message-ID: <069bd230-f2e0-e5ef-4fdf-7d0c529c5d5f@gmail.com>

Hi everyone,

I am a final year (5th year) undergraduate Applied Mathematics student 
in India. I am thinking of doing my final year thesis by doing some work 
(coding part) on scikit learn, so I was thinking if anyone could tell me 
if there are available topics (not necessarily names of those topics) 
that I could work on being an undergraduate student? I would want to 
expand upon this in December when my exams will be over. But in the mean 
time would want to take a step in that direction by just knowing if 
there will be available topics that I could work on.

It could be the case that available topics are not so easy for an 
undergraduate, still in that case I would like to do some research on 
the topics first.

-- 
Best,
Gaurav Dhingra
(sent from Thunderbird email client)


From t3kcit at gmail.com  Tue Oct 31 16:13:18 2017
From: t3kcit at gmail.com (Andreas Mueller)
Date: Tue, 31 Oct 2017 16:13:18 -0400
Subject: [scikit-learn] Topic for thesis work on scikit learn
In-Reply-To: <069bd230-f2e0-e5ef-4fdf-7d0c529c5d5f@gmail.com>
References: <069bd230-f2e0-e5ef-4fdf-7d0c529c5d5f@gmail.com>
Message-ID: <9641a578-194f-183c-fa2c-22cb45a7c76d@gmail.com>

Hi Gaurav.

Do you have a local mentor? I think having a mentor that can guide you 
during a thesis is very important.
You could get some feedback from the community for a contribution, but 
that can be slow,
and is entirely on volunteer basis, so there is no guarantee that you'll 
get the necessary feedback in time
to finish your thesis.

Mentoring a thesis - in particular without knowing you - is a serious 
commitment, so I'm not sure someone
from inside the project will want to do this. I saw you already made a 
contribution in https://github.com/scikit-learn/scikit-learn/pull/10005
but that's a very different scope than doing what I expect would be 
several month of work.

Best,
Andy

On 10/31/2017 03:31 PM, Gaurav Dhingra wrote:
> Hi everyone,
>
> I am a final year (5th year) undergraduate Applied Mathematics student 
> in India. I am thinking of doing my final year thesis by doing some 
> work (coding part) on scikit learn, so I was thinking if anyone could 
> tell me if there are available topics (not necessarily names of those 
> topics) that I could work on being an undergraduate student? I would 
> want to expand upon this in December when my exams will be over. But 
> in the mean time would want to take a step in that direction by just 
> knowing if there will be available topics that I could work on.
>
> It could be the case that available topics are not so easy for an 
> undergraduate, still in that case I would like to do some research on 
> the topics first.
>


From gauravdhingra.gxyd at gmail.com  Tue Oct 31 16:36:12 2017
From: gauravdhingra.gxyd at gmail.com (Gaurav Dhingra)
Date: Wed, 1 Nov 2017 02:06:12 +0530
Subject: [scikit-learn] Topic for thesis work on scikit learn
In-Reply-To: <9641a578-194f-183c-fa2c-22cb45a7c76d@gmail.com>
References: <069bd230-f2e0-e5ef-4fdf-7d0c529c5d5f@gmail.com>
 <9641a578-194f-183c-fa2c-22cb45a7c76d@gmail.com>
Message-ID: <c0ea14a1-13d1-9879-de72-b674c63f153d@gmail.com>

Hi Andreas,

No, I don't have a local mentor (at least not now neither did I think of 
it). I'll talk to a professor (that I know of) in my college's CS 
department. My plan was to make some good volunteer contributions in 
December and on the basis of those made contributions to ask for a 
mentor. I totally understand the absence of full-time availability of 
core scikit-learn developers on this.

Considering my past experience with SymPy's contribution with the help 
of my mentor (in Finland), I think I can try to override the problem of 
absence of local mentor (in case I don't get access to one) both of my 
past projects were pure maths based. My only worry would be that I could 
choose a project having good enough impact on scikit-learn 's development.

So the point is I'll try to make contribution and if I think I've made 
good enough contributions I'll ask on this mail thread.

PS: Sorry, I mistakenly sent that mail to you, I intended to send that 
mail to the list.


On Wednesday 01 November 2017 01:43 AM, Andreas Mueller wrote:
> Hi Gaurav.
>
> Do you have a local mentor? I think having a mentor that can guide you 
> during a thesis is very important.
> You could get some feedback from the community for a contribution, but 
> that can be slow,
> and is entirely on volunteer basis, so there is no guarantee that 
> you'll get the necessary feedback in time
> to finish your thesis.
>
> Mentoring a thesis - in particular without knowing you - is a serious 
> commitment, so I'm not sure someone
> from inside the project will want to do this. I saw you already made a 
> contribution in https://github.com/scikit-learn/scikit-learn/pull/10005
> but that's a very different scope than doing what I expect would be 
> several month of work.
>
> Best,
> Andy
>
> On 10/31/2017 03:31 PM, Gaurav Dhingra wrote:
>> Hi everyone,
>>
>> I am a final year (5th year) undergraduate Applied Mathematics 
>> student in India. I am thinking of doing my final year thesis by 
>> doing some work (coding part) on scikit learn, so I was thinking if 
>> anyone could tell me if there are available topics (not necessarily 
>> names of those topics) that I could work on being an undergraduate 
>> student? I would want to expand upon this in December when my exams 
>> will be over. But in the mean time would want to take a step in that 
>> direction by just knowing if there will be available topics that I 
>> could work on.
>>
>> It could be the case that available topics are not so easy for an 
>> undergraduate, still in that case I would like to do some research on 
>> the topics first.
>>
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn

-- 
Best,
Gaurav Dhingra
(sent from Thunderbird email client)