[scikit-learn] scikit-learn Digest, Vol 42, Issue 14
Sayak Paul
spsayakpaul at gmail.com
Fri Sep 13 01:16:09 EDT 2019
I was able to solve the problem using -
mlb = MultiLabelBinarizer()
mlb.fit([y_train])
Thanks for the suggestions. The output of mlb.classes_ now looks the
following (first ten classes):
[image: image.png]
However, when I transform it using mlb.transform([y_train]), another
problem arrises -
[image: image.png]
Kindly suggest :)
Sayak Paul | sayak.dev
On Thu, Sep 12, 2019 at 9:33 PM <scikit-learn-request at python.org> wrote:
> Send scikit-learn mailing list submissions to
> scikit-learn at python.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
> https://mail.python.org/mailman/listinfo/scikit-learn
> or, via email, send a message with subject or body 'help' to
> scikit-learn-request at python.org
>
> You can reach the person managing the list at
> scikit-learn-owner at python.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of scikit-learn digest..."
>
>
> Today's Topics:
>
> 1. Re: MultiLabelBinarizer gives individual characters instead
> of the classes (Lo?c Est?ve)
> 2. Re: Vote on SLEP009: keyword only arguments (Guillaume Lema?tre)
> 3. How can I enable line tracing for cython modules.
> (Alejandro Javier Peralta Frias)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Thu, 12 Sep 2019 07:24:48 +0200
> From: Lo?c Est?ve <loic.esteve at ymail.com>
> To: Scikit-learn mailing list <scikit-learn at python.org>
> Subject: Re: [scikit-learn] MultiLabelBinarizer gives individual
> characters instead of the classes
> Message-ID: <vnokwoeeozmn.fsf at ymail.com>
> Content-Type: text/plain; charset=utf-8
>
> I think this caveat has been added in the dev doc (not yet in the stable
> doc). You may want to read:
>
> https://scikit-learn.org/dev/modules/generated/sklearn.preprocessing.MultiLabelBinarizer.html
> and in particular the part that starts with "A common mistake is to pass
> in a list".
>
> Cheers,
> Lo?c
>
> > Hi.
> >
> > I am working on a Multi-label text classification problem. In order to
> encode the labels, I am using MultiLabelBinarizer. The labels of the
> dataset look like -
> >
> > image
> >
> > When I am using
> >
> > mlb = MultiLabelBinarizer()
> > mlb.fit(labels)
> > print(mlb.classes_)
> >
> > I am getting -
> >
> > image
> >
> > Whereas, the output (sample output) I want is -
> >
> > image
> >
> > I got the above output by -
> >
> > mlb = MultiLabelBinarizer()
> > sample_labels = [
> > ['stat.ML', 'cs.LG'],
> > ['cs.CV', 'cs.RO']
> > ]
> > mlb.fit(sample_labels)
> > print(mlb.classes_)
> >
> > Help would be very much appreciated here.
> >
> > Here's the dataset I had prepared:
> > arXivdata.csv.zip
> >
> > I stripped away the double quotes in the labels after loading it in a
> pandas DataFrame by -
> >
> > import re
> >
> > arxiv_data['labels'] = arxiv_data['labels'].str.replace(r"[\"]", '')
> >
> > scikit-learn version: '0.21.3'
> >
> > Sayak Paul | sayak.dev
>
>
>
> ------------------------------
>
> Message: 2
> Date: Thu, 12 Sep 2019 10:06:30 +0200
> From: Guillaume Lema?tre <g.lemaitre58 at gmail.com>
> To: Scikit-learn mailing list <scikit-learn at python.org>
> Subject: Re: [scikit-learn] Vote on SLEP009: keyword only arguments
> Message-ID:
> <
> CACDxx9jCkE5GAjRNj3TKinbuyWZQvXMrrcHBBqn6q_FXYdPrbQ at mail.gmail.com>
> Content-Type: text/plain; charset="utf-8"
>
> To the question: do we want to utilise Python 3's force-keyword-argument
> syntax
> and to change existing APIs which support arguments positionally to use
> this
> syntax, via a deprecation period?
>
> I am +1.
>
> IMO, even if the syntax might be unknown, it will remain unknown until
> projects
> from the ecosystem are not using it.
>
> To the question: which methods should be impacted?
>
> I think we should be as gentle as possible at first. I am a little
> concerned about
> breaking some codes which were working fine before.
>
> On Thu, 12 Sep 2019 at 04:43, Joel Nothman <joel.nothman at gmail.com> wrote:
>
> > These there details of specific API changes to be decided:
> >
> > The question being put, as per the SLEP, is:
> > do we want to utilise Python 3's force-keyword-argument syntax
> > and to change existing APIs which support arguments positionally to use
> > this syntax, via a deprecation period?
> > _______________________________________________
> > scikit-learn mailing list
> > scikit-learn at python.org
> > https://mail.python.org/mailman/listinfo/scikit-learn
> >
>
>
> --
> Guillaume Lemaitre
> INRIA Saclay - Parietal team
> Center for Data Science Paris-Saclay
> https://glemaitre.github.io/
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <
> http://mail.python.org/pipermail/scikit-learn/attachments/20190912/047eb83c/attachment-0001.html
> >
>
> ------------------------------
>
> Message: 3
> Date: Thu, 12 Sep 2019 09:23:03 -0300
> From: Alejandro Javier Peralta Frias
> <alejandro.peralta at mercadolibre.com>
> To: scikit-learn at python.org
> Subject: [scikit-learn] How can I enable line tracing for cython
> modules.
> Message-ID:
> <CAL+ZpG6ccwnnJm1Q2CQM4qt+sfiMtHV5Tr=
> mgsgFpcmASzUhZA at mail.gmail.com>
> Content-Type: text/plain; charset="utf-8"
>
> Hello all,
>
> To enable cython tracing (in particular I want to line trace neighbors
> module) I understand that I have to recompile the cython modules with
> CYTHON_TRACE=1 but I'm not sure where should I set this.
>
> Should I use:
>
> # distutils: define_macros=CYTHON_TRACE_NOGIL=1
>
>
> In the files I want to trace?
>
> Regards,
> --
> Ale
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <
> http://mail.python.org/pipermail/scikit-learn/attachments/20190912/0377329b/attachment-0001.html
> >
>
> ------------------------------
>
> Subject: Digest Footer
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
>
> ------------------------------
>
> End of scikit-learn Digest, Vol 42, Issue 14
> ********************************************
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20190913/921c80cd/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 16117 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20190913/921c80cd/attachment-0002.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 7675 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20190913/921c80cd/attachment-0003.png>
More information about the scikit-learn
mailing list