<html>
<head>
<meta content="text/html; charset=UTF-8" http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<p>I think I saw it in the Deep Learning book:
<a class="moz-txt-link-freetext" href="http://www.deeplearningbook.org/">http://www.deeplearningbook.org/</a><br>
</p>
Bill<br>
<br>
<div class="moz-cite-prefix">On 3/28/17 9:48 AM, Henrique C. S.
Junior wrote:<br>
</div>
<blockquote
cite="mid:CAEeMmB=+wWLedcOJWu8yCPLbieyuuoDwROA3RYcb1C9B3xmL8Q@mail.gmail.com"
type="cite">
<div dir="ltr">
<div class="gmail_default"
style="font-family:monospace,monospace">@Tommaso, this is
something like Internal Coordinates[1], right?</div>
<div class="gmail_default"
style="font-family:monospace,monospace">@Bill, thanks for the
hint, I'll definitely take a look at this.</div>
<div class="gmail_default"
style="font-family:monospace,monospace"><br>
</div>
<div class="gmail_default"
style="font-family:monospace,monospace">[1] - <a
moz-do-not-send="true"
href="https://en.wikipedia.org/wiki/Z-matrix_%28chemistry%29">https://en.wikipedia.org/wiki/Z-matrix_(chemistry)</a></div>
</div>
<div class="gmail_extra"><br>
<div class="gmail_quote">On Tue, Mar 28, 2017 at 2:12 AM, Bill
Ross <span dir="ltr"><<a moz-do-not-send="true"
href="mailto:ross@cgl.ucsf.edu" target="_blank">ross@cgl.ucsf.edu</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">
<div bgcolor="#FFFFFF" text="#000000">
<p>Image processing deals with xy coordinates by (as I
understand) training with multiple permutations of the
raw data, in the form of translations and rotations in
the 2d space. If training with 3d data, there would be
that much more translating and rotating to do, in order
to divorce the learning from the incidentals.</p>
<span class="HOEnZb"><font color="#888888">
<p>Bill<br>
</p>
</font></span>
<div>
<div class="h5"> <br>
<div class="m_-7569063688226978064moz-cite-prefix">On
3/27/17 4:35 PM, Tommaso Costanzo wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>Dear Henrique,<br>
</div>
I am sorry for the poor email I
wrote before. What I was saying is
simply the fact that if you are
trying to use the coordinates as
"features" from an .xyz file then
by machine learning you will learn
at wich coordinate certain atoms
will occur so you can only make
prediction on the coordinate.
However, if I correctly
understood, the "features"
representing the coupling J are
distance, angle, and electron
number. Definitely this properties
can be derived from the XYZ file
format from simple geometric
calculations and the number of
electrons will depend from the
type of atom. So, what I was
trying to say is that instead of
using the XYZ file as input for
scikit-learn, I was suggesting to
do the calculation of angle,
distances, electrons' number in
advance (with other software(s) or
directly in python) and use the
new calculated matrix as input for
scikit-learn. In this case the
machine will learn how J(AB)
varies as a function of angle,
distance, number of electrons. <br>
</div>
For example <br>
</div>
<br>
distance angle n el.<br>
1 90 1<br>
1 90 1<br>
2 90 1<br>
.... ... ...<br>
<br>
</div>
If you are using a supervised learning
you will have to add a 4th column ( in
reality a separate column vector) with
your J(AB) on which you can train your
model and then predict the unknown
samples<br>
<br>
</div>
For example <br>
distance angle n el. J(AB)<br>
1 90 1 1<br>
1 90 1 1<br>
2 90 1 0.5<br>
.... ... ...
...<br>
<br>
</div>
<div>Now if you train the model on the
second matrix, and then you try to predict
the first one you should expect a results
like:<br>
<br>
1<br>
1<br>
0.5<br>
<br>
</div>
Of course in this case the "features" are
perfectly equal, hence the example is
completely unrealistic. However, I hope that
it will help to understand what I was
explaining in the previous email.<br>
If you want you can directly contact me at
this email, and I hope that you got
additional hints from Robert, that he seems
to be even more knowledgeable than me.<br>
</div>
<br>
</div>
Sincerely <br>
</div>
Tommaso<br>
<div>
<div><br>
<div><br>
</div>
</div>
</div>
</div>
<div class="gmail_extra"><br>
<div class="gmail_quote">2017-03-27 18:44
GMT-04:00 Henrique C. S. Junior <span dir="ltr"><<a
moz-do-not-send="true"
href="mailto:henriquecsj@gmail.com"
target="_blank">henriquecsj@gmail.com</a>></span>:<br>
<blockquote class="gmail_quote" style="margin:0
0 0 .8ex;border-left:1px #ccc
solid;padding-left:1ex">
<div dir="ltr">
<div class="gmail_default"
style="font-family:monospace,monospace">Dear
Tommaso, thank you for your kind reply.</div>
<div class="gmail_default"
style="font-family:monospace,monospace">I
know I have a lot to study before actually
starting any code and that's why any
suggestion is so valuable.</div>
<div class="gmail_default"
style="font-family:monospace,monospace">So,
you're suggesting that a simplification of
the system using only the paramagnetic
centers can be a good approach? (I'm not
sure if I understood it correctly).</div>
<div class="gmail_default"
style="font-family:monospace,monospace">My
main idea was, at first, try to represent
the systems as realistically as possible
(using coordinates). I know that the
software will not know what a bond is or
what an intermolecular interaction is but,
let's say, after including 1000s of
examples in the training, I was expecting
that (as an example) finding a C 0.000 and
an H at 1.000 should start to "make sense"
because it leads to an experimental trend.
And I totally agree that my way to
represent the system is not the better.</div>
<div class="gmail_default"
style="font-family:monospace,monospace"><br>
</div>
<div class="gmail_default"
style="font-family:monospace,monospace">Thank
you so much for all the help.</div>
</div>
<div class="m_-7569063688226978064HOEnZb">
<div class="m_-7569063688226978064h5">
<div class="gmail_extra"><br>
<div class="gmail_quote">On Mon, Mar 27,
2017 at 4:15 PM, Tommaso Costanzo <span
dir="ltr"><<a
moz-do-not-send="true"
href="mailto:tommaso.costanzo01@gmail.com"
target="_blank">tommaso.costanzo01@gmail.com</a>></span>
wrote:<br>
<blockquote class="gmail_quote"
style="margin:0 0 0
.8ex;border-left:1px #ccc
solid;padding-left:1ex">
<div dir="ltr">
<p
style="margin:0px;text-indent:0px"><span
style="font-family:"ubuntu";font-size:12pt">Dear Henrique,</span></p>
<p
style="margin:0px;text-indent:0px;font-family:"ubuntu";font-size:12pt"><br>
</p>
<p
style="margin:0px;text-indent:0px"><span
style="font-family:"ubuntu";font-size:12pt">I agree with
Robert on the use of a
supervised algorithm and I
would also suggest you to try
a semisupervised one if you
have trouble in labeling your
data. </span></p>
<p
style="margin:0px;text-indent:0px;font-family:"ubuntu";font-size:12pt"><br>
</p>
<p
style="margin:0px;text-indent:0px"><span
style="font-family:"ubuntu";font-size:12pt">Moreover, as a
chemist I think that the input
you are thinking to use is not
the in the best form for
machine learning because you
are trying to predict coupling
J(AB) but in the future space
you have only coordinates
(XYZ). What I suggest is to
generate the pair of atoms
externally and then use a
matrix of the form (Mx3),
where M are the pairs of atoms
you want to predict your J and
3 are the features of the two
atoms (distance, angle,
unpaired electrons). For a
supervised approach you will
need a training set where the
J is know so your training
data will be of the form Mx4
and the fourth feature will be
the J you know.</span></p>
<p
style="margin:0px;text-indent:0px"><span
style="font-family:"ubuntu";font-size:12pt">Hope that this is
clear, if not I will be happy
to help more</span></p>
<p
style="margin:0px;text-indent:0px;font-family:"ubuntu";font-size:12pt"><br>
</p>
<p
style="margin:0px;text-indent:0px"><span
style="font-family:"ubuntu";font-size:12pt">Sincerely</span></p>
<p
style="margin:0px;text-indent:0px"><span
style="font-family:"ubuntu";font-size:12pt">Tommaso</span></p>
</div>
<div class="gmail_extra">
<div>
<div
class="m_-7569063688226978064m_-419284271361902240h5"><br>
<div class="gmail_quote">2017-03-27
13:46 GMT-04:00 Henrique C.
S. Junior <span dir="ltr"><<a
moz-do-not-send="true"
href="mailto:henriquecsj@gmail.com"
target="_blank">henriquecsj@gmail.com</a>></span>:<br>
<blockquote
class="gmail_quote"
style="margin:0 0 0
.8ex;border-left:1px #ccc
solid;padding-left:1ex">
<div dir="ltr">
<div
class="gmail_default"
style="font-family:monospace,monospace">Dear Robert, thank you. Yes, I'd
like to talk about
some specifics on the
project.</div>
<div
class="gmail_default"
style="font-family:monospace,monospace">Thank you again.</div>
</div>
<div
class="m_-7569063688226978064m_-419284271361902240m_-8383123951498439579HOEnZb">
<div
class="m_-7569063688226978064m_-419284271361902240m_-8383123951498439579h5">
<div
class="gmail_extra"><br>
<div
class="gmail_quote">On
Mon, Mar 27, 2017
at 2:25 PM, Robert
Slater <span
dir="ltr"><<a
moz-do-not-send="true" href="mailto:rdslater@gmail.com" target="_blank">rdslater@gmail.com</a>></span>
wrote:<br>
<blockquote
class="gmail_quote"
style="margin:0
0 0
.8ex;border-left:1px
#ccc
solid;padding-left:1ex">
<div dir="ltr">You
definitely can
use some of
the tools in
sci-kit learn
for supervised
machine
learning. The
real trick
will be how
well your
training
system is
representative
of your future
predictions.
All of the
various
regression
algorithms
would be of
some value and
you make even
consider an
ensemble to
help
generalize.
There will be
some important
questions to
answer--what
kind of loss
function do
you want to
look at? I
assumed
regression
(continuous
response) but
it could also
classify--paramagnetic, diamagnetic, ferromagnetic, etc...
<div><br>
</div>
<div>Another
task to think
about might be
dimension
reduction.</div>
<div>There is
no guarantee
you will get
fantastic
results--every
problem is
unique and
much will
depend on
exactly what
you want out
of the
solution--it
may be that we
get '10%'
accuracy at
best--for some
systems that
is quite good,
others it is
horrible.<br>
</div>
<div><br>
</div>
<div>If you'd
like to talk
specifics,
feel free to
contact me at
this email. I
have a
background in
magnetism (PhD
in magnetic
multilayers--i
was physics,
but as you are
probably aware
chemisty and
physics blend
in this area)
and have a
fairly good
knowledge of
sci-kit learn
and machine
learning. </div>
<div><br>
</div>
<div><br>
</div>
</div>
<div
class="gmail_extra"><br>
<div
class="gmail_quote">
<div>
<div
class="m_-7569063688226978064m_-419284271361902240m_-8383123951498439579m_6033336047822367828h5">On
Mon, Mar 27,
2017 at 10:50
AM, Henrique
C. S. Junior <span
dir="ltr"><<a
moz-do-not-send="true" href="mailto:henriquecsj@gmail.com"
target="_blank">henriquecsj@gmail.com</a>></span>
wrote:<br>
</div>
</div>
<blockquote
class="gmail_quote"
style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div>
<div
class="m_-7569063688226978064m_-419284271361902240m_-8383123951498439579m_6033336047822367828h5">
<div dir="ltr">
<div
class="gmail_default"
style="font-family:monospace,monospace">
<p
style="margin:0cm
0cm
12pt;background-image:initial;background-position:initial;background-size:initial;background-repeat:initial;background-origin:initial;background-clip:initial"><span
style="color:rgb(36,39,41)" lang="EN-US">I'm a chemist with some
rudimentary
programming
skills
(getting
started with
python) and in
the middle of
the year I'll
be starting a
Ph.D. project
that uses
computers to
describe
magnetism in
molecular
systems.<span></span></span></p>
<p
style="margin:0cm
0cm
12pt;background-image:initial;background-position:initial;background-size:initial;background-repeat:initial;background-origin:initial;background-clip:initial"><span
style="color:rgb(36,39,41)" lang="EN-US">Most of the time I get my
results after
several
simulations
and
experiments,
so, I know
that one of
the hardest
tasks in
molecular
magnetism is
to predict the
nature of
magnetic
interactions.
That's why
I'll try to
tackle this
problem with
Machine
Learning
(because such
interactions
are dependent,
basically, of
distances,
angles and
number of
unpaired
electrons).
The idea is to
feed the
computer with
a large
training set
(with number
of unpaired
electrons, XYZ
coordinates of
each molecule
and
experimental
magnetic
couplings) and
see if it can
predict the
magnetic
couplings
(J(AB)) of new
systems:<span></span></span></p>
</div>
<div>
<div
class="gmail_default"
style="font-family:monospace,monospace">(see example in the attached
image)</div>
<div
class="gmail_default"
style="font-family:monospace,monospace"><br>
</div>
<div
class="gmail_default"
style="font-family:monospace,monospace">Can Scikit-Learn handle the
task, knowing
that the
matrix used to
represent
atomic
coordinates
will probably
have a
different
number of
atoms (because
some molecules
have more
atoms than
others)? Or is
this a job
better suited
for another
software/approach?
</div>
<span
class="m_-7569063688226978064m_-419284271361902240m_-8383123951498439579m_6033336047822367828m_-1717598575983325084HOEnZb"><font
color="#888888"><br>
</font></span></div>
<span
class="m_-7569063688226978064m_-419284271361902240m_-8383123951498439579m_6033336047822367828m_-1717598575983325084HOEnZb"><font
color="#888888">
<div><br>
</div>
-- <br>
<div
class="m_-7569063688226978064m_-419284271361902240m_-8383123951498439579m_6033336047822367828m_-1717598575983325084m_-4201444065020757644gmail_signature">
<div dir="ltr">
<div>
<div dir="ltr">
<div>
<div dir="ltr">
<div>
<div dir="ltr">
<div>
<div dir="ltr">
<div>
<div dir="ltr">
<div>
<div dir="ltr">
<div>
<div dir="ltr"><span
style="color:rgb(139,139,139)"><font face="monospace, monospace"><b><font
color="#808080">Henrique C. S. Junior</font></b><br>
Industrial
Chemist -
UFRRJ</font></span></div>
<div dir="ltr"><span
style="color:rgb(139,139,139)"><font face="monospace, monospace">M. Sc.
Inorganic
Chemistry -
UFRRJ<br>
Data
Processing
Center - PMP</font><br>
</span></div>
</div>
<div><span
style="color:rgb(139,139,139)"><font
face="monospace, monospace">Visite o <a moz-do-not-send="true"
href="http://mundoquimico.com.br"
target="_blank">Mundo Químico</a></font></span></div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</font></span></div>
<br>
</div>
</div>
<span>______________________________<wbr>_________________<br>
scikit-learn
mailing list<br>
<a
moz-do-not-send="true"
href="mailto:scikit-learn@python.org" target="_blank">scikit-learn@python.org</a><br>
<a
moz-do-not-send="true"
href="https://mail.python.org/mailman/listinfo/scikit-learn"
rel="noreferrer"
target="_blank">https://mail.python.org/mailma<wbr>n/listinfo/scikit-learn</a><br>
<br>
</span></blockquote>
</div>
<br>
</div>
<br>
______________________________<wbr>_________________<br>
scikit-learn
mailing list<br>
<a
moz-do-not-send="true"
href="mailto:scikit-learn@python.org" target="_blank">scikit-learn@python.org</a><br>
<a
moz-do-not-send="true"
href="https://mail.python.org/mailman/listinfo/scikit-learn"
rel="noreferrer"
target="_blank">https://mail.python.org/mailma<wbr>n/listinfo/scikit-learn</a><br>
<br>
</blockquote>
</div>
<br>
<br clear="all">
<div><br>
</div>
-- <br>
<div
class="m_-7569063688226978064m_-419284271361902240m_-8383123951498439579m_6033336047822367828gmail_signature"
data-smartmail="gmail_signature">
<div dir="ltr">
<div>
<div dir="ltr">
<div>
<div dir="ltr">
<div>
<div dir="ltr">
<div>
<div dir="ltr">
<div>
<div dir="ltr">
<div>
<div dir="ltr">
<div>
<div dir="ltr"><span
style="color:rgb(139,139,139)"><font face="monospace, monospace"><b><font
color="#808080">Henrique C. S. Junior</font></b><br>
Industrial
Chemist -
UFRRJ</font></span></div>
<div dir="ltr"><span
style="color:rgb(139,139,139)"><font face="monospace, monospace">M. Sc.
Inorganic
Chemistry -
UFRRJ<br>
Data
Processing
Center - PMP</font><br>
</span></div>
</div>
<div><span
style="color:rgb(139,139,139)"><font
face="monospace, monospace">Visite o <a moz-do-not-send="true"
href="http://mundoquimico.com.br"
target="_blank">Mundo Químico</a></font></span></div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
<br>
______________________________<wbr>_________________<br>
scikit-learn mailing list<br>
<a moz-do-not-send="true"
href="mailto:scikit-learn@python.org" target="_blank">scikit-learn@python.org</a><br>
<a moz-do-not-send="true"
href="https://mail.python.org/mailman/listinfo/scikit-learn"
rel="noreferrer"
target="_blank">https://mail.python.org/mailma<wbr>n/listinfo/scikit-learn</a><br>
<br>
</blockquote>
</div>
<br>
<br clear="all">
<br>
-- <br>
</div>
</div>
<div
class="m_-7569063688226978064m_-419284271361902240m_-8383123951498439579gmail_signature"
data-smartmail="gmail_signature">
<div dir="ltr"><span></span><span>Please
do NOT send Microsoft Office
Attachments:</span><br>
<div> <a
moz-do-not-send="true"
href="http://www.gnu.org/philosophy/no-word-attachments.html"
target="_blank">http://www.gnu.org/philosophy/<wbr>no-word-attachments.html</a></div>
</div>
</div>
</div>
<br>
______________________________<wbr>_________________<br>
scikit-learn mailing list<br>
<a moz-do-not-send="true"
href="mailto:scikit-learn@python.org"
target="_blank">scikit-learn@python.org</a><br>
<a moz-do-not-send="true"
href="https://mail.python.org/mailman/listinfo/scikit-learn"
rel="noreferrer" target="_blank">https://mail.python.org/mailma<wbr>n/listinfo/scikit-learn</a><br>
<br>
</blockquote>
</div>
<br>
<br clear="all">
<div><br>
</div>
-- <br>
<div
class="m_-7569063688226978064m_-419284271361902240gmail_signature"
data-smartmail="gmail_signature">
<div dir="ltr">
<div>
<div dir="ltr">
<div>
<div dir="ltr">
<div>
<div dir="ltr">
<div>
<div dir="ltr">
<div>
<div dir="ltr">
<div>
<div dir="ltr">
<div>
<div dir="ltr"><span
style="color:rgb(139,139,139)"><font face="monospace, monospace"><b><font
color="#808080">Henrique C. S. Junior</font></b><br>
Industrial
Chemist -
UFRRJ</font></span></div>
<div dir="ltr"><span
style="color:rgb(139,139,139)"><font face="monospace, monospace">M. Sc.
Inorganic
Chemistry -
UFRRJ<br>
Data
Processing
Center - PMP</font><br>
</span></div>
</div>
<div><span
style="color:rgb(139,139,139)"><font
face="monospace, monospace">Visite o <a moz-do-not-send="true"
href="http://mundoquimico.com.br"
target="_blank">Mundo Químico</a></font></span></div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
<br>
______________________________<wbr>_________________<br>
scikit-learn mailing list<br>
<a moz-do-not-send="true"
href="mailto:scikit-learn@python.org"
target="_blank">scikit-learn@python.org</a><br>
<a moz-do-not-send="true"
href="https://mail.python.org/mailman/listinfo/scikit-learn"
rel="noreferrer" target="_blank">https://mail.python.org/mailma<wbr>n/listinfo/scikit-learn</a><br>
<br>
</blockquote>
</div>
<br>
<br clear="all">
<br>
-- <br>
<div class="m_-7569063688226978064gmail_signature"
data-smartmail="gmail_signature">
<div dir="ltr"><span></span><span>Please do NOT
send Microsoft Office Attachments:</span><br>
<div> <a moz-do-not-send="true"
href="http://www.gnu.org/philosophy/no-word-attachments.html"
target="_blank">http://www.gnu.org/philosophy/<wbr>no-word-attachments.html</a></div>
</div>
</div>
</div>
<br>
<fieldset
class="m_-7569063688226978064mimeAttachmentHeader"></fieldset>
<br>
<pre>______________________________<wbr>_________________
scikit-learn mailing list
<a moz-do-not-send="true" class="m_-7569063688226978064moz-txt-link-abbreviated" href="mailto:scikit-learn@python.org" target="_blank">scikit-learn@python.org</a>
<a moz-do-not-send="true" class="m_-7569063688226978064moz-txt-link-freetext" href="https://mail.python.org/mailman/listinfo/scikit-learn" target="_blank">https://mail.python.org/<wbr>mailman/listinfo/scikit-learn</a>
</pre>
</blockquote>
</div></div></div>
______________________________<wbr>_________________
scikit-learn mailing list
<a moz-do-not-send="true" href="mailto:scikit-learn@python.org">scikit-learn@python.org</a>
<a moz-do-not-send="true" href="https://mail.python.org/mailman/listinfo/scikit-learn" rel="noreferrer" target="_blank">https://mail.python.org/<wbr>mailman/listinfo/scikit-learn</a>
</blockquote></div>
<div>
</div>--
<div class="gmail_signature" data-smartmail="gmail_signature"><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><span style="color:rgb(139,139,139)"><font face="monospace, monospace"><b><font color="#808080">Henrique C. S. Junior</font></b>
Industrial Chemist - UFRRJ</font></span></div><div dir="ltr"><span style="color:rgb(139,139,139)"><font face="monospace, monospace">M. Sc. Inorganic Chemistry - UFRRJ
Data Processing Center - PMP</font>
</span></div></div><div><span style="color:rgb(139,139,139)"><font face="monospace, monospace">Visite o <a moz-do-not-send="true" href="http://mundoquimico.com.br" target="_blank">Mundo Químico</a></font></span></div></div></div></div></div></div></div></div></div></div></div></div></div></div></div>
</div>
<fieldset class="mimeAttachmentHeader"></fieldset>
<pre wrap="">_______________________________________________
scikit-learn mailing list
<a class="moz-txt-link-abbreviated" href="mailto:scikit-learn@python.org">scikit-learn@python.org</a>
<a class="moz-txt-link-freetext" href="https://mail.python.org/mailman/listinfo/scikit-learn">https://mail.python.org/mailman/listinfo/scikit-learn</a>
</pre>
</blockquote>
</body></html>