[Numpy-discussion] Problem with np.savetxt
Eric Wieser
wieser.eric+numpy at gmail.com
Thu Oct 10 10:22:46 EDT 2019
You're trying to read a file with a name of literally `${d}.log`, which is
unlikely to be the name of your file. `${}` is bash syntax, not python
syntax.
This has drifted out of numpy territory and into "how to coordinate between
bash and python" territory - I'd perhaps recommend you ask this to a wider
python audience on StackOverflow, where you'll get a faster response.
Eric
On Thu, 10 Oct 2019 at 15:11, Stephen P. Molnar <s.molnar at sbcglobal.net>
wrote:
> I am slowly and not quickly stumbling forward, but at this point my
> degree of mental entropy (confusion) is monumental.
>
> This works:
>
> > import numpy as np
> >
> > print('${d}')
> >
> > data = np.genfromtxt("14-7.log", usecols=(1), skip_header=27,
> > skip_footer=1, encoding=None)
> >
> > print(data)
> >
> > np.savetxt('14-7.dG', data, fmt='%12.9f', header='14-7')
> > print(data)
>
> which produces:
>
> > runfile('/home/comp/Apps/Python/PsoVina/DeltaGTable_V_s.py',
> > wdir='/home/comp/Apps/Python/PsoVina', current_namespace=True)
> > ${d}
> > [-9.96090267 -8.97950478 -8.94261136 -8.91552301 -8.73650883 -8.66338714
> > -8.41073971 -8.38914635 -8.29679891 -8.16845411 -8.12799082 -8.12710377
> > -7.97909074 -7.94187268 -7.90076621 -7.88148523 -7.83782648 -7.8159095
> > -7.72254029 -7.72034674]
> > [-9.96090267 -8.97950478 -8.94261136 -8.91552301 -8.73650883 -8.66338714
> > -8.41073971 -8.38914635 -8.29679891 -8.16845411 -8.12799082 -8.12710377
> > -7.97909074 -7.94187268 -7.90076621 -7.88148523 -7.83782648 -7.8159095
> > -7.72254029 -7.72034674]
> Note; the print statements are for a quick check o the output, which is:
>
> > # 14-7
> > -9.960902669
> > -8.979504781
> > -8.942611364
> > -8.915523010
> > -8.736508831
> > -8.663387139
> > -8.410739711
> > -8.389146347
> > -8.296798909
> > -8.168454106
> > -8.127990818
> > -8.127103774
> > -7.979090739
> > -7.941872682
> > -7.900766215
> > -7.881485228
> > -7.837826485
> > -7.815909505
> > -7.722540286
> > -7.720346742
> Also, this bash script works:
>
> > #!/bin/bash
> >
> > # Run.dG.list_1
> >
> > while IFS= read -r d
> > do
> > echo "${d}.log"
> >
> > done <ligand.list
> which returns the three log file names:
>
> > 14-7.log
> > 15-7.log
> > 18-7.log
> > C-VX3.log
>
>
> But, if I run this bash script:
>
> > #!/bin/bash
> >
> > # Run.dG.list_1
> >
> > while IFS= read -r d
> > do
> > echo "${d}.log"
> > python3 DeltaGTable_V_sl.py
> >
> >
> > done <ligand.list
> >
> where DeltaGTable_V_sl.py is:
>
> > import numpy as np
> >
> > print('${d}')
> >
> > data = np.genfromtxt('${d}.log', usecols=(1), skip_header=27,
> > skip_footer=1, encoding=None)
> > print(data)
> >
> > np.savetxt('${d}.dG', data, fmt='%12.9f', header='${d}')
> > print(data.dG)
>
> I get:
>
> > (base) comp at AbNormal:~/Apps/Python/PsoVina$ sh ./Run.dG.list_1.sh
> > 14-7.log
> > python3: can't open file 'DeltaGTable_V_sl.py': [Errno 2] No such file
> > or directory
> > 15-7.log
> > python3: can't open file 'DeltaGTable_V_sl.py': [Errno 2] No such file
> > or directory
> > 18-7.log
> > python3: can't open file 'DeltaGTable_V_sl.py': [Errno 2] No such file
> > or directory
> > C-VX3.log
> > python3: can't open file 'DeltaGTable_V_sl.py': [Errno 2] No such file
> > or directory
>
> So, it would appear that the log file labels are in the workspace, but
> '${d}.log' is not being recognized as fname by genfromtxt. Although i
> have googled every combination of terms I can think of I am obviously
> missing something.
>
> As I have potentially hundreds of files to process, I would appreciate
> pointers towards a solution to the problem.
>
> Thanks in advance.
>
> On 10/08/2019 10:49 AM, Stephen P. Molnar wrote:
> > Many thanks or your kind replies.
> >
> > I really appreciate your suggestions.
> >
> > On 10/08/2019 09:44 AM, Andras Deak wrote:
> >> PS. if you just want to specify the width of the fields you wouldn't
> >> have to convert anything, because you can specify the size and
> >> justification of a %s format. But arguably having float data as floats
> >> is more natural anyway.
> >>
> >> On Tue, Oct 8, 2019 at 3:42 PM Andras Deak <deak.andris at gmail.com>
> >> wrote:
> >>> On Tue, Oct 8, 2019 at 3:17 PM Stephen P. Molnar
> >>> <s.molnar at sbcglobal.net> wrote:
> >>>> I am embarrassed to be asking this question, but I have exhausted
> >>>> Google
> >>>> at this point .
> >>>>
> >>>> I have a number of identically formatted text files from which I
> >>>> want to
> >>>> extract data, as an example (hopefully, putting these in as quotes
> >>>> will
> >>>> persevere the format):
> >>>>
> >>>>>
> =======================================================================
> >>>>>
> >>>>> PSOVina version 2.0
> >>>>> Giotto H. K. Tai & Shirley W. I. Siu
> >>>>>
> >>>>> Computational Biology and Bioinformatics Lab
> >>>>> University of Macau
> >>>>>
> >>>>> Visit http://cbbio.cis.umac.mo for more information.
> >>>>>
> >>>>> PSOVina was developed based on the framework of AutoDock Vina.
> >>>>>
> >>>>> For more information about Vina, please visit
> >>>>> http://vina.scripps.edu.
> >>>>>
> >>>>>
> =======================================================================
> >>>>>
> >>>>>
> >>>>> Output will be 13-7_out.pdbqt
> >>>>> Reading input ... done.
> >>>>> Setting up the scoring function ... done.
> >>>>> Analyzing the binding site ... done.
> >>>>> Using random seed: 1828390527
> >>>>> Performing search ... done.
> >>>>>
> >>>>> Refining results ... done.
> >>>>>
> >>>>> mode | affinity | dist from best mode
> >>>>> | (kcal/mol) | rmsd l.b.| rmsd u.b.
> >>>>> -----+------------+----------+----------
> >>>>> 1 -8.862004149 0.000 0.000
> >>>>> 2 -8.403522829 2.992 6.553
> >>>>> 3 -8.401384636 2.707 5.220
> >>>>> 4 -7.886402037 4.907 6.862
> >>>>> 5 -7.845519031 3.233 5.915
> >>>>> 6 -7.837434227 3.954 5.641
> >>>>> 7 -7.834584887 3.188 7.294
> >>>>> 8 -7.694395765 3.746 7.553
> >>>>> 9 -7.691211177 3.536 5.745
> >>>>> 10 -7.670759445 3.698 7.587
> >>>>> 11 -7.661882758 4.882 7.044
> >>>>> 12 -7.636280303 2.347 3.284
> >>>>> 13 -7.635788052 3.511 6.250
> >>>>> 14 -7.611175249 2.427 3.449
> >>>>> 15 -7.586368357 2.142 2.864
> >>>>> 16 -7.531307666 2.976 4.980
> >>>>> 17 -7.520501084 3.085 5.775
> >>>>> 18 -7.512906514 4.220 7.672
> >>>>> 19 -7.307403528 3.240 4.354
> >>>>> 20 -7.256063348 3.694 7.252
> >>>>> Writing output ... done.
> >>>> At this point, my python script consists of only the following:
> >>>>
> >>>>> #!/usr/bin/env python3
> >>>>> # -*- coding: utf-8 -*-
> >>>>> """
> >>>>>
> >>>>> Created on Tue Sep 24 07:51:11 2019
> >>>>>
> >>>>> """
> >>>>> import numpy as np
> >>>>>
> >>>>> data = []
> >>>>>
> >>>>> data = np.genfromtxt("13-7.log", usecols=(1), dtype=None,
> >>>>> skip_header=27, skip_footer=1, encoding=None)
> >>>>>
> >>>>> print(data)
> >>>>>
> >>>>> np.savetxt('13-7', [data], fmt='%15.9f', header='13-7')
> >>>> The problem lies in tfe np.savetxt line, on execution I get:
> >>>>
> >>>>>
> runfile('/home/comp/Apps/Models/1-PhosphorusLigands/CombinedLigands/MOL/Docking/VINA20/SmallSet/DeltaGTable_V_s.py',
>
> >>>>>
> >>>>>
> wdir='/home/comp/Apps/Models/1-PhosphorusLigands/CombinedLigands/MOL/Docking/VINA20/SmallSet',
>
> >>>>>
> >>>>> current_namespace=True)
> >>>>> ['-8.839713733' '-8.743377250' '-8.151051167' '-8.090452911'
> >>>>> '-7.967494477' '-7.854890056' '-7.757417879' '-7.741557490'
> >>>>> '-7.643885488' '-7.611595767' '-7.507605524' '-7.413920814'
> >>>>> '-7.389408331' '-7.384446364' '-7.374206276' '-7.368808179'
> >>>>> '-7.346641418' '-7.325037898' '-7.309614787' '-7.113209147']
> >>>>> Traceback (most recent call last):
> >>>>>
> >>>>> File
> >>>>>
> "/home/comp/Apps/Models/1-PhosphorusLigands/CombinedLigands/MOL/Docking/VINA20/SmallSet/DeltaGTable_V_s.py",
>
> >>>>>
> >>>>> line 16, in <module>
> >>>>> np.savetxt('13-7', [data], fmt='%16.9f', header='13-7')
> >>>>>
> >>>>> File "<__array_function__ internals>", line 6, in savetxt
> >>>>>
> >>>>> File
> >>>>>
> "/home/comp/Apps/Miniconda3/lib/python3.7/site-packages/numpy/lib/npyio.py",
>
> >>>>>
> >>>>> line 1438, in savetxt
> >>>>> % (str(X.dtype), format))
> >>>>>
> >>>>> TypeError: Mismatch between array dtype ('<U12') and format specifier
> >>>>> ('%16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f
> >>>>> %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f
> >>>>> %16.9f')
> >>>> The data is in the data file, but the only entry in '13-7', the saved
> >>>> file, is the label. Obviously, the error is in the format argument.
> >>> Hi,
> >>>
> >>> One problem is the format: the error is telling you that you have
> >>> strings in your array (compare the `'<U12'` dtype and the output of
> >>> your `print(data)` call with strings inside), whereas %16.9f can only
> >>> be used to format floats (f for float). You would first have to
> >>> convert your array of strings to an array numbers. I don't usually use
> >>> genfromtxt so I'm not sure how you can make it return floats for you
> >>> in the first place, but I suspect `dtype=None` in the call to
> >>> genfromtxt might be responsible. In any case making it return numbers
> >>> should be the easier case.
> >>> The second problem is that you should make sure you mean `[data]` in
> >>> the call to savetxt. As it is now this would give you a 2d array of
> >>> shape (1, 20), and the output would correspondingly contain a single
> >>> row of 20 values (hence the 20 instances of '%16.9f' in the error
> >>> message). In case you meant to print one value per row in a single
> >>> column, you should drop the brackets around `data`:
> >>> np.savetxt('13-7', data, fmt='%16.9f', header='13-7')
> >>>
> >>> And just a personal note, but I'd find an output file named '13-7' to
> >>> be a bit surprising. Perhaps some extension or prefix would help
> >>> organize these files?
> >>> Regards,
> >>>
> >>> Andr??s
> >>>
> >>>> Help will be much appreciated.
> >>>>
> >>>> Thanks in advance.
> >>>>
> >>>> --
> >>>> Stephen P. Molnar, Ph.D.
> >>>> www.molecular-modeling.net
> >>>> 614.312.7528 (c)
> >>>> Skype: smolnar1
> >>>>
> >>>> _______________________________________________
> >>>> NumPy-Discussion mailing list
> >>>> NumPy-Discussion at python.org
> >>>> https://mail.python.org/mailman/listinfo/numpy-discussion
> >> _______________________________________________
> >> NumPy-Discussion mailing list
> >> NumPy-Discussion at python.org
> >> https://mail.python.org/mailman/listinfo/numpy-discussion
> >
>
> --
> Stephen P. Molnar, Ph.D.
> www.molecular-modeling.net
> 614.312.7528 (c)
> Skype: smolnar1
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20191010/259436f4/attachment-0001.html>
More information about the NumPy-Discussion
mailing list