[Numpy-discussion] Problem with np.savetxt

Eric Wieser wieser.eric+numpy at gmail.com
Thu Oct 10 10:22:46 EDT 2019


You're trying to read a file with a name of literally `${d}.log`, which is
unlikely to be the name of your file. `${}` is bash syntax, not python
syntax.

This has drifted out of numpy territory and into "how to coordinate between
bash and python" territory - I'd perhaps recommend you ask this to a wider
python audience on StackOverflow, where you'll get a faster response.

Eric

On Thu, 10 Oct 2019 at 15:11, Stephen P. Molnar <s.molnar at sbcglobal.net>
wrote:

> I am slowly and not quickly stumbling forward, but at this point my
> degree of mental entropy (confusion) is monumental.
>
> This works:
>
> > import numpy as np
> >
> > print('${d}')
> >
> > data = np.genfromtxt("14-7.log", usecols=(1), skip_header=27,
> > skip_footer=1, encoding=None)
> >
> > print(data)
> >
> > np.savetxt('14-7.dG', data, fmt='%12.9f', header='14-7')
> > print(data)
>
> which produces:
>
> > runfile('/home/comp/Apps/Python/PsoVina/DeltaGTable_V_s.py',
> > wdir='/home/comp/Apps/Python/PsoVina', current_namespace=True)
> > ${d}
> > [-9.96090267 -8.97950478 -8.94261136 -8.91552301 -8.73650883 -8.66338714
> >  -8.41073971 -8.38914635 -8.29679891 -8.16845411 -8.12799082 -8.12710377
> >  -7.97909074 -7.94187268 -7.90076621 -7.88148523 -7.83782648 -7.8159095
> >  -7.72254029 -7.72034674]
> > [-9.96090267 -8.97950478 -8.94261136 -8.91552301 -8.73650883 -8.66338714
> >  -8.41073971 -8.38914635 -8.29679891 -8.16845411 -8.12799082 -8.12710377
> >  -7.97909074 -7.94187268 -7.90076621 -7.88148523 -7.83782648 -7.8159095
> >  -7.72254029 -7.72034674]
> Note;  the print statements are for a quick check o the output, which is:
>
> > # 14-7
> > -9.960902669
> > -8.979504781
> > -8.942611364
> > -8.915523010
> > -8.736508831
> > -8.663387139
> > -8.410739711
> > -8.389146347
> > -8.296798909
> > -8.168454106
> > -8.127990818
> > -8.127103774
> > -7.979090739
> > -7.941872682
> > -7.900766215
> > -7.881485228
> > -7.837826485
> > -7.815909505
> > -7.722540286
> > -7.720346742
>   Also, this bash script works:
>
> > #!/bin/bash
> >
> > # Run.dG.list_1
> >
> > while IFS= read -r d
> > do
> >     echo "${d}.log"
> >
> > done <ligand.list
> which returns the three log file names:
>
> > 14-7.log
> > 15-7.log
> > 18-7.log
> > C-VX3.log
>
>
> But, if I run this bash script:
>
> > #!/bin/bash
> >
> > # Run.dG.list_1
> >
> > while IFS= read -r d
> > do
> >     echo "${d}.log"
> >     python3 DeltaGTable_V_sl.py
> >
> >
> > done <ligand.list
> >
> where DeltaGTable_V_sl.py is:
>
> > import numpy as np
> >
> > print('${d}')
> >
> > data = np.genfromtxt('${d}.log', usecols=(1), skip_header=27,
> > skip_footer=1, encoding=None)
> > print(data)
> >
> > np.savetxt('${d}.dG', data, fmt='%12.9f', header='${d}')
> > print(data.dG)
>
> I get:
>
> > (base) comp at AbNormal:~/Apps/Python/PsoVina$ sh ./Run.dG.list_1.sh
> > 14-7.log
> > python3: can't open file 'DeltaGTable_V_sl.py': [Errno 2] No such file
> > or directory
> > 15-7.log
> > python3: can't open file 'DeltaGTable_V_sl.py': [Errno 2] No such file
> > or directory
> > 18-7.log
> > python3: can't open file 'DeltaGTable_V_sl.py': [Errno 2] No such file
> > or directory
> > C-VX3.log
> > python3: can't open file 'DeltaGTable_V_sl.py': [Errno 2] No such file
> > or directory
>
> So, it would appear that the log file labels are in the workspace, but
> '${d}.log' is not being recognized as fname by genfromtxt. Although i
> have googled every combination of terms I can think of I am obviously
> missing something.
>
> As I have potentially hundreds of files to process, I would appreciate
> pointers towards a solution to the problem.
>
> Thanks in advance.
>
> On 10/08/2019 10:49 AM, Stephen P. Molnar wrote:
> > Many thanks or your kind replies.
> >
> > I really appreciate your suggestions.
> >
> > On 10/08/2019 09:44 AM, Andras Deak wrote:
> >> PS. if you just want to specify the width of the fields you wouldn't
> >> have to convert anything, because you can specify the size and
> >> justification of a %s format. But arguably having float data as floats
> >> is more natural anyway.
> >>
> >> On Tue, Oct 8, 2019 at 3:42 PM Andras Deak <deak.andris at gmail.com>
> >> wrote:
> >>> On Tue, Oct 8, 2019 at 3:17 PM Stephen P. Molnar
> >>> <s.molnar at sbcglobal.net> wrote:
> >>>> I am embarrassed to be asking this question, but I have exhausted
> >>>> Google
> >>>> at this point .
> >>>>
> >>>> I have a number of identically formatted text files from which I
> >>>> want to
> >>>> extract data, as an example (hopefully, putting these in as quotes
> >>>> will
> >>>> persevere the format):
> >>>>
> >>>>>
> =======================================================================
> >>>>>
> >>>>> PSOVina version 2.0
> >>>>> Giotto H. K. Tai & Shirley W. I. Siu
> >>>>>
> >>>>> Computational Biology and Bioinformatics Lab
> >>>>> University of Macau
> >>>>>
> >>>>> Visit http://cbbio.cis.umac.mo for more information.
> >>>>>
> >>>>> PSOVina was developed based on the framework of AutoDock Vina.
> >>>>>
> >>>>> For more information about Vina, please visit
> >>>>> http://vina.scripps.edu.
> >>>>>
> >>>>>
> =======================================================================
> >>>>>
> >>>>>
> >>>>> Output will be 13-7_out.pdbqt
> >>>>> Reading input ... done.
> >>>>> Setting up the scoring function ... done.
> >>>>> Analyzing the binding site ... done.
> >>>>> Using random seed: 1828390527
> >>>>> Performing search ... done.
> >>>>>
> >>>>> Refining results ... done.
> >>>>>
> >>>>> mode |   affinity | dist from best mode
> >>>>>       | (kcal/mol) | rmsd l.b.| rmsd u.b.
> >>>>> -----+------------+----------+----------
> >>>>>     1    -8.862004149      0.000      0.000
> >>>>>     2    -8.403522829      2.992      6.553
> >>>>>     3    -8.401384636      2.707      5.220
> >>>>>     4    -7.886402037      4.907      6.862
> >>>>>     5    -7.845519031      3.233      5.915
> >>>>>     6    -7.837434227      3.954      5.641
> >>>>>     7    -7.834584887      3.188      7.294
> >>>>>     8    -7.694395765      3.746      7.553
> >>>>>     9    -7.691211177      3.536      5.745
> >>>>>    10    -7.670759445      3.698      7.587
> >>>>>    11    -7.661882758      4.882      7.044
> >>>>>    12    -7.636280303      2.347      3.284
> >>>>>    13    -7.635788052      3.511      6.250
> >>>>>    14    -7.611175249      2.427      3.449
> >>>>>    15    -7.586368357      2.142      2.864
> >>>>>    16    -7.531307666      2.976      4.980
> >>>>>    17    -7.520501084      3.085      5.775
> >>>>>    18    -7.512906514      4.220      7.672
> >>>>>    19    -7.307403528      3.240      4.354
> >>>>>    20    -7.256063348      3.694      7.252
> >>>>> Writing output ... done.
> >>>>    At this point, my python script consists of only the following:
> >>>>
> >>>>> #!/usr/bin/env python3
> >>>>> # -*- coding: utf-8 -*-
> >>>>> """
> >>>>>
> >>>>> Created on Tue Sep 24 07:51:11 2019
> >>>>>
> >>>>> """
> >>>>> import numpy as np
> >>>>>
> >>>>> data = []
> >>>>>
> >>>>> data = np.genfromtxt("13-7.log", usecols=(1), dtype=None,
> >>>>> skip_header=27, skip_footer=1, encoding=None)
> >>>>>
> >>>>> print(data)
> >>>>>
> >>>>> np.savetxt('13-7', [data], fmt='%15.9f', header='13-7')
> >>>> The problem lies in tfe np.savetxt line, on execution I get:
> >>>>
> >>>>>
> runfile('/home/comp/Apps/Models/1-PhosphorusLigands/CombinedLigands/MOL/Docking/VINA20/SmallSet/DeltaGTable_V_s.py',
>
> >>>>>
> >>>>>
> wdir='/home/comp/Apps/Models/1-PhosphorusLigands/CombinedLigands/MOL/Docking/VINA20/SmallSet',
>
> >>>>>
> >>>>> current_namespace=True)
> >>>>> ['-8.839713733' '-8.743377250' '-8.151051167' '-8.090452911'
> >>>>>   '-7.967494477' '-7.854890056' '-7.757417879' '-7.741557490'
> >>>>>   '-7.643885488' '-7.611595767' '-7.507605524' '-7.413920814'
> >>>>>   '-7.389408331' '-7.384446364' '-7.374206276' '-7.368808179'
> >>>>>   '-7.346641418' '-7.325037898' '-7.309614787' '-7.113209147']
> >>>>> Traceback (most recent call last):
> >>>>>
> >>>>>    File
> >>>>>
> "/home/comp/Apps/Models/1-PhosphorusLigands/CombinedLigands/MOL/Docking/VINA20/SmallSet/DeltaGTable_V_s.py",
>
> >>>>>
> >>>>> line 16, in <module>
> >>>>>      np.savetxt('13-7', [data], fmt='%16.9f', header='13-7')
> >>>>>
> >>>>>    File "<__array_function__ internals>", line 6, in savetxt
> >>>>>
> >>>>>    File
> >>>>>
> "/home/comp/Apps/Miniconda3/lib/python3.7/site-packages/numpy/lib/npyio.py",
>
> >>>>>
> >>>>> line 1438, in savetxt
> >>>>>      % (str(X.dtype), format))
> >>>>>
> >>>>> TypeError: Mismatch between array dtype ('<U12') and format specifier
> >>>>> ('%16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f
> >>>>> %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f
> >>>>> %16.9f')
> >>>> The data is in the data file, but the only entry in '13-7', the saved
> >>>> file, is the label. Obviously, the error is in the format argument.
> >>> Hi,
> >>>
> >>> One problem is the format: the error is telling you that you have
> >>> strings in your array (compare the `'<U12'` dtype and the output of
> >>> your `print(data)` call with strings inside), whereas %16.9f can only
> >>> be used to format floats (f for float). You would first have to
> >>> convert your array of strings to an array numbers. I don't usually use
> >>> genfromtxt so I'm not sure how you can make it return floats for you
> >>> in the first place, but I suspect `dtype=None` in the call to
> >>> genfromtxt might be responsible. In any case making it return numbers
> >>> should be the easier case.
> >>> The second problem is that you should make sure you mean `[data]` in
> >>> the call to savetxt. As it is now this would give you a 2d array of
> >>> shape (1, 20), and the output would correspondingly contain a single
> >>> row of 20 values (hence the 20 instances of '%16.9f' in the error
> >>> message). In case you meant to print one value per row in a single
> >>> column, you should drop the brackets around `data`:
> >>> np.savetxt('13-7', data, fmt='%16.9f', header='13-7')
> >>>
> >>> And just a personal note, but I'd find an output file named '13-7' to
> >>> be a bit surprising. Perhaps some extension or prefix would help
> >>> organize these files?
> >>> Regards,
> >>>
> >>> Andr??s
> >>>
> >>>> Help will be much appreciated.
> >>>>
> >>>> Thanks in advance.
> >>>>
> >>>> --
> >>>> Stephen P. Molnar, Ph.D.
> >>>> www.molecular-modeling.net
> >>>> 614.312.7528 (c)
> >>>> Skype:  smolnar1
> >>>>
> >>>> _______________________________________________
> >>>> NumPy-Discussion mailing list
> >>>> NumPy-Discussion at python.org
> >>>> https://mail.python.org/mailman/listinfo/numpy-discussion
> >> _______________________________________________
> >> NumPy-Discussion mailing list
> >> NumPy-Discussion at python.org
> >> https://mail.python.org/mailman/listinfo/numpy-discussion
> >
>
> --
> Stephen P. Molnar, Ph.D.
> www.molecular-modeling.net
> 614.312.7528 (c)
> Skype:  smolnar1
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20191010/259436f4/attachment-0001.html>


More information about the NumPy-Discussion mailing list