[Numpy-discussion] Fwd: Re: Problem with np.savetxt
Stephen P. Molnar
s.molnar at sbcglobal.net
Thu Oct 10 13:18:56 EDT 2019
-------- Forwarded Message --------
Subject: Re: [Numpy-discussion] Problem with np.savetxt
Date: Thu, 10 Oct 2019 10:10:58 -0400
From: Stephen P. Molnar <s.molnar at sbcglobal.net>
To: numpy-discussion at python.org
I am slowly and not quickly stumbling forward, but at this point my
degree of mental entropy (confusion) is monumental.
This works:
> import numpy as np
>
> print('${d}')
>
> data = np.genfromtxt("14-7.log", usecols=(1), skip_header=27,
> skip_footer=1, encoding=None)
>
> print(data)
>
> np.savetxt('14-7.dG', data, fmt='%12.9f', header='14-7')
> print(data)
which produces:
> runfile('/home/comp/Apps/Python/PsoVina/DeltaGTable_V_s.py',
> wdir='/home/comp/Apps/Python/PsoVina', current_namespace=True)
> ${d}
> [-9.96090267 -8.97950478 -8.94261136 -8.91552301 -8.73650883 -8.66338714
> -8.41073971 -8.38914635 -8.29679891 -8.16845411 -8.12799082 -8.12710377
> -7.97909074 -7.94187268 -7.90076621 -7.88148523 -7.83782648 -7.8159095
> -7.72254029 -7.72034674]
> [-9.96090267 -8.97950478 -8.94261136 -8.91552301 -8.73650883 -8.66338714
> -8.41073971 -8.38914635 -8.29679891 -8.16845411 -8.12799082 -8.12710377
> -7.97909074 -7.94187268 -7.90076621 -7.88148523 -7.83782648 -7.8159095
> -7.72254029 -7.72034674]
Note; the print statements are for a quick check o the output, which is:
> # 14-7
> -9.960902669
> -8.979504781
> -8.942611364
> -8.915523010
> -8.736508831
> -8.663387139
> -8.410739711
> -8.389146347
> -8.296798909
> -8.168454106
> -8.127990818
> -8.127103774
> -7.979090739
> -7.941872682
> -7.900766215
> -7.881485228
> -7.837826485
> -7.815909505
> -7.722540286
> -7.720346742
Also, this bash script works:
> #!/bin/bash
>
> # Run.dG.list_1
>
> while IFS= read -r d
> do
> echo "${d}.log"
>
> done <ligand.list
which returns the three log file names:
> 14-7.log
> 15-7.log
> 18-7.log
> C-VX3.log
But, if I run this bash script:
> #!/bin/bash
>
> # Run.dG.list_1
>
> while IFS= read -r d
> do
> echo "${d}.log"
> python3 DeltaGTable_V_sl.py
>
>
> done <ligand.list
>
where DeltaGTable_V_sl.py is:
> import numpy as np
>
> print('${d}')
>
> data = np.genfromtxt('${d}.log', usecols=(1), skip_header=27,
> skip_footer=1, encoding=None)
> print(data)
>
> np.savetxt('${d}.dG', data, fmt='%12.9f', header='${d}')
> print(data.dG)
I get:
> (base) comp at AbNormal:~/Apps/Python/PsoVina$ sh ./Run.dG.list_1.sh
> 14-7.log
> python3: can't open file 'DeltaGTable_V_sl.py': [Errno 2] No such file
> or directory
> 15-7.log
> python3: can't open file 'DeltaGTable_V_sl.py': [Errno 2] No such file
> or directory
> 18-7.log
> python3: can't open file 'DeltaGTable_V_sl.py': [Errno 2] No such file
> or directory
> C-VX3.log
> python3: can't open file 'DeltaGTable_V_sl.py': [Errno 2] No such file
> or directory
So, it would appear that the log file labels are in the workspace, but
'${d}.log' is not being recognized as fname by genfromtxt. Although i
have googled every combination of terms I can think of I am obviously
missing something.
As I have potentially hundreds of files to process, I would appreciate
pointers towards a solution to the problem.
Thanks in advance.
On 10/08/2019 10:49 AM, Stephen P. Molnar wrote:
> Many thanks or your kind replies.
>
> I really appreciate your suggestions.
>
> On 10/08/2019 09:44 AM, Andras Deak wrote:
>> PS. if you just want to specify the width of the fields you wouldn't
>> have to convert anything, because you can specify the size and
>> justification of a %s format. But arguably having float data as floats
>> is more natural anyway.
>>
>> On Tue, Oct 8, 2019 at 3:42 PM Andras Deak <deak.andris at gmail.com>
>> wrote:
>>> On Tue, Oct 8, 2019 at 3:17 PM Stephen P. Molnar
>>> <s.molnar at sbcglobal.net> wrote:
>>>> I am embarrassed to be asking this question, but I have exhausted
>>>> Google
>>>> at this point .
>>>>
>>>> I have a number of identically formatted text files from which I
>>>> want to
>>>> extract data, as an example (hopefully, putting these in as quotes
>>>> will
>>>> persevere the format):
>>>>
>>>>> =======================================================================
>>>>>
>>>>> PSOVina version 2.0
>>>>> Giotto H. K. Tai & Shirley W. I. Siu
>>>>>
>>>>> Computational Biology and Bioinformatics Lab
>>>>> University of Macau
>>>>>
>>>>> Visit http://cbbio.cis.umac.mo for more information.
>>>>>
>>>>> PSOVina was developed based on the framework of AutoDock Vina.
>>>>>
>>>>> For more information about Vina, please visit
>>>>> http://vina.scripps.edu.
>>>>>
>>>>> =======================================================================
>>>>>
>>>>>
>>>>> Output will be 13-7_out.pdbqt
>>>>> Reading input ... done.
>>>>> Setting up the scoring function ... done.
>>>>> Analyzing the binding site ... done.
>>>>> Using random seed: 1828390527
>>>>> Performing search ... done.
>>>>>
>>>>> Refining results ... done.
>>>>>
>>>>> mode | affinity | dist from best mode
>>>>> | (kcal/mol) | rmsd l.b.| rmsd u.b.
>>>>> -----+------------+----------+----------
>>>>> 1 -8.862004149 0.000 0.000
>>>>> 2 -8.403522829 2.992 6.553
>>>>> 3 -8.401384636 2.707 5.220
>>>>> 4 -7.886402037 4.907 6.862
>>>>> 5 -7.845519031 3.233 5.915
>>>>> 6 -7.837434227 3.954 5.641
>>>>> 7 -7.834584887 3.188 7.294
>>>>> 8 -7.694395765 3.746 7.553
>>>>> 9 -7.691211177 3.536 5.745
>>>>> 10 -7.670759445 3.698 7.587
>>>>> 11 -7.661882758 4.882 7.044
>>>>> 12 -7.636280303 2.347 3.284
>>>>> 13 -7.635788052 3.511 6.250
>>>>> 14 -7.611175249 2.427 3.449
>>>>> 15 -7.586368357 2.142 2.864
>>>>> 16 -7.531307666 2.976 4.980
>>>>> 17 -7.520501084 3.085 5.775
>>>>> 18 -7.512906514 4.220 7.672
>>>>> 19 -7.307403528 3.240 4.354
>>>>> 20 -7.256063348 3.694 7.252
>>>>> Writing output ... done.
>>>> At this point, my python script consists of only the following:
>>>>
>>>>> #!/usr/bin/env python3
>>>>> # -*- coding: utf-8 -*-
>>>>> """
>>>>>
>>>>> Created on Tue Sep 24 07:51:11 2019
>>>>>
>>>>> """
>>>>> import numpy as np
>>>>>
>>>>> data = []
>>>>>
>>>>> data = np.genfromtxt("13-7.log", usecols=(1), dtype=None,
>>>>> skip_header=27, skip_footer=1, encoding=None)
>>>>>
>>>>> print(data)
>>>>>
>>>>> np.savetxt('13-7', [data], fmt='%15.9f', header='13-7')
>>>> The problem lies in tfe np.savetxt line, on execution I get:
>>>>
>>>>> runfile('/home/comp/Apps/Models/1-PhosphorusLigands/CombinedLigands/MOL/Docking/VINA20/SmallSet/DeltaGTable_V_s.py',
>>>>>
>>>>> wdir='/home/comp/Apps/Models/1-PhosphorusLigands/CombinedLigands/MOL/Docking/VINA20/SmallSet',
>>>>>
>>>>> current_namespace=True)
>>>>> ['-8.839713733' '-8.743377250' '-8.151051167' '-8.090452911'
>>>>> '-7.967494477' '-7.854890056' '-7.757417879' '-7.741557490'
>>>>> '-7.643885488' '-7.611595767' '-7.507605524' '-7.413920814'
>>>>> '-7.389408331' '-7.384446364' '-7.374206276' '-7.368808179'
>>>>> '-7.346641418' '-7.325037898' '-7.309614787' '-7.113209147']
>>>>> Traceback (most recent call last):
>>>>>
>>>>> File
>>>>> "/home/comp/Apps/Models/1-PhosphorusLigands/CombinedLigands/MOL/Docking/VINA20/SmallSet/DeltaGTable_V_s.py",
>>>>>
>>>>> line 16, in <module>
>>>>> np.savetxt('13-7', [data], fmt='%16.9f', header='13-7')
>>>>>
>>>>> File "<__array_function__ internals>", line 6, in savetxt
>>>>>
>>>>> File
>>>>> "/home/comp/Apps/Miniconda3/lib/python3.7/site-packages/numpy/lib/npyio.py",
>>>>>
>>>>> line 1438, in savetxt
>>>>> % (str(X.dtype), format))
>>>>>
>>>>> TypeError: Mismatch between array dtype ('<U12') and format specifier
>>>>> ('%16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f
>>>>> %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f
>>>>> %16.9f')
>>>> The data is in the data file, but the only entry in '13-7', the saved
>>>> file, is the label. Obviously, the error is in the format argument.
>>> Hi,
>>>
>>> One problem is the format: the error is telling you that you have
>>> strings in your array (compare the `'<U12'` dtype and the output of
>>> your `print(data)` call with strings inside), whereas %16.9f can only
>>> be used to format floats (f for float). You would first have to
>>> convert your array of strings to an array numbers. I don't usually use
>>> genfromtxt so I'm not sure how you can make it return floats for you
>>> in the first place, but I suspect `dtype=None` in the call to
>>> genfromtxt might be responsible. In any case making it return numbers
>>> should be the easier case.
>>> The second problem is that you should make sure you mean `[data]` in
>>> the call to savetxt. As it is now this would give you a 2d array of
>>> shape (1, 20), and the output would correspondingly contain a single
>>> row of 20 values (hence the 20 instances of '%16.9f' in the error
>>> message). In case you meant to print one value per row in a single
>>> column, you should drop the brackets around `data`:
>>> np.savetxt('13-7', data, fmt='%16.9f', header='13-7')
>>>
>>> And just a personal note, but I'd find an output file named '13-7' to
>>> be a bit surprising. Perhaps some extension or prefix would help
>>> organize these files?
>>> Regards,
>>>
>>> Andr??s
>>>
>>>> Help will be much appreciated.
>>>>
>>>> Thanks in advance.
>>>>
>>>> --
>>>> Stephen P. Molnar, Ph.D.
>>>> www.molecular-modeling.net
>>>> 614.312.7528 (c)
>>>> Skype: smolnar1
>>>>
>>>> _______________________________________________
>>>> NumPy-Discussion mailing list
>>>> NumPy-Discussion at python.org
>>>> https://mail.python.org/mailman/listinfo/numpy-discussion
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at python.org
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>
--
Stephen P. Molnar, Ph.D.
www.molecular-modeling.net
614.312.7528 (c)
Skype: smolnar1
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20191010/15d59f85/attachment-0001.html>
More information about the NumPy-Discussion
mailing list