[Numpy-discussion] Problem with np.savetxt
Stephen P. Molnar
s.molnar at sbcglobal.net
Tue Oct 8 10:49:31 EDT 2019
Many thanks or your kid replies.
I really appreciate your suggestions.
On 10/08/2019 09:44 AM, Andras Deak wrote:
> PS. if you just want to specify the width of the fields you wouldn't
> have to convert anything, because you can specify the size and
> justification of a %s format. But arguably having float data as floats
> is more natural anyway.
>
> On Tue, Oct 8, 2019 at 3:42 PM Andras Deak <deak.andris at gmail.com> wrote:
>> On Tue, Oct 8, 2019 at 3:17 PM Stephen P. Molnar <s.molnar at sbcglobal.net> wrote:
>>> I am embarrassed to be asking this question, but I have exhausted Google
>>> at this point .
>>>
>>> I have a number of identically formatted text files from which I want to
>>> extract data, as an example (hopefully, putting these in as quotes will
>>> persevere the format):
>>>
>>>> =======================================================================
>>>> PSOVina version 2.0
>>>> Giotto H. K. Tai & Shirley W. I. Siu
>>>>
>>>> Computational Biology and Bioinformatics Lab
>>>> University of Macau
>>>>
>>>> Visit http://cbbio.cis.umac.mo for more information.
>>>>
>>>> PSOVina was developed based on the framework of AutoDock Vina.
>>>>
>>>> For more information about Vina, please visit http://vina.scripps.edu.
>>>>
>>>> =======================================================================
>>>>
>>>> Output will be 13-7_out.pdbqt
>>>> Reading input ... done.
>>>> Setting up the scoring function ... done.
>>>> Analyzing the binding site ... done.
>>>> Using random seed: 1828390527
>>>> Performing search ... done.
>>>>
>>>> Refining results ... done.
>>>>
>>>> mode | affinity | dist from best mode
>>>> | (kcal/mol) | rmsd l.b.| rmsd u.b.
>>>> -----+------------+----------+----------
>>>> 1 -8.862004149 0.000 0.000
>>>> 2 -8.403522829 2.992 6.553
>>>> 3 -8.401384636 2.707 5.220
>>>> 4 -7.886402037 4.907 6.862
>>>> 5 -7.845519031 3.233 5.915
>>>> 6 -7.837434227 3.954 5.641
>>>> 7 -7.834584887 3.188 7.294
>>>> 8 -7.694395765 3.746 7.553
>>>> 9 -7.691211177 3.536 5.745
>>>> 10 -7.670759445 3.698 7.587
>>>> 11 -7.661882758 4.882 7.044
>>>> 12 -7.636280303 2.347 3.284
>>>> 13 -7.635788052 3.511 6.250
>>>> 14 -7.611175249 2.427 3.449
>>>> 15 -7.586368357 2.142 2.864
>>>> 16 -7.531307666 2.976 4.980
>>>> 17 -7.520501084 3.085 5.775
>>>> 18 -7.512906514 4.220 7.672
>>>> 19 -7.307403528 3.240 4.354
>>>> 20 -7.256063348 3.694 7.252
>>>> Writing output ... done.
>>> At this point, my python script consists of only the following:
>>>
>>>> #!/usr/bin/env python3
>>>> # -*- coding: utf-8 -*-
>>>> """
>>>>
>>>> Created on Tue Sep 24 07:51:11 2019
>>>>
>>>> """
>>>> import numpy as np
>>>>
>>>> data = []
>>>>
>>>> data = np.genfromtxt("13-7.log", usecols=(1), dtype=None,
>>>> skip_header=27, skip_footer=1, encoding=None)
>>>>
>>>> print(data)
>>>>
>>>> np.savetxt('13-7', [data], fmt='%15.9f', header='13-7')
>>> The problem lies in tfe np.savetxt line, on execution I get:
>>>
>>>> runfile('/home/comp/Apps/Models/1-PhosphorusLigands/CombinedLigands/MOL/Docking/VINA20/SmallSet/DeltaGTable_V_s.py',
>>>> wdir='/home/comp/Apps/Models/1-PhosphorusLigands/CombinedLigands/MOL/Docking/VINA20/SmallSet',
>>>> current_namespace=True)
>>>> ['-8.839713733' '-8.743377250' '-8.151051167' '-8.090452911'
>>>> '-7.967494477' '-7.854890056' '-7.757417879' '-7.741557490'
>>>> '-7.643885488' '-7.611595767' '-7.507605524' '-7.413920814'
>>>> '-7.389408331' '-7.384446364' '-7.374206276' '-7.368808179'
>>>> '-7.346641418' '-7.325037898' '-7.309614787' '-7.113209147']
>>>> Traceback (most recent call last):
>>>>
>>>> File
>>>> "/home/comp/Apps/Models/1-PhosphorusLigands/CombinedLigands/MOL/Docking/VINA20/SmallSet/DeltaGTable_V_s.py",
>>>> line 16, in <module>
>>>> np.savetxt('13-7', [data], fmt='%16.9f', header='13-7')
>>>>
>>>> File "<__array_function__ internals>", line 6, in savetxt
>>>>
>>>> File
>>>> "/home/comp/Apps/Miniconda3/lib/python3.7/site-packages/numpy/lib/npyio.py",
>>>> line 1438, in savetxt
>>>> % (str(X.dtype), format))
>>>>
>>>> TypeError: Mismatch between array dtype ('<U12') and format specifier
>>>> ('%16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f
>>>> %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f
>>>> %16.9f')
>>> The data is in the data file, but the only entry in '13-7', the saved
>>> file, is the label. Obviously, the error is in the format argument.
>> Hi,
>>
>> One problem is the format: the error is telling you that you have
>> strings in your array (compare the `'<U12'` dtype and the output of
>> your `print(data)` call with strings inside), whereas %16.9f can only
>> be used to format floats (f for float). You would first have to
>> convert your array of strings to an array numbers. I don't usually use
>> genfromtxt so I'm not sure how you can make it return floats for you
>> in the first place, but I suspect `dtype=None` in the call to
>> genfromtxt might be responsible. In any case making it return numbers
>> should be the easier case.
>> The second problem is that you should make sure you mean `[data]` in
>> the call to savetxt. As it is now this would give you a 2d array of
>> shape (1, 20), and the output would correspondingly contain a single
>> row of 20 values (hence the 20 instances of '%16.9f' in the error
>> message). In case you meant to print one value per row in a single
>> column, you should drop the brackets around `data`:
>> np.savetxt('13-7', data, fmt='%16.9f', header='13-7')
>>
>> And just a personal note, but I'd find an output file named '13-7' to
>> be a bit surprising. Perhaps some extension or prefix would help
>> organize these files?
>> Regards,
>>
>> Andr??s
>>
>>> Help will be much appreciated.
>>>
>>> Thanks in advance.
>>>
>>> --
>>> Stephen P. Molnar, Ph.D.
>>> www.molecular-modeling.net
>>> 614.312.7528 (c)
>>> Skype: smolnar1
>>>
>>> _______________________________________________
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion at python.org
>>> https://mail.python.org/mailman/listinfo/numpy-discussion
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
--
Stephen P. Molnar, Ph.D.
www.molecular-modeling.net
614.312.7528 (c)
Skype: smolnar1
More information about the NumPy-Discussion
mailing list