[Numpy-discussion] Problem with np.savetxt

Tue Oct 8 09:44:30 EDT 2019

PS. if you just want to specify the width of the fields you wouldn't
have to convert anything, because you can specify the size and
justification of a %s format. But arguably having float data as floats
is more natural anyway.

On Tue, Oct 8, 2019 at 3:42 PM Andras Deak <deak.andris at gmail.com> wrote:
>
> On Tue, Oct 8, 2019 at 3:17 PM Stephen P. Molnar <s.molnar at sbcglobal.net> wrote:
> >
> > I am embarrassed to be asking this question, but I have exhausted Google
> > at this point .
> >
> > I have a number of identically formatted text files from which I want to
> > extract data, as an example (hopefully, putting these in as quotes will
> > persevere the format):
> >
> > > =======================================================================
> > > PSOVina version 2.0
> > > Giotto H. K. Tai & Shirley W. I. Siu
> > >
> > > Computational Biology and Bioinformatics Lab
> > > University of Macau
> > >
> > > Visit http://cbbio.cis.umac.mo for more information.
> > >
> > > PSOVina was developed based on the framework of AutoDock Vina.
> > >
> > > For more information about Vina, please visit http://vina.scripps.edu.
> > >
> > > =======================================================================
> > >
> > > Output will be 13-7_out.pdbqt
> > > Reading input ... done.
> > > Setting up the scoring function ... done.
> > > Analyzing the binding site ... done.
> > > Using random seed: 1828390527
> > > Performing search ... done.
> > >
> > > Refining results ... done.
> > >
> > > mode |   affinity | dist from best mode
> > >      | (kcal/mol) | rmsd l.b.| rmsd u.b.
> > > -----+------------+----------+----------
> > >    1    -8.862004149      0.000      0.000
> > >    2    -8.403522829      2.992      6.553
> > >    3    -8.401384636      2.707      5.220
> > >    4    -7.886402037      4.907      6.862
> > >    5    -7.845519031      3.233      5.915
> > >    6    -7.837434227      3.954      5.641
> > >    7    -7.834584887      3.188      7.294
> > >    8    -7.694395765      3.746      7.553
> > >    9    -7.691211177      3.536      5.745
> > >   10    -7.670759445      3.698      7.587
> > >   11    -7.661882758      4.882      7.044
> > >   12    -7.636280303      2.347      3.284
> > >   13    -7.635788052      3.511      6.250
> > >   14    -7.611175249      2.427      3.449
> > >   15    -7.586368357      2.142      2.864
> > >   16    -7.531307666      2.976      4.980
> > >   17    -7.520501084      3.085      5.775
> > >   18    -7.512906514      4.220      7.672
> > >   19    -7.307403528      3.240      4.354
> > >   20    -7.256063348      3.694      7.252
> > > Writing output ... done.
> >   At this point, my python script consists of only the following:
> >
> > > #!/usr/bin/env python3
> > > # -*- coding: utf-8 -*-
> > > """
> > >
> > > Created on Tue Sep 24 07:51:11 2019
> > >
> > > """
> > > import numpy as np
> > >
> > > data = []
> > >
> > > data = np.genfromtxt("13-7.log", usecols=(1), dtype=None,
> > > skip_header=27, skip_footer=1, encoding=None)
> > >
> > > print(data)
> > >
> > > np.savetxt('13-7', [data], fmt='%15.9f', header='13-7')
> >
> > The problem lies in tfe np.savetxt line, on execution I get:
> >
> > > runfile('/home/comp/Apps/Models/1-PhosphorusLigands/CombinedLigands/MOL/Docking/VINA20/SmallSet/DeltaGTable_V_s.py',
> > > wdir='/home/comp/Apps/Models/1-PhosphorusLigands/CombinedLigands/MOL/Docking/VINA20/SmallSet',
> > > current_namespace=True)
> > > ['-8.839713733' '-8.743377250' '-8.151051167' '-8.090452911'
> > >  '-7.967494477' '-7.854890056' '-7.757417879' '-7.741557490'
> > >  '-7.643885488' '-7.611595767' '-7.507605524' '-7.413920814'
> > >  '-7.389408331' '-7.384446364' '-7.374206276' '-7.368808179'
> > >  '-7.346641418' '-7.325037898' '-7.309614787' '-7.113209147']
> > > Traceback (most recent call last):
> > >
> > >   File
> > > "/home/comp/Apps/Models/1-PhosphorusLigands/CombinedLigands/MOL/Docking/VINA20/SmallSet/DeltaGTable_V_s.py",
> > > line 16, in <module>
> > >     np.savetxt('13-7', [data], fmt='%16.9f', header='13-7')
> > >
> > >   File "<__array_function__ internals>", line 6, in savetxt
> > >
> > >   File
> > > "/home/comp/Apps/Miniconda3/lib/python3.7/site-packages/numpy/lib/npyio.py",
> > > line 1438, in savetxt
> > >     % (str(X.dtype), format))
> > >
> > > TypeError: Mismatch between array dtype ('<U12') and format specifier
> > > ('%16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f
> > > %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f
> > > %16.9f')
> >
> > The data is in the data file, but the only entry in '13-7', the saved
> > file, is the label. Obviously, the error is in the format argument.
>
> Hi,
>
> One problem is the format: the error is telling you that you have
> strings in your array (compare the `'<U12'` dtype and the output of
> your `print(data)` call with strings inside), whereas %16.9f can only
> be used to format floats (f for float). You would first have to
> convert your array of strings to an array numbers. I don't usually use
> genfromtxt so I'm not sure how you can make it return floats for you
> in the first place, but I suspect `dtype=None` in the call to
> genfromtxt might be responsible. In any case making it return numbers
> should be the easier case.
> The second problem is that you should make sure you mean `[data]` in
> the call to savetxt. As it is now this would give you a 2d array of
> shape (1, 20), and the output would correspondingly contain a single
> row of 20 values (hence the 20 instances of '%16.9f' in the error
> message). In case you meant to print one value per row in a single
> column, you should drop the brackets around `data`:
> np.savetxt('13-7', data, fmt='%16.9f', header='13-7')
>
> And just a personal note, but I'd find an output file named '13-7' to
> be a bit surprising. Perhaps some extension or prefix would help
> organize these files?
> Regards,
>
> András
>
> >
> > Help will be much appreciated.
> >
> > Thanks in advance.
> >
> > --
> > Stephen P. Molnar, Ph.D.
> > www.molecular-modeling.net
> > 614.312.7528 (c)
> > Skype:  smolnar1
> >
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion at python.org
> > https://mail.python.org/mailman/listinfo/numpy-discussion