On 5 Oct 2019, at 12:15 am, Andras Deak <deak.andris@gmail.com> wrote:
On Fri, Oct 4, 2019 at 7:31 PM Stephen P. Molnar <s.molnar@sbcglobal.net> wrote:
I have a snippet of code
#!/usr/bin/env python3 # -*- coding: utf-8 -*- """
Created on Tue Sep 24 07:51:11 2019
""" import numpy as np
files = []
data = np.genfromtxt(files, usecols=(3), dtype=None, skip_header=8, skip_footer=1, encoding=None)
print(data)
If file is a single file the code generates the data that I want. However I have a list of files that I want to process. According to numpy.genfromtxt fname can be a "File, filename, list, or generator to read." If I use [13-7a_apo-1acl.RMSD 13-7_apo-1acl.RMSD 14-7_apo-1acl.RMSD 15-7_apo-1acl.RMSD 17-7_apo-1acl.RMSD ] get the error:
Hi Stephen,
As far as I know genfromtxt is designed to read the contents of a single file. Consider this quote from the docs for the first parameter: "The strings in a list or produced by a generator are treated as lines." And the general description of the function says "Load data from a text file, with missing values handled as specified." ("a text file", singular) So if I understand correctly the list case is there so that you can pass `f.readlines()` or equivalent into genfromtxt. From a higher-level standpoint, how would reading multiple files behave if the files have different structure, and what type and shape should the function return in that case? If one file can be read just fine then I suggest looping over them to read each, one after the other. You can then tell python what to do with each returned array and so it doesn't have to guess.
The above is correct in that genfromtxt expects a single file or file-like object. That said, assuming all input files have compatible format (i.e. identical no. of columns with matching dtypes), which really is the only case that would make sense to pass to genfromtxt, you could try creating a pipe to concatenate all input files into a single object. Something like this might work: fobj = os.popen('cat 1[3457]-7a_apo-1acl.RMSD’) data = np.genfromtxt(fobj, usecols=(3), dtype=None, …) However the multiple headers and footers in your concatenated file may cause trouble here - maybe you find a way to remove them in the popen call with some '[e]grep -v’ artistry. Depending on this, the loop over input files might be the easier solution. HTH, Derek