save data from multiple txt files

Cameron Simpson cs at cskk.id.au
Sat Mar 26 22:11:15 EDT 2022


On 26Mar2022 15:47, alberto <voodoo.bender at gmail.com> wrote:
>Hi to everyone,
>I would save data from multiple files in one using pandas.
>below my script

Well, it looks like you're doing the right thing. You've got this:

    results = pd.DataFrame()
    for counter, current_file in enumerate(glob.glob("results_*.log")):
        gcmcdatadf = pd.read_csv(current_file, header=2, sep=" ", usecols=[1, 2, 3])
        print(gcmcdatadf)
        results = pd.concat([results, gcmcdatadf])
    #results.to_csv('resuls_tot.log', header=None, sep=" ")

You're printing `gcmcdatadf`:

    Empty DataFrame
    Columns: [182.244, 10, 0.796176]
    Index: []
    Empty DataFrame
    Columns: [181.126, 12.5, 0.995821]
    Index: []

It looks to me like it is getting the column names from the wrong row of 
data. You have supplied `header=2`, which says that row 2 contains the 
column names. Rows count from 0 in this context; maybe you want 
`header=1`? Have a look at one of the log files to check.

Also, are there data lines in these files? Just wondering, since it says 
"Empty Dataframe".

The docs for `read_csv()` are here: 
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html#pandas.read_csv

You've also commented out:

    #results.to_csv('resuls_tot.log', header=None, sep=" ")

which means you're not printing the concatenated results to a file. Also 
note that your final output is `'resuls_tot.log'` (which looks 
misspelled).  This is in the same directory as the other 
`"results_*.log"` files which means that if you run this again you will 
pick up the concatenated results along with the original files. I'd give 
it another name which will not match your pattern, such as 
`"all_results.log"`.

Cheers,
Cameron Simpson <cs at cskk.id.au>


More information about the Python-list mailing list