Problem with concatenating two dataframes
MRAB
python at mrabarnett.plus.com
Sat Nov 6 15:50:56 EDT 2021
On 2021-11-06 16:16, Mahmood Naderan via Python-list wrote:
> In the following code, I am trying to create some key-value pairs in a dictionary where the first element is a name and the second element is a dataframe.
>
> # Creating a dictionary
> data = {'Value':[0,0,0]}
> kernel_df = pd.DataFrame(data, index=['M1','M2','M3'])
> dict = {'dummy':kernel_df}
> # dummy -> Value
> # M1 0
> # M2 0
> # M3 0
>
>
> Then I read a file and create some batches and compare the name in the batch with the stored names in dictionary. If it doesn't exist, a new key-value (name and dataframe) is created. Otherwise, the Value column is appended to the existing dataframe.
>
>
> df = pd.read_csv('test.batch.csv')
> print(df)
> for i in range(0, len(df), 3):
> print("\n------BATCH BEGIN")
> batch_df = df.iloc[i:i+3]
> name = batch_df.loc[i].at["Name"]
> values = batch_df.loc[:,["Value"]]
> print(name)
> print(values)
> print("------BATCH END")
> if name in dict:
> # Append values to the existing key
> dict[name] = pd.concat( dict[name],values ) #### ERROR
> else:
> # Create a new pair in dictionary
> dict[name] = values;
>
>
>
> As you can see in the output, the join statement has error.
>
>
>
> ID Name Metric Value
> 0 0 K1 M1 10
> 1 0 K1 M2 5
> 2 0 K1 M3 10
> 3 1 K2 M1 20
> 4 1 K2 M2 10
> 5 1 K2 M3 15
> 6 2 K1 M1 2
> 7 2 K1 M2 2
> 8 2 K1 M3 2
>
> ------BATCH BEGIN
> K1
> Value
> 0 10
> 1 5
> 2 10
> ------BATCH END
>
> ------BATCH BEGIN
> K2
> Value
> 3 20
> 4 10
> 5 15
> ------BATCH END
>
> ------BATCH BEGIN
> K1
> Value
> 6 2
> 7 2
> 8 2
> ------BATCH END
>
>
>
>
> As it reaches the contact() statement, I get this error:
>
> TypeError: first argument must be an iterable of pandas objects, you passed an object of type "DataFrame"
>
>
> Based on the definition I wrote in the beginning of the code, "dict[name]" should be a dataframe. Isn't that?
>
> How can I fix that?
>
You're trying to concatenate by passing the 2 items as the first 2
arguments to pd.concat, but I think that you're supposed to pass them as
an _iterable_, e.g. a list, as the first argument to pd.concat.
Try this instead:
dict[name] = pd.concat([dict[name], values])
More information about the Python-list
mailing list