pandas (in jupyter?) problem
Paulo da Silva
p_d_a_s_i_l_v_a_ns at nonetnoaddress.pt
Fri May 6 15:11:07 EDT 2022
Hi all!
I'm having the following problem. Consider the code (the commented or
the not commented which I think do the same things):
#for col in missing_cols:
# df[col] = np.nan
df=df.copy()
df[missing_cols]=np.nan
df has about 20000 cols and len(missing_cols) is about 18000.
I'm getting lots (1 by missing_col?) of the following message from
ipykernel:
"PerformanceWarning: DataFrame is highly fragmented. This is usually
the result of calling `frame.insert` many times, which has poor
performance. Consider joining all columns at once using
pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe =
frame.copy()`
df[missing_cols]=np.nan"
At first I didn't have df=df.copy(). I added it later, but the same problem.
This slows down the code a lot, perhaps because jupyter is taking too
much time issuing these messages!
Thanks for any comments.
More information about the Python-list
mailing list