<p dir="ltr">> df = pd.read_csv('nhamcsopd2010.csv' , index_col='PATCODE', low_memory=False)<br>

> col_init = list(df.columns.values)<br>

> keep_col = ['PATCODE', 'PATWT', 'VDAY', 'VMONTH', 'VYEAR', 'MED1', 'MED2', 'MED3', 'MED4', 'MED5']<br>

> for col in col_init:<br>

>     if col not in keep_col:<br>

>         del df[col]</p>

<p dir="ltr">I'm no pandas expert, but a couple things come to mind. First, where is your code slow (profile it, even with a few well-placed prints)? If it's in read_csv there might be little you can do unless you load those data repeatedly, and can save a pickled data frame as a caching measure. Second, you loop over columns deciding one by one whether to keep or toss a column. Instead try<br>


 <br>

df = df[keep_col]</p>

<p dir="ltr"> Third, if deleting those other columns is costly, can you perhaps just ignore them? </p>

<p dir="ltr">Can't be more investigative right now. I don't have pandas on Android. :-)</p>

<p dir="ltr">Skip</p>