speed up pandas calculation

Steven D'Aprano steve+comp.lang.python at pearwood.info
Thu Jul 31 03:46:39 CEST 2014


On Wed, 30 Jul 2014 18:57:15 -0600, Vincent Davis wrote:

> On Wed, Jul 30, 2014 at 6:28 PM, Vincent Davis
> <vincent at vincentdavis.net> wrote:
> 
>> The real slow part seems to be
>> for n in drugs:
>>     df[n] =
>> df[['MED1','MED2','MED3','MED4','MED5']].isin([drugs[n]]).any(1)
>>
>>
> ​I was wrong, this is fast, it was selecting the columns that was slow.
> using
> keep_col = ['PATCODE', 'PATWT', 'VDAYR', 'VMONTH', 'MED1', 'MED2',
> 'MED3', 'MED4', 'MED5']
> df = df[keep_col]
> 
> took the time down from 19sec to 2 sec.


19 seconds? I thought you said it was taking multiple minutes?




-- 
Steven 



More information about the Python-list mailing list