speed up pandas calculation
Steven D'Aprano
steve+comp.lang.python at pearwood.info
Wed Jul 30 21:46:39 EDT 2014
On Wed, 30 Jul 2014 18:57:15 -0600, Vincent Davis wrote:
> On Wed, Jul 30, 2014 at 6:28 PM, Vincent Davis
> <vincent at vincentdavis.net> wrote:
>
>> The real slow part seems to be
>> for n in drugs:
>> df[n] =
>> df[['MED1','MED2','MED3','MED4','MED5']].isin([drugs[n]]).any(1)
>>
>>
> I was wrong, this is fast, it was selecting the columns that was slow.
> using
> keep_col = ['PATCODE', 'PATWT', 'VDAYR', 'VMONTH', 'MED1', 'MED2',
> 'MED3', 'MED4', 'MED5']
> df = df[keep_col]
>
> took the time down from 19sec to 2 sec.
19 seconds? I thought you said it was taking multiple minutes?
--
Steven
More information about the Python-list
mailing list