optomizations
Roy Smith
roy at panix.com
Mon Apr 22 21:53:11 EDT 2013
In article <mailman.944.1366680414.3114.python-list at python.org>,
Rodrick Brown <rodrick.brown at gmail.com> wrote:
> I would like some feedback on possible solutions to make this script run
> faster.
If I had to guess, I would think this stuff:
> line = line.replace('mediacdn.xxx.com', 'media.xxx.com')
> line = line.replace('staticcdn.xxx.co.uk', '
> static.xxx.co.uk')
> line = line.replace('cdn.xxx', 'www.xxx')
> line = line.replace('cdn.xxx', 'www.xxx')
> line = line.replace('cdn.xx', 'www.xx')
> siteurl = line.split()[6].split('/')[2]
> line = re.sub(r'\bhttps?://%s\b' % siteurl, "", line, 1)
You make 6 copies of every line. That's slow. But I'm also going to
quote something I wrote here a couple of months back:
> I've been doing some log analysis. It's been taking a grovelingly long
> time, so I decided to fire up the profiler and see what's taking so
> long. I had a pretty good idea of where the ONLY TWO POSSIBLE hotspots
> might be (looking up IP addresses in the geolocation database, or
> producing some pretty pictures using matplotlib). It was just a matter
> of figuring out which it was.
>
> As with most attempts to out-guess the profiler, I was totally,
> absolutely, and embarrassingly wrong.
So, my real advice to you is to fire up the profiler and see what it
says.
More information about the Python-list
mailing list