sorting 1172026 entries
Cameron Simpson
cs at zip.com.au
Sun May 6 20:31:14 EDT 2012
On 06May2012 17:10, Chris Rebert <clp2 at rebertia.com> wrote:
| On Sun, May 6, 2012 at 4:54 PM, Cameron Simpson <cs at zip.com.au> wrote:
| > On 06May2012 18:36, J. Mwebaze <jmwebaze at gmail.com> wrote:
| > | > for filename in txtfiles:
| > | > temp=[]
| > | > f=open(filename)
| > | > for line in f.readlines():
| > | > line = line.strip()
| > | > line=line.split()
| > | > temp.append((parser.parse(line[0]), float(line[1])))
| >
| > Have you timed the different parts of your code instead of the whole
| > thing?
| >
| > Specificly, do you know the sort time is the large cost?
| >
| > I would point out that the loop above builds the list by append(), one
| > item at a time. That should have runtime cost of the square of the list
| > length, 1172026 * 1172026. Though I've just done this:
|
| Er, what? list.append() is O(1) amortized.
I didn't mean per .append() call (which I'd expect to be O(n) for large
n), I meant overall for the completed list.
Don't the realloc()s make it O(n^2) overall for large n? The list
must get copied when the underlying space fills. I confess to being
surprised at how quick it went for me though. I suppose reallocing in
chunks (eg to double the available size) might conceal this for a while.
It should still be well over O(n) overall (not amortised per .append()
call).
| Perhaps you're confusing list.append() with list.insert(), which is indeed O(n)?
I wasn't, though .insert() is surely way more expensive.
Cheers,
--
Cameron Simpson <cs at zip.com.au> DoD#743
http://www.cskk.ezoshosting.com/cs/
The Borg assimilated my race and all I got was this lousy tagline.
- Cath Lawrence <Cath_Lawrence at premium.com.au>
More information about the Python-list
mailing list