Alternative C implementation idea for collections.deque
Many years ago I implemented a deque type in C (for use in C programs) using a single dynamically sized array as the underling type, along with an extra integer to designate where the "start" of the list is. Somehow, I had always imagined that Python's implementation worked the same way, but was just looking at the code and found that it uses doubly-linked lists of 62-item blocks. I just wanted to toss my approach out there to see if it might offer better performance and/or simpler code (my pure-C implementation is only 80 lines of code). If there's interest, I'd be happy to work it up into a patch. I could also do some performance tests if others have specific use cases they'd like to see tested. I see the advantages as follows: - __getitem__ would take O(1) time instead of O(n) - Dirt simple implementation Compared to the existing code, memory allocation/deallocation is less frequent but more expensive when necessary, working out to O(1) amortized time (same as the current implementation). The basic idea is to use a single dynamically sized array (much like Python's list type). An extra integer points to the start of the deque within the array, and the data implicitly wraps around the end of the list back to the beginning. Here's the basic data structure: struct deque { PyObject **obs; int max_len; /* Number of objects we can store in the deque before having to reallocate */ int len; /* Number of objects currently in the deque */ int start; /* Where in obs the objects starts */ } When len == max_len and the user adds an element, we reallocate obs. Much like Python's list type, this can be done in such a way that memory allocation takes O(1) amortized time (e.g., by doubling max_len).. The start of the list is always obs[start] and the last element of the list is at obs[(start + len - 1) % max_len]. The nth element of the list is obs[(start + n) % max_len]. I think an example would be helpful. For illustrative purposes, suppose we start with max_len=4. Each line is followed by a picture of the data structure's contents:
x = deque() {obs = {NULL, NULL, NULL, NULL}, max_len = 4, len = 0, start = 0} x.extend((1,2,3,4)) {obs = {1, 2, 3, 4}, max_len = 4, len = 4, start = 0} x.popleft() {obs = {NULL, 2, 3, 4}, max_len = 4, len = 3, start = 1} x.append(5) {obs = {5, 2, 3, 4}, max_len = 4, len = 4, start = 1} x.popleft() {obs = {5, NULL, 3, 4}, max_len = 4, len = 3, start = 2} x.pop() {obs = {NULL, NULL, 3, 4}, max_len = 4, len = 2, start = 2}
Comments? Thoughts? Would a patch be welcome? -- Daniel Stutzbach, Ph.D. President, Stutzbach Enterprises LLC
From: "Daniel Stutzbach" <daniel@stutzbachenterprises.com>
Many years ago I implemented a deque type in C (for use in C programs) using a single dynamically sized array as the underling type,
The approach was not used so we could avoid data movement associated with re-sizing.
I see the advantages as follows:
- __getitem__ would take O(1) time instead of O(n)
While nice, that is somewhat at odds with how deques are used. The current implementation has O(1) access time with a tiny constant for access near the end-points. The only reason that __getitem__ was put in was to support fast access to the head and tail without actually popping the value. That it can access the middle was incidental and imho optimizing in the middle access would only encourage mis-use of the data structure.
Compared to the existing code, memory allocation/deallocation is less frequent but more expensive when necessary,
For most use cases, the current version rolls around with no memory allocations (reusable blocks are kept on a free list).
Comments? Thoughts? Would a patch be welcome?
Might as well put it in the tracker for history/research purposes, but I think you're barking up the wrong tree. Deques are not about accessing the nth item in the middle. For their purpose, the current design works really well (the blocks are sized to fit in cache lines and no mass movements are necessary when the size changes -- the data never moves). If you're going to spend time, why not put it into developing a data structure we don't already have (red-black trees or some such). FWIW, my development time is now somewhat limited and I prefer to spend it on a todo list that has been around for some time. I dread throwing that time away and spending it on reviewing your patch, timing tons of test cases, arguing the merits of alternate approaches, and ending-up with substantially the same functionality that we already have. The big win for deques was getting the O(1) append/pop on each end. That gave a huge algorithmic speed boost to the Queue module and a number of other tools that were using lists for deque-style operations. Raymond
On Nov 15, 2007 12:57 AM, Raymond Hettinger <python@rcn.com> wrote:
FWIW, my development time is now somewhat limited and I prefer to spend it on a todo list that has been around for some time. I dread throwing that time away and spending it on reviewing your patch, timing tons of test cases, arguing the merits of alternate approaches, and ending-up with substantially the same functionality that we already have.
I respect that. I won't waste either of our time, then. Cheers, -- Daniel Stutzbach, Ph.D. President, Stutzbach Enterprises LLC
participants (2)
-
Daniel Stutzbach -
Raymond Hettinger