Popping from the middle of a deque + deque rotation speed

Discussion:

(too old to reply)

Russell Warren

2006-04-28 22:42:44 UTC

Does anyone have an easier/faster/better way of popping from the middle
of a deque than this?

class mydeque(deque):
def popmiddle(self, pos):
self.rotate(-pos)
ret = self.popleft()
self.rotate(pos)
return ret

I do recognize that this is not the intent of a deque, given the
clearly non-"double-ended" nature. I'm using a deque in a place where
99.999 of the time it will be a fifo, but occasionally I will want to
pop from the middle.

I initially wrote that thinking that the rotate duration would be
independent of the rotation distance, but...

import timeit
s = "from collections import deque; d = deque(xrange(1000000))"
timeit.Timer(stmt="d.rotate(1)", setup = s).timeit(number=100000)

0.1372316872675583

timeit.Timer(stmt="d.rotate(1000)", setup = s).timeit(number=100000)

3.5050192133357996

timeit.Timer(stmt="d.rotate(10000)", setup = s).timeit(number=100000)

32.756590851630563

timeit.Timer(stmt="d.rotate(100000)", setup = s).timeit(number=100000)

325.59845064107299

timeit.Timer(stmt="d.rotate(999999)", setup = s).timeit(number=100000)

0.14491059617921564

Boy was I wrong. Given that it scales linearly it looks like it
cut-pastes the rotation an element at a time! At least it recognizes
the shortest rotation path, though.

On first guess of how the deque is implemented I would have thought
that rotation could be achieved simply by diddling some pointers, but I
could see how that would mess with popping efficiency (seems you'd have
to remap memory in the event of a pop after rotation). Worst case I
figured a rotate would just do a single shot memory remapping of the
deque contents so that the speed was the same regardless of rotation
size...

My guessing/figuring skills clearly need some work.

What's up with this deque rotation? If I were to hazard one more guess
I'd think it is trying to conserve transient memory usage during
rotation... in my (poor) mental scheme it seems that cutting/relocating
could take 50% more memory than the deque itself for a full rotation.

I should stop guessing. Or at least figure out how to find the source
code for the deque implementation...

Should I avoid using deques with large iterables?

Thanks,
Russ

Tim Chase

2006-04-28 23:13:01 UTC

Permalink

Post by Russell Warren
Does anyone have an easier/faster/better way of popping from the middle
of a deque than this?
self.rotate(-pos)
ret = self.popleft()
self.rotate(pos)
return ret

My first thought would just be to use indexing:

def popmiddle(self, pos):
ret = self[pos]
del(self[pos])
return ret

It seems to work with my Python2.4 here. If you're
interested in efficiency, I'll leave their comparison as an
exercise to the reader... :)

-tkc

Russell Warren

2006-05-01 23:38:49 UTC

Permalink

Thanks for the responses.

Post by Tim Chase
It seems to work with my Python2.4 here. If you're
interested in efficiency, I'll leave their comparison as an
exercise to the reader... :)

Ok, exercise complete! :) For the record, they are pretty much the
same speed...

Post by Tim Chase

s = """

... from collections import deque
... class mydeque(deque):
... def popmiddle(self, pos):
... self.rotate(-pos)
... ret=self.popleft()
... self.rotate(pos)
... return ret
... d = mydeque(xrange(1000000))

Post by Tim Chase

timeit.Timer(stmt="x=d.popmiddle(1000)", setup = s).timeit(number=100000)

5.4620059253340969

Post by Tim Chase

s2="""

... from collections import deque
... class mydeque(deque):
... def popmiddle(self, pos):
... ret = self[pos]
... del(self[pos])
... return ret
... d = mydeque(xrange(1000000))
... """

Post by Tim Chase

timeit.Timer(stmt="x=d.popmiddle(1000)", setup = s2).timeit(number=100000)

5.3937888754018104

Thanks for the alternative solution.

Russ

Tim Peters

2006-04-29 03:37:08 UTC

Permalink

[Russell Warren]
|> Does anyone have an easier/faster/better way of popping from the middle

Post by Russell Warren
of a deque than this?
self.rotate(-pos)
ret = self.popleft()
self.rotate(pos)
return ret

As Tim Chase said, the easiest way is to do "del self[pos]" instead of
manually fiddling with rotations. However, deque's implementation of
__delitem__ does rotations under the covers, so it's not necessarily
faster.

Post by Russell Warren
I do recognize that this is not the intent of a deque, given the
clearly non-"double-ended" nature. I'm using a deque in a place where
99.999 of the time it will be a fifo, but occasionally I will want to
pop from the middle.

So does the speed of the remaining 0.001 cases really matter? Note
that even just indexing into a deque takes O(index) time.

Post by Russell Warren
...
I should stop guessing. Or at least figure out how to find the source
code for the deque implementation...

It's in Modules/collectionsmodule.c. You'll see that a deque is
implemented as a doubly-linked list of buckets, where each bucket is a
contiguous vector of no more than 62 (pointers to) elements. There
appears to be an invariant that all "interior" (non-endpoint) buckets
contain exactly 62 elements, presumably to speed indexing. It all
works very well for deque's intended purposes.

Post by Russell Warren
Should I avoid using deques with large iterables?

Can't answer that without more detail about the criteria you use to judge ;-)

Russell Warren

2006-05-01 23:53:26 UTC

Permalink

Post by Tim Peters
So does the speed of the remaining 0.001 cases really matter? Note
that even just indexing into a deque takes O(index) time.

It doesn't matter as much, of course, but I was looking to make every
step as efficient as possible (while staying in python).

As to indexing into a deque being O(index)... I didn't realize that.
It is certainly something to keep in mind, though... looping through
the contents of a deque would obviously be a bad idea with this being
the case! I wonder if the generator for the deque helps reduce this?
Will check later.

Proof of the O(n) for indexing into a deque (not that I doubted Tim #2!
:)...

Post by Tim Peters

import timeit
s = "from collections import deque; d = deque(xrange(1000000))"
timeit.Timer(stmt="x=d[10000]", setup = s).timeit(number=100000)

0.14770257113683627

Post by Tim Peters

timeit.Timer(stmt="x=d[100000]", setup = s).timeit(number=100000)

1.4016418287799155

Russ

Tim Peters

2006-05-02 06:23:37 UTC

Permalink

[Russell Warren]

Post by Russell Warren
...
As to indexing into a deque being O(index)... I didn't realize that.
It is certainly something to keep in mind, though... looping through
the contents of a deque would obviously be a bad idea with this being
the case! I wonder if the generator for the deque helps reduce this?
Will check later.

FYI, deque implements efficient forward and reverse iterators. So, e.g.,

for x in a_deque:
pass

and

for x in reversed(a_deque):
pass

takes time proportional to len(a_deque). In contrast,

for i in xrange(len(a_deque)):
x = a_deque[i]

take time quadratic in len(a_deque).

The iterators don't use __getitem__ under the covers; explicitly
indexing does, and that's the difference here.