In this post, we'll look at some issues that arise when trying to put cuckoo hashing into practice. If I can wax a bit philosophical here (and I can, since it's my blog), putting things into practice is rarely as easy as taking the theoretical result and implementing it. There are always details to be dealt with. Often theorists shy away from (or run away screaming from) such details, which is unfortunate. I believe that more ideas from theory would be applied much more readily if we theory folk put in a little extra legwork to make things easier for the non-theory folk. If we want them to try something new and potentially challenging, we could help by making the benefits as clear and convincing as possible, and by attempting to address natural implementation concerns ahead of time. As a bonus, sometimes new theory problems come out of such efforts.

A natural place to try to apply cuckoo hashing ideas in practice is in routers. Routers can use hash tables for IP lookups, network measurement and monitoring, and other related tasks. If the hashing is to be done frequently, say on a per packet basis, then really the only way to do it without slowing the router down beyond reason is to build the hash table directly into the hardware. So now we have to think in terms of hardware implementation, a mindset which is generally not present in the theoretical analysis.

By the way, besides compelling applications, there's another reason to try to apply these ideas to routers: multiple-choice hashing is already there! I have heard the technique is known and in use at Cisco, though sadly being an outsider I am not privy to any top-secret details.

In thinking about hardware implementations, there are some key positives.

- There's no need for pointers. The hash table can be simply stored in an array or a collection of arrays. This is apparently very useful in hardware, and there's no wasted pointer space.
- Lookups are trivial, as one knows exactly where to look for an element. The multiple accesses to memory required can be easily parallelized.

- The biggest impediment is cuckoo hashing's biggest strength: all that moving around of items. In many cases, the hash table will be stored in a chip's slow memory. We simply don't have time t0 do all those moves. The average number of moves is constant, which already may be too high. And with some non-trivial probability you'll need a logarithmic number of moves. In most hardware implementations, the worst-case time for an operation is what's important, and the worst-case insertion time is a weakness of cuckoo hashing, even if the table is stored in fast memory.
- The failure mode of cuckoo hashing is unpleasant. One thing I glossed over last time is that there's some (small) chance that cuckoo hashing gets caught in a loop; in moving things around to try to find an empty space, it can get stuck in a cycle, moving the same things back and forth repeatedly without ever being able to squeeze in a new element. In the theoretical analysis, this can swept under the rug by saying that after a certain number of steps, declare a failure and re-hash everything with new hash functions. In practical settings, like a router, this is often not a feasible option.
- The analysis of advanced cuckoo variants that perform well remains incomplete. Generally, proofs give that you can hold n elements with O(n) space, but the analysis gives up something non-trivial in the constant factors. Arguably, this is only a theoretical concern, but this gap in the analysis makes designing, testing, and optimizing such structures much harder, in that one must rely almost exclusively on simulation. More accurate theory can not only help convince others to use these schemes, it can actively aid in the development and deployment.

## 2 comments:

Hi, I am just playing around with cockoo hashing, I think you are missing something in point 2: You are right that cuckoo hashing can get caught in a loop, but this is easy to detect. On insertion, remember the position where the new element is put. Elements are pushed around, and after a while it might be that the inserted element is pushed to another position (count up by 1). Now some more elements are pushed around, and eventually the first inserted element is pushed again, and this is when there has to be a circle.

So in practice it should be enough to count the moves of the inserted element. If it is moved 3 times, you have a loop.

PS: I am no mathematician so I can't prove this, but it seems to work :-)

martinus, I think you misunderstood the point. The loop is certainly detectable. The problem is that it might take a long time to detect, which is unacceptable to some real time applications.

--Haoyu

Post a Comment