To wrap up, the other paper I presented at Allerton (pdf, talk slides), joint with Adam Kirsch, continues our work in analyzing the power of moving objects around in multi-level hash tables. The gold standard here, at least in theory, is cuckoo hashing, which achieves very high loads but can require a large number of moves when inserting an item. That might not be practical for many applications (especially in network router hardware). In previous work, we had considered the gains obtainable from moving at most 1 item on an insertion. In this work, we add the power to move items on a deletion as well. (Lookups we keep fast and don't do any extra work.)
[The discussion below is, admittedly, probably more for those at least familiar with our previous work...]
Moving an item after another item is deleted is difficult -- generally, you'd want to move some other item deeper in your Multi-Level Hash table into the now empty space higher up in your hash table, but how can you find what item to move? Actually, we'd had ideas on how to do this back when we were working on the insertion paper. But we didn't include it then, for several reasons. First, our ideas required additional space (as explained below). Second, we couldn't analyze them, and one of the points we were aiming for in that paper was to analyze schemes. Third, we didn't have space -- not in the hardware, but in the actual paper. Blame page limits (yes, even for the journal version).
Another recent INFOCOM paper that offered an idea of how to handle moves on deletions inspired me to have us write down our approach -- because I feel our approach is clearly better than that proposed by this other INFOCOM paper, as we discuss in our paper. Even if we can't prove what we'd like, we can still show it works well through simulation.
Our (obvious) idea is just to keep a hint at each hash table cell, if there was a collision there, of where the collided item was placed, so you can find it later on a delete. Hints take space, but not too much. If multiple collisions occur, you have to choose which possible hint to keep; we found most recent collision worked best. Moves on deletes work reasonably well, but the real gains naturally occur from combining moves on insertions and deletes. This gives pretty good performance -- you can get space utilizations more in line with (though of course still not as good as) cuckoo hashing, albeit with more hash functions -- but again, using just 1 move per insert/delete. More details in the paper.