Friday, August 21, 2009

Lossy Difference Aggregators

This post is about 2 papers and a student.

The first paper was presented at SIGCOMM Thursday: Every Microsecond Counts: Tracking Fine-Grain Latencies with a Lossy Difference Aggregator , by Ramana Rao Kompella (Purdue University), Kirill Levchenko (UC San Diego), Alex C. Snoeren (UC San Diego), and George Varghese (UC San Diego). The paper introduces a new hash-based data structure for the following situation: packets are being sent from a source to destination over a lossy channel (like the Internet), and you want to estimate the average latency over a time window. (They can also handle jitter -- variance in latency -- as well.) If each packet was timestamped at the source, this would be easy; at the destination you could add up individual latencies on the fly. But adding sufficiently accurate timestamps is an expensive proposition. If there weren't any losses, this would be easy: add up the timestamps over the time window at the source, send that over separately, and subtract it from the sum of the reception times at the receiver, then divide to get the average. If there is packet loss, however, this won't work. The authors show that by sampling and bucketing packets at the right rate, though, you can get buckets that contain a random sample of the packets with no loss, from which you can estimate the latency.

It's a clever idea and a great paper. (Obviously, I'm biased -- novel hash-based data structures with real-world applications appeal greatly to me!) So when I got a preprint from George Varghese, I asked if I could please give it to my seminar class on network algorithms, CS 222, for an in-class discussion. (The students wouldn't know it had been accepted to SIGCOMM, but would be asked to review the paper; I wanted to see their reaction.) In return I'd send George feedback, before the final version was due. George and his co-authors pleasantly agreed.

Now about the student. Hilary Finucane was a standout student in my undergraduate algorithms class, and has continued to impress me since. She started working with me as her advisor on her senior thesis, on "floating codes"; the thesis is available here as a technical report, and some of this work is part of this Allerton paper (the "journal version" of which was just accepted to Transactions on Information Theory). So when Hilary, who was taking CS 222 as a senior, asked during the discussion why the authors were doing X, when it seemed clear that Y would work better, I listened, and we sent a note to the authors; they very nicely acknowledged Hilary in the final version for her contribution.

Hilary and I discussed the paper further and felt we could improve on the analysis. We're making the results available in a preprint here; we'll be submitting a version somewhere soon. (This is the 2nd paper of today's post.) Besides giving an improved analysis, we also showed how competitive analysis can be used in this setting to help decide suitable parameters for implementations. (Despite my previous rants about competitive analysis, this seemed like an appropriate place for it!)

It's been a great pleasure having the chance to work with Hilary, who will be starting graduate school at Weizmann next year. While I've been influencing her to work on the more "practical" problems I'm interested in, I imagine she'll be moving back to problems she claims to like more -- more complexity theory and mathematically oriented problems -- as a graduate student. Given what she's been able to contribute and accomplish as an undergraduate, I'm really looking forward to seeing what she'll do in the future!

7 comments:

Anonymous said...

Do you mind sharing what was the "X" they were originally doing? They reveal Hilary's "Y" in the paper.

Michael Mitzenmacher said...

Well, I think it's up to them to describe it (as they like) in a future (journal?) version, but essentially they were using a more complicated bucketing procedure for some seemingly compelling reasons. Hilary saw that the simpler bucketing procedure actually worked just fine, and the more complicated procedure wasn't necessary.

Anonymous said...

Thank you. Unfortunately, I doubt they will describe it unless there is really some parameter in which it outperforms the simple solution.
Too bad, because it often really helps to see the thought process behind something, instead of just seeing the end result.

Anonymous said...

One of the problems with the paper is with clock synchronization, or lack thereof. I was there at the presentation and the authors gave a hand-wavy argument.

Michael Mitzenmacher said...

Anon #4: I'd certainly like to hear more about the issue of the clock synchronization assumption -- how big an assumption it would be in practice, and what the implications are -- if you (or anyone) would care to elaborate.

Anonymous said...

They basically assume perfect synchronization at both sides of the link.

When these sides are ingress and egress ports of a router, I suppose that *might* make sense.
They briefly describe a router architecture that supports this in the paper.

But when talking about two different routers, I just can't see how this will work.

--not anon #4

Anonymous said...

I'm one of the authors of the papers so I can take a shot at answering both of Anonymous's questions.

1) First, what about clock synchronization. Modern routers do very well (few usec) internally because they use a hardware bus that connects all ports. The hardware bus is dedicated to clock synchronization messages so when an Input port says its 2 am the output will receive it a few usec later. The crucial point is that the clock synchronization message does not travel on the same data path as the packets in which case congestion can cause variable delays. We make a big deal of the fact that most of the delays are in routers. However, there is also a lot of work going on in something called IEEE 1588 (ieee1588.nist.gov) *between routers*. The idea is that the sender and receiver chip sets at say a Cisco and Juniper router agree to send a similar clock synchronization message periodically that is intercepted by the hardware link chip (again no interposing variable delays). This is a big movement and will happen in a few years. They should get synchronization in the order of 100's of usec with short links as opposed to msec for NTP

Second, what was Hilary's insight? We have to take you back to our thinking. The basic LDA idea is to divide the set of packets into random groups and sum the time stamps and number of packets for each. If the number of losses is smaller than the number of groups, then at least one group survives and we get a good estimate. But each group has a cost (2 counters), so a real router can afford say 200 counters. If the number of losses is bigger than 200, say 400, we can "fit" the loss into 200 by sampling only half the packets to get say 200 losses. This works if we know the loss and so the last idea was to keep multiple banksm each tuned to a different loss rate (high, medium, low, Hilary and Mike show 1 is enough in their note!). Each bank had multiple rows representing the groups within the bank. We assumed that one of the banks would work well for any real loss and decided to build our estimator only from the "best" bank.
Hilary realized that any row (group) that had not suffered loss in *any* bank (not just the "best" bank) would also provide useful information and produce "more samples" and hence "less error". Kind of obvious when she pointed it out, but given our thinking about "best" banks, we missed it.