Tuesday, November 03, 2009

Conference Reviews

I promised at some point to get back to discussing the reviewing process for two conferences I am currently on the PC for, NSDI and LATIN. Since I happily just finished my "first drafts" of the reviews for both conferences, now seems like a good time. As usual, I've finished a bit early and most reviews are not yet in, so I'm writing this without benefit of seeing most of the other reviews yet.

I should point out that comparing NSDI and LATIN is definitely an apples and oranges comparison, and not just because one is systems and one is theory. LATIN is a "2nd tier" conference (and one would probably argue that was being polite), held every other year, with no specific theme other than theory; the acceptance rate is probably in the 25-35% range. That is not to say the papers are bad, but generally the papers generally utilize known techniques, and the question is whether the underlying question seems interesting, the paper was written well, etc. I'm not looking for papers that everyone would want to read; I'm looking for papers that I think somebody wants to read. Since interests vary greatly, I suspect there may be some substantial score deviations among reviewers, corresponding to different opinions about how interesting something is. I don't mean to sound negative about the conference; some very nice papers have appeared in LATIN, with my favorites including The LCA Problem Revisited, and On Clusters in Markov chains. But I don't think it's a first choice destination for many papers -- unless, of course, an author lives in Latin America or wants to go to Latin America.

NSDI is arguably a "1st tier" systems conference for networks/distributed systems. While it doesn't have the prestige of a SIGCOMM, it's certainly aiming at that level -- although I think perhaps even more than SIGCOMM there's a bit of bias at NSDI for concrete implementations demonstrating actual improvements. In the last two years the acceptance rate has dropped below 20% and I expect it to be there again. Generally I'm looking for a solid, well-explained idea or system design, with some experimental evidence to back up that the idea really could be useful. I admit I would prefer to have some definitions, equations, theorems, or at least well-structured arguments in these submissions -- this is something I push on regularly -- as for me these are highlights of having a well-explained idea, but a paper can still possibly be good without them (and sometimes a paper that is too theoretically oriented wanders too far off from reality, even for an open-minded idealist such as myself).

Now for concrete differences. For LATIN I only have 10 or so papers to review; there's a big PC and the meeting will all be electronic. I imagine I might get asked to read one or two more papers where the reviews don't agree but that's probably it. Most papers will probably have 3 reviews. There's a 0-5 point scale, from strong reject to strong accept, but no "percentages" assigned to the ratings. There's also a whole lot of other scores (originality, innovation, correctness, presentation, difficulty) I have to give that I think are overkill. Even though the number of papers is small, it seems a number of people are using outside reviewers. (I generally don't, unless I feel I'm so far from the area of the paper I need someone else to read it.) We're using Easychair, which these days seems passable, but is far from my favorite.

For NSDI, we have a first round of 20 or so papers. Each paper is getting 3 reviews in the first round, and then we'll probably cut the bottom X% (about 40-50%?). Everyone reviews their own papers. In the second round papers will probably get 1-2 more reviews (or more), and outside reviewers will be used if it's thought their expertise could help. (Usually the chairs, I believe, assign outside reviewers, often based on comments or suggestions by the first-round reviewers.) After the second round of reviews are in we have a face-to-face PC meeting. We're using the standard 1-5 networking scale with 1 being "bottom 50%", and 5 being "top 5%". I've actually found that helpful; I was going over my scores, realized I had bit less than 50% with scores of 1, and went back and decided that there were papers I was being a bit too generous to. (Giving scores of 1 is hard, but if everyone tries to follow the guidelines -- unless they really believe they had a well-above-average set of papers -- I find it makes things run much more smoothly.) We're using hotcrp, which I like much better than Easychair -- I can easily see from the first screen the other scores for each paper, the average over all reviews, how many other reviews have been completed, etc.

Once all the reviews are in, we'll see how things work beyond the mechanics.


Anonymous said...

"The LCA Problem Revisited" is a curious choice as favourite paper. Their result is identical in terms of space and time complexity to that published already twice before. The specific technique they deploy to get the result was itself taken from O. Berkman, D. Breslauer, Z. Galil, B. Schieber, and U. Vishkin. Highly
parallelizable problems, STOC '89, as they say in the paper. The novelty and hance their contribution is almost entirely in the clarity of the exposition. This has indeed been very useful to researchers interested in this area but who were not keen to read the previous papers.

Michael Mitzenmacher said...

Anon #1 : Indeed, I'd have to say, I like the LCA paper for the clarity of the exposition. I found it so enjoyable, I ended up making it one of my last lectures for my undergraduate algorithms class. It brings back dynamic programming in interesting ways, gives a nice demonstration of offline setup cost/query cost/memory tradeoffs for data structures, and shows the power of a clever "recursive" argument. And combined with suffix trees (which I quickly go over on the last lecture) it allows for some pretty fun results.

I'd be happy to see more papers like it. But then again, I regularly do things like write survey articles, so what do I know.

Semi said...

The LCA paper is also one of my favorites. It took a cryptic theoretical solution and made it practicable.

Solving problems for problem's sake is mathematics, solving them in a way that is relevant to CS is TCS. The previous two versions made progress in the LCA problem but since they were not implemented by others they did not fully solve the TCS problem. They solved the math problem of proving that a solution exists in theory. The LCA paper proved that the solution exists in practice and countless implementations of the algorithm soon followed.

Anonymous said...

Solving problems for problem's sake is mathematics, solving them in a way that is relevant to CS is TCS.

In fact,to the contrary multiple proofs of the same theorem is quite highly valued in mathematics -- and there is usually no problem in publishing a different (even if more complicated) proof of a known theorem, if the new proof shows interesting connections to other areas. In TCS, it is a rare occurrence.

Anonymous said...

Coming back to reviewing processes, I also am a fan of the multiple round system, which focuses attention and reviewer time on where it is needed: on the upper ranked papers. Some conferences even have a third round (e.g., OSDI) and perhaps even a fourth. I once had an SOSP reject with 8 reviews.

This approach is essentially taking the ranking approach to the ranking/rating debate in PC reviewing, which I agree with. The job of the PC is to determine what are the N top-ranked papers based on the pool of submissions they get.

Dave Backus said...

I'm trying to understand the difference between conferences in CS and economics (my field). Are yours published? Our are more informal -- early ideas exposed to discussion -- so that might change things. Here are two approaches I've found interesting. (i) Do formal reviews (like those you described) and aggregate them somehow. (ii) Have a group of young people in the field do whatever they think best. In my experience, (ii) generates a much more interesting program.