My Biased Coin: February 2013

Wednesday, February 27, 2013

Discussing STOC 2013 PC with Joan Feigenbaum

Joan Feigenbaum is the Program Committee Chair for STOC 2013, where papers decisions were recently announced; I served as part of the Executive Committee. Joan did an excellent job running the entire process, and experimented with a "two-tiered" PC. We agreed that it would be interesting to talk about her experience on the blog, and she agreed to answer some questions I posed. We hope you'll find the discussion interesting.

1. You're now completing your stint as Program Committee Chair for STOC 2013. How do you think the program looks?
I think it looks great. We had roughly 20% more submissions than last year, and many of them were excellent -- an embarrassment of riches. Once we decided to stick with the recent STOC practice of a three-day program with two parallel tracks of talks, we were faced with the usual problem for STOC PCs, namely having to reject many clearly acceptable submissions. I guess that's a much better problem to have than an insufficient number of clearly acceptable submissions, but I still have reservations about this approach to conferences. (There's more on that in my answer to questions 2 and 5 below.)

2. You tried a number of new things this year -- a "two-tiered" PC being the most notable. How do you think it worked? Where do you think it improved things, and where did it not work as you might have hoped?
When Lance Fortnow, SIGACT Past Chair, asked me to be the Program Chair for STOC 2013, he strongly encouraged me to "experiment" and, in particular, strongly encouraged me to try a two-tiered PC. I agreed to do so, but it was a strange "experiment" in that it was not clear to me (or to anyone, for that matter) what problem a two-tiered PC might solve. There was no hypothesis to test, and the whole exercise wasn't a controlled experiment in any well defined sense. Nonetheless, I was able to reverse engineer my way into some potential advantages of a two-tiered PC and hence some good reasons for trying it.
Before I get into those reasons, however, I should state the primary conclusion that I drew from this experience: Given the extraordinarily high quantity and quality of STOC submissions, it's extremely easy to put together a good program, and any reasonable PC structure will do. That is, assuming that you don't want to change the nature of the product (where the product is a three-day, two-track STOC that has a fairly but not ridiculously low acceptance rate), you have a lot of latitude in the program-committee process that you use to produce it. There's nothing sacred about the "traditional," 20-person PC with one chair and no PC-authored submissions; there's nothing definitively wrong with it either.
Now what did we try this year, and what were some of its potential advantages? First of all, we briefly considered changing the product, e.g., by having three parallel sessions, but decided against it; we set out to put together a STOC program that was similar in quality and quantity to other recent STOC programs but to do so using a different process. We had an Executive Committee (EC) of nine people (including me) and a Program Committee (PC) of 62 people. PC members were allowed to submit, but EC members were not. The job of the PC was to read the submissions in detail and write reviews, and the job of the EC was to oversee and coordinate the reviewing process. For example, EC members reassigned submissions that HotCRP had assigned to inappropriate reviewers, looked for submissions that required extra scrutiny because they might have subtle technical flaws, and, most importantly, looked for pairs of submissions that were directly comparable and needed to have at least one reviewer in common. In order to promote high-quality reviews (which I thought should be attainable, because each PC member had fewer submissions to review than he would have in a traditional PC), I put together a list of suggested review questions and regularly reminded PC members to flesh out, revise, and polish their reviews based on committee discussions. We made accept/reject decisions about a hefty fraction of the submissions fairly early in the process, based on two reviews of each submission. For the rest of the submissions, we got additional reviews or asked the original two reviewers to consider them in more detail or both; for each set of comparable submissions that survived the first cut, an EC member conducted an online "meeting" (using both email and HotCRP comments) of all of the reviewers of submissions in the set.
One potential big advantage of this way of doing things over the traditional way is that PC service can be much less burdensome. Each PC member can review far fewer submissions than he would for a traditional program committee and can also submit his own papers. He can devote considerably more time and attention to each submission assigned to him and still wind up spending considerably less total time and effort than he would under the old system. He's also less likely to have to review submissions that are outside of his area(s) of expertise, because there are many more PC members to choose from when finalizing assignments. The hope is that almost everyone in the theory community will be willing to serve on a STOC PC when asked if the workload is manageable, that PC members will be more satisfied with the quality of their work if they can spend more time on each submission and don't have to review submissions outside of their area(s), and that authors will get higher quality reviews.
A second potential advantage is that the managerial and oversight responsibilities can be shared by the entire EC and don't all fall on the chair. In almost every traditional program committee I've served on (not just STOC committees), there has been a great deal of last-minute scrambling. In particular, I've been in many face-to-face program-committee meetings at which we discovered that various pairs of papers needed to be compared but had been read by disjoint sets of reviewers. That's not surprising, of course, when everyone (except the chair) had spent the previous few months trying to read the 60 submissions assigned to him and hence hadn't had a minute in which to at least skim all of the other submissions. These relationships among submissions can be discovered early in the process if there are enough people whose job it is to look for them. Having an EC that can facilitate many parallel, online "meetings" about disjoint sets of gray-area submissions is also a big win over a monolithic face-to-face program-committee meeting. The latter inevitably requires each PC member to sit through long, tense discussions of submissions that he hasn't read and isn't interested in; our procedure enabled everyone to participate in the discussions to which he could really make a contribution -- and only those.
I think that most of these hoped-for improvements actually materialized. Certainly almost everyone whom I invited to serve on the PC said yes, and many said explicitly "OK, I'll do it because the workload looks as though it won't be crushing," or "I really appreciate the opportunity to submit papers!" Similarly, we had no last-minute scrambling, and I attribute that to the oversight work done by the EC. All of the potential technical flaws in submissions that we discovered were discovered early in the process and resolved one way or the other (sometimes with the help of outside experts); similarly, all of the pairs of submissions that, by the end, we thought should be compared were assigned to common reviewers early in the process.
Unfortunately, the effect of the lower workload on quality of reviews was disappointing. There was some improvement over the reviews produced by traditional STOC PCs but not as much as I had hoped for.

3. In my experience, our major PCs -- STOC and FOCS -- have small amounts of institutional memory and even smaller amounts of actual analysis of performance. What data would you like to have to help evaluate whether the PC process went better this year?
For this year, I'd like to hear from PC members whether they did in fact spend less time overall but more time per submission than they have in the past on "traditional" PCs. I'd also like to know whether they found the whole experience to be manageable and unstressful (if that's a word) enough to be willing to do it often, by which I mean significantly more often than they'd be willing to serve on traditional PCs. Finally, I'd like to know whether the opportunity to submit papers was a factor in their willingness to serve and whether they found it awkward to review their fellow PC members' submissions.
If future PC Chairs continue to experiment with the process or even with the product, as I suggest that they do in my answer to question 5 below, then I hope they'll capture their PC members' opinions of the experimental steps they take.

4. Are there things you did for the PC that you would change if you had to do it again?
Because the goals of this "experiment" were so amorphous, I and the rest of the EC members made up a great deal of the process as we went along. If I were to run this committee process again, I would start by creating a detailed schedule, and I would distribute and explain it to the entire PC at the beginning of the review process. I'd also lengthen the amount of time PC members had to write their first round of reviews (used to make the "first-cut" accept/reject decisions) by a week or two. I'd also assign second-round reviewers at the beginning, rather than waiting as we did until after the first round of decisions had already been made; we wound up losing a fair amount of time while we figured out whom to ask for additional reviews, and I suspect that many PC members wound up losing interest during this down time. So each submission would still receive just two reviews in the first round, but third (and perhaps fourth) reviewers would have their assignments and be ready to start immediately on all submissions on which early decisions weren't made.

5. Are there things you would strongly recommend to future PC chairs?
I hope that the theory community as a whole will consider fundamental changes to the form and function of STOC. As I said in my answer to question 2, if we want to continue producing the same type of product (a three-day, two-track conference with an acceptance rate somewhere between 25% and 30%), then there are many PC processes that would work well enough; each PC chair might as well choose the process that he or she thinks will be easiest for all concerned. The more interesting question is whether we want to change the product. Do we want more parallel sessions, no parallel sessions, different numbers of parallel sessions on different days, more invited talks, more papers but the same number of talks (which could be achieved by having some papers presented only in poster sessions), or something even more radical? What do we want the goals of STOC to be, and how should we arrange the program to achieve our goals?
The community should discuss these and other options. We should elect SIGACT officers who support experimentation and empower future PC Chairs to try fundamentally new things.
More specifically, I recommend that future PC chairs include, as we did, a subcommittee whose job it is to oversee the reviewing process rather than actually to review submissions; in our case, this oversight function was the responsibility of the executive "tier," but there might be other ways to do it. As I said in my answer to question 2, giving oversight and management responsibility to more people than just the PC Chair really helped in uncovering problems early and in making sure that related submissions were compared early.
Finally, I'd of course recommend that future PC chairs not make the same mistakes I made -- see my answer to question 4.

6. In my experience, the theoretical computer science community is known for comparatively poor conference reviewing. Having been PC chair, do you agree or disagree? Do you think the two-tiered structure help make for better reviews? Do you have any thoughts on how to make reviewing better in the future?
In my experience, reviews on submissions to theory conferences range enormously in quality. The worst consist of just a few tossed-off remarks and the best of very clear, well thought out, constructive criticism. As I said in my answer to question 2, I had hoped that the two-tiered PC and its concomitant lighter reviewing load (together with my suggested review questions and regular prodding) would lead to a marked improvement in the quality of reviews, but we got only a small improvement. I was extremely disappointed. Frankly, I don't know what the theory community can do about review quality. Maybe we should start by discussing it frankly and finding out whether people really think it's a problem. If most people don't see it as a serious problem, then perhaps we don't have to do anything.

7. As you know, I'm a big fan of HotCRP. How did you like it?
I've used three web-based conference-management systems: HotCRP, EasyChair, and Shai Halevi's system (the name of which I don't remember). In my experience, they're all reasonable and certainly capable of getting the job done, but none of them is great; HotCRP is the best, but not by a wide margin. Part of my problem was that I had unrealistic expectations going in. I'd been told that HotCRP was almost infinitely flexible and configurable, and I thought that it would be easy to set things up exactly as I wanted them; that turned out not to be true. On the other hand, if you use HotCRP exactly as it was designed to be used, it works quite well. I have the feeling that it is a "system builder's system" in that it's very powerful and very efficient but not all that easy on users; the UI is not great. Anyway, you and I do agree on one thing: HotCRP's "tagging" feature is amazing; PCs of all shapes and sizes should make heavy use of it.

Thursday, February 14, 2013

ICALP formatting

Given the loud outcry regarding the STOC 2013 formatting, which gave you 10 double-column pages to work with (at the cost of, you know, having to turn your paper into double-column format), I though I'd again express my annual dismay at the format for ICALP submission. Twelve LNCS pages is simply not enough space to present anything interesting at a suitable level of detail. I'm tempted as always to turn in a 1 page paper, that says "If the 1 paragraph abstract sounds interesting, here's the arxiv link to something you can read." Why they haven't pushed LNCS to allow at least 14 pages remains a mystery to me.

Back to formatting.

Wednesday, February 13, 2013

Online Censorship Day and Other Links

1) Sharon Goldberg and Nick Feamster asked me to announce the following:

In the tradition of CAEC, NYCE, and etc, we are holding a "Day" on online censorship at BU on March 8, with speakers from technology, law and public policy. We're currently soliciting abstracts for short talks and posters (due Feb 21). Info is here:

http://www.bu.edu/cs/bfoc/

2) The Crimson has a nice article on CS at Harvard, leading with

The computer science concentration has nearly doubled in size in the last two years and continues to drive growth in Harvard’s School of Engineering and Applied Sciences, according to new data released by the SEAS Communications Office.

3) I wanted to point to this essay by Don Rosa. Don Rosa is well-known as the writer and illustrator for many of the tales of Scrooge McDuck, which I didn't read as a kid but have enjoyed with my kids as an adult. (I'd recommend the Life and Times of Scrooge McDuck, but there doesn't appear to be an affordable version available on Amazon right now.) The essay is a poignant explanation of why he stopped, which I expect might resonate with many people, including those who have never read a comic.

Monday, February 11, 2013

Daily Show

Anyone else watching Jon Stewart making fun of Harvard w/regard to the "cheating scandal".

[I need the exact wording for the punch line -- "Open Internet? Is this Harvard or the University of Phoenix?"]

Wednesday, February 06, 2013

Zachary Quinto and Cherry Jones are in Town...

Harvard had a special faculty meet-the-director-deal thing for a preview of The Glass Menagerie, playing the next few weeks at the American Repertory Theater, that my wife and I went out to tonight. Cherry Jones and Zachary Quinto are the leads. It was excellent, though, of course, totally depressing in that Tennessee Williams play way. Shockingly (to me), there appear to be tickets available. If you're in the Boston area, I'd highly recommend making an evening out of it. Enough so that I thought to blog about it.

Sunday, February 03, 2013

Ad Board Update

We've finally got an update on the Government 1310 situation. As reported by the Crimson, FAS Dean Mike Smith sent out an email Friday, where he wrote that

“somewhat more than half” of cases heard by the College’s Administrative Board last fall resulted in forced withdrawals

and many of the other half resulted in disciplinary probation. While this includes more than Gov 1310, given the size of the case, that probably represents the bulk of the withdrawals.

On the positive side, the university tried to limit any financial damage to students given the long time frame required to reach decisions, rolling it back for tuition purposes as though they had to withdraw September 30. It seems like they could have decided and announced that previously, but at least they did it.

For better or worse, I suspect we won't be getting substantially more details given the (appropriate) confidentiality with which these cases are handled. I would like there to be some way that more could come out of this, though it appears that will be indirect. A fairly recently-developed Committee on Academic Integrity (which does pre-date the Gov 1310 situation) will be making recommendations on issues such as whether Harvard should institute an honor code and how faculty should structure assessments. I'm skeptical that these will get to the deeper issues -- what we mean by cheating (especially in the Internet age) and whether the faculty have a consistent and clear policy about it, what leads students to cheat, and what the role of the University is in developing morality in the student body. At the same time, I'm sure these issues are much closer to the surface now than they have been in the past, and are discussed more amongst faculty and students, in unofficial settings.

Further update: I highly recommend Harry Lewis's latest (and last?) post on the issue, and the discussion in Harvard Magazine, which includes the full text of Dean Smith's letter.

Saturday, February 02, 2013

Friday Ruminations

It's felt like a bad few weeks at Harvard.

Not that anything actually BAD has happened, like an inexplicable paper rejection or some interdepartmental fight or anything. It's just that January is filled with time-sucking (or, really, just sucking) administrative work normally, and it's worse this year as I have more administrative duties.

My weeks have been spent looking over faculty applications and graduate applications, and dealing with the paperwork and relevant meetings associated with such. Handling multiple promotion cases. Writing letters for students applying for internships or summer programs or whatever. Preparing for my class this semester -- a process exacerbated my Harvard's fairly recent change of schedule which places many students away from campus from December finals until the day classes start, making it hard to organize the mostly undergraduate teaching assistants. Handling papers as part of the STOC executive committee. Reading some undergraduate admission folders. Chairing a grant panel (for a friendly foreign country). Writing letters for colleagues outside Harvard going through promotion cases. (You're welcome.) Dealing with the myriad issues of the other CS faculty that pass through me while I'm Area Dean. It's rare that I've had the 20 minutes to think about research, talk to my collaborators, or work out things with my graduate students.

Individually, there's nothing bad about any of these tasks. They're part of the job. But packed together, so I feel like I'm on an administrative treadmill, it's wearing me out. If it's optional in February, I'm saying no. (That means you, ISIT papers people have asked me to review.)

On a more positive note, I was at a new concentrator event, and ended up talking for a while with four or five women who plan to major in CS at Harvard, most of whom are currently taking my class. I'm happy that we're seeing many, many more women in CS at Harvard; my only disappointment is that it hasn't been that way for so long.

CS 124 is holding steady at about 100-110 students this year, maybe a little smaller than last year, but at the level of noise. We've got four CS classes over 100 people this semester from the looks of things. (To calibrate, that's a lot for us spoiled Ivy League faculty.) Overall CS enrollments keep on growing.

Finally, an amusing note, in Harvard's Courses of Instructions I'm listed as the teacher next year for CS 221, our graduate complexity course. There always seem to be many bugs in the data for course listings, but this is particularly funny, as I'm planning to be on sabbatical next year, and I've never taught (or had it suggested that I teach) 221. Someone transposed something somewhere.

My Biased Coin