Tuesday, November 10, 2009

Graduate School? How to Decide...

What do people think about students going to work for a year or two and then applying to graduate school? Or applying but then deferring to work for a year or two?

It's that time of year when seniors are thinking about graduate school. (I have multiple requests for NSF letters pending...) So, naturally, the other day I talked with a student who, essentially, had the question, "Should I go to graduate school?"

In this case, the question wasn't one of talent; the student would, I'm sure, do very well in graduate school. But he also has a job offer from a top company in computing where he could do interesting work and, I'm sure, also do very well.

In these tough situations, I try my best not to give direct advice, but instead try to get a student to talk about their own concerns and issues to help them realize which way they really want to go. While I feel positive about the outcomes from my having gone to graduate school, I'm a very biased sample, and I know lots of others -- very bright, talented, capable people -- who found it wasn't worth it for them. I don't think I would attempt to give advice even if I thought I could perfectly distinguish those who would find great personal success from graduate school from those who won't, and it's perfectly clear to me that I'm far from a perfect distinguisher.

Where possible, I try to give facts. Inevitably, people who find both work and graduate school compelling options want to know how difficult it would be to switch from working back to school. My take was that at the application level, a year or two working generally, at worst, does minimal harm to an application. Your professors still remember who you are well enough to write useful and informative letters, and your academic skills are assumed to have not gotten rusty. Coming back after an extended period, however, might make the application harder to judge.

The greater difficulty in switching is that the longer you work, the harder it can become. You get used to a real paycheck instead of a subsistence wage. Who wants to move again, uprooting their life (friends, relationships, etc.)? And you probably start to become attached to your job and your co-workers in various ways. [Interestingly, the same sorts of issues can arise for people who are thinking about academic jobs vs. research labs/other jobs after their PhD.]

Happily, the student seemed to not need my most important advice -- that both possibilities offered him great opportunities for success and happiness, so he should not stress about making a choice that was "wrong".

Does anyone have further, general advice for those facing this decision?

Friday, November 06, 2009

Ranking

An anonymous commenter asked an insightful question, worthy of a real answer: "Hi Prof, Why are you so obsessed with ranking things?"*

Honestly, I don't think I am. I have 3 children, and I have thus far avoided assigning them a preference ordering.** If you asked me for my favorite TV shows (or movies, or songs, etc.) I could think of some off the top of my head, but I haven't ever thought hard about coming up with a list of favorites.*** Same with restaurants, food, vacation destinations, whatever. I don't spend my time giving rankings for Netflix or things like that. I could probably come up with rankings with some thought, but it's not like I go around ranking things constantly.

That is, I don't do that in my personal life. In my professional life, come to think about it, I spend an awful lot of my time ranking things. I serve on multiple program committees each year where I'm asked to rank papers. (And I send my papers to conferences, where they are in turn ranked, and my submission is, implicitly a ranking of sorts on the conference.) I serve on NSF panels to rank grants. I write letters of recommendation which, implicitly or explicitly, provide a ranking of students (and, occasionally, faculty). I interview and evaluate faculty candidates. I grade and assign grades in my classes, and similarly grade senior theses. I serve on a Harvard committee that decides undergraduate thesis prizes. And I'm sure if I thought it about some more, I could come up with even more examples.

My blog is meant to be a professional blog, about my professional life. If it seems that I'm obsessed about ranking, that is a reflection of my professional life. I am asked to rank a lot as part of my job.

So I think I can turn the question back -- why are all of you so obsessed with ranking, that I end up having to spend so much time doing it?


* This comment came up in my last post about the possibility of FOCS/STOC asymmetry, where ranking was at most a tangential concern. But my previous post was on ranking networking conferences, so I can understand where the comment comes from.
** That's meant to be humorous.
*** Well, that's perhaps not quite true. Any undergraduate who has taken my algorithms class can correctly tell you that my favorite TV show of all time is Buffy the Vampire Slayer, so I'd best admit to it before it comes up in the comments.

FOCS/STOC and Asymmetry

I had a funny conversation with Madhu Sudan yesterday, with him relaying an idea he said he heard from Umesh Vazirani (and perhaps the trail goes on further from there) -- roughly that FOCS should double in size and STOC should halve in size. Or, I guess vice versa -- the point is that right now the two are pretty symmetric, and it's not clear that's the best setup.

The idea (or my interpretation of it) is that in theory we could use a more selective "top" conference -- one that people felt they should really try to go to, even if they didn't have a paper in it, because it would have the major results from the year. Hence we halve one of the conferences and make it more selective (and, naturally, make it single-session, maybe have some special tutorials or other activities). At the same time, we don't want to lessen the number of papers that currently are taken in FOCS/STOC -- indeed, since the community (or at least the number of papers being written) has expanded, we should probably accept more. (So maybe people wouldn't feel the need to start yet more conferences, like ICS.) So we double the other. Again, this would be a conference that, ideally, more people would attend, because more people would have papers in it. Indeed, this could help get papers out of the system faster (instead of papers being resubmitted quite so frequently). By introducing asymmetry, perhaps we could make both conferences more appealing and better attended.

I pointed out that one community I know of already does this -- this is very similar to SIGCOMM and INFOCOM in networking. I think that model works, though there are certainly tensions and problems with it -- as you can see in the comments on my recent post on Ranking Networking Conferences. (Bigger conferences are more variable in quality, primarily; also, they require large-scale parallel sessions.) Again, we'd have asymmetry -- the larger conference might become perceived as "weaker", but it would play the important role of bringing the community together and being an outlet for more papers.

Interesting as though the idea is, I have trouble imagining the theory community moving in that direction. Big changes are always hard to get moving, and it's not clear how many people really think the current system is broken -- though the ICS movement clearly seemed to think something was wrong. I'd be willing to try it, myself, but of course I also like the "two-tiered" (or maybe 1.5-tiered) SIGCOMM/INFOCOM system.

Thursday, November 05, 2009

Harvard Financial Aid

This post will talk about Harvard's financial aid program, and why it's a perfectly good thing to give money to Harvard, despite what you might read in the New York Times.

I am motivated to write about this also because some weeks ago, I got into a blog-argument with some Chronicle of Higher Education writer who gave an incoherent argument that Harvard should have been spending its endowment increasing its undergraduate class size. (See the bottom of this post for the starting point if you want.) One point I argued was that Harvard had in fact been spending its endowment to make college more affordable through its financial aid program, and that that was probably doing more to open Harvard up to a wider talent pool than simply admitting more students would do.

Certainly one can argue whether teaching more students or making Harvard financially available to more students is a more important goal. But one thing that became clear is that that author, the author of the New York Times opinion piece, and I presume many other people, just don't understand the financial aid picture at Harvard. So I'll say something about it, that's actually based on facts and numbers.

Let me start with a back of the envelope calculation. (I recently got access to some official numbers, but they may be confidential, and the back of the envelope calculation is easy and accurate enough.) About 2/3 of Harvard undergraduates get financial aid from Harvard, and on average it covers about 2/3 of their tuition. That's approximately 4000 students, getting an average of about $35,000 per year in aid from Harvard, for about $140 million per year. Let's call it $125 million in case my numbers are off and to make the math easier.

Long-term endowment spending rates are about 5%. (This seems to be a standard rule across most major universities, but I haven't seen an economic analysis to explain this number. Please give pointers in the comments.) So Harvard's undergrad financial aid corresponds to roughly $2.5 billion of endowment money.

This is a much bigger proportion of the endowment than people realize. Usually people bandy about a figure of $27 billion or so post-crash for Harvard's endowment, but the endowment for the Faculty of Arts and Sciences -- that is, for the undergrads, as opposed to the law/business/medical/graduate/etc. schools -- is only about $11 billion. So Harvard is now using, by my estimates, well over 20% of its annual endowment spending (for FAS) for financial aid. I've argued in the past that Harvard should make itself tuition-free for undergraduates -- but even I'm impressed by and happy with these numbers.

Think of it this way: the projected deficit for FAS over the next few years, roughly speaking, could disappear entirely without any budget cuts if we just turned off financial aid. Of course that's a terrible idea, and financial aid is one area where Harvard, so far, is making sure not to cut. But that gives an idea of the scope.

So when I hear people say that Harvard isn't doing enough to open its educational doors, or suggesting that giving to Harvard is not morally sound, I admit I feel obliged to politely correct them. (Or, sometimes, less politely correct them.) If you believe that affordable education is important, there are of course many institutions deserving of support. Harvard remains on that list.

Tuesday, November 03, 2009

Conference Reviews

I promised at some point to get back to discussing the reviewing process for two conferences I am currently on the PC for, NSDI and LATIN. Since I happily just finished my "first drafts" of the reviews for both conferences, now seems like a good time. As usual, I've finished a bit early and most reviews are not yet in, so I'm writing this without benefit of seeing most of the other reviews yet.

I should point out that comparing NSDI and LATIN is definitely an apples and oranges comparison, and not just because one is systems and one is theory. LATIN is a "2nd tier" conference (and one would probably argue that was being polite), held every other year, with no specific theme other than theory; the acceptance rate is probably in the 25-35% range. That is not to say the papers are bad, but generally the papers generally utilize known techniques, and the question is whether the underlying question seems interesting, the paper was written well, etc. I'm not looking for papers that everyone would want to read; I'm looking for papers that I think somebody wants to read. Since interests vary greatly, I suspect there may be some substantial score deviations among reviewers, corresponding to different opinions about how interesting something is. I don't mean to sound negative about the conference; some very nice papers have appeared in LATIN, with my favorites including The LCA Problem Revisited, and On Clusters in Markov chains. But I don't think it's a first choice destination for many papers -- unless, of course, an author lives in Latin America or wants to go to Latin America.

NSDI is arguably a "1st tier" systems conference for networks/distributed systems. While it doesn't have the prestige of a SIGCOMM, it's certainly aiming at that level -- although I think perhaps even more than SIGCOMM there's a bit of bias at NSDI for concrete implementations demonstrating actual improvements. In the last two years the acceptance rate has dropped below 20% and I expect it to be there again. Generally I'm looking for a solid, well-explained idea or system design, with some experimental evidence to back up that the idea really could be useful. I admit I would prefer to have some definitions, equations, theorems, or at least well-structured arguments in these submissions -- this is something I push on regularly -- as for me these are highlights of having a well-explained idea, but a paper can still possibly be good without them (and sometimes a paper that is too theoretically oriented wanders too far off from reality, even for an open-minded idealist such as myself).

Now for concrete differences. For LATIN I only have 10 or so papers to review; there's a big PC and the meeting will all be electronic. I imagine I might get asked to read one or two more papers where the reviews don't agree but that's probably it. Most papers will probably have 3 reviews. There's a 0-5 point scale, from strong reject to strong accept, but no "percentages" assigned to the ratings. There's also a whole lot of other scores (originality, innovation, correctness, presentation, difficulty) I have to give that I think are overkill. Even though the number of papers is small, it seems a number of people are using outside reviewers. (I generally don't, unless I feel I'm so far from the area of the paper I need someone else to read it.) We're using Easychair, which these days seems passable, but is far from my favorite.

For NSDI, we have a first round of 20 or so papers. Each paper is getting 3 reviews in the first round, and then we'll probably cut the bottom X% (about 40-50%?). Everyone reviews their own papers. In the second round papers will probably get 1-2 more reviews (or more), and outside reviewers will be used if it's thought their expertise could help. (Usually the chairs, I believe, assign outside reviewers, often based on comments or suggestions by the first-round reviewers.) After the second round of reviews are in we have a face-to-face PC meeting. We're using the standard 1-5 networking scale with 1 being "bottom 50%", and 5 being "top 5%". I've actually found that helpful; I was going over my scores, realized I had bit less than 50% with scores of 1, and went back and decided that there were papers I was being a bit too generous to. (Giving scores of 1 is hard, but if everyone tries to follow the guidelines -- unless they really believe they had a well-above-average set of papers -- I find it makes things run much more smoothly.) We're using hotcrp, which I like much better than Easychair -- I can easily see from the first screen the other scores for each paper, the average over all reviews, how many other reviews have been completed, etc.

Once all the reviews are in, we'll see how things work beyond the mechanics.

Monday, November 02, 2009

ICS Papers Announced

As pointed out many places, the paper for the (strangely named) new theory conference Innovations in Computer Science are out, with the list here and list with abstracts here.

I suppose the future will tell how "innovative" these papers are compared to, say, the normal collection at FOCS/STOC/SODA. I'm not surprised to see the trendy areas of game theory and quantum fairly heavily represented. I was a bit shocked, however, to see a number of papers on what I would consider "mainstream" coding/information theory, in that I wouldn't be at all shocked to see papers with similar abstracts (but different authors) at say an International Symposium on Information Theory. The example nearest and dearest to me would have to be

Global Alignment of Molecular Sequences via Ancestral State Reconstruction
Authors: Alexandr Andoni, Constantinos Daskalakis, Avinatan Hassidim, Sebastien Roch

which, while sounding all biological, is really just studying trace reconstruction problems on a tree. I'm a fan of the under-studied trace reconstruction problem, as it's tied closely to insertion and deletion channels; I was a co-author on a paper on a different variant of the problem back in SODA 2008. (I also cover the problem in my survey on insertion/deletion channels.) I guess I'm glad to see that work on this very challenging problem is considered "innovative".

Wednesday, October 28, 2009

Ranking Networking Conferences

I'm curious if various readers out there would be willing to offer their ranking of networking conferences. The issue has come up in some conversations recently, and I was wondering what other possibly more informed sources think.

Besides your ranking, of course, I'm interested in the reasons behind the rankings. Is it just acceptance rate? Do certain networking conferences specialize in subareas where they are remarkably strong? How does/did such a ranking get built and maintained; does it ever get lost?

Friday, October 23, 2009

Major (and Minor) Happenings : Gu-Yeon Wei

I'm thrilled to announce that my colleague Gu-Yeon Wei, in EE here at Harvard, received tenure.

I feel this is worth a mention because:

1) Strangely, people sometimes seem to forget we have EE here are Harvard. They shouldn't. (Gu, for example, is one of the leaders of the RoboBee project, and with David Brooks in CS, has been writing a slew of papers that spans the circuits and architecture divide.)
2) Strangely, people sometimes seem to have kept the notion that Harvard EE+CS do not tenure their junior people. That's an outdated impression. Like most other universities, we aim to hire people who will eventually earn tenure.

That's the major happening of the day. The minor happening, from yesterday, was that I visited Yale and gave my talk on Open Questions in Cuckoo Hashing. I had a great day, but will pass on just one great insight : if Joan Feigenbaum recommends a restaurant, it's worth listening to. (Dinner after the talk was a true treat.)

Tuesday, October 20, 2009

Old References

One interesting aspect of our WSDM paper is that we have multiple references from the 1930's and 40's. It turns out our problem is related to some of the problems from the early (earliest?) days of experiment design.

This was actually a stumbling block for us for a while. In one sense, we had a very positive starting point, in that I knew there was something out there related to our problem. As a youth (literally, back in high school) I had seen some stuff on the theory of combinatorial design, and while it was too abstract for me to find a direct connection, I knew there must be some stuff out there we better be aware of. We eventually found what we really needed by random searching of keywords; some variation of "experiment design" led us to the Design of experiments Wikipedia page, which used Hotelling's problem as an example. Once we had this (our magic keyword!), we could forward track to other relevant references and information.

In many cases, the problem is not only that you don't know what you should be referencing -- you may not even know you should be referencing something at all. This happens a lot in problems at the boundaries -- econ/CS problems, for example. Most notably, this was a big problem in the early work on power laws, as I pointed out in my survey on power laws -- that's the most egregious example I know, where a lot was "re-invented" without people realizing it for quite some time.

I still get the feeling that, despite the great tools we now have available to us, people don't do enough searching for related work. I can understand why. First, it's not easy. If you don't know what the right keywords are, you have to use trial and error (possibly helped by asking others who might have a better idea). For multiple papers I have written, I have spent multiple hours typing semi-random things into Google and Google scholar, looking around for related work. (In the old days, as a graduate student, I actually pulled out lots of physical books from the library shelves on anything that seemed related -- I like this new system better.) It can seem like a waste of time -- but I really, really encourage authors to do this before submitting a paper. Second, in many cases there's a negative payoff. Who wants to find out (some of) what they did was already done? (In fact, I think everyone who expects to have a long research career would actually prefer to find this out as soon as possible -- but it still can be hard to actively seek such news out.)

On the positive side, I can say that good things can come out of it. Reading all the original work and related problems really helped us (or at least me) better understand the right framework for our variation of the problem. It also, I think, can help get your paper accepted. I feel we tried hard to clearly explain the historical context of our problem -- I think it makes our paper richer than it would be without it, exposing some interesting connections -- and I think it paid off; one reviewer specifically mentioned our strong discussion of related work.

Monday, October 19, 2009

WSDM Paper : Acceptance Rates

I'm happy to announce our paper "Adaptive Weighing Designs for Keyword Value Computation" -- by me, John Byers, and Georgios Zervas -- was accepted to WSDM 2010 -- The Third ACM Int'l Conference on Web Search and Data Mining. (The submission version is available as a technical report here.) The abstract is at the bottom the post for those who are interested.

The paper's acceptance gives me an excuse to discuss some issues on paper writing, research, conferences, and so on, which I'll do this week. To start, I found it interesting that WSDM had 290 submissions, a 70% increase in submissions over 2009. Apparently, Web Search and Data Mining is a healthy research area in terms of the quantity of papers and researchers. They accepted 45, or just about 15.5%. This turns out not to be too far off from the first two years, where acceptance rates were also in the 16-17% range. I'm glad I didn't know that ahead of time, or I might not have submitted!

I'm curious -- why would a new conference, trying to establish itself and gain a viable, long-term group of researchers who will attend, limit itself to such small acceptance rates when starting out? Apparently they thought the key to success would be a high quality bar, but I find the low acceptance rate quite surprising. I can imagine that the rate is low because there are a number of very poor submissions -- even the very top conferences, I've found, get a non-trivial percentage of junk submitted, and although I have no inside knowledge I could see how a conference with the words "International" and "Web" in the title might receive a number of obviously subpar submissions. But even if I assume that a third of the submissions were immediate rejects, the acceptance rate on the remaining papers is a not particularly large 23.3%.

The topic of low acceptance rates for CS conferences has been a subject of some discussion lately -- see Birman and Schneider's article at the CACM, Matt Welsh's thoughts, Dan Wallach's thoughts, and Lance Fortnow's article at the CACM for instance. Here we have an interesting example case to study -- a new conference that starts out with an accept rate in the 16% range, and an apparent abundance of submissions. Anyone have any thoughts on why that should be? (I'll see if I can get some of the conference organizers to comment.) Or opinions on if that's the way it should be?

Now for that abstract:
Attributing a dollar value to a keyword is an essential part of running any profitable search engine advertising campaign. When an advertiser has complete control over the interaction with and monetization of each user arriving on a given keyword, the value of that term can be accurately tracked. However, in many instances, the advertiser may monetize arrivals indirectly through one or more third parties. In such cases, it is typical for the third party to provide only coarse-grained reporting: rather than report each monetization event, users are aggregated into larger channels and the third party reports aggregate information such as total daily revenue for each channel. Examples of third parties that use channels include Amazon and Google AdSense.

In such scenarios, the number of channels is generally much smaller than the number of keywords whose value per click (VPC) we wish to learn. However, the advertiser has flexibility as to how to assign keywords to channels over time. We introduce the channelization problem: how do we adaptively assign keywords to channels over the course of multiple days to quickly obtain accurate VPC estimates of all keywords? We relate this problem to classical results in weighing design, devise new adaptive algorithms for this problem, and quantify the performance of these algorithms experimentally. Our results demonstrate that adaptive weighing designs that exploit statistics of term frequency, variability in VPCs across keywords, and flexible channel assignments over time provide the best estimators of keyword VPCs.