Monday, December 31, 2007
Theory at Bloomingdale's
Theory on Sale
I didn't know CS theory was popular in the mainstream!
Theory is apparently a New York clothing design firm, and their wares are in most of the major chains (Saks, Neiman Marcus, Nordstrom's, etc.) So when you look up anything related to theory, and it's vague enough that Google isn't sure what sort of ad to throw at you, you end with clothing ads. I believe their website is www.theory.com, an address I suppose I should have purchased years ago.
Yes, www.cstheory.com is taken. (Smart thinking, Kevin...)
Wednesday, December 26, 2007
I never know what to make of these sorts of stories. While I admire the enthusiasm, is this really what gets these students (and others) excited about computing? Somehow, it seems like the equivalent of writing your own iambic pentameter poetry in Latin. The result should be worthwhile for a small, small number of people.
Yet here I am linking to it.
Thursday, December 20, 2007
I'm always torn when I hear about such things. I think it's a very good and important idea, but at the same time, I look forward to the day when such workshops won't be necessary (and wonder why we're not there yet).
I do think that peer support is key to getting women into computer science, and having them stay. Some years ago at Harvard we were fortunate to have a group of four very mathematically talented undergraduate women, all of whom apparently had at least met each other in high school, come through the computer science ranks. It was clear that the "group dynamic" of being able to work together and talk with each other was very helpful to them. Three of the four went to graduate school in computer science; two are/have finished (in theory), one switched over to economics. While I have seen many strong women undergraduates come through Harvard, I haven't seen a group quite like this one, and I wish I saw more.
Of course, it also helps to have successful role models, so I suppose now is an apropos time to belatedly congratulate Susanne Albers for winning the Leibniz Prize. I had the great fortune to start working with Susanne while I was a graduate student at Berkeley and she was a postdoc at ICSI. It was a great experience for me, as at the time, I was still figuring out how this whole research thing worked; I learned a lot working with her. I'm thrilled she's getting this outstanding prize and the corresponding recognition for her body of work.
Monday, December 17, 2007
Wednesday, December 12, 2007
Monday, December 10, 2007
It's not as wacky or ludicrous an idea as not charging anyone tuition, but I'm still glad to hear it. (Wait, this doesn't cut my salary, does it...)
Friday, December 07, 2007
Now occasionally I catch students who admit to reading this blog. I mean all students, from whatever school, undergraduates and graduate students, not just students from my classes or Harvard students. I hope some of you are reading now. Because I'd like to ask you to enlighten me. (That means, for instance, I'll keep quiet on the comments.) Please tell me, in your experience, did your education prepare you for your life after in the real world. (For current students, you can comment on how you feel your education is preparing you.)
While I'd expect you to comment anonymously, I'd ask that you provide salient information where possible. (Harvard student or not, current undergraduate or long-time real-world person, CS or EE or other major, etc.) I'd also greatly enjoy hearing specific comments and criticism regarding my own classes, from any ex-students out there.
And in advance of some annoying anonymous commenter who might feel the need to say how out of touch I must be that I need to find out how students are doing by asking on my blog, please rest assured I have other sources of information (both personal and data-driven) on the subject, but this is, hopefully, an interesting opportunity for me and others to gain more insight.
Thursday, December 06, 2007
Wednesday, December 05, 2007
We are so pleased that you are going to be a part of the ACM-SIAM Symposium on Discrete Algorithms (SODA08).Please be advised SIAM has not yet received your conference registration. Although you may register on-site, SIAM requests you register in advance to confirm presentation of your paper.
I had to search my e-mail, to confirm my memory that I had in fact registered. I sent a note back to the sender explaining I had an e-mail confirmation, and got back this response.
Dear Michael Mitzenmacher,
Please note that this is a mass mailing. It clearly states and the end of the registration reminder
“If you have already registered please disregard this message”.
Indeed it does. But why then send a mass mailing saying clearly at the beginning of the message that they haven't gotten a registration, instead of just sending a reminder that speakers need to register? What makes this SIAM conference person (whose name I'm politely removing) think this is OK? It's not to me. I've sent a complaint, and I plan to complain further to SIAM staff at the conference.
Tuesday, December 04, 2007
Wednesday, November 28, 2007
One positive for me is the chance to catch up with various networking people who I don't see so regularly. The socializing aspect -- just talking to people, finding out what they're working on, making sure they know Harvard is hiring, all this is good. (Jennifer Rexford and I chatted during a break, and when I sat down after the break, she had already sent me a paper to look at. Which I need to go ask her some questions about... I also finally caught up on some overdue researchy discussion with Dev Shah of MIT; yes, I have to go to DC to meet up with someone from MIT...) In this sense, it's like going to a networking conference, and obviously interacting with peers is one of the pleasures of a conference.
The invited presentations aren't doing much for me, however. They're not highly technical talks -- more high-level talks, connecting network research to network users. (We're hearing two talks right now about issues in the practice of emergency networks in disasters.) It's not that the talks are bad, but it's a bit hard to see their purpose. This is a room full of networking people -- they all have well-developed individual research agendas and know what they think the problems are. I don't think the talks are adding much new. (Let's just say I'm not the only one tapping away at the keyboard when I should be listening to the talk. So again, it's just like a conference.)
Besides invited presentations, there are break-out sessions. I'm sitting in on the "network management" breakout session, which far and away seems the largest. About 25-30 people or so in here. It's actually kind of like a conference panel, only a bit more chaotic with so many people. While the discussion is high-level, and I'm not sure we're headed anyplace specific, it's actually a pretty interesting discussion that highlights the scope of the problems of network management. (So many requirements, so many heterogeneous user classes, so many ill-defined goals, so many ill-defined or at least hard to measure quantities, etc.) Interestingly, the group as a whole seems quite convinced of the importance of what I call theory -- modeling, control, algorithms. (The pro-theory comments seemed to begin from Jennifer Rexford and Anja Feldmann, two incredibly pro-theory and powerful network researchers.) This gives me a good excuse to chime in and further emphasize the importance of theory to network management, which I manage to do repeatedly.
Ellen Zegura is giving a talk about GENI and the GENI Science Council. The GENI Science Council is supposed to come up with a "research plan" for GENI, an experimental facility/infrastructure for the development of the next generation Internet. (Network people rightfully complain it's hard to do experimental research on the current Internet at a large scale, so the plan is to design a research infrastructure for these sorts of problems.) There's a notable shortage of theorists on the council, but they seem to have noticed this and added Joan Feigenbaum and Michael Kearns recently. (From my perspective, if you're going to design a whole new network infrastructure, it seems to me you'd want some theorists to help make sure it's done right.) They're still putting together the research agenda that goes along with the planned facility, and they're still looking for input. Essentially, it seems like they're looking for research examples and research experiments that would utilize or benefit from a GENI-scale system, so that they can explain clearly why building a GENI-scale system is a good idea. So if you have ideas for experiments that you'd like to do on a network-wide scale, you might start looking at what GENI's up to. Overall, GENI seems like such a huge project, and it's still a bit amorphous at this point, so that people are a bit confused. Or maybe that's just me. (I did have to get up at 4 am to catch the early plane down here.)
I talked with Ellen later, and got clarification of the GENI state. I had thought this was well on its way to becoming a possibly multi-hundred-million dollar new NSF project, but apparently it's much less further along than I had thought. If GENI is going to happen, it apparently needs to regain some momentum. And GENI seems a key component of the FIND program; it's hard to develop next-generation network architectures if you don't have a network to try them out on.
Before dinner they have a poster session. It's well done for a poster session -- they have food and a cash bar, and food and drink always seem to be requirements for a poster session where people actually stay and look around. There's a nice mix of work; my favorite is an idea multiple groups seem to be working on that there shouldn't be just one network architecture, but multiple network architectures running over the same network substrate. My interpretation is that you have routers/other infrastructure that can run multiple architectures efficiently in parallel, so running side by side you might have our standard Internet with another network with much stronger (or weaker!) quality of service guarantees, or another network where connections are based on virtual circuits, and so on. This makes sense if you believe that a one-size-fits-all Internet is no longer the best idea.
Day 2 is mostly having the breakout session coalesce their ideas into short presentations. Pablo Rodriguez, of fame for example for his work on Avalanche/network coding at Microsoft, gave a nice talk about the economics of P2P and ISPs; he's left Microsoft to work for Telefonica Barcelona, and has developed some insight from the ISP point of view.
Overall, I ended up with mixed feelings about this working meeting. It's useful to get people together to find out about what everyone is doing and to talk about high-level issues. Conferences already serve some of this purpose, though. Some of the value I got from this meeting derives from the fact I haven't been to a networking conference for a while (I didn't feel like traveling to Japan for SIGCOMM this year...). High-level issues don't necessarily get discussed at conferences, but it's also hard to get 50+ people together to talk about high-level issues and get any sort of agreement that wasn't already obvious. I'm skeptical that 3 such meetings are needed each year. However, the plan seems to be for the next meeting to be a graduate student focused meeting, which may be a good idea -- getting graduate students together to meet and talk like this seems like a potentially very interesting idea.
Going to the meeting will pay off, however, if it leads to any new collaborations. If any FIND people are reading this, and you want to add some theory to your work, send me an e-mail. (Of course, you could also go down the hall and talk to the theorist(s) in your locale, which I'd really strongly recommend, unless I'd be interested, in which case try me first!)
Monday, November 26, 2007
Why does this story remind me of Digital Fountain? I recall the early, early days going around giving talks about Tornado codes and their potential uses in networks, and in the beginning, the greatest skeptics were the coding theorists. One talk in particular, I remember, early on a very senior coding theorist insisted repeatedly that highly optimized Reed-Solomon codes could do anything we were claiming. The rest of the talk was a bit more difficult to get through after that argument. (I like to think by the end of the talk he thought that maybe there might be something new there...)
So is this to say that I think the academic crowd is being too hard on D-Wave? Absolutely not. It sounds like they're doing their job -- trying to figure out what's going on and asking tough questions. The point of my parallel is that the coding theorists had the right to be skeptical. (One reason for it was there was a huge language barrier, me not knowing what a "decibel" was, and them not really getting why linear vs. quadratic complexity was such a big deal. The divide being coding theorists and CS theorists has shrunk an awful lot this past decade.) It was our job to convince them that we had the goods. And we provided evidence with equations, papers, talks, and demos. While they took some convincing, the coding theorists listened to the evidence, and rapidly came around.
Here, the story seems to be (my interpretation!) that D-Wave is presenting painfully unconvincing demos and releasing not nearly enough information for experts to see if they have any new ideas. Then they're compounding this by excessive PR overstating what might eventually be possible with quantum computing. (There seem to be some arguments whether they themselves are doing the overstating, or uninformed journalists, in which case D-Wave is just failing to correct the press, but whatever.) Their excuse for this is that they're a private company so they have to keep their secrets. Understood! But they shouldn't then be surprised when knowledgeable academics studying the issue voice their opinion that the emperor has no clothes. I'm quite sure the scientific community has enough open-minded people that if they start producing convincing evidence, or give enough details on what they're doing so people can verify if the path they're taking is productive, they'll be understood and praised. Until then, the feedback they're getting is rightly telling them they have more to do.
Sunday, November 25, 2007
So maybe someday I'll be watching IPTV using a Digital-Fountain enabled product.
Wednesday, November 21, 2007
I can imagine two reasons for instituting page limits on journal papers: paper costs, and reviewer time. Neither seems particularly compelling to me. Are there other reasons I'm missing? (Notice that both of these are more compelling in the case of conferences, where I would agree page limits make more sense. And even in that setting, papers costs is rapidly disappearing as a reason, and at some point, the question of page limits should probably be reconsidered.)
I do admit that there are many papers where the author should be forced to edit things down further, and most authors have a tendency to write too much rather than too little. But it seems the right answer to this is to have good reviewers and a good editor firmly tell an author that the paper is not yet ready and needs further editing. A page limit doesn't adequately solve this problem and raises new, more unpleasant ones.
Anyone want to defend this practice? Or at least suggest reasons for it I haven't thought of?
Sunday, November 18, 2007
Friday, November 16, 2007
Let me explain. As an algorithmist, a major interest for me is designing algorithms that people will actually use -- perhaps not in the original form that I describe them in, of course, but that the ideas make it out into the real world. And while it is not my aim that every project I'm involved in should have a clear potential practical impact, it is the case that this goal guides my overall research agenda quite significantly.
IP issues have the potential to limit the practical impact of an algorithm I design. For example, suppose someone has a patent on SuperHashSchemeX (SHSX). (If you want something more specific, suppose Bloom filters were patented.) I might come up to an improvement to SHSX, or a better analysis of SHSX, but if the patent is written well, the patent owner may have rights controlling who could use SHSX, including my improvement. There may really be no hope that anyone else will actually be able to make use of my ideas, until the patent runs out, or unless the user is willing to buy a license. Indeed, unless I go through the process of obtaining a patent, it's possible that my work might simply add value for the initial patent-holder, and I've actually rewarded those who are keeping my work from being used.
Perhaps this wouldn't bother me if my theory blood ran pure, but I like being an "applied theorist". In my mind, this affects the calculus I use in thinking about what I should work on.
Should it? Has anyone actually run into this problem?
Monday, November 12, 2007
Thanks to some friends who are working in Web marketing, I have a chance to try a viral marketing experiment right here on the blog. I have been give some discount codes for a new ticket site -- ticketheroes.com -- which offers tickets for sports/concerts/other events. Anyone can use them, at least until 1/1/08, so you can use them, or tell a friend to use them, or have a friend tell a friend, or whatever. According to that whole 6 degrees of separation thing, it's possible that everyone in the world could learn my discount codes, which by the way are:
MMBLOG1 [$5 off any ticket purchase]
MMBLOG2 [5% off any ticket purchase]
[Yes, you can only use one of them...]
To be clear, I am officially acting as a consultant for the SEM company that is trying to drive traffic to TicketHeroes. I am not getting a direct cut back from sales using these codes. (Though if this experiment is hugely successful, perhaps I'll try to negotiate one!) But I will get back data -- I'll get to see how many sales used the codes. (Don't worry, I won't get your credit card number.) I find this all very interesting. Can this little blog, focusing primarily on academic topics, help sell things? Really? I kind of doubt it, but hey, I remember thinking around the year 1999 that it would be a really impressive thing if Google became a 1 billion dollar company, so clearly I don't know as much as I should about marketing.
I've done my own checks and the site's prices look to be slightly less than or the same as other similar sites, even without the discount. So this should be a win for anyone looking for tickets, although of course you shouldn't trust me, you should compare.
Next post, we'll return to less crass, commercial topics, like how to re-finance your mortgage.
Saturday, November 10, 2007
Thursday, November 08, 2007
Having served on a half dozen or so NSF review panels, I have some sympathy for this point of view. In particular, being anti-travel, I'm not sure why we have to fly in for a two-day meeting to make decisions; it's expensive for the NSF and time-consuming for me. (It's not clear to me that decisions are any better because of the face-to-face meeting. Indeed, arguably, they can be worse, depending on the group interaction. But government process is government process, so unlike for PCs, not having the face-to-face meeting is not currently an option...)
However, despite the time, I've tried to make myself available when the calls for NSF panels go out, because I've always figured that if the NSF is paying my summer salary (which, in large part, they do), they have the right to ask for some of my time. Indeed, my gut reaction (without working through the arguments) is that it's objectionable that they don't actually require people who have received funding to serve on a review panel at some point during the term of their grant, though I suppose the implementation of that rule could become unpleasant. In short, my moral interpretation is that by taking their money, I'm part of their system, so when they come calling, I should respond, even without an explicit quid pro quo.
Even if one does not subscribe to this implicit understanding, I think it's important for us individually and as a community (including non-academics who don't get NSF funding directly) to do our service for the NSF, particularly in the way of review panels. In general, citizens should be keeping an eye out on how the government uses their money, and in this specific case, as scientists, we should be paying especially close attention to how the government is distributing research money to scientists in our own and related fields, and we should be attempting to influence the process to move in the right directions as much as possible. The panels give us some opportunity to influence the process, and these otherwise near-useless face-to-face meetings give us some opportunity to influence the NSF administrators who shape and make the decisions that affect us all.
So, with all due respect to my senior colleague, and any of you other slackers out there, get over yourselves, and go serve on an NSF panel every other year or so.
Monday, November 05, 2007
Besides helping create a more pleasant graduate student atmosphere and experience, these activities let graduate students get to know more about each other and what they were doing. And I believe expanding your own work-related social network pays dividends in the long run.
I don't know what the current state of graduate student life is like these days, but if you don't think there's enough of this sort of stuff where you are, I encourage you, take action! (Don't wait for the department to do it; you'll do it better anyhow.) Try to get a grad student poker game going. Or a biweekly pub night. Or start an informal seminar. Or an open problems session. Or whatever it is you want to do, where graduate students can just hang out together, without faculty, with it being about fun, instead of or combined with work. Besides having more fun, you'll open up more opportunities for the serendipitous events in your work-life that lead to exciting projects.
Friday, November 02, 2007
I'll state clearly that my opinion is that breadth requirements are a good thing. Breadth requirements for Ph.D. students are like vegetables for kids -- they don't always like them, but they're good for them. When I was at Berkeley the breadth requirements were reasonably onerous and I think they were very positive for me (even if, as contemporaries will suggest, it seemed like I slept through most of them). I especially think it's good for theoretical people to have a background in many areas of computer science outside theory, and classes are arguably the best way to get that background. And certainly I would argue that people in systems need some theory in their background.
I'd enjoy hearing arguments from the other side. You can argue that breadth requirements in general are a bad idea, or specifically that theory people shouldn't have to take classes in other CS areas (they should be allowed to just do all theory!), or that systems people shouldn't have to take a theory class. Any EE readers or readers from other areas should chime in with arguments based on their experiences as well!
Wednesday, October 31, 2007
Monday, October 29, 2007
- It fulfills a need to be more directly relevant to the real world.
- It gives me a chance to work with different sets of people.
- It allows me to learn new things and exercise skills that I don't necessarily use as a professor.
- It pays well.
Although I do know several algorithmists that consult at times, my impression is that the consulting culture is surprisingly small in computer science theory, even among algorithmists, and that it is much more prevalent in networking and EE theory. (The exceptional case I can recall is around the time of Akamai, where once it seemed like a sure bet, a lot of theorists hopped on the bandwagon for at least a short stint.) I generally hear people talk about consulting much more at networking and EE conferences than I do at theory conferences. Maybe it's just impolite to talk about it. Or maybe theorists don't consult much because most don’t have the skills people are looking for in consultants. Or maybe most theorists just aren’t that interested in consulting, and that's part of why they ended up being theorists.
Do you consult? (I'm particularly curious -- do complexity theorists consult?) If not, why not? Are my impressions about theory+consulting just way off base?
Sunday, October 28, 2007
Monday, October 22, 2007
Strangely, I can't think of any time where a "philosophy of service" has been explained to me -- in graduate school or as a new faculty member, from the theory CS community or from my institution that employs me. Thinking about this, I find it rather odd. Surely, someone should be suggesting what's an appropriate level of time and effort to put into service, and where these efforts might best be applied.
(There is some advice I've seen in books for new faculty members. For example, Robert Boice's book Advice for New Faculty Memberssuggests that new faculty essentially spend as little time as possible on service activities, and make early service opportunities as self-serving as self-benefitting as possible. I roughly agree, but we should make this part of a well-reasoned philosophy of service -- the community should protect it's youngest members and help them flourish.)
I think this topic is ripe for community discussion, and some resulting general guidelines. People should have some guidelines as to what's expected -- what's usual and what's exceptional community service. (And to be clear, here I mean service to the scientific community at large; university service, for example, is separate.)
I'll start with two basic, over-arching questions. (I have more specific ones in mind, but I'll save that for the next post.)
- How much time should be spent on academic service activities? [My take -- 1/8 time, or at least one hour a day on average, for senior faculty. Less for junior faculty starting out, but increasing steadily toward that. Ostensibly, people in research labs should also be at 1/8 service time -- anyone know of any policies on that?]
- Should we be better rewarding service, and if so, how?
Saturday, October 20, 2007
Also, I wanted to give a big thanks to the local arrangements team of Philip Klein, Anna Lysyanskaya, and Claire Mathieu, as well as everyone assisting them, since I know from experience that local arrangements is a thankless job. I found today's setup wonderful. The room at Brown was great -- perfect size, fine acoustics. Lots of food and refreshments were put out, and to save the conference money, it wasn't catered, but the local arrangements team brought stuff in themselves. (I arrived early and saw Anna driving up with a car full of goodies!) That's the kind of extra dedication that helps keep conferences affordable for everyone, at the expense of the personal time of the people who have to make it all run. So thanks again!
The FOCS setup has reminded me that I wanted to do a series of posts on the theme of service, something I'll try to get to this week.
Friday, October 19, 2007
Wednesday, October 17, 2007
The book takes a stark look at the directions of modern college education, painting a picture of increasing directionlessness at the leading research universities, using Harvard and Harry's experience as Dean of Harvard college as a backdrop. Whether you agree with it or not -- and I have to admit I found a strong resonance with most of the issues in the book -- it is definite food for thought, well-reasoned and well-argued through and through.
There are two very small points of criticism I can make with the book. The first is that while the book sheds light on a number of challenging problems, it frustratingly offers little advice in the way of solutions. However, I think this was quite intentional. Indeed, one of the points in the book is that for many of these problems, what is needed isn't a "solution", but leadership, discussion, and the building of a consensus within the university.
The second nitpick is that one issue raised repeatedly in the book is the invasion of the consumer culture in education. Students pay a great deal for an education, particularly at private institutions, and they expect to get what they pay for; Harry argues forcefully that this trend is not good for the education of students. It would seem to me that this should be another compelling reason why Harvard shouldn't charge for tuition, as it might lessen the "I paid for this" attitude of many students (and parents), but perhaps Harry believes that even if there was no tuition, the consumer attitude would remain.
Monday, October 15, 2007
- Classes specialized for graduate students in a subfield (e.g., theory, or usually the next level down, e.g. cryptography, pseudorandomness).
- Classes for graduate students in CS.
- Classes specialized for undergraduate CS majors.
- Classes for undergraduates primarily in other sciences.
- Classes for general, non-science undergraduates.
Perhaps more challenging is classes of type 5 -- classes for non-scientists. The question of how to design a science distribution requirement course for non-scientists is not specific to computer science, but in CS it perhaps especially challenging. Programming is often taken to be off-limits, since how much worthwhile programming can be taught to people who have never seen a computer program before? And theory seems to require too much math. Plus there is the age-old question of how to make it all relevant to non-scientists.
Harry Lewis co-designed a course, Bits, which I think is a great example of what's possible, and is the course I wish I could have designed, if I wasn't so busy teaching my regular classes. I enjoy this first paragraph from the course description:
This is a course about information for students who are not technically or mathematically oriented. It aims to explain the dramatic increase in the amount of information that is being communicated and stored, and the consequences for society of that increase in capacity.We are on the cusp of a revolution, potentially as consequential as the invention of printing or the replacement of animals by steam engines as the vehicles of transportation.The course aims to equip those who will determining policies, whether as legislators, corporate leaders, or ordinary citizens in their multiple roles, with an awareness of the choices that lie only a short time ahead of us.I am also strongly biased in that Harry followed my advice, with much of the technical meat in the beginning of the course covering basic information theory, specifically compression and coding. (Note: I emphasize that Harry followed my advice, not that he took my advice. As far as I know, he always planned to structure the course this way, and it's just happenstance that his take on what was worthwhile matched mine. I'm pleased nonetheless.)
The course is an interesting mix of science, technology, and policy. The lectures are now apparently freely available online now that the course is over for those who want to see it. Someday, perhaps I'll get to design a course like this. I'd be curious to hear from others what other examples there are for CS-based courses with non-trivial technical content meant for non-science majors. (Jon Kleinberg's course new course comes to mind, but it really seems to be type 4, built for scientists, including mathematically oriented economists and sociologists.)
Tuesday, October 09, 2007
Sticky channels have the nice property that contiguous blocks of (resp 1s) at the input correspond to contiguous blocks of 0s (resp 1s) at the output. This property makes sticky channels easier than more general insertion/deletion channels.
I've just had a paper on sticky channels accepted to Transactions on Information Theory; here's a link to a preprint. The main result is that for that simplest channel above, I can numerically obtain very tight bounds on the channel capacity. But of course I'd still like to know -- is there a simple formula that gives the capacity as a function of p? And is there are simple and efficient coding scheme that nearly reaches the capacity?
Friday, October 05, 2007
For graduate school, your letters are probably the most important deciding factor for whether you'll get accepted or not. For job searches, good letters won't really get you the job, but you'll need great letters to even get your foot in the door for an interview.
Yes, great work will trump everything (if you do really great work, you'll get the great letters). But I've seen numerous students do less well than they should have because they didn't think well enough in advance about this part of the game.
For undergraduates thinking about graduate school, have you formed a close working relationship with one or more faculty members? (This can be done by doing research with them, writing a thesis with them as an advisor, being a teaching assistant for them, etc.) Lots of students will have exceptional grades and test scores; we want to know about you as an individual, and your capacity for research. Letters are the best way for us to get this information. Particularly if you go to a smaller school or a school with a smaller CS department, you may want to try to find a way to do a summer internship with a "bigger name" person -- a letter from such a person will likely make the difference between getting in and not for many places.
For graduate students thinking about jobs, who besides your advisor is going to write you a letter? We expect your advisor to say that you're wonderful; we want to hear from less biased sources. (Many students fall into the trap of working almost exclusively with their advisor -- good in the short-term, not in the long-term.) What other faculty have you worked at your institution? Have you worked with anyone from another institution? Have you thought about doing a summer internship -- which can be worthwhile just to get a good letter! It's important to work with multiple people, not just your advisor, so we can see multiple letters explaining and confirming why you and your work are so outstanding.
I thought about this after a discussion on the complexity blog about graduate school detoured slightly into the subject of letters for jobs (in particular, check Paul Beame's insightful comment #41).
By the time you get to tenure, letters are still important, but ostensibly you have less control about where the letters come from. The same general advice seems to apply, though -- great work trumps all, but working with or having some shared experience with many colleagues, including more senior ones, can be key to career advancement.
Tuesday, October 02, 2007
Somehow it made me think, will the pick and play model of iTunes ever make it to universities? Imagine selecting your own collection of courses for your undergraduate or graduate degree, without being limited to your actual physical university; you could take say Kleinberg's Structure of Information Networks, Roughgarden's Introduction to Algorithmic Game Theory, Demaine's Geometric Folding Algorithms, and so on... of course, an appropriate payment mechanism would have to be designed, but the possibilities are exciting and frightening at the same time...
Monday, October 01, 2007
Thursday, September 27, 2007
The highlight for me was the sessions on Networking Games and Algorithms. Bruce Hajek has been organizing excellent networking sessions for years, and I think this year the focus on network games really paid off and led to some very compelling sessions. I arrived to hear Ramesh Johari, Vijay Vazirani, Tim Roughgarden, and David Kempe all give great talks back-to-back. I'm expecting they'll have a similar set of sessions next year, hopefully expanded!
A small disappointment was that there seemed to be fewer talks on coding theory this year. Plenty on network coding, but very little on standard coding theory. (This may have to do with Ralf Koetter recently leaving UIUC; in any case, many people I usually see there, like Muriel Medard, Ralf Koetter, Michelle Effros, Alexander Vardy, and so on didn't seem to be there.) A top coding theorist who was there independently confirmed this impression; hopefully, things will be back to normal next year. This year, I'm glad the network games sessions kept me so busy.
Wednesday, September 26, 2007
In our new paper, we look at a different approach; here we use the CAM as a queue, so the CAM holds elements that are being moved but have not reached a resting place. Using the CAM as a queue is somewhat more complicated in terms of an implementation, but it allows one to essentially implement a full cuckoo hashing scheme. The queue is used to de-amortize the amortized performance of cuckoo hashing. That is, on average inserting an item in the queue take a constant number of moves, but in the worst case it can take Omega(log n) moves (where n is the number of items being stored). If we allow the queue a constant number of moves per insert, then the queue can simply buffer the insertions that take too long. The result is even better space utilization (naturally, since more moves are used per insert). The paper and talk slides are available.
Sadly, this ended up being essentially an experimental paper. The problem is that there are too many open questions in the setting of cuckoo hashing. For example, in the practical implementations I know of, for cuckoo hashing with d > 2 choices of locations for an element, when an element has to be kicked out, one chooses the element randomly. The analyses I know of are based on doing a breadth-first-search to find an element to kick out, which isn't really practical (certainly not in hardware). As a result, I believe it's still open to prove things like constant average moves/worst-case logarithmic moves for standard cuckoo hashing when d > 2. To really do an analysis of the queueing process, we'd need even more detailed information about the distribution of the number of moves required per inserted element. Lots of good open questions left here...
Monday, September 24, 2007
One of the interesting things about Allerton is that it's a mixed of invited and submitted papers. The understanding is that if you're invited to give a paper, you're supposed to go and give the talk, not a student. So there are a lot of big-name people giving talks at the conference. Also, invited papers have a lot more leeway. Sometimes, a talk will mostly be a survey, or a discussion of preliminary work on a problem. Some of the best talks I've seen there have been these kinds! Attendance the past few years has been over 300, so it's a nice-sized conference.
I've heard some complaints about Allerton. First, UIUC is out of the way, and the Allerton site is about 20-30 minutes from the main town. If you don't know the place, it's easy to get bored post-conference. Usually the various UIUC hosts have some sort of dinner, and the town is actually a nice small college town. So if you get to know the place and the people, it's not really a problem. (Also, it means the conference is a great place to start a new collaboration, after the talks.) Second, generally there are 6 parallel sessions going on. Many people don't enjoy that. I usually find I'm interested in 2 (coding something and network something) at the same time and have to hop around. Third, I've heard CS people say they don't know which talks they should go listen to -- who are the really good speakers and what are the interesting topics. CS and EE people don't really know each other as well as they should, and there is some cost in getting to know a slightly different area and group of researchers. Fourth, the quality of papers is mixed. Some blame this on the "invited papers", but I think it's just true anywhere, especially for large conferences. Fifth, they don't have a published proceedings in the way that other conferences do. I think that's changing from what I've heard, and these days people just post papers on arxiv or their web page anyway.
The biggest complaint that I would agree with is the conference has grown beyond its location. Some rooms are small and easily get too noisy, making it hard to hear the speaker. The conference has just proven too successful!
I really enjoy the Allerton conference. It's a different set of people than I usually see, but work I'm very interested in. It was at Allerton that I first heard about the coding problem for deletion channels, which I've since worked on quite a lot. I've had multiple collaborations come out of the Allerton conference. I'd really encourage people who don't know about it to take a look.
In the other posts this week, I'll talk about what I'm presenting there, and about interesting things I see at the conference.
Friday, September 21, 2007
I imagine the crowd reading this blog is more "technical" than the average, so I'm curious what you all think. Is Star out of touch with reality, are the police in a state of extreme over-paranoia, or both?
Thursday, September 20, 2007
This is a remarkably time-intensive committee. We meet weekly, and it's not at all unusual for the meetings to last 2-3 hours. Plus there's prep time involved in reading the cases before the meeting. There are only a few faculty members on the committee at a time. Most of the committee is made up of administrators and the "resident deans" -- people who live at the Harvard dorms (or "houses" in Harvard-speak) and are responsible for the care of students.
Several of the resident deans have asked me what I did to get stuck on the committee. I generally answer that I didn't realize it was a punishment! (In fact, then-Dean of Students Dick Gross asked me to serve. As he and Harry Lewis were my senior thesis advisors as an undergraduate, I found it hard to say no. I've found this sort of thing is a problem about going back as a professor where you were an undergraduate; people are more familiar with your name and stick you on committees.)
It is indeed a very challenging committee. I've learned some important things, including all about the infrastructure for helping Harvard students with problems and how I can help lead students to this infrastructure. One perhaps obvious lesson is that most often students get into academic trouble -- in particular, with plagiarism -- because they leave assignments until the night before and then take shortcuts to get them in. I've always been quite strict with undergraduates about homework deadlines -- I think overall students need better time management skills and try to do too much -- but I admit I've softened up since being on the committee. Contrary to many students' belief, I don't want my class to spur a breakdown. But remind your students over and over -- the night before is no time to start a theory problem set (or any other assignment)!
Monday, September 17, 2007
At a high level, I know what else I do: research, write papers, give talks, go to conferences, manage and work with graduate students, advise undergraduates, seek funding, sit on department and university and theory-community committees, review papers and grant proposals, act as editor for a few journals, serve on program committees, write letters of recommendation, deal with standard administrative paperwork, and... well, I'm sure there are a few other things as well that I've forgotten. Let's count blogging as part of my job, too. And I consult on the side. Come to think of it, it's a wonder I have time to teach. (That probably explains the students' question. They're wondering if I have 40 working hours a week, why isn't this class better?) Of course, I'm know I'm not unusual; all this seems like standard faculty stuff.
What I've been realizing lately I don't have a good handle on is how much time I spend on these various activities. It has seemed to me that in the last few years -- since getting tenure -- a lot more of my time is going to administrative duties than to research-style activities. If true, that fact along with my poor time-management skills seems like a recipe for disaster. So this semester, I'm going to try a little experiment, and try to track my time in a spreadsheet somewhere, to start getting a better handle on what's going on. And figure out what activities to cut.
Of course, I'm curious -- what are others experiencing? How much time do you think you spend on various activities, and is there a pre-post tenure change? What's the biggest impediment to research time?
Thursday, September 13, 2007
How best to handle such a situation? The authors are good researchers, not hacks. It may be that they already know there's a problem, although we haven't seen any sort of retraction or explanatory note anywhere. At some point, I imagine we send a polite note to them, explaining what we found, although for now we'd like to try to fix the argument (and prove our extension!) ourselves. It seemed like it must be a common enough situation -- in the past I've found major and even unfixable bugs in papers -- that people might have an opinion on how to best handle the problem.
Monday, September 10, 2007
We're looking at a situation similar to the well-known "secretary problem". For those unaware of the secretary problem, it goes like this: you need to hire a new secretary. There are n applicants, and you will interview them in a random order. The result of each interview is the relative rank of the applicant against those you have already interviewed. ("This applicant was the 3rd best so far.") After each interview, you have to decide whether to hire that applicant before the next interview; once you pass on an applicant, they're gone. Your goal is to maximize the probability that you hire the best of the n applicants. What's your optimal strategy? (I won't give away the answer for those who haven't seen the problem; it's easily found online.)
In our setting, we are considering growth strategies for a small startup company. Each applicant has a quality score, which we take to be uniform on [0,1], that we learn in the course of the interview. Unlike the secretary problem setting, we want more than one employee, although there is not a fixed number of employees we are aiming for. We simply want to grow, balancing the tradeoff between the speed of growth and the quality of new employees.
In the paper, we consider the strategies where we choose to hire an applicant only if their quality score is above the average of the current employees. (We study both the mean average and the median average.) We call these strategies Lake Wobegon strategies, after the fictional town of Lake Wobegon, where
all the women are strong, all the men are good looking, and all the children are above average.Here, everyone we hire is above average (at least when we hire them). Google claims to use this strategy.
I should hasten to emphasize that (despite the Google claim) we feel that the hiring problem is about hiring employees just as much as the secretary problem is about hiring a secretary: very little. Rather, it's an abstraction, leading to problems about random processes and decision-making in the context of incomplete information. I've given the talk about this work a few times, and some people get hung up on setting it up as an optimization problem and determining the optimal strategy. You could, I suppose, but you'd have to be precise about the tradeoff about growth speed vs. quality, and the abstraction is already removed quite a bit from reality. I personally think it's simply interesting to consider how growth processes like this work under Lake Wobegon strategies; does anyone know of any related examples in nature?
If you're a puzzle person, then you might consider the following: suppose we start with a single person, of quality 1/2, and then interview until we hire n people. Which strategy will lead to higher average quality -- hiring above the median, or hiring above the mean? (Or are they asymptotically the same?)
The hiring problem leads to a great deal of curious mathematical fun : lognormal distributions, the Berry-Esseen inequality, the difference between hiring above the median and hiring above the mean (yes, they're very different!), and potentially lots of open problems and further variations to look at. Hopefully, the problem will become at least a fraction as popular as the secretary problem has.
I'm not sure why people at Yahoo were interested in this. In this case, I think they too got caught up in the mathematics of the problem. I'm glad that in the research labs, people still just follow their curiosity from time to time, even if the immediate application isn't obvious.
Sunday, September 09, 2007
Thursday, September 06, 2007
Negative dependence is one of those nice tricks that hardly ever gets used because it usually ends up being harder than it should be. While there are many flavors of negative dependence, the most natural is probably "negative assocation". The intuition is simple -- given a collection of random variables, if when one goes up that means the others should go down, then they are negatively associated. More formally, given any monotone non-decreasing function f of a subset of the variables, and a monotone non-decreasing function g of another disjoint subset of the variables, if f goes up, g should go down. Proving this holds formally is often more difficult than one would expect.
Why should we care? Well, it turns out that if a collection of random variables are negatively associated, then even though they are dependent, you can just apply your standard Chernoff bounds to them, without a care. Chernoff bounds (and other tail probability bounds) pop up in many arguments, but they can be very difficult to deal with when the random variables are dependent. Usually, you have to switch to a martingale argument, since standard Chernoff bounds apply only to independent random variables. If you can get negative dependence, it's much cleaner.
The best introduction to negative dependence is probably this surveyish article by Dubhashi and Ranjan. The focus is on how balls and bins problems are a natural example of negative dependence -- if a ball lands in one bin, it can't be in any other! Naturally, this explains my interest.
For example, the following problem came up for me some years ago in this paper. Suppose we thrown n points uniformly at random on the boundary of the unit circle. We want bounds on the number of arcs of length larger than say c/n for some constant c. If arc lengths were independent, we could just apply a Chernoff bound, easy enough. But of course they're dependent -- the sum of the arc lengths is 1! Intuitively, though, if one arc gets longer, then the others must get shorter, so there should be negative dependence. We proved what we needed to get the Chernoff bound, but it wasn't pretty. (I've since seen that a stronger version of this result is given as an exercise on negative association in the very nice-looking draft monograph by Dubhashi and Panconesi; to tell the truth, I'd like to see it worked out, as it seems to me that the conditioning is a bit subtle, but then again, geometry confuses me.)
In fact, in that paper we actually wanted a similar result in the following setting. Suppose one throws n points uniformly at random into a fixed region (say, for example, the unit square, or to avoid boundary issues, the 2-d unit torus). We want bounds on the number of Voronoi cells of size larger than say c/n for some constant c. If Voronoi cell sizes were independent, we could just apply a Chernoff bound, easy enough. But of course they're dependent! Intuitively, though, if one cell gets bigger, then the others must get smaller, so there should be negative dependence. Or maybe that isn't the case. Now I can't remember if I ever found an example that led me to believe it wasn't the case... Anyhow, we couldn't prove negative dependence easily, so we ended up using an uglier martingale argument that sufficed.
I was surprised at the time that I couldn't find any reference to type of this problem in the geometric literature. If negative dependence of Voronoi regions in random settings is still open (and true!), I'd be happy to ponder it with anyone who has a better sense of geometry than I do. In general, the area of negative dependence seems like a promising area for additional results.
Monday, September 03, 2007
Strangely, this idea (which hit me back in '99, when I came back to Harvard, and which at least one Dean told me was unrealistic) does not seem so wacky today. Harvard has devoted significant new resources to financial aid over the last decade, so college is essentially free for families earning less than $60,000, as outlined here and here.
While some would view this in the mystical light of commitment to the educational mission, social good, etc., I'll be a bit more cynical and actually try to back up why this has been a good idea with an economic argument. More financial aid is just good business. Harvard's cost was limiting its access to the talent pool of graduating seniors; moreover, the resulting debt was at least a potential and probably real barrier to the future success of many graduates. Since Harvard's greatest asset (after its remarkably talented, and modest, faculty, and maybe also its endowment) is its reputation, losing top students because of cost or limiting their success through debt simply cannot be Harvard's optimal strategy. Increasing financial aid just makes good sense, especially given Harvard's resources -- it can take the short-term cost for the long-term good. (The fact that all this coincides with altruism is certainly fortunate, as naturally pretty much all the faculty really do have a mystical commitment to the educational mission, social good, etc.)
This argument, however, doesn't justify making Harvard free for all accepted, as I have proposed. In fact, one might think the other way -- that the rich should be soaked for as much as they can to help pay for the not-rich. My cynical but rational argument for a free Harvard undergraduate education for all is that, if tuition was free, but Harvard then encouraged people to donate what they thought their education was worth -- say, perhaps, a small percentage of their annual income for life -- in the long run, they'd more than make up the tuition loss with increased funding of the endowment (thanks to the power law that is the wealth distribution). This is not a tax, but it is based on the idea that your college education is related to your future earnings, and giving back a percentage of those earnings is an arguably fair way to pay for that education.
Harvard might not even have to wait for the long-term to get the benefit. Indeed, imagine Harvard's next fund-raising campaign beginning with the announcement that after the campaign, as long as their goals were met, Harvard would be free for all undergraduates! What PR! What kind of donations would that bring in immediately!
Longer term, I would suspect the benefit would be even more. The pay-what-you-think-is-fair approach has been tried in some restaurants (see the SAME cafe, or the One World Cafe), and many museums have no fee but "suggested donations" at least part of the time, so this approach is not entirely unprecedented. In Harvard's case, I think the mix of guilt, altruism, and competition would push wealthy alumni (and wealthy parents of students) to give much more. (So you see, I am soaking the rich -- I just think you should wait to soak people until after they are rich, and inspire them to give generously, rather than bill them for a tuition payment.)
There are all sorts of possible side-benefits. It could work out that by moving from a system of tuition to voluntary donations, there would be an immediate jump because the donations, as opposed to tuition, could be made with pre-tax instead of post-tax money. (Note: I am not a tax attorney...) Students not laden with debt may prove more entrepreneurial (which besides helping the economy might lead to bigger donations down the line), or perhaps might take on more public-service oriented employment.
I can see why this might not have been tried elsewhere -- there's a lot of up-front cost while waiting for future payoff. Even for Harvard, perhaps the risk of too many free-riders is too large, although I doubt it. Harvard could also label the plan an experiment, suggesting that after some time tuition would have to be re-instituted if donations didn't keep pace with projections, to limit the risk. Perhaps the biggest negative is if Harvard tried this unilaterally, it would be seen as an unfriendly neighbor to all the other colleges. Still, I can't help but wish this idea would be given a try. Perhaps it could change the way we talk about funding education in this country, moving increased resources to our education system. I'd like to see Harvard lead the way in this regard.
Friday, August 31, 2007
Trace reconstruction has some history in the information theory community, and the first CS-style paper I saw on it was by Batu, Kannan, Khanna, and McGregor in SODA 2004. The main result of this paper dealt with random input X and considered p values that were O(1/log n). It seems to me much more natural for p to be constant, and it has remained an open problem to determine an efficient algorithm for constant p.
I mentioned this problem last time I visited Microsoft, and it seemed to resonate with some of the people there. Thomas Holenstein, Rina Panigrahy, Udi Wieder and I have a submission with several results, including an algorithm that for random X and sufficiently small constant probability p requires only a polynomial number of traces and polynomial time (with high probability).
The SODA 2004 uses a majority voting technique -- the bits are determined sequentially, with each string voting on the next bit. A key idea in our new algorithm is a "smart voting" technique. We only let traces vote if there is good reason (based on the already determined bits) to think that the trace has a good prediction for the subsequent bit. That is, only well-informed strings are allowed to vote. Feel free to make your own political analogies. My intuition is that this smart voting technique is a closer analogue to the full belief propagation (or Bayesian analysis) that we want to do than just majority voting. Because of this, I hope this "smart voting" technique is a general approach that will find other applications.
I don't yet have an analysis of a belief-propagation-based algorithm. Also, currently we can't analyze a maximum-likelihood algorithm, that finds the most likely original string X. I also don't know how to implement maximum-likelihood efficiently in this setting. So there's still plenty of open questions in this area.
Wednesday, August 29, 2007
For example, one statement I found fairly misguided is the following:
Math departments usually believe theI accept the constructive criticism that in computer science we perhaps publish too quickly, and too many incremental things. On the other hand, this conjecture has little to nothing to do with reality. For every Wiles, who spent essentially a decade on what is a very important result, there are probably 100 mathematicians who spent a decade on the problem getting essentially nowhere and publishing what in retrospect are worthless papers. And the implication that TCS is producing only a stream of worthless papers is fundamentally incorrect. The question is really whether the culture should be that a person works only on big problems with the goal of having a very small number of big results (say, 1) over their lifetime, or that a person helps the community make progress through smaller and quicker increments (as well as the occasional larger contribution). Given the important tie between TCS and real-world applications, there's a clear reason why the community has developed with a larger focus on small/quick progress, although as a community we are also supportive of people who want to work the other way as well. The phrasing of the conjecture is clever, but I actually think TCS progresses much better with the culture it has than the one Koblitz favors. (Just like sometimes fast local heuristics are much better than slow exact algorithms.)
Conjecture. For the development of mathematics it is better for someone to publish one excellent paper in n years than n nearly worthless papers in one year.
In certain other fields of science - including, unfortunately, computer science and cryptography - the analogous conjecture, while most likely true, is not widely believed.
Rather than just start a tirade against Koblitz, however, I think we'd be best off dissecting the article and understanding what aspects of the TCS culture has led to his opinions and, in many cases, misperceptions. We may find some things worth changing, either to improve our culture, or at least to present ourselves differently to other sciences.
Tuesday, August 28, 2007
The main purpose of the visit was to talk to people and try to come up with new things to work on. Corporate money seems to synchronize with annual budgets. If you want another check next year, you need to convince them that funding you is still worthwhile. (That's not unusual; I understand if I ever get a DoD-based grant, it will be similar.) I should point out that all research money I've gotten from Cisco and Yahoo is given as a gift -- no requirements (and much less in overhead taken by Harvard!). It's just that it's nice to get these gifts annually, like a research anniversary present, and that means maintaining the relationship.
I like to think that getting corporate money is a simple winning proposition. I get access to interesting problems; I get to work with talented people; and I get research money for doing what I would be doing anyway. But perhaps I'm deluding myself? Perhaps I'd be working on other grander problems if my agenda wasn't influenced by the vision of others? In my own personal case, I really doubt it, and it feels like a pure jackpot to me, but it's something every researcher probably has to think about.
I'd certainly advise that people interested in "practical algorithms" and "algorithm engineering" should seek collaborations with (and funding from!) such industrial sources. In my experience, it takes some work. (But then again, so does writing NSF grants.) Usually the funding comes after a successful collaboration, not before, or there has to be a clear champion in the organization who can make the case that your work is related to something going on at the company. Building such relationships takes time, and doesn't happen overnight. But it's very worthwhile.
To provide evidence, I'll spend some upcoming posts discussing problems I've recently collaborated on with people at Cisco, Microsoft, and Yahoo.
Friday, August 24, 2007
The course is taught at the level of an introductory graduate class, meant for non-theorists as well as theorists. These days, who doesn't need to know randomized algorithms and probabilistic analysis? If you know someone, for example in industry, who might like to take such a course, here's the link to the bare-bones syllabus.
Wednesday, August 22, 2007
In a binary deletion channel, n bits are sent, and the channel deletes each bit independently with probability p. So, for example, the message sent might be 00110011 and the received message could be 010011 if the 2nd and 4th bits were deleted. Now suppose the same message is sent through k independent binary deletion channels, and the receiver sees all of results. Can we say anything useful here? The problem is automatically more challenging since we only have bounds and don't even know the capacity of the standard deletion channel (when k is 1). This is yet another simply stated question from the theory of coding for deletion channels in need of an idea.
Monday, August 20, 2007
Please go to http://focs2007.org/
Note that the early registration deadline is September 20, and the deadline for reserving a room at the hotel at the conference rate is also September 20. (The hotel's regular rate is much higher.)
The Knuth Prize lecture will be given by Nancy Lynch on October 21.
A program of tutorials takes place on October 20, the Saturday before the regular conference program begins. The tutorial talks are as follows:
Terrence Tao : Combinatorial Number Theory
Dan Boneh : Recent Developments in Cryptography
Daniel Spielman : Theory and Applications of Graph Spectra
Abstracts for the tutorials should be available soon.
Thursday, August 16, 2007
We request submissions of position papers that propose new directions of research, advocate nontraditional approaches to old (or new) ideas, or generate insightful discussion... As a venue for exploring new ideas, HotOS encourages contributions influenced by other fields such as hardware design, networking, economics, social organizations, biological systems, and the impact of compiler developments on systems and vice versa. We particularly look for position papers containing highly original ideas.Submissions were just due for the sixth HotNets workshop. Here is the HotNets mission statement:
The Workshop on Hot Topics in Networks (HotNets) was created in 2002 to discuss early-stage, creative networking research and to debate positions that reflect on the research direction and needs of the broad networking community. Architecture, high-level design work, and positions that may shape long-term research direction are especially welcome. HotNets is structured to work in synergy with conferences such as SIGCOMM by providing a venue in which innovative work may receive feedback to help it mature into conference papers or otherwise have a long-term impact on the community. To fulfill these goals HotNets calls for short position papers that argue a thoughtful point-of-view rather than full-length conference papers, and maintains a broad and diverse scope.Does theory need a Hot Workshop? A workshop specifically designed for somewhat wacky or not-fully-baked ideas that may turn into long-range research directions?