Wednesday, November 28, 2007

NSF FIND Working Meeting

I'm writing this while stuck at attending an NSF FIND (Future Internet Design) working meeting (Nov 27-28). Part of the FIND program is that there are 3 meetings per year where all the FIND PIs are supposed to get together, to talk about the new network that is the goal of the FIND program. This is my first meeting; I missed the last one. It seems to me that this idea of having PI meetings related to grant proposals is an NSF experiment worth examining, and there are some discussions going on here that may be of broader interest, so I thought I'd report. (If this reads as stream of consciousness, I'm generally typing as things go on.)

One positive for me is the chance to catch up with various networking people who I don't see so regularly. The socializing aspect -- just talking to people, finding out what they're working on, making sure they know Harvard is hiring, all this is good. (Jennifer Rexford and I chatted during a break, and when I sat down after the break, she had already sent me a paper to look at. Which I need to go ask her some questions about... I also finally caught up on some overdue researchy discussion with Dev Shah of MIT; yes, I have to go to DC to meet up with someone from MIT...) In this sense, it's like going to a networking conference, and obviously interacting with peers is one of the pleasures of a conference.

The invited presentations aren't doing much for me, however. They're not highly technical talks -- more high-level talks, connecting network research to network users. (We're hearing two talks right now about issues in the practice of emergency networks in disasters.) It's not that the talks are bad, but it's a bit hard to see their purpose. This is a room full of networking people -- they all have well-developed individual research agendas and know what they think the problems are. I don't think the talks are adding much new. (Let's just say I'm not the only one tapping away at the keyboard when I should be listening to the talk. So again, it's just like a conference.)

Besides invited presentations, there are break-out sessions. I'm sitting in on the "network management" breakout session, which far and away seems the largest. About 25-30 people or so in here. It's actually kind of like a conference panel, only a bit more chaotic with so many people. While the discussion is high-level, and I'm not sure we're headed anyplace specific, it's actually a pretty interesting discussion that highlights the scope of the problems of network management. (So many requirements, so many heterogeneous user classes, so many ill-defined goals, so many ill-defined or at least hard to measure quantities, etc.) Interestingly, the group as a whole seems quite convinced of the importance of what I call theory -- modeling, control, algorithms. (The pro-theory comments seemed to begin from Jennifer Rexford and Anja Feldmann, two incredibly pro-theory and powerful network researchers.) This gives me a good excuse to chime in and further emphasize the importance of theory to network management, which I manage to do repeatedly.

Ellen Zegura is giving a talk about GENI and the GENI Science Council. The GENI Science Council is supposed to come up with a "research plan" for GENI, an experimental facility/infrastructure for the development of the next generation Internet. (Network people rightfully complain it's hard to do experimental research on the current Internet at a large scale, so the plan is to design a research infrastructure for these sorts of problems.) There's a notable shortage of theorists on the council, but they seem to have noticed this and added Joan Feigenbaum and Michael Kearns recently. (From my perspective, if you're going to design a whole new network infrastructure, it seems to me you'd want some theorists to help make sure it's done right.) They're still putting together the research agenda that goes along with the planned facility, and they're still looking for input. Essentially, it seems like they're looking for research examples and research experiments that would utilize or benefit from a GENI-scale system, so that they can explain clearly why building a GENI-scale system is a good idea. So if you have ideas for experiments that you'd like to do on a network-wide scale, you might start looking at what GENI's up to. Overall, GENI seems like such a huge project, and it's still a bit amorphous at this point, so that people are a bit confused. Or maybe that's just me. (I did have to get up at 4 am to catch the early plane down here.)

I talked with Ellen later, and got clarification of the GENI state. I had thought this was well on its way to becoming a possibly multi-hundred-million dollar new NSF project, but apparently it's much less further along than I had thought. If GENI is going to happen, it apparently needs to regain some momentum. And GENI seems a key component of the FIND program; it's hard to develop next-generation network architectures if you don't have a network to try them out on.

Before dinner they have a poster session. It's well done for a poster session -- they have food and a cash bar, and food and drink always seem to be requirements for a poster session where people actually stay and look around. There's a nice mix of work; my favorite is an idea multiple groups seem to be working on that there shouldn't be just one network architecture, but multiple network architectures running over the same network substrate. My interpretation is that you have routers/other infrastructure that can run multiple architectures efficiently in parallel, so running side by side you might have our standard Internet with another network with much stronger (or weaker!) quality of service guarantees, or another network where connections are based on virtual circuits, and so on. This makes sense if you believe that a one-size-fits-all Internet is no longer the best idea.

Day 2 is mostly having the breakout session coalesce their ideas into short presentations. Pablo Rodriguez, of fame for example for his work on Avalanche/network coding at Microsoft, gave a nice talk about the economics of P2P and ISPs; he's left Microsoft to work for Telefonica Barcelona, and has developed some insight from the ISP point of view.

Overall, I ended up with mixed feelings about this working meeting. It's useful to get people together to find out about what everyone is doing and to talk about high-level issues. Conferences already serve some of this purpose, though. Some of the value I got from this meeting derives from the fact I haven't been to a networking conference for a while (I didn't feel like traveling to Japan for SIGCOMM this year...). High-level issues don't necessarily get discussed at conferences, but it's also hard to get 50+ people together to talk about high-level issues and get any sort of agreement that wasn't already obvious. I'm skeptical that 3 such meetings are needed each year. However, the plan seems to be for the next meeting to be a graduate student focused meeting, which may be a good idea -- getting graduate students together to meet and talk like this seems like a potentially very interesting idea.

Going to the meeting will pay off, however, if it leads to any new collaborations. If any FIND people are reading this, and you want to add some theory to your work, send me an e-mail. (Of course, you could also go down the hall and talk to the theorist(s) in your locale, which I'd really strongly recommend, unless I'd be interested, in which case try me first!)

Monday, November 26, 2007

Introducing New Scientific Ideas, and D-Wave

Apropos of my last post on Digital Fountain is Scott's Aaronson's current report on D-Wave researchers giving talks about what's behind the hype of their building of quantum computers. For those who don't know the story, the short summary is that D-Wave Systems has claimed non-trivial advancements in the development of quantum computers, and many (most? all?) academic quantum scientists aren't buying it, based on the very limited evidence they've seen.

Why does this story remind me of Digital Fountain? I recall the early, early days going around giving talks about Tornado codes and their potential uses in networks, and in the beginning, the greatest skeptics were the coding theorists. One talk in particular, I remember, early on a very senior coding theorist insisted repeatedly that highly optimized Reed-Solomon codes could do anything we were claiming. The rest of the talk was a bit more difficult to get through after that argument. (I like to think by the end of the talk he thought that maybe there might be something new there...)

So is this to say that I think the academic crowd is being too hard on D-Wave? Absolutely not. It sounds like they're doing their job -- trying to figure out what's going on and asking tough questions. The point of my parallel is that the coding theorists had the right to be skeptical. (One reason for it was there was a huge language barrier, me not knowing what a "decibel" was, and them not really getting why linear vs. quadratic complexity was such a big deal. The divide being coding theorists and CS theorists has shrunk an awful lot this past decade.) It was our job to convince them that we had the goods. And we provided evidence with equations, papers, talks, and demos. While they took some convincing, the coding theorists listened to the evidence, and rapidly came around.

Here, the story seems to be (my interpretation!) that D-Wave is presenting painfully unconvincing demos and releasing not nearly enough information for experts to see if they have any new ideas. Then they're compounding this by excessive PR overstating what might eventually be possible with quantum computing. (There seem to be some arguments whether they themselves are doing the overstating, or uninformed journalists, in which case D-Wave is just failing to correct the press, but whatever.) Their excuse for this is that they're a private company so they have to keep their secrets. Understood! But they shouldn't then be surprised when knowledgeable academics studying the issue voice their opinion that the emperor has no clothes. I'm quite sure the scientific community has enough open-minded people that if they start producing convincing evidence, or give enough details on what they're doing so people can verify if the path they're taking is productive, they'll be understood and praised. Until then, the feedback they're getting is rightly telling them they have more to do.

Sunday, November 25, 2007

What's Up, Digital Fountain?

Every now and again, I get asked about Digital Fountain, the startup that initially grew out of work I was involved with on Tornado codes, a flavor of low-density parity-check codes. (The current codes being used are apparently based on Amin Shokrollahi's improved variant, Raptor codes -- the page has a link to the Trans. on Inf. Theory paper.) Unfortunately, I've been disconnected from Digital Fountain since the tech bust of 2000 or so, so I'm no longer privy to their internal workings. But the company indeed lives on. Every once in a while, I check for news, and I recently saw this positive piece on ZDNet, complete with a recent talk/demo from the fall Demo conference. Digital Fountain is making a push to get into IPTV.

So maybe someday I'll be watching IPTV using a Digital-Fountain enabled product.

Wednesday, November 21, 2007

Page Limits on Journal Papers

I know of at least two journals that I publish in that have page limits on the length of papers. I am completely puzzled by this. A journal paper should be as long as it needs to be to derive and explain the results the author intends to convey, since the paper should be the final record on that piece of research. Some papers might need 20 pages; others might be too long even at 10. A hard rule of say 12 pages maximum just doesn't make sense.

I can imagine two reasons for instituting page limits on journal papers: paper costs, and reviewer time. Neither seems particularly compelling to me. Are there other reasons I'm missing? (Notice that both of these are more compelling in the case of conferences, where I would agree page limits make more sense. And even in that setting, papers costs is rapidly disappearing as a reason, and at some point, the question of page limits should probably be reconsidered.)

I do admit that there are many papers where the author should be forced to edit things down further, and most authors have a tendency to write too much rather than too little. But it seems the right answer to this is to have good reviewers and a good editor firmly tell an author that the paper is not yet ready and needs further editing. A page limit doesn't adequately solve this problem and raises new, more unpleasant ones.

Anyone want to defend this practice? Or at least suggest reasons for it I haven't thought of?

Sunday, November 18, 2007

David Parkes talk Monday, 4pm

David Parkes will be giving a talk on computational mechanism design in Maxwell-Dworkin G115 at 4pm on the 19th. This is what at Harvard we call the "tenure talk" -- it's a talk he gives giving an overview of his research and research area as he's coming up for tenure. If you're a theorist in the area and want to hear a perhaps slightly different point of view on mechanism design and work at the intersection of economics/cs (David proves things, but he comes from the AI side), you should come. Heck, if you're local, in any area, and interested in the economics/cs interface you should come. It should be a very nice talk, and it would be great to have the room full to bursting as a sign of support of David.

Friday, November 16, 2007

Should IP issues affect our research?

Since I've done some legal consulting, I've become interested in intellectual property (IP) issues, and the question of whether IP issues should affect my research agenda has recently occurred to me.

Let me explain. As an algorithmist, a major interest for me is designing algorithms that people will actually use -- perhaps not in the original form that I describe them in, of course, but that the ideas make it out into the real world. And while it is not my aim that every project I'm involved in should have a clear potential practical impact, it is the case that this goal guides my overall research agenda quite significantly.

IP issues have the potential to limit the practical impact of an algorithm I design. For example, suppose someone has a patent on SuperHashSchemeX (SHSX). (If you want something more specific, suppose Bloom filters were patented.) I might come up to an improvement to SHSX, or a better analysis of SHSX, but if the patent is written well, the patent owner may have rights controlling who could use SHSX, including my improvement. There may really be no hope that anyone else will actually be able to make use of my ideas, until the patent runs out, or unless the user is willing to buy a license. Indeed, unless I go through the process of obtaining a patent, it's possible that my work might simply add value for the initial patent-holder, and I've actually rewarded those who are keeping my work from being used.

Perhaps this wouldn't bother me if my theory blood ran pure, but I like being an "applied theorist". In my mind, this affects the calculus I use in thinking about what I should work on.

Should it? Has anyone actually run into this problem?

Monday, November 12, 2007

An Experiment in Viral Marketing : Buying Tickets Online

(Warning: the following post is highly crass and commercial. If that offends you in any way, please don't read on.)

Thanks to some friends who are working in Web marketing, I have a chance to try a viral marketing experiment right here on the blog. I have been give some discount codes for a new ticket site -- -- which offers tickets for sports/concerts/other events. Anyone can use them, at least until 1/1/08, so you can use them, or tell a friend to use them, or have a friend tell a friend, or whatever. According to that whole 6 degrees of separation thing, it's possible that everyone in the world could learn my discount codes, which by the way are:

MMBLOG1 [$5 off any ticket purchase]
MMBLOG2 [5% off any ticket purchase]
[Yes, you can only use one of them...]

To be clear, I am officially acting as a consultant for the SEM company that is trying to drive traffic to TicketHeroes. I am not getting a direct cut back from sales using these codes. (Though if this experiment is hugely successful, perhaps I'll try to negotiate one!) But I will get back data -- I'll get to see how many sales used the codes. (Don't worry, I won't get your credit card number.) I find this all very interesting. Can this little blog, focusing primarily on academic topics, help sell things? Really? I kind of doubt it, but hey, I remember thinking around the year 1999 that it would be a really impressive thing if Google became a 1 billion dollar company, so clearly I don't know as much as I should about marketing.

I've done my own checks and the site's prices look to be slightly less than or the same as other similar sites, even without the discount. So this should be a win for anyone looking for tickets, although of course you shouldn't trust me, you should compare.

Next post, we'll return to less crass, commercial topics, like how to re-finance your mortgage.

Saturday, November 10, 2007

Thursday, November 08, 2007

Service and the NSF

I recently returned from an NSF (National Science Foundation) review panel. (Of course, I'm not allowed to say when, or for what.) On the way back, at the airport, I ran into a more senior colleague (from an entirely different field -- not EECS) coming back on the same flight. It came up that he had never served on an NSF review panel. He admittedly seemed a bit sheepish about it, but offered the excuse that there were more important things to do, and I will give him the benefit of the doubt and assume that he meant other service activities, not research and such. Essentially, his argument seemed to be that NSF panels were a waste of his time.

Having served on a half dozen or so NSF review panels, I have some sympathy for this point of view. In particular, being anti-travel, I'm not sure why we have to fly in for a two-day meeting to make decisions; it's expensive for the NSF and time-consuming for me. (It's not clear to me that decisions are any better because of the face-to-face meeting. Indeed, arguably, they can be worse, depending on the group interaction. But government process is government process, so unlike for PCs, not having the face-to-face meeting is not currently an option...)

However, despite the time, I've tried to make myself available when the calls for NSF panels go out, because I've always figured that if the NSF is paying my summer salary (which, in large part, they do), they have the right to ask for some of my time. Indeed, my gut reaction (without working through the arguments) is that it's objectionable that they don't actually require people who have received funding to serve on a review panel at some point during the term of their grant, though I suppose the implementation of that rule could become unpleasant. In short, my moral interpretation is that by taking their money, I'm part of their system, so when they come calling, I should respond, even without an explicit quid pro quo.

Even if one does not subscribe to this implicit understanding, I think it's important for us individually and as a community (including non-academics who don't get NSF funding directly) to do our service for the NSF, particularly in the way of review panels. In general, citizens should be keeping an eye out on how the government uses their money, and in this specific case, as scientists, we should be paying especially close attention to how the government is distributing research money to scientists in our own and related fields, and we should be attempting to influence the process to move in the right directions as much as possible. The panels give us some opportunity to influence the process, and these otherwise near-useless face-to-face meetings give us some opportunity to influence the NSF administrators who shape and make the decisions that affect us all.

So, with all due respect to my senior colleague, and any of you other slackers out there, get over yourselves, and go serve on an NSF panel every other year or so.

Monday, November 05, 2007

Graduate Students, Socialize!

The discussion that in my mind connected breadth requirements to social networks reminded me of my time as a graduate student at Berkeley, where there were lots of informal social opportunities for graduate students to meet and talk. The CS grad student association had a weekly bagel/donut hour that was always well attended (mmmm...donuts....); the theory students arranged a Friday afternoon seminar (I believe called TGIF) where theory students would present something (a practice talk, some ongoing research, a paper they liked), no faculty allowed; there was a regular Friday pre-weekend grad student get-together that would be at a semi-randomly chosen local bar (usually upwards of a dozen or so people would show, with a high percentage of theorists, who didn't have to spend the weekend coding or running jobs); a grad student poker game broke out once a month or so (again, with a high percentage of theorists, who'd add complexity to the games).

Besides helping create a more pleasant graduate student atmosphere and experience, these activities let graduate students get to know more about each other and what they were doing. And I believe expanding your own work-related social network pays dividends in the long run.

I don't know what the current state of graduate student life is like these days, but if you don't think there's enough of this sort of stuff where you are, I encourage you, take action! (Don't wait for the department to do it; you'll do it better anyhow.) Try to get a grad student poker game going. Or a biweekly pub night. Or start an informal seminar. Or an open problems session. Or whatever it is you want to do, where graduate students can just hang out together, without faculty, with it being about fun, instead of or combined with work. Besides having more fun, you'll open up more opportunities for the serendipitous events in your work-life that lead to exciting projects.

Friday, November 02, 2007

Breadth Requirements

In our department we're looking at the class requirements for Ph.D. students, and in particular breadth requirements.

I'll state clearly that my opinion is that breadth requirements are a good thing. Breadth requirements for Ph.D. students are like vegetables for kids -- they don't always like them, but they're good for them. When I was at Berkeley the breadth requirements were reasonably onerous and I think they were very positive for me (even if, as contemporaries will suggest, it seemed like I slept through most of them). I especially think it's good for theoretical people to have a background in many areas of computer science outside theory, and classes are arguably the best way to get that background. And certainly I would argue that people in systems need some theory in their background.

I'd enjoy hearing arguments from the other side. You can argue that breadth requirements in general are a bad idea, or specifically that theory people shouldn't have to take classes in other CS areas (they should be allowed to just do all theory!), or that systems people shouldn't have to take a theory class. Any EE readers or readers from other areas should chime in with arguments based on their experiences as well!