Wednesday, April 22, 2009

Two Talks : Savage and Upfal

Today was a busy day with talks.

Stefan Savage of UCSD gave a talk at Harvard's Center for Research on Computation and Society on his work on Spamalytics, which is really about the economics of botnets. Essentially, they "infiltrated" a botnet in order to (mostly passively) monitor its behavior and learn what these networks actually do and what their economic potential is. It's fascinating both from an engineering perspective (how do you do it) and an economics perspective (what do you learn about the behavior of the participants), and should guide both anti-bot technical efforts and policy. While it's not clear to me there's much in the way of "science" in the pure sense of the word in this work, I liked Stefan's analogizing this work to anthropology: the goal here is to study what's going on and learn the relationships among the actors.

I look forward to cornering Stefan sometime and hearing more about the "issues" that arose with this work -- somehow, the FBI kept coming up at points in his talk, but he didn't really have time for the details. (He did seem to state they're more in touch with the FBI now than initially -- to help make sure the FBI doesn't mistake them for the botnet!) What's interesting in my mind is there must be more projects like this where the government-powers-that-be would both like and benefit from active research into the misuse of computer systems and related computer security problems. How can that cooperation be fostered, in a way that maintains the academic goals like publication and dissemination of the knowledge learned? I'm not sure it works by government agencies initiating the project; it seems it would have to start the other way around, as this project did. But I don't envy the time Stefan (or his team) must have spent with lawyers making sure they weren't breaking the law because they weren't working under government supervision.

[The question I didn't get to ask Stefan: what grant do you use to cover "legal expenses" for projects like this? Can that be an NSF line item, or did the corporate donations cover that part?]

The second talk of the day was Eli Upfal visiting MIT to talk about his work on multi-armed bandit problems (see the paper list here). His variations were all nicely motivated by related problems for search engines, specifically matching ads to web pages. (I recall hearing about these motivations when we were both visiting Yahoo! Research, so they resonated with me.) The variations include when the bandits are mapped to a metric space and their value satisfies a Lipschitz condition, when the bandits value can change over time (specifically the mean changes according to a Brownian motion process), and when the useful lifetime of a bandit is given by a stochastic distribution. The talk was at the opposite extreme from Stefan's -- very theoretical, with a focus on both upper and lower bounds and the techniques behind them. I had thought of multi-armed bandits as a fairly well-mined area of research, so it was interesting to see multiple novel, well-motivated examples -- suggesting there's plenty more interesting questions left in this area.


Stefan Savage said...

FWIW, we've long collaborated with two lawyers (one at UCSD and one at Berkeley). One of them has always been on our budgets in this space (NSF, DoD, etc) both to make sure we manage legal risk and to help be the conduit for informing legal thought wrt public policy regarding security research.
The other lawyer is a legal scholar who has checked some of our thinking as a favor.

Stefan Savage said...

While it's not clear to me there's much in the way of "science" in the pure sense of the word in this work, I liked Stefan's analogizing this work to anthropology: the goal here is to study what's going on and learn the relationships among the actors.Hey, hey, hey... those be fightin' wurds! You just said we're not science and then defined science :-) Seriously, I think there is frequently a disconnect where people think of science as something that requires heavy formalism and/or studies entirely deterministic processes. This just seems wrong. Indeed, if you go back to traditional definitions of science (e.g., gathering knowledge through the scientific method) our current work is a far better fit to the label than most work in computer science (and certainly most of my work to date which has been focused on engineering)

Consider this. We have an unknown phenomenon, we develop interactive and observational experiments to measure it, we develop hypotheses about the process generating the phenomenon, and then attempt to generalize, retest and validate.

If we were studying quarks or protein networks it would clearly be science, but we're studying a social phenomenon. What changed about the science?

Michael Mitzenmacher said...

Hey Stefan. I was also going to call the work "soft science", but figured that would be equally offensive to you, and would further offend any soft scientists reading. :)

My differentiation is not on formalism or deterministic processes, as you suggest. My differentiation is between studying abstract mathematical processes, processes found in nature, and human processes. In this respect, many (or, in an extreme view, all) human processes are inherently ephemeral, and subject to change, in ways that both the laws of mathematics and the laws of the universe are not. (With biology, the changes are on a time-scale order that I wouldn't call it ephemeral. I do know that many would argue that mathematics itself is also not pure science, and focus only on the laws of nature part, but naturally my bias is the laws of mathematics are as much the laws of nature as anything tangible in the real world.)

Your work is studying specific behaviors on a system that will probably change dramatically in our lifetime. (20 years ago, you couldn't by Viagra on the Internet -- there was neither commercial Viagra, nor the commercial Internet as we know it. 20 years from now, will the economics of botnets look the same?) That's not to say that the work is uninteresting -- it's VERY interesting, and can have a huge impact -- nor is it to say that you didn't tackle the problem using scientific methods and principles. My meaning was what I think most people associate with "pure science" -- learning the laws of the universe or of nature -- and that is the sense that you should read my phrasing. It's not meant to be a positive or negative, but an observation.

Besides, I don't want to start a fight. :)