Monday, July 14, 2008

A BIG Theory Conference

Lance talks about TCS's lack of a big conference after attending GAMES. (I recently talked about something similar with regard to ISIT.)

I'm in favor of a BIG cs theory conference, but I don't see a clear path to it. One way of doing it would be to have a mini-FCRC especially for theory, with 6+ conferences/workshops together over a weeklong period. I'd imagine there'd be some small, hopefully nominal fee to allow access to everything, and lots of administration issues. (A CD with all papers would be nice, for instance.) Some conferences have moved this way-- ICALP, for sure, and even SODA has ALENEX and ANALCO set up before it regularly -- but nobody seems to have tried to set up something like this. One possible advantage of this approach is that it could reduce the "quality control" concerns that generally arise for larger conferences.

So what would you like to see strung together into a theory-super-conference? A summer conference including SPAA, PODC, STOC, SOCG, EC, and a few other things sounds like fun to me...

Sunday, July 06, 2008

ISIT this week

While many theory CS people will be spending the week at ICALP (and
adjacent workshops and such), the information theory people will be
spending the week at ISIT. No wonder these communities don't get
together as much as they should -- conferences are cross-scheduled!
CS theory will be well represented at ISIT, however, as Avi Wigderson
is one of the plenary speakers.

ISIT is bigger than, well, any theory CS conference I know, because it
is the major IT conference each year. CS theory has nothing
like it, with more, smaller conferences throughout the year, and (it
seems to me) many more specialized conferences and workshops. So here are some things I think worth observing for theory CS people, just to think about how we do things, and if we'd want to change:

1) ISIT goes for a week. One day of tutorials, 5 days of talks, with
4 parallel sessions, and a plenary each day. So naturally, most everyone
comes. (Um, no, I'm not going this year. That new baby thing...)

2) There's specific time for the Board of Governors of the IEEE
Information Society to meet and do their business. A benefit of a
conference where most everyone comes is that having "business meetings" like this seems easy to set up. Similarly, there's a large Awards Luncheon, and most people are there to pick up their awards, and see/hear the award-winners.

3) There are several activities especially for students. Roundtable
discussions, a panel led by the student committee, and a panel on
balancing career and personal life (see here and here for recent related posts on Sorelle's blog on that theme). Generally these are done over lunch, and lunch is provided for the students. (Never underestimate how well students respond to free food.)

4) Part of the tradeoff in establishing a large conference is that a higher
percentage of papers are accepted. And there's an understanding that individuals are not supposed to submit large numbers of papers, as large-scale participation is one of the goals. (This used to be explicit in the call -- something about multiple papers from an author being subject to more scrutiny -- but I don't see it in this year's call.)

As a relative outsider, I enjoy the ISIT setup. There's one
conference I know I can send my IT papers too; when I go, there's a
chance to see everyone, though there is sometimes the challenge of
tracking them down and scheduling a meet. There's a clear and strong
sense of community at the conference, despite the size.

I'm not trying to say that CS theory doesn't have a community-feeling.
But it does feel like the CS communities tend to partition themselves
more into loosely overlapping subcommunities. There's no universal
conference, although perhaps SODA (moreso than FOCS/STOC, based on sheer size) comes closest. I wonder, sometimes, what we as a community lose from this.

Friday, July 04, 2008

Zittrain on Colbert

So I'm a little late with this news (hey, I was out of the country), but Jonathan Zittrain was on Colbert plugging his book The Future of the Internet and How to Stop It. (If you Google Zittrain and Colbert, you'll find links...) It's a pretty tech-laden interview for the mainstream, in my opinion.

This is relevant (at least to me) because I've known Zittrain for years -- we interned at Microsoft at the same time back in college, and bumped into each other again at Harvard (where he was a law professor, before moving to Oxford). I'm now officially humbled -- someone I know, approximately sort-of in my field, has appeared on Colbert. (OK, I admit, I'd personally rather go on the Daily Show, but still...)

Wednesday, July 02, 2008

Quick Notes from SWAT

1. I'm at SWAT, but don't have much conference news to report; I arrived right around lunch, and went to my room for a post-red-eye nap.

In sort-of-conference news, the hotel is great, Gothenburg is a beautiful city, the organizers seem on top of everything, and I had a very nice dinner with the PC/Steering committee folks. Topics ranged from the difficulty and cost of running a conference, to the "Bell Curve" and IQ, to why people live in Sweden but work in Denmark. And a few technical research-y ideas as well. A nice day. Boy, I hope I don't mess up my talk tomorrow.

2. A place I consult switched to a VoIP phone system requiring me to install phone software on my Mac and gave me a USB headphone set. The upside -- I can call home from Sweden on the wireless! Nice. Yes, I could also use Skype, but calling on a regular phone has its advantages. Yes, I know I'm late to join the 21st century.

3. Following the complexity blog, let me welcome Sorelle to the list of current bloggers. She's a CS theory graduate student who's just started!

Monday, June 30, 2008

Strange Transactions on Information Theory Policies

The journal where I've published the most now is the IEEE Transactions on Information Theory. (Probably because when I write any journal article in information theory, I send it there, while my other work is more scattered.) So I hope it's clear that, as a journal, I hold it in high regard. That being said, they have some strange policies that I just don't understand. (If you're not familiar with the journal and want to check what I'm saying, their submission guidelines are here.)

First and foremost, they have a distinction between "regular papers" and "correspondence". As they briefly make unclear, "a correspondence item will have one point to make briefly and without adornment, whereas a regular paper will be a more well-rounded treatment of a problem area." In the old days, as far as I can tell, this distinction didn't matter much, and roughly the distinction boiled down to length -- for me, long papers were regular papers, shorter papers were correspondence. Perhaps in the IT community there was more prestige for a regular paper, but I've never been aware of it -- at least in my mind citation count would give a better idea of a paper's importance than its nominal type -- and never cared.

This strangeness didn't become important until recently, when to deal with the apparent problem that the Transactions was becoming too big and thereby using too much paper [yes, I know this shouldn't be a problem, but they still send out lots of paper copies], they limited correspondence items to 5 journal pages.

This, to me, is just strange. First, I dislike the idea of page limits in general for journals. The journal is supposed to be the "permanent record" of a scientific result. If it takes 50 pages to present the result, then that's what it takes. A journal certainly has a right (indeed, perhaps a duty) to have authors properly edit a work so that there's not a gross amount of wasted verbiage, but to set a priori page limits, in my mind, goes against the purported purpose of journal publication.

Second, I don't see how this page limit jibes with this (already strange) regular paper/correspondence distinction. Sometimes, presenting a single point briefly takes more than 5 pages. (Arguably, it often takes this long, if one wants to do a suitable job presenting previous work, or substantial mathematical setup must be done.)

As an example, I submitted a paper to ToIT. I thought it was clearly a correspondence, according to their definition, since it really covered a single thematic issue. Given the policy, I wasted at least a day cutting things from my version to make it a (just over) 5 page submission version. (I had tables and a picture -- it really was a hard exercise to cut that last page.)

The reviews came back, and most of the complaints were questions of the form, "Why don't you talk about Z?", where in fact my discussion of Z was something I took out to get things down to 5 pages. (Duh.) The editor asked the editor-in-chief -- no exceptions to the 5 page rule.

So the editor, smartly, decided that I had a 6-page regular paper. While I appreciated the solution, the whole thing just highlighted the silliness of the policy, and perhaps the strangeness in distinguishing "regular papers" and "correspondence" in the first place.

Another policy I object to is their decision not to publish already-published conference articles unless there is "substantially new" material. (This is not spelled out, but I have been told the rough figure is "30% of the article must be new", whatever that means.) Again, in my mind, this goes against the purported purpose of a journal as the final record of a research result. Conference papers are, in my mind, dramatically more likely to be wrong, and at the very least have generally had little feedback in the reviewing process. A journal is supposed to provide more in terms of reviewing; if a conference paper and journal paper ended up identical, I'd have no problem with it -- and I'd congratulate the author on writing such a solid conference version (assuming I didn't question the reviewing practices of the journal).

I can understand this policy in the context of copyright. An IEEE journal can't (re-)publish something another organization has the copyright to, and an author should understand this. But when the paper was in an IEEE conference -- what's the problem?

These two policies interact in a very strange way in the context of the IEEE International Symposium on Information Theory (ISIT) -- the big IT conference. ISIT papers are -- get this -- 5 pages. One might naturally hope that ISIT papers could be turned right over into Transactions correspondences -- and get the benefit of more complete reviewing and editing for that "final version". But not, it seems, under the current policy that the journal version must have something "new" over the conference version. But I can't add anything new to my 5 page conference version without going over that arbitrary 5 page correspondence limit. Am I the only one that sees something wrong with that picture? (And yes, I have ISIT papers that I won't bother submitting to ToIT, because they're really correspondences that should/would be 6-7 pages once I add that 30% new stuff in.) If they're going to not publish things from conferences, even IEEE conferences, then having a 5 page ISIT limit and a 5 page correspondence limit is, in my mind, just strange.

Monday, June 23, 2008

Yet Another Survey... Deletion Codes

Next week I'll be at SWAT for an invited talk. When I gave them a choice of topics, they went with: A Survey of Results for Deletion Channels and Related Synchronization Channels. I've actually given this talk a few other places, but there's been some new stuff, so it's been updated a bit. (I'll put the latest version of the slides up on my talks page after the talk...)

The invitation gave me an excuse to buckle down and finish something I started over a year ago -- a survey article on results for the capacity of the deletion channel (and related problems) to go with the talk. I must be out of practice writing, because it's taken me most of the month to get this thing done. (Well, OK, there was that whole "baby incident" that took up some time, but still, I've clearly been doing far too much administration this last year, and I need to shake off the rust.) It's probably still a bit rough so anyone interested enough to read it, please feel free to mail corrections/suggestions directly.

Apparently, based on some mild teasing from a colleague and friend, I'm known as someone who writes survey articles. (The teasing was perhaps I'm known only as someone who writes survey articles...) I don't know why I end up writing them -- I can't really say it's a fun exercise to write a survey, though it is often educational -- but somehow I feel it's valuable. In this case, there's definitely the motivation that I've done a lot of work in this area, and I want to both promote the area and my work. The survey has a couple of dozen open problems spelled out right there, so let that be motivation for some of you to read it...

Also, I'm open on suggestions on where to publish it. My last few surveys have nicely been able to go to Internet Mathematics (see here for Bloom filters and here for power laws/lognormal distributions-- although the "uncensored" version of the power law survey from my publications page is more fun) so that they've had a worthwhile permanent home. This survey won't quite fit there, and I think it's good for such things to have a visible outlet (besides, say, arxiv). Ostensibly Transactions on Information Theory will take surveys, although I haven't seen that much in practice; and this survey doesn't really fit in the Foundations and Trends model. (I like the Foundations and Trends model, but it seems to me their surveys are like "mini-textbooks" -- and cost accordingly -- while my surveys are more like "here's some background, big ideas, and pointers to papers" -- and I'd like them to be essentially free.)

Wednesday, June 18, 2008

CS Family Values

One aspect of graduate school at Berkeley I recall quite clearly was the lack of children. Graduate students having children was and is generally quite rare, and I'm not sure that CS at Berkeley was much worse in that regard than anywhere else -- I'd be interested if readers have any pointers to stats on the issue, by field and/or by school. But it struck me even then that it seemed unusual for the faculty to have kids, particularly in the theory group.

I know there's the old advice -- still apparently prevalent in some circles -- not to have children until you get tenure. Though that advice seems less widespread these days, or perhaps just more young faculty are choosing to ignore it. (Or, perhaps, I'm just out of touch -- at Harvard, at least, the common case is for CS junior faculty to have kids.) [And apparently I may also be suffering sex bias; see this summary and this pointer to an NSF report, suggesting that the effect on tenure chances for men having children is small, but is much larger for women.]

Does having kids help or hurt one's work? I don't think there's a clear answer, that it probably depends on the individual (and that the effect, in any case, is probably overestimated). On the other hand, I think a career framework that pushes people to postpone or not have children is ultimately cutting off a substantial supply of raw talent, which can't be a good thing. If academia as a whole is moving away from that 20th (19th?) century mindset, I'm all in favor of it.

(An aside -- Chloe Elizabeth Mitzenmacher arrived last week, prompting some of these thoughts. Which, due to lack of sleep, might be even less coherent than average...)

Friday, June 13, 2008

An Improved Analysis for "Why Simple Hash Functions Work"

I recently had a paper with Salil Vahdan (in SODA 2008) explaining why simple hash functions "work" for so many stream-related problems where data is hashed to be placed in a hash table or a Bloom filter or a similar structure. Here "work" means "behave essentially the way you would predict if the hash function was perfectly random." The basic reason, we suggested, was that if the data in the stream had enough randomness, it would combine with the randomness from the choice of hash function to produce value that looked independent and uniform.

Our analysis of how much randomness the data needed to have seemed a little loose. Salil and his student, Kai-Min Chung, have recently improved the analysis; here's the arxiv link, the paper will be appearing in RANDOM '08. There's some really nice and somewhat subtle arguments that I imagine will find some uses in other areas. While it "only" improves the constant factor (in terms of number of bits of randomness needed), this is certainly a case where constant factors are important.

The result further reinforces my belief that this is the right high-order explanation for why, whenever anyone actually does an experiment involving hashing, you get the right results no matter what hash function you use.

Thursday, June 12, 2008

STOC -- Choosing A Program Committee

It took longer than I expected to put together a PC for STOC, and now that it's over, I thought I'd talk about the process.

First, I downloaded the PCs for FOCS/STOC/SODA for the last three years. I wanted to get an idea of what past PCs have looked like, and I wanted the raw data. Generally, I tried to avoid asking people who have served on those committees twice in the past three years. In my mind, it's good to mix things up and get new people involved in the process.

Then I tried to wrap my head around issues of coverage. I know authors are (reasonably) annoyed when they see nobody on the PC in their BROAD AREA X, leaving them wondering if their paper will get a fair read. Despite (indeed, because of) my readily apparent algorithms bias, I wanted to make sure there were a suitable number of people in complexity/cryptography/quantum etc. Even on the algorithms side, I wanted to make sure major areas like data structures, on-line algorithms, approximation algorithms, etc. were reasonably covered as well. It's hard to cover everything, so I'm sure some people will still be disappointed, but I think that overall there is a nice balance in the committee. (We'll see if it's a suitable balance once the papers come in...)

Besides covering areas, there were other balancing issues that came to mind. The "ages" of the PC members was perhaps the next most important. I wanted some fresh faces, since serving on a PC is both good experience for younger members of our community, and a signal to their departments (or bosses, or future bosses) that they are well thought of. But I also wanted some more experienced hands who could possibly head off my mistakes before I made them and provide appropriate judgment where needed.

Additional balances I tried to make: "Practical" people and "pure" theoreticians. Research lab/industry people and academics. (No, that's not the same as the practical/theoretical balance.) US and non-US participants. I did also consider the number of women on the committee. My practice in this situation was not any sort of direct affirmative action, but just to make a conscious effort to make sure that I wasn't overlooking the numerous well-qualified possible female PC members when sending out invites. I look forward to the day when I don't think it's necessary for me (and others) to make this effort, but I personally don't think our field is there yet. For those who are interested in such matters, the STOC 2009 PC is 25% female. In checking back over the PC lists for the last three years, this seems to be more than average for FOCS/STOC/SODA. I admit to being somewhat surprised (and disappointed) by that, and I leave it to commenters to discuss the importance of this issue to our community.

The biggest difficulty I found in getting the PC finalized was getting senior people to sign on. Roughly speaking, the probability of your willingness to serve on a PC seems to decline monotonically with years of experience (with perhaps a significant step-function drop post-tenure). In this case, I don't mean this as a criticism. I know that the people who declined are indeed quite busy with many other things, including larger-scale efforts to benefit and promote the theory community. Or, as one colleague put it to me when I explained the problem I was having, "Duh!" Anyhow, it took a couple of rounds of trials to find, in particular, more senior colleagues, and I thank them especially for their willingness to volunteer, and/or for just giving in to my begging.

[Come to think of it, my own PCing has declined post-tenure, though I'm still averaging at least a couple of larger conferences per year...]

Anyhow, at some point or another, an official call for papers will go up, but I think it's OK to announce the PC (subject to last-minute issues or changes):

Susanne Albers
Andris Ambainis
Nikhil Bansal
Paul Beame
Andrej Bogdanov
Ran Canetti
David Eppstein
Dmitry Gavinsky
Leslie Goldberg
Shafi Goldwasser
Nicole Immorlica
Anna Karlin
Jonathan Katz
Jonathan Kelner
Subhash Khot
Ravi Kumar
Michael Mitzenmacher
Kamesh Munagala
Rasmus Pagh
Anup Rao
Rocco Servedio
Mikkel Thorup
Chris Umans
Lisa Zhang