The paper's acceptance gives me an excuse to discuss some issues on paper writing, research, conferences, and so on, which I'll do this week. To start, I found it interesting that WSDM had 290 submissions, a 70% increase in submissions over 2009. Apparently, Web Search and Data Mining is a healthy research area in terms of the quantity of papers and researchers. They accepted 45, or just about 15.5%. This turns out not to be too far off from the first two years, where acceptance rates were also in the 16-17% range. I'm glad I didn't know that ahead of time, or I might not have submitted!
I'm curious -- why would a new conference, trying to establish itself and gain a viable, long-term group of researchers who will attend, limit itself to such small acceptance rates when starting out? Apparently they thought the key to success would be a high quality bar, but I find the low acceptance rate quite surprising. I can imagine that the rate is low because there are a number of very poor submissions -- even the very top conferences, I've found, get a non-trivial percentage of junk submitted, and although I have no inside knowledge I could see how a conference with the words "International" and "Web" in the title might receive a number of obviously subpar submissions. But even if I assume that a third of the submissions were immediate rejects, the acceptance rate on the remaining papers is a not particularly large 23.3%.
The topic of low acceptance rates for CS conferences has been a subject of some discussion lately -- see Birman and Schneider's article at the CACM, Matt Welsh's thoughts, Dan Wallach's thoughts, and Lance Fortnow's article at the CACM for instance. Here we have an interesting example case to study -- a new conference that starts out with an accept rate in the 16% range, and an apparent abundance of submissions. Anyone have any thoughts on why that should be? (I'll see if I can get some of the conference organizers to comment.) Or opinions on if that's the way it should be?
Now for that abstract:
Attributing a dollar value to a keyword is an essential part of running any profitable search engine advertising campaign. When an advertiser has complete control over the interaction with and monetization of each user arriving on a given keyword, the value of that term can be accurately tracked. However, in many instances, the advertiser may monetize arrivals indirectly through one or more third parties. In such cases, it is typical for the third party to provide only coarse-grained reporting: rather than report each monetization event, users are aggregated into larger channels and the third party reports aggregate information such as total daily revenue for each channel. Examples of third parties that use channels include Amazon and Google AdSense.
In such scenarios, the number of channels is generally much smaller than the number of keywords whose value per click (VPC) we wish to learn. However, the advertiser has flexibility as to how to assign keywords to channels over time. We introduce the channelization problem: how do we adaptively assign keywords to channels over the course of multiple days to quickly obtain accurate VPC estimates of all keywords? We relate this problem to classical results in weighing design, devise new adaptive algorithms for this problem, and quantify the performance of these algorithms experimentally. Our results demonstrate that adaptive weighing designs that exploit statistics of term frequency, variability in VPCs across keywords, and flexible channel assignments over time provide the best estimators of keyword VPCs.