Monday, September 10, 2007

The Hiring Problem and Lake Wobegon Strategies

Continuing the "and now a word from our sponsors" riff, I'll describe a paper (to appear at SODA) that's joint with several people at Yahoo! Research, and started on a visit there.

We're looking at a situation similar to the well-known "secretary problem". For those unaware of the secretary problem, it goes like this: you need to hire a new secretary. There are n applicants, and you will interview them in a random order. The result of each interview is the relative rank of the applicant against those you have already interviewed. ("This applicant was the 3rd best so far.") After each interview, you have to decide whether to hire that applicant before the next interview; once you pass on an applicant, they're gone. Your goal is to maximize the probability that you hire the best of the n applicants. What's your optimal strategy? (I won't give away the answer for those who haven't seen the problem; it's easily found online.)

In our setting, we are considering growth strategies for a small startup company. Each applicant has a quality score, which we take to be uniform on [0,1], that we learn in the course of the interview. Unlike the secretary problem setting, we want more than one employee, although there is not a fixed number of employees we are aiming for. We simply want to grow, balancing the tradeoff between the speed of growth and the quality of new employees.

In the paper, we consider the strategies where we choose to hire an applicant only if their quality score is above the average of the current employees. (We study both the mean average and the median average.) We call these strategies Lake Wobegon strategies, after the fictional town of Lake Wobegon, where

all the women are strong, all the men are good looking, and all the children are above average.
Here, everyone we hire is above average (at least when we hire them). Google claims to use this strategy.

I should hasten to emphasize that (despite the Google claim) we feel that the hiring problem is about hiring employees just as much as the secretary problem is about hiring a secretary: very little. Rather, it's an abstraction, leading to problems about random processes and decision-making in the context of incomplete information. I've given the talk about this work a few times, and some people get hung up on setting it up as an optimization problem and determining the optimal strategy. You could, I suppose, but you'd have to be precise about the tradeoff about growth speed vs. quality, and the abstraction is already removed quite a bit from reality. I personally think it's simply interesting to consider how growth processes like this work under Lake Wobegon strategies; does anyone know of any related examples in nature?

If you're a puzzle person, then you might consider the following: suppose we start with a single person, of quality 1/2, and then interview until we hire n people. Which strategy will lead to higher average quality -- hiring above the median, or hiring above the mean? (Or are they asymptotically the same?)

The hiring problem leads to a great deal of curious mathematical fun : lognormal distributions, the Berry-Esseen inequality, the difference between hiring above the median and hiring above the mean (yes, they're very different!), and potentially lots of open problems and further variations to look at. Hopefully, the problem will become at least a fraction as popular as the secretary problem has.

I'm not sure why people at Yahoo were interested in this. In this case, I think they too got caught up in the mathematics of the problem. I'm glad that in the research labs, people still just follow their curiosity from time to time, even if the immediate application isn't obvious.


Anonymous said...

Hie Michael,

This is Caroline from SocialRank.

I am trying to get in touch with you but couldn't find your email address.

We will index your blog posts as part of our content filter. I'd like to send you an invite to a beta preview of our new Web 2.0 site.

Can you get back to me with your email address.

Mine is

Kind regards,


Anonymous said...

We understand how it is. Google "Mitzenmacher" and it's hard to pick out the blogger from among the millions of pages you get back.

Michael Mitzenmacher said...

Anonymous above -- very funny. That's pretty much what I thought when I saw that comment. Does anyone know anything about "SocialRank" or "mathbloggers", or is that just spam?

Anonymous said...

I don't know what the true motive is, but at least you want to remove the domain names from her commet to avoid helping them do SEO.


Unknown said...

Love the problem! It's also interesting that there's suddenly so much interest in modeling hiring processes -- maybe because information about how they are done and what the tradeoffs are is becoming more available and therefore the processes themselves are becoming more transparent? Which, of course, leads to a (shameless?) plug for some of my own related work on a problem with a somewhat different formulation (but that we started looking at from the secretary problem perspective as well!) -- Suppose you're an applicant waiting to hear back from a bunch of places that take sequential decisions on whether or not they want to hire you -- what's the value of being told when you're rejected versus receiving signals only when you get offers, and how does it change under different distributional assumptions, etc? Also a fun problem :)
Here's a link