Sunday, November 25, 2012

Assessing Computer Scientists

My colleague and frequent co-author John Byers knows that I can spend up to several hours a day actively worrying about my ranking as a computer scientist.  While the automated daily Google Scholar updates (any new citations today?) are helpful, it's not always clear how I should interpret them. So John was happy to direct me to a new paper, Assessing Computer Scientists Using Citation Data, to help me and other ranking-obsessed individuals find their place in the world.

The methodology itself is actually quite interesting.  The first question is what version of the h-index do you want to use?  In particular, what external information do you use to judge which version of the h-index appears most accurate?  The method used here is to assume that department rankings accurately represent the quality of the faculty within the departments, and use a regression between the reputation of departments and the mean citation score of the faculty to help determine which versions of the h-index appear most accurate for assessment purposes.  There are several other factors accounted for in the methodology, such as a prediction model for the probability a computer scientist works at a department depending on their quality "mismatch", and how to take into account the thing like the field and years-since-PhD of individual researchers.  (Theory papers as a group obtain much fewer citations on average;  security and cryptography papers, much more.)  The latter allows one to come up with variations of the h-index score that are field- and age-adjusted to individual researchers.  That is, the paper provides a systematic approach that attempts to correct for some of the known weaknesses of h-index scores.  This sort of analysis is common in econometric papers, and is the same general type of analysis we did in our Groupon paper a while back.

I'm well aware that many people object to this type of ranking of individuals, some based on arguments of principle (this isn't how scientific research should be judged) and some based on technical arguments (these approaches are fundamentally flawed).  This work doesn't really try to address the first type of argument, but arguably it goes a fair way toward addressing various technical concerns by showing a suitable application of econometric techniques.  How well does it do?  You'll have to look at the tables in the paper and decide for yourself.    

I generally find these types of measurements useful, with the understanding that they're imperfect.  (When asked to write a promotion letter for someone, for instance, I do examine citation counts.)  To the extent that they become "more perfect", the implication seems to me that they will become the standard first-order approximation of quality.  I don't see how such an outcome could reasonably be avoided.  One argument is that it shouldn't, because it's a useful guide to performance;  if certain people don't perform well in relative terms under that metric, but should be thought of as exceptions to the first-order rule, then exceptional arguments (in letters and such) can be made when needed.  But then not everyone can or will be an exception.



12 comments:

Anonymous said...

One of my arguments against reliance on these kinds of citation metrics is that they disproportionately affect women and minority scientists.

Look, we all know that citations are political. Smaller results by more famous people attract more attention and citations than bigger results by less famous people, and people often cite the more famous person for the same result over the less famous.

I am not saying people do this deliberately; a lot of it is subconscious. And this kind of subconscious bias is particularly exacerbated in the case of people who traditionally don't look like scientists -- aka women and minorities.

Anonymous said...

What would be useful, imo is a "weighted count" where the citation count is normalized by the average citation count of the journal/conference the paper is published in. This way people working in bigger fields will not have undue advantage over those working in smaller fields when the quality of the work is being assessed using citations as a metric.

Michael Mitzenmacher said...

@Anon1: Let us try thinking positively. Perhaps these econometric techniques can be used to determine the effects that being a woman or minority has on citation counts, and allow a corresponding correction. That is what the paper tries to do for other "variables", such as what field the person is in. It may be able to do so for these variables as well.

Indeed, I would suggest that I interpret your comment as a challenge. Pure citation count is hypothesized to disproportionally negatively affect women and minority citation counts. Can this effect be measured, and if so, can some correction be reasonably introduced?

Michael Mitzenmacher said...

@Anon2: To be clear, the paper did try to provide a "weighted count" normalized by field of the author, as I said in the post. You should see the paper for more details.

One might consider normalizing by citation count to specific conferences/journals, but I'd be worried that would be subject to various concerns, such as possibly noise or overfitting. Remember total citations aren't the issue with metrics like h-index, so my intuition is that a field-level correction is pretty reasonable.

Anonymous said...

Hi Michael, We've heard some of your reasons for favoring h-index-type metrics to evaluate scientists.

Perhaps you'd also be willing to say something about the benefits/drawbacks of spending "several hours a day actively worrying about [one's] ranking as a computer scientist"? What impact do you think this has had on your career/life? How do you manage the stress involved? I'm not asking this to criticize or put you on the spot; I'm just curious.

Anonymous said...

@Anon5: the opening line "I can spend up to several hours a day actively worrying about my ranking as a computer scientist" strikes me as a self-deprecating joke -- at least I hope it is.

Michael Mitzenmacher said...

@Anon5: I think you should quickly head to your doctor, and get your sarcasm-detection fluid levels checked. They appear to be low.

I don't actually spend several hours a day (or week, or...) checking my citation count.

That's not to say I don't look at them ever. I do. I think it's useful to know things like the following:
a) What work of mine do others seem to care about?
b) Who has been citing my work, and why?

To me this is a practical way of finding out how people see and understand your work, which can be useful for suggesting new problems, new directions, and new collaborators. In these ways, the impact can be positive instead of stressful or negative.

I'm actually finding the Google service that tracks "interesting papers" for you -- including those that cite your work -- kind of handy. I check in once a month or so to see if anything interesting is there.

I can, however, see how the stats being so readily available would and could be especially stressful pre-tenure, particularly as the tenure case grows near. To reduce the stress in this clearly stressful situation, I'd consider the following.

1) Everyone realizes citation counts are noisy; they're not going to be the final arbiter in your tenure case.
2) The best thing to do for your citation count is just keep trying to do good work. The sorts of things besides that that might affect your citation count will probably have minimal (or possibly negative) effect on your reputation.
3) As I've said above, citation counts can provide you useful information about how others are seeing your work, allowing you to plan and explain your tenure case accordingly. It should be helpful to you and your colleagues. If you think about it as something that can help you, it might not seem so stressful.

Michael Mitzenmacher said...

@Anon6: I hope so too!

Wim van Dam said...

Fellow readers of Michael's blog: can we please, please start posting comments under our name? This anon 1,2,...,6 business is getting silly.

Anon5 again said...

Michael--I certainly detected the presence of exaggeration in your self-description, but not of sarcasm (still don't). I was curious about your own experience (as a successful prof) with stress/status anxiety. But what you've written is also useful.

Wim--I choose to post anonymously in this case, while trying my best to be civil. If Michael doesn't like what I've written he can remove it or change the blog policy.

Michael Mitzenmacher said...

@Anon5. Hello again! I suppose you're right, exaggeration is a better description than sarcasm, but the first paragraph was meant in jest. (I had a more boring opening, but decided to liven it up a bit.) Also, while I of course prefer people to post by name, as long as they're civil, I won't delete anonymous comments, and your comments have certainly been both civil and thought-provoking. Indeed, I hope my first answer showed I took your question seriously.

I'll try to answer your refined question, as I understand it. I do, as I imagine most people do, feel stress and status anxiety with regard to my chosen profession. I would like to be excellent at what I do, and I would like to recognized for that excellence.

I'm happy to say, though, that the stress and status anxiety seems to reduce with age, at least in my case. To put things in perspective, I received tenure at about the age of 35. In my 20s, the stress and status anxiety ranged from "What am I doing here?" in graduate school -- it took me the first 2 years to figure out how to do research -- to "Who is going to hire me?" -- the job market wasn't great when I graduated.

In my 30's, the stress and status anxiety revolved around tenure -- would I get it? And while there were a few bad mental days in there, mostly once the case went in and I was just waiting, overall it actually wasn't so stressful. At that point, I was comfortable enough with myself and my work to feel that I would get a job elsewhere if I didn't get tenure, and it really wouldn't be that bad. Not my first choice, obviously, but not the end of the world.

Now, the stress and status anxiety is pretty much all self-imposed. I want to do well because I like doing well. But I'm also much more comfortable with myself -- I can be happy in my own accomplishments, without needing to compare myself to others. At least most of the time :).

Of course I'm human. I work in an area -- primarily "academic" computer science -- where there seem to bunches of really smart and amazing people. I'm sure I'll always have occasional thoughts like, "I wish I was as smart/creative/talented/successful as X, Y, or Z." It's hard not to, with so may smart/creative/talented/successful people around. I'm no psychologist, but my take is that in small doses, that inner competitiveness can probably be a good spur to good work. In large doses, probably not. If you're asking advice, I'd suggest trying to figure out how much stress and status anxiety for you personally is "healthy". As I've aged, I like to think I'm doing better at that.

Glenn Ellison said...

Glad to see people found the paper.

On the question raised by @Anon1: I ran the regression in Table 2 this morning adding a "Female" variable. The estimated coefficient says that female faculty tend to have H(30,1) indexes about 12% below those of their male colleagues. (The standard error is 5.6% so this is not highly precise/significant.)

This does suggest that one would need to apply a 12% correction factor to the indexes of female faculty if one didn't want to worsen underrepresentation.

To be clear, this type of analysis can't say why female faculty end up with fewer citations than their male colleagues -- just like it can't say why the theorists in a given department tend to have fewer citations than those working in security -- all it can say is what correction factor you'd need to apply if you wanted to maintain the status quo distribution of faculty across fields/genders.