Friday, July 08, 2011

h-index != impact

Suresh and Daniel Lemire (in Google+ posts) have pointed to the following paragraph from this blog:
The sad thing is that young people have now been terrified by the Impact and H factors, and I can’t give them much hope. When I published my first paper in 1967 (J. Chem. Soc. (now the RSC), Chemical Communications) I did it because I had a piece of science I was excited about and wanted to tell the world about. That ethos has gone. It’s now “I have to publish X first author-papers in Y journals with impact factors great than Z”.
As a service to those young people, I'd like to make clear that, at least at my institution, the “I have to publish X first author-papers in Y journals with impact factors great than Z” approach is not actually suitable, and you should focus on the "I had a piece of science I was excited about and wanted to tell the world about" approach. 

I'm not being naive.  Citation counts certainly arise in promotion and tenure cases.  They're a piece of information, and we look at them.  But just as your GRE score won't get you into (one of the top-tier) graduate schools,  your h-index is not going to get you tenure (or a grant, or an award, or...)

When you come up for promotion, we ask for letters.  Some letters will mention your citation counts or your h-index as a way of providing evidence that you've done interesting and important work, and that's all well and good.  Then, what we look for, is an explanation from the scientist as to why they think your work is interesting and important.  Arguably, the best way to get your letter-writers to write a good case for why your work is interesting and important is to do work that you're excited and want to tell the world about.  Because if you are excited and go tell the world, repeatedly and with energy, the word will get out, and get to the ears of those scientists who are going to write your letters.

Of course, even ignoring those employment-relate aspects, doing science you're excited about is just more fun.

I wonder if the author of this blog post is correct in the characterization of young people, as the idea is a bit foreign to me.  In theory, of course, we have some great role models;  I don't think Les Valiant, Jon Kleinberg, David Karger, Cynthia Dwork, and so on spend their time worrying about their h-index.  They just want to do cool stuff (and, as far as I've known them, always have -- it's not a "now-that-they're senior" thing).  But just in case, let's make sure the correlation/causation message gets out right:

cool work, excitement, and enthusiasm tends to yield high citation counts and maybe h-indexes
citations are not how we define or even measure cool work, excitement, and enthusiasm


Daniel Lemire said...

I don't know if it was intentional, but this is related to your comments about the number of submissions to SODA.

By setting up very selective conferences, we make it so that a good filter is "the number of papers at SODA". This turns the game into a "get as many papers in SODA as you can".

Once you go down this path, it is really hard to recover.

Science is not the pursuit of prestige. In fact, science ought to be the rejection of prestige as a filter. We had prestigious scholars well before the scientific revolution. What the scientific revolution brought is the freedom to contradict the authorities. This freedom to think differently is what leads us to new ideas which turned Europe into the master of the Earth for a brief time. New ideas is what this is all about.

Michael Mitzenmacher said...

Hi Daniel.

While I can't say I agree with all of your opinions regarding selective conferences, I would certainly state (and I think you'd agree) that selectivity has, on the whole, gotten out of hand. As I'm sure you know I've pushed for larger FOCS/STOC conferences for a variety of reasons, and similarly think -- although SODA is already larger than FOCS/STOC -- it would be better if it was larger still.

My question from my last blog post still applies "(Would SODA be that much different if it accepted, say, 250 papers? Discuss.)"


Anonymous said...

To respond to your question about whether this is a correct characterization of young people, I believe it is. I have had discussions with several fellow grad students, and the pressure to publish is present in all of us (speaking of theory CS folks).

Whether this is a function of fewer new faculty openings, a changing academic community/culture, or something else I can't say, but I am certainly *hoping* your advice is correct.

Anonymous said...

I agree with Michael's advice, but want to point out that it is good advice not because the market is not competitive, but because the market is hyper-competitive. To get into a Ph.D. program in a top department, GRE's and GPA in effect don't matter. More precisely, they only matter in one direction. They can weed you out, but there is no GPA or GRE score that will get you in because half the people applying have perfect GPAs and GRE's.

Similarly, there is no number of FOCS/STOC/SODA papers that will get you a job at a top department. Not having enough will be a strike against you, but since there are more people with umpty ump FOCS papers on the market than there are jobs by a considerable margin, also having umpty ump papers won't even get you an interview. To get a job at a top place, and that's UCSD as well as Harvard, you have to have the FOCS paper that everyone's talking about, the one that will have six follow-up clones in the next STOC.

And, if you have that paper, the one they're already excited about, the hiring committee will be less fussy about what else you have.

I'm not sure if this is really better. It rewards people for being trendy rather than prolific. But in my experience, that's how it actually works.

Russell Impagliazzo

Anonymous said...

Les Valiant, Jon Kleinberg, David Karger, and Cynthia Dwork are not in
theory-hostile departments. If there are tensions between groups (over, for example, allocation of faculty slots between areas) citations (or h-index), *is*, unfortunately, going to be used as an argument (in addition to the other arguments, such us, amount of funding, amount of students graduated, papers in top conferences, and PR)

David Andersen said...

Hi, Michael -

I guess I count as a "young people" because I don't yet have tenure (hi, CMU's tenure clock! :-). If it offers a datapoint, I *do* know my h-index, and I update it about once a year by ego-surfing google scholar. If I'm being honest, it probably does have something to do with the normal academic insecurity, but it has nothing to do with tenure: my department makes it very clear that their criteria are overall research excellence, not some numerical indicator. But I actually find it useful, because it's forced me to ask questions about some of the papers I've written.

Category 1: "Those" papers. You know (or maybe you're lucky enough not to know): The ones you agreed to be on, or even wrote, knowing in your heart that they're sub-par. I have a few of these papers, and reflecting on them strengthened my resolve to be more willing to say "No; I'd rather spend my time only on work I'm really psyched about, even if it cuts my publication count." Which can be hard to do as a jr faculty who feels a vague sense that they have to publish not just great stuff, but lots of stuff. But it's the right decision, I hope.

Category 2: Why isn't this paper cited more? I have a few papers I've written that I quite like, but that have few citations ("Adaptive file transfers for diverse environments" is one example - 6 citations on GS). Reflecting on these has been useful for understanding where I've goofed, either in venue selection, problem selection, or exposition.

tl;dr: There's benefit to a meta-examination of your own work and research process, and to the extent that the h-index or other things help reveal things you might not have seen, they're useful. But don't obsess, and don't try to optimize for them - use them in retrospect!

Michael Mitzenmacher said...

David --

What a great comment! I, too, do know my h-index, and troll Google scholar regularly. As you suggest, it's perhaps normal academic insecurity, but it's also my way of keeping tabs on what's going on. A great use of Google scholar, besides just looking at citation counts, is to see who is citing your work; looking at those papers can suggest research problems, collaborations, or just keep you up to date on what's going on.

I think your two categories are indeed helpful. I certainly have Category 1 papers, though I admit, they always seemed like good ideas at the time. (As you say, though, you have to learn how to say "no" and avoid them when possible.) And of course I have Category 2 papers. On a good day, probably 1/2 my papers are Category 2 papers. :)

Anonymous said...

MM, a related question from a systems grad student: how do you know if a topic you are working on is "high impact"? In general, how do you choose a research problem considering "impact"? Is impact proportional to citation potential? They say that you choose a problem that is relatively long-term, high-risk: what does that translate to, in systems research? Finally, if the problem I'm working on has a ton of immediate potential/use, does that count as "impact" (such problems are likely not high-risk and definitely not long-term)?

Anonymous said...

"I don't think Les Valiant, Jon Kleinberg, David Karger, Cynthia Dwork, and so on spend their time worrying about their h-index."

Well, life is easier if you are tenured...

Anonymous said...

In response to anon @ 3:41, I don't know how it is at your institution, but at mine I have to pressure my grad students to publish because if I don't they won't---even if they have really interesting results. The stuff they tend to find interesting is unpublishable, and they tend not to be very good at distinguishing between "interesting to me" and "interesting to the research community".

This is, of course, part of learning to be a researcher. I'd wager that most students would never publish if not pressured to do so by their advisor. That advisor-driven pressure is part of their education.

Vacanze calabria .biz said...

Hello, I wanted to tell you about an plugin to calculate the h-index using Firefox with Google Scholar: h index