Monday, February 16, 2009

Some Notes on Collaboration

[For part of tomorrow's CS 222 class, we'll be talking about
collaboration, inspired by the Gowers's Polymath project. It fits in
with one of the themes in the class, which is getting young grad
students or senior undergrads into research mode. While I swear I've
written something like this up before, now I can't find it, so I'm
writing it down now and it will be part of my class "lecture" for
CS222 tomorrow. Think of this as a draft of lecture notes, which --
inspired in part by Luca's online blog notes -- I'm putting online. And
following the theme of collaboration, feel free to add comments,
suggestions, or advice to students that you think is useful.]

In this class, you'll be doing a final research project, most of you
with one or more partners. I'd like to spend a little time talking
about collaboration, some tricks for doing it, and how important it
can be for you as a graduate student.

I think most undergraduates -- especially those who aren't scientists
-- have the wrong impression that science and especially computer
science isn't very collaborative. The common image is of a professor
hiding away in his office, or a coder alone with the terminal. In
contrast, those of you who have been in the terminal room late the
night before a programming assignment probably know how collaborative,
and social, coding can be. And the large majority of my papers, and
most papers written in computer science, have more than one author.

In fact, as a graduate student, collaborating successfully is likely
to be key to your success. Synergy is real. Research is inherently
non-linear; it's not how much time you spend working on a problem,
it's coming up with the right idea. And working with others, for many
people, simply leads to better ideas more quickly. A key insight
could require knowledge that each individual doesn't have alone; a
different perspective can move someone forward when they're stuck; an
idea one might be quick to discard as being the wrong path might prove
fruitful in another's eyes. And beyond that, collaborating is often
fun, and having fun while working on a problem can make people more
productive on its own. So there are reasons House has his staff,
Buffy has her Scooby gang, and even Holmes hangs out with Watson.

Some quick thoughts about collaborating. First, as a beginning
graduate student, you are probably very concerned about who gets the
credit. My advice is to put this aside. When you go into a project,
you don't know who will hit on the right path -- and it's very
possible that the thought you had that broke open the problem might
not have happened if you hadn't been working with those other people.
Think of the long term -- you'll be known for your body of work over
your lifetime, people will see what you can do over a number of
projects. You don't need to focus on specific credit for this
specific project. And you want to foster an environment where you can
collaborate with others easily and naturally. That becomes harder
when you're always worried about who will get the credit. For more
on this, see also Hardy and Littlewood's Four Axioms for Collaboration,
or the end of Fan Chung's notes for graduate students.

Another thought about collaborating -- although, actually, you can use
this idea even when you're working on a project yourself! I've
generally found that when working with others in research, people
implicitly tend to take on different natural paired contrasting roles.
In fact, I think people taking on these different roles can greatly
enhance the process -- so it may be worthwhile for people to
explicitly take on these roles (and switch off from time to time)!
The sort of thing I'm thinking about includes:

Optimist/Pessimist: One person can be trying to think 3 steps ahead,
making intuitive leaps forward, assuming other details will work out
or as yet unproven lemmas will be proven. And another person can try
to be the skeptic, making sure that the assumptions being made to move
things forward aren't completely out of line, that details eventually
get filled in, and that the proofs don't break.

Writer/Editor: When one person writes, the other should read like a
hypercritical reviewer.

Implementer/Debugger: It can be time-consuming watching over someone's
shoulder as they code trying to catch mistakes on the fly, but it can
be an effective way for two people to code together.

There are other natural pairs of roles people can take on doing research,
and I think it's useful to be aware of them so you and your collaborators
can work together more smoothly.

Now we'll talk a bit about the Polymath project, which represents an
extreme experiment in collaboration -- research via blog. Is this the
future of research, as new communication tools make group research on
this large a scale possible? Does this paradigm seem helpful or
harmful to the research process? What is the right size for a
collaborative group, and why? Let's discuss...


Anonymous said...

Michael, but what do you do when collaborating with systems people? Isn't author-ordering inherently alphabetical in the systems community?

Anonymous said...

Dear Prof. Mitzenmacher,
I read your blogs regularly; they are full of information for a graduate student like me.
I have a request: could you blog about the guideline regarding collaborative research and what goes into one's dissertation?
Most of the work I have done in grad school has been in collaboration with others (other than my advisor), and sometime I am not sure what I can or cannot put in my dissertation.
Obviously I dont expect you to resolve my individual situation, but I want to know what the general rules are regarding this.

Michael Mitzenmacher said...

Anonymous #1: I think you mean isn't author order NON-alphabetical in the systems community. (In the theory community, alphabetical is the default.)

When I'm on a paper I tell people I would prefer alphabetical order, as that is my standard. If they don't want that I'm happily at a stage in my career where I can simply not care. (Generally, when this happens, it is because somebody wants a student to be first author, and generally, that's appropriate, and I don't mind obliging.) However, I do encourage everyone to adopt the alphabetical order as the standard rather than fight about ordering.

Anonymous said...

Do you have any suggestions on how to find collaborators (that aren't your advisor) while a grad student?

Michael Mitzenmacher said...

Sorelle --

1) Do internships whenever possible. (Which reminds me, I should be doing my annual post, telling graduate students to do internships whenever possible.)
2) Start a side project with your officemates or another group of graduate students -- make it a "no-professor" project if you can.
3) Take a "final project" class, even if it's not in your area. Projects in many grad classes can turn into papers. (And it's nice to have collaborators and a paper outside your direct area -- it shows you can do "other stuff" when you interview.)
4) Read everything you can, and when you think you have an idea that might improve a paper, contact one of the authors and say, "I have this idea after reading your paper..." Many people won't mind working with a student on a "follow-up" paper if they have a good starting idea.
5) Find a postdoc who looks like they have some time or need some help. Many postdocs don't get enough attention from their busy hosts, and they'd like to visibly "lead a project" before their next round of interviews. Maybe they could make use of an eager student?

In all these cases, best to (eventually) inform your advisor about your additional projects, but most advisors will get it if you make clear this is something you're doing "in addition to" working with them.

Anonymous said...

Thanks Michael! I find the out-of-area collaborations relatively easy to find (and definitely rewarding). It's finding other people interested in more specialized kinetic computational geometry problems that I'm finding tricky. Perhaps this means that I need to go ahead and try #4 on your list, though I worry that its hard to begin a collaboration when not in the same place.

Anonymous said...

However, I do encourage everyone to adopt the alphabetical order as the standard rather than fight about ordering.

Hear, hear! Non-alphabetical order is corrosive. Whenever there is a (sub)cast of authors with about equal credit due it forces to split hairs as to how these people are listed. This does not add anything to the group dynamic nor does it reflect any actual difference in contribution.

Anonymous said...

Non alphabetic order is sometimes required when it is obligatory to put adviser or head of the lab as co-authors while their only contribution is proofreading of the paper before submission.

Michael Mitzenmacher said...

Anon #8: non-alphabetical order is not required in such situations, just as it's not required in any situation. It's a choice, and once you make that choice, you open yourself to arguments about credit. (A student may think an adviser has does nothing but proofread the paper; the adviser, however, might see it differently...)

Michael Mitzenmacher said...

Anonymous #2: My understanding is this can vary from institution to institution and even from committee to committee; my guess is that you should have an open conversation with your advisor/committee about their expectations (and yours).

Anonymous said...

Thanks for an informative article, Michael. I agree that developing collaboration relationships is very important for grad students these days. However I think that "equal credit" collaborations are often at least as frustrating and damaging to relationships between researchers as potential awkwardness involved in discussing the relative contributions. In my experience in papers with 3 or more co-authors issues with one or more of the co-authors "having other priorities" are very common ... especially when it comes to the mundane and time consuming tasks like writing up/proof-reading/revising. In general I think that the disconnect between the credit and the effort becomes more and more problematic as the average number of collaborators per paper grows in TCS.
In addition, the absence of any explicit information about relative contributions adds a lot of uncertainty in selection decisions (e.g. for hiring or an award). It is very common in recent years to see a PhD graduate in theory with mostly 3+ authored publications and without a single single-authothed one. It is true that insider information is often available (e.g. from recommendation letters or who got to present the result) but such information is often quite incomplete and not necessarily fully reliable.
In my opinion, ordering of names is really not an adequate way to deal with this problem. It is too crude for a such delicate matters. On the other hand including an explicit summary of authors' contributions in works with 3+ authors would certainly do much more good than harm to collaboration practices and overall transparency in our area. If the collaborators decide that they do not feel like they want to discuss their contributions they can always state something in the spirit of "contributed equally". So I see no real downsides or valid excuse not to follow the practice.
I believe that the appropriate way to introduce this practice is by requiring such a summary to be included in STOC/FOCS submissions. This is a standard practice for premier science journals like Science and Nature where multi author contributions are more common.


Anonymous said...

> "I believe that the appropriate way to introduce this practice is by requiring such a summary to be included in STOC/FOCS submissions. This is a standard practice for premier science journals like Science and Nature where multi author contributions are more common."


I agree with many of your points (especially regarding the difficulty of evaluating candidates, and the difficulty for candidates to prove that their contributions were significant).

However, for theoretical collaborations it is often nearly impossible to pin down who contributed what. We could definitely keep track of who did the writing, but because of the way we (or at least I) do research, there is no bright line separating my part of the paper from anyone else's.


Anonymous said...

A relevant question: can you also comment on the importance of being able to write papers independently (i.e., having papers with one single author) as a theory graduate student? Specifically, which skill do you think should be acquired first: being able to collaborate extensively or being able to work independently?

Anonymous said...

for theoretical collaborations it is often nearly impossible to pin down who contributed what...

I don't think this is the issue -- in this case it would be easy to say that everyone contributed equally. But there are many papers where the contributions are not equal.

I think the real issue is that our community is very liberal with co-authorship. (I know of papers where a co-author's contribution was being in the room at the time the question was posed.) Whether this is good or bad, I don't know. But authors would be reluctant to divulge this information.

Anonymous said...

Hi Michael, I'm a foreign student... and I think it's not so important from where I come from... Anyway, I have just a simple question for you. I agree with everything you wrote and I am sure that collaboration and group-work could improve the efficiency and motivations, but only if you work with the right people. So, if you are allowed to, how to choose the right team for you? I mean, random people probably will not lead to an expected result...
Thank you!

Michael Mitzenmacher said...

VG --

I agree with Adam, or would go even further. I think there could be large disagreements in writing up "who should get what credit" pieces to go with articles, creating unnecessary ill will in the community. I can see the potential of introducing gamesmanship to various proceedings. (Advisors (with tenure) might be incentivized to exaggerate student contributions; advisors (without tenure) might be incentivized to exaggerate their own, and how can the student complain?) And I think its value would be minimal, so I'd see it as a waste of time. I wouldn't voluntarily go in such a direction.

Michael Mitzenmacher said...

Anon 13: Both skills are important. As a graduate student, I think the emphasis is on "producing your own results" -- that tends to be the pressure on students who need to produce a thesis, establish a reputation, and who are probably a bit misguided as to the importance of collaboration. Hence my thoughts on emphasizing the other side.

I'd also recommend all graduate students have one project that's an "on their own" project -- something they can think about or work on at their own time, at their own pace, in their free moments, without pressure from collaborators (or advisors).

Michael Mitzenmacher said...

Mark --

How do you pick people for collaborations? I tend to pick by a few simple criteria (unordered):

1) Do I like working with the person?
If so, we can find something to work on, at least it will be fun.
2) Skill set. If I'm faced with a problem where I think -- hey, I could use someone who knows XXX, I'll look for someone I know who knows XXX and talk to them about it.
3) Locality. Local interactions just happen more naturally. Also, these days, lots of my collaborations are working with students of some form or another... but this is also the basis for my "do a project with your officemates" suggestion.
4) Did they seek me out? Happily, I'm now at a point where people also seek me out to collaborate, and I'm open to working with people on interesting projects

Finally, if you talk to enough people about problems you're interested in, collaborations will probably just happen (as you find they're interested too...)