Wednesday, May 20, 2009

Computer Science 222 : A Discussion

I'm reflecting on my Graduate Course on (vaguely) network algorithms from this year, and particularly the discussion I had with students at the end of the day. I view the class as being geared toward first-year graduate students, with explicit goals of the class being to introduce the students to research and how it's done and to build theory/systems connections.

1. The class really tries to mix theory and practice -- on good days, we discuss two papers on a topic, where I've tried to have one come from the theory side and one from the systems side. At the end of the day, not surprisingly, the theorists like the theory papers more and the systems people like the systems paper more. But it does seem that the course opens the lines of communication, and they have a greater understanding and appreciation for the other side. The payoff, if any, may only be seen in the long term -- but I'm pretty sure there's a payoff there.
2. The class can apparently be a lot of work at times -- the theory people have to pick up things (how does TCP work?), and systems people have to pick up things (what's that notation mean?). I hope that's good training, on both sides.
3. Students seem surprised that all papers we read aren't "blockbusters" (especially since we start with the papers for PageRank and HITS -- two clear blockbusters). That's on purpose. Beginning graduate students have exaggerated expectations -- what they've generally seen are the blockbuster results, and they wonder where their breakthrough will come from. (This matches my recollection of starting graduate school.) They're unaware that research is a skill to be learned, that it's perfectly OK to start out with a "mere" incremental result, especially to get one's feet wet. For typical graduate students, there's a learning curve in figuring out how dissect and analyze problems, organize and present experiments, write clear and compelling text, etc.; but many don't really think of their first year as training in this sense. Also, beginning graduate students tend to underestimate the potential value of innovative follow-on work; lots of incremental results are actually really valuable. (Few papers actually start new fields.) Several students said they were relieved in coming to realize while taking the course that their first papers didn't have to be top-conference breakthroughs.
4. The most popular topic by far (for both systems and theory people) was information theory (particularly compression and related algorithms). Many said it was their first real exposure to information theory -- why hadn't it come up before? Some asked where there was a course on information theory algorithmics. (None other that I know at Harvard -- is there a course to recommend at MIT?) And really, you can't go through the Burrows-Wheeler compression algorithm without being completely floored. If I ever stop teaching undergraduate algorithms and data structures, maybe I should put together an advanced undergraduate course merging (practical) algorithmics with information theory and queueing theory.
5. There's demand to expand these theory/systems connections; while I focused on networks, there was interest in covering related topics in databases, distributed systems, and other areas.
6. I'm hoping that several of the class projects -- with or without my assistance and input -- can be expanded into research papers. A number of them look promising. However, I usually teach this course in the fall, and can use the momentum to work with students in the spring to realize the results more fully. I'm not sure that will work this year (especially with some projects being done by graduating seniors...)

Sadly, I probably won't teach this class again until Fall 2010 or 2011. Something to look forward to. (If anyone from the class reads this, feel free to post comments anonymously or mail me directly...)


Arjun said...

Re. point 4, From my (rather crude) understanding of the undergraduate CS majors at different institutions, Information theory tends to fall within the "EE" branch of the EECS spectrum - thus it gets left out of the pure CS majors that you find in liberal arts curriculum. After all, a lot of the required math is fairly intense: time and frequency domains; analog and digital signaling; convolution, Fourier series and transforms, sampling and discrete-time processing of continuous-time signals, modulation, Laplace and Z-transforms... A cursory glance at MIT Course 6.2 shows that it takes about 2 full courses to get up to speed on all that. I'm a CS major at Williams College (which has a very robust math faculty with a wide range of course offerings) and the classes to learn it all (with the CS bent) simply don't exist. At MIT, they fall under the "core EE courses" section of the 6.2 requirements. I suppose when you have a limited set of courses to offer, information theory falls out under the bottom... Even at the Cambridge University CS Tripos (which gets fairly advanced, as at Cambridge they devote 100% of their 3 years to the "major"), this stuff is all Year 3 electives.

Anonymous said...

At Latrobe in Australia where I go they offer a CS-EE double degree which would cover all this. The 4 year CS based degrees also cover some of the more basic stuff in the area.