Wednesday, January 13, 2010

Algorithms and Data Structures : Course Goals

Richard Lipton has an excellent blog post up right now on "What should a course on complexity theory cover?" where he describes what he is putting in his graduate course and what he expects students to get out of it. It seemed like a great idea, making me think that I should write a similar post for my undergraduate course (Algorithms and Data Structures).

I have a somewhat different take on the problem than Richard, though. He ends up creating a list of things (definitions and theorems) he thinks student should know after his class. I admit I think a bit more in terms of high-level concepts and skills I want students to get out of the course. (I think both approaches can work.) My list includes:

1. Students should understand the basics of algorithm/data structure language and notation; in particular: order notation, what it means (and doesn't mean!), and how to calculate running times of algorithms.
2. Students should learn how to model a problem through proper abstraction. In particular, they should learn how to set problems into the language of graphs, linear programs, etc.
3. Students should learn basic algorithmic techniques: greedy, divide and conquer, dynamic programming, linear programming.
4. Students should learn what is required for a formal and rigorous argument (even if they are not so good at writing one themselves). They should learn how to argue/prove correctness of algorithms and data structures in the contexts above.
5. Students should clearly understand the power of reductions -- not just to show problems are hard, but to show problems are easy, by using the solution of one problem to solve another.
6. Students should obtain some insight into the power of randomness in algorithmic thinking.
7. Students should practice translating algorithmic ideas into working code. It is hoped they will learn the importance of thinking before coding (choosing the right approach), and that issues such as memory requirements and running time can and should be reasoned about before implementation.
8. Students should learn that correctness, like computation time and memory, is just a "resource" that can be traded off with the others, by learning the notion of an approximation algorithm.
9. Students should learn the basic of heuristics and heuristic techniques, so that if asked in practice to solve an NP-hard problem (in the sense of coming up with a very good but not optimal solution), they have some idea of what approaches might give them reasonable answers.
10. Students should see some really cool, unusual, interesting algorithms and data structures, and be forced to work on some really hard, challenging problems, to challenge their thinking and demonstrate the power of theory.

There's probably more I could add, but those seem like ten clear objectives I have for the students who take my class. Perhaps in another post I can elaborate on how I try to see those objective are met.

Like Richard, I should end the post by asking the important open questions:
What have I left out? What should I leave out? What do you think?

13 comments:

Thore Husfeldt said...

A reason to prefer your list to Lipton’s is that you are able to formulate the course goals in terms of skills.

Most of your items could be given in the form “After the course, the student can ———”, where ——— preferably is something they couldn’t do before they took the course. A notable exception is your last point. Maybe, “After the course, the student can nod in recognition at up to three, cool recent algorithms.”

The real challenge for a complexity course, especially if you teach at a department where course goals have to be formulated in terms of Intended Learning Outcomes rather than facts or topics, is indeed to finish that sentence. “After Computational Complexity A–113, the student can———” What?

--

As to the algorithms and data structures outcomes, here’s another good, testable skill: “Construct a counterexample to the correctness of a given algorithm.” Proving stuff incorrect is much more useful than proving stuff correct. It’s also more fun, creative (rather than formal), and can be taught (and examined) with a broader student population.

Michael Mitzenmacher said...

Good addition, Thore! In fact in my exams I usually have a section of the form: "Show the following statement is not true with an example", and I do think finding counter-examples is a useful skill.

Mihai said...

I would assume that at least 50% of your students will ultimately work in software, without becoming theorists. So I think the most important goal of the class is to educate such students that there exist highly non-obvious algorithms that work very well, and they don't know all of them :)

If you manage to convince them of this fact (which is not easy), when they will face a hard problem, they won't just hack something and hope it usually works. They will hopefully start by looking at some literature or just asking somebody.

Daniel Lemire said...

My answer as a blog post:

http://www.daniel-lemire.com/blog/archives/2010/01/13/the-fundamental-properties-of-computing/

Michael Mitzenmacher said...

Mihai -- I think we agree on your goal, though perhaps we're just wording it slightly differently. (Numbers 7 and 10 line up with your goal pretty well...)

Anonymous said...

To me what stands out for its omission is not any particular topic. Rather, what I see in most of this list are very passive verbs, "students should see" and "students should understand". What I don't see is any kind of learning outcome: what specific things should the students be able to do to demonstrate their understanding? How are their abilities changed for having seen what they've seen in the class?

Michael Mitzenmacher said...

Daniel --

Interestingly, the idea that correctness, like time and memory, can be traded off, seems to really shock students when they first hear it. It's one of those things you don't naturally think of -- but once you get used it to it, it's hard to remember what is was like not to think in those terms.

I actually consider it a major philosophical contribution of the class, more important than most people realize.

Michael Mitzenmacher said...

DE -- As I said at the end of the post, perhaps another post will focus on the issue of how the objectives are met -- which is, I think, the answers to your questions.

proaonuiq said...

"“After Computational Complexity A–113, the student can———” What?"

Computational Complexity is theory still under construction. Although the target is taxonomic, i.e. to obtain robust complexity classes (by definitions) and means to classify problems in these classes (trough theorems about the algorithms that solve the problems) IMO we are not yet at this stage. What we have is provisional classes, conditional theorems based on conjetures and the hope to be able to prove these conjetures in the near future (and this makes the whole thing very appealing).

The list in the post seems hard to improve. Maybe iwould include that "find the counterexample" suggestion...

Anonymous said...

It's easy to test whether the objective that a student has merely seen a certain topic is met: just use attendance-based grading.

Thore Husfeldt said...

OK, as an exercise I’ve transformed Michael’s list into intended learning outcomes. Some of the items benefit a lot (it’s a lot clearer exactly where the line is drawn — did we mean 3a or 3b? 4a or 4b), others become awkward.

After Algorithms and Data Structures A-113 the student can

1. state the asymptotic running time of a given algorithm, including recursive algorithms, possibly using amortised analysis

2. model a computational problem in terms of graphs or linear programmes

3a. classify an algorithm by technique: greedy, divide and conquer, dynamic programming, linear programming, or

3b. solve a given problem using algorithmic paradigms such as greedy, divide and conquer, dynamic programming, linear programming,

4a. recognise a formal correctness proof, or

4b. decorate an given algorithm with assertions and invariants to establish its correctness

5. solve a problem by reducing it to another problem, and argue for the hardness of a problem by reducing it from another problem

6. analyse a randomised algorithms, including finding the expected running time using — (fill in whatever you want here: expectation as sum of indicator random variables, bound stuff using concentration of measure… model something as a markov chain… derandomise using the method of conditional … this is a whole blog post in itself)

7. implement and run an algorithm given as pseudocode

8. bound a performance guarantee of a simple approximation algorithm

9. find a (nonoptimal) solution to a hard computational problem instance using local and exhaustive search methods

10. nod in recognition at three really cool, unusual, interesting algorithms and data structures, and brag about having being forced to work on some really hard, challenging problems, to challenge their thinking and demonstrate the power of theory.

--

I’ll add my own:

11. prove an algorithm incorrect by constructing a counterexample

12. refactor an implementation by separating abstract data types from data structure implementations

13. say “I think I’d use a hash function” during an oral exam

Anonymous said...

Nice list. I would be interested in hearing more of what you think about item 4 (formal and rigorous arguments). I don't know that CS departments are handling this correctly.

Kunal said...

One could add something similar to 9:

9.5 Students should learn the basics of exact algorithms (with worst case exponential running time), so that if asked in practice to solve an NP-hard problem (in the sense of coming up with an optimal solution), they have some idea of what approaches might solve small-ish instances. Or at least when they hear of a 100,000 city TSP instance being solved, or of SAT solvers, they have some idea of what's going on.

At the other extreme, one could add
- Students should (really)understand the difference between an nlog n algorithm and an n^3 algorithm, by implementing say Dijkstra and Floyd-Warshall, and running them on instances of slowly increasing sizes.