Thursday, June 25, 2020

Writing Code for a Paper : A Note to Students

This post both relates to some of the stuff I'll be presenting at Friday's STOC workshop on Algorithms with Predictions, but is also for future students in my classes, who sometimes wonder why I have them write code on theory material.  (Somehow this might be a theme for a lecture in my grad class next semester, so maybe these are a first attempt at notes for the lecture.  Comments  and  suggestions welcome.)

Some of the work I'm doing is looking at how queueing systems perform with various sorts of predictions on the times the jobs take.  This particular work I'm doing on my own.  (While most of my research has been and is with collaborators, and it's one of the things I enjoy about computer science -- we're a very collaborative field!, which seems to surprise many people -- I still sometimes like to do research projects on my own.  I've looked at queueing systems since my PhD thesis, and it's a bit outside the research interest of most of my collaborator pool, and it's "fun" sometimes to do my own thing.  The reason why "fun" is in quotes is described below.)

Often in my work in queueing I'm looking at mean-field limits (meant to model infinite systems of queues, which provides a good approximation for large finite systems under reasonable assumptions), where I can derive families of differential equations describing the system behavior.  I can also simulate the large finite system directly, and make sure the results match.  I generally do this for all of these types of papers.

Now the numbers I get from simulating the system directly and from simulating the differential equations should match (say within 1% or so).  If they don't, something is wrong.  In an effort to avoid wrongness, I won't consider the paper ready for outside consumption until I get a match.  Unfortunately, there are three ways things can go wrong.

1.  My simulation code for the queueing system might have bugs.
2.  My code to evaluate the differential equations might have bugs.
3.  My equations themselves might have bugs.

And I find there are two main categories of bugs.  Sometimes the bugs are simple/standard coding mistakes -- I'm off by 1 on an index, or I cut and paste and forget to change an i++ to a j++ in one my double loops, or I type x instead of a y.  Usually it's pretty easy to find these things, although I've had times where a hidden typo took hours to find.  But sometimes the bug is a thinking mistake -- I've forgotten a subcase and so my equations aren't complete (and so my code evaluating the equations won't give the right answer), or I've not handled a subcase correctly in my simulation.  That type usually takes longer.

Usually, the first time through, most all of these types of bugs happen -- my math is off, I've typed some stuff wrong, it can all happen.  And then, like coders everywhere, I go through and fix it.  And it's painful.  Sometimes everything goes right, a quick check or two and everything works.  For more complicated stuff, it's more time figuring out what went wrong than setting up the code to begin with.  And being the personality type to not let things sit, that can mean late nights figuring out what went wrong.

For my talk this week, there was one last problem I wanted to include, which meant finally taking the model and writing the equations and code.  I didn't even need it for the talk, but it's also the last bit before I put a paper draft on arxiv, so taking advantage of a deadline, I figured now was the time.  Which means the last 2 days, I've spent many hours (and a late night) trying to remove the disagreements.

On the plus side, when everything finally works, it's a wonderful feeling.  And it always makes me feel better when I have worked to verify my math this way;  this time, what kept me up well past midnight and took several hours to track down was actually a boundary case I had left out of the equations.  (I had looked at the equations over and over again without noticing I had left out the subcase;  I had to step through numbers from the differential equations one time step at a time to track down what was missing, and then the numbers told me what I had done wrong.)

On the down side, it's work, and debugging is never particularly fun.

For students out there, maybe I'm just explaining that I understand the pain that I am putting you through.  You may wonder why I have you do simulations that take a few hours if you do them well, but days if you don't think through the best approach.  But using programming and theory together can be powerful;  it's helped me countless times in my research.

(Related: on theory and experiments that I've written on before, along with a viewpoint by Jeffrey Ullman.)