My Biased Coin: January 2013

Thursday, January 24, 2013

On My Drive in This Morning...

It would, of course, be completely inappropriate for me to write that if you're interested in illegal prescription drugs, you should go talk to Stefan Savage of UCSD. But it's more appropriate for me to give a shout-out to him for his recent appearance on Planet Money's podcast, in Episode 430: Black Market Pharmacies and the Spam Empire Behind Them, which I listened to on my way into work this morning. It's well worth listening to. The podcast's take summarized (which matches my experience with other non-drug online advertising based businesses, and I think I'm summarizing Stefan correctly) -- it's actually a pretty boring business. Online pharmacies are just trying to sell something, get their cut, and spam mail is their effective way of advertising. (Since selling prescription drugs without a prescription in the US is illegal, they can't exactly advertise on TV.) Stefan mentions in the podcast that in cases he's examined, they don't appear to be trying to send you placebos or bad product; they're just making their margin, like other retail businesses.

Maybe Stefan will see this and offer some pointers to research in the comments. In any case, I enjoyed listening to his insights and smooth-sounding voice on the drive.

Tuesday, January 22, 2013

Auernheimer Speaks

Auernheimer -- who as I've discussed on this blog was convicted of a felony by accessing AT&T servers -- put out a statement yesterday.

Monday, January 21, 2013

Here Comes Another One

And yet another security story -- this one titled find a bug, get expelled or something close to that in various places. I'm going to have to find out where I send an opinionated letter to Dawson college.

Also, if you haven't been reading it, I'd really recommend reading the last week or so of posts on Harry Lewis's blog, which goes into a lot of depth on issues related to the Aaron Swartz's case, university governance and culture, and the Harvard cheating case.

Sunday, January 13, 2013

Aaron Swartz : Links

If you haven't done so already, today is a good day to reflect upon the life of Aaron Swartz.

If you haven't heard of him or need background, you can always start with his Wikipedia page.

After that, I'd recommend reading the words of Cory Doctorow, Lawrence Lessig, and Glenn Greenwald.

Finally, it's worth reading the family's statement.

I am finding it disturbing that this is my second post involving government prosecution this year. (Here was the first.) This seems like it may be an important ongoing concern, although given the setting, a topic for another day.

Saturday, January 12, 2013

Government 1310 -- What's New?

So, as many of you may remember, Harvard had a rather embarrassing cheating scandal flare up at the start of the academic year, involving over 100 students. To be clear, that's the number apparently involved in investigations; it's not the number punished. That's my point here -- we don't know how many were found to have cheated, or, really, much other information.

A bunch of Crimson articles can be found here. There don't seem to be any after the beginning of October. We originally heard that the students would have their case outcomes determined by November. So, where do we stand?

I might be missing something, but I can't find any updates on the situation. Please inform me if there's more I don't know about. But it seems to me that the faculty -- and, arguably, the public at large -- merit some further information. If the cases are still ongoing, a brief note saying that, along with a statement that the lessons learned will be discussed at an appropriate time, would be fine. But honestly, I'm disappointed and disturbed (albeit not surprised) by the lack of information.

I understand that this can't be a popular topic in the administrative circles. But at the time when the news broke there was a great deal of talk about how the incident should open the door to further discussions about cheating and pressure on students. And certainly after the fact there should be greater understanding of what happened in this specific class and how we might prevent it in the future. It seems to me there are basic things the faculty should know -- like whether it turned out that 10 or fewer students were required to withdraw, or more than 50. And ideally, we should know it sooner rather than later. In fact, it seems like the beginning of the new semester is just the right time to get this information out, as faculty share with students their expectations regarding collaborative work for this set of classes.

Perhaps the powers that be are just waiting for the semester to start. But really, I think we're due an update, with some basic analysis of what ended up happening and how we as a community should think about responding to it. I wonder when (if?) we'll see it.

Wednesday, January 09, 2013

Any Good Voronoi Code Out There?

Help!

I've got a wacky idea I'd like to explore (pseudo-preliminary-pre-research stage) which will require some code, namely for Voronoi diagrams. What I'd like seems like it should be available. I think I want the following:

1) I input a list of points -- 2-D is fine, but hey, if the code can handle 3-D, more time-wasting fun.
1a) Update: I suppose what I really want is for code that works on the "2-D torus" -- i.e say on the unit square [0,1]^2 where the boundaries wrap around, so that there's symmetry. That would be ideal, but from what I understand "torus" versions of the algorithms are harder to find, so maybe I'll just have to deal with regular 2-D and truncate somehow.
2) For the output, all I really want is the area of each cell corresponding to each point. Sure, if more information is available -- things like number of sides of the cell, etc. -- again, more time-wasting fun for me. But for now cell area seems most interesting. (I don't need pretty pictures.)
3) Now, the painful part -- I expect to be doing a lot of incremental inserting and deleting of points. So I'd like to have code that's very fast if say I delete a point from my list and insert a new point elsewhere. The code I've seen available seems to (if I'm understanding right) just implement the algorithms where you have a list of points. Some of them use incremental insertion and can therefore probably be modified easily to handle point insertions, but not point deletions. Somehow I think dealing with point sets of thousands or more and re-computing from scratch when I delete a point seems like it will be way too slow for my explorations. I realize handling deletions is harder, but that there exist algorithms for it, so I was surprised I can't easily find relevant code. But maybe I'm just looking at the wrong place.

I suppose whatever language the code is I can figure out how to use/wrap/rewrite it to my needs, so I shouldn't quibble about those sorts of issues.

Strangely, I can't seem to find anything that meets these needs. Feel free to set me straight via comments or mail. Or, if it seems to be something I'll need to implement myself, feel free to point me to the most helpful relevant papers.

Sunday, January 06, 2013

Security, Disclosure, Legality

I guess I'm late to this news, but I stumbled across the case of Andrew Auernheimer, who was convicted of one count of identity fraud and one count of conspiracy to access a computer without authorization, for posting to Gawker that AT&T had a data leak that allowed anyone to get information about a set of AT&T iPad users. A description of what occurred can be found for example in this Wired article. I also recommend this opinion on the matter by Matt Blaze, or this take by Ed Felten.

This case hit home for me because, as you may recall, last year we had an entirely similar situation with Yelp. We found that they were accidentally leaking personal user information. We collected data to back our claims if needed, brought it to them, and they fixed it. Moreover, when we told them this leak should be made public (after they had fixed it), they agreed to do so; their blog post on the issue remains up. As I mentioned at the time, it was an exemplary experience. Now I feel even more so.

Did we handle it differently, by contacting the vendor first? Yes. And perhaps these two situations exemplify some of the differences between what some people call full disclosure vs. responsible disclosure. But it's very unclear to me why what Mr. Auernheimer did -- finding a flaw and disclosing it in the manner he did -- would be considered illegal. Or, perhaps more to the point, it's not clear to me at this point what the difference is between what he did and what we did, which worries me, because as far as I can see, we should not have been anywhere close to any legal line in how we dealt with a found data leakage.

Certainly we thought our responsible disclosure approach -- contacting Yelp -- increased the likelihood of good will and a good outcome. And, in retrospect, we may have been depending on our status as university researchers. A legal action against us would have led a lot of negative press for Yelp (I think), and we'd have a lot of support from the academic community. I should emphasize, though, that I'm not clear that we would have gotten much help from our universities. I contacted Harvard legal, and they were very hands-off. Examples of the wording in their response to us. (Note, this was going through another layer, hence the 3rd person "the researchers" wording -- it's not just legalese).

If the researchers move ahead to disclose this publicly, as they intend to do, they should understand that the discovery and announcement is something for which they are responsible in their individual capacities (and should not be held out as an activity done by or on behalf of Harvard).

If there is some liability that results from the discovery or their announcement of it, the researchers should understand that they could not look to Harvard to cover that liability.

There are, as I’m sure you know, laws that prohibit certain kinds of hacking. It’s important for the researchers to be very comfortable that they were not engaged in any activity that could be construed as posing under another name, unauthorized breaking into a site, etc.

In the end, I think we were depending on common sense -- we found a leak, we aimed to get it fixed, we wanted it announced afterwards, for the obvious motivations -- credit, and protecting others. Auernheimer didn't go to AT&T first, but what he did does not seem completely outside the realm of common sense to me. (I suppose this is the heart of the full disclosure vs. responsible disclosure debate.) So how as researchers do we protect ourselves from felony charges? How as a practical matter do we improve computer security in this legal environment, or how can we change the legal environment to improve computer security while maintaining researchers' rights?

Auernheimer is due to be sentenced in February, although the articles suggest he will appeal his case.

Friday, January 04, 2013

No Stress

Greg Morrisett points me to this bit of silliness to start my morning:

University professors have a lot less stress than most of us. Unless they teach summer school, they are off between May and September and they enjoy long breaks during the school year, including a month over Christmas and New Year’s and another chunk of time in the spring. Even when school is in session they don’t spend too many hours in the classroom. For tenure-track professors, there is some pressure to publish books and articles, but deadlines are few. Working conditions tend to be cozy and civilized and there are minimal travel demands, except perhaps a non-mandatory conference or two. As for compensation, according to the Bureau of Labor Statistics, the median salary for professors is $62,000, not a huge amount of money but enough to live on, especially in a university town.

Generally, I love my job, and the great flexibility that comes with it. And I do think it's less stressful than many other careers (at least, post-tenure, and in my mind even pre-tenure as well). So the topic sentence is one that is hard for me to argue with. But really, the description in the rest of the paragraph is so far from reality, it makes me giggle. And I imagine the stress levels are significantly higher for professors outside of CS and the Ivy League -- for example, I make significantly more than $62,000, but I think that the way the author blithely ignores that making "enough to live on" may indeed be stressful is just absurd.

Right now, of course, is actually a stressful time of the year, especially as I have to plan for next semester's class. Pre-enrollment numbers are at about 129, suggesting something like a 5-10% rise from last year. I don't have all my TAs in place, I have to revise the schedule/first few lectures to take into account changes in the courses before mine, and I have other administrative duties sucking up time before classes begin.

But yes, I'm well aware I'm under much less stress than my college roommate the cardiac surgeon.

Thursday, January 03, 2013

Answers for Assignments?

One of the smarter students I've known at Harvard, who has been a Teaching Assistant for me the last two years (and who I'm hoping will be back again this year), recently made the following argument to me, encouraging me to hand out written answer keys:

There are two approaches to giving quality feedback on student work: writing good comments for each student, which is O(N) for N students, and releasing answers to the pset problems, which is close to O(1). In classes that release pset answers, TF comment-writing decreases to almost nothing. We tried, last year, to implement an alternative O(1) solution, the "answer review sections" that I ran for each pset. I think that students who had been confused on the pset had trouble following these sections, and I don't think that they were nearly as effective as written solutions.

Here he's cleverly played a novel argument: I should be providing answers not because it helps the students, but because it helps the teaching assistants, who are woefully overworked. He knows I agree on this last point. He also knows that I'm unsympathetic to the argument that it helps students to be given written solutions.

(Parenthetically: Since this comes up often, I feel it necessary to be clear that it's not that I'm unsympathetic to students who have trouble on my apparently very difficult homework assignments; it's just that I don't feel that handing out written solutions is the right response. First, by what I consider necessity, I re-use problems, and experience shows that, given opportunity, even students who you would not imagine would copy old solutions run into situations where they believe they need to copy old solutions, which generally leads them to getting kicked out for a year at Harvard. I don't like having to turn in students and then seeing them get kicked out for a year. Second, even if you feel that cheating-prevention is not a suitable reason not to hand out assignment solutions, I'm not a big believer that written answer sheets help students learn better than other methods, such as getting more individual feedback and working out the problems afterwards from the assignments with TAs or others. Or, if you want to see worked-out examples, buy one of the recommended textbooks and read it as well. The fact that so many students fail to take advantage of opportunities to obtain and/or work through answers after the fact in the absence of written answers being handed out remains troubling to me.

Maybe I should just use MOOC techniques and have students turn in numerical solutions electronically that can be script-corrected. And videotape my lectures and just replay them each year so I can stop showing up.)

This year, however, I'm entertaining his proposal... primarily because it's possibly my last year teaching the undergrad algorithms class for some time. (With any luck, I'm on sabbatical next year -- and I'm likely to be passing the course on, after what will be the 15th year in a row teaching it, to one of our newer faculty.) If someone else is teaching it, the calculus changes from my perspective. (Though I suppose I should check that the incoming faculty are OK with the idea.)

What's your take? It seems like an interesting way to change things up.

My Biased Coin