I've spent the last couple of days attending the Andreessen Horowitz Academic Round Table. Andreessen Horowitz is a leading venture capital firm, founded by Marc Andreessen and Ben Horowitz. The goal of the meeting as it was described to me was to get a number of academics and entrepreneurs together, along with the VCs, and get to know each other in an non-pressured environment. Andreessen Horowitz seems very interested in promoting the technical CEO (as opposed to the "MBA CEO") as a viable option for startups, and seems interested in establishing long-term connections with universities. They're obviously very well connected at Stanford and Berkeley, but recognize that there is startup potential at other universities that is probably underutilized.

Most of the program was a series of short technical talks, mostly by academic types, including many academics with start-up experience. Some talks were also by people leading start-ups that Andreessen Horowitz has invested in. The talks were of a uniformly very high quality. It was like a super-interesting conference, perhaps made more interesting because of the breadth of computer science areas covered. The speaker list included Susan Athey, Ed Felten, Michael Franklin, Michael Kearns, Nick McKeown, Rob Miller, Andrew Ng, Tuomas Sandholm, and Stefan Savage, just to name some. On the company side, the speakers included Ben Milne of Dwolla, who explained a lot of the cruft that has been built up in the money system in terms of moving money around, and how he was getting rid of it. Vijay Balasubramaniyan of Pindrop Security described their fraud detection measures for phone-based fraud. Jonathan Downey of Airware discussed developing open architectures for the commercial drone market. (And more.)

Before I go on, the most entertaining talk bit: Rob Miller showed a demo of a crowdsourcing program for shortening your document -- to get you down to that 10-page limit when you're a 1/4 page over, and you're fed up hours before the conference deadline. (Looking online, I think this must be the Soylent project.) I hadn't seen it before. For me, as a CS professor, that's a great example of crowdsourcing.

Beyond the technical talks, there were some interesting discussions about the connections between research and entrepreneurship, about how we might teach our students so that they can be entrepreneurial both in technology and in business, about what VCs like Andreessen Horowitz are looking for in (faculty-based) startups, and about how many universities seem to get in the way rather than help their faculty (or graduate students, or even undergraduate students) start companies. (It seemed like the most controversial topic involved who had the worst, most unhelpful technology licensing office...) In many ways, these discussions felt more interesting than the talks, and I wish we had more time for them -- even though the talks were really good, this was stuff that was new to me, and where the people in the room were experts I wanted to hear from on these topics.

There was much too much fodder at this meeting for a single blog post, so I'll try to cover some of these themes a bit more in depth over the next few days. One thing that might come up in the future -- that I think would be great and recommend (to them and whoever might be able to attend) -- is that they might do another meeting like this but focus on inviting graduate students. After all, professors already have exciting jobs they enjoy and are attached to -- graduate students may be a bit more eager to take their ideas and take the leap to try build a company out of them.

## Monday, September 30, 2013

## Saturday, September 28, 2013

### Yelp Reviews News

Giorgos Zervas is in the news again (with Michael Luca of Harvard Business School), this time for his work on filtered/fake Yelp reviews. See also this piece in the Wall Street Journal blog. High-level issue: filtered reviews are on the rise, as businesses recognize that social media reviews can be important to their business. The work is quite timely given the recent NY Attorney General's investigation into fake reviews that led to a number of fines.

## Friday, September 27, 2013

### Dina Katabi, MacArthur Genius

It's always nice to see a computer scientist be on the list for an award that spans over multiple disciplinary areas. This year, we get to congratulate Dina Katabi for earning a MacArthur Fellowship. Dina's work focuses in wireless, but seems to span more areas than you could imagine. She was part of the recent team that came out with the breakthrough work on sparse FFT. She's done some of the most well-known work in network coding (XORs in the Air). And recently, she's been working on using wireless signals to see through walls.

The official citation is here. (Apparently, Dina is the "894th" Fellow.)

The official citation is here. (Apparently, Dina is the "894th" Fellow.)

## Thursday, September 19, 2013

### Reviewing Bad

A pun title, relating to two quick things.

First, I had the wonderful experience of getting to see (through a special deal set up by Harvard's faculty development office) All the Way at the American Repertory Theater. It's a new play following the history of Lyndon Johnson (and Martin Luther King) from the November 1963- November 1964 time period (from when Kennedy was assassinated to when Johnson won the presidential election). I was silly enough to not realize when I got the tickets that Bryan Cranston, of~~Malcolm in the Middle~~ Breaking Bad fame, was playing Johnson as the lead. It was a really fantastic show. (Hey, the rest of you in Harvard CS who aren't going to these things -- why not? They're awesome! Krzysztof was the only one I saw this time...) Well acted and fascinating history. The cast was amazing (and large -- I think 17 actors total), and I kept recognizing them from TV. My inner gushing fan was set off by Michael McKean -- at one point, some of the actors were out in the audience area, and I excitedly noted McKean was about six feet from me. (I chose not to seek his autograph given the performance was going on at the time.)

[Note -- sadly, the show is already sold out... at least for this run.]

Ah, then the bad news. After being on the executive PC for STOC 2013, I heard from multiple colleagues afterwards who had their papers rejected about what they felt was the low quality of reviewing. (In my defense, I commiserated with several of them at the time.) So, after getting the reviews from the SODA PC (for my rejected papers), I feel obliged to comment. Quality-wise, they're terrible. (Not universally so... but some of them....) I was going to put in specific examples, but after the heat of the moment died down, my cooler head prevailed and determined that was inappropriate. But suffice to say that beyond the usual we don't understand the motivation type stuff, there are comments that are factually wrong that betray fundamental misunderstandings, and opinions regarding "what's important" in the paper that are -- in my experience -- just off the mark. I've been through it before -- you suck it up, find the useful comments, rewrite, and re-submit. But it is disturbing (from both sides, as the receiver of reviews and as one helping manage the reviewing process), and worrisome if it's an increasing problem for many submitters.

First, I had the wonderful experience of getting to see (through a special deal set up by Harvard's faculty development office) All the Way at the American Repertory Theater. It's a new play following the history of Lyndon Johnson (and Martin Luther King) from the November 1963- November 1964 time period (from when Kennedy was assassinated to when Johnson won the presidential election). I was silly enough to not realize when I got the tickets that Bryan Cranston, of

[Note -- sadly, the show is already sold out... at least for this run.]

Ah, then the bad news. After being on the executive PC for STOC 2013, I heard from multiple colleagues afterwards who had their papers rejected about what they felt was the low quality of reviewing. (In my defense, I commiserated with several of them at the time.) So, after getting the reviews from the SODA PC (for my rejected papers), I feel obliged to comment. Quality-wise, they're terrible. (Not universally so... but some of them....) I was going to put in specific examples, but after the heat of the moment died down, my cooler head prevailed and determined that was inappropriate. But suffice to say that beyond the usual we don't understand the motivation type stuff, there are comments that are factually wrong that betray fundamental misunderstandings, and opinions regarding "what's important" in the paper that are -- in my experience -- just off the mark. I've been through it before -- you suck it up, find the useful comments, rewrite, and re-submit. But it is disturbing (from both sides, as the receiver of reviews and as one helping manage the reviewing process), and worrisome if it's an increasing problem for many submitters.

## Friday, September 06, 2013

### Guest Post by Justin Thaler: A "Mini-Survey" of Practical Verification Algorithms

Justin Thaler, after presenting his most recent work on verifiable computation at Crypto, decided to write up a mini-survey on recent work in the area, and he offered to make it a blog post. It's a somewhat longer-than-average blog post, but well worthwhile. So without further ado, here's Justin.

------

In the last few years there has been an enormous amount of work devoted to the development of "practical" protocols for verifiable computation, with multiple approaches being pursued across several research groups. It has recently become clear to me that the sheer volume of work and the sometimes subtle relationships among the various approaches has made the area difficult to follow. The goal of this post is to provide a reasonably concise description of the various approaches and the pros and cons of each, with the hope that this will serve as a handy reference for those interested in the area, at least until a more thorough and formal survey can take its place.

==============

Some Quick Background:

The field of verifiable computation seeks to develop protocols that allow a computationally weak verifier (such as a cloud computing user) to send its data to a powerful but untrusted prover (such as a cloud computing service). The verifier can then ask the prover to perform some computation on the data, and the protocol should enable the prover to return the output of the computation along with a guarantee that the output is correct. Concretely, the ultimate goal is to start with a program written in a high-level language like C and automatically ensure that the prover faithfully executed the program.

The only background required to understand this post is having encountered

===============

Sources of Overhead:

I will focus on projects that are reasonably general-purpose, require only one prover, and have attempted to refine theory toward implementation. In the general case, all approaches satisfying these criteria work by first turning the high-level computer program into an arithmetic circuit or a set of "constraints", and then using complexity-theoretic or cryptographic machinery (or both) to check that the prover correctly evaluated the circuit or satisfied the set of constraints. I will refer to the "high-level program ==>; circuit/constraints" transformation in any system as the 'front-end', and the actual protocol used to check that the prover correctly evaluated the circuit as the 'back-end'.

Thus, there are two primary sources of overhead in existing systems: the overhead in the front-end (as some computer programs that run in T machine steps may only be computed by circuits with far more than T gates), and the overhead in the back-end (i.e., the extra work required for the prover to evaluate the circuit with a guarantee of correctness, relative to evaluating the circuit with no guarantee). The existence of two sources of overhead complicates the comparison among many of the papers that I will discuss, as work on front-ends and back-ends has been interleaved, some back-ends work with more general circuits than other back-ends (and hence can interface with more efficient front-ends), and there has been cross-pollination among systems. I will attempt to clearly delineate these issues below by focusing on contributions to the design of back-ends and front-ends separately, beginning with the former.

=================

Summary of Back-Ends:

Existing back-ends fall into three broad categories: interactive proofs, argument systems with pre-processing, and argument systems without preprocessing. I will describe each of the three approaches below, starting with the most efficient but least general approach and moving toward more general but less efficient approaches.

1) Interactive Proofs: In 2008, Goldwasser, Kalai, and Rothblum (GKR) gave a powerful interactive proof protocol for circuit evaluation. The costs to the verifier in this protocol grow linearly with the circuit depth, so the protocol only achieves an efficient verifier if the circuit has small depth (equivalently, if the function computed by the circuit has an efficient parallel algorithm). Fortunately, many of the computations actually being outsourced do exhibit large amounts of parallelism. In 2012, Cormode, Mitzenmacher, and I implemented the GKR protocol plus refinements -- most notably, we brought the runtime of the prover down from Omega(S^3) in a naive implementation to O(S log S), where S is the number of gates in the circuit. Our implementation demonstrated that the concrete costs to the verifier are very low, but there were two downsides to the implementation: First, despite our refinements, the prover runtime remained a bottleneck (the prover took roughly three orders of magnitude more time than that required to evaluate the circuit with no guarantee of correctness). Second, unless the circuit satisfied a certain 'regularity' condition on its wiring pattern, the verifier required an expensive pre-processing phase to "extract" information about the wiring of the circuit.

Very recently, I addressed these bottlenecks by showing the following. First, for 'regular' circuits, the runtime of the prover can be reduced from O(S log S) to O(S). Concretely, for these circuits, the prover now runs about 10 times slower than what is required to evaluate the circuit gate-by-gate with no guarantee of correctness. Informally, a circuit has a 'regular' wiring pattern if there is a way to label each gate g with a binary string such that the labels of the in-neighbors of g are very simple functions of the label of g (this condition is satisfied by natural circuits computing naive matrix multiplication, pattern matching, FFTs, and several other problems).

Second, for any data parallel computation (i.e., any setting where an arbitrary sub-computation C is applied independently to many pieces of data, before possibly aggregating the results), the cost of the pre-processing stage for the verifier and the overhead for the prover can be made to depend only on the size of the sub-computation C, and not at all on the number of pieces of data to which C is applied. Essentially, this holds because the data-parallel computation is 'regular' at a higher level of abstraction than in the first result -- while the sub-computation C may have very irregular wiring, the various invocations of C do not interact at all.

Third, I gave a simple protocol for matrix multiplication that avoids the circuit representation entirely, allowing the prover to compute the correct answer using an arbitrary algorithm, and then spend O(n^2) extra work proving the answer is correct.

[Aside on Matrix Multiplication: It is worth comparing this new matrix multiplication protocol to classical approaches like Freivalds' algorithm. The latter requires no interaction and *no* extra work for the prover. The advantage of the interactive approach is most evident for computations that invoke matrix multiplication as a means to an end rather than as an end in itself. For example, the best-known algorithms for computing the diameter of a graph work by repeatedly squaring the adjacency matrix. Running Freivalds' algorithm to verify these algorithms would require the prover to send the intermediate matrices to the verifier, which would be terabytes of data even for graphs on 1 million nodes. The interactive approach allows the prover to send only the right answer (which is just a single number), and then spend a few KBs of extra communication and a low-order amount of extra work proving the answer is correct.]

2) Argument Systems with Pre-Processing: In 2007, Ishai, Kushilevitz, and Ostrovsky gave an argument that required an expensive pre-processing phase for the verifier, but avoided the use of short PCPs (short PCPs are complicated complexity-theoretic protocols upon which most argument systems were traditionally based -- see the section on Argument Systems without Pre-processing below). Systems with pre-processing are primarily of interest in data-parallel settings, because the verifier can only save work when the set-up costs are amortized over many instances of the same computation (i.e., when the same computation is applied independently to many different inputs).

In 2012, Setty, McPherson, Blumberg, and Walfish began refining IKO's protocol. They called their first system Pepper, and subsequent refinements were described in a follow-up system Ginger. Further follow-up work incorporated an ingenious encoding of computations due to Genarro, Gentry, Parno, and Raykova (GGPR), and resulted in a system called Zaatar that (mostly) supercedes Pepper and Ginger.

Parno, Gentry, Howell, and Raykova introduced another system called Pinocchio that is also based on the theoretical innovations of GGPR. Also, very recent work by Ben-Sasson, Chiesa, Genkin, Tromer, and Virza (BCGTV) implements a back-end that is closely related to Pinocchio.

The high-level comparison between the Zaatar back-end and the Pinocchio/BCGTV back-end is the following. Zaatar is somewhat more efficient because it uses less cryptographic machinery (roughly, the verifier in Zaatar poses queries in the clear, while Pinocchio/BCGTV require the prover to compute over encrypted queries.) However, Pinocchio/BCGTV provide functionality that Zaatar does not. Specifically, the former supports public verifiability and zero-knowledge. Additionally, it consists of only two messages, one from the verifier to the prover and one from the prover to the verifier. Finally, it allows set-up costs to be amortized over an indefinite number of instances of the same computation -- in contrast, the pre-processing costs of Zaatar can be amortized over many instances of a computation only if all instances are verified at the same time. It is worth mentioning that the Pinocchio/BCGTV back-ends are based on non-standard cryptographic assumptions, while Zaatar is based on standard ones.

3) Argument Systems without Pre-Processing: Ben-Sasson, Chiesa, Genkin, and Tromer (BCGT) have been pursuing argument systems that avoid a pre-processing phase for the verifier. These argument systems are based on short PCPs, and while existing work on this topic is still theoretical in nature (focusing on reducing the concrete costs of existing PCPs), implementation efforts are reportedly underway.

===============

Comparison of Back-Ends:

Comparison of Argument Systems without Pre-Processing to other approaches: The advantage of the short-PCPs approach relative to interactive proofs and argument systems with pre-processing is that the former *never* requires a pre-processing phase for the verifier. The disadvantage is that the overheads to the prover are likely to be substantially higher than they are in argument systems with pre-processing. For example, the short PCPs described by BCGT require sequential repetition of the protocol many times in order to drive down soundness error. Like Pinocchio and Zaatar, they also require performing FFT operations on objects as large as a computation transcript, but unlike Pinocchio and Zaatar, these short PCPs also require working over fields of small characteristic (and hence can only be combined with front-ends that generate circuits over these fields), require the use of non-trivial finite field algorithms, and require circuits that satisfy certain algebraic notions of regularity. Precisely how large these overheads are in practice remains to be seen.

Comparison of Interactive Proofs to Argument Systems with Pre-Processing: The advantages of interactive proofs are three-fold. First, they are secure even against computationally unbounded provers, while argument systems are only secure against polynomial-time provers. Second, interactive proofs can avoid pre-processing entirely for structured computation (e.g. 'regular' circuits) and minimize pre-processing for data-parallel computation, while argument systems with pre-processing (discussed below) inherently require an expensive pre-processing stage for the verifier. Third, interactive proofs achieve unmatched prover efficiency when they are applicable, and in fact can avoid the circuit representation entirely for fundamental primitives like matrix multiplication. Concretely, Zaatar's prover is roughly 3 orders of magnitude slower than evaluating the circuit with no correctness guarantee (see e.g. these slides by Mike Walfish), which is roughly commensurate with the prover overhead in my original GKR implementation with Cormode and Mitzenmacher, but is a couple orders of magnitude slower than what we can now achieve with interactive proofs for "regular" circuits (see above).

The disadvantages of interactive proofs are the following. First, they are only applicable to small-depth circuits (i.e., parallelizable computation). Second, they do not support "non-deterministic" circuits, which can be extremely useful for turning high-level programs into small circuits as described in the section on front-ends below (there is evidence that lack of support for non-determinism in inherent in the use of interactive proofs). This means that certain kinds of computations such as those involving many random accesses to memory or sorting/comparison operations are problematic for interactive proofs. Third, interactive proofs cannot support properties like zero-knowledge and public verifiability that are achieved by Pinocchio/BCGTV. Fourth, interactive proofs require logarithmically many rounds of interaction between prover and verifier, while argument systems typically require just one or two rounds.

(I maintain, however, that the round complexity of interactive proofs isn't a big deal. First, interaction can be generically removed via the Fiat-Shamir heuristic, which is secure in the random oracle model and may be acceptable in practice. Second, any system allowing the verifier to request a specific computation in the first place is likely to involve a remote procedure call and hence incur round-trip delays anyway. Third, browsers already send a separate request for every image and script on a web page, and typically a browser cannot substantially parallelize these requests.)

====================

Summary of and Comparison of Front-Ends:

Ginger, Zaatar, and Pinocchio all contain compilers that turn high-level programs into (non-deterministic) circuits (Pinocchio was the first to handle a subset of C, while the others initially handled a different high-level language but have since moved to C). Vu, Setty, Blumberg, and Walfish (VSBW) built a front-end for an implementation of the GKR protocol (this front-end must generate small-depth circuits that are either deterministic or contain a small number of non-deterministic inputs, since the non-deterministic inputs must be sent explicitly to the verifier. It should also be clarified that their compiler does not produce regular circuits, and hence the GKR protocol requires a pre-processing phase when applied to these circuits, and amortizes these costs over many instances of the computation exactly as in Zaatar/Ginger/Pepper). Given a high-level program, the VSBW front-end also automatically determines whether Zaatar or GKR would be most efficient to verify the computation and runs the better of the two protocols, and their system as a whole is called Allspice.

However, none of the front-ends above efficiently supports programs that utilize random access to memory. Two recent front-ends change this. One is the front-end developed by BCGTV mentioned above, which implemented earlier theoretical work by the first four authors. Given a high-level program, their approach to circuit generation roughly works as follows. They first compile the C program into actual machine code for a simple Random Access Machine (RAM). They then generate a circuit which takes an entire transcript (sorted by time) of the RAM computation as a non-deterministic input, and checks that the transcript is valid. This requires checking the transcript for both *time consistency* (i.e., that the claimed state of the machine at time i correctly follows from the machine's claimed state at time i-1) and *memory consistency* (i.e., that every time a value is read from memory location x, the value that is returned is equal to the last value written to that location).

Their circuit checks time-consistency by representing the transition function of the RAM as a small circuit (they have managed to represent the entire transition function as a circuit with ~1000 gates, at least for short RAM computations of < 1 million steps. While this is still a lot of gates, it is a substantial engineering achievement). It then applies this circuit to each entry i of the transcript and checks that the output is equal to entry i+1 of the transcript. Their circuit checks memory consistency by using routing techniques to re-sort the transcript based on memory location (with ties broken by time), at which point it is straightforward to check that every memory read from a given location returns the last value written to that location.

The second front-end that supports random access to memory is called Pantry, developed by Braun, Feldman, Ren, Setty, Blumberg, and Walfish. They augment Zaatar and Pinocchio's front-end to support random access to memory by using Merkle-hashing techniques -- this offers an alternative approach to the routing-based technique for ensuring memory consistency pursued by BCGTV. (Interestingly, the Merkle-hashing approach was explicitly considered in BCGT's theoretical work, but has not yet been incorporated into the BCGTV system).

Merkle trees allow one to outsource memory maintenance by building a binary tree whose leaves correspond to memory locations. Each leaf stores the value contained in the corresponding memory location, and each internal node stores an evaluation of a collision-resistant hash function applied to its children. As long as the verifier knows the value of the root, every time a memory location (leaf) is accessed, the prover can 'prove' that the returned value is correct by revealing all values along the leaf-to-root 'witness path'. One can show that the only way the prover can lie is by finding a collision in the hash function.

Pantry identifies a collision-resistant hash function whose evaluation function can be (relatively) efficiently represented as a circuit or set of constraints. This allows evaluation of the hash function (and hence checking of 'witness paths', and hence support for random access to memory) to be folded into Zaatar's and Pinocchio's verification machinery.

Pantry's use of collision-resistant hash functions is also useful for some applications (such as MapReduce) that do not necessarily require random access to memory -- for these applications, Pantry avoids the use of a full Merkle-hash tree, reducing costs relative to what is required to support general RAMs.

Comparison of Pantry and BCGTV:

Generality: BCGTV can handle arbitrary C programs. Pantry comes close to achieving this but still cannot handle data-dependent loops. BCGTV also has the advantage that it can be modified to generate circuits that satisfy the algebraic regularity conditions required by back-ends based on short PCPs (though implementations of these back-ends remain in progress). One important aspect of Pantry is that it enables the outsourcing of computations that are run over "remote inputs", i.e., inputs that the verifier never sees in full, but is merely provided a digest (typically a hash, or a Merkle-hash) of the input. This is important in several applications, such as verifying MapReduce computations. In principle, this feature could be incorporated into the BCGTV front-end as well, but doing so efficiently would likely require redoing many of the optimizations in Pantry.

Efficiency: BCGTV's primary overhead stems from the fact the ~1000-gate transition function sub-circuit must be repeated for *every* step of the RAM. Thus, their circuits are at least 1000 times larger than the runtime of the RAM. Pantry on the other hand requires expensive evaluations of a collision-resistant hash function (when supporting general random access to memory, logarithmically many evaluations of the hash function are required per memory access). However, hash function evaluations in Pantry are only required when memory is accessed, not for every step of the RAM. Thus, which of the two approaches (BCGTV vs. Pantry) generates smaller circuits may depend on how memory-intensive the computation is.

================

Summary:

Ideally, this post makes clear that each of the approaches to verifiable computation being pursued thus far achieves a different tradeoff between efficiency, expressiveness, and support for features such as public verifiability and zero knowledge properties. This diversity can only be a good thing as users will be able to choose the approach that best suits their needs.

In more detail, despite substantial progress, the fully general-purpose front-end still generates circuits that are at least three orders of magnitude larger than the runtime of original RAM computation. Moreover, existing back-ends based on argument systems with pre-processing impose an additional three orders of magnitude overhead for the prover. Arguments based on short PCPs will avoid pre-processing for the verifier, but impose additional overheads on the prover. Interactive proofs can avoid these large overheads for sufficiently structured computation, but require a front-end that generates small-depth deterministic circuits, and does not provide cryptographic properties like public verifiability and support for zero-knowledge achieved by some of the argument systems.

Two attractive directions for future work present themselves. One is to develop protocols and build systems that can verifiably execute general computations, but that automatically leverage structure within computations for efficiency gains. Alternatively, it may be better to develop a restricted programming framework analogous to MapReduce that still allows for the expression of a powerful class of computations and automatically "extracts" the structure necessary to verify the computation efficiently. If this approach is pursued, it will be critical to determine the right balance between the level of generality to support and the amount of structure to force upon computations for efficiency gains.

Acknowledgements: I am indebted to Mike Walfish and Andrew J. Blumberg for extensive conversations over several months that substantially influenced my understanding of the area and this post in particular, and to Mike Walfish and Michael Mitzenmacher for valuable feedback on an early draft of this post. Any errors are entirely my own.

Further resources: Bryan Parno hosted a session on verifiable computation at the Microsoft Research Faculty Summit, where he, Michael Mitzenmacher, and Mike Walfish spoke about several of the approaches discussed in this post. Video of the session is available, and Mike's slides contain a nice comparison of the various approaches.

------

In the last few years there has been an enormous amount of work devoted to the development of "practical" protocols for verifiable computation, with multiple approaches being pursued across several research groups. It has recently become clear to me that the sheer volume of work and the sometimes subtle relationships among the various approaches has made the area difficult to follow. The goal of this post is to provide a reasonably concise description of the various approaches and the pros and cons of each, with the hope that this will serve as a handy reference for those interested in the area, at least until a more thorough and formal survey can take its place.

==============

Some Quick Background:

The field of verifiable computation seeks to develop protocols that allow a computationally weak verifier (such as a cloud computing user) to send its data to a powerful but untrusted prover (such as a cloud computing service). The verifier can then ask the prover to perform some computation on the data, and the protocol should enable the prover to return the output of the computation along with a guarantee that the output is correct. Concretely, the ultimate goal is to start with a program written in a high-level language like C and automatically ensure that the prover faithfully executed the program.

The only background required to understand this post is having encountered

*interactive proof systems*and*argument systems*. To review, in an interactive proof, the prover tells the verifier the output of a computation, and then they have a conversation during which the prover tries to convince the verifier that the output is correct. Any interactive proof must satisfy two properties: the first, called completeness, says that an honest prover will convince the verifier to accept, and the second, called soundness, says that a dishonest prover will be caught with high probability even if the dishonest prover is computationally unbounded. An argument system is an interactive proof system that is sound only against polynomially time provers.===============

Sources of Overhead:

I will focus on projects that are reasonably general-purpose, require only one prover, and have attempted to refine theory toward implementation. In the general case, all approaches satisfying these criteria work by first turning the high-level computer program into an arithmetic circuit or a set of "constraints", and then using complexity-theoretic or cryptographic machinery (or both) to check that the prover correctly evaluated the circuit or satisfied the set of constraints. I will refer to the "high-level program ==>; circuit/constraints" transformation in any system as the 'front-end', and the actual protocol used to check that the prover correctly evaluated the circuit as the 'back-end'.

Thus, there are two primary sources of overhead in existing systems: the overhead in the front-end (as some computer programs that run in T machine steps may only be computed by circuits with far more than T gates), and the overhead in the back-end (i.e., the extra work required for the prover to evaluate the circuit with a guarantee of correctness, relative to evaluating the circuit with no guarantee). The existence of two sources of overhead complicates the comparison among many of the papers that I will discuss, as work on front-ends and back-ends has been interleaved, some back-ends work with more general circuits than other back-ends (and hence can interface with more efficient front-ends), and there has been cross-pollination among systems. I will attempt to clearly delineate these issues below by focusing on contributions to the design of back-ends and front-ends separately, beginning with the former.

=================

Summary of Back-Ends:

Existing back-ends fall into three broad categories: interactive proofs, argument systems with pre-processing, and argument systems without preprocessing. I will describe each of the three approaches below, starting with the most efficient but least general approach and moving toward more general but less efficient approaches.

1) Interactive Proofs: In 2008, Goldwasser, Kalai, and Rothblum (GKR) gave a powerful interactive proof protocol for circuit evaluation. The costs to the verifier in this protocol grow linearly with the circuit depth, so the protocol only achieves an efficient verifier if the circuit has small depth (equivalently, if the function computed by the circuit has an efficient parallel algorithm). Fortunately, many of the computations actually being outsourced do exhibit large amounts of parallelism. In 2012, Cormode, Mitzenmacher, and I implemented the GKR protocol plus refinements -- most notably, we brought the runtime of the prover down from Omega(S^3) in a naive implementation to O(S log S), where S is the number of gates in the circuit. Our implementation demonstrated that the concrete costs to the verifier are very low, but there were two downsides to the implementation: First, despite our refinements, the prover runtime remained a bottleneck (the prover took roughly three orders of magnitude more time than that required to evaluate the circuit with no guarantee of correctness). Second, unless the circuit satisfied a certain 'regularity' condition on its wiring pattern, the verifier required an expensive pre-processing phase to "extract" information about the wiring of the circuit.

Very recently, I addressed these bottlenecks by showing the following. First, for 'regular' circuits, the runtime of the prover can be reduced from O(S log S) to O(S). Concretely, for these circuits, the prover now runs about 10 times slower than what is required to evaluate the circuit gate-by-gate with no guarantee of correctness. Informally, a circuit has a 'regular' wiring pattern if there is a way to label each gate g with a binary string such that the labels of the in-neighbors of g are very simple functions of the label of g (this condition is satisfied by natural circuits computing naive matrix multiplication, pattern matching, FFTs, and several other problems).

Second, for any data parallel computation (i.e., any setting where an arbitrary sub-computation C is applied independently to many pieces of data, before possibly aggregating the results), the cost of the pre-processing stage for the verifier and the overhead for the prover can be made to depend only on the size of the sub-computation C, and not at all on the number of pieces of data to which C is applied. Essentially, this holds because the data-parallel computation is 'regular' at a higher level of abstraction than in the first result -- while the sub-computation C may have very irregular wiring, the various invocations of C do not interact at all.

Third, I gave a simple protocol for matrix multiplication that avoids the circuit representation entirely, allowing the prover to compute the correct answer using an arbitrary algorithm, and then spend O(n^2) extra work proving the answer is correct.

[Aside on Matrix Multiplication: It is worth comparing this new matrix multiplication protocol to classical approaches like Freivalds' algorithm. The latter requires no interaction and *no* extra work for the prover. The advantage of the interactive approach is most evident for computations that invoke matrix multiplication as a means to an end rather than as an end in itself. For example, the best-known algorithms for computing the diameter of a graph work by repeatedly squaring the adjacency matrix. Running Freivalds' algorithm to verify these algorithms would require the prover to send the intermediate matrices to the verifier, which would be terabytes of data even for graphs on 1 million nodes. The interactive approach allows the prover to send only the right answer (which is just a single number), and then spend a few KBs of extra communication and a low-order amount of extra work proving the answer is correct.]

2) Argument Systems with Pre-Processing: In 2007, Ishai, Kushilevitz, and Ostrovsky gave an argument that required an expensive pre-processing phase for the verifier, but avoided the use of short PCPs (short PCPs are complicated complexity-theoretic protocols upon which most argument systems were traditionally based -- see the section on Argument Systems without Pre-processing below). Systems with pre-processing are primarily of interest in data-parallel settings, because the verifier can only save work when the set-up costs are amortized over many instances of the same computation (i.e., when the same computation is applied independently to many different inputs).

In 2012, Setty, McPherson, Blumberg, and Walfish began refining IKO's protocol. They called their first system Pepper, and subsequent refinements were described in a follow-up system Ginger. Further follow-up work incorporated an ingenious encoding of computations due to Genarro, Gentry, Parno, and Raykova (GGPR), and resulted in a system called Zaatar that (mostly) supercedes Pepper and Ginger.

Parno, Gentry, Howell, and Raykova introduced another system called Pinocchio that is also based on the theoretical innovations of GGPR. Also, very recent work by Ben-Sasson, Chiesa, Genkin, Tromer, and Virza (BCGTV) implements a back-end that is closely related to Pinocchio.

The high-level comparison between the Zaatar back-end and the Pinocchio/BCGTV back-end is the following. Zaatar is somewhat more efficient because it uses less cryptographic machinery (roughly, the verifier in Zaatar poses queries in the clear, while Pinocchio/BCGTV require the prover to compute over encrypted queries.) However, Pinocchio/BCGTV provide functionality that Zaatar does not. Specifically, the former supports public verifiability and zero-knowledge. Additionally, it consists of only two messages, one from the verifier to the prover and one from the prover to the verifier. Finally, it allows set-up costs to be amortized over an indefinite number of instances of the same computation -- in contrast, the pre-processing costs of Zaatar can be amortized over many instances of a computation only if all instances are verified at the same time. It is worth mentioning that the Pinocchio/BCGTV back-ends are based on non-standard cryptographic assumptions, while Zaatar is based on standard ones.

3) Argument Systems without Pre-Processing: Ben-Sasson, Chiesa, Genkin, and Tromer (BCGT) have been pursuing argument systems that avoid a pre-processing phase for the verifier. These argument systems are based on short PCPs, and while existing work on this topic is still theoretical in nature (focusing on reducing the concrete costs of existing PCPs), implementation efforts are reportedly underway.

===============

Comparison of Back-Ends:

Comparison of Argument Systems without Pre-Processing to other approaches: The advantage of the short-PCPs approach relative to interactive proofs and argument systems with pre-processing is that the former *never* requires a pre-processing phase for the verifier. The disadvantage is that the overheads to the prover are likely to be substantially higher than they are in argument systems with pre-processing. For example, the short PCPs described by BCGT require sequential repetition of the protocol many times in order to drive down soundness error. Like Pinocchio and Zaatar, they also require performing FFT operations on objects as large as a computation transcript, but unlike Pinocchio and Zaatar, these short PCPs also require working over fields of small characteristic (and hence can only be combined with front-ends that generate circuits over these fields), require the use of non-trivial finite field algorithms, and require circuits that satisfy certain algebraic notions of regularity. Precisely how large these overheads are in practice remains to be seen.

Comparison of Interactive Proofs to Argument Systems with Pre-Processing: The advantages of interactive proofs are three-fold. First, they are secure even against computationally unbounded provers, while argument systems are only secure against polynomial-time provers. Second, interactive proofs can avoid pre-processing entirely for structured computation (e.g. 'regular' circuits) and minimize pre-processing for data-parallel computation, while argument systems with pre-processing (discussed below) inherently require an expensive pre-processing stage for the verifier. Third, interactive proofs achieve unmatched prover efficiency when they are applicable, and in fact can avoid the circuit representation entirely for fundamental primitives like matrix multiplication. Concretely, Zaatar's prover is roughly 3 orders of magnitude slower than evaluating the circuit with no correctness guarantee (see e.g. these slides by Mike Walfish), which is roughly commensurate with the prover overhead in my original GKR implementation with Cormode and Mitzenmacher, but is a couple orders of magnitude slower than what we can now achieve with interactive proofs for "regular" circuits (see above).

The disadvantages of interactive proofs are the following. First, they are only applicable to small-depth circuits (i.e., parallelizable computation). Second, they do not support "non-deterministic" circuits, which can be extremely useful for turning high-level programs into small circuits as described in the section on front-ends below (there is evidence that lack of support for non-determinism in inherent in the use of interactive proofs). This means that certain kinds of computations such as those involving many random accesses to memory or sorting/comparison operations are problematic for interactive proofs. Third, interactive proofs cannot support properties like zero-knowledge and public verifiability that are achieved by Pinocchio/BCGTV. Fourth, interactive proofs require logarithmically many rounds of interaction between prover and verifier, while argument systems typically require just one or two rounds.

(I maintain, however, that the round complexity of interactive proofs isn't a big deal. First, interaction can be generically removed via the Fiat-Shamir heuristic, which is secure in the random oracle model and may be acceptable in practice. Second, any system allowing the verifier to request a specific computation in the first place is likely to involve a remote procedure call and hence incur round-trip delays anyway. Third, browsers already send a separate request for every image and script on a web page, and typically a browser cannot substantially parallelize these requests.)

====================

Summary of and Comparison of Front-Ends:

Ginger, Zaatar, and Pinocchio all contain compilers that turn high-level programs into (non-deterministic) circuits (Pinocchio was the first to handle a subset of C, while the others initially handled a different high-level language but have since moved to C). Vu, Setty, Blumberg, and Walfish (VSBW) built a front-end for an implementation of the GKR protocol (this front-end must generate small-depth circuits that are either deterministic or contain a small number of non-deterministic inputs, since the non-deterministic inputs must be sent explicitly to the verifier. It should also be clarified that their compiler does not produce regular circuits, and hence the GKR protocol requires a pre-processing phase when applied to these circuits, and amortizes these costs over many instances of the computation exactly as in Zaatar/Ginger/Pepper). Given a high-level program, the VSBW front-end also automatically determines whether Zaatar or GKR would be most efficient to verify the computation and runs the better of the two protocols, and their system as a whole is called Allspice.

However, none of the front-ends above efficiently supports programs that utilize random access to memory. Two recent front-ends change this. One is the front-end developed by BCGTV mentioned above, which implemented earlier theoretical work by the first four authors. Given a high-level program, their approach to circuit generation roughly works as follows. They first compile the C program into actual machine code for a simple Random Access Machine (RAM). They then generate a circuit which takes an entire transcript (sorted by time) of the RAM computation as a non-deterministic input, and checks that the transcript is valid. This requires checking the transcript for both *time consistency* (i.e., that the claimed state of the machine at time i correctly follows from the machine's claimed state at time i-1) and *memory consistency* (i.e., that every time a value is read from memory location x, the value that is returned is equal to the last value written to that location).

Their circuit checks time-consistency by representing the transition function of the RAM as a small circuit (they have managed to represent the entire transition function as a circuit with ~1000 gates, at least for short RAM computations of < 1 million steps. While this is still a lot of gates, it is a substantial engineering achievement). It then applies this circuit to each entry i of the transcript and checks that the output is equal to entry i+1 of the transcript. Their circuit checks memory consistency by using routing techniques to re-sort the transcript based on memory location (with ties broken by time), at which point it is straightforward to check that every memory read from a given location returns the last value written to that location.

The second front-end that supports random access to memory is called Pantry, developed by Braun, Feldman, Ren, Setty, Blumberg, and Walfish. They augment Zaatar and Pinocchio's front-end to support random access to memory by using Merkle-hashing techniques -- this offers an alternative approach to the routing-based technique for ensuring memory consistency pursued by BCGTV. (Interestingly, the Merkle-hashing approach was explicitly considered in BCGT's theoretical work, but has not yet been incorporated into the BCGTV system).

Merkle trees allow one to outsource memory maintenance by building a binary tree whose leaves correspond to memory locations. Each leaf stores the value contained in the corresponding memory location, and each internal node stores an evaluation of a collision-resistant hash function applied to its children. As long as the verifier knows the value of the root, every time a memory location (leaf) is accessed, the prover can 'prove' that the returned value is correct by revealing all values along the leaf-to-root 'witness path'. One can show that the only way the prover can lie is by finding a collision in the hash function.

Pantry identifies a collision-resistant hash function whose evaluation function can be (relatively) efficiently represented as a circuit or set of constraints. This allows evaluation of the hash function (and hence checking of 'witness paths', and hence support for random access to memory) to be folded into Zaatar's and Pinocchio's verification machinery.

Pantry's use of collision-resistant hash functions is also useful for some applications (such as MapReduce) that do not necessarily require random access to memory -- for these applications, Pantry avoids the use of a full Merkle-hash tree, reducing costs relative to what is required to support general RAMs.

Comparison of Pantry and BCGTV:

Generality: BCGTV can handle arbitrary C programs. Pantry comes close to achieving this but still cannot handle data-dependent loops. BCGTV also has the advantage that it can be modified to generate circuits that satisfy the algebraic regularity conditions required by back-ends based on short PCPs (though implementations of these back-ends remain in progress). One important aspect of Pantry is that it enables the outsourcing of computations that are run over "remote inputs", i.e., inputs that the verifier never sees in full, but is merely provided a digest (typically a hash, or a Merkle-hash) of the input. This is important in several applications, such as verifying MapReduce computations. In principle, this feature could be incorporated into the BCGTV front-end as well, but doing so efficiently would likely require redoing many of the optimizations in Pantry.

Efficiency: BCGTV's primary overhead stems from the fact the ~1000-gate transition function sub-circuit must be repeated for *every* step of the RAM. Thus, their circuits are at least 1000 times larger than the runtime of the RAM. Pantry on the other hand requires expensive evaluations of a collision-resistant hash function (when supporting general random access to memory, logarithmically many evaluations of the hash function are required per memory access). However, hash function evaluations in Pantry are only required when memory is accessed, not for every step of the RAM. Thus, which of the two approaches (BCGTV vs. Pantry) generates smaller circuits may depend on how memory-intensive the computation is.

================

Summary:

Ideally, this post makes clear that each of the approaches to verifiable computation being pursued thus far achieves a different tradeoff between efficiency, expressiveness, and support for features such as public verifiability and zero knowledge properties. This diversity can only be a good thing as users will be able to choose the approach that best suits their needs.

In more detail, despite substantial progress, the fully general-purpose front-end still generates circuits that are at least three orders of magnitude larger than the runtime of original RAM computation. Moreover, existing back-ends based on argument systems with pre-processing impose an additional three orders of magnitude overhead for the prover. Arguments based on short PCPs will avoid pre-processing for the verifier, but impose additional overheads on the prover. Interactive proofs can avoid these large overheads for sufficiently structured computation, but require a front-end that generates small-depth deterministic circuits, and does not provide cryptographic properties like public verifiability and support for zero-knowledge achieved by some of the argument systems.

Two attractive directions for future work present themselves. One is to develop protocols and build systems that can verifiably execute general computations, but that automatically leverage structure within computations for efficiency gains. Alternatively, it may be better to develop a restricted programming framework analogous to MapReduce that still allows for the expression of a powerful class of computations and automatically "extracts" the structure necessary to verify the computation efficiently. If this approach is pursued, it will be critical to determine the right balance between the level of generality to support and the amount of structure to force upon computations for efficiency gains.

Acknowledgements: I am indebted to Mike Walfish and Andrew J. Blumberg for extensive conversations over several months that substantially influenced my understanding of the area and this post in particular, and to Mike Walfish and Michael Mitzenmacher for valuable feedback on an early draft of this post. Any errors are entirely my own.

Further resources: Bryan Parno hosted a session on verifiable computation at the Microsoft Research Faculty Summit, where he, Michael Mitzenmacher, and Mike Walfish spoke about several of the approaches discussed in this post. Video of the session is available, and Mike's slides contain a nice comparison of the various approaches.

## Sunday, September 01, 2013

### How Should History Affect Funding Decisions?

As I go through the reviews from my latest NSF proposal, there are general high-level comments related to my past work. This leads to an interesting and perhaps-not-often-enough discussed issue -- how should the past influence present funding decisions? Mikkel Thorup recently tried to open up discussion on this theme with a viewpoint piece in the Communications of the ACM. (Sorry it's behind a firewall; if someone has a direct link please post in comments.)

To be clear, I'm not sure what the current NSF policy is on "history", as I haven't served on an NSF panel for a while. In the past I recall hearing that NSF panels were not supposed to take PI history into account in evaluating the proposals, although it seemed to be implied that the program manager would take that into consideration if needed. Another country I do reviews for does take history into account in an a fairly direct way -- there are explicit questions regarding whether the applicant has demonstrated the qualifications necessary to carry out the proposed research project. That's a fairly broad phrasing, that at least in my mind opens the way to discussing the PIs past work. So in that case there is some weight put on past performance.

I admit that I think past performance is a perfectly reasonable criterion for judging funding proposals. It helps to think of extreme cases. For example, if Les Valiant wrote a proposal and I somehow was on the panel and didn't understand or appreciate it, history (and/or basic probability) would overwhelmingly suggest that the fault is with me. Or even if one would assume or argue that the proposal wasn't well written, wouldn't the risk/reward calculation based on the track record argue strongly for funding? At the other extreme, one must make sure that new faculty with limited track records obtain enough funding to have the chance to ignite, either through special programs (like CAREER grants) or with some understanding that new faculty should be given an appropriate share of the funding pie.

In the end, my mindset is that there is a difference between writing a compelling sounding proposal and being able to deliver the research results. Looking at a researcher's past helps calibrate what the final outcome will be. Not taking it into account seems inherently wrong, under standard success metrics. However, it's hard to assign a number as to how much it should affect the overall (and arguably quite subjective) scoring of proposals in any given panel that decides grants. I don't envy people at the NSF the job of trying to untangle this messy situation, particularly in settings where they feel others (say, up in the direction of Congress) are out to second-guess their decisions.

In terms of my current reviews, I thought the panel took into account my past history in mild and reasonable ways that I appreciated. Unsurprisingly my proposal dealt with problems at the intersection of algorithms, networking, and information theory, and I discussed that I had a (successful) history of interacting with all of these communities. The reviewers acknowledged this history and noted that my background, including previous research in hashing, coding theory, and so on would likely be relevant to solving the problems I proposed. I don't know how much it played into the final decision, but I was glad they agreed with the argument that I had the right background. I should note, though, that in the past I've written proposals where I was trying to "branch out" into areas that were newer to me (they did not -- gasp -- involve hashing, or coding theory....), and I believe I experienced the same effect in reverse. So there's certainly an argument the other way...

The other way the past seemed to be taken into account was with regard to some of the "broader impacts". This blog, for instance, seems to be seen as a "positive" in terms of community outreach. Similarly, the fact that I have produced a body of educational materials over the years (class notes*, surveys, a textbook, assignments, etc.) was noted. Again, I think history should be taken into account somehow here. It's easy to say you're going to write a textbook or large-scale survey in a proposal, but it's another thing to have a record of delivering.

How much should past performance count in reviewing proposals? And can we actually influence the funding agencies to take past performance into account in ways we as a community find suitable? I'm curious to see if people have thoughts they'll share in the comments.

* I'm off teaching the Computer Science 124, Algorithms and Data Structures, now for sabbatical and potentially for a while. If anyone teaching this wants my materials, though, let me know; I've often passed them along to faculty starting to teach that type of course.

To be clear, I'm not sure what the current NSF policy is on "history", as I haven't served on an NSF panel for a while. In the past I recall hearing that NSF panels were not supposed to take PI history into account in evaluating the proposals, although it seemed to be implied that the program manager would take that into consideration if needed. Another country I do reviews for does take history into account in an a fairly direct way -- there are explicit questions regarding whether the applicant has demonstrated the qualifications necessary to carry out the proposed research project. That's a fairly broad phrasing, that at least in my mind opens the way to discussing the PIs past work. So in that case there is some weight put on past performance.

I admit that I think past performance is a perfectly reasonable criterion for judging funding proposals. It helps to think of extreme cases. For example, if Les Valiant wrote a proposal and I somehow was on the panel and didn't understand or appreciate it, history (and/or basic probability) would overwhelmingly suggest that the fault is with me. Or even if one would assume or argue that the proposal wasn't well written, wouldn't the risk/reward calculation based on the track record argue strongly for funding? At the other extreme, one must make sure that new faculty with limited track records obtain enough funding to have the chance to ignite, either through special programs (like CAREER grants) or with some understanding that new faculty should be given an appropriate share of the funding pie.

In the end, my mindset is that there is a difference between writing a compelling sounding proposal and being able to deliver the research results. Looking at a researcher's past helps calibrate what the final outcome will be. Not taking it into account seems inherently wrong, under standard success metrics. However, it's hard to assign a number as to how much it should affect the overall (and arguably quite subjective) scoring of proposals in any given panel that decides grants. I don't envy people at the NSF the job of trying to untangle this messy situation, particularly in settings where they feel others (say, up in the direction of Congress) are out to second-guess their decisions.

In terms of my current reviews, I thought the panel took into account my past history in mild and reasonable ways that I appreciated. Unsurprisingly my proposal dealt with problems at the intersection of algorithms, networking, and information theory, and I discussed that I had a (successful) history of interacting with all of these communities. The reviewers acknowledged this history and noted that my background, including previous research in hashing, coding theory, and so on would likely be relevant to solving the problems I proposed. I don't know how much it played into the final decision, but I was glad they agreed with the argument that I had the right background. I should note, though, that in the past I've written proposals where I was trying to "branch out" into areas that were newer to me (they did not -- gasp -- involve hashing, or coding theory....), and I believe I experienced the same effect in reverse. So there's certainly an argument the other way...

The other way the past seemed to be taken into account was with regard to some of the "broader impacts". This blog, for instance, seems to be seen as a "positive" in terms of community outreach. Similarly, the fact that I have produced a body of educational materials over the years (class notes*, surveys, a textbook, assignments, etc.) was noted. Again, I think history should be taken into account somehow here. It's easy to say you're going to write a textbook or large-scale survey in a proposal, but it's another thing to have a record of delivering.

How much should past performance count in reviewing proposals? And can we actually influence the funding agencies to take past performance into account in ways we as a community find suitable? I'm curious to see if people have thoughts they'll share in the comments.

* I'm off teaching the Computer Science 124, Algorithms and Data Structures, now for sabbatical and potentially for a while. If anyone teaching this wants my materials, though, let me know; I've often passed them along to faculty starting to teach that type of course.

Subscribe to:
Posts (Atom)