Wednesday, September 24, 2008

Allerton : Floating Codes

A couple of decades or so ago, Ron Rivest and Adi Shamir wrote a fun paper on write-once memory. Think of a punch card -- you can turn a 0 into a 1, but you can't turn it back again.
Optical disks were a potential application. Suppose you have a k-bit value that is going to be written t times. How many write-once bits (or "wits") will you need to store it through the t changes?

Fast-forward a few decades, and similar models are coming back into vogue, because of flash memory. My understanding is flash memory uses floating cells; roughly, cells are organized into larger blocks of n cells, and each cell can hold a number of electrons from 0 to q-1. It's easy to add electrons to cells, but not to remove them; to do that, you have to reset the whole block back to 0's first, which is both expensive and wears out the memory, which has a limited lifetime of resets. Suppose you want to store k ell-ary variable values in a block. How many rewrites before you need to reset? Note that a "code" in this setting consists of two things: a mapping from cell states (the state of the entire block) to variable states, and a transition function giving how cell states change when variables change. For more of an introduction, see this paper by Jiang, Bohossian, and Bruck.

In our paper (pdf,talk slides) for Allerton, we introduce (as far as I know) the question of considering floating codes in the "average case"-- the previous work was considering the worst-case number of rewrites before a reset. Given the lifetimes available to flash memory, and the potential to model system behavior before productions, we think analysis of average case performance makes sense here. For our notion of "average case", we assume that the variable values follow a Markov chain, and the goal is to maximize the long-term average time between resets. Given a code, the Markov chain on the variable states induces a Markov chain on the cell states. So the question becomes how to design a code to maximize time between resets on this Markov chain.

Following in the footsteps of the JBB paper, we find some initial interesting results, and leave a lot of open questions -- including the complexity of computing optimal codes for general versions of the problem. I think is yet another potentially interesting coding area where CS ideas should be able to play a strong role. And while I'm not (yet) an expert on the practical implications of these theoretical coding models, the connection to flash memory seems promising.


Anonymous said...

Dr. Mitzenmacher:

Thought you might be interested in knowing that the blog link to your Powerpoint document is broken, apparently because the word "talks" is not capitalized.

I was able to retrieve directly via your website, but others might not be so diligent.

Michael Mitzenmacher said...

Thanks, fixed.