Let us develop an appreciation for how randomness can be exploited to dramatically improve the efficiency of certain algorithms. We consider two parties, Alice and Bob, who trust each other and want to cooperate to jointly compute a certain function of their inputs.
Alice and Bob live across the country from each other. They each hold a very large file, each consisting of characters (for concreteness, suppose that these are ASCII characters, so there are possible characters). Let us denote Alice's file as the sequence of characters , and Bob's as . Their goal is to determine whether their files are equal, i.e., whether for all . Since the files are large, they would like to minimize the communication, i.e., Alice would like to send as little information about her file to Bob as possible.
A trivial solution to this problem is for Alice to send her entire file to Bob, and Bob can check whether for all . But this requires Alice to send all characters to Bob, which is prohibitive if is very large. It turns out that no deterministic procedure can send less information than this trivial solution(fooling set method). However, we will see that if Alice and Bob are allowed to execute a randomized procedure that might output the wrong answer with some tiny probability, say at most , then they can get away with a much smaller amount of communication.
The rough idea is that Alice is going to pick a hash function at random from a (small) family of hash functions . We will think of as a very short "fingerprint" of . By fingerprint, we mean that is a "nearly unique identifier" for , in the sense that for any , the fingerprints of and differ with high probability over the random choice of , i.e.,
Rather than sending to Bob in full, Alice sends and to Bob. Bob checks whether . If , then Bob knows that , while if , then Bob can be very confident (but not sure) that .
To make the above outline concrete, fix a prime number , and let denote the set of integers modulo . We assume that all arithmetic is done modulo without further mention.
For each , define . The family of hash functions we will consider is
Intuitively, each hash function interprets its input as the coefficients of a degree polynomial, and outputs the polynomial evaluated at .
That is, in our communication protocol, Alice picks a random element from , computes , and sends and to Bob. Bob outputs if , and outputs otherwise.
We now prove that this protocol outputs the correct answer with very high probability. In particular:
The first property is easy to see: if , then obviously for every possible choice of .
The second property relies on the following crucial fact.
Fact 1: For any two distinct (i.e., unequal) polynomials of degree at most with coefficients in , for at most values of in .
:
Fact 2: Any nonzero polynomial of degree at most over any field has at most roots.
If and are distinct polynomials of degree at most , and for more than values of , then is a nonzero polynomial of degree at most with more than roots.
Let and similarly . Observe that both and are polynomials in of degree at most The value that Alice sends to Bob in the communication protocol is precisely , and Bob compares this value to .
By Fact 1, if there is even one such that , then there are at most values of such that . Since is chosen at random from , the probability that Alice picks such an is thus at most Hence, Bob outputs NOT-EQUAL with probability at least (where the probability is over the random choice of ).
Alice sends only two elements of to Bob in the above protocol, namely and In terms of bits, this is bits assuming for some constant . This is an exponential improvement over the bits sent in the deterministic protocol. This is an impressive demonstration of the power of randomness.
We refer to the above protocol as Reed-Solomon fingerprinting because is actually a random entry in an error-corrected encoding of the vector . The encoding is called the Reed-Solomon encoding.
Suppose we are given as input two matrices and over , where is a prime number. Our goal is to compute the product matrix .
Asymptotically, the fastest known algorithm for accomplishing this task is very complicated, and runs in time time [LG14]. Moreover, the algorithm is not practical.
But for the purposes of this manuscript, the relevant question is not how fast can we multiply two matrices-it's how efficiently can one verify that two matrices were multiplied correctly. In particular, can verifying the output of a matrix multiplication problem be done faster than the fastest known algorithm for actually multiplying the matrices? The answer, given by Freivalds in 1977 [Fre77], is yes.
Formally, suppose someone hands us a matrix , and we want to check whether or not .
Here is a very simple randomized algorithm that will let us perform this check in time (this is only a constant factor more time than what is required to simply read the matrices , and )
First, choose a random , and let Then compute and , outputting if and otherwise.
We claim that the entire algorithm runs in time It is easy to see that generating the vector can be done with total multiplication operations can be computed as , then can be computed as , then as , and so on . Since multiplying an matrix by an -dimensional vector can be done in time, the remainder of the algorithm runs in time: computing involves multiplying by the vector , and computing involves multiplying by to get a vector , and then multiplying by to compute .
Let , so that our goal is to determine whether the claimed product matrix actually equals the true product matrix . Letting denote the set , we claim that the above algorithm satisfies the following two conditions:
The first property is easy to see: if , then clearly for all vectors , so the algorithm will output YES for every choice of .
To see that the second property holds, suppose that , and let and denote the th row of and respectively. Obviously, since , there is some row such that . Recalling that , observe that is precisely , the Reed-Solomon fingerprint of as in the previous section. Similarly, Hence, by the analysis above, the probability that is at least , and in this event the algorithm outputs .
Whereas fingerprinting saved communication compared to a deterministic protocol, Freivalds' algorithm saves runtime compared to the best known deterministic algorithm. We can think of Freivalds' algorithm as our first probabilistic proof system: here, the proof is simply the answer itself, and the -time verification procedure simply checks whether .
Freivalds actually described his algorithm with a perfectly random vector , rather than for a random . We chose to ensure that is a Reed-Solomon fingerprint of row of , thereby allowing us to invoke the analysis from above section.