Lair Of The Multimedia Guru

2007-07-08

Reed Solomon codes part 2

Asymptotic complexity of best known (to me ;) ) decoding algorithm

O(n log n + t log2 t) for a (n,k) RS code over GF(n+1) and t=n-k

The proof for this is quite easy, syndrom calculation is just evaluating a polynomial at n-k points, and evaluating a polynomial (in GF(n)) at all points can be done with the GFFT actually evaluation at all points is the GFFT of the polynomial. Multiplying 2 polynomials is just 2xGFFT + componentwise multiplication + IGFFT. Finding the roots of a polynomial can as well be done by just evaluating it at all points. The only non trivial operation left for normal RS decoding is solving the key equation which is equivalent to euclids GCD algorithm as well as schönhages GCD algorithm, later has O(t log2 t) complexity (log2 t == (log t)2 in case thats unclear).

An alternative to GF(2x)

Normally RS codes are build over GF(2x) that way the bits of the elements of an RS codeword have a nice 1:1 mapping to x bits which can then be stored or transmitted, but it has a big disadvantage and that is that the GFFT for GF(y) needed for fast RS decoding is done with y-1 points and so it cannot use the well known power of 2 style FFT algorithms as 2x – 1 is not a multiple of 2. The solution is to use GF(2x+1), though note GF(2x+1) does not exist for all integer values of x, it only exists if 2x+1 is a power of a prime that is pj, 2 obvious choices using fermat primes are GF(28+1) and GF(216+1)

How do you store 2x+1 values in 2x values

Trivial ;)

The data part of our RS code is specified by the user and so it simply doesnt use the 2x+1 th symbol, actually it would be messy to use it. So the only problem left are the n-k parity symbols, which can trivially be transformed to not contain the annoying 2x+1 th symbol while at the same time maintaining the property of being an RS code

Let us assume that we have a symbol (at position y with value yv) in our k input symbols which is guranteed to have a value yv < 2x – n + k that is in practice less than one unused bit. Let p be the RS codeword with all k-1 data symbols 0 and the symbol at position y 1. The next step is to find all the values of the y element in our original codeword which would cause no parity symbol to have that annoying 2x+1 th value, for encoding we simply select the yv th element of this list as new yv element. For decoding we choose the number of elements in the list which are smaller than yv as our new element. As last step we just need to add a scaled version of p so as to actually have the wanted yv element and avoiding the nasty too large elements while also still having an RS code

Filed under: Error Correcting Codes — Michael @ 20:31

Reed Solomon codes

What is a reed solomon code

Lets assume we have k values out of which we want to build a reed solomon code, to do this we imagine that our k values specify the height (=y) of k points with x from 0 to k-1. Now we really have just made a silly graph of our data. Next we find a order k-1 polynomial which goes exactly through these points, this polynomial is unique, no other polynomial of order k-1 will go through these points. And last we just evaluate this polynomial on the points 0 … n-1, these n values are a reed solomon code, simple isnt it? Note the first k values are just our input data values which we already know.

Correcting erasures with RS codes

We now can make RS codes, but how can we correct erasures? Lets assume there are t erasures (erasures are errors where the location of the error is known). That means we know n-t values of our polynomial, and if t≤n-k then we can just find the remaining values by finding the (unique) polynomial which goes through the n-t values. Its also easy to show (just think that you have k-1 of your k data values) that if t>n-k then no code can correct the erasures, so RS codes are optimal in that sense

Correcting errors with RS codes

But what if we dont know where the errors are? Well just try all possible error locations of 0, 1,…,t errors, yes this is not practical but its nice to proof the error correcting capability. Now if we have t actual errors and we guess their locations correctly then we will find our correct polynomial and can correct the errors if we have at least k values left. The only thing now we need to find out is how large t can be so that we cant find a wrong polynomial before we find the correct one. The awnser is trivial actually, a polynomial of order k-1 is uniquely defined by k points so if we have t errors and guess all t error locations wrong then we effectively kill 2t points, and if there are less than k left then we could end up with a wrong polynomial. So we can correct (n-k)/2 errors. More generally reed solomon codes can correct 2*errors + erasures as long as thats ≤ n-k

Hamming distance

n-k+1 proof is trivial (smaller would contradict error correcting capability)

Practice

The above is true if our data and code values are real, rational or integer numbers (and others) but these are quite difficult to handle in reality as they arent bounded. Luckily all the above also works with finite fields so we can just work with polynomials over GF(256) or similar, which has the nice property that you can store such values in bytes while integers and reals can be quite hard to store in finite storage space

Filed under: Error Correcting Codes — Michael @ 00:28

Powered by WordPress