Lair Of The Multimedia Guru


Move to front coder/transform

First a little explanation about what a move to front coder is. Its a revesible transform which remaps symbols to an index into a table which initially contains (0,1,2,3,…) and every time a symbol is used its moved to the front of the table.

example: input: “to be or not to be” initial table: (abcdefghijklmnopqrstuvwxyz), for the first ‘t’ we output 19 as its the 19th entry in the table and move it to the front to build the new table (tabcdefghijklmnopqrsuvwxyz) the complete list of steps looks like:

abcdefghijklmnopqrstuvwxyz  19(t)
tabcdefghijklmnopqrsuvwxyz  15(o)
otabcdefghijklmnpqrsuvwxyz  27( )
 otabcdefghijklmnpqrsuvwxyz 4(b)
b otacdefghijklmnpqrsuvwxyz 7(e)
eb otacdfghijklmnpqrsuvwxyz 2( )
 ebotacdfghijklmnpqrsuvwxyz 3(o)
o ebtacdfghijklmnpqrsuvwxyz 19(r)
ro ebtacdfghijklmnpqsuvwxyz 2( )
 roebtacdfghijklmnpqsuvwxyz 17(n)
n roebtacdfghijklmpqsuvwxyz 3(o)
on rebtacdfghijklmpqsuvwxyz 6(t)
ton rebacdfghijklmpqsuvwxyz 3( )
 tonrebacdfghijklmpqsuvwxyz 1(t)
t onrebacdfghijklmpqsuvwxyz 2(o)
ot nrebacdfghijklmpqsuvwxyz 2( )
ot nrebacdfghijklmpqsuvwxyz 6(b)
bot nreacdfghijklmpqsuvwxyz 6(e)

As you can see the mtf transform will generally favor small numbers on “non random” data, on its own its not too usefull for compression but together with the BWT and a simple order 0 entropy coder its a pretty good general purpose compressor.

Now to the actual reason why i am writing this blog entry, can you use the symbol which the index represents as context for the entropy coder coding the index? You say no because thats like knowing the index together with the table.

Well if you think so, you are wrong, the trick is not to know the table :)

First to compress our data we run an mtf transform over it, we store the mtf table after the whole transform (this is not strictly needed but simplifies things). Next we encode the indexes from the end to the begin each time moving the zero entry of our table to the point pointed to by the index, each time we do know the symbol before the index and so can use it as context. This really is nothing else as the MTF coder in reverse.

An example, we start with the table (ebot nracdfghijklmpqsuvwxyz), the first symbol is ‘e’ the corresponding index is 6 (we know this from the forward MTF pass), so we store 6 with context ‘e’ and move e to the 6th place to find the table (bot nreacdfghijklmpqsuvwxyz), the next steps are all just the same than the MTF example above just in reverse.

Decompression is pretty much the same, read the table (again this isnt needed but its simpler as it avoids a 2nd pass over the data) then take the value from position 0 of the table and use it as context to decode the next index, after that use the index to move the entry from position 0 to its new position and repeat until you are out of input data.

Filed under: Entropy Coding — Michael @ 23:09


  1. Hmm, that differs a bit from canonical explanation.

    What can you make out of Distance Coding and Symbol Ranking?

    Comment by Kostya — 2007-06-17 @ 05:34

  2. > What can you make out of Distance Coding

    (DC)distance coding, also called (IF)inversion frequencies (ignoring minor details)
    simply stores the distances between symbols so that first all positions of all ‘A’ are stored as difference to the previous position of ‘A’, then all ‘A’ are removed from the input and the same is done with ‘B’ then ‘C’ and so on
    Note compression rate depends on the order in which symbols are coded as they are removed after they are coded (not removing them is always worse compression wise)

    and the papers ive found have somewhat obfuscated descriptions of DC/IF

    also funnily ive tested this (without knowing what its called) and many variations of it a while ago and the simple context based MTF coder above performed best
    also the bbb compressor performs VERY well without any second stage/MTF (replacement) …

    > and Symbol Ranking?

    ill look at that later …

    Comment by Michael — 2007-06-17 @ 21:55

  3. Your method can be compared with distance coding, the difference being that DC stores “how many characters between this and the next occurrence of character X” and you store “how many *unique* characters between this and the next occurrence of character X”, counting each kind of character at most once.

    Comment by uau — 2007-06-19 @ 22:15

RSS feed for comments on this post. TrackBack URI

Leave a comment

Powered by WordPress