Small tasks for FFmpeg
The FFmpeg todo list is getting longer and longer and melanson suggested me to try to talk about some small tasks here so as to maybe inspire some reader to contribute to ffmpeg :)
Optimal huffman tables for (M)JPEG encoding
this is pretty trivial, instead of encoding the stuff with the default huffman tables, simply store the data in a temporary buffer, build optimal huffman tables for the data, encode data with them
Building optimal huffman tables
In the unlikely case that the reader doesnt know how to build a optimal huffman table and to safe google from the millions of people looking that up after the post ;) heres how:
- you know the propability of each symbol (simply because you have “encoded” the image and know how often each symbol occured)
- merge the 2 least likely symbols into one which has their propablities sum as its propability, after that you have 2 symbols less and 1 more or simply one symbol less to deal with
- if you have more than 1 symbol left goto 2
- your huffman table/tree is finished, you only have to decide which side of each branch gets 1 and which 0, assigning these in any random or non random way will do and have no effect on the decodeability or optimality of the result. Note, (m)jpeg puts some restrictions on the 1/0 assignment so you cannot do them entirely random
Update 2007-04-26
As a reader has noticed (m)jpeg has some additional constraints which cannot be ignored when building huffman tables. So heres some update to deal with these constraints
Additional constraints for (m)jpeg
No code may have all 1 bits
This is trivial to avoid, simply add an additional dummy symbol with propability lower then all other symbols, such a code would be the longest after building the huffman table and could then simply be given all 1 bits during assinging bits for each branch.
The maximum code length is 16
To avoid this we need to change the symbol propabilities before huffman tree building. The optimal solution is the package merge alorithm
The package merge algorithm
- start with an empty list, lets call it list(0), set i=0
- add 1 entry to list(i) for each symbol we have and give each a score equal to the propability of the respective symbol
- merge the 2 symbols of least score and put them in list(i+1), list(i) will have 2 symbols less, list(i+1) will have 1 element more, the new score will be the sum of the 2 scores
- if there is more than 1 symbol left in the current list(i) then goto 3
- i++
- if i < 16 then goto 2
- select the n-1 elements in the last list with lowest score (n=the number of symbols)
- the length of the huffman code for symbol s will be equal to the number of times the symbol occurs in the select elements
Example: symbol propability A 1 B 2 C 5 D 10 E 21 optimal huffman tree A 1111 B 1110 C 110 D 10 E 0 assume we would rather want max length 3 list(0)= A1 B2 C5 D10 E21 list(1)= A1 B2 AB3 C5 D10 CD15 E21 list(2)= A1 B2 AB3 C5 ABC8 D10 E21 CDD25 list(3)= AB3 ABC8 ABCD18 CDDE46 A 111 B 110 C 101 D 100 E 0
Note: ffmpeg already contains code to build huffman tables from just the their lengths
Be careful, in JPEG there are additional constraints:
* Max code length is 16 bits
* No code can be all 1’s (1’s are reserved for byte-padding at the end; dumb, I know)
The first constraint is present in pretty much every dynamic huffman implementation in use (including zlib [*]) and requires thought. Contrary to popular belief, just 4180 symbols (*) are enough to get 17 bit code lengths (4179 in JPEG due to the second constraint).
(*) With Fibonacci-numbers frequencies, that is, 1, 1, 3, 5, 8, 13…
[*] deflate has a limit length of 15 bits, so with 2583 input symbols you can already have trouble, and that includes the block terminator and the match length codes with share the dictionary with the literal codes.
Comment by Not so easy — 2007-04-26 @ 03:33
18 symbols are enough to end with a optimal huffman table longer then 16 bits
propabilities: 1,2,4,8,16, … will do
Comment by Michael — 2007-04-26 @ 11:55
I was talking about how much input you need to achieve the probabilities, not the number of different symbols (you can’t have 4180 symbols; in JPEG the largest table is the AC one with up to 162 different symbols IIRC). My point is that with just 5K of input you can have trouble. I’ve seen a lot of people assume you need much more input and don’t care about this.
About step 4, you can’t decide which one gets a 0 and which one gets a 1 as the only thing you can specify are the lengths; the standard decides how to assign the actual codewords based on the lengths. Ditto for deflate. So you should really use that ffmpeg code you talk about :)
Comment by Not so easy — 2007-04-26 @ 13:17
> About step 4, you can’t decide which one gets a 0 and which one gets a 1 as the only thing you
> can specify are the lengths; the standard decides how to assign the actual codewords based on
> the lengths. Ditto for deflate. So you should really use that ffmpeg code you talk about
ive meant that you can decide it for the general case, jpeg of course puts some restrictions on it
ive updated the post to make this more clear
also its not true that the standard always decides the 0 vs. 1 it just does so for bits which seperate codewords of differing lengths. bits which seperate codewords of equal length can be selected at will i think
Comment by Michael — 2007-04-26 @ 16:07