mnzip
I always wanted to write my own general purpose compressor, a short while ago i actually did. Its based on the BWT + MTF variant described in yesterdays blog post + snows/ffv1s range coder with a state transition table borrowed from bbb/paq8 source code under GPL is of course available too.
compressed file sizes of the Canterbury Corpus
| alice29.txt | asyoulik.txt | cp.html | fields.c | grammar.lsp | kennedy.xls | lcet10.txt | plrabn12.txt | ptt5 | sum | xargs.1 | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| bzip2 1.0.3 | 43202 | 39569 | 7624 | 3039 | 1283 | 130280 | 107706 | 145577 | 49759 | 12909 | 1762 |
| 7zip 4.43 beta | 48555 | 44606 | 7692 | 3076 | 1350 | 50904 | 119569 | 165462 | 42153 | 9619 | 1860 |
| bbb ver. 1 | 40839 | 37818 | 7736 | 3253 | 1349 | 76523 | 101117 | 135829 | 44816 | 12593 | 1792 |
| mnzip r32 with plain MTF | 42698 | 39286 | 7572 | 2962 | 1227 | 19804 | 105883 | 143634 | 50624 | 12591 | 1698 |
| mnzip r32 | 40950 | 37835 | 7431 | 2983 | 1237 | 19287 | 101140 | 137191 | 45604 | 12428 | 1699 |
Time needed to compress
| alice29.txt | asyoulik.txt | lcet10.txt | plrabn12.txt | cp.html | fields.c | grammar.lsp | kennedy.xls | ptt5 | sum | xargs.1 | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| bzip2 1.0.3 | 0m0.166s | 0m0.133s | 0m0.533s | 0m0.633s | 0m0.047s | 0m0.037s | 0m0.007s | 0m1.062s | 0m0.151s | 0m0.056s | 0m0.006s |
| 7zip 4.43 beta | 0m0.539s | 0m0.417s | 0m1.732s | 0m2.161s | 0m0.070s | 0m0.035s | 0m0.019s | 0m6.048s | 0m1.402s | 0m0.105s | 0m0.022s |
| bbb ver. 1 | 0m2.675s | 0m2.271s | 0m7.455s | 0m8.599s | 0m0.559s | 0m0.344s | 0m0.230s | 0m17.446s | 0m45.407s | 0m0.813s | 0m0.235s |
| mnzip r32 | 0m0.273s | 0m0.206s | 0m0.951s | 0m1.099s | 0m0.031s | 0m0.012s | 0m0.006s | 0m3.545s | 0m1.173s | 0m0.051s | 0m0.006s |
time needed to decompress
| alice29.txt | asyoulik.txt | lcet10.txt | plrabn12.txt | cp.html | fields.c | grammar.lsp | kennedy.xls | ptt5 | sum | xargs.1 | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| bzip2 1.0.3 | 0m0.063s | 0m0.049s | 0m0.177s | 0m0.222s | 0m0.007s | 0m0.003s | 0m0.002s | 0m0.210s | 0m0.053s | 0m0.009s | 0m0.003s |
| 7zip 4.43 beta | 0m0.033s | 0m0.027s | 0m0.066s | 0m0.085s | 0m0.009s | 0m0.011s | 0m0.007s | 0m0.099s | 0m0.043s | 0m0.016s | 0m0.006s |
| bbb ver. 1 | 0m2.265s | 0m1.918s | 0m6.015s | 0m6.916s | 0m0.511s | 0m0.332s | 0m0.231s | 0m13.492s | 0m6.660s | 0m0.715s | 0m0.237s |
| mnzip r32 | 0m0.073s | 0m0.061s | 0m0.215s | 0m0.261s | 0m0.010s | 0m0.005s | 0m0.003s | 0m0.441s | 0m0.155s | 0m0.017s | 0m0.002s |
Options used where -9 for bzip2, -mx=9 for 7zip, f for bbb to use fast but memory hungry mode (this doesnt affect compression rate for bbb). The benchmark score are just single run based no proper mean so dont take them too serious, and i hope ive not messed up the file order ;)
Patches are welcome!