Lair Of The Multimedia Guru

2007-04-12

animal-rights activists

You get one of these begging letters from a bunch of animal-rights activists who need some money to safe a few poor animals, well you dont really wonder about from where they know your name and address, after all you live in a corrupto-cracy which protects such personal data …
But the real question is what do you do, donate a few euro to them? I mean the case iam talking about is not about some corrupted spammers but rather the following, quick summary (based on unverified info from the net)

an austrian zoo (safaripark gänserndorf) goes bankrupt somewhere in 2004, over 800 animals need to find a new home, many of them are being bought by animal-rights activist groups or zoos, the politicans and other austrian zoos who should take care of the animals until a new home is found rather fight against each other and try to squeeze money out of the whole while pretending in public that they would work pro bono. In the end there are a few lions and HIV infected monkeys left. The monkeys needs are being paid for by the pharma industry, yeah anyone surprised they care more about the wellfare of the animals than our politicans and other zoos? The lions owned by animal-rights activist group “Vier pfoten” from whom the begging letter was, they also bought some area in south africa for the lions but not surprissingly need more money …

Now why did i write this off topic blog entry? Well iam unsure if i should donate them a few euro or not, i want to help the animals but iam not so sure if what the animal rights activists do really helps anyone. Ive no doubt they have only the best intentions but the question is not dead lion vs. alive lion living in a nice natural environment but rather dead lion vs. a few hundread dead cows, yes lions arent vegetarian also the lions cannot be let loose into true freedom as they have been used to being close to humans. So being part of a working ecosystem where they hunt and kill their pray and by that positively contribute to the evolution of their pray (weakest dies strong survives and reproduces …) isnt possible.

And just in case anyone is curious, i belive its a detestable thing to lock animals into cages for the publics entertainment. Or in other words i hate zoos, thouse running them and thouse who support them (by visiting and thus paying for them).

Filed under: Nature,Off Topic — Michael @ 03:26

Comments (9)

2007-03-25

Resampling

Resampling is the problem of converting a signal which is sampled at a given samplerate to one which is sampled at another samplerate. This is the same as converting a discretely sampled signal to a continuous one and then converting it back to a discretely sampled one.
There are many ways to look at this problem which dont lead to the same solutions

The optimization point of view

The problem can be seen as one of optimization that is choose the function which converts the discrete samples to a continous signal so that the error to the original correct signal is minimized. And then select the function which converts the continuous signal to discreet samples so that together with a optimal back conversation the error is minimzed.
While this is a nice generic definition of the problem it doesnt point to a obvious solution, also the optimal solution will almost certainly not be remotely close to a common sense definition of what samples represent, it might be more similar to a vector quantizer or other odd non linear thingy …

The linear optimization point of view

Like in the previous point we again try to find continuous->discrete->continuous convertion functions which minimize the error to the original but we restrict ourselfs to linear convolutional filters. Note continuous signal here means one which is either given analytically (like a polynom) or discretely sampled at higher resolution.
Here its a matter of linear algebra and a training set of signals to find the
optimal discrete->continuous filter for a given continuous->discrete filter. And the optimal continuous->discrete filter for a given discrete->continuous filter.
In this framework it is also possible to consider output and input devices. For example a monitor which converts the discretely sampled data into very tiny fuzzy colorfull dots, which when they are all added together give a continous signal. The monitor here is a discrete -> continuous filter for which we can find the optimal linear least squares filter which changes a given signal to a discretly sampled one.

The bandlimited point of view

Its possible to look at a amplitude per time signal also as a amplitude per frequency signal. If our signal now happens to not contain any non zero frequency component above a specific frequency X then this signal is said to be band limited. If we now sample such a signal with a sampling rate >2X then we can exactly restore the original continuous signal from these samples by simple convolution with the sinc filter (sin(x)/x) (the discrete samples here would be at x=…, -pi, 0, pi, 2pi, …).
And to remove all frequencies above a given frequency X while retaining all below again simply convolution with sinc is the awnser. So resampling becomes a simple matter of using the sinc filter with the correct scaling to find the new samples. Sadly there are a few problems and missunderstandings with this approch.

Summing a infinite number of samples fallacy

Some people say sinc cannot be used because it would require summing an infinite number of samples, this is of course not true as images and audio on computers tend not to contain infinite many samples to begin with. Or more precissely a signal of N samples can simply be converted into frequency domain with a FFT the too high frequency coefficients could then be set to 0 and then a inverese FFT could be applied. This is exactly equivalent to infinite periodic extension and application of the sinc filter. Direct applicaton of the sinc filter in time domain is trickier.

sincs latency

As the sinc filter has infinite support (theres no X after which all values are 0) it cannot be applied in realtime, it needs all input samples before it can output anything.
That also means all applications and libraries which claim to use sinc for resampling in realtime lie.

Causality and sinc

Most output samples depend on most (practically all) future input samples

Shooting sinc and the whole bandlimited theory down

Imagine absolute silence, open space, no obstacles to reflect sound and a gun, theres no doubt that the correct continuous sound amplitude will be exactly 0 (the air pressure will not change), now the gun is fired, which causes a very loud “bang” pretty much a very sharp spike in the amplitude and air pressure and then perfect silence again. Now this signal is clearly not bandlimited, so lets make it bandlimited by convolving it with sinc, the resulting signal will of course look like the sinc filter itself (convolution of a single spike is simply a identity operation). At that point we will have a bandlimited signal of a gunshot. Lets assume our signal is sampled at 8khz
in that case the amplitude 1 second before the gunshot will be 1/(pi*8000) of the loudness of the gunshot while this seems very little given a loud enough gunshot its audible and this simply isnt correct. Also for shorter distances like for example 5 samples before the gushot its 1/(pi*5) ~ 1/16 which for example when resampling an image with a sharp edge leads to terrible ringing artifacts.

Windowed sinc

An alternative to the infinite support sinc filter are windowed sinc filters, which are simply sinc componentwisely multiplied by a window function. These filers have compact support (are zero everywhere except a few samples). For them the the size of the transition band where frequencies are neither left at their amplitude nor attenuated to near 0 cannot be 0. Its size, the length of the filter the stopband attenuation and all the other parameters depend on each other and a trandeoff must be made …

Filed under: Uncategorized — Michael @ 03:50

Comments (5)

2006-12-29

A few days ago chkrootkit told me “Enye LKM found”

My first thought was, shit …, but there was something odd, it wasnt reproduceable, subsequent runs of chkrootkit found nothing, was it just a false positive?
Also mutt had died with on odd error message, “Real-time signal 24” to be precisse, was there a relation? I decided to look at the chkrootkit source to clarify all that, i found the following:

   /* Check for Enye LKM */
   if (kill (12345, 58) >= 0)
   {
      printf("Enye LKM found\n");
      retdir+= errno;
   }

chkrootkit simply sends a “random” signal to a “random” pid, no check if anything normal is running at that pid not to mention the likelyness that a intruder would leave the pid/signal id at the default

Filed under: Uncategorized — Michael @ 23:25

Comments (0)

2006-12-12

Hadamard transform

In the last blog entry ive mentioned the hadamard transform, so its high time to dissuss it
the hadamard transform is a linear transform, and all linear transforms can be written as a vector-matrix multiplication, here the matrix is orthogonal, symmetric and all it elements are either a or -a. Actually this is commonly written as a matrix of 1 and -1 which is multiplied by a which is sqrt(size of the transform), you also might want to take a look at Wikipedia/Hadamard_transform.
The hadamard transform is also somewhat similar to the DCT but simpler, thats also why its used as part of a SATD motion estimation comparission function, the dct would be slightly better quality wise but significantly slower …

How to calculate it

Well there are many ways
a slow and simple one is:

for(i=0; i<n; i++){
    dst[i]=0;
    for(j=0; j<n; j++){
        int sign= parity(i&j) ? -1 : 1;
        dst[i] += src[j] * sign;
    }
}

a faster one is:

void transform(int *data, int step, int n){
    int i, j;
    for(i=0; i<n; i+=2*step){
        for(j=i; j<i+step; j++){
            int a= data[j];
            data[j]      = a + data[j+step];
            data[j+step] = a - data[j+step];
        }
    }
    if(2*step<n)
        transform(data, 2*step, n);
}

You should of course not implement this recursively, but instead remove recursion, unroll the loops where cleanly possible and replace step and n by constants if you can.
Ive just written it like above as its IMHO easier to understand that way then 2 pages of unrolled loops

Filed under: Uncategorized — Michael @ 01:41

Comments (5)

Motion Estimation (1. comparission functions)

~~This months ;)~~ Todays blog entry is about motion estimation for block based video encoders, more precissely integer pel style, no half pel or qpel to keep this reasonable simple. In practice the sub-pel motion estimation (hpel/qpel…) is normaly done after normal integer pel anyway so we can nicely ignore sub pel today

Now how do we estimate the motion of a square or rectangular block of pixels?
Nothing easier then that you will likely say just find the best matching block in the reference frame for each block we need to encode, but what is “best” and how to find it?

Compare functions

To find the best matching block we first need to give each pair of blocks a score, there are several popular ways to do that:

SAD

Sum of absolute differences

      for(x=0; x<SIZE; x++){
          for(y=0; y<SIZE; y++){
              score += abs(reference[y][x] - current[y][x]);
          }          
      }

SSD

Sum of squared differences

      for(x=0; x<SIZE; x++){
          for(y=0; y<SIZE; y++){
              error= reference[y][x] - current[y][x];
              score += error*error;
          }          
      }

SATD

Sum of absolute hadamard transformed differences

      for(x=0; x<SIZE; x++)
          for(y=0; y<SIZE; y++)
              difference[y][x]= reference[y][x] - current[y][x];
      vertical_hadamard_transform(difference);
      horizontal_hadamard_transform(difference); // vertical and horizontal can be exchanged
      for(x=0; x<SIZE; x++)
          for(y=0; y<SIZE; y++)
              score += abs(difference[y][x]);

And the theoretical ideal comparission function is to completely encode the block and then use the weighted sum of the number of bits and the distortion be it SSD or a psychovissual one as score

And how to find the best block with such a comparssion function will be the subject of another blog entry, dont fear, iam just trying to keep the blog entries small, selfcontained and easy digestable

Filed under: Uncategorized — Michael @ 00:17

Comments (1)

2006-10-29

Integer precision conversation

You have a 8bit grayscale value (or a audio sample) and want to convert it to 16bit, so without thinking about it you write x<<8 but that generally isnt correct, because the the new least significant bits are all zero so that the largest 8bit value will become a value 255 below the largest 16bit value

So what is correct?

This depends on your definition of what all the values mean, if 0 is black then simply multiplying by the new value for white and dividing by the old value for white with rounding to nearest will do the trick

For the specific case where the largest representable value is white, increasing precision can be done by setting the new least significant bits to the most significant ones, for example for 8->16bit x+(x<<8)=x*65535/255=x*257, for 8->10bit (x<<2)+(x>>6) or if you are pedantic (x*1023+127)/255=(x*341+42)/85 and for 2->8 x+(x<<2)+(x<<4)+(x<<6)=x*85

For the specific case where the largest+1 (256 in case of 8bit) is white, indeed just setting the low bits to 0 is correct, but at least for low precision this is not how grayscale values are defined, just think about 1bit monochrome, converting that to 8bit by setting the low ones to 0 will give you 128 for 1 while white is 255 (or 256)

What about decreasing precision?

Well thats simply the inverse, so for x*257 its (x+128)/257 which can be approximated by (x*255+(1<<15))>>16

Filed under: Uncategorized — Michael @ 01:48

Comments (11)

2006-09-27

X86 assembly instructions you always wanted but intel didnt give them to you

Everyone who wrote assembly code probably has cursed the lack of some instuctions in the instrcution set of their target architecture, be it due to the work needed to workaround it or the complexity and consequently speed loss of the resulting code

Heres a list of what i think is lacking in the x86 instruction set …

btw, i had to use shl and shr instead of 2 less than and 2 right than symbols as wordpress randomizes the html code if & lt ; or & gt ; occur in it (i dunno why, but that did work in the past)

cadd/csub (conditional add / sub): While there are conditional move instructions there are no conditional add/subtract instructions even though code like if(...) x+=... is pretty common
max/min: While there are mmx and sse maximum and minimum instruction there are sadly no normal integer equivalents
negative index addressing: addressing stuff like base[constant+index] is possible base[constant-index] is not, not only would that be usefull on its own it would also allow to subtract a left shifted register from another like: lea eax, [eax-8*ebx]
imulfix: For fixed point calculations the mul and imul instructions are very hard or slow to use as using the low 32bit result often would overflow and the high 32bit limits the coefficient to -0.5 … 0.5, a instruction which does (a*b + (1 shl 23)) shr 24 would solve this
sarr (shift arithmetic right with rounding): Shift right is nice but often you need to do it with rounding, a (a + (0.5 shl s)) shr s would come in handy in these cases
random: a pseudo random number generator and a true random number generator would be nice
abs: While there is pabs for mmx theres no equivalent for normal integers
select: A a&c | b&(~c) would come in handy for combining bits from 2 registers
readbits: a hardware based bitstream reader which takes a base pointer, index in bits and length of bits it should read would be very usefull for many audio and video formats, bitstream&vlc decoding generally takes 1/3 of the time spend decoding
nand/nor: (a&~b) and (a|~b) and maybe others
pavgb_nornd (packed byte average without rounding): while theres a average with rounding theres none without sadly mpeg need the later too
integer instructions like sarr and select for mmx/sse too: …
pcmpneq / pcmplt / pcmple / pcmpge: Just the missing compare instructions for mmx
punpckh/l0: unpack instructions which unpack the source with an implicit 0, this would avoid 1 move per 2 unpack and unpacking is very common
packsrr (pack with shift right and rounding): like pack* but with rounding and shift right integrated
plea: A mmx/sse instruction similar to lea which does the extreemly common operation of adding a left shifted value to another, and if this isnt able to do a=2*a+-b then an additional instruction for that as it would reduce the very common a-b, a+b butterfly operation to 2 instructions instead of 3
p(add/sub/mul)unpackh/l: having to unpack 2 source operands into 4 to perform 1 operation with them sucks (=needs 6-8 instructions) so combined foobar+unpack high/low would be pretty nice, of course only if its fast though
ps(r/l)lb (packed right/left shift unsigned byte): the missing packed byte shifts
pdabs (packed absolute difference): absolute value of the difference of 2 values
paddsusb: add a signed into a unsiged (byte) value with unsigned saturation, usefull for brightness correcture and loop/postprocessing filters

Filed under: Optimization — Michael @ 13:16

Comments (14)

2006-08-28

The highlevel structure of the mp3 bitstream

Todays blog entry descibes the mp3 bitstream structure, it wont describe the actual mp3 decoding and neither will it descibe where and how each variable is precissely stored …
First the mp3 bitstream is made of packets, these packets vary in size, yes they can even vary for “CBR” though only by +-1 byte

Each such packet begins with a 11bit “startcode” followed by 21 header bits which encode all of the most important info like samplerate, bitrate, number of channels, layer, padding, …. from these 21bits can the packet size be calculated

The remainder of the packet is split in 2 parts, the first contains everything except the scale factors and coefficients, the second contains the scale factors and coefficients which make it the larger part, the first is just encoded normally from position 4 (0-3 is the header) to the end

The second though is encoded much more stupidly, first we have (from the first part of the packet) a backstep value which is an amount of bytes from an internal buffer which we must prepend to the second part of the packet before we can begin to decode it, we will describe whats in that internal buffer in a moment, the second part now is split into channels and granules, for each we have a length in bits, we decode them normally until we either have decoded all scale factors and coefficients or we hit or run over the amount of bits assigned for that channel-granule if we dont hit it precissely but run over it as the bit-amount end doesnt coincide with the end of a vlc than we must step back by one vlc, undo its effects and skip the half vlc, whats left of the second part of our packet (that can include part of the internal buffer) will be put into the internal buffer to be used by future packets

Filed under: Uncategorized — Michael @ 21:50

Comments (1)

2006-08-17

ddrescue

ddrescue is a nice and simple tool to rescue data from a half broken disk or cd, it does that by simply repeatly trying to copy chunks, in the first pass it will try big chunks which is fairly fast, in the second pass it will try to split the damaged chunks into damaged and undamaged areas, and in the following passes it will just repeatly try to read the still damaged chunks

This takes quite some time if the disk/cd contains many damaged areas and recovery may or may not be successfull in the end but one thing which had a significantly positive effect on recovery of a old cdr (no there was nothing important on it, i was just playing around with ddrescue …) was changing the cdrom speed with hdparm every 10 secs, yes thats probably not healthy for the drive so kids try it with someone elses drive, not your own :)

While playing with ddrescue i added some ascii vissualization of the recovery process

Filed under: Uncategorized — Michael @ 00:57

Comments (1)

2006-07-21

Division by zero

Have you ever been curious why you cant divide by zero?

The intuitive explanation with division of positve integers

To divide a by b means to remove b as often as possible from a and assign the number of times which you could do that to a/b, and whats left as the remainder of the division
if b is 0 you can remove it as often or as rare as you want, nothing will ever change

The intuitive explanation with division as the inverse of multiplication

If you have a number a and multiply that by b then divide
by b, so a*b/b then you should get a again if division reverses what multiplication did, but if b is 0 then a*b is a*0 which is 0 no matter what a was, so theres no information left about the value a had and consequently no way to reverse it

The Analytical explanation

if a is non zero constant, then a/b diverges to infinity as b approaches 0, and it converges to x if a=xb
note its important to remember here that the limit of a/b as b approaches 0 is not the same as a/0

The Algebraic explanation

All the above really just deals with the common numbers like integers, reals, complex numbers, …
so you might ask yourself if its not maybe possible to invent some other algebra (set of elements with some operations like addition and multiplication) in which we can divide by zero, well the awnser is of course you can, the problem is just that you have to give up some of the well known properties of common numbers
Lets see which set of properties is incompatible with division by zero
to speak about division by zero we need at least

a set of elements S for our algebra A
an additative identity element (x + 0 = 0 + x = x for all x)
we use “0” here for simplicity even though the element doesnt need to be related to our well known zero
of course a + and * operation, which again of course doesnt have to be the same as common addition and multiplication
0 to have a inverse ((x*0)*0^-1 = x for all x)

with that set of rules one can easily find many algebras in which you can divide by zero, for example just define the * operation to be identical to + and that identical to the common addition, division by zero is just subtraction by zero, totally useless but it works :)
anoter property we likely want is a multiplicative identity element (x * 1 = 1 * x = x for all x) and the distributive law (a+b)*c= a*c + b*c and c*(a+b)= c*a + c*b sadly that already starts causing problems:

00 + 0= 00	definition of the identity element of +
00 + 10 = 0*0	definition of the identity element of *
(0 + 1)0 = 00	distributive law
10 = 00	definition of the identity element of +
1 = 0	inverting the multiplication by 0

so our algebra would need the identity element of + and the identity element of * to be identical, hmm …

and another odd result:

x*1 = x	definition of the identity element of *
x*(1+1) = x	definition of the identity element of + and 1=0
x+x = x	distributive law

now the question is, is there actualy a non trivial (more then 1 element) algebra left at all?
it seems there is: assume our algebra has 2 elements: “0” and “#”
and we define addition and multiplication like:

+/*	0	#
0	0	#
#	#	#

its immedeatly obvious that the rules for the 2 identity elements hold
does the distributive law hold too? it does which can be proofen by simply looking at the 8 possible cases, and an inverse element for multiply by 0 well 0 is its own inverse obviously

next, we drop the multiplicative identity element again and try to add a unique multiplicative inverse element x for every element instead of just for zero (a*x=b for all a,b), without that we would either just change the division by zero in a division by foobar problem or we wouldnt be able to reach some elements, sadly only the trivial 1 element algebra is left then:

ab = a(b+0)	definition of the identity element of +
ab = ab+a*0	distributive law
0 = 0+a*0	choose b so that a*b=0
0 = a*0	definition of the identity element of +
0*0^-1 = a	invert multiplication by zero

the last line means: every algebra with the properties above will have exactly one element and be useless
alternatively if we replace the property that every element must have a unique multiplicative inverse by requiring addition to be invertable ((a+b) + (-b) = a) then we too end up with the useless 1 element algebra:

a=(a0)o^-1	definition of the inverse of 0
a=(a(0+0))0^-1	definition of the identity element of +
a=(a0 + a0)*0^-1	distributve law
a=(a0)0^-1+(a0)0^-1	distributve law
a=a + a	substituting from row 1
0=a	inverting the addition of a

i guess thats enough math for today, and enough chances to embarrass myself with a trivial typo somewhere …

Filed under: Uncategorized — Michael @ 11:25

Comments (2)

« Previous Page — Next Page »

2007-04-12

2007-03-25

The optimization point of view

The linear optimization point of view

The bandlimited point of view

Summing a infinite number of samples fallacy

sincs latency

Causality and sinc

Shooting sinc and the whole bandlimited theory down

Windowed sinc

2006-12-29

2006-12-12

How to calculate it

Compare functions

2006-10-29

So what is correct?

What about decreasing precision?

2006-09-27

2006-08-28

2006-08-17

2006-07-21

The intuitive explanation with division of positve integers

The intuitive explanation with division as the inverse of multiplication

The Analytical explanation

The Algebraic explanation

Links