Lair Of The Multimedia Guru

2006-07-21

Division by zero

Have you ever been curious why you cant divide by zero?

The intuitive explanation with division of positve integers

To divide a by b means to remove b as often as possible from a and assign the number of times which you could do that to a/b, and whats left as the remainder of the division
if b is 0 you can remove it as often or as rare as you want, nothing will ever change

The intuitive explanation with division as the inverse of multiplication

If you have a number a and multiply that by b then divide
by b, so a*b/b then you should get a again if division reverses what multiplication did, but if b is 0 then a*b is a*0 which is 0 no matter what a was, so theres no information left about the value a had and consequently no way to reverse it

The Analytical explanation

if a is non zero constant, then a/b diverges to infinity as b approaches 0, and it converges to x if a=xb
note its important to remember here that the limit of a/b as b approaches 0 is not the same as a/0

The Algebraic explanation

All the above really just deals with the common numbers like integers, reals, complex numbers, …
so you might ask yourself if its not maybe possible to invent some other algebra (set of elements with some operations like addition and multiplication) in which we can divide by zero, well the awnser is of course you can, the problem is just that you have to give up some of the well known properties of common numbers
Lets see which set of properties is incompatible with division by zero
to speak about division by zero we need at least

  1. a set of elements S for our algebra A
  2. an additative identity element (x + 0 = 0 + x = x for all x)
    we use “0” here for simplicity even though the element doesnt need to be related to our well known zero
  3. of course a + and * operation, which again of course doesnt have to be the same as common addition and multiplication
  4. 0 to have a inverse ((x*0)*0-1 = x for all x)

with that set of rules one can easily find many algebras in which you can divide by zero, for example just define the * operation to be identical to + and that identical to the common addition, division by zero is just subtraction by zero, totally useless but it works :)
anoter property we likely want is a multiplicative identity element (x * 1 = 1 * x = x for all x) and the distributive law (a+b)*c= a*c + b*c and c*(a+b)= c*a + c*b sadly that already starts causing problems:

0*0 + 0= 0*0 definition of the identity element of +
0*0 + 1*0 = 0*0 definition of the identity element of *
(0 + 1)*0 = 0*0 distributive law
1*0 = 0*0 definition of the identity element of +
1 = 0 inverting the multiplication by 0

so our algebra would need the identity element of + and the identity element of * to be identical, hmm …

and another odd result:

x*1 = x definition of the identity element of *
x*(1+1) = x definition of the identity element of + and 1=0
x+x = x distributive law

now the question is, is there actualy a non trivial (more then 1 element) algebra left at all?
it seems there is: assume our algebra has 2 elements: “0” and “#”
and we define addition and multiplication like:

+/* 0 #
0 0 #
# # #

its immedeatly obvious that the rules for the 2 identity elements hold
does the distributive law hold too? it does which can be proofen by simply looking at the 8 possible cases, and an inverse element for multiply by 0 well 0 is its own inverse obviously

next, we drop the multiplicative identity element again and try to add a unique multiplicative inverse element x for every element instead of just for zero (a*x=b for all a,b), without that we would either just change the division by zero in a division by foobar problem or we wouldnt be able to reach some elements, sadly only the trivial 1 element algebra is left then:

a*b = a*(b+0) definition of the identity element of +
a*b = a*b+a*0 distributive law
0 = 0+a*0 choose b so that a*b=0
0 = a*0 definition of the identity element of +
0*0-1 = a invert multiplication by zero

the last line means: every algebra with the properties above will have exactly one element and be useless
alternatively if we replace the property that every element must have a unique multiplicative inverse by requiring addition to be invertable ((a+b) + (-b) = a) then we too end up with the useless 1 element algebra:

a=(a*0)*o-1 definition of the inverse of 0
a=(a*(0+0))*0-1 definition of the identity element of +
a=(a*0 + a*0)*0-1 distributve law
a=(a*0)*0-1+(a*0)*0-1 distributve law
a=a + a substituting from row 1
0=a inverting the addition of a

i guess thats enough math for today, and enough chances to embarrass myself with a trivial typo somewhere …

Filed under: Uncategorized — Michael @ 11:25

2006-07-08

UTF-8

UTF-8 is the most common encoding to store unicode, the following table summarizes how it works

number of bytes byte1 byte2 byte3 byte4 range of unicode symbols
1 0XXX XXXX 0-127
2 110X XXXX 10XX XXXX 128-2047
3 1110 XXXX 10XX XXXX 10XX XXXX 2048-65535
4 1111 0XXX 10XX XXXX 10XX XXXX 10XX XXXX 65536-2097151

Features

  • 0-127 ASCII is stored as is in UTF-8
  • bytewise substring matching works as is, or in other words one UTF-8 symbol will never match anything but one and the same UTF-8 symbol, not maybe just the last byte of a symbol
  • UTF-8 is easy to distinguish from a random byte sequence
  • when starting at a random byte, resynchronization is trivial and always happens at the next symbol

S1D-7

The biggest problem of UTF-8 is that its not very compact, it needs more bytes then other (non-unicode) encodings, can we do better without loosing its advantages?, well apparently yes, its sad because youd expect the standard comittees to do better …
Why “S1D-7”, well 1 “stop” bit to determine if the byte is the last one and 7 data bits for each byte

number of bytes byte1 byte2 byte3 byte4 range of unicode symbols
1 0XXX XXXX 0-127
2 1XXX XXXX 0XXX XXXX 128-16383
3 1XXX XXXX 1XXX XXXX 0XXX XXXX 16384-2097151

Compactness differences

  • In all cases is S1D-7 able to store as many or more symbols per number of bytes
  • where UTF-8 needs up to 4 bytes to store all symbols, S1D-7 never needs more then 3
  • Many languages which need 3 bytes per symbol in UTF-8 can be encoded with 2 bytes per symbol, just look at the Basic Multilingual Plane at wikipedia, all the pages which are in the range of 0x07FF to 0x3FFF need 3 bytes per symbol in UTF-8 and 2 bytes per symbol in S1D-7

Features

Well, the argumentation at the wikipedia page to justify UTF-8 bloatedness is that its “advantages outweigh this concern”, well so lets see if our simpler encoding which requires fewer bytes per symbol and is much more compact looses any features

0-127 ASCII is stored as is in UTF-8

same, no change here

Substring matching

in UTF-8 we can simply use bytewise substring matching while in S1D-7 we need to write a slightly differnt substring matching function, though both are very simple, see below

int match_utf8_or_ascii(uint8_t *heap, uint8_t *needle){
    int i,j;

    for(i=0; heap[i]; i++){
        for(j=0; needle[j]; j++){
            if(heap[i+j] != needle[j])
                break;
        }
        if(!needle[j])
            return i;
    }
    return -1;
}

int match_S1D8(uint8_t *heap, uint8_t *needle){
    int i,j;

    for(i=0; heap[i]; i++){
        for(j=0; needle[j]; j++){
            if(heap[i+j] != needle[j])
                break;
        }
        if(!needle[j] && !(i && heap[i-1]>127))
            return i;
    }
    return -1;
}
Distinguishing S1D-7 from random bytes

Due to the lower overhead there is less to check and it consequently needs more text to detect reliably, but the check is very simple, just check if there are any consecutive 3 bytes which are each larger then 127 and you know its not S1D-7, to distingissh it from UTF-8 just run your UTF-8 detection over it, its very unlikely that S1D-7 will be parsed correctly by a UTF-8 parser
not to mention, the encoding should be specified somehow anyway and not guessed by such encoding analysis

resynchronization

is even more trivial then in UTF-8, in UTF-8 you search for a byte <128 or a byte > 192 when you find it you know its the first byte of a symbol
in S1D-7 you search for a byte < 128 when you find it you know that the next byte is the first byte of a symbol

parsing

Parsing S1D-7 is much easier then UTF-8, see:

read_utf8_symbol(uin8_t **b){
    int val= *((*b)++);
    int ones=0;

    while(val&(128>>ones))
        ones++;

    if(ones==1 || ones>4)
        return -1;
    val&= 127>>ones;
    while(--ones > 0){
        int tmp= *((*b)++) - 128;
         if(tmp>>6)
            return -1;
        val= (val<<6) + tmp;
    }
    return val;
}

read_s1d7_symbol(uin8_t **b){
    int i, val=0;

    for(i=0; i<3; i++){
        int tmp= *((*b)++) - 128;
        val= (val<<7) + tmp;
        if(tmp < 0)
            return val + 128;
    }
    return -1;
}
zero byte occurances (new 2006-07-21)

This issue has been raised by a comment …
While UTF-8 gurantees that a zero byte in the encoding will occur only in a zero symbol, S1D-7 does not gurantee that, for example the value 256 is encoded as 0x82 0x00, UTF-16 also has this problem
zero bytes are problematic if the encoded strings are passed through code which has been written for one byte per symbol zero terminated strings, of course you should never pass something through code which is designed for a different encoding but if for whatever odd reason you still do that then one possible solution is to simply change the value before encoding and after decoding to avoid zero bytes, this is actually very easy:

int avoid_zero(int x){
    x+= x<<7;
    return (x + (x>>14))>>7;
}
int reverse_avoid_zero(int x){
    return x - (x>>7);
}

these transforms have almost no effect on the space efficiency of S1D-7

occurance of specific bytes (new 2006-07-21)

This issue has been raised by a comment …
While UTF-8 gurantees that no ascii bytes below 128 will occur in the encoding of symbols larger then 128, S1D-7 does not gurantee that, UTF-16 also has this “problem”
again this “problem” is limited to the case of passing encoded strings through code which expects some other encoding, and thats something you should never ever do, not only will the code not be able to make sense of at least some symbols its likely going to lead to bugs and exploits, if strings have to be passed through code which expects another encoding then you should convert your strings to that encoding and escape the problematic symbols somehow
S1D-7 can also easily be changed to avoid specific low <128 bytes in its encoded representation, allthough thats just for demonstration and shouldnt be done in practice as its very bad to pass strings though code which expects another encoding, if your filesystem for example expects filenames in ascii without specific letters then thats what the filename should be encoded with, and letters which cannot be represented should be encode with some sort of escape sequence

int avoid_some_bytes(int x){
    if(x>127) x= (x<<1) ^ table1[x&63];
    return x;
}
int reverse_avoid_some_bytes(int x){
    if(x>127) x= (x ^ table2[x&127])>>1;
    return x;
}

with these transforms any 64 low ascii bytes can be avoided at the cost of 1 bit per symbol, that makes S1D-7 somewhat worse then before but still much better then UTF-8 in space efficiency

Filed under: Uncategorized — Michael @ 12:40

2006-06-09

Deinterlacing filters

Below you will find a comparission of various deinterlacing filters, the interlaced source was created with mplayer using the phase=t and tinterlace=1 filters from the well known foreman video

all pictures where exported with mplayer -vo pnm and ppmtojpg --optimize --quality 90 why not -vo jpeg? well i didnt had -vo jpeg compiled in and -vo jpeg only accepts rgb input, so there wouldnt have been any advantage in using it …

Most deinterlacing filters only output one frame for every 2 fields that combined with the fact that some always choose the even, some always the odd and some always the first no matter if thats even or odd made this comparission less funny then i hoped originally, due to this little issue several filters (pp=ci and kerndeint IIRC) had to be feeded with vertically fliped video and tomsmocomp even needed -vf phase=b to output a frame based on the correct field and yes i of coarse did specify the top vs. bottom field first flag correctly as long as the filter had such an option …

Original non interlaced image
interlaced image mplayer -vf phase=t,tinterlace=1
mplayer -vf pp=lb (linear blend)
mplayer -vf pp=l5 (5tap lowpass filter)
transcode -J smartyuv
mplayer -vf pp=fd
mplayer -vf pp=md (median deinterlacer)
mplayer -vf pp=li (linear interpolate)
mplayer -vf pp=ci (cubic interpolate)
transcode -J dilyuvmmx
mplayer -vf kerndeint (Donald Graft’s adaptive kernel deinterlacer)
transcode -J smartdeinter (VirtualDub’s smart deinterlacer)
transcode -J tomsmocomp (Tom’s Motion Compensation deinterlacing filter)
mplayer -vf yadif=1:1
mplayer -vf yadif=3:1
mplayer -vf yadif=1:1,mcdeint=2:1:10
mplayer -vf yadif=3:1,mcdeint=2:1:10
Filed under: Uncategorized — Michael @ 16:48

2006-06-02

Optimal (A)DPCM encoding

ADPCM in general encodes differences between samples using a small number of possible difference values and adapts which values are allowed depending upon past difference values
( … we all know that)

ADPCM encoders normally work based on the principle that they select the best possible difference between the last sample and the current sample while ignoring both future and past, thats not optimal at all

a viterbi based encoder would select the optimal sequence of differences which minimize some distortion measure like the sum of squares, it achives this by successively finding the optimal encodings up to a specific state or in other words
instead of keeping a single sequence of encoded differenes of the past samples and then encoding the next difference
we keep track of the optimal sequence of differences/encoded bitstream up to the current sample for every state (0..88 step + a few sample values around the ideal one) next we just calculate the optimal bitstreams up to the next sample using the current sample and the optimal ones up to the last sample

maybe a concrete example would help, we use non adaptive DPCM here but the principle is the same …

lets say we have some input samples 0,2,6,4,-4
and our example encoder can encode +4,+1,-1,-4 differences
a conventional encoder would output 0,1,5,4, 0 distortion=18
optimal would be 1,2,6,2,-2 distortion=9
if we would force the first output to 1 and then use a conventional encoder 1,2,6,5, 1 distortion=27

viterbi encoder:

1. iteration
-2              distortion 4
-1              distortion 1
 0              distortion 0
 1              distortion 1
 2              distortion 4
2. iteration
 1, 0           distortion 5
 0, 1           distortion 1
 1, 2           distortion 1
-1, 3           distortion 2
 0, 4           distortion 4
3. iteration
-1, 3, 4        distortion 6
 0, 1, 5        distortion 2
 1, 2, 6        distortion 1
-1, 3, 7        distortion 3
 0, 4, 8        distortion 8
4. iteration
 1, 2, 6, 2     distortion 5
-1, 3, 7, 3     distortion 4
 0, 1, 5, 4     distortion 2
 1, 2, 6, 5     distortion 2
 0, 1, 5, 6     distortion 6
5. iteration
not possibl,-6
not possibl,-5
not possibl,-4
not possibl,-3
 1, 2, 6, 2,-2  distortion 9
Filed under: Uncategorized — Michael @ 20:18

2006-05-23

PT_GNU_STACK

You upgrade your linux distro and suddenly things are failing with error messages like "error while loading shared libraries: libfoobar.so.123: cannot enable executable stack as shared object requires: Permission denied" what happened?

Well, you just have become yet another witness of the decay of the GNU toolchain. Besides that gcc and glibc becomes more and more bloated every day, gcc bugreports being randomly closed without fixing them and so on, we since some time have the beautifull PT_GNU_STACK feature in every GNU tool which tries very hard to make the stack of application executable, without grsec your system will be more vulnerable, with grsec stuff will fail hard as grsec will not give apps an executable stack if they dont need one and the redhead ehm i mean gnu libc/loader/… will then simply refuse to continue if the unneccesary executablity isnt provided

what exactly is PT_GNU_STACK?

PT_GNU_STACK is a entry in the elf file format which contains the access rights (read, write, execute) of the stack, so far so good, nothing bad here, the problem is how these access rights are choosen by the tools …

The idea behind PT_GNU_STACK / choosing the access rights

The stack is always read and writeable obviously thats needed, so we are left with the question about it being executable, when linking objects together into an library or libraries into an application, a single object requireing an executable stack will cause the final application and all libraries which contain that object to need an executable stack, if no object needs an executable stack the final application wont need one either

The design and implementation of PT_GNU_STACK

Every tool in the gnu toolchain will from source over assembly, objects and libraries to final application merge the PT_GNU_STACK access rights so that a single object needing a executable stack will cause the final application to need one, in case of doubt (no PT_GNU_STACK entry) the default is that the specific object needs an executable stack

The problem with this / the total failure of the PT_GNU_STACK idea

Undecideablility at the compilation and assembly level

Deciding if a piece of code needs an executable stack is proofably not possible, its indeed equivalent to solving the halting problem
so gcc, gas and so on guess if a pice of code needs an executeable stack and if they are unsure, they claim that it does

Undecideablility at the global level

For example calling a function through a global pointer, the compiler compiling that file cant decide about the stack access rights as it doesnt know all code which changes the pointer, the linker later would have to dissassemble the object files again just to have the info needed to decide it …

several definitions of executability

If you execute code on the stack, does it need to be executable? if you use redheads execshield then AFAIK yes, if you use grsec it depends upon the code, as grsec can detect and emulate some common and harmless sequences of instrucions on the stack (trampolines)

Even if an application needs an executable stack for some feature

Theres no need to have an executable stack if that feature isnt used, actually applications might explicily try to enable / disable executability of a specific part of memory depening upon needs, gcc will in that case almost certainly claim that the app always needs an executable stack …

All these things cause applications to claim that they require an executable stack while they simply do not, which makes your system much more vulnerable to attacks

The alternative

Mark final applications which need an executable stack as such, make the default to not requireing an executable stack

whats the difference

Well, in practice more then 99% of all applications dont need an executable stack and if a application needs an executable stack but doesnt have one then that causes a clear failure which will be noticed immedeatly and can be fixed by marking that application as needing a executable stack
OTOH the current PT_GNU_STACK system with its pass the flags from source to final app ideally needs corretures to be done for each source file which the gnu tools cant deal with automatically, and there are many more source files then applications not to mention that the error rate at the application level seems to be far in excess of 1%
furthermore marking an app as requireing an executable stack while it actually doesnt, wont produce any warning, error or otherwise any hint that something is wrong while security is greatly impaired

Filed under: Off Topic — Michael @ 14:29

2006-05-19

Coding Style / Code Conventions

There are many coding style guides, IMO actually significantly too many … their existence often only justified by the single hypothesis that having everything in the same style is more readable then having one style per file and letting the original author / maintainer decide it, now maybe my memory sucks but i dont remember a single case where the difference in coding style in different parts of a project caused me any difficulty in understanding. It was rather the lack of any indention, exclusive use of single letter variables or simply the complexity of the code, lack of comments, and my stupidity of course which caused difficulty understanding something …

hmm, mphq cvs is still dead :( so lets try to write the ultimate coding style guide and hope mphq works when iam done so i can finally commit a few minor bugfixes

The ultimate coding (style) guide

Rule 1

Every rule must be justifyable and the justification must not apply to a contradicting rule too
yes, iam trying to avoid the “for consistancy” justification which can be applied to litterally anything

Rule 2a

Write consistant code / Changes should be approximately in the same style as the surrounding code
Justifiction: having the coding and indention style change every few lines makes code somewhat harder to read, while a overly strict style restriction will cause some volunteers (or employees) to leave

Rule 2b

Write indented code (code within a block should be at the same indention level, code in parent and child blocks should be at different levels), placement and indention of stuff like {} is not restricted here, just the actual code
Justification: see mplayer.c for what happens without that rule :)

Rule 2c

Avoid constructs which IDEs/editors have difficulty with, common examples are non-ASCII letters, tab / space mixes, differing tab length in different files, dos line endings, …
Justification: obvious

Rule 2d

avoid non english (in identiers and comments)
Justification: not everyone speaks every language, english is understood my most people, of course if every programmer you care about understands hungarian then its a perfect choice too

Rule 3

maximize usefull information at minimum size and complexity (simple code preferred)

so a variable name like szMyCompleteName is normaly worse then simply name as the type is probably obvious, “complete” also adds no new info … of course a function which converts zero terminated strings to length + chars style would benefit from having the type in the names, it all depends upon how and where a variable is used
comments should not say whats already obvious from the code like int i=sqrt(v); //i is the squre root of v but instead provide usefull info like //i is the largest factor v can contain, so we only need to test integers less then i but then in that case maybe renaming i to largest_factor would have provided more info at less size …
justificaion: the idea is to provide the reader with the info she is interrested in, not to duplicate and spread out the important stuff between noise
also less code means fewer places where a bug could be
Note: this rule also forbids code duplication, and other bloatedness

Rule 4

Write efficient code
Note: efficient here means little memory, disk, cpu, …
Justification: well this one should be obvious

Rule 5

Write portable code
Justification: well this one should be obvious too

Rule 6

Write localized / internationalized applications (user vissible stuff in her native language)
Note, maybe error messages should be dealt with differently as the developers need to understand them in bugreports …
Justification: well this one should be obvious too

Rule 7

Write compatible code (dont break compatibility of file formats, your last library release, …)
Justification: well this one should be obvious too

Rule 8

Write secure code (dont write over the end of an array, checking untrusted input carefully, be carefull with malloc(w*h) due to overflows of the *, …)
Justification: well this one should be obvious too

Rule 9

Write modular code / split code into independant parts
Justification: many dependancies makes code very hard to understand and maintain

Rule 10

If some rules contradict, use common sense (which goal is more important in a specific case)

Filed under: Off Topic — Michael @ 19:49

2006-05-15

TV sets

Its been a month since my last blog entry, iam not sure if thats bad or not, anyway heres some more halfway offtopic crap, i want(ed) to buy a new TV set as the one i have here is _old_ and doesnt really work very well …

First question CRT or LCD?

CRTs, are old, many shops dont have any CRT based TV sets anymore and everyone seems to know they are old and inferrior
LCDs, shiny new technology, flatter and better, well wait, why exactly does every LCD TV below 1000€ have a _vastly_ inferrior picture quality then my mothers el-cheapo noname CRT TV which probably costed < 200€ … and iam not saying the ones above 1000€ where better, just out of my acceptable price range and many showing high resolution HDTV which made fair comparission hard …

Size

for CRTs size seems strongly correlated with quality and features, no 55cm 100hz TV sets for example, it also seemed like larger isnt realy more expensive, for LCDs of coarse larger is much more expensive …

100hz CRT vs 50hz

50hz flickers, and actually i somehow cant get rid of the feeling that its stronger flickering then 50hz TV sets had a few years ago
100hz dosnt flicker, wait, actually i found just one 100hz TV set <300€ which didnt had at least part of the image flickering, all others had parts of the TV channels logo flickering also with 100hz you can choose between some residue interlacing artifacts or “staircase” artifacts on moving objects, ive not found any 100hz tv set which didnt show some artifacts allthough in the end i prefer 100hz over 50hz the artifcats are less annoying then the flickering IMHO

Comparing TV sets

Well you walk into a shop with TV sets, many TV sets … all showing different stuff, and if you are lucky and they show the same then its a crappy dark low quality music video from MTV completely unsuitable for comparing, but wait one shop had remote controls “attached” to each tv set so customers could change settings and channels as they liked, sadly less then half the remotes where working …

Summary, conclusion and personal oppinion (yeah as if the whole blog was anything but subjective personal oppinion)

LCDs have imporved, more in the written numbers then in reallity, CRTs seems to have become worse but still LCDs are behind CRTs both in quality and quality per price
many TV sets be it CRT or LCD show terrible artifacts
50hz CRTs: flickering
100hz CRTs: staircase or interlacing artifacts on moving objects
CRTs in general: some cases of brightness-size dependance, most have oversharpened and too strongly saturated images (yeah last 2 are purely an adjustment thing, probably done as most customers are idiots and prefer oversharpened and too strongly saturated images)
LCDs: washed out blurry images, colors more or less wrong, aritifacts caused by slow reaction times not to mention 2-4 times as expensive as CRTs

Filed under: Off Topic — Michael @ 11:56

2006-04-15

Decrypting Nagravision

Now after breaking videocrypt, lets take a look at (old analog) nagravision

Nagravision

the PRNG seed

Same as videocrypt (Binary data from the vertical blanking interval is feeded into a smartcard which then among other things produces a PRNG seed for the nagravision decoder, but none of this matters to us here)

the PRNG

The PRNG (pseudo random number generator) is feeded with a 15bit seed and a 256 5bit entry table, and produces 287 5 bit values, the following code will do exactly that:

void prng(int seed, int *cle2enc){
  uint8_t keyNdx= seed >> 7;
  int keyInc= ((seed & 0x7F) << 1) + 1;
  int buffer[32];

  for(int i=0; i<32; i++) buffer[i]= i;

  for(int i=0; i<255; i++){
    cle2enc[i]= buffer[ keyTable[keyNdx] ];
    buffer[ keyTable[keyNdx] ]= i + 32;
    keyNdx+= keyInc;
  }

  memcpy(cle2enc+255, buffer, 32*sizeof(int));
}
the actual decryption

nagravision has a 32 line buffer, each line can contain a line of video, for each of the 287 lines of a field there are 5bits from the PRNG which are used as index into the buffer to decide into which entry to store the next received line and from where to output one, or in other words nagravision reorders lines using a 32line temporary buffer

Breaking Nagravision without knowledeg of the PRNG

The naive and ideal solution would be to try all 32^287 possibilities and compare the resulting decoded images, obviously thats not realistic
if OTOH we just use a compare function which compares 2 lines and lets assume its just simple correlation then and wish to find the global minimum then we actually didnt improve by that much as thats equivalent of solving the TSP
old versions of 2010 used the following algorithm to find a good permutation of the lines, which resulted in watchable movies

Step 1 (downsampling)

to speedup the linecompare later, the lines can be horizintally downsampled

Step 2 (comparing lines)

all lines which are not near black and which have not yet been merged with 2 lines are compared to all other such lines, the compares are done blockwise so that the number of cache misses is minimzed, for each line a fixed number of best matches is kept

Step 3 (sorting)

all the line pairs are sorted depending upon their similarity

Step 4 (merging)

from best matching line pair to worst the following steps are done

Step 4A (out of pairs)

if we hit the last pair for any line (we just keept a fixed number of best matches for each line) then we goto Step 2

Step 4B (impossible merges 1)

if either of the lines is in the middle of a block or they are the ends of the same block we continue with the next pair at Step 4

Step 4C (impossible merges 2)

we maintain 2 position limits for each block one for each possible orientation, if a merge would force the block outside the image we continue with the next pair at Step 4
the limits are calculated based on the fact that no line can be output before it has been received

Step 4D (impossible merges 3)

we maintain the fullness state of the 32 line reorder buffer if a merge would require overflowing it then we again continue with the next pair at Step 4

Step 4E (merge)

we merge the 2 blocks and update the buffer fullness and limits

Step 5 (sort blocks)

all blocks are sorted by their less restrictive position limit (=orientation is so that the limit is less restrictive)

Step 6 (reorder lines)

we reorder the actual lines

Breaking Nagravision with knowledge of the PRNG

as there are just 2^15 possible permutations so the naive bruteforce becomes alot more realistic but its still somewhat expensive for realtime decoding, so 2010 used a faster method

Step 1 (downsampling)

to speedup the linecompare later, the lines can be horizintally downsampled

Step 2 (compare)

every keyline is compared against all lines, the best 2 are keept

Step 3 (find seedlists)

for every line triplet (keyline and the 2 best matches) we assume that they will be consecutive
in the decoded image, we look up a list of seeds for which this is true from a precalculated table
or tree structure

Step 4 (find seed)

we find the most often occuring seed in the seedlists and if its number of occurances is above some threshold we decode the image, if not we drop it, this ensure practcally zero wrong decoded images

Step 5 (reorder)

after we know the seed we just need to reorder the lines and fix the chroma

Filed under: Cryptanalysis — Michael @ 11:54

2006-04-10

Decrypting VideoCrypt

To complement my last blog entry, heres a description of how VideoCrypt works and how it can be broken with reasonable quality

PAL

First before we can talk about VideoCrypt, we need to understand how PAL works
Video information in PAL (and NTSC and SECAM) is stored linewise top to bottom, and interlaced so all the even and then in the next field all the odd lines are stored (313 lines in even fields, 312 in odd from which ~288 lines contain a vissible picture for PAL)
each line has a sync pulse to mark the end of a line, a horizontal blanking interval during which the electron beam in crts “moves” back to the left after the end of the previous line at the right and obviously the vissible part in which brightness = amplitude
In case of Color the U and V components of the YUV color value are stored quadrature amplitude modulated at the color subcarier frequency and the V component has its sign inverted in every 2nd line of a field, theres also a short reference color burst added in the HBI

VideoCrypt

the PRNG seed

Binary data from the vertical blanking interval is feeded into a smartcard which then among other things produces a PRNG seed for the videocrypt decoder, but none of this matters to us here

the actual decryption

The PRNG (pseudo random number generator) outputs a sequence of 8bit values (cutpoints), one for each line of video, each of these lines is then split into 4 parts, parts 0 and 3 are outside the vissible area and arent touched by videocrypt, parts 1 and 2 are the vissible area and are exchanged, the location which seperates 1 and 2 is the cutpoint, videocrypt does nothing special with the color information it just exchanges raw digitally sampled data
because this exchange operation cannot be done with zero-delay the videocrypt decoder will also output parts 1 and 2 one line later then when they where transmitted

Breaking VideoCrypt

Lacking informaion about the PRNG we can only guess the cutpoints based on the assumtation that color/luminance values for pixels which are close will be strongly correlated while distant pixels will have more different color/lumi
still, naive bruteforce would require 256^288 images of w*288 pixels to be compared
depening on how we test a image for how good it is, we are able to significantly reduce the complexity, if we use a score function which is the sum of some compare function which uses just consecutive line pairs and their cutpoints then the dynamic programing algorithm / viterbi algorithm can be used to reduce the complexity to 256*256*288 such 2 line checks, if we now further limit ourselfs to the common cross correlation + some edge detector and only consider a few cutpoint differences close to the best result from cross correlation then the complexity would be w*log(w)*288 + 9*w*288 which is what markus kuhns antisky.c did, though antisky was far from realtime decoding, mainly because it was inefficiently written (floating point code everwhere at a time where no cpu had a reasonable fast fpu, …)

Note, if you think comparing 256^288 images would be stupid, its not, think about choosing the image which is compressed best with jpeg or another codec, this might (or might not) be better then the linewise method

Step 1 (downsampling)

to speedup the cross correlation and other things the whole image can be optionally downsampled

Step 2 (luminance cross correlation)

antisky.c used a FFT based cross correlation, this while asymptotically quite fast is really not fast at all for the case of finding the best matching cutpoint difference in reality, 2010 uses a adaptive comparission which first tries a few coarsly spaced cutpoint differences and then tries recursively more around the better ones, its simpler and faster

Step 3 (missmatched lines detection)

lines which simply cant be matched by cross correlation (mean under some threshold or variance under some threshold or best matching score from cross correlation not much better then average score)
must be marked otherwise the (random) cross correlation result will trash the other surrounding lines

Step 4 (PAL phase detection)

as already described somewhere above, PAL flips the sign of the V component every second line, the first line may or may not be fliped, and if the capture hw doesnt provide us with this info then we need to find it if we want to decode color, one way to solve this is to let the user set the flag and then just flip the flag after every frame, a much more robust method is to compare the color of the first and last pixel of the encrypted line which will be exchanged, these are guranteed to be always adjacent in the decrypted image so they will have almost the same color in the decrypted image, while in the encrypted image their difference depends upon the PAL phase thingy

Step 5 (finding the Chroma phase difference)

we know the approximate cutpoint difference from the luma cross correlation, so we can simply find the chroma phase/hue difference by calculating the complex correlation coefficient (U and V are real and imag values) (note a little care needs to be taken here due to various things i dont remember exactly …)

Step 6 (edge detection)

simply run your favorite edge detector over the 256 possible cutpoints to detect the picture edge, for example log(abs(l[x-2] + l[x-1] – l[x+0] – l[x+1]))

Step 7 (dynamic programming search)

Well here we simply combine all the previous stuff to find the best cutpoint sequence to reach every cutpoint of line X assuming we know the best sequence to all cutpoints of line X-1 already
lines which couldnt be matched to their previous line are a special case which is handled like the first line (every cutpoint has the same score) the same is done with yet another special case, namely the one where no cutpoint is possible anymore, that can happen as we restrict ourselfs to a few relative cutpoints around what the cross correlation guessed is correct and theres a small deadzone in which no cutpoints lie, so not every sequence of relative cutpoints is possible depending upon how you define relative

Step 8 (cutpoint sequence cache)

over some timeperiod cutpoint sequences where repeating after an hour or so, so i added a cache to 2010 which lead to perfect decoding until the repeations dissapeared …

Step 8a (finding the current sequence)

we select several consecutive relative cutpoint pairs and look them up in a hashtable, we also try to add +-1 to both as to compensate for errors in our cutpoint guess
for each pair we will get a list of pointers to cutpoint sequences
we count the number of occurances of each pointer and choose the most often occuring as the correct sequence if its number of occurances is above some threshold or return no match if not

Step 8b (merging the current cutpoint estimates)

we add the current relative cutpoints to the sequence so that each line has a list of relative cutpoints, if that gets too long the least often occuring is discarded
for each line the most often occuring relative cutpoint is choosen and used in later steps, the hashtable is also updated with it

Step 9 (cut and exchange)

this is simple, just 2-3 memcpys and rotating the UV values so as to compensate for the effects of the shift and 1 line delay on the color

Filed under: Cryptanalysis — Michael @ 14:58

2006-04-07

Decrypting (old analog) PayTv

A really long time ago, 1997 or maybe it was even a little earlier i wrote a program to decrypt the at that time used analog pay tv systems videocrypt and nagravision sadly i was too afraid of legal issues at that time so i never released either source or binary, well both systems are AFAIK dead nowadays, the old code hasnt been touched since 2000, doesnt compile with any compiler i have, so i thought its the perfect time for finally releasing that trash

When was the first working version born?

well i dont know, the oldest file i could find was a 2do list from 1997 which contains notes about improving videcrypt and nagravision decoding so these must have been working already …

Does it contain any cryptografically new knowlegde about the systems

no, before the nagra PRNG was reverse engeneered i belived that the PRNGs both systems used would be very secure and there would be no point in “looking” at their output, afterwards besides feeling silly as even a blind man would have noticed how the nagra PRNG works (its just permutating 256 5bit values in trivial 32768 ways, 127 only if we ignore new[x]= old[x+C])
i tried to look at the videocrypt PRNG output but failed to find anything, 2 cutpoint sequences where either different and their difference random or there where exactly equal, which happened either commonly (repeating after an hour or so) or never, depending upon when you recorded them, i also failed to find any other relations in the sequences …

Is this clean ANSI C code?

its ultra messy DJGPP-GCC-DOS-C code, i started to clean it but gave up, a patch with my unfinished cleanup work is included for the insane. with the patch a few files compile under current gcc-linux

what else does it do? could it make coffee?

Theres some teletext decoding code, some half working sync suppression decoder (with unmodified hardware) and probably a few other things i dont remember, after all it doesnt compile and i dont have my matrox meteor card here (which was the only card it supported, as that was what i had)

did you port it to java?

yes, i ported part of it to java a really long time ago, why i did that iam not sure anymore, was i seriously beliving the sun propaganda that java was a useable general propose programming language?i guess my excuse is that my previous java code at that time where just tiny applets & applications where the java-effect isnt so noticeable
the java version was even with hand coded mmx over jni about 3 times slower, there where ~1sec garbage collector pauses every 30secs or so and after 15min the computer needed a hard reset (yeah that was even without any mmx/jni)

download this crap\h\h\h\hmasterpiece

2010

license?

GPL but what do you want to do with it?!

Filed under: Cryptanalysis — Michael @ 14:37
« Previous PageNext Page »

Powered by WordPress