### SIMD without SIMD

You want to put several values in a single int or uint64_t, and either you dont have a cpu which supports SIMD instructions like MMX or the instructions plain dont match the datatype you have (like 2 5:6:5 rgb values in an int)

##### shifting some arbitrary bits left by 1

a += a&mask

note1, this is very usefull for converting betweem rgb15 and rgb16

note2, yes the bit into which you shift should be 0

##### unsiged shift left & right

this is simply a normal shift and then masking out the bits which where shifted out of each component

a>>/<<= shift

a&=mask

##### sign extension

(a+C)^C

00011|00010|00001|00000 10001|10000|01111|01110 + 1110011100111001110 11111|11110|00001|00000 ^ 1110011100111001110

##### signed shift right (signed left shift == unsigned)

unsigned shift + sign extension

##### sum all elements

there are several ways,

if you know nothing will overflow then a simple multiply will do

AAAABBBBCCCCDDDD * 1000100010001 = SSSSXXXXXXXXXXXX

or

AAAABBBBCCCCDDDD + (AAAABBBBCCCCDDDD>>4) = X,a+b,X,c+d

(X,a+b,X,c+d) + ((X,a+b,X,c+d)>>8) = X,X,X,a+b+c+d

if overflows might happen / the width of the types is too small then

a= (x & 1111000011110000)>>4

b= x & 0000111100001111

a+=b

and then either use one of the methods above or recursively continue with this

##### add

1. remove msb, and add the lsbs

2. xor the xored msb back in

x=AAAABBBBCCCCDDDD and y=EEEEFFFFGGGGHHHH are our inputs

m=1000100010001000 and l=0111011101110111

((x&l) + (y&l)) ^ ((x^y)&m)

##### subtract

1. remove msb, and sub the lsbs

2. xor the xored msb back in

x=AAAABBBBCCCCDDDD and y=EEEEFFFFGGGGHHHH are our inputs

m=1000100010001000 and l=0111011101110111

((x|m) – (y&l)) ^ ((x^y^m)&m)

##### negate

x=AAAABBBBCCCCDDDD is our input

m=1000100010001000 and l=0111011101110111

x= ~x

((x&l) + 0001000100010001) ^ (x&m)

##### average round down

x=AAAABBBBCCCCDDDD and y=EEEEFFFFGGGGHHHH are our inputs

(x&y) + (((x^y)&1110111011101110)>>1)

##### average round up

x=AAAABBBBCCCCDDDD and y=EEEEFFFFGGGGHHHH are our inputs

(x|y) – (((x^y)&1110111011101110)>>1)

##### test if any element is zero

x=AAAABBBBCCCCDDDD is our input

(x – 0001000100010001) & (~x) & 1000100010001000

Note, this can trivially be used to test for equality by choosing x=a^b

##### create a bitmask based on zeroness

x=AAAABBBBCCCCDDDD is our input

m=1000100010001000 and l=0111011101110111

(((((x&l) + l) | x) & m) >>3) + l ^ l

Note, this can trivially be used to test for equality by choosing x=a^b