## Avoiding branches/if/conditionals

Why? well because misspredictable branches are very slow on modern CPUs, doing 10 or more simple arithemetic operations can be faster, but also keep in mind that:

- compilers sometimes compile conditional code to very efficient unconditional code, while they might not recognize your branchless code es equivalent
- if the branch is well predictable it will probably be faster then the branchless code

#### Building blocks of branchless code

##### Turning comparissions into masks or 0/1

a<0 | a>>31 |

a!=0 | (a|-a)>>31 |

a<b | (a-b)>>31 |

a>b | (b-a)>>31 |

a≤b | (a-b-1)>>31 |

a≥b | (b-a-1)>>31 |

Note, you might want to replace 31 by the number of bits in whatever type you use and cast to a signed type if you want a bit mask (-1/0) or unsigned type if you want (1/0)

allso keep in mind that (a<b) and such will give you (1/0) too

##### converting bitmasks to 0/1

well, thats trivial, there are many ways, some even flip the 0/1 in the process

mask&1, -mask, mask+1

##### converting 0/1 to masks

-x, x-1

##### flipping the 0/1 value

x^1, 1-x

##### flipping the mask value

x^(-1), ~x

##### switch between 2 variables depening upon a mask

b+((a-b)&mask), b^((a^b)&mask)

##### convert mask to 0/x

mask&x

##### convert mask to -1/x

mask|x

Note: usefull for (-1/0) -> (-1/1)

##### add/sub x depening upon a mask

a += x&mask

Note, dont forget a+=mask / a-=mask for x=1/-1

##### negate depening upon a mask

a = (a^mask) – mask

##### minimum of 2 variables

b+((a-b)&((a-b)>>31))

##### maximum of 2 variables

a-((a-b)&((a-b)>>31))

##### shift by s depending upon a mask

a>>= s&mask, a< <=s&mask, a += ((a>>s)-a)&mask

Note, that shifting by a constant might be faster

##### exchanging 2 values depending upon a mask

a^=b

b^=a&mask

a^=b

Note, the mask doesnt need to be 0/-1 it could have just some bits set

##### logical operations between masks

well, that really should be obvious AND a&b, OR a|b, XOR a^b, NOT ~a