Avoiding branches/if/conditionals
Why? well because misspredictable branches are very slow on modern CPUs, doing 10 or more simple arithemetic operations can be faster, but also keep in mind that:
- compilers sometimes compile conditional code to very efficient unconditional code, while they might not recognize your branchless code es equivalent
- if the branch is well predictable it will probably be faster then the branchless code
Building blocks of branchless code
Turning comparissions into masks or 0/1
a<0 | a>>31 |
a!=0 | (a|-a)>>31 |
a<b | (a-b)>>31 |
a>b | (b-a)>>31 |
a≤b | (a-b-1)>>31 |
a≥b | (b-a-1)>>31 |
Note, you might want to replace 31 by the number of bits in whatever type you use and cast to a signed type if you want a bit mask (-1/0) or unsigned type if you want (1/0)
allso keep in mind that (a<b) and such will give you (1/0) too
converting bitmasks to 0/1
well, thats trivial, there are many ways, some even flip the 0/1 in the process
mask&1, -mask, mask+1
converting 0/1 to masks
-x, x-1
flipping the 0/1 value
x^1, 1-x
flipping the mask value
x^(-1), ~x
switch between 2 variables depening upon a mask
b+((a-b)&mask), b^((a^b)&mask)
convert mask to 0/x
mask&x
convert mask to -1/x
mask|x
Note: usefull for (-1/0) -> (-1/1)
add/sub x depening upon a mask
a += x&mask
Note, dont forget a+=mask / a-=mask for x=1/-1
negate depening upon a mask
a = (a^mask) – mask
minimum of 2 variables
b+((a-b)&((a-b)>>31))
maximum of 2 variables
a-((a-b)&((a-b)>>31))
shift by s depending upon a mask
a>>= s&mask, a< <=s&mask, a += ((a>>s)-a)&mask
Note, that shifting by a constant might be faster
exchanging 2 values depending upon a mask
a^=b
b^=a&mask
a^=b
Note, the mask doesnt need to be 0/-1 it could have just some bits set
logical operations between masks
well, that really should be obvious AND a&b, OR a|b, XOR a^b, NOT ~a