Fast CRC
Ok kids, today we will look at how to quickly calculate a crc checksum, the naive way is to xor the crc polynomial with various shifts into the data to remove all 1 bits except the last 32 (or whatever length our crc polynom has) which from a mathemathical point of view is the data_polynom modulo the crc_polynom over GF(2)
now noone does that except maybe a few very smart or very stupid people :-) everyone else uses a 8bit->32bit look up table which simply contains the 32bit crc for each 8bit data polynom, more precissely:
LUT[i]= (i<<|CRC_POLY|) GF2_MOD CRC_POLY where GF2_MOD is the modulo operation between 2 binary polynoms over GF(2)
the actual code which uses this LUT looks approximately like:
for (i = 0; i < length; i++) crc = LUT[(crc&0xFF) ^ *buffer++] ^ (crc >>8);
a much faster method is to use 4 LUTs so that they gives the crc checksum 0,1,2,3 bytes ahead:
LUT[j][i]= (i<<(|CRC_POLY| + j*8)) GF2_MOD CRC_POLY
with that we can calculate the crc then by:
while(buffer<end-3){ crc ^= le2me_32(*(uint32_t*)buffer); buffer+=4; crc = tab[3][ crc &0xFF] ^tab[2][(crc>>8 )&0xFF] ^tab[1][(crc>>16)&0xFF] ^tab[0][(crc>>24) ]; } while(buffer<end) crc = tab[0][(crc&0xFF) ^ *buffer++] ^ (crc >>8);
actually working code should be in ffmpeg/libavutil soon, if not ill post my messy proof of concept code