March 2006

AVI and b frames, is it allowed or not?

AVI certainly wasnt designed to support b frames but on the other hand theres nothing which would make b frames in AVI illegal, the biggest problem isnt AVI but the various applications and APIs like vfw which are designed with zero delay codecs in mind

PTS(Presentation timestamps)

well wtf are they and why would we need them? every (video) frame has a decode timestamp(DTS) and a PTS, the DTS is the time at which a frame is feeded into the decoder, the PTS is the time at which the decoded frame will be presented to the user, codecs which have no bframes / zero delay always have PTS =DTS and AVI as it wasnt desiged with b frames in mind has no concept of PTS, there are just DTS which are simply the frame number divide by the frame rate
So one could now argue AVI doesnt support b frames as it doesnt store PTS and would if the application needs to know PTS (simpler players dont need to know the PTS…) to calculate the PTS based upon frame type and DTS, but that argument against AVI+b frames has a critical flaw, MPEG-PS and MPEG-TS dont store PTS for every frame either but only require it to be stored every 0.5 seconds or so. Which means that the same complicated calculate the PTS from DTS + frame types code is needed for the official MPEG format too

Packed bitstream

Packed bitstream is a very ugly hack which puts several frames in to a single avi chunk, this reduces the b frame delay by 1, which avoids some problems with APIs which are not aware of b frames, but causes various problems with APIs which are aware of b frames, the resulting bitstream is also not valid according to the MPEG standards, so lossless transcoding to .mp4 is much more complicated

Some people love it some hate it, but no doubt it is probably the most common container format for videos though its usage is declining as supperior formats like matroska, nut and others gain popularity

Why is/was AVI so successfull

well theres no way to awnser this with certanity, we can only guess

  • allows storing of audio and video encoded with almost any codec without complicated codec specific hacks (ogg is probably the most common example which fundamentally failed here, but mpeg-ps/ts has the same issue, and .mp4 is thanks to the standard comitees also full of codec specific hacks, which ironically wouldnt be needed at all)
  • simple muxing and demuxing (quicktime/.mov/.mp4/mpeg-ps fails here)
  • no known software patents (asf/wma/wmv fails here)

AVI the format vs. reallity or why some developers hate AVI

The biggest problem with AVI at least in my humble oppionion is not some technical limitation of AVI but the fact that many encoders (muxers actually to be precisse) generate incorrect AVI files which violate the AVI spec from MS, that again means that players need to be much more complex to deal with all these variations …
some examples of such variations

  • putting all the audio in a single chunk, yeah try to implement seeking …
  • having an index which doesnt match the chunks, and just to be sure dont set the header flags which indicate that the index must be used
  • putting several variable sized frames into a single chunk
  • or the very common way of storing mp3 in AVI, simply cut the mp3 frames completely randomly and then scream “avi doesnt support vbr mp3” (hi virtualdub and clones)

Common missconceptions about AVI

AVI doesnt support variable bitrate audio
This is simple nonsense, there are 2 ways to store frames (audio/video/…) in AVI, the first is to set sample_size to the size in bytes of a packet and store an integer number of such packets per chunk, the second is to store 1 frame per chunk and set sample_size=0 and thats what has to be done for variable size packets it works with audio as well as video. Now what some encoders actually do is they set sample_size=1 and then chop the audio stream up at random, this surprisingly works with cbr mp3 but not vbr mp3 but its totally wrong even for cbr mp3
AVI doesnt support variable framerate
again this isnt true, storing variable framerate in AVI is not efficient but it is possible and its not a hack and it doesnt stretch the AVI spec, its fully suported, you just set the framerate to the least common multiple of the rates you want to use, and use 0-byte chunks for “skiped” frames
storing variable samples per packet audio in AVI (vorbis)
while this is still possible, this stretches the spec somewhat, sample_size must be 0, now there are 2 ways to store things, 1. the one with large overhead this sets rate/scale to a common timebase (gcd of samples per packet/samplerate) and then adds 0 sized packets to keep the timestamps correct, 2. the one with small overhead, this sets rate/scale to (lcm of samples per packet/samplerate) and packs several packets intoeach chunk which isnt really allowed but well …
AVI has 24byte per chunk overhead, ODML-AVI has 32byte per chunk overhead
well ODML-AVI without the old redundant index has just 16byte per chunk overhead and if you stretch the spec like some encoders successfully do by having chunks and index missmatch then you can get away with 8byte per chunk overhead, and yes you can still seek in this