Whats a patch
A patch can be many things if you look at wikipedia, but the one i mean here is of course whats generated by the diff program. What diff outputs are simply differences between 2 text files in a terse form which is human and machine readable. The machine readable part means that the diff output together with one of the 2 input text files can be used by the patch program to generate the other text file. This makes patches a very convenient method of communicating changes between people, as people can read and understand them and apply them automatically to a file.
The following is an example patch which corrects a serious mistake of shakespeare. Sadly shakespeare did know about the patch program so he apparently failed to correct that …
--- mytext 2007-05-19 00:46:37.000000000 +0200
+++ mytext2 2007-05-19 00:47:12.000000000 +0200
@@ -1,3 +1,3 @@
Shakespear said or wrote or whatever ...
-"To be or not to be,
+"To pee or not to pee,
that is the question"
What are patches used for
Well they are used for communicating changes between software developers, at least in the open source world. So for example if you change the ffmpeg source code and fix a bug or add a feature then you could make a patch and submit that to us.
Why a patch and not the whole text
Well just think about the following: 2 people send you a patch, you look at the 2 small patches and if you are ok with them, you just apply both and have both changes in your text (which may be the source code of a program, or it might be just a love letter, the patch program doesnt care). On the other hand if the 2 people would have sent you the complete files not only would it be tricky to see what they have changed, integrating the changes would be even less fun. Note there are of course cases where applying 2 patches from 2 people conflict with each other so that they cannot both be applied automatically but in practice this is rare if the patches are clean
Now after this little introduction lets look at the actual reviewing of patches. People send us patches with their modifications of the ffmpeg source code and we review and comment on them. … Well, its sadly more i than we but anyway
The primary goal of the reviewing is to prevent bad, broken, buggy, inefficient, stupid, unneeded, … changes from being applied to the official ffmpeg source, so as to keep the source code small, efficient, simple, fast and working. This might sound like an easy task but it is absolutely not easy, if it where people would not subimit crappy changes in the first place but rather realize the problems and correct them before submitting.
Fixing problems in patches ourselfs vs. forcing the submitter to fix them
Fixing all issues which are found during review is always the problem of the person who submitted the patch. The first reason for that is it works very well in practice as the submitter wants the change to be accepted. The second reason is simply that we dont have the time to fix the issues considering the number of patches we receive and applying patches without all issues fixed would cause the quality, stability and maintainability of ffmpeg to decline steeply so that isnt an option either.
This is the first simply review level which checks if the patch could be applied at all if we wanted to apply it and if it does work
- Is it a patch at all?
- Is it undamaged?
- does it apply to the current latest ffmpeg source?
- do the regression tests pass? (We have a series of automated tests which check if various things like encoding, decoding and seeking still work as they did before)
- does the patch not introduce any trailing whitespace or tabs? (we have agreed that these dont do any good and so the ffmpeg source doesnt contain any, this is arguable a philosophical issues but its easy to conform to)
- does it not mix whitespace/cosmetic changes with functional changes? (patches which do are very hard to review so we always require them to be split)
- does it do what the author claims it does?
Low level review
Low level here means reviewing the code in a line per line / function per function way without thinking too much about how the lines or functions are used
- Are there any obvious bugs in the code?
- Is the code minimal, that is can the same thing be done with less code? Less code generally means fewer bugs, faster code and less to read and understand …
- Is the code fast? (does speed even matter where its used)
- Is the indention consistent?
- Is the code portable or would it fail on a big endian architecture?
- Does the change not break anything?
- Are all function/struct/… comments doxygen compatible so that nice html source code documentation can be automatically build?
- Is the code not duplicated? Duplicated code is hard to maintain and wastes space in the source, the object file and in memory at runtime
- Does the code have proper prefixes on the names of non static functions to avoid namespace issues?
- Does the code not break the API/ABI and if it does break it does it update the version numbers properly and is the change really needed?
- Are all input values properly checked to prevent crashes and security issues with invalid input files?
- Are the types of large arrays minimal so as to safe space?
High level review
- Is the choosen algorithm wisely choosen or is there a faster/simple one? does speed or size matter more in the case where its used …
- Can the patch be split into seperate or incremental changes? Splited patches are easier to review, they are easier to understand, they are easier to debug as they can be seperately reverted and applied
- Does the implementation make sense or are there serious conceptual mistakes in it, like messing up codec/container seperation or doing not really needed things
- Are Things properly split into files? (for example decoder and encoder should be in seperate files)
- Can big and seperate features be disabled at compile time?
- Do the advantages from applying the patch outweight the disadvantages? Some rare patches add features noone needs but to do that add alot of complexity though that is rare and often such things can be made optional at compiletime
There are probably more things i forgot, but iam getting tired and this blog entry has become much longer than i wanted it to become …