Editing Motion compensation (section)

== Motion-compensated DCT ==
=== Block motion compensation ===
{{See also|Block-matching algorithm}}
'''Block motion compensation''' (BMC), also known as motion-compensated [[discrete cosine transform]] (MC DCT), is the most widely used motion compensation technique.<ref name="Li">{{cite book |last1=Li |first1=Jian Ping |title=Proceedings of the International Computer Conference 2006 on Wavelet Active Media Technology and Information Processing: Chongqing, China, 29-31 August 2006 |date=2006 |publisher=[[World Scientific]] |isbn=9789812709998 |page=847 |url=https://books.google.com/books?id=FZiK3zXdK7sC&pg=PA847}}</ref> In BMC, the frames are partitioned in blocks of pixels (e.g. macro-blocks of 16×16 pixels in [[MPEG]]).
Each block is predicted from a block of equal size in the reference frame.
The blocks are not transformed in any way apart from being shifted to the position of the predicted block.
This shift is represented by a ''motion vector''.

To exploit the redundancy between neighboring block vectors, (e.g. for a single moving object covered by multiple blocks) it is common to encode only the difference between the current and previous motion vector in the bit-stream. The result of this differentiating process is mathematically equivalent to a global motion compensation capable of panning.
Further down the encoding pipeline, an [[entropy encoding|entropy coder]] will take advantage of the resulting statistical distribution of the motion vectors around the zero vector to reduce the output size.

It is possible to shift a block by a non-integer number of pixels, which is called ''sub-pixel precision''.
The in-between pixels are generated by interpolating neighboring pixels. Commonly, half-pixel or quarter pixel precision ([[Qpel]], used by H.264 and MPEG-4/ASP) is used. The computational expense of sub-pixel precision is much higher due to the extra processing required for interpolation and on the encoder side, a much greater number of potential source blocks to be evaluated.

The main disadvantage of block motion compensation is that it introduces discontinuities at the block borders (blocking artifacts).
These artifacts appear in the form of sharp horizontal and vertical edges which are easily spotted by the human eye and produce false edges and ringing effects (large coefficients in high frequency sub-bands) due to quantization of coefficients of the [[List of Fourier-related transforms|Fourier-related transform]] used for [[transform coding]] of the [[residual frame]]s<ref>Zeng, Kai, et al. "Characterizing perceptual artifacts in compressed video streams." IS&T/SPIE Electronic Imaging. International Society for Optics and Photonics, 2014.</ref>

Block motion compensation divides up the ''current'' frame into non-overlapping blocks, and the motion compensation vector tells where those blocks come ''from''
(a common misconception is that the ''previous frame'' is divided up into non-overlapping blocks, and the motion compensation vectors tell where those blocks move ''to'').
The source blocks typically overlap in the source frame.
Some video compression algorithms assemble the current frame out of pieces of several different previously transmitted frames.

Frames can also be predicted from future frames.
The future frames then need to be encoded before the predicted frames and thus, the encoding order does not necessarily match the real frame order.
Such frames are usually predicted from two directions, i.e. from the I- or P-frames that immediately precede or follow the predicted frame.
These bidirectionally predicted frames are called [[Video compression picture types|''B-frames'']].
A coding scheme could, for instance, be IBBPBBPBBPBB.

Further, the use of triangular tiles has also been proposed for motion compensation. Under this scheme, the frame is tiled with triangles, and the next frame is generated by performing an affine transformation on these triangles.<ref>Aizawa, Kiyoharu, and Thomas S. Huang. "Model-based image coding advanced video coding techniques for very low bit-rate applications." Proceedings of the IEEE 83.2 (1995): 259-271.</ref> Only the affine transformations are recorded/transmitted. This is capable of dealing with zooming, rotation, translation etc.

=== Variable block-size motion compensation ===
'''Variable block-size motion compensation''' (VBSMC) is the use of BMC with the ability for the encoder to dynamically select the size of the blocks.  When coding video, the use of larger blocks can reduce the number of bits needed to represent the motion vectors, while the use of smaller blocks can result in a smaller amount of prediction residual information to encode.  Other areas of work have examined the use of variable-shape feature metrics, beyond block boundaries, from which interframe vectors can be calculated.<ref>{{Cite book|last=Garnham|first=Nigel W.|title=Motion Compensated Video Coding - PhD Thesis|publisher=University of Nottingham|year=1995|url=http://eprints.nottingham.ac.uk/13447/1/thesis.pdf|oclc=59633188}}</ref> Older designs such as [[H.261]] and [[MPEG-1]] video typically use a fixed block size, while newer ones such as [[H.263]], [[MPEG-4 Part 2]], [[H.264/MPEG-4 AVC]], and [[VC-1]] give the encoder the ability to dynamically choose what block size will be used to represent the motion.

=== Overlapped block motion compensation ===

'''Overlapped block motion compensation''' (OBMC) is a good solution to these problems because it not only increases prediction accuracy but also avoids blocking artifacts.  When using OBMC,
blocks are typically twice as big in each dimension and overlap quadrant-wise with all 8 neighbouring blocks.
Thus, each pixel belongs to 4 blocks.  In such a scheme, there are 4 predictions for each pixel which are summed up to a weighted mean.
For this purpose, blocks are associated with a window function that has the property that the sum of 4 overlapped windows is equal to 1 everywhere.

Studies of methods for reducing the complexity of OBMC have shown that the contribution to the window function is smallest for the diagonally-adjacent block.  Reducing the weight for this contribution to zero and increasing the other weights by an equal amount leads to a substantial reduction in complexity without a large penalty in quality.  In such a scheme, each pixel then belongs to 3 blocks rather than 4, and rather than using 8 neighboring blocks, only 4 are used for each block to be compensated.  Such a scheme is found in the [[H.263]] Annex F Advanced Prediction mode