This is sort of a follow-up to your previous questions about the difference between imresize
in MATLAB and cv::resize
in OpenCV given a bicubic interpolation.
I was interested myself in finding out why there's a difference. These are my findings (as I understood the algorithms, please correct me if I make any mistakes).
Think of resizing an image as a planar transformation from an input image of size M-by-N
to an output image of size scaledM-by-scaledN
.
The problem is that the points do not necessarily fit on the discrete grid, therefore to obtain intensities of pixels in the output image, we need to interpolate the values of some of the neighboring samples (usually performed in the reverse order, that is for each output pixel, we find the corresponding non-integer point in the input space, and interpolate around it).
This is where interpolation algorithms differ, by choosing the size of the neighborhood and the weight coefficients giving to each point in that neighborhood. The relationship can be first or higher order (where the variable involved is the distance from the inverse-mapped non-integer sample to the discrete points on the original image grid). Typically you assign higher weights to closer points.
Looking at imresize
in MATLAB, here are the weights functions for the linear and cubic kernels:
function f = triangle(x)
% or simply: 1-abs(x) for x in [-1,1]
f = (1+x) .* ((-1 <= x) & (x < 0)) + ...
(1-x) .* ((0 <= x) & (x <= 1));
end
function f = cubic(x)
absx = abs(x);
absx2 = absx.^2;
absx3 = absx.^3;
f = (1.5*absx3 - 2.5*absx2 + 1) .* (absx <= 1) + ...
(-0.5*absx3 + 2.5*absx2 - 4*absx + 2) .* ((1 < absx) & (absx <= 2));
end
(These basically return the interpolation weight of a sample based on how far it is from an interpolated point.)
This is how these functions look like:
>> subplot(121), ezplot(@triangle,[-2 2]) % triangle
>> subplot(122), ezplot(@cubic,[-3 3]) % Mexican hat
Note that the linear kernel (piece-wise linear functions on [-1,0] and [0,1] intervals, and zeros elsewhere) works on the 2-neighboring points, while the cubic kernel (piece-wise cubic functions on the intervals [-2,-1], [-1,1], and [1,2], and zeros elsewhere) works on 4-neighboring points.
Here is an illustration for the 1-dimensional case, showing how to interpolate the value x
from the discrete points f(x_k)
using a cubic kernel:
The kernel function h(x)
is centered at x
, the location of the point to be interpolated. The interpolated value f(x)
is the weighted sum of the discrete neighboring points (2 to the left and 2 to the right) scaled by the value of interpolation function at those discrete points.
Say if the distance between x
and the nearest point is d
(0 <= d < 1
), the interpolated value at location x
will be:
f(x) = f(x1)*h(-d-1) + f(x2)*h(-d) + f(x3)*h(-d+1) + f(x4)*h(-d+2)
where the order of points is depicted below (note that x(k+1)-x(k) = 1
):
x1 x2 x x3 x4
o--------o---+----o--------o
\___/
distance d
Now since the points are discrete and sampled at uniform intervals, and the kernel width is usually small, the interpolation can be formulated concisely as a convolution operation:
The concept extends to 2 dimensions simply by first interpolating along one dimension, and then interpolating across the other dimension using the results of the previous step.
Here is an example of bilinear interpolation, which in 2D considers 4 neighboring points:
The bicubic interpolation in 2D uses 16 neighboring points:
First we interpolate along the rows (the red points) using the 16 grid samples (pink). Then we interpolate along the other dimension (red line) using the interpolated points from the previous step. In each step, a regular 1D interpolation is performed. In this the equations are too long and complicated for me to work out by hand!
Now if we go back to the cubic
function in MATLAB, it actually matches the definition of the convolution kernel shown in the reference paper as equation (4). Here is the same thing taken from Wikipedia:
You can see that in the above definition, MATLAB chose a value of a=-0.5
.
Now the difference between the implementation in MATLAB and OpenCV is that OpenCV chose a value of a=-0.75
.
static inline void interpolateCubic( float x, float* coeffs )
{
const float A = -0.75f;
coeffs[0] = ((A*(x + 1) - 5*A)*(x + 1) + 8*A)*(x + 1) - 4*A;
coeffs[1] = ((A + 2)*x - (A + 3))*x*x + 1;
coeffs[2] = ((A + 2)*(1 - x) - (A + 3))*(1 - x)*(1 - x) + 1;
coeffs[3] = 1.f - coeffs[0] - coeffs[1] - coeffs[2];
}
This might not be obvious right away, but the code does compute the terms of the cubic convolution function (listed right after equation (25) in the paper):
We can verify that with the help of the Symbolic Math Toolbox:
A = -0.5;
syms x
c0 = ((A*(x + 1) - 5*A)*(x + 1) + 8*A)*(x + 1) - 4*A;
c1 = ((A + 2)*x - (A + 3))*x*x + 1;
c2 = ((A + 2)*(1 - x) - (A + 3))*(1 - x)*(1 - x) + 1;
c3 = 1 - c0 - c1 - c2;
Those expressions can be rewritten as:
>> expand([c0;c1;c2;c3])
ans =
- x^3/2 + x^2 - x/2
(3*x^3)/2 - (5*x^2)/2 + 1
- (3*x^3)/2 + 2*x^2 + x/2
x^3/2 - x^2/2
which match the terms from the equation above.
Obviously the difference between MATLAB and OpenCV boils down to using a different value for the free term a
. According to the authors of the paper, a value of 0.5
is the preferred choice because it implies better properties for the approximation error than any other choice for a
.