Imagine Kernel Internals

Saturday, June 15th, 2013

Imagine is an image processing framework that I began writing in 2009. The framework makes it easy to load, manipulate, and analyze images and perform lots of common tasks. I've used imagine in several of my projects involving graphics, games, computer vision, and artificial intelligence. There are a lot of components in the framework that we could cover in this post, but I'd like to focus on one particular area: image resampling.


Image resampling is the process of resizing an image by sampling it at discrete intervals. This process typically produces a higher quality result than basic scaling because it uses more sophisticated processing and may perform sub-pixel interpolation to improve visual quality.

Imagine offers a number of different options for resampling, each with their own benefits and trade-offs. For the remainder of this post we'll walk through the supported sampler kernels in detail. Also, don't forget to check out the accompanying source code for this post.


Nearest Kernel

   NearestKernel simply selects the requested pixel closed to the specified coordinates from
   the source image and places it within the output buffer.

   More formally, our 1D kernel is defined as:
      
            {  image( floor( x ) )        for x - floor(x) < 0.5
      f(x): { 
            {  image( floor( x ) + 1 )    otherwise  

   Pros:

      o: Speed -- this is one of the fastest kernels in the library

      o: Preservation of noise distribution
 
   Cons: 

      o: Results in obvious blocking artifacts due to strong discontinuities created when 
         resizing or rotating the image

      o: No sub-pixel estimation


Average Kernel

   AverageKernel performs a simple *unweighted* averaging of a source image based on a 
   provided kernel radius.

   Pros:

      o: Much faster than the cubic kernels, with performance comparable to the bilinear
         kernel.

      o: Useful if a simple blurring operation is needed.

   Cons:

      o: Generally lower quality than almost any other kernel.


Bilinear Kernel

   BilinearKernel performs three weighted average computations in order to interpolate four 
   pixels from the source image and produce the output pixel value. 

   The 2D kernel is defined as:

     x1 = floor( x )                f11 = image( x1, y1 )
     x2 = x1 + 1                    f12 = image( x1, y2 )

     y1 = floor( y )                f21 = image( x2, y1 )
     y2 = y1 + 1                    f22 = image( x2, y2 )

     fy1 = f11 + [ ( f21 - f11 ) / ( x2 - x1 ) ] * ( x - x1 )
     fy2 = f12 + [ ( f22 - f12 ) / ( x2 - x1 ) ] * ( x - x1 )

     f( x, y ) = fy1 + [ ( fy2 - fy1 ) / ( y2 - y1 ) ] * ( y - y1 )

   Pros:

      o: Noticable improvement over NearestKernel

      o: Reasonable quality when resizing to 4x or 1/4th the resolution. For this reason,
         it is implemented in most modern GPUs and used for real-time bilinear texture 
         sampling.

   Cons:

      o: Poor preservation of image detail (may blur the image)

      o: Strong aliasing artifacts when resizing or rotating an image


Coverage Kernel

   CoverageKernel attempts to balance the benefits of the bilinear kernel with that of the 
   nearest kernel, while also providing a quality improvement over both.
   
   The kernel attempts to calculate, for each destination pixel, the set of source pixels 
   that should contribute to it. For each of these source pixels, a weight is determined 
   based on the percentage of the destination pixel that is covered by the source pixel.

   For minification operations, this kernel effectively performs a distance based weighted
   average over the set of contributing source pixels. For magnification, the kernel performs 
   a nearest sampling if the destination pixel is fully covered by a single source pixel, 
   and (bi)linear interpolation otherwise.

   This kernel is designed for image resizing operations and generally performs better than
   the bilinear kernel for minimization.

   Note that when filtering 2D images with this kernel the processing should occur in two
   passes whenever possible as it will generate a higher quality result than a single combined
   pass. 

   Pros:

      o: Much faster than the more expensive cubic kernels, with speed roughly equivalent to
         that of the bilinear kernel.

      o: The kernel will copy source data to a destination image if no resizing is requested.

      o: For minification, the kernel typically reduces jagged edges better than nearest
         and bilinear kernels, and may not discard as much (high frequency) information.

      o: For magnification, the kernel decides whether to directly sample a source pixel 
         (if the kernel deems its coverage to be 100%), or to interpolate 

   Cons:
      
      o: During minification, the kernel may overly blur the image.         


Bicubic (Generic)

   BicubicKernel samples the surrounding sixteen pixels in an image and computes a 
   weighted average using a 2D cubic function. The shape of the cubic curve has a 
   dramatic effect on the quality of the kernel. 

   The kernel is sampled with the following:
 
              {  6x^3 - 12x^2 + 6                              for 0 <= abs( x ) < 1
      f(x) =  { -6x^3 + 30x^2 -48x + 24                        for 1 <= abs( x ) < 2
              { 0                                              otherwise
   
   Pros:

     o: Better reconstruction than bilinear for greater scale changes

   Cons: 

     o: Significantly slower than the linear kernels

     o: Produces halo and ringing artifacts

     o: Selection of this kernel requires careful testing and comparison
        with each particular operator and image. 


Catmull-Rom Bicubic Kernel

   Similar to BicubicKernel, this kernel interpolates 16 source pixels to produce
   the output pixel. The cubic function used for this kernel is similar to the following:

              {  9x^3 - 15x^2 + 6                              for 0 <= abs( x ) < 1
      f(x) =  { -3x^3 + 15x^2 -24x + 12                        for 1 <= abs( x ) < 2
              { 0                                              otherwise
   
   Pros:

     o: Better reconstruction than bilinear for greater scale changes

     o: Sharper results than generic bicubic and bicubic b-spline

     o: Fewer halo artifacts than generic bicubic

   Cons: 

     o: B-spline may produce smoother results for certain images
     
     o: Selection of this kernel requires careful testing and comparison
        with each particular operator and image. 


Mitchell-Netravali Bicubic Kernel

   Similar to BicubicKernel, this kernel interpolates 16 source pixels to produce
   the output pixel. The cubic function used for this kernel is similar to the following:

              {  7x^3 - 12x^2 + 5.33                           for 0 <= abs( x ) < 1
      f(x) =  { -2.33x^3 + 12x^2 -20x + 10.66                  for 1 <= abs( x ) < 2
              { 0                                              otherwise
   
   Pros:

     o: Softer results than Catmull-Rom

     o: Crisper results than cubic b-spline

   Cons: 

     o: Selection of this kernel requires careful testing and comparison
        with each particular operator and image. 


Cubic B-Spline Bicubic Kernel

   Similar to BicubicKernel, this kernel interpolates 16 source pixels to produce
   the output pixel. The cubic function used for this kernel is similar to the following:

              {  3x^3 - 6x^2 + 4                               for 0 <= abs( x ) < 1
      f(x) =  { -x^3 + 6x^2 -12x + 8                           for 1 <= abs( x ) < 2
              { 0                                              otherwise
   
   Pros:

     o: Quality improvement over linear

   Cons: 

     o: Often results in a significant amount of blurring

     o: Selection of this kernel requires careful testing and comparison
        with each particular operator and image. 


Cubic Spline Bicubic Kernel

   Similar to BicubicKernel, this kernel interpolates 16 source pixels to produce
   the output pixel. The cubic function used for this kernel is similar to the following:

              {  1.5x^3 - 2.5x^2 + 1                           for 0 <= abs( x ) < 1
      f(x) =  { -0.5^3 + 2.5x^2 - 4x + 2                       for 1 <= abs( x ) < 2
              { 0                                              otherwise
   
   Pros:

     o: Usually a safe bet for resizing operations

   Cons: 

     o: Non-separable filter (although this may be overridden by applying
        two 1D kernels in succession, but the results may deviate from the
        intended quality of this kernel.

     o: Selection of this kernel requires careful testing and comparison
        with each particular operator and image. 


Lanczos Kernel

   This kernel interpolates (2n)^2 source pixels to produce the output kernel, where
   n is the sampling radius. The sampling function used for this kernel is similar to 
   the following:

              { 1                                              for x = 0
      sinc(x) {
              { sin( pi * x ) / ( pi * x )                     otherwise

              {  sinc(x) * sinc( x / n )                       for abs( x ) <= n
      f(x) =  { 
              { 0                                              otherwise
   
   Pros:

     o: Usually a safe bet for resizing operations

   Cons: 

     o: Non-separable filter (although this may be overridden by applying
        two 1D kernels in succession, but the results may deviate from the
        intended quality of this kernel.

     o: Selection of this kernel requires careful testing and comparison
        with each particular operator and image.