As the description of "IIR Gaussian Blur Filter Implementation using Intel® Advanced Vector Extensions",
The AVX should be faster than SSE,But, my result of performance measurement as following:
The computer supports AVX
number CPU in the system = 4
IIR Gaussian Filter Coefficients are:
a0 = 0.021175, a1 = -0.017807, a2 = 0.021103, a3 = -0.017875, b1 = -1.837578, b2
= 0.844174, cprev = 0.510583, cnext = 0.489409
image width = 1024, height = 1024
Running multi threaded SSE code
Running multi threaded AVX code
SSE and AVX Implementation matches
Performance Measurement:
SSE horizontal Pass min: 4.94052 max: 109.795 avg: 6.97836
SSE vertical Pass min: 3.32723 max: 89.6741 avg: 4.52679
AVX horizontal Pass min: 33.0741 max: 159.732 avg: 43.4993
AVX vertical Pass min: 9.69314 max: 162.726 avg: 14.5814
My OS is Windows7 64bit
My CPU is Intel(R) Core(TM) i5-3230M CPU @ 2.6GHz 2.6GHz
My IDE is VS2013, and open the option of OpenMP
I want to know why is my AVX so slowly?
Is there anyone can teach me how to understand it ?
Thank you very much