Hello.
I've recently recompiled an old C program which mainly uses integers, with VC 2013 and ICC 13.1 compilers and was initially compiled targeting SSE2 architecture using VC 2010 compiler.
I targeted AVX (version 1.0) architecture in the compiler options for my SandyBridge and I saw a massive speedup of 68% with both compilers (VC 2013 and ICC 13.1) compared to old SSE2 optimizations.
I recompiled the C code targeting this time AVX2 for my Haswell but the speedup was the same like AVX for Haswell for both compilers.
My questions:
1) How is it possible for an integer based C code, both VC 2013 and ICC 13.1 compilers to produce a huge speedup of approximately 70% using AVX1 target architecture ?
2) Is there a way for a compiler to see the 256 bit AVX units as integer SIMD units and almost double the performance of integer code compared to SSE2 ? Or is it the SSE2 code running on 256 bit execution units ? Or is it something else ?
3) Why there was no speedup (1%) for AVX2 compared to AVX optimizations for integer code using Haswell processor ?
Thanks!