Quantcast
Channel: Intel® Software - Intel ISA Extensions
Viewing all articles
Browse latest Browse all 685

Confusion in behavior of _mm256_loadu_ps and _mm256_loadu_ps instrinsics

$
0
0

Hi all,

I performed a quick test to understand the behaviors of _mm256_load_ps and _mm256_loadu_ps SIMD intrinsic respectively, and the behavior is quite unexpected.

I am wondering if this is a bug by any chance?

when i try to load a register with unaligned access with _mm256_load_ps, I am expected to encounter an general-protection exception. But this isn't the case with _mm256_loadu_ps.
However, I see no such thing happen when using the aligned load access intrinsic?. For instance in the code below clearly I must expect an exception thrown on the second iteration.

for(i = 0; i < size ; i+=1)
        {
                t0 = _mm256_load_ps(&a[i]);
                t1 = _mm256_load_ps(&b[i]);
                t2 = _mm256_add_ps(t0, t1);
                _mm256_store_ps(&c[i], t2);
        }

This seems to be the case irrespective of weather a,b,c arrays were aligned or unaligned?

Is there any documentation I could refer to which explains this behavior and the performance implication of such unaligned access?

Attached below is the full code

Thanks,

Aketh

AttachmentSize
Downloadtext/x-csrcSIMD_intrinsics.c851 bytes

Viewing all articles
Browse latest Browse all 685

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>