Intel® Software - Intel ISA Extensions

↧

Is it ok to create an array of _m256i

May 10, 2015, 7:07 am

Hi all! I am parallelizing a certain dynamic programming problem using AVX2. In the main iteration of my calculation, I calculate column in matrix where each cell is an AVX2 register -> _m256i. I...

View Article

Alignment requirements for _mm256_maskload_pd

May 11, 2015, 12:41 am

Hi,Are there any alignment requirements (beyond 8 bytes) for _mm256_maskload_pd and likewise for _mm256_maskstore_pd?Thanks

View Article

Extract non-zero byte from _m128i

May 16, 2015, 4:14 pm

Hi,I have 4 _m128i 64byte elements which can contain 0 or non-zero (+ve, -ve) values. I want to extract non-zero values from them.I looked at _mm_extract_epi8/_mmextract_epi16 but the syntax is int...

View Article

mm256_shuffle_epi8

May 25, 2015, 7:14 am

HI,I am going through the documentation for _mm256_shuffle_epi8https://software.intel.com/sites/products/documentation/doclib/iss/2013/...pseudo code shows only upto 16 bytes... for (i = 0; i < 16;...

View Article

Alignment requirement for pcmpistri

June 2, 2015, 4:38 am

Hi,I'm testing a custom implementation of strcmp() which involves SSE4.2 and this instruction in particular:pcmpistri $0x18,(%rsi,%rax,1),%xmm1I've made a test that passes unaligned pointers to the...

View Article

Why is my AVX slower than SSE?

June 2, 2015, 9:18 am

As the description of "IIR Gaussian Blur Filter Implementation using Intel® Advanced Vector Extensions",The AVX should be faster than SSE,But, my result of performance measurement as following: The...

View Article

SGX EGETKEY clarification?

June 11, 2015, 9:52 am

I've been looking at a variety of things with SGX, and while looking into the EGETKEY description, I think I've found an inconsistency in the October 2014 spec. Specifically:Table 5-43 says that the...

View Article

Dynamic Shift

June 25, 2015, 3:59 am

Hello,I am trying to achieve a dynamic shift. Well, let me explain the task. I process data with SSE, AVX. Data gets loaded, worked with and later results are stored. To support arbitrary lengths, I...

View Article

the issue about APIC drop msix interrupt

June 28, 2015, 6:27 pm

hello, I have a difficult problem,.scenes are as follows:the hardware env is Intel(R) Xeon(R) CPU E5-2609 v2 @ 2.50GHz, a Altera FPGA board. the os is Linux debian-rss 3.16.7-ckt7FPGA create 32 DMA...

View Article

Guaranteed atomic operation clarification

June 29, 2015, 8:52 pm

Hello,I'm trying to understand a line in the Intel Architecture manual. It's a description of a memory operation that is guaranteed to be atomic.The line is at Chapter 8, Section 8.1.1 "Guaranteed...

View Article

small typo in Intel® 64 and IA-32 Architectures Software Developer’s Manual

June 30, 2015, 2:31 am

Hi,It seems that there is a small typo in the Intel® 64 and IA-32 Architectures Software Developer’s Manual (Order Number: 253665-054US April 2015), page 3-149 (cmpss instruction) :128-bit Legacy SSE...

View Article

MPX instructions not in the Appendix A opcode map

July 1, 2015, 2:28 pm

Hi,In the last release 55 of Intel® 64 and IA-32 Architectures Software Developer’s Manual in Vol 2C A-11, we can't see MPX instructions. In fact, I usually use opcode maps to find instructions...

View Article

Ooops - wrong instruction description in volume 2 of the SDM

July 2, 2015, 11:39 am

Looking at the new version of Volume 2 of the SDM (document 325383-055), I just noticed that the "Description" field for the VINSERTF128 instruction (page 4-514) is incorrect. It appears to have been...

View Article

Processor Trace decoding support library for Atom

July 6, 2015, 11:26 pm

Dear Intel guru,Could I ask will libipt on github support decoding small-core (Atom) processor trace packets (pt pkt)? Or is already supported in other commercial product like PAL (Platform Analysis...

View Article

PCI Legacy Mode - Why does it use subtractive decoding?

July 10, 2015, 5:41 pm

Hello, On most modern Intel boards, they have a feature called 'PCI Legacy Mode' that allows users to add old PCI cards. The datasheets say - "PCI functionality is not supported on new generation of...

View Article

Encodings for instructions with {sae} are unclear in the doc

July 22, 2015, 12:23 pm

Chapter 4.6 indicates that EVEX.L'L is encoded for the vector length, and that {sae} is supported for all vector lengths.However, the various instruction pages, such as VCMPPD, only show {sae} for...

View Article

New extension needed for Maps and Sets

July 23, 2015, 1:10 am

Idea:In current SW lot of time every app is spending walking Maps and Sets (besides arrays, those are most often used data structures). I think this is place where CPU can provide enormous acceleration...

View Article

IRET Pseudo-code Bug

July 23, 2015, 7:46 am

Hi,I believe that there is a documentation bug in the pseudo-code for the IRET instruction in the current edition of Volume 2A of the Architectures Software Developers' Manual.The case we're looking at...

View Article

What is syntax for broadcast decorator?

July 26, 2015, 5:51 pm

The ISE doc only describes the decorator syntax with the single example {1to16} (document 319433-022 page 7).I would assume that generally you write {1ton} where n = the full vector size / the single...

View Article

Wrong memory size for VGATHERQPS (?)

July 29, 2015, 1:52 pm

My version of the document, 319433-022, page 350 showsEVEX.128.66.0F38.W0 93 /vsib VGATHERQPS xmm1 {k1}, vm64xI think this should be vm32x, not vm64x, since the operands are single-precision...

View Article