Quantcast
Channel: Intel® Software - Intel ISA Extensions
Browsing all 685 articles
Browse latest View live

Is it ok to create an array of _m256i

Hi all! I am parallelizing a certain dynamic programming problem using AVX2. In the main iteration of my calculation, I calculate column in matrix where each cell is an AVX2 register -> _m256i. I...

View Article


Alignment requirements for _mm256_maskload_pd

Hi,Are there any alignment requirements (beyond 8 bytes) for _mm256_maskload_pd and likewise for _mm256_maskstore_pd?Thanks

View Article


Extract non-zero byte from _m128i

Hi,I have 4 _m128i 64byte elements which can contain 0 or non-zero (+ve, -ve) values. I want to extract non-zero values from them.I looked at _mm_extract_epi8/_mmextract_epi16 but the syntax is int...

View Article

mm256_shuffle_epi8

HI,I am going through the documentation for _mm256_shuffle_epi8https://software.intel.com/sites/products/documentation/doclib/iss/2013/...pseudo code shows only upto 16 bytes... for (i = 0; i < 16;...

View Article

Alignment requirement for pcmpistri

Hi,I'm testing a custom implementation of strcmp() which involves SSE4.2 and this instruction in particular:pcmpistri $0x18,(%rsi,%rax,1),%xmm1I've made a test that passes unaligned pointers to the...

View Article


Why is my AVX slower than SSE?

As the description of "IIR Gaussian Blur Filter Implementation using Intel® Advanced Vector Extensions",The AVX should be faster than SSE,But, my result of performance measurement as following: The...

View Article

SGX EGETKEY clarification?

I've been looking at a variety of things with SGX, and while looking into the EGETKEY description, I think I've found an inconsistency in the October 2014 spec. Specifically:Table 5-43 says that the...

View Article

Dynamic Shift

Hello,I am trying to achieve a dynamic shift. Well, let me explain the task. I process data with SSE, AVX. Data gets loaded, worked with and later results are stored. To support arbitrary lengths, I...

View Article


the issue about APIC drop msix interrupt

hello, I have a difficult problem,.scenes are as follows:the hardware env is Intel(R) Xeon(R) CPU E5-2609 v2 @ 2.50GHz, a Altera FPGA board. the os is Linux debian-rss 3.16.7-ckt7FPGA create 32 DMA...

View Article


Guaranteed atomic operation clarification

Hello,I'm trying to understand a line in the Intel Architecture manual. It's a description of a memory operation that is guaranteed to be atomic.The line is at Chapter 8, Section 8.1.1 "Guaranteed...

View Article

small typo in Intel® 64 and IA-32 Architectures Software Developer’s Manual

Hi,It seems that there is a small typo in the Intel® 64 and IA-32 Architectures Software Developer’s Manual (Order Number: 253665-054US April 2015), page 3-149 (cmpss instruction) :128-bit Legacy SSE...

View Article

MPX instructions not in the Appendix A opcode map

Hi,In the last release 55  of  Intel® 64 and IA-32 Architectures Software Developer’s Manual in Vol 2C A-11, we can't see MPX instructions. In fact, I usually use opcode maps to find instructions...

View Article

Ooops - wrong instruction description in volume 2 of the SDM

Looking at the new version of Volume 2 of the SDM (document 325383-055), I just noticed that the "Description" field for the VINSERTF128 instruction (page 4-514) is incorrect.  It appears to have been...

View Article


Processor Trace decoding support library for Atom

Dear Intel guru,Could I ask will libipt on github support decoding small-core (Atom) processor trace packets (pt pkt)? Or is already supported in other commercial  product like PAL (Platform Analysis...

View Article

PCI Legacy Mode - Why does it use subtractive decoding?

Hello, On most modern Intel boards, they have a feature called 'PCI Legacy Mode' that allows users to add old PCI cards. The datasheets say - "PCI functionality is not supported on new generation of...

View Article


Encodings for instructions with {sae} are unclear in the doc

Chapter 4.6 indicates that EVEX.L'L is encoded for the vector length, and that {sae} is supported for all vector lengths.However, the various instruction pages, such as VCMPPD, only show {sae} for...

View Article

New extension needed for Maps and Sets

Idea:In current SW lot of time every app is spending walking Maps and Sets (besides arrays, those are most often used data structures). I think this is place where CPU can provide enormous acceleration...

View Article


IRET Pseudo-code Bug

Hi,I believe that there is a documentation bug in the pseudo-code for the IRET instruction in the current edition of Volume 2A of the Architectures Software Developers' Manual.The case we're looking at...

View Article

What is syntax for broadcast decorator?

The ISE doc only describes the decorator syntax with the single example {1to16} (document 319433-022 page 7).I would assume that generally you write {1ton} where n = the full vector size / the single...

View Article

Wrong memory size for VGATHERQPS (?)

My version of the document, 319433-022, page 350 showsEVEX.128.66.0F38.W0 93 /vsib VGATHERQPS xmm1 {k1}, vm64xI think this should be vm32x, not vm64x, since the operands are single-precision...

View Article
Browsing all 685 articles
Browse latest View live


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>