SDE: ssc-marks and multiple mix output
Dear developers,I would like to use the ssc-marks as follow:for (…){ //ssc_mark_start … //ssc_mark_end }Is there a way to get the statistics for each iteration of the loop? Last time I've used SDE, the...
View ArticleUpdated ISE doc, rev 25 & PCOMMIT change blog
Revision 25 of the instruction set extension document was released, now including a change-log.http:///www.intel.com/software/isaA blog post describes the background for the change of the proposed...
View ArticleMOV CR8 not serializing?
Hello,I have a question about the MOV CR8 instruction.In the Intel Software Developer's Manual it states that "MOV CR* instructions, except for MOV CR8, are serializing instructions. MOV CR8 is not...
View ArticleHow to detect New Instruction support in the Haswell/Broadwell generation...
Dear Support,I have used this piece of code in the following link for a while to detect the support of new instructions across the board with different compilers (Intel, GCC, PGI). What I am looking...
View ArticleSSE2 to AVX2 performance question
I've rewritten sse2 code to avx2, but performance is only 30-40 % better. Number of instructions has halved. What can be the problem?How can I know running time for each instruction? Is there any...
View ArticleHow is the sign bit represented in memory and in the CPU?
I'm trying to use bitfields in my software, and I heard that intel store's it's sign bit in bit 7 (aka the 8th bit) regardless of the size of the integer (so uint8_t and uint64_t would both store their...
View ArticleSDE RTM emulation - small issue
A small discrepancy between the behavior of the SDE and the behavior of a real (Skylake) processor with regards to RTM emulation (with -rtm-mode full). For access violations such as writing to...
View ArticleIntel® Xeon Phi™ x200 series (KNL) Ring 3 Monitor/MWait
We are glad to announce a model specific feature of the Intel® Xeon Phi™ x200 series (formerly known as Knights Landing (KNL)) which allows the MONITOR and MWAIT instructions to be executed in user...
View Articlepopcount emulated for core2quads
Hi all,I'm not a programmer, i've registered just to ask a simple question:Is it possible to emulate "popcount" instruction set on core2quads?I have a core2quad Q9550 and I want to be able to play the...
View ArticleHow to convert two __m256d to one __m512d using intrinsics
Hi,I need to convert two __m256d variables to one __m512d variable. For example, __m256d vA holds {0,1,2,3} and __m256d vB holds {4,5,6,7}, then I want to covert vA and vB to __m512d vC which holds...
View ArticleRetrieving/querying Intrinsic Guide
Hello,Is it possible to query the IntrinsicsGuide to retrieve the function definition, synopsis, description, and operation as text?Regards,-Rashawn Knapp
View ArticleSDE fails to run a process
Hello, I'm running SDE 7.49 on a machine with the following characteristics # uname -a Linux login3 3.0.101-0.35-default #1 SMP Mon Oct 31 15:43:41 CET 2016 x86_64 x86_64 x86_64 GNU/Linux # grep...
View ArticleAVX2 optimized code execution time deviation
When running a benchmark which compares SSE optimized code with AVX2 optimized code I'm getting results for the AVX2 optimized code with a very strong deviation:Run on (1 X 2300 MHz CPU ) 11/16/16...
View ArticleWhat is the status of VZEROUPPER use?
The problem with VZEROUPPER comes up again now that the recommendation for the Knights Landing processor is the opposite of previous processors.The history is this: The extension of vector registers...
View ArticleAVX-512 in graph process applications
Hello, everyone!I'm currently researching possibilities of graph algorithm implementations using vector instructions (working in KNL and using AVX-512) and need some help and advice. The first...
View ArticleSupported processors for PTWRITE instruction?
I have an i7-6700k processor which does not support the PTWRITE instruction, even though it supports intel processor trace.I checked using the CPUID instruction:...
View ArticleAVX512 On Xeon Phi KNL using Intel Intrinsics
Hi,I am a newbie to AVX512 Intrinsics, I tried this simple test code on Intel Xeon Phi 7210. I compiled using xMIC_AVX512.I get an illegal instruction. This is the peice of code I am using __m512d src...
View Article_mm_clmulepi64_si128 and pclmulqdq doc error
The operation pseudo code in the intrinsics guid (https://software.intel.com/sites/landingpage/IntrinsicsGuide/#=undefined&expand=3728,636&text=carry) and in the 64-ia-32-architectures guid...
View ArticleIntel(R) Parallel Studio XE 2017 emulator for linux (SDE)
Hi all,I hope i am at the currect forum..I am a newbe for Parallel Studio,I Trying to understand if there is a emulator that i can use and download that emulates the use of Xeon Phi Coprocessors.As far...
View Articleshuffles on load ports
For various algorithms that require a significant amount of SIMD shuffles to be performed, a performance penalty can occur on both SandyBridge and Haswell uarch's as far as I am aware with it would...
View Article