Go to file
Sanjay Patel 12a561fa1b [x86] use psubus for more vsetcc lowering (PR39859)
Circling back to a leftover bit from PR39859:
https://bugs.llvm.org/show_bug.cgi?id=39859#c1

...we have this counter-intuitive (based on the test diffs) opportunity to use 'psubus'.
This appears to be the better perf option for both Haswell and Jaguar based on llvm-mca.
We already do this transform for the SETULT predicate, so this makes the code more
symmetrical too. If we have pminub/pminuw, we prefer those, so this should not affect
anything but pre-SSE4.1 subtargets.

  $ cat before.s
	movdqa	-16(%rip), %xmm2    ## xmm2 = [32768,32768,32768,32768,32768,32768,32768,32768]
	pxor	%xmm0, %xmm2
	pcmpgtw	-32(%rip), %xmm2 ## xmm2 = [255,255,255,255,255,255,255,255]
	pand	%xmm2, %xmm0
	pandn	%xmm1, %xmm2
	por	%xmm2, %xmm0

  $ cat after.s
	movdqa	-16(%rip), %xmm2    ## xmm2 = [256,256,256,256,256,256,256,256]
	psubusw	%xmm0, %xmm2
	pxor	%xmm3, %xmm3
	pcmpeqw	%xmm2, %xmm3
	pand	%xmm3, %xmm0
	pandn	%xmm1, %xmm3
	por	%xmm3, %xmm0

  $ llvm-mca before.s -mcpu=haswell
  Iterations:        100
  Instructions:      600
  Total Cycles:      909
  Total uOps:        700

  Dispatch Width:    4
  uOps Per Cycle:    0.77
  IPC:               0.66
  Block RThroughput: 1.8

  $ llvm-mca after.s -mcpu=haswell
  Iterations:        100
  Instructions:      700
  Total Cycles:      409
  Total uOps:        700

  Dispatch Width:    4
  uOps Per Cycle:    1.71
  IPC:               1.71
  Block RThroughput: 1.8

Differential Revision: https://reviews.llvm.org/D60838

llvm-svn: 358999
2019-04-23 15:20:17 +00:00
clang Fix "-Wimplicit-fallthrough" warning. NFCI. 2019-04-23 11:45:28 +00:00
clang-tools-extra [clangd] Support dependent bases in type hierarchy 2019-04-22 01:38:53 +00:00
compiler-rt [TSan] Support fiber API on macOS 2019-04-20 00:18:44 +00:00
debuginfo-tests Set config.lit_tools_dir, which is needed by lit.llvm.initialize. 2018-11-06 21:54:27 +00:00
libclc travis: Add LLVM-8 build 2019-03-27 21:28:31 +00:00
libcxx [libc++] Remove redundant conditionals for Apple platforms 2019-04-23 14:05:04 +00:00
libcxxabi [libc++abi] Don't use a .sh.cpp test for uncaught_exception 2019-04-23 00:03:34 +00:00
libunwind [NFC] Fix typo in debug log 2019-04-22 15:40:50 +00:00
lld [WebAssembly] Fix typo in relocation checking 2019-04-23 14:49:38 +00:00
lldb Move postfix expression code out of the NativePDB plugin 2019-04-23 11:50:07 +00:00
llgo IR: Support parsing numeric block ids, and emit them in textual output. 2019-03-22 18:27:13 +00:00
llvm [x86] use psubus for more vsetcc lowering (PR39859) 2019-04-23 15:20:17 +00:00
openmp Use correct way to test for MIPS arch after rOMP355687 2019-04-22 19:20:46 +00:00
parallel-libs Fix typos throughout the license files that somehow I and my reviewers 2019-01-21 09:52:34 +00:00
polly Apply include-what-you-use #include removal suggestions. NFC. 2019-03-28 20:19:49 +00:00
pstl [pstl] Add a serial backend for the PSTL 2019-04-18 18:20:19 +00:00
.arcconfig Update monorepo .arcconfig with new project callsign. 2019-01-31 14:34:59 +00:00
.clang-format Add .clang-tidy and .clang-format files to the toplevel of the 2019-01-29 16:43:16 +00:00
.clang-tidy Disable tidy checks with too many hits 2019-02-01 11:20:13 +00:00
.gitignore Add a reduced copy of the llvm .gitignore 2019-04-09 00:52:49 +00:00
README.md

README.md

The LLVM Compiler Infrastructure

This directory and its subdirectories contain source code for LLVM, a toolkit for the construction of highly optimized compilers, optimizers, and runtime environments.