Summary:
When SimplifySetCC sees a setcc node that compares the result of a
value extension operation with a constant, it tries to simplify the
setcc node by eliminating the extension and shrinking the constant.
If shrinking the inputs to setcc is deemed not desirable by the target
(e.g. the target does not want a setcc comparing i1 values), then it
is still possible to optimize this sequence in some cases.
This patch adds the following combines to SimplifySetCC when shrinking setcc
inputs is not desirable:
(setcc ([sz]ext (setcc x, y, cc)), 0, setne) -> (setcc (x, y, cc))
(setcc ([sz]ext (setcc x, y, cc)), 0, seteq) -> (setcc (x, Y, !cc))
There are no tests for this yet, but once AMDGPU correctly implements
TargetLowering::isTypeDesirableForOp(), this new combine will be
exercised by the existing CodeGen/AMDGPU/setcc-opt.ll test.
Reviewers: resistor, arsenm
Subscribers: jroelofs, arsenm, llvm-commits
Differential Revision: http://reviews.llvm.org/D15034
llvm-svn: 258067
When the shift immediate is zero, PKHTB is an alias for PKHBT, but the order of
the input operands needs to be swapped.
Differential Revision: http://reviews.llvm.org/D16288
llvm-svn: 258044
The symtab is logically referenced beyond the call to the create
method. This changes makes sure its lifetime matches that of
the reader.
llvm-svn: 258036
The feature flag is for VPERMB,VPERMI2B,VPERMT2B and VPMULTISHIFTQB instructions.
More about the instruction can be found in:
hattps://software.intel.com/sites/default/files/managed/07/b7/319433-023.pdf
Differential Revision: http://reviews.llvm.org/D16190
llvm-svn: 258012
Added support for the extraction of the upper 128-bit subvectors for lower/upper half undef shuffles if it would reduce the number of extractions/insertions or avoid loads of AVX2 permps/permd shuffle masks.
Minor follow up to D15477.
llvm-svn: 258000
%RBP can't be handled explicitly. We generate the following code:
pushq %rbp
movq %rsp, %rbp
...
movq %rbx, (%rbp) ## 8-byte Spill
where %rbp will be overwritten by the spilled value.
The fix is to let PEI handle %RBP.
PR26136
llvm-svn: 257997
FIXME: Add more targets to use emutls into clang/test/Driver/emulated-tls.cpp.
FIXME: Add cygwin tests into llvm/test/CodeGen/X86. Working in progress.
llvm-svn: 257984
Summary:
Later in DWARF emission we check that DebugLocEntries have
non-overlapping pieces, so we should create any such entries
by merging here.
Fixes PR26163.
Reviewers: aprantl
Differential Revision: http://reviews.llvm.org/D16249
llvm-svn: 257979
This is part of a new statistics gathering feature for the sanitizers.
See clang/docs/SanitizerStats.rst for further info and docs.
Differential Revision: http://reviews.llvm.org/D16174
llvm-svn: 257970
The only functional change would be that we might emit multiple filename
segments on code like this:
void f() {
#include "p1/../t.h"
#include "p2/../t.h"
}
I believe these get separate DIFile metadata nodes, but will have the
same canonicalized absolute path. Previously by computing the path up
front and comparing it we would merge the line info segments.
llvm-svn: 257966
WebAssembly's stack will never be executable by default, so it isn't
necessary to declare .note.GNU-stack sections to request a non-executable
stack.
Differential Revision: http://reviews.llvm.org/D15969
llvm-svn: 257962
The cases of this switch are all perfectly regular (except for the first case).
A macro is more readable here.
Thanks to Dave Blaikie for the suggestion.
llvm-svn: 257951
It looks nicer and improves the compiletime of a typical
clang -O3 -emit-llvm run by ~0.6% for me.
Differential Revision: http://reviews.llvm.org/D16205
llvm-svn: 257944