The typo got unnoticed because we were testing only on Dwarf 2. Add a
Dwarf4 test that exercises the code path, and also tests some newer
FORMs that the other test doesn't cover.
llvm-svn: 232191
This patch adds initial support for vector instructions to the reassociation
pass. It enables most parts of the pass to work with vectors but to keep the
size of the patch small, optimization of Xor trees, canonicalization of
negative constants and converting shifts to muls, etc., have been left out.
This will be handled in later patches.
The patch is based on an initial patch by Chad Rosier.
Differential Revision: http://reviews.llvm.org/D7566
llvm-svn: 232190
Summary:
ScalarEvolutionExpander assumes that the header block of a loop is a
legal place to have a use for a phi node. This is true only for phis
that are either in the header or dominate the header block, but it is
not true for phi nodes that are strictly internal to the loop body.
This change teaches ScalarEvolutionExpander to place uses of PHI nodes
in the basic block the PHI nodes belong to. This is always legal, and
`hoistIVInc` ensures that the said position dominates `IsomorphicInc`.
Reviewers: atrick
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D8311
llvm-svn: 232189
Similar to gep (r230786) and load (r230794) changes.
Similar migration script can be used to update test cases, which
successfully migrated all of LLVM and Polly, but about 4 test cases
needed manually changes in Clang.
(this script will read the contents of stdin and massage it into stdout
- wrap it in the 'apply.sh' script shown in previous commits + xargs to
apply it over a large set of test cases)
import fileinput
import sys
import re
rep = re.compile(r"(getelementptr(?:\s+inbounds)?\s*\()((<\d*\s+x\s+)?([^@]*?)(|\s*addrspace\(\d+\))\s*\*(?(3)>)\s*)(?=$|%|@|null|undef|blockaddress|getelementptr|addrspacecast|bitcast|inttoptr|zeroinitializer|<|\[\[[a-zA-Z]|\{\{)", re.MULTILINE | re.DOTALL)
def conv(match):
line = match.group(1)
line += match.group(4)
line += ", "
line += match.group(2)
return line
line = sys.stdin.read()
off = 0
for match in re.finditer(rep, line):
sys.stdout.write(line[off:match.start()])
sys.stdout.write(conv(match))
off = match.end()
sys.stdout.write(line[off:])
llvm-svn: 232184
using numeric values and not their symbolic constant names.
The routines that print Mach-O stuff already had a verbose parameter and this
change is just changing the passing true to passing !NonVerbose. With just a
couple of fixes and a bunch of test case updates.
llvm-svn: 232182
This patch fixes a bug in the shuffle lowering logic implemented by function
'lowerV2X128VectorShuffle'.
The are few cases where function 'lowerV2X128VectorShuffle' wrongly expands a
shuffle of two v4X64 vectors into a CONCAT_VECTORS of two EXTRACT_SUBVECTOR
nodes. The problematic expansion only occurs when the shuffle mask M has an
'undef' element at position 2, and M is equivalent to mask <0,1,4,5>.
In that case, the algorithm propagates the wrong vector to one of the two
new EXTRACT_SUBVECTOR nodes.
Example:
;;
define <4 x double> @test(<4 x double> %A, <4 x double> %B) {
entry:
%0 = shufflevector <4 x double> %A, <4 x double> %B, <4 x i32><i32 undef, i32 1, i32 undef, i32 5>
ret <4 x double> %0
}
;;
Before this patch, llc (-mattr=+avx) generated:
vinsertf128 $1, %xmm0, %ymm0, %ymm0
With this patch, llc correctly generates:
vinsertf128 $1, %xmm1, %ymm0, %ymm0
Added test lower-vec-shuffle-bug.ll
Differential Revision: http://reviews.llvm.org/D8259
llvm-svn: 232179
Constant folding for shift IR instructions ignores all bits above 32 of
second argument (shift amount).
Because of that, some undef results are not recognized and APInt can
raise an assert failure if second argument has more than 64 bits.
Patch by Paweł Bylica!
Differential Revision: http://reviews.llvm.org/D7701
llvm-svn: 232176
%Q5_Q6<def> = COPY %Q2_Q3
%D5<def> =
%D3<def> =
%D3<def> = COPY %D6 // Incorrectly removed in MachineCopyPropagation
Using of %D3 results in incorrect result ...
Reviewed in http://reviews.llvm.org/D8242
llvm-svn: 232142
There's a missed optimization opportunity where we could look at the full chain of computation and take the intersection of the flags instead of only looking one instruction deep.
llvm-svn: 232134
This should complete the job started in r231794 and continued in r232045:
We want to replace as much custom x86 shuffling via intrinsics
as possible because pushing the code down the generic shuffle
optimization path allows for better codegen and less complexity
in LLVM.
AVX2 introduced proper integer variants of the hacked integer insert/extract
C intrinsics that were created for this same functionality with AVX1.
This should complete the removal of insert/extract128 intrinsics.
The Clang precursor patch for this change was checked in at r232109.
llvm-svn: 232120
Instead print them as part of the $dst operand. The AsmMatcher
requires the 32-bit and 64-bit encodings have the same mnemonic in
order to parse them correctly.
llvm-svn: 232105
The permps and permd instructions have their operands swapped compared to the
intrinsic definition. Therefore, they do not fall into the INTR_TYPE_2OP
category.
I did not create a new category for those two, as they are the only one AFAICT
in that case.
<rdar://problem/20108262>
llvm-svn: 232085
Part of the folding logic implemented by function 'PerformISDSETCCCombine'
only worked under the assumption that the condition code in input could have
been either SETNE or SETEQ.
Unfortunately that assumption was incorrect, and in some cases the algorithm
ended up incorrectly folding SETCC nodes.
The incorrect folding only affected SETCC dag nodes where:
- one of the operands was a build_vector of all zeroes;
- the other operand was a SIGN_EXTEND from a vector of MVT:i1 elements;
- the condition code was neither SETNE nor SETEQ.
Example:
(setcc (v4i32 (sign_extend v4i1:%A)), (v4i32 VectorOfAllZeroes), setge)
Before this patch, the entire dag node sequence from the example was
incorrectly folded to node %A.
With this patch, the dag node sequence is folded to a
(xor %A, (v4i1 VectorOfAllOnes)).
Added test setcc-combine.ll.
Thanks to Greg Bedwell for spotting this issue.
llvm-svn: 232046
Now that we've replaced the vinsertf128 intrinsics,
do the same for their extract twins.
This is very much like D8086 (checked in at r231794):
We want to replace as much custom x86 shuffling via intrinsics
as possible because pushing the code down the generic shuffle
optimization path allows for better codegen and less complexity
in LLVM.
This is also the LLVM sibling to the cfe D8275 patch.
Differential Revision: http://reviews.llvm.org/D8276
llvm-svn: 232045
It's firstly committed at r231630, and reverted at r231635.
Function pass InstructionSimplifier is inserted as barrier to
make sure loop unroll pass won't affect on LICM pass.
llvm-svn: 232011
Instead, run both EH preparation passes, and have them both ignore
functions with unrecognized EH personalities. Pass delegation involved
some hacky code for creating an AnalysisResolver that we don't need now.
llvm-svn: 231995
CodeGen incorrectly ignores (assert from APInt) constant index bigger
than 2^64 in getelementptr instruction. This is a test and fix for that.
Patch by Paweł Bylica!
Reviewed By: rnk
Subscribers: majnemer, rnk, mcrosier, resistor, llvm-commits
Differential Revision: http://reviews.llvm.org/D8219
llvm-svn: 231984
If a function is going in an unique section (because of -ffunction-sections
for example), putting a jump table in .rodata will keep .rodata alive and
that will keep alive any other function that also has a jump table.
Instead, put the jump table in a unique section that is associated with the
function.
llvm-svn: 231961
The main issue being fixed here is that APCS targets handling a "byval align N"
parameter with N > 4 were miscounting what objects were where on the stack,
leading to FrameLowering setting the frame pointer incorrectly and clobbering
the stack.
But byval handling had grown over many years, and had multiple layers of cruft
trying to compensate for each other and calculate padding correctly. This only
really needs to be done once, in the HandleByVal function. Elsewhere should
just do what it's told by that call.
I also stripped out unnecessary APCS/AAPCS distinctions (now that Clang emits
byvals with the correct C ABI alignment), which simplified HandleByVal.
rdar://20095672
llvm-svn: 231959
DW_AT_low_pc on functions is taken care of by the relocation processing, but
DW_AT_high_pc and DW_AT_low_pc on other lexical scopes need special handling.
llvm-svn: 231955
This is a follow-up to r231182. This adds the "vbroadcasti128" instruction
back, but without the intrinsic mapping. Also add a test to check the
instriction encoding.
This is related to rdar://problem/18742778.
llvm-svn: 231945
Summary:
The generic ELF TargetObjectFile defaults to .ctors, but Linux's
defaults to .init_array by calling InitializeELF with the value of
UseInitArray from TargetMachine. Make NaCl's behavior match.
Reviewers: jvoung
Differential Revision: http://reviews.llvm.org/D8240
llvm-svn: 231934
The CallGraphNode function "addCalledFunction()" asserts that edges are not to intrinsics.
This patch makes sure that the Inliner does not add such an edge to the callgraph.
Fix for clang crash by assertion: https://llvm.org/bugs/show_bug.cgi?id=22857
Differential Revision: http://reviews.llvm.org/D8231
llvm-svn: 231927
As of r231908, the test I added in r231902 actually gets run - but I'd
checked in a stale version of the input so it didn't pass. Fix the
input and un-xfail the test.
llvm-svn: 231911
This causes a crash if the referenced intrinsic was malformed. In this case, we
would already have reported an error on the referenced intrinsic, but then
crashed on the second one when it tried to introspect the first without
error checking.
llvm-svn: 231910
There were also errors in the CHECK line which I fixed and the test
doesn't actually pass as the "100" is in the wrong line. Not sure
whether this is a test failure or a coverage failure so making the test
XFAIL for now.
llvm-svn: 231908