EarlyCSE is giving up on the current instruction immediately when it recognizes that the current instruction makes a previous store trivially dead. There's no reason to do this. Once the previous store has been deleted, it's perfectly legal to remember the value of the current store (for value forwarding) and the fact the store occurred (it could be dead too!).
Reviewed by: Hal
Differential Revision: http://reviews.llvm.org/D6301
llvm-svn: 222241
It is impossible for (x & INT_MAX) == 0 && x == INT_MAX to ever be true.
While this sort of reasoning should normally live in InstSimplify,
the machinery that derives this result is not trivial to split out.
llvm-svn: 222230
- Make CallGraphSCCPass's paragraph about doFinalization refer to
runOnSCC instead of runOnFunction, since that's what it's about.
- Fix a reference in the FunctionPass paragraph.
llvm-svn: 222222
Usually global variables are in a retain list and instanciated before
any call to constructImportedEntityDIE is made. This isn't true for
forward declarations though.
The testcase for this change is generated by a clang patched to emit
such forward declarations (patch at http://reviews.llvm.org/D6173
which will land soon). The updated testcase tests more than just
global variables, it now tests every type of 'using' clause we
support.
llvm-svn: 222217
I added a pessimization in r217102 to prevent miscompiles when the
incremented induction variable was used in a comparison; it would be
poison.
Try to use the incremented induction variable more often when we can be
sure that the increment won't end in poison.
Differential Revision: http://reviews.llvm.org/D6222
llvm-svn: 222213
Having the operands at the back prevents subclasses from safely adding
fields. Move them to the front.
Instead of replicating the custom `malloc()`, `free()` and `DestroyFlag`
logic that was there before, overload `new` and `delete`.
I added calls to a new `GenericMDNode::dropAllReferences()` in
`LLVMContextImpl::~LLVMContextImpl()`. There's a maze of callbacks
happening during teardown, and this resolves them before we enter
the destructors.
Part of PR21532.
llvm-svn: 222211
Split `MDNode` into two classes:
- `GenericMDNode`, which is uniquable (and for now, always starts
uniqued). Once `Metadata` is split from the `Value` hierarchy, this
class will lose the ability to RAUW itself.
- `MDNodeFwdDecl`, which is used for the "temporary" interface, is
never uniqued, and isn't managed by `LLVMContext` at all.
I've left most of the guts in `MDNode` for now, but I'll incrementally
move things to the right places (or delete the functionality, as
appropriate).
Part of PR21532.
llvm-svn: 222205
use DIScopeRef.
A paired commit at clang will follow to show cases where we will use an
identifer for the context of a global variable.
rdar://18958417
llvm-svn: 222195
Change uniquing from a `FoldingSet` to a `DenseSet` with custom
`DenseMapInfo`. Unfortunately, this doesn't save any memory, since
`DenseSet<T>` is a simple wrapper for `DenseMap<T, char>`, but I'll come
back to fix that later.
I used the name `GenericDenseMapInfo` to the custom `DenseMapInfo` since
I'll be splitting `MDNode` into two classes soon: `MDNodeFwdDecl` for
temporaries, and `GenericMDNode` for everything else.
I also added a non-debug-info reduced version of a type-uniquing test
that started failing on an earlier draft of this patch.
Part of PR21532.
llvm-svn: 222191
This reverts commit r222183.
Broke on the MSVC buildbots due to MSVC not producing default move
operations - I'd fix it immediately but just broke my build system a
bit, so backing out until I have a chance to get everything going again.
llvm-svn: 222187
The next step is to actually use unique_ptr in TreePatternNode's
Children vector. That will be more intrusive, and may not work,
depending on exactly how these things are handled (I have a bad
suspicion things are shared more than they should be, making this more
DAG than tree - but if it's really a tree, unique_ptr should suffice)
llvm-svn: 222183
This was resulting in use of a register after a kill.
For some reason this showed up as a problem in many tests
when moving the SIFixSGPRCopies pass closer to instruction
selection.
llvm-svn: 222175
When converting a switch to a lookup table we might have to generate a bitmaks
to encode and check for holes in the original switch statement.
The type of this mask depends on the number of switch statements, which can
result in illegal types for pretty much all architectures.
To avoid unnecessary type legalization and help FastISel this commit increases
the size of the bitmask to next power-of-2 value when necessary.
This fixes rdar://problem/18984639.
llvm-svn: 222168
They were producing the wrong result if NumBits == BitsInWord. The old mask
produced -1, the new mask 0.
This should fix the 32 bit bots.
llvm-svn: 222166
The specializations were broken. For example,
void foo(const CallGraph *G) {
auto I = GraphTraits<const CallGraph *>::nodes_begin(G);
auto K = I++;
...
}
or
void bar(const CallGraphNode *N) {
auto I = GraphTraits<const CallGraphNode *>::nodes_begin(G);
auto K = I++;
....
}
would not compile.
Patch by Speziale Ettore!
llvm-svn: 222149
The triple parser should only accept existing architecture names
when the triple starts with armv, armebv, thumbv or thumbebv.
Patch by Gabor Ballabas.
llvm-svn: 222129
SCEVDivision::divide constructed an object of SCEVDivision<Derived>
instead of Derived. divide would call visit which would cast the
SCEVDivision<Derived> to type Derived. As it happens,
SCEVDivision<Derived> and Derived currently have the same layout but
this is fragile and grounds for UB.
Instead, just construct Derived. No functional change intended.
llvm-svn: 222126
This was motivated by a bug which caused code like this to be
miscompiled:
declare void @take_ptr(i8*)
define void @test() {
%addr1.32 = alloca i8
%addr2.32 = alloca i32, i32 1028
call void @take_ptr(i8* %addr1)
ret void
}
This was emitting the following assembly to get the value of %addr1:
add r0, sp, #1020
add r0, r0, #8
However, "add r0, r0, #8" is not a valid Thumb1 instruction, and this
could not be assembled. The generated object file contained this,
resulting in r0 holding SP+8 rather tha SP+1028:
add r0, sp, #1020
add r0, sp, #8
This function looked like it could have caused miscompilations for
other combinations of registers and offsets (though I don't think it is
currently called with these), and the heuristic it used did not match
the emitted code in all cases.
llvm-svn: 222125
We were a little lax in a few areas:
- We pretended that import libraries were like any old COFF file, they
are not. In fact, they aren't really COFF files at all, we should
probably grow some specialized functionality to handle them smarter.
- Our symbol iterators were more than happy to attempt to go past the
end of the symbol table if you had a symbol with a bad list of
auxiliary symbols.
llvm-svn: 222124
Some optimisations in DAGCombiner cause miscompilations for targets that use
TargetLowering::UndefinedBooleanContent, because they assume that the results
of a SELECT_CC node are boolean values, and can be safely ANDed, ORed and
XORed. These optimisations are only valid for targets that use
ZeroOrOneBooleanContent or ZeroOrNegativeOneBooleanContent.
This is a follow-up to D6210/r221693.
llvm-svn: 222123
This is a simple optimization for switch table lookup:
It computes the output value directly with an (optional) mul and add if there is a linear mapping between index and output.
Example:
int f1(int x) {
switch (x) {
case 0: return 10;
case 1: return 11;
case 2: return 12;
case 3: return 13;
}
return 0;
}
generates:
define i32 @f1(i32 %x) #0 {
entry:
%0 = icmp ult i32 %x, 4
br i1 %0, label %switch.lookup, label %return
switch.lookup:
%switch.offset = add i32 %x, 10
ret i32 %switch.offset
return:
ret i32 0
}
llvm-svn: 222121
Indices into the table are stored in each MCRegisterClass instead of a pointer. A new method, getRegClassName, is added to MCRegisterInfo and TargetRegisterInfo to lookup the string in the table.
llvm-svn: 222118
This adds back r222061, but now calls initializePAEvalPass from the correct
library to avoid link problems.
Original message:
Don't make assumptions about the name of private global variables.
Private variables are can be renamed, so it is not reliable to make
decisions on the name.
The name is also dropped by the assembler before getting to the
linker, so using the name causes a disconnect between how llvm makes a
decision (var name) and how the linker makes a decision (section it is
in).
This patch changes one case where we were looking at the variable name to use
the section instead.
Test tuning by Michael Gottesman.
llvm-svn: 222117
It turns out that not all users of SCEVDivision want the same
signedness. Let the users determine which operation they'd like by
explicitly choosing SCEVUDivision or SCEVSDivision.
findArrayDimensions and computeAccessFunctions will use SCEVSDivision
while HowFarToZero will use SCEVUDivision.
llvm-svn: 222104
Summary:
Several places in DependenceAnalysis assumes both SCEVs in a subscript pair
share the same integer type. For instance, isKnownPredicate calls
SE->getMinusSCEV(X, Y) which asserts X and Y share the same type. However,
DependenceAnalysis fails to ensure this assumption when producing a subscript
pair, causing tests such as NonCanonicalizedSubscript to crash. With this
patch, DependenceAnalysis runs unifySubscriptType before producing any
subscript pair, ensuring the assumption.
Test Plan:
Added NonCanonicalizedSubscript.ll on which DependenceAnalysis before the fix
crashed because subscripts have different types.
Reviewers: spop, sebpop, jingyue
Reviewed By: jingyue
Subscribers: eliben, meheff, llvm-commits
Differential Revision: http://reviews.llvm.org/D6289
llvm-svn: 222100
HowFarToZero was supposed to use unsigned division in order to calculate
the backedge taken count. However, SCEVDivision::divide performs signed
division. Unless I am mistaken, no users of SCEVDivision actually want
signed arithmetic: switch to udiv and urem.
This fixes PR21578.
llvm-svn: 222093
A few things:
- computeKnownBits is relatively expensive, let's delay its use as long
as we can.
- Don't create two APInt values just to run computeKnownBits on a
ConstantInt, we already know the exact value!
- Avoid creating a temporary APInt value in order to calculate unary
negation.
llvm-svn: 222092
This patch teaches the DAGCombiner how to combine shuffles according to rules:
shuffle(shuffle(A, Undef, M0), B, M1) -> shuffle(B, A, M2)
shuffle(shuffle(A, B, M0), B, M1) -> shuffle(B, A, M2)
shuffle(shuffle(A, B, M0), A, M1) -> shuffle(B, A, M2)
llvm-svn: 222090
Updated X86TargetLowering::isShuffleMaskLegal to match SHUFP masks with commuted inputs and PSHUFD masks that reference the second input.
As part of this I've refactored isPSHUFDMask to work in a more general manner and allow it to match against either the first or second input vector.
Differential Revision: http://reviews.llvm.org/D6287
llvm-svn: 222087
This gets the correct NaN behavior based on the compare type
the hardware uses. This now passes the new piglit test I have
for this on SI.
Add stricter tests for the operand order.
llvm-svn: 222079
While this program worked correctly with small example programs, larger
ones tickled this bug. I'm working on a reduction because my program is
quite large.
llvm-svn: 222078
This is so it could potentially be used by SI. However, the current
implementation does not always produce correct results, so the
IntegerDivisionPass is being used instead.
llvm-svn: 222072
Make explicit the requirement that most IR values in `DIBuilder` are
`Constant`. This requires a follow-up change in clang.
Part of PR21532.
llvm-svn: 222070
Now that `MDString` and `MDNode` have a common base class, use it. Note
that it's not useful to assume subclasses of `Metadata` must be one or
the other since we'll be adding more subclasses soon enough.
Part of PR21532.
llvm-svn: 222064
Summary:
The current "WinEH" exception handling type is more about Itanium-style
LSDA tables layered on top of the Windows native unwind info format
instead of .eh_frame tables or EHABI unwind info. Use the name
"ItaniumWinEH" to better reflect the hybrid nature of the design.
Also rename isExceptionHandlingDWARF to usesItaniumLSDAForExceptions,
since the LSDA is part of the Itanium C++ ABI document, and not the
DWARF standard.
Reviewers: echristo
Subscribers: llvm-commits, compnerd
Differential Revision: http://reviews.llvm.org/D6279
llvm-svn: 222062
Private variables are can be renamed, so it is not reliable to make
decisions on the name.
The name is also dropped by the assembler before getting to the
linker, so using the name causes a disconnect between how llvm makes a
decision (var name) and how the linker makes a decision (section it is
in).
This patch changes one case where we were looking at the variable name to use
the section instead.
Test tuning by Michael Gottesman.
llvm-svn: 222061
We use to track quite a few "adjusted" offsets through the FrameLowering code
to account for changes in the prologue instructions as we went and allow the
emission of correct CFA annotations. However, we were missing a couple of cases
and the code was almost impenetrable.
It's easier to just add any stack-adjusting instruction to a list and emit them
together.
llvm-svn: 222057
When we folded the DPR alignment gap into a push, we weren't noting the extra
distance from the beginning of the push to the FP, and so FP ended up pointing
at an incorrect offset.
The .cfi_def_cfa_offset directives are still wrong in this case, but I think
that can be improved by refactoring.
llvm-svn: 222056
The test's DWARF stubs were there just to trigger the emission of .cfi
directives. Fortunately, the NetBSD ABI already demands proper DWARF unwind
info, so it's easier to just use that triple.
llvm-svn: 222055
FYI, removed the unused MCInstrAnalysis as it does not exist for 64-bit ARM and
was causing a “couldn't initialize disassembler for target” error.
llvm-svn: 222045
We would attempt to replace a fptrunc of an frem with an identical
fptrunc. This would cause the new fptrunc to be added to the worklist.
Of course, this results in an infinite loop because we will keep
visiting the newly created fptruncs.
This fixes PR21576.
llvm-svn: 222040
doing Load PRE"
This commit updates the failing test in
Analysis/TypeBasedAliasAnalysis/gvn-nonlocal-type-mismatch.ll
The failing test is sensitive to the order in which we process loads. This
version turns on the RPO traversal instead of the while DT traversal in GVN.
The new test code is functionally same just the order of loads that are
eliminated is swapped.
This new version also fixes an issue where GVN splits a critical edge and
potentially invalidate the RPO/DT iterator.
llvm-svn: 222039
based on instruction complexity
The order that tablegen fast-isel instruction code is generated is
currently based on the text of the predicate (using string
less-than). This patch changes this to instead use the instruction
complexity. Because the complexities are not unique a C++ multimap is
used instead of a map.
This fixes the problem where code with no predicate always comes out
first (the empty string always compares as less than all other
strings) thus making the code with predicates dead code. See the FMUL
code in PPCFastISel.cpp for an example. It also more closely matches
the normal codegen ordering. Some error checking in the tablegen
fast-isel code is fixed as well.
Patch by Bill Seurer.
llvm-svn: 222038
If we have spilled the value of the m0 register, then we need to restore
it with v_readlane_b32 to a regular sgpr, because v_readlane_b32 can't
write to m0.
v_readlane_b32 can't write to m0, so
llvm-svn: 222036
This allows COFF targets to emit accelerator tables
when requested by -dwarf-accel-tables=Enable instead
of aborting. The test DebugInfo/cross-cu-inlining.ll
covers this on COFF platforms.
llvm-svn: 222034
ELF targets (and maybe COFF) use relocations when referring
to strings in the .debug_str section. Handle that in the
accelerator table dumper. This commit restores the
test/DebugInfo/cross-cu-inlining.ll test to its expected
platform independant form, validating that the fix works
(this test failed on linux boxes).
llvm-svn: 222029
If this workaround gets the bots green, then we have to find out
why the -dwarf-accel-tables=Enable option doesn't work as
expected on non-darwin platforms.
llvm-svn: 222007
Prior to this commit fmul and fadd binary operators were being canonicalized for
both scalar and vector versions. We now canonicalize add, mul, and, or, and xor
vector instructions.
llvm-svn: 222006
This reverts commit r221842 which was a revert of r221836 and of the
test parts of r221837.
This new version fixes an UB bug pointed out by David (along with
addressing some other review comments), makes some dumping more
resilient to broken input data and forces the accelerator tables
to be dumped in the tests where we use them (this decision is
platform specific otherwise).
llvm-svn: 222003
This patch adds builtin support for xvdivdp and xvdivsp, along with a
test case. Straightforward stuff.
There's a companion patch for Clang.
llvm-svn: 221983
In support of serializing executables, obj2yaml now records the virtual address
and size of sections. It also serializes whatever we strictly need from
the PE header, it expects that it can reconstitute everything else via
inference.
yaml2obj can reconstitute a fully linked executable.
In order to get executables correctly serialized/deserialized, other
bugs were fixed as a circumstance. We now properly respect file and
section alignments. We also avoid writing out string tables unless they
are strictly necessary.
llvm-svn: 221975
This matches std::vector and is more efficient as it avoids
truncations.
With this the text segment of opt goes from 19705442 bytes
to 19703930 bytes.
llvm-svn: 221973
This teaches CoverageMapping::getCoveredFunctions to filter to a
particular file and uses that to replace most of the logic found in
llvm-cov report.
llvm-svn: 221962
getTargetConstant should only be used when you can guarantee the instruction
selected will be able to cope with the raw value. BUILD_VECTOR is rather too
generic for this so we should use getConstant instead. In that case, an
instruction can still consume the constant, but if it doesn't it'll be
materialised through its own round of ISel.
Should fix PR21352.
llvm-svn: 221961
Stop using `Value::getName()` to get the string behind an `MDString`.
Switch to `StringMapEntry<MDString>` so that we can find the string by
its coallocation.
This is part of PR21532.
llvm-svn: 221960
When "MBB->Insert(It, ...)" is called, we want It to be pointing inside the
correct basic block. No actual failures at the moment, but it's caused problems
before.
llvm-svn: 221953
Hide the fact that `MDString`'s string is stored in `Value::Name` --
that's going to change soon. Update the only in-tree client that was
using it instead of `Value::getString()`.
Part of PR21532.
llvm-svn: 221951
Summary:
This has most of what is needed for mips fast-isel call lowering for O32.
What is missing I will add on the next patch because this patch is already too large.
It should not be doing anything wrong but it will punt on some cases that it is basically
capable of doing.
The mechanism is there for parameters to be passed on the stack but I have not enabled it because it serves as a way for now to prevent some of the strange cases of O32 register passing that I have not fully checked yet and have some issues.
The Mips O32 abi rules are very complicated as far how data is passed in floating and integer registers.
However there is a way to think about this all very simply and this implementation reflects that.
Basically, the ABI rules are written as if everything is passed on the stack and aligned as such.
Once that is conceptually done, it is nearly trivial to reassign those locations to registers and
then all the complexity disappears.
So I have told tablegen that all the data is passed on the stack and during the lowering I fix
this by assigning to registers as per the ABI doc.
This has been my approach and you can line up what I did with the ABI document and see 1 to 1 what
is going on.
Test Plan: callabi.ll
Reviewers: dsanders
Reviewed By: dsanders
Subscribers: jholewinski, echristo, ahatanak, llvm-commits, rfuhler
Differential Revision: http://reviews.llvm.org/D5714
llvm-svn: 221948
Fix for LLI failure on Windows\X86: http://llvm.org/PR5053
LLI.exe crashes on Windows\X86 when single precession floating point
intrinsics like the following are used: acos, asin, atan, atan2, ceil,
copysign, cos, cosh, exp, floor, fmin, fmax, fmod, log, pow, sin, sinh,
sqrt, tan, tanh
The above intrinsics are defined as inline-expansions in math.h, and are
not exported by msvcr120.dll (Win32 API GetProcAddress returns null).
For an FREM instruction, the JIT compiler generates a call to a stub for
the fmodf() intrinsic, and adds a relocation to fixup at load time. The
loader searches the libraries for the function, but fails because the
symbol is not exported. So, the call target remains NULL and the
execution crashes.
Since the math functions are loaded at JIT/runtime, the JIT can patch
CALL instruction directly instead of the searching the libraries'
exported symbols. However, this fix caused build failures due to
unresolved symbols like _fmodf at link time.
Therefore, the current fix defines helper functions in the Runtime
link/load library to perform the above operations. The address of these
helper functions are used to patch up the CALL instruction at load time.
Reviewers: lhames, rnk
Reviewed By: rnk
Differential Revision: http://reviews.llvm.org/D5387
Patch by Swaroop Sridhar!
llvm-svn: 221947
Windows defines NULL to 0, which when used as an argument to a variadic
function, is not a null pointer constant. As a result, Clang's
-Wsentinel fires on this code. Using '0' would be wrong on most 64-bit
platforms, but both MSVC and Clang make it work on Windows. Sidestep the
issue with nullptr.
llvm-svn: 221940
in-lane shuffles that aren't always handled well by the current vector
shuffle lowering.
No functionality change yet, that will follow in a subsequent commit.
llvm-svn: 221938
The generic FastISel code would bail, because it can't emit a sign-extend for
AArch64. This copies the code over and uses AArch64 specific emit functions.
This is not ideal and 'computeAddress' should handles this, so it can fold the
address computation into the memory operation.
I plan to clean up 'computeAddress' anyways, so I will add that in a future
commit.
Related to rdar://problem/18962471.
llvm-svn: 221923
These were directly using the old base instruction
class, and specifying the wrong register classes
for operands. The operands can be the other special
inputs besides SGPRs. The op name was also being
directly used for the asm string, so this was printed
without any operands.
llvm-svn: 221921
If a function is just an unreachable, this would hit a
"this is not a MachO target" assertion because of setting
HasSubsectionViaSymbols.
llvm-svn: 221920
e.g. v_mad_f32 a, b, c -> v_mad_f32 b, a, c
This simplifies matching v_madmk_f32.
This looks somewhat surprising, but it appears to be
OK to do this. We can commute src0 and src1 in all
of these instructions, and that's all that appears
to matter.
llvm-svn: 221910