"global-init", "global-init-src" and "global-init-type" were originally
used to blacklist entities in ASan init-order checker. However, they
were never documented, and later were replaced by "=init" category.
Old blacklist entries should be converted as follows:
* global-init:foo -> global:foo=init
* global-init-src:bar -> src:bar=init
* global-init-type:baz -> type:baz=init
llvm-svn: 222401
This reverts commit r222142. This is causing/exposing an execution-time regression
in spec2006/gcc and coremark on AArch64/A57/Ofast.
Conflicts:
test/Transforms/Reassociate/optional-flags.ll
llvm-svn: 222398
This allows the logic to work with Git, and also uses the variable names
to match what Clang is actually looking for.
This is a re-application of r190556 and r190808. This changes the interface
of GetSVN.cmake. Clang change to follow.
llvm-svn: 222391
- Show "Considering..." message after flipping so you actually see the final
destination vreg as destination.
- Add a message on final join, so you can grep for "Success" messages to obtain
a list of which register got merged with which.
llvm-svn: 222382
This patch improves the lowering of v4f32 and v4i32 build_vector dag nodes
that are known to have at least two non-zero elements.
With this patch, a build_vector that performs a blend with zero is
converted into a shuffle. This is done to let the shuffle legalizer expand
the dag node in a optimal way. For example, if we know that a build_vector
performs a blend with zero, we can try to lower it as a movq/blend instead of
always selecting an insertps.
This patch also improves the logic that lowers a build_vector into a insertps
with zero masking. See for example the extra test cases added to test sse41.ll.
Differential Revision: http://reviews.llvm.org/D6311
llvm-svn: 222375
As detailed at http://llvm.org/PR20728, due to an internal overflow in
APFloat::multiplySignificand the APFloat::fusedMultiplyAdd method can return
incorrect results for x87DoubleExtended (x86_fp80) values. This commonly
manifests as incorrect constant folding of libm fmal calls on x86. E.g.
fmal(1.0L, 1.0L, 3.0L) == 0.0L (should be 4.0L)
This patch fixes PR20728 by adding an extra bit to the significand for
intermediate results of APFloat::multiplySignificand, avoiding the overflow.
llvm-svn: 222374
A register operand that has a common sub-class with its instruction's
defined register class is not always legal. For example,
SReg_32 and M0Reg both have a common sub-class, but we can't
use an SReg_32 in instructions that expect a M0Reg.
This prevents the llvm.SI.sendmsg.ll test from failing when the fold
operand pass is added.
llvm-svn: 222368
When the BasicBlock containing the return instrution has a PHI with 2
incoming values, FoldReturnIntoUncondBranch will remove the no longer
used incoming value and remove the no longer needed phi as well. This
leaves us with a BB that no longer has a PHI, but the subsequent call
to FoldReturnIntoUncondBranch from FoldReturnAndProcessPred will not
remove the return instruction (which still uses the result of the call
instruction). This prevents EliminateRecursiveTailCall to remove
the value, as it is still being used in a basicblock which has no
predecessors.
The basicblock can not be erased on the spot, because its iterator is
still being used in runTRE.
This issue was exposed when removing the threshold on size for lifetime
marker insertion for named temporaries in clang. The testcase is a much
reduced version of peelOffOuterExpr(const Expr*, const ExplodedNode *)
from clang/lib/StaticAnalyzer/Core/BugReporterVisitors.cpp.
llvm-svn: 222354
This change makes use of the new "job pool" capability in cmake 3.0
with ninja generator to allow limiting the number of concurrent jobs
of a certain type.
llvm-svn: 222341
This patch builds on http://reviews.llvm.org/D5598 to perform byte rotation shuffles (lowerVectorShuffleAsByteRotate) on pre-SSSE3 (palignr) targets - pre-SSSE3 is only enabled on i8 and i16 vector targets where it is a more definite performance gain.
I've also added a separate byte shift shuffle (lowerVectorShuffleAsByteShift) that makes use of the ability of the SLLDQ/SRLDQ instructions to implicitly shift in zero bytes to avoid the need to create a zero register if we had used palignr.
Differential Revision: http://reviews.llvm.org/D5699
llvm-svn: 222340
AliasSetTracker::addUnknown may create an AliasSet devoid of pointers
just to contain an instruction if no suitable AliasSet already exists.
It will then AliasSet::addUnknownInst and we will be done.
However, it's possible for addUnknown to choose an existing AliasSet to
addUnknownInst.
If this were to occur, we are in a bit of a pickle: removing pointers
from the AliasSet can cause the entire AliasSet to become destroyed,
taking our unknown instructions out with them.
Instead, keep track whether or not our AliasSet has any unknown
instructions.
This fixes PR21582.
llvm-svn: 222338
This is to be consistent with StringSet and ultimately with the standard
library's associative container insert function.
This lead to updating SmallSet::insert to return pair<iterator, bool>,
and then to update SmallPtrSet::insert to return pair<iterator, bool>,
and then to update all the existing users of those functions...
llvm-svn: 222334
Using AA during CodeGen is very useful for in-order cores. It is less useful for ooo cores. Also I find
enabling useAA for Cortex-A57 may generate worse code for some test cases. If useAA in codegen is improved
and benefical for ooo cores, we can enable it again.
llvm-svn: 222333
SeparateConstOffsetFromGEP can gives more optimizaiton opportunities related to GEPs, which benefits EarlyCSE
and LICM. By enabling these passes we can have better address calculations and generate a better addressing
mode. Some SPEC 2006 benchmarks (astar, gobmk, namd) have obvious improvements on Cortex-A57.
Reviewed in http://reviews.llvm.org/D5864.
llvm-svn: 222331
If LowerGEP is enabled, it can lower a GEP with multiple indices into GEPs with a single index
or arithmetic operations. Lowering GEPs can always extract structure indices. Lowering GEPs can
also give use more optimization opportunities. It can benefit passes like CSE, LICM and CGP.
Reviewed in http://reviews.llvm.org/D5864
llvm-svn: 222328
Having two ways to do this doesn't seem terribly helpful and
consistently using the insert version (which we already has) seems like
it'll make the code easier to understand to anyone working with standard
data structures. (I also updated many references to the Entry's
key and value to use first() and second instead of getKey{Data,Length,}
and get/setValue - for similar consistency)
Also removes the GetOrCreateValue functions so there's less surface area
to StringMap to fix/improve/change/accommodate move semantics, etc.
llvm-svn: 222319
StringSet is still a bit dodgy in that it exposes the raw iterator of
the StringMap parent, which exposes the weird detail that StringSet
actually has a 'value'... but anyway, this is useful for a handful of
clients that want to reference the newly inserted/persistent string data
in the StringSet/Map/Entry/thing.
llvm-svn: 222302
It printed out base relocation table header as table entry.
This patch also makes llvm-readobj to not skip ABSOLUTE entries
becuase it was confusing.
llvm-svn: 222299
The other option would be to do something like
if (that.isSingleWord())
VAL = that.VAL;
else
pVal = that.pVal
This bug was causing 86TTI::getIntImmCost to be miscompiled in a LTO
bootstrap in stage2, causing the build of stage3 to fail.
LLVM is getting quiet good at exploiting this. Not sure if there is anything
a sanitizer could do to help
llvm-svn: 222294
Summary:
move the code from BreakCriticalEdges::runOnFunction()
into a separate utility function llvm::SplitAllCriticalEdges()
so that it can be used independently.
No functionality change intended.
Test Plan: check-llvm
Reviewers: nlewycky
Reviewed By: nlewycky
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D6313
llvm-svn: 222288
This partially makes up for not having address spaces
used for alias analysis in some simple cases.
This is not yet enabled by default so shouldn't change anything yet.
llvm-svn: 222286
Assuming unmodeled side effects interferes with some scheduling
opportunities.
Don't put it in the base class of DS instructions since there
are a few weird effecting, non load/store instructions there.
llvm-svn: 222285
Under many circumstances the stack is not 32-byte aligned, resulting in the use of the vmovups/vmovupd/vmovdqu instructions when inserting ymm reloads/spills.
This minor patch adds these instructions to the isFrameLoadOpcode/isFrameStoreOpcode helpers so that they can be correctly identified and not be treated as folded reloads/spills.
This has also been noticed by http://llvm.org/bugs/show_bug.cgi?id=18846 where it was causing redundant spills - I've added a reduced test case at test/CodeGen/X86/pr18846.ll
Differential Revision: http://reviews.llvm.org/D6252
llvm-svn: 222281
shift-right for booleans (i1).
Arithmetic shift-right immediate with sign-/zero-extensions also works for
boolean values. Update the assert and the test cases to reflect that fact.
llvm-svn: 222272
shift-right for booleans (i1).
Logical shift-right immediate with sign-/zero-extensions also works for boolean
values. Update the assert and the test cases to reflect that fact.
llvm-svn: 222270
We would attempt to replace an frem's operand with the same operand.
This would cause InstCombine to think real work was done, causing
InstCombine to enter an infinite loop.
This fixes the second part of PR21576.
llvm-svn: 222265
Summary: This will help in testing libc++ and libc++abi with tsan.
Reviewers: samsonov
Reviewed By: samsonov
Subscribers: samsonov, llvm-commits
Differential Revision: http://reviews.llvm.org/D6283
llvm-svn: 222258
Shifts also perform sign-/zero-extends to larger types, which requires us to emit
an integer extend instead of a simple COPY.
Related to PR21594.
llvm-svn: 222257
This should expose more of the actually used VALU
instructions to the machine optimization passes.
This also should help getting i1 handling into a better state.
For not entirly understood reasons, this fixes the split-scalar-i64-add.ll
test where a 64-bit add would only partially be moved to the VALU
resulting in use of undefined VCC.
llvm-svn: 222256
"optimizeCompareInstr" converts compares (cmp/cmn) into plain sub/add
instructions when the flags are not used anymore. This conversion is valid for
most instructions, but not all. Some instructions that don't set the flags
(e.g. sub with immediate) can set the SP, whereas the flag setting version uses
the same encoding for the "zero" register.
Update the code to also check for the return register before performing the
optimization to make sure that a cmp doesn't suddenly turn into a sub that sets
the stack pointer.
I don't have a test case for this, because it isn't easy to trigger.
llvm-svn: 222255
This change emits a COPY for a shift-immediate with a "zero" shift value.
This fixes PR21594 where we emitted a shift instruction with an incorrect
immediate operand.
llvm-svn: 222247
EarlyCSE is giving up on the current instruction immediately when it recognizes that the current instruction makes a previous store trivially dead. There's no reason to do this. Once the previous store has been deleted, it's perfectly legal to remember the value of the current store (for value forwarding) and the fact the store occurred (it could be dead too!).
Reviewed by: Hal
Differential Revision: http://reviews.llvm.org/D6301
llvm-svn: 222241
It is impossible for (x & INT_MAX) == 0 && x == INT_MAX to ever be true.
While this sort of reasoning should normally live in InstSimplify,
the machinery that derives this result is not trivial to split out.
llvm-svn: 222230
- Make CallGraphSCCPass's paragraph about doFinalization refer to
runOnSCC instead of runOnFunction, since that's what it's about.
- Fix a reference in the FunctionPass paragraph.
llvm-svn: 222222
Usually global variables are in a retain list and instanciated before
any call to constructImportedEntityDIE is made. This isn't true for
forward declarations though.
The testcase for this change is generated by a clang patched to emit
such forward declarations (patch at http://reviews.llvm.org/D6173
which will land soon). The updated testcase tests more than just
global variables, it now tests every type of 'using' clause we
support.
llvm-svn: 222217
I added a pessimization in r217102 to prevent miscompiles when the
incremented induction variable was used in a comparison; it would be
poison.
Try to use the incremented induction variable more often when we can be
sure that the increment won't end in poison.
Differential Revision: http://reviews.llvm.org/D6222
llvm-svn: 222213
Having the operands at the back prevents subclasses from safely adding
fields. Move them to the front.
Instead of replicating the custom `malloc()`, `free()` and `DestroyFlag`
logic that was there before, overload `new` and `delete`.
I added calls to a new `GenericMDNode::dropAllReferences()` in
`LLVMContextImpl::~LLVMContextImpl()`. There's a maze of callbacks
happening during teardown, and this resolves them before we enter
the destructors.
Part of PR21532.
llvm-svn: 222211
Split `MDNode` into two classes:
- `GenericMDNode`, which is uniquable (and for now, always starts
uniqued). Once `Metadata` is split from the `Value` hierarchy, this
class will lose the ability to RAUW itself.
- `MDNodeFwdDecl`, which is used for the "temporary" interface, is
never uniqued, and isn't managed by `LLVMContext` at all.
I've left most of the guts in `MDNode` for now, but I'll incrementally
move things to the right places (or delete the functionality, as
appropriate).
Part of PR21532.
llvm-svn: 222205
use DIScopeRef.
A paired commit at clang will follow to show cases where we will use an
identifer for the context of a global variable.
rdar://18958417
llvm-svn: 222195
Change uniquing from a `FoldingSet` to a `DenseSet` with custom
`DenseMapInfo`. Unfortunately, this doesn't save any memory, since
`DenseSet<T>` is a simple wrapper for `DenseMap<T, char>`, but I'll come
back to fix that later.
I used the name `GenericDenseMapInfo` to the custom `DenseMapInfo` since
I'll be splitting `MDNode` into two classes soon: `MDNodeFwdDecl` for
temporaries, and `GenericMDNode` for everything else.
I also added a non-debug-info reduced version of a type-uniquing test
that started failing on an earlier draft of this patch.
Part of PR21532.
llvm-svn: 222191
This reverts commit r222183.
Broke on the MSVC buildbots due to MSVC not producing default move
operations - I'd fix it immediately but just broke my build system a
bit, so backing out until I have a chance to get everything going again.
llvm-svn: 222187
The next step is to actually use unique_ptr in TreePatternNode's
Children vector. That will be more intrusive, and may not work,
depending on exactly how these things are handled (I have a bad
suspicion things are shared more than they should be, making this more
DAG than tree - but if it's really a tree, unique_ptr should suffice)
llvm-svn: 222183
This was resulting in use of a register after a kill.
For some reason this showed up as a problem in many tests
when moving the SIFixSGPRCopies pass closer to instruction
selection.
llvm-svn: 222175
When converting a switch to a lookup table we might have to generate a bitmaks
to encode and check for holes in the original switch statement.
The type of this mask depends on the number of switch statements, which can
result in illegal types for pretty much all architectures.
To avoid unnecessary type legalization and help FastISel this commit increases
the size of the bitmask to next power-of-2 value when necessary.
This fixes rdar://problem/18984639.
llvm-svn: 222168
They were producing the wrong result if NumBits == BitsInWord. The old mask
produced -1, the new mask 0.
This should fix the 32 bit bots.
llvm-svn: 222166
The specializations were broken. For example,
void foo(const CallGraph *G) {
auto I = GraphTraits<const CallGraph *>::nodes_begin(G);
auto K = I++;
...
}
or
void bar(const CallGraphNode *N) {
auto I = GraphTraits<const CallGraphNode *>::nodes_begin(G);
auto K = I++;
....
}
would not compile.
Patch by Speziale Ettore!
llvm-svn: 222149
The triple parser should only accept existing architecture names
when the triple starts with armv, armebv, thumbv or thumbebv.
Patch by Gabor Ballabas.
llvm-svn: 222129
SCEVDivision::divide constructed an object of SCEVDivision<Derived>
instead of Derived. divide would call visit which would cast the
SCEVDivision<Derived> to type Derived. As it happens,
SCEVDivision<Derived> and Derived currently have the same layout but
this is fragile and grounds for UB.
Instead, just construct Derived. No functional change intended.
llvm-svn: 222126
This was motivated by a bug which caused code like this to be
miscompiled:
declare void @take_ptr(i8*)
define void @test() {
%addr1.32 = alloca i8
%addr2.32 = alloca i32, i32 1028
call void @take_ptr(i8* %addr1)
ret void
}
This was emitting the following assembly to get the value of %addr1:
add r0, sp, #1020
add r0, r0, #8
However, "add r0, r0, #8" is not a valid Thumb1 instruction, and this
could not be assembled. The generated object file contained this,
resulting in r0 holding SP+8 rather tha SP+1028:
add r0, sp, #1020
add r0, sp, #8
This function looked like it could have caused miscompilations for
other combinations of registers and offsets (though I don't think it is
currently called with these), and the heuristic it used did not match
the emitted code in all cases.
llvm-svn: 222125
We were a little lax in a few areas:
- We pretended that import libraries were like any old COFF file, they
are not. In fact, they aren't really COFF files at all, we should
probably grow some specialized functionality to handle them smarter.
- Our symbol iterators were more than happy to attempt to go past the
end of the symbol table if you had a symbol with a bad list of
auxiliary symbols.
llvm-svn: 222124
Some optimisations in DAGCombiner cause miscompilations for targets that use
TargetLowering::UndefinedBooleanContent, because they assume that the results
of a SELECT_CC node are boolean values, and can be safely ANDed, ORed and
XORed. These optimisations are only valid for targets that use
ZeroOrOneBooleanContent or ZeroOrNegativeOneBooleanContent.
This is a follow-up to D6210/r221693.
llvm-svn: 222123
This is a simple optimization for switch table lookup:
It computes the output value directly with an (optional) mul and add if there is a linear mapping between index and output.
Example:
int f1(int x) {
switch (x) {
case 0: return 10;
case 1: return 11;
case 2: return 12;
case 3: return 13;
}
return 0;
}
generates:
define i32 @f1(i32 %x) #0 {
entry:
%0 = icmp ult i32 %x, 4
br i1 %0, label %switch.lookup, label %return
switch.lookup:
%switch.offset = add i32 %x, 10
ret i32 %switch.offset
return:
ret i32 0
}
llvm-svn: 222121
Indices into the table are stored in each MCRegisterClass instead of a pointer. A new method, getRegClassName, is added to MCRegisterInfo and TargetRegisterInfo to lookup the string in the table.
llvm-svn: 222118
This adds back r222061, but now calls initializePAEvalPass from the correct
library to avoid link problems.
Original message:
Don't make assumptions about the name of private global variables.
Private variables are can be renamed, so it is not reliable to make
decisions on the name.
The name is also dropped by the assembler before getting to the
linker, so using the name causes a disconnect between how llvm makes a
decision (var name) and how the linker makes a decision (section it is
in).
This patch changes one case where we were looking at the variable name to use
the section instead.
Test tuning by Michael Gottesman.
llvm-svn: 222117
It turns out that not all users of SCEVDivision want the same
signedness. Let the users determine which operation they'd like by
explicitly choosing SCEVUDivision or SCEVSDivision.
findArrayDimensions and computeAccessFunctions will use SCEVSDivision
while HowFarToZero will use SCEVUDivision.
llvm-svn: 222104
Summary:
Several places in DependenceAnalysis assumes both SCEVs in a subscript pair
share the same integer type. For instance, isKnownPredicate calls
SE->getMinusSCEV(X, Y) which asserts X and Y share the same type. However,
DependenceAnalysis fails to ensure this assumption when producing a subscript
pair, causing tests such as NonCanonicalizedSubscript to crash. With this
patch, DependenceAnalysis runs unifySubscriptType before producing any
subscript pair, ensuring the assumption.
Test Plan:
Added NonCanonicalizedSubscript.ll on which DependenceAnalysis before the fix
crashed because subscripts have different types.
Reviewers: spop, sebpop, jingyue
Reviewed By: jingyue
Subscribers: eliben, meheff, llvm-commits
Differential Revision: http://reviews.llvm.org/D6289
llvm-svn: 222100
HowFarToZero was supposed to use unsigned division in order to calculate
the backedge taken count. However, SCEVDivision::divide performs signed
division. Unless I am mistaken, no users of SCEVDivision actually want
signed arithmetic: switch to udiv and urem.
This fixes PR21578.
llvm-svn: 222093
A few things:
- computeKnownBits is relatively expensive, let's delay its use as long
as we can.
- Don't create two APInt values just to run computeKnownBits on a
ConstantInt, we already know the exact value!
- Avoid creating a temporary APInt value in order to calculate unary
negation.
llvm-svn: 222092
This patch teaches the DAGCombiner how to combine shuffles according to rules:
shuffle(shuffle(A, Undef, M0), B, M1) -> shuffle(B, A, M2)
shuffle(shuffle(A, B, M0), B, M1) -> shuffle(B, A, M2)
shuffle(shuffle(A, B, M0), A, M1) -> shuffle(B, A, M2)
llvm-svn: 222090
Updated X86TargetLowering::isShuffleMaskLegal to match SHUFP masks with commuted inputs and PSHUFD masks that reference the second input.
As part of this I've refactored isPSHUFDMask to work in a more general manner and allow it to match against either the first or second input vector.
Differential Revision: http://reviews.llvm.org/D6287
llvm-svn: 222087
This gets the correct NaN behavior based on the compare type
the hardware uses. This now passes the new piglit test I have
for this on SI.
Add stricter tests for the operand order.
llvm-svn: 222079
While this program worked correctly with small example programs, larger
ones tickled this bug. I'm working on a reduction because my program is
quite large.
llvm-svn: 222078
This is so it could potentially be used by SI. However, the current
implementation does not always produce correct results, so the
IntegerDivisionPass is being used instead.
llvm-svn: 222072
Make explicit the requirement that most IR values in `DIBuilder` are
`Constant`. This requires a follow-up change in clang.
Part of PR21532.
llvm-svn: 222070
Now that `MDString` and `MDNode` have a common base class, use it. Note
that it's not useful to assume subclasses of `Metadata` must be one or
the other since we'll be adding more subclasses soon enough.
Part of PR21532.
llvm-svn: 222064
Summary:
The current "WinEH" exception handling type is more about Itanium-style
LSDA tables layered on top of the Windows native unwind info format
instead of .eh_frame tables or EHABI unwind info. Use the name
"ItaniumWinEH" to better reflect the hybrid nature of the design.
Also rename isExceptionHandlingDWARF to usesItaniumLSDAForExceptions,
since the LSDA is part of the Itanium C++ ABI document, and not the
DWARF standard.
Reviewers: echristo
Subscribers: llvm-commits, compnerd
Differential Revision: http://reviews.llvm.org/D6279
llvm-svn: 222062
Private variables are can be renamed, so it is not reliable to make
decisions on the name.
The name is also dropped by the assembler before getting to the
linker, so using the name causes a disconnect between how llvm makes a
decision (var name) and how the linker makes a decision (section it is
in).
This patch changes one case where we were looking at the variable name to use
the section instead.
Test tuning by Michael Gottesman.
llvm-svn: 222061
We use to track quite a few "adjusted" offsets through the FrameLowering code
to account for changes in the prologue instructions as we went and allow the
emission of correct CFA annotations. However, we were missing a couple of cases
and the code was almost impenetrable.
It's easier to just add any stack-adjusting instruction to a list and emit them
together.
llvm-svn: 222057
When we folded the DPR alignment gap into a push, we weren't noting the extra
distance from the beginning of the push to the FP, and so FP ended up pointing
at an incorrect offset.
The .cfi_def_cfa_offset directives are still wrong in this case, but I think
that can be improved by refactoring.
llvm-svn: 222056
The test's DWARF stubs were there just to trigger the emission of .cfi
directives. Fortunately, the NetBSD ABI already demands proper DWARF unwind
info, so it's easier to just use that triple.
llvm-svn: 222055
FYI, removed the unused MCInstrAnalysis as it does not exist for 64-bit ARM and
was causing a “couldn't initialize disassembler for target” error.
llvm-svn: 222045
We would attempt to replace a fptrunc of an frem with an identical
fptrunc. This would cause the new fptrunc to be added to the worklist.
Of course, this results in an infinite loop because we will keep
visiting the newly created fptruncs.
This fixes PR21576.
llvm-svn: 222040
doing Load PRE"
This commit updates the failing test in
Analysis/TypeBasedAliasAnalysis/gvn-nonlocal-type-mismatch.ll
The failing test is sensitive to the order in which we process loads. This
version turns on the RPO traversal instead of the while DT traversal in GVN.
The new test code is functionally same just the order of loads that are
eliminated is swapped.
This new version also fixes an issue where GVN splits a critical edge and
potentially invalidate the RPO/DT iterator.
llvm-svn: 222039
based on instruction complexity
The order that tablegen fast-isel instruction code is generated is
currently based on the text of the predicate (using string
less-than). This patch changes this to instead use the instruction
complexity. Because the complexities are not unique a C++ multimap is
used instead of a map.
This fixes the problem where code with no predicate always comes out
first (the empty string always compares as less than all other
strings) thus making the code with predicates dead code. See the FMUL
code in PPCFastISel.cpp for an example. It also more closely matches
the normal codegen ordering. Some error checking in the tablegen
fast-isel code is fixed as well.
Patch by Bill Seurer.
llvm-svn: 222038
If we have spilled the value of the m0 register, then we need to restore
it with v_readlane_b32 to a regular sgpr, because v_readlane_b32 can't
write to m0.
v_readlane_b32 can't write to m0, so
llvm-svn: 222036
This allows COFF targets to emit accelerator tables
when requested by -dwarf-accel-tables=Enable instead
of aborting. The test DebugInfo/cross-cu-inlining.ll
covers this on COFF platforms.
llvm-svn: 222034
ELF targets (and maybe COFF) use relocations when referring
to strings in the .debug_str section. Handle that in the
accelerator table dumper. This commit restores the
test/DebugInfo/cross-cu-inlining.ll test to its expected
platform independant form, validating that the fix works
(this test failed on linux boxes).
llvm-svn: 222029
If this workaround gets the bots green, then we have to find out
why the -dwarf-accel-tables=Enable option doesn't work as
expected on non-darwin platforms.
llvm-svn: 222007
Prior to this commit fmul and fadd binary operators were being canonicalized for
both scalar and vector versions. We now canonicalize add, mul, and, or, and xor
vector instructions.
llvm-svn: 222006
This reverts commit r221842 which was a revert of r221836 and of the
test parts of r221837.
This new version fixes an UB bug pointed out by David (along with
addressing some other review comments), makes some dumping more
resilient to broken input data and forces the accelerator tables
to be dumped in the tests where we use them (this decision is
platform specific otherwise).
llvm-svn: 222003
This patch adds builtin support for xvdivdp and xvdivsp, along with a
test case. Straightforward stuff.
There's a companion patch for Clang.
llvm-svn: 221983
In support of serializing executables, obj2yaml now records the virtual address
and size of sections. It also serializes whatever we strictly need from
the PE header, it expects that it can reconstitute everything else via
inference.
yaml2obj can reconstitute a fully linked executable.
In order to get executables correctly serialized/deserialized, other
bugs were fixed as a circumstance. We now properly respect file and
section alignments. We also avoid writing out string tables unless they
are strictly necessary.
llvm-svn: 221975
This matches std::vector and is more efficient as it avoids
truncations.
With this the text segment of opt goes from 19705442 bytes
to 19703930 bytes.
llvm-svn: 221973
This teaches CoverageMapping::getCoveredFunctions to filter to a
particular file and uses that to replace most of the logic found in
llvm-cov report.
llvm-svn: 221962
getTargetConstant should only be used when you can guarantee the instruction
selected will be able to cope with the raw value. BUILD_VECTOR is rather too
generic for this so we should use getConstant instead. In that case, an
instruction can still consume the constant, but if it doesn't it'll be
materialised through its own round of ISel.
Should fix PR21352.
llvm-svn: 221961
Stop using `Value::getName()` to get the string behind an `MDString`.
Switch to `StringMapEntry<MDString>` so that we can find the string by
its coallocation.
This is part of PR21532.
llvm-svn: 221960
When "MBB->Insert(It, ...)" is called, we want It to be pointing inside the
correct basic block. No actual failures at the moment, but it's caused problems
before.
llvm-svn: 221953
Hide the fact that `MDString`'s string is stored in `Value::Name` --
that's going to change soon. Update the only in-tree client that was
using it instead of `Value::getString()`.
Part of PR21532.
llvm-svn: 221951
Summary:
This has most of what is needed for mips fast-isel call lowering for O32.
What is missing I will add on the next patch because this patch is already too large.
It should not be doing anything wrong but it will punt on some cases that it is basically
capable of doing.
The mechanism is there for parameters to be passed on the stack but I have not enabled it because it serves as a way for now to prevent some of the strange cases of O32 register passing that I have not fully checked yet and have some issues.
The Mips O32 abi rules are very complicated as far how data is passed in floating and integer registers.
However there is a way to think about this all very simply and this implementation reflects that.
Basically, the ABI rules are written as if everything is passed on the stack and aligned as such.
Once that is conceptually done, it is nearly trivial to reassign those locations to registers and
then all the complexity disappears.
So I have told tablegen that all the data is passed on the stack and during the lowering I fix
this by assigning to registers as per the ABI doc.
This has been my approach and you can line up what I did with the ABI document and see 1 to 1 what
is going on.
Test Plan: callabi.ll
Reviewers: dsanders
Reviewed By: dsanders
Subscribers: jholewinski, echristo, ahatanak, llvm-commits, rfuhler
Differential Revision: http://reviews.llvm.org/D5714
llvm-svn: 221948
Fix for LLI failure on Windows\X86: http://llvm.org/PR5053
LLI.exe crashes on Windows\X86 when single precession floating point
intrinsics like the following are used: acos, asin, atan, atan2, ceil,
copysign, cos, cosh, exp, floor, fmin, fmax, fmod, log, pow, sin, sinh,
sqrt, tan, tanh
The above intrinsics are defined as inline-expansions in math.h, and are
not exported by msvcr120.dll (Win32 API GetProcAddress returns null).
For an FREM instruction, the JIT compiler generates a call to a stub for
the fmodf() intrinsic, and adds a relocation to fixup at load time. The
loader searches the libraries for the function, but fails because the
symbol is not exported. So, the call target remains NULL and the
execution crashes.
Since the math functions are loaded at JIT/runtime, the JIT can patch
CALL instruction directly instead of the searching the libraries'
exported symbols. However, this fix caused build failures due to
unresolved symbols like _fmodf at link time.
Therefore, the current fix defines helper functions in the Runtime
link/load library to perform the above operations. The address of these
helper functions are used to patch up the CALL instruction at load time.
Reviewers: lhames, rnk
Reviewed By: rnk
Differential Revision: http://reviews.llvm.org/D5387
Patch by Swaroop Sridhar!
llvm-svn: 221947
Windows defines NULL to 0, which when used as an argument to a variadic
function, is not a null pointer constant. As a result, Clang's
-Wsentinel fires on this code. Using '0' would be wrong on most 64-bit
platforms, but both MSVC and Clang make it work on Windows. Sidestep the
issue with nullptr.
llvm-svn: 221940
in-lane shuffles that aren't always handled well by the current vector
shuffle lowering.
No functionality change yet, that will follow in a subsequent commit.
llvm-svn: 221938
The generic FastISel code would bail, because it can't emit a sign-extend for
AArch64. This copies the code over and uses AArch64 specific emit functions.
This is not ideal and 'computeAddress' should handles this, so it can fold the
address computation into the memory operation.
I plan to clean up 'computeAddress' anyways, so I will add that in a future
commit.
Related to rdar://problem/18962471.
llvm-svn: 221923
These were directly using the old base instruction
class, and specifying the wrong register classes
for operands. The operands can be the other special
inputs besides SGPRs. The op name was also being
directly used for the asm string, so this was printed
without any operands.
llvm-svn: 221921
If a function is just an unreachable, this would hit a
"this is not a MachO target" assertion because of setting
HasSubsectionViaSymbols.
llvm-svn: 221920
e.g. v_mad_f32 a, b, c -> v_mad_f32 b, a, c
This simplifies matching v_madmk_f32.
This looks somewhat surprising, but it appears to be
OK to do this. We can commute src0 and src1 in all
of these instructions, and that's all that appears
to matter.
llvm-svn: 221910
Normally entries can only move to a lower address, but when that wasn't viable,
the user's block was considered anyway. Unfortunately, it went via
createNewWater which wasn't designed to handle the case where there's already
an island after the block.
Unfortunately, the test we have is slow and fragile, and I couldn't reduce it
to anything sane even with the @llvm.arm.space intrinsic. The test change here
is recreating the previous one after the change.
rdar://problem/18545506
llvm-svn: 221905
We were using a naive heuristic to determine whether a basic block already had
an unconditional branch at the end. This mostly corresponded to reality
(assuming branches got optimised) because there's not much point in a branch to
the next block, but could go wrong.
llvm-svn: 221904
Creating tests for the ConstantIslands pass is very difficult, since it depends
on precise layout details. Having the ability to precisely inject a number of
bytes into the stream helps greatly.
llvm-svn: 221903
This will become the root of a new class hierarchy separate from
`Value`. As a first step, stick it between `Value` and `MDNode`.
This is part of PR21532.
llvm-svn: 221886
Let's try this again...
This reverts r219432, plus a bug fix.
Description of the bug in r219432 (by Nick):
The bug was using AllPositive to break out of the loop; if the loop break
condition i != e is changed to i != e && AllPositive then the
test_modulo_analysis_with_global test I've added will fail as the Modulo will
be calculated incorrectly (as the last loop iteration is skipped, so Modulo
isn't updated with its Scale).
Nick also adds this comment:
ComputeSignBit is safe to use in loops as it takes into account phi nodes, and
the == EK_ZeroEx check is safe in loops as, no matter how the variable changes
between iterations, zero-extensions will always guarantee a zero sign bit. The
isValueEqualInPotentialCycles check is therefore definitely not needed as all
the variable analysis holds no matter how the variables change between loop
iterations.
And this patch also adds another enhancement to GetLinearExpression - basically
to convert ConstantInts to Offsets (see test_const_eval and
test_const_eval_scaled for the situations this improves).
Original commit message:
This reverts r218944, which reverted r218714, plus a bug fix.
Description of the bug in r218714 (by Nick):
The original patch forgot to check if the Scale in VariableGEPIndex flipped the
sign of the variable. The BasicAA pass iterates over the instructions in the
order they appear in the function, and so BasicAliasAnalysis::aliasGEP is
called with the variable it first comes across as parameter GEP1. Adding a
%reorder label puts the definition of %a after %b so aliasGEP is called with %b
as the first parameter and %a as the second. aliasGEP later calculates that %a
== %b + 1 - %idxprom where %idxprom >= 0 (if %a was passed as the first
parameter it would calculate %b == %a - 1 + %idxprom where %idxprom >= 0) -
ignoring that %idxprom is scaled by -1 here lead the patch to incorrectly
conclude that %a > %b.
Revised patch by Nick White, thanks! Thanks to Lang to isolating the bug.
Slightly modified by me to add an early exit from the loop and avoid
unnecessary, but expensive, function calls.
Original commit message:
Two related things:
1. Fixes a bug when calculating the offset in GetLinearExpression. The code
previously used zext to extend the offset, so negative offsets were converted
to large positive ones.
2. Enhance aliasGEP to deduce that, if the difference between two GEP
allocations is positive and all the variables that govern the offset are also
positive (i.e. the offset is strictly after the higher base pointer), then
locations that fit in the gap between the two base pointers are NoAlias.
Patch by Nick White!
llvm-svn: 221876
Split getObject's smarts into checkOffset, use this to replace the
handwritten check in getSectionContents. Similarly, replace checks in
section_rel_begin/section_rel_end with getNumberOfRelocations.
No functionality change intended.
llvm-svn: 221873
On error conditions, relocAddressLess might claim that a value is less
than itself. Instead, abort llvm-readobj. No functionality change
intended.
llvm-svn: 221872
lib/Object is supposed to be robust to malformed object files. Don't
assert if we don't have a symbol table. I'll try to come up with a test
case later.
llvm-svn: 221870
getObject didn't consider the case where a pointer came before the start
of the object file. No test is included, trying to come up with
something reasonable.
llvm-svn: 221868