Change builtin function name and signature ( add third parameter - rounding mode ).
Added tests for intrinsics.
Differential Revision: http://reviews.llvm.org/D10473
llvm-svn: 239888
This is now living in MemoryLocation, which is what it pertains to. It
is also an enum there rather than a static data member which is left
never defined.
llvm-svn: 239886
that it is its own entity in the form of MemoryLocation, and update all
the callers.
This is an entirely mechanical change. References to "Location" within
AA subclases become "MemoryLocation", and elsewhere
"AliasAnalysis::Location" becomes "MemoryLocation". Hope that helps
out-of-tree folks update.
llvm-svn: 239885
The original change broke clang side tests. I will be submitting those momentarily. This change includes post commit feedback on the original change from from Pete Cooper.
Original Submission comments:
If a parameter to a function is known non-null, use the existing parameter attributes to record that fact at the call site. This has no optimization benefit by itself - that I know of - but is an enabling change for http://reviews.llvm.org/D9129.
Differential Revision: http://reviews.llvm.org/D9132
llvm-svn: 239849
A reduction is a special kind of recurrence. In the loop vectorizer we currently
identify basic reductions. Future patches will extend this to identifying basic
recurrences.
llvm-svn: 239835
This change is hopefully NFC. The only tricky part is that I changed the context instruction being used to the branch rather than the comparison. I believe both to be correct, but the branch is strictly more powerful. With the moved code, using the branch instruction is required for the basic block comparison test to return the same result. The previous code was able to directly access both the branch and the comparison where the revised code is not.
Differential Revision: http://reviews.llvm.org/D9652
llvm-svn: 239797
`LLVM_ENABLE_MODULES` builds sometimes fail because `Intrinsics.td`
needs to regenerate `Instrinsics.h` before anyone can include anything
from the LLVM_IR module. Represent the dependency explicitly to prevent
that.
llvm-svn: 239796
If a parameter to a function is known non-null, use the existing parameter attributes to record that fact at the call site. This has no optimization benefit by itself - that I know of - but is an enabling change for http://reviews.llvm.org/D9129.
Differential Revision: http://reviews.llvm.org/D9132
llvm-svn: 239795
This patch adds the safe stack instrumentation pass to LLVM, which separates
the program stack into a safe stack, which stores return addresses, register
spills, and local variables that are statically verified to be accessed
in a safe way, and the unsafe stack, which stores everything else. Such
separation makes it much harder for an attacker to corrupt objects on the
safe stack, including function pointers stored in spilled registers and
return addresses. You can find more information about the safe stack, as
well as other parts of or control-flow hijack protection technique in our
OSDI paper on code-pointer integrity (http://dslab.epfl.ch/pubs/cpi.pdf)
and our project website (http://levee.epfl.ch).
The overhead of our implementation of the safe stack is very close to zero
(0.01% on the Phoronix benchmarks). This is lower than the overhead of
stack cookies, which are supported by LLVM and are commonly used today,
yet the security guarantees of the safe stack are strictly stronger than
stack cookies. In some cases, the safe stack improves performance due to
better cache locality.
Our current implementation of the safe stack is stable and robust, we
used it to recompile multiple projects on Linux including Chromium, and
we also recompiled the entire FreeBSD user-space system and more than 100
packages. We ran unit tests on the FreeBSD system and many of the packages
and observed no errors caused by the safe stack. The safe stack is also fully
binary compatible with non-instrumented code and can be applied to parts of
a program selectively.
This patch is our implementation of the safe stack on top of LLVM. The
patches make the following changes:
- Add the safestack function attribute, similar to the ssp, sspstrong and
sspreq attributes.
- Add the SafeStack instrumentation pass that applies the safe stack to all
functions that have the safestack attribute. This pass moves all unsafe local
variables to the unsafe stack with a separate stack pointer, whereas all
safe variables remain on the regular stack that is managed by LLVM as usual.
- Invoke the pass as the last stage before code generation (at the same time
the existing cookie-based stack protector pass is invoked).
- Add unit tests for the safe stack.
Original patch by Volodymyr Kuznetsov and others at the Dependable Systems
Lab at EPFL; updates and upstreaming by myself.
Differential Revision: http://reviews.llvm.org/D6094
llvm-svn: 239761
Summary:
Scalarizer has two data structures that hold information about changes
to the function, Gathered and Scattered. These are cleared in finish()
at the end of runOnFunction() if finish() detects any changes to the
function.
However, finish() was checking for changes by only checking if
Gathered was non-empty. The function visitStore() only modifies
Scattered without touching Gathered. As a result, Scattered could have
ended up having stale data if Scalarizer only scalarized store
instructions. Since the data in Scattered is used during the execution
of the pass, this introduced dangling pointer errors.
The fix is to check whether both Scattered and Gathered are empty
before deciding what to do in finish().
Reviewers: srhines
Reviewed By: srhines
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D10422
llvm-svn: 239644
It is valid for globals to be unnamed, but aliases must have a name. To avoid
creating invalid IR, we need to assign names to any aliases we create that
point to unnamed objects that have been moved into combined globals.
llvm-svn: 239590
DebugLoc::getFnDebugLoc() should soon be removed. Also,
getDISubprogram() might become more effective soon and wouldn't need to
scan debug locations at all, if function-level metadata would be emitted
by Clang.
llvm-svn: 239586
Summary:
A side effect of this change is that it IRBuilder now automatically
created debug info locations for new instructions, which is the
same as debug location of insertion point. This is fine for the
functions in questions (GetStoreValueForLoad and
GetMemInstValueForLoad), as they are used in two situations:
* GVN::processLoad, which tries to eliminate a load. In this case
new instructions would have the same debug location as the load they
eventually replace;
* MaterializeAdjustedValue, which adds new instructions to the end
of the basic blocks, which could later be used to replace the load
definition. In this case we don't yet know the way the load would
be eventually replaced (either by assembling the precomputed values
via PHI, or by using them directly), so just using the basic block
strategy seems to be reasonable. There is also a special case
in the code that *would* adjust the location of the last
instruction replacing the load definition to the location of the
load.
Test Plan: regression test suite
Reviewers: echristo, dberlin, dblaikie
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D10405
llvm-svn: 239585
Use IRBuilder::Create(Cond)?Br instead of constructing instructions
manually with BranchInst::Create(). It's consistent with other
uses of IRBuilder in this pass, and has an additional important
benefit:
Using IRBuilder will ensure that new branch instruction will get
the same debug location as original terminator instruction it will
eventually replace.
For now I'm not adding a testcase, as currently original terminator
instruction also lack debug location due to missing debug location
propagation in BasicBlock::splitBasicBlock. That is, the testcase
will accompany the fix for the latter I'm going to mail soon.
llvm-svn: 239550
This only updates one of the uses. The other is used in cases
that may never touch memory, so I'm not sure why this is even
calling it. Perhaps there should be a new, similar hook for such
cases or pass -1 for unknown address space.
llvm-svn: 239540
If the first argument to a function is a 'this' argument and the second
has the sret attribute, the ArgumentPromotion pass may promote the 'this'
argument to more than one argument, violating the IR constraint that 'sret'
may only be applied to the first or second argument.
Although this IR constraint is arguably unnecessary, it highlighted the fact
that ArgPromotion does not need to preserve this attribute. Dropping the
attribute reduces register pressure in the backend by avoiding the register
copy required by sret. Because sret implies noalias, we also replace the
former with the latter.
Differential Revision: http://reviews.llvm.org/D10353
llvm-svn: 239488
O2 compiles just before GlobalDCE, unless we are preparing for LTO.
This pass eliminates available externally globals (turning them into
declarations), regardless of whether they are dead/unreferenced, since
we are guaranteed to have a copy available elsewhere at link time.
This enables additional opportunities for GlobalDCE.
If we are preparing for LTO (e.g. a -flto -c compile), the pass is not
included as we want to preserve available externally functions for possible
link time inlining. The FE indicates whether we are doing an -flto compile
via the new PrepareForLTO flag on the PassManagerBuilder.
llvm-svn: 239480
Determining proper debug locations for instructions created in
PHITransAddr is tricky. We use a simple approach here and simply copy
debug locations from instructions computing load address to
"corresponding" instructions re-creating the address computation
in predecessor basic blocks.
This may not always be correct, given all the rearrangement and
simplification going on, and debug locations may jump around a lot,
as the basic blocks we copy locations between may be very far from
each other.
Still, this would work good in most simple cases (e.g. when chain
of address computing instruction is short, or our mapping turns out
to be 1-to-1), and we desire to have *some* reasonable debug locations
associated with newly inserted instructions.
See http://reviews.llvm.org/D10351 review thread for more details.
Test Plan: regression test suite
Reviewers: spatel, dblaikie
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D10351
llvm-svn: 239479
Test Plan: regression test suite
Reviewers: eugenis, dblaikie
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D10343
llvm-svn: 239438
that was resetting it.
Remove the uses of DisableTailCalls in subclasses of TargetLowering and use
the value of function attribute "disable-tail-calls" instead. Also,
unconditionally add pass TailCallElim to the pipeline and check the function
attribute at the start of runOnFunction to disable the pass on a per-function
basis.
This is part of the work to remove TargetMachine::resetTargetOptions, and since
DisableTailCalls was the last non-fast-math option that was being reset in that
function, we should be able to remove the function entirely after the work to
propagate IR-level fast-math flags to DAG nodes is completed.
Out-of-tree users should remove the uses of DisableTailCalls and make changes
to attach attribute "disable-tail-calls"="true" or "false" to the functions in
the IR.
rdar://problem/13752163
Differential Revision: http://reviews.llvm.org/D10099
llvm-svn: 239427
The following code triggers a fatal error in the compiler instrumentation
of ASan on Darwin because we place the attribute into llvm.metadata section,
which does not have the proper MachO section name.
void foo() __attribute__((annotate("custom")));
void foo() {;}
This commit reorders the checks so that we skip everything in llvm.metadata
first. It also removes the hard failure in case the section name does not
parse. That check will be done lower in the compilation pipeline anyway.
(Reviewed in http://reviews.llvm.org/D9093.)
llvm-svn: 239379
We don't want to replace function A by Function B in one module and Function B
by Function A in another module.
If these functions are marked with linkonce_odr we would end up with a function
stub calling B in one module and a function stub calling A in another module. If
the linker decides to pick these two we will have two stubs calling each other.
rdar://21265586
llvm-svn: 239367
on a per-function basis.
Previously some of the passes were conditionally added to ARM's pass pipeline
based on the target machine's subtarget. This patch makes changes to add those
passes unconditionally and execute them conditonally based on the predicate
functor passed to the pass constructors. This enables running different sets of
passes for different functions in the module.
rdar://problem/20542263
Differential Revision: http://reviews.llvm.org/D8717
llvm-svn: 239325
Interleaved memory accesses are grouped and vectorized into vector load/store and shufflevector.
E.g. for (i = 0; i < N; i+=2) {
a = A[i]; // load of even element
b = A[i+1]; // load of odd element
... // operations on a, b, c, d
A[i] = c; // store of even element
A[i+1] = d; // store of odd element
}
The loads of even and odd elements are identified as an interleave load group, which will be transfered into vectorized IRs like:
%wide.vec = load <8 x i32>, <8 x i32>* %ptr
%vec.even = shufflevector <8 x i32> %wide.vec, <8 x i32> undef, <4 x i32> <i32 0, i32 2, i32 4, i32 6>
%vec.odd = shufflevector <8 x i32> %wide.vec, <8 x i32> undef, <4 x i32> <i32 1, i32 3, i32 5, i32 7>
The stores of even and odd elements are identified as an interleave store group, which will be transfered into vectorized IRs like:
%interleaved.vec = shufflevector <4 x i32> %vec.even, %vec.odd, <8 x i32> <i32 0, i32 4, i32 1, i32 5, i32 2, i32 6, i32 3, i32 7>
store <8 x i32> %interleaved.vec, <8 x i32>* %ptr
This optimization is currently disabled by defaut. To try it by adding '-enable-interleaved-mem-accesses=true'.
llvm-svn: 239291
Summary:
Using some SCEV functionality helped to entirely remove SCEVCache class and FindConstantPointers SCEV visitor.
Also, this makes the code more universal - I'll take advandate of it in next patches where I start handling additional types of instructions.
Test Plan: Tests would be submitted in subsequent patches.
Reviewers: atrick, chandlerc
Reviewed By: atrick, chandlerc
Subscribers: atrick, llvm-commits
Differential Revision: http://reviews.llvm.org/D10205
llvm-svn: 239282
There were several SelectInst combines that always returned an existing
instruction instead of modifying an old one or creating a new one.
These are prime candidates for moving to InstSimplify.
llvm-svn: 239229
Summary:
canUnrollCompletely takes `unsigned` values for `UnrolledCost` and
`RolledDynamicCost` but is passed in `uint64_t`s that are silently
truncated. Because of this, when `UnrolledSize` is a large integer
that has a small remainder with UINT32_MAX, LLVM tries to completely
unroll loops with high trip counts.
Reviewers: mzolotukhin, chandlerc
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D10293
llvm-svn: 239218
CVP wants to analyze the condition operand of a select along an edge.
It succeeds in getting back a Constant but not a ConstantInt. Instead,
it gets a ConstantExpr. It then assumes that the Constant must be equal
to false because it isn't equal to true.
Instead, perform an additional comparison.
This fixes PR23752.
llvm-svn: 239217
If we have (select a, b, c), it is sometimes valid to simplify this to a
single select operand. However, doing so is only valid if the
computation doesn't inject poison into the computation.
It might be helpful to consider the following example:
(select (icmp ne %i, INT_MAX), (add nsw %i, 1), INT_MIN)
The select is equivalent to (add %i, 1) but not (add nsw %i, 1).
Self hosting on x86_64 revealed that this occurs very, very rarely so
bailing out is hopefully pretty reasonable.
llvm-svn: 239215
This reverts commit r239141. This commit was an attempt to reintroduce
a previous patch that broke many self-hosting bots with clang timeouts,
but it still has slowdown issues, at least on ARM, increasing the
compilation time (stage 2, clang's) by 5x.
llvm-svn: 239175
This change is NFC because both the ``break;`` and the fall through end
up returning immediately. However, this helps clarify intent and also
ensures correctness in case more ``case`` blocks are added later.
llvm-svn: 239172