Summary: SampleProfile pass needs to be performed after InstructionCombiningPass, which helps eliminate un-inlinable function calls.
Reviewers: davidxl, dnovillo
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D17742
llvm-svn: 262419
ARC ownership-convention function type modifications.
According to the Itanium ABI, vendor extended qualifiers are
supposed to be mangled in reverse-alphabetical order before
any CVR qualifiers. The ARC function type conventions are
plausibly order-significant (they are associated with the
function type), which permits us to ignore the need to correctly
inter-order them with any other vendor qualifiers on the parameter
and return types.
Implementing these rules correctly is technically an ABI break.
Apple is comfortable with the risk of incompatibility here for
the ARC features, and I believe that address-space qualification
is still uncommon enough to allow us to adopt the conforming
rule without serious risk. Still, targets which make heavy
use of address space qualification may want to revert to the
non-conforming order.
llvm-svn: 262414
Summary: Recent changes to the expression parser broke function name resolution when using the IR interpreter instead of JIT. This patch changes the IRMemoryMap ivar in InterpreterStackFrame to an IRExecutionUnitSP (which is a subclass), allowing InterpreterStackFrame::ResolveConstantValue() to call FindSymbol() on the name of the Value when it's a FunctionVal. It also changes IRExecutionUnit::FindInSymbols() to call GetFileAddress() on the symball if ResolveCallableAddress() fails and there is no valid Process.
Reviewers: spyffe
Subscribers: lldb-commits
Differential Revision: http://reviews.llvm.org/D17745
llvm-svn: 262407
Polly recognizes affine loops that ScalarEvolution does not, in
particular those with loop conditions that depend on hoisted invariant
loads. Check for SCEVAddRec dependencies on such loops and do not
consider their exit values as synthesizable because SCEVExpander would
generate them as expressions that depend on the original induction
variables. These are not available in generated code.
llvm-svn: 262404
On AMDGPU where operations i64 operations are often bitcasted to v2i32
and back, this pattern shows up regularly where it breaks some
expected combines on i64, such as load width reducing.
This fixes some test failures in a future commit when i64 loads
are changed to promote.
llvm-svn: 262397
This adds some missing generic schedule info definitions, enables
completeness checking for cyclone and fixes a typo uncovered by that.
Differential Revision: http://reviews.llvm.org/D17748
llvm-svn: 262393
This isn't quite NFC because some of the SDLocs may change which could
cause scheduling differences. But no regression tests are affected and
there is no functional change intended.
llvm-svn: 262391
Revert r262248 in an attempt to fix the clang-native-aarch64-full
bot and to investigate a performance regression in
SingleSource/Benchmarks/CoyoteBench/huffbench
llvm-svn: 262388
This reverts commit r262316.
It seems that my change breaks an out-of-tree chromium buildbot, so
I'm reverting this in order to investigate the situation further.
llvm-svn: 262387
The doxygen comments are automatically generated based on Sony's intrinsics documentation.
Differential Revision: http://reviews.llvm.org/D17550
llvm-svn: 262385
TableGen checks at compiletime that for scheduling models with
"CompleteModel = 1" one of the following holds:
- Is marked with the hasNoSchedulingInfo flag
- The instruction is a subclass of Sched
- There are InstRW definitions in the scheduling model
Typical steps necessary to complete a model:
- Ensure all pseudo instructions that are expanded before machine
scheduling (usually everything handled with EmitYYY() functions in
XXXTargetLowering).
- If a CPU does not support some instructions mark the corresponding
resource unsupported: "WriteRes<WriteXXX, []> { let Unsupported = 1; }".
- Add missing scheduling information.
Differential Revision: http://reviews.llvm.org/D17747
llvm-svn: 262384
This introduces a new flag that indicates that a specific instruction
will never be present when the MachineScheduler runs and therefore needs
no scheduling information.
This is in preparation for an upcoming commit which checks completeness
of a scheduling model when tablegen runs.
Differential Revision: http://reviews.llvm.org/D17728
llvm-svn: 262383
Summary:
Tablegen was unable to determine that param loads/stores were actually
reading or writing from memory. I think this isn't a problem in
practice for param stores, because those occur in a block right before
we make our call. But param loads don't have to at the very beginning
of a function, so should be annotated as mayLoad so we don't incorrectly
optimize them.
Reviewers: jholewinski
Subscribers: jholewinski, llvm-commits
Differential Revision: http://reviews.llvm.org/D17471
llvm-svn: 262381
Summary: Looks like this was caused by a typo.
Reviewers: jholewinski
Subscribers: jholewinski, llvm-commits, tra
Differential Revision: http://reviews.llvm.org/D17357
llvm-svn: 262380
We'd lose track of the parent CodeGenFunction, leading us to get
confused with regard to which function a nested finally belonged to.
Differential Revision: http://reviews.llvm.org/D17752
llvm-svn: 262379
Summary:
Calls sometimes need to be convergent. This is already handled at the
LLVM IR level, but it also needs to be handled at the MI level.
Ideally we'd propagate convergence from instructions, down through the
selection DAG, and into MIs. But this is Hard, and would affect
optimizations in the SDNs -- right now only SDNs with two operands have
any flags at all.
Instead, here's a much simpler hack: Add new opcodes for NVPTX for
convergent calls, and generate these when lowering convergent LLVM
calls.
Reviewers: jholewinski
Subscribers: jholewinski, chandlerc, joker.eph, jhen, tra, llvm-commits
Differential Revision: http://reviews.llvm.org/D17423
llvm-svn: 262373
Summary:
Also simplify some of the embedded C++ logic.
No functional changes.
Reviewers: jholewinski
Subscribers: llvm-commits, tra, jholewinski
Differential Revision: http://reviews.llvm.org/D17354
llvm-svn: 262371
The _chkstk function is called by the compiler to probe the stack in an
order consistent with Windows' expectations. However, it is possible to
elide the call to _chkstk and manually adjust the stack pointer if we
can prove that the allocation is fixed size and smaller than the probe
size.
This shrinks chrome.dll, chrome_child.dll and chrome.exe by a
cummulative ~133 KB.
Differential Revision: http://reviews.llvm.org/D17679
llvm-svn: 262370