Commit Graph

262235 Commits

Author SHA1 Message Date
Rafael Espindola 88ab9fb163 Detemplate the got.
This is a bit hackish, but allows for a lot of followup cleanups.

llvm-svn: 302845
2017-05-11 23:26:03 +00:00
Teresa Johnson 2a6b7991d4 Restrict call metadata based hotness detection to Sample PGO mode
Summary:
Don't use the metadata on call instructions for determining hotness
unless we are in sample PGO mode, where it is needed because profile
counts are not accurate. In instrumentation mode this is not necessary
and does more harm than good when calls have VP metadata that hasn't
been properly scaled after transformations or dropped after constant
prop based devirtualization (both should be fixed, but we don't need
to do this in the first place for instrumentation PGO).

This required adjusting a number of tests to distinguish between sample
and instrumentation PGO handling, and to add in profile summary metadata
so that getProfileCount can get the summary.

Reviewers: davidxl, danielcdh

Subscribers: aemerson, rengolin, mehdi_amini, Prazek, llvm-commits

Differential Revision: https://reviews.llvm.org/D32877

llvm-svn: 302844
2017-05-11 23:18:05 +00:00
Rafael Espindola 5ab1989584 Reduce templating. NFC.
llvm-svn: 302843
2017-05-11 23:16:43 +00:00
Richard Smith 858e0e0a42 Remove unnecessary mapping from SourceLocation to Module.
When we parse a redefinition of an entity for which we have a hidden existing
declaration, make it visible in the current module instead of mapping the
current source location to its containing module.

llvm-svn: 302842
2017-05-11 23:11:16 +00:00
Eric Fiselier bb78837e4d Fix XFAIL to reflect recent fixes in GCC
llvm-svn: 302841
2017-05-11 23:04:04 +00:00
Adrian Prantl d88705587f Module Debug Info: Emit namespaced C++ forward decls in the correct module.
The AST merges NamespaceDecls, but for module debug info it is
important to put a namespace decl (or rather its children) into the
correct (sub-)module, so we need to use the parent module of the decl
that triggered this namespace to be serialized as a second key when
looking up DINamespace nodes.

rdar://problem/29339538

llvm-svn: 302840
2017-05-11 22:59:19 +00:00
Michael Kruse d644ec7647 [DeLICM] Use input access heuristic for mapped PHI WRITEs.
As with the scalar operand of the initial StoreInst, also use input
accesses when searching for new opportunities after mapping a
PHI write.

The same rational applies here: After LICM has been applied, the
promoted value will either be an instruction in the same statement
(in which case we fall back to try every scalar access of the
statement), or in another statement such that there will be such
an input access. In the latter case other scalars cannot have
originated from the same register promotion, at least not by LICM.

This mostly helps to decrease compilation time and makes debugging
easier by not pursuing unpromising routes. In some circumstances,
it may change the compiler's output.

llvm-svn: 302839
2017-05-11 22:56:59 +00:00
Michael Kruse 4c27643398 [DeLICM] Lookup input accesses.
Previous to this patch, we used VirtualUse to determine the input
access of an llvm::Value in a statement. The input access is the
READ MemoryAccess that makes a value available in that statement,
which can either be a READ of a MemoryKind::Value or the
MemoryKind::PHI for a PHINode in the statement. DeLICM uses the input
access to heuristically find a candidate to map without searching all
possible values.

This might modify the behaviour in that previously PHI accesses were
not considered input accesses before. This was unintentially lost when
"VirtualUse" was extracted from the "Known Knowledge" patch.

llvm-svn: 302838
2017-05-11 22:56:46 +00:00
Michael Kruse bfaa1857b3 [VirtualInstruction] Do a lookup instead of a linear search. NFC.
llvm-svn: 302837
2017-05-11 22:56:27 +00:00
Michael Kruse e60eca7316 [ScopInfo] Keep scalar acceess dictionaries up-to-data. NFC.
When removing a MemoryAccess, also remove it from maps pointing to it.
This was already done for InstructionToAccess, but not yet for
ValueReads, ValueWrites and PHIWrites as those were only used during
the ScopBuilder phase. Keeping them updated allows us to use them
later as well.

llvm-svn: 302836
2017-05-11 22:56:12 +00:00
Reid Kleckner 43bbeb4c9f Issue diagnostics when returning FP values on x86_64 without SSE1/2
Avoid using report_fatal_error, because it will ask the user to file a
bug. If the user attempts to disable SSE on x86_64 and them use floating
point, that's a bug in their code, not a bug in the compiler.

This is just a start. There are other ways to crash the backend in this
configuration, but they should be updated to follow this pattern.

Differential Revision: https://reviews.llvm.org/D27522

llvm-svn: 302835
2017-05-11 22:43:02 +00:00
Guozhi Wei 22e7da9597 [PPC] Change the register constraint of the first source operand of instruction mtvsrdd to g8rc_nox0
According to Power ISA V3.0 document, the first source operand of mtvsrdd is constant 0 if r0 is specified. So the corresponding register constraint should be g8rc_nox0.

This bug caused wrong output generated by 401.bzip2 when -mcpu=power9 and fdo are specified.

Differential Revision: https://reviews.llvm.org/D32880

llvm-svn: 302834
2017-05-11 22:17:35 +00:00
Sean Callanan 09e91ac6ab [DWARF parser] Produce correct template parameter packs
Templates can end in parameter packs, like this

template <class T...> struct MyStruct 
  { /*...*/ };

LLDB does not currently support these parameter packs; 
it does not emit them into the template argument list
at all. This causes problems when you specialize, e.g.:

template <> struct MyStruct<int> 
  { /*...*/ };
template <> struct MyStruct<int, int> : MyStruct<int> 
  { /*...*/ };

LLDB generates two template specializations, each with 
no template arguments, and then when they are imported 
by the ASTImporter into a parser's AST context we get a 
single specialization that inherits from itself, 
causing Clang's record layout mechanism to smash its
stack.

This patch fixes the problem for classes and adds
tests. The tests for functions fail because Clang's
ASTImporter can't import them at the moment, so I've
xfailed that test.

Differential Revision: https://reviews.llvm.org/D33025

llvm-svn: 302833
2017-05-11 22:08:05 +00:00
Rafael Espindola 895aea6d15 Reduce template usage. NFC.
llvm-svn: 302832
2017-05-11 22:02:41 +00:00
Aditya Nandakumar fd484c443f [GISel]: Remove unused lambda captures. NFC
https://reviews.llvm.org/D33085

llvm-svn: 302831
2017-05-11 21:56:51 +00:00
Kostya Kortchinsky 01a66fc928 [scudo] Use our own combined allocator
Summary:
The reasoning behind this change is twofold:
- the current combined allocator (sanitizer_allocator_combined.h) implements
  features that are not relevant for Scudo, making some code redundant, and
  some restrictions not pertinent (alignments for example). This forced us to
  do some weird things between the frontend and our secondary to make things
  work;
- we have enough information to be able to know if a chunk will be serviced by
  the Primary or Secondary, allowing us to avoid extraneous calls to functions
  such as `PointerIsMine` or `CanAllocate`.

As a result, the new scudo-specific combined allocator is very straightforward,
and allows us to remove some now unnecessary code both in the frontend and the
secondary. Unused functions have been left in as unimplemented for now.

It turns out to also be a sizeable performance gain (3% faster in some Android
memory_replay benchmarks, doing some more on other platforms).

Reviewers: alekseyshl, kcc, dvyukov

Reviewed By: alekseyshl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33007

llvm-svn: 302830
2017-05-11 21:40:45 +00:00
Easwaran Raman c103ef89ee Decrease inlinecold-threshold to 45
I ran the test-suite (including SPEC 2006) in PGO mode comparing cold
thresholds of 225 and 45. Here are some stats on the text size:

Out of 904 tests that ran, 197 see a change in text size. The average
text size reduction (of all the 904 binaries) is 1.07%. Of the 197
binaries, 19 see a text size increase, as high as 18%, but most of them
are small single source benchmarks. There are 3 multisource benchmarks
with a >0.5% size increase (0.7, 1.3 and 2.1 are their % increases). On
the other side of the spectrum, 31 benchmarks see >10% size reduction
and 6 of them are MultiSource.

I haven't run the test-suite with other values of inlinecold-threshold.
Since we have a cold callsite threshold of 45, I picked this value.

Differential revision: https://reviews.llvm.org/D33106

llvm-svn: 302829
2017-05-11 21:36:28 +00:00
Rafael Espindola b3aa2c9b9e Reduce template usage. NFC.
llvm-svn: 302828
2017-05-11 21:33:30 +00:00
Reid Kleckner 45a13e1b54 De-virtualize TerminatorInst successor accessors
Use the same switch technique to eliminate virtual successor accessors
from TerminatorInst. Extracted from D31261.

NFC

llvm-svn: 302827
2017-05-11 21:26:55 +00:00
Rafael Espindola 4b1c3696e3 Reduce template usage. NFC.
llvm-svn: 302826
2017-05-11 21:23:38 +00:00
Richard Smith 74df05471e XFAIL this test for Hexagon.
It's failing due to Hexagon calling convention lowering being broken (empty
structs are not passed even if they have nontrivial destructors / copy ctors).

llvm-svn: 302825
2017-05-11 21:18:27 +00:00
Martell Malone 53877bc5b9 [Libcxxabi]: Support using compiler-rt for MinGW64
Reviewers: EricWF

Differential Revision: https://reviews.llvm.org/D33098

llvm-svn: 302824
2017-05-11 21:16:29 +00:00
Reid Kleckner e7c7854cb1 De-virtualize GlobalValue
The erase/remove from parent methods now use a switch table to remove
themselves from their appropriate parent ilist.

The copyAttributesFrom method is now completely non-virtual, since we
only ever copy attributes from a global of the appropriate type.

Pre-requisite to de-virtualizing Value to save a vptr
(https://reviews.llvm.org/D31261).

NFC

llvm-svn: 302823
2017-05-11 21:14:29 +00:00
Chad Rosier aeffffdb44 [AArch64][MachineCombine] Fold FNMUL+FSUB -> FNMADD.
Differential Revision: http://reviews.llvm.org/D33101.

llvm-svn: 302822
2017-05-11 20:07:24 +00:00
Davide Italiano 0dcc015a81 [AMDGPU] Placate unused variable warning in release builds.
llvm-svn: 302821
2017-05-11 19:58:52 +00:00
Vadzim Dambrouski 38e30197c3 [MSP430] Generate EABI-compliant libcalls
Updates the MSP430 target to generate EABI-compatible libcall names.
As a byproduct, adjusts the hardware multiplier options available in
the MSP430 target, adds support for promotion of the ISD::MUL operation
for 8-bit integers, and correctly marks R11 as used by call instructions.

Patch by Andrew Wygle.

Differential Revision: https://reviews.llvm.org/D32676

llvm-svn: 302820
2017-05-11 19:56:14 +00:00
Davide Italiano 36acbc716d [LiveVariables] Switch Kill/Defs sets to be DenseSet(s).
The testcase in PR32984 shows a non linear compile time increase
after a change that made the LoopUnroll pass more aggressive
(increasing the threshold).

My profiling shows all the time of PHI elimination goes to
llvm::LiveVariables::addNewBlock. This is because we keep
Defs/Kills registers in a SmallSet and vfind(const T &V); is O(N).

Switching to a DenseSet reduces the time spent in the pass from
297 seconds to 97 seconds. Profiling still shows a lot of time is
spent iterating the data structure, so I guess there's room for
improvement.

Dan tells me GCC uses real set operations for live registers and
it takes no-time on this testcase. Matthias points out we might
want to switch all this to LiveIntervalAnalysis so it's not entirely
sure if a rewrite is worth it.

Differential Revision:  https://reviews.llvm.org/D33088

llvm-svn: 302819
2017-05-11 19:37:43 +00:00
Richard Smith 2cbd1f6c9f Work around different -std= default for PS4 target.
llvm-svn: 302818
2017-05-11 19:17:54 +00:00
Richard Smith 722363727d PR22877: When constructing an array via a constructor with a default argument
in list-initialization, run cleanups for the default argument after each
iteration of the initialization loop.

We previously only ran the destructor for any temporary once, at the end of the
complete loop, rather than once per iteration!

Re-commit of r302750, reverted in r302776.

llvm-svn: 302817
2017-05-11 18:58:24 +00:00
Craig Topper dbd6219f81 [APInt] Remove an APInt copy from the return of APInt::multiplicativeInverse.
llvm-svn: 302816
2017-05-11 18:40:53 +00:00
Craig Topper 3fbecadab6 [APInt] Fix typo in comment. NFC
llvm-svn: 302815
2017-05-11 17:57:43 +00:00
Matt Arsenault 47ccafe787 AMDGPU: Remove tfe bit from flat instruction definitions
We don't use it and it was removed in gfx9, and the encoding
bit repurposed.

Additionally actually using it requires changing the output register
class, which wasn't done anyway.

llvm-svn: 302814
2017-05-11 17:38:33 +00:00
Matt Arsenault bf5482e4bb AMDGPU: Pull fneg out of extract_vector_elt
This allows folding source modifiers in more f16 cases.
Makes it easier to select per-component packed neg modifiers.

llvm-svn: 302813
2017-05-11 17:26:25 +00:00
Stanislav Mekhanoshin 33a97ec4ed [AMDGPU] Fix incorrect register pressure calculation
Earlier fix D32572 introduced a bug where live-ins were calculated
for basic block instead of scheduling region. This change fixes it.

Differential Revision: https://reviews.llvm.org/D33086

llvm-svn: 302812
2017-05-11 17:16:55 +00:00
Adam Nemet 0aca09fc6c [SLP] Emit optimization remarks
The approach I followed was to emit the remark after getTreeCost concludes
that SLP is profitable.  I initially tried emitting them after the
vectorizeRootInstruction calls in vectorizeChainsInBlock but I vaguely
remember missing a few cases for example in HorizontalReduction::tryToReduce.

ORE is placed in BoUpSLP so that it's available from everywhere (notably
HorizontalReduction::tryToReduce).

We use the first instruction in the root bundle as the locator for the remark.
In order to get a sense how far the tree is spanning I've include the size of
the tree in the remark.  This is not perfect of course but it gives you at
least a rough idea about the tree.  Then you can follow up with -view-slp-tree
to really see the actual tree.

llvm-svn: 302811
2017-05-11 17:06:17 +00:00
Nemanja Ivanovic 96c3d626a2 [PowerPC] Eliminate integer compare instructions - vol. 1
This patch is the first in a series of patches to provide code gen for
doing compares in GPRs when the compare result is required in a GPR.

It adds the infrastructure to select GPR sequences for i1->i32 and i1->i64
extensions. This first patch handles equality comparison on i32 operands with
the result sign or zero extended.

Differential Revision: https://reviews.llvm.org/D31847

llvm-svn: 302810
2017-05-11 16:54:23 +00:00
Adrian Prantl 40b201c326 Add a test that local submodule visibility has no effect on debug info
rdar://problem/27876262

llvm-svn: 302809
2017-05-11 16:40:48 +00:00
Simon Pilgrim 6faddcbd07 [DAGCombine] Use SelectionDAG::getAnyExtOrTrunc helper. NFCI.
llvm-svn: 302808
2017-05-11 16:40:44 +00:00
Pierre Gousseau 9ce59db426 [asan] Test 'strndup_oob_test.cc' added in r302781 fails on the clang-cmake-thumbv7-a15-full-sh bot.
Marking as unsupported on armv7l-unknown-linux-gnueabihf, same as strdup_oob_test.cc

llvm-svn: 302807
2017-05-11 16:26:50 +00:00
Hans Wennborg 905da7458b Fix -DLLVM_ENABLE_THREADS=OFF build after r302748
llvm-svn: 302806
2017-05-11 15:32:47 +00:00
Michael Kruse 07e315e780 [Simplify] Remove identical scalar writes.
After DeLICM, it is possible to have two writes of the same value to
the same location in the same statement when it determined that those
writes do not conflict (write the same value).

Teach -polly-simplify to remove one of the writes. It interferes with
the pattern matching of matrix-multiplication kernels and also seem
to not be optimized away by LLVM.

The algorthm is simple, has O(n^2) behaviour (n = max number of
MemoryAccesses in a statement) and only matches the most obvious cases,
but seem to be enough to pattern-match Boost ublas gemm.

Not handled cases include:
- StoreInst instructions (a.k.a. explicit writes), since the value might
  be loaded or overwritten between the two stores.
- PHINode, especially LCSSA, when the PHI value matches with on other's.
- Partial writes (in preparation)

llvm-svn: 302805
2017-05-11 15:07:38 +00:00
Simon Pilgrim e2c055b8c5 [X86][AVX] Added zeroall/zeroupper scheduler tests
Missing on SandyBridge and Btver2 models

llvm-svn: 302804
2017-05-11 15:02:49 +00:00
Tim Northover a424117501 Modules: fix modules build.
A recent commit made GlobalVariable.h depend on intrinsics generation, so (I
think) it needs to be in the lower-level module. I'll confirm with others, but
this should fix the bots.

llvm-svn: 302803
2017-05-11 14:51:43 +00:00
Marshall Clow 35f62e3228 Mark LWG#2782 as complete. No functionality change; we already do this. Just added a few more tests.
llvm-svn: 302802
2017-05-11 14:25:45 +00:00
Benjamin Kramer 71ed2e6457 Renumber test line number expectations after r302783.
Also remove a confused stable-runtimes requirement.

llvm-svn: 302801
2017-05-11 14:04:23 +00:00
Marshall Clow 7e154cdca7 Replace a nested namespace used for overload resolution with a struct. Richard Smith says that using the namespace results in an ODR violation, but I disagree. Nevertheless, the struct works just as well.
llvm-svn: 302800
2017-05-11 14:00:54 +00:00
Marshall Clow afda4a9af9 Mark LWG#2850 as complete. No functionality change; we had tests that covered it already. Just added comments to the tests. Thanks to K-ballo for the heads up.
llvm-svn: 302799
2017-05-11 13:55:20 +00:00
Marshall Clow 9630f46dde Mark LWG#2796 as complete. No functionality change; we had tests that covered it already. Just added comments to the tests
llvm-svn: 302798
2017-05-11 13:51:09 +00:00
Alex Lorenz e6afa397c6 [CodeCompletion] Provide member completions for dependent expressions whose
type is a TemplateSpecializationType or InjectedClassNameType

Fixes PR30847. Partially fixes PR20973 (first position only).

PR17614 is still not working, its expression has the dependent
builtin type. We'll have to teach the completion engine how to "resolve"
dependent expressions to fix it.

rdar://29818301

llvm-svn: 302797
2017-05-11 13:48:57 +00:00
Alex Lorenz 0fe0d98557 [CodeCompletion] NFC, extract a function that generates member
completion results for records

llvm-svn: 302796
2017-05-11 13:41:00 +00:00