Commit Graph

269963 Commits

Author SHA1 Message Date
Teresa Johnson 73305f82e9 [ThinLTO] Fix ThinLTO crash
Summary:
Follow up to fix in r311023, which fixed the case where the combined
index is written to disk. The same samplePGO logic exists for the
in-memory index when computing imports, so we need to filter out
GlobalVariable summaries there too.

Reviewers: davidxl

Subscribers: inglorion, llvm-commits

Differential Revision: https://reviews.llvm.org/D36919

llvm-svn: 311254
2017-08-19 18:04:25 +00:00
Craig Topper 6e70f7cd33 [X86] Remove an unnecessary alignment restriction from MOVDDUP pattern.
The SSE MOVDDUP instruction only loads 64-bits with no alignment restriction.

llvm-svn: 311253
2017-08-19 18:02:28 +00:00
Jatin Bhateja 66f7958e91 Revert rL311247 : To rectify commit message.
Summary: This reverts commit rL311247.

Differential Revision: https://reviews.llvm.org/D36927

llvm-svn: 311252
2017-08-19 17:59:58 +00:00
Johannes Altmanninger 51321aef8e [clang-diff] Simplify mapping
Summary:
Until we find a decent heuristic on how to choose between multiple
identical trees, there is no point in supporting multiple mappings.

This also enables matching of nodes with parents of different types,
because there are many instances where this is appropriate.  For
example for and foreach statements; functions in the global or
other namespaces.

Reviewers: arphaman

Subscribers: klimek

Differential Revision: https://reviews.llvm.org/D36183

llvm-svn: 311251
2017-08-19 17:53:01 +00:00
Johannes Altmanninger 4933a6a2b2 Add clang-diff to tool_patterns in test/lit.cfg
llvm-svn: 311250
2017-08-19 17:52:29 +00:00
Johannes Altmanninger e1a89fbf6d [clang-diff] Fix compiler warning
llvm-svn: 311249
2017-08-19 17:12:25 +00:00
Philipp Schaad 50139f0f38 [PPCG] Only add Kernel argument sizes for OpenCL, not CUDA runtime
Kernel argument sizes now only get appended to the kernel launch parameter list if the OpenCL runtime is selected, not if CUDA runtime is chosen.

Differential revision: D36925

llvm-svn: 311248
2017-08-19 17:04:57 +00:00
Jatin Bhateja 6f0d0d23b0 Merge branch 'arcpatch-D35788'
llvm-svn: 311247
2017-08-19 17:00:04 +00:00
Jatin Bhateja 1c56863739 Revert rL311242 "Extension of shuffle vector pattern detection, updating post rebase."
Summary:

This reverts commit rL311242.

Differential Revision: https://reviews.llvm.org/D36924

llvm-svn: 311246
2017-08-19 16:40:06 +00:00
Davide Italiano 7e930d12df [Plugins/ObjC] Remove more semicolons to placate -Wpedantic. NFCI.
llvm-svn: 311245
2017-08-19 16:32:19 +00:00
Davide Italiano c8bee2ced2 [Plugins/ObjC] Remove unneded semicolon(s) to placate GCC -Wpedantic.
llvm-svn: 311244
2017-08-19 16:30:47 +00:00
Tobias Grosser 9f2eb24c06 Clarify the intend of the run-time check
llvm-svn: 311243
2017-08-19 16:26:39 +00:00
Jatin Bhateja 313f97dd84 Extension of shuffle vector pattern detection, updating post rebase.
llvm-svn: 311242
2017-08-19 15:58:36 +00:00
Johannes Altmanninger a29d6aeced [clang-diff] Add HTML side-by-side diff output
Reviewers: arphaman

Subscribers: mgorny

Differential Revision: https://reviews.llvm.org/D36182

llvm-svn: 311241
2017-08-19 15:40:45 +00:00
Johannes Altmanninger 14be1839e8 [clang-diff] Fix warning about useless comparison
llvm-svn: 311240
2017-08-19 13:29:44 +00:00
Tobias Grosser 43df2020e7 [GPGPU] Collect parameter dimension used in MemoryAccesses
When using -polly-ignore-integer-wrapping and -polly-acc-codegen-managed-memory
we add parameter dimensions lazily to the domains, which results in PPCG not
including parameter dimensions that are only used in memory accesses in the
kernel space. To make sure these parameters are still passed to the kernel, we
collect these parameter dimensions and align the kernel's parameter space
before code-generating it.

llvm-svn: 311239
2017-08-19 12:58:28 +00:00
Victor Leschuk ee7d232a41 revert failing test
llvm-svn: 311238
2017-08-19 12:24:41 +00:00
Johannes Altmanninger 683876ca6d [clang-diff] Make printing of matches optional
Reviewers: arphaman

Subscribers: klimek

Differential Revision: https://reviews.llvm.org/D36181

llvm-svn: 311237
2017-08-19 12:04:04 +00:00
Victor Leschuk ba0954c4e2 Add temporary test to verify that win10 builder hangs on error
llvm-svn: 311236
2017-08-19 12:02:39 +00:00
Peter Szecsi 999a25ff72 [CFG] Add LoopExit information to CFG
This patch introduces a new CFG element CFGLoopExit that indicate when a loop
ends. It does not deal with returnStmts yet (left it as a TODO).
It hidden behind a new analyzer-config flag called cfg-loopexit (false by
default).
Test cases added.

The main purpose of this patch right know is to make loop unrolling and loop
widening easier and more efficient. However, this information can be useful for
future improvements in the StaticAnalyzer core too.

Differential Revision: https://reviews.llvm.org/D35668

llvm-svn: 311235
2017-08-19 11:19:16 +00:00
Peter Szecsi 8de103f2f0 [StaticAnalyzer] LoopUnrolling: Exclude cases where the counter is escaped before the loop
Adding escape check for the counter variable of the loop.
It is achieved by jumping back on the ExplodedGraph to its declStmt.

Differential Revision: https://reviews.llvm.org/D35657

llvm-svn: 311234
2017-08-19 10:24:52 +00:00
Johannes Altmanninger c58ac104d1 [clang-diff] Fix test
llvm-svn: 311233
2017-08-19 10:05:24 +00:00
Johannes Altmanninger a1d2b5d54f [clang-diff] Add option to dump the AST, one node per line
Summary:
This is done with -ast-dump; the JSON variant has been renamed to
-ast-dump-json.

Reviewers: arphaman

Differential Revision: https://reviews.llvm.org/D36180

llvm-svn: 311232
2017-08-19 09:36:14 +00:00
Tobias Grosser d5f1fad77c [Polly] Run early cse + memory SSA to remove redundancies in the input code
This allows us to get rid of many identical loads as they commonly appear in
Fortran code.

llvm-svn: 311231
2017-08-19 08:44:46 +00:00
Victor Leschuk 59dc64f3af Temporary mark lit :: shtest-format as unsupported on windows
When run manually it fails, but when run under buildbot it causes hang.

llvm-svn: 311230
2017-08-19 07:58:07 +00:00
Chandler Carruth 4f3aa29a46 [Inliner] Fix a nasty bug when inlining a non-recursive trace of
a function into itself.

We tried to fix this before in r306495 but that got reverted as the
assert was actually hit.

This fixes the original bug (which we seem to have lost track of with
the revert) by blocking a second remapping when the function being
inlined is also the caller and the remapping could succeed but
erroneously.

The included test case would actually load from an inlined copy of the
alloca before this change, failing to load the stored value and
miscompiling.

Many thanks to Richard Smith for diagnosing a user miscompile to this
bug, and to Kyle for the first attempt and initial analysis and David Li
for remembering the issue and how to fix it and suggesting the patch.
I'm just stitching it together and landing it. =]

llvm-svn: 311229
2017-08-19 06:56:11 +00:00
Chandler Carruth 2a80fddf67 [Inliner] Clean up a test case a bit to make it more clear what is being
tested and why.

llvm-svn: 311228
2017-08-19 06:06:44 +00:00
Chandler Carruth 1f8212597d [SLP] Fix an unused variable warning in non-asserts builds.
llvm-svn: 311227
2017-08-19 05:06:23 +00:00
Chandler Carruth 93a645525c [x86] Teach the cmov converter to aggressively convert cmovs with memory
operands into control flow.

We have seen periodically performance problems with cmov where one
operand comes from memory. On modern x86 processors with strong branch
predictors and speculative execution, this tends to be much better done
with a branch than cmov. We routinely see cmov stalling while the load
is completed rather than continuing, and if there are subsequent
branches, they cannot be speculated in turn.

Also, in many (even simple) cases, macro fusion causes the control flow
version to be fewer uops.

Consider the IACA output for the initial sequence of code in a very hot
function in one of our internal benchmarks that motivates this, and notice the
micro-op reduction provided.
Before, SNB:
```
Throughput Analysis Report
--------------------------
Block Throughput: 2.20 Cycles       Throughput Bottleneck: Port1

| Num Of |              Ports pressure in cycles               |    |
|  Uops  |  0  - DV  |  1  |  2  -  D  |  3  -  D  |  4  |  5  |    |
---------------------------------------------------------------------
|   1    |           | 1.0 |           |           |     |     | CP | mov rcx, rdi
|   0*   |           |     |           |           |     |     |    | xor edi, edi
|   2^   | 0.1       | 0.6 | 0.5   0.5 | 0.5   0.5 |     | 0.4 | CP | cmp byte ptr [rsi+0xf], 0xf
|   1    |           |     | 0.5   0.5 | 0.5   0.5 |     |     |    | mov rax, qword ptr [rsi]
|   3    | 1.8       | 0.6 |           |           |     | 0.6 | CP | cmovbe rax, rdi
|   2^   |           |     | 0.5   0.5 | 0.5   0.5 |     | 1.0 |    | cmp byte ptr [rcx+0xf], 0x10
|   0F   |           |     |           |           |     |     |    | jb 0xf
Total Num Of Uops: 9
```
After, SNB:
```
Throughput Analysis Report
--------------------------
Block Throughput: 2.00 Cycles       Throughput Bottleneck: Port5

| Num Of |              Ports pressure in cycles               |    |
|  Uops  |  0  - DV  |  1  |  2  -  D  |  3  -  D  |  4  |  5  |    |
---------------------------------------------------------------------
|   1    | 0.5       | 0.5 |           |           |     |     |    | mov rax, rdi
|   0*   |           |     |           |           |     |     |    | xor edi, edi
|   2^   | 0.5       | 0.5 | 1.0   1.0 |           |     |     |    | cmp byte ptr [rsi+0xf], 0xf
|   1    | 0.5       | 0.5 |           |           |     |     |    | mov ecx, 0x0
|   1    |           |     |           |           |     | 1.0 | CP | jnbe 0x39
|   2^   |           |     |           | 1.0   1.0 |     | 1.0 | CP | cmp byte ptr [rax+0xf], 0x10
|   0F   |           |     |           |           |     |     |    | jnb 0x3c
Total Num Of Uops: 7
```
The difference even manifests in a throughput cycle rate difference on Haswell.
Before, HSW:
```
Throughput Analysis Report
--------------------------
Block Throughput: 2.00 Cycles       Throughput Bottleneck: FrontEnd

| Num Of |                    Ports pressure in cycles                     |    |
|  Uops  |  0  - DV  |  1  |  2  -  D  |  3  -  D  |  4  |  5  |  6  |  7  |    |
---------------------------------------------------------------------------------
|   0*   |           |     |           |           |     |     |     |     |    | mov rcx, rdi
|   0*   |           |     |           |           |     |     |     |     |    | xor edi, edi
|   2^   |           |     | 0.5   0.5 | 0.5   0.5 |     | 1.0 |     |     |    | cmp byte ptr [rsi+0xf], 0xf
|   1    |           |     | 0.5   0.5 | 0.5   0.5 |     |     |     |     |    | mov rax, qword ptr [rsi]
|   3    | 1.0       | 1.0 |           |           |     |     | 1.0 |     |    | cmovbe rax, rdi
|   2^   | 0.5       |     | 0.5   0.5 | 0.5   0.5 |     |     | 0.5 |     |    | cmp byte ptr [rcx+0xf], 0x10
|   0F   |           |     |           |           |     |     |     |     |    | jb 0xf
Total Num Of Uops: 8
```
After, HSW:
```
Throughput Analysis Report
--------------------------
Block Throughput: 1.50 Cycles       Throughput Bottleneck: FrontEnd

| Num Of |                    Ports pressure in cycles                     |    |
|  Uops  |  0  - DV  |  1  |  2  -  D  |  3  -  D  |  4  |  5  |  6  |  7  |    |
---------------------------------------------------------------------------------
|   0*   |           |     |           |           |     |     |     |     |    | mov rax, rdi
|   0*   |           |     |           |           |     |     |     |     |    | xor edi, edi
|   2^   |           |     | 1.0   1.0 |           |     | 1.0 |     |     |    | cmp byte ptr [rsi+0xf], 0xf
|   1    |           | 1.0 |           |           |     |     |     |     |    | mov ecx, 0x0
|   1    |           |     |           |           |     |     | 1.0 |     |    | jnbe 0x39
|   2^   | 1.0       |     |           | 1.0   1.0 |     |     |     |     |    | cmp byte ptr [rax+0xf], 0x10
|   0F   |           |     |           |           |     |     |     |     |    | jnb 0x3c
Total Num Of Uops: 6
```

Note that this cannot be usefully restricted to inner loops. Much of the
hot code we see hitting this is not in an inner loop or not in a loop at
all. The optimization still remains effective and indeed critical for
some of our code.

I have run a suite of internal benchmarks with this change. I saw a few
very significant improvements and a very few minor regressions,
but overall this change rarely has a significant effect. However, the
improvements were very significant, and in quite important routines
responsible for a great deal of our C++ CPU cycles. The gains pretty
clealy outweigh the regressions for us.

I also ran the test-suite and SPEC2006. Only 11 binaries changed at all
and none of them showed any regressions.

Amjad Aboud at Intel also ran this over their benchmarks and saw no
regressions.

Differential Revision: https://reviews.llvm.org/D36858

llvm-svn: 311226
2017-08-19 05:01:19 +00:00
Chandler Carruth e3b3547e9f [x86] Refactor the CMOV conversion pass to be more flexible.
The primary thing that this accomplishes is to allow future re-use of
these routines in more contexts and clarify the behavior w.r.t. loops.
For example, if handling outer loops is desirable, doing so in
a inside-out order becomes straight forward because it walks the loop
nest itself (rather than walking the function's basic blocks) and
de-couples the CMOV rewriting from the loop structure as there isn't
actually anything loop-specific about this transformation.

This patch should be essentially a no-op. It potentially changes the
order in which we visit the inner loops, but otherwise should merely set
the stage for subsequent changes.

Differential Revision: https://reviews.llvm.org/D36783

llvm-svn: 311225
2017-08-19 04:28:20 +00:00
Faisal Vali 8194a3e975 [c++2a] Implement P0409R2 - Allow lambda capture [=,this] (by hamzasood)
This patch, by hamzasood, implements P0409R2, and allows [=, this] pre-C++2a as an extension (with appropriate warnings) for consistency.

https://reviews.llvm.org/D36572

Thanks Hamza!

llvm-svn: 311224
2017-08-19 03:43:07 +00:00
Dinar Temirbulatov 7aff8cfa55 [SLPVectorizer] Tighten up VLeft, VRight declaration, remove unnecessary testcase test/Transforms/SLPVectorizer/X86/reorder.ll, NFCI.
llvm-svn: 311223
2017-08-19 03:15:07 +00:00
Johannes Altmanninger e0fe5cd4f3 Revert "Revert "[clang-diff] Move printing of matches and changes to clang-diff""
Fix build by renaming ChangeKind -> Change

This reverts commit 0c78c5729f29315d7945988efd048c0cb86c07ce.

llvm-svn: 311222
2017-08-19 02:56:35 +00:00
Dinar Temirbulatov e3ce1b455e [SLPVectorizer] Add opcode parameter to reorderAltShuffleOperands, reorderInputsAccordingToOpcode functions.
Reviewers: mkuper, RKSimon, ABataev, mzolotukhin, spatel, filcab

Subscribers: llvm-commits, rengolin

Differential Revision: https://reviews.llvm.org/D36766

llvm-svn: 311221
2017-08-19 02:54:20 +00:00
Johannes Altmanninger b2d26c303e [clang-diff] Fix test for python 3
llvm-svn: 311220
2017-08-19 01:34:24 +00:00
Matthias Braun 91bd3ad128 ARMRegsiterInfo: Define more ssub indexes; NFC
This doesn't really change anything as Tablegen would have inferred
those indices anyway; defining them gives us shorter names that are
easier to read while debugging (i.e. "ssub_4" rather than
"dsub2_then_ssub_0")

llvm-svn: 311218
2017-08-19 01:21:11 +00:00
Adrian Prantl 2116dd360a Filter out non-constant DIGlobalVariableExpressions reachable via the CU
They won't affect the DWARF output, but they will mess with the
sorting of the fragments. This fixes the crash reported in PR34159.

https://bugs.llvm.org/show_bug.cgi?id=34159

llvm-svn: 311217
2017-08-19 01:15:06 +00:00
Johannes Altmanninger 0da12c841a Revert "Revert "[clang-diff] Move the JSON export function to clang-diff""
This reverts commit eac4c13ac9ea8f12bc049e040c7b9c8a517f54e7, the
original commit *should* not have caused the build failure.

llvm-svn: 311216
2017-08-19 00:57:38 +00:00
Eric Beckmann 91d8af5386 llvm-mt: Merge manifest namespaces.
mt.exe performs a tree merge where certain element nodes are combined
into one.  This introduces the possibility of xml namespaces conflicting
with each other.  The original mt.exe has a hierarchy whereby certain
namespace names can override others, and nodes that would then end up in
ambigious namespaces have their namespaces explicitly defined.  This
namespace handles this merging process.

llvm-svn: 311215
2017-08-19 00:37:41 +00:00
Rui Ueyama 42479e02ca Rename {Lazy,}ObjectKind -> {Lazy,}ObjKind.
I renamed corresponding classes in r309199 but forgot to rename enums
at the moment. Rename them now to make them consistent.

llvm-svn: 311214
2017-08-19 00:13:54 +00:00
Eugene Zelenko be709f2c19 [Analysis] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 311212
2017-08-18 23:51:26 +00:00
Vlad Tsyrklevich deb2a2adc8 Revert "[clang-diff] Move the JSON export function to clang-diff"
This reverts commit r311199, it was causing widespread build failures.

llvm-svn: 311211
2017-08-18 23:21:11 +00:00
Vlad Tsyrklevich ebcb773f7d Revert "[clang-diff] Move printing of matches and changes to clang-diff"
This reverts commit r311200, it was causing widespread build failures.

llvm-svn: 311210
2017-08-18 23:21:10 +00:00
Xinliang David Li 0d07f9d68a Fix comment /NFC
llvm-svn: 311209
2017-08-18 23:08:50 +00:00
Xinliang David Li 709ffe178e [Profile] backward propagate profile info in JumpThreading
Differential Revsion: http://reviews.llvm.org/D36864

llvm-svn: 311208
2017-08-18 23:00:05 +00:00
Jason Molenda fba547d7e7 Commiting Christopher Brook's patch for
"Prevent negative chars from being sign-extended into isprint and isspace which take and int and crash if the int is negative"

https://reviews.llvm.org/D36620

llvm-svn: 311207
2017-08-18 22:57:59 +00:00
Amjad Aboud 88ffa3afe2 [InstCombine] Teach ComputeNumSignBitsImpl to handle integer multiply instruction.
Differential Revision: https://reviews.llvm.org/D36679

llvm-svn: 311206
2017-08-18 22:56:55 +00:00
Max Kazantsev 0aaf8c16ac [IRCE] Fix buggy behavior in Clamp
Clamp function was too optimistic when choosing signed or unsigned min/max function for calculations.
In fact, `!IsSignedPredicate` guarantees us that `Smallest` and `Greatest` can be compared safely using unsigned
predicates, but we did not check this for `S` which can in theory be negative.

This patch makes Clamp use signed min/max for cases when it fails to prove `S` being non-negative,
and it adds a test where such situation may lead to incorrect conditions calculation.

Differential Revision: https://reviews.llvm.org/D36873

llvm-svn: 311205
2017-08-18 22:50:29 +00:00
Justin Bogner b29bebe47b IR: Make stripDebugInfo robust against (invalid) empty basic blocks
Since stripDebugInfo runs before the verifier when reading IR, we can
end up in a situation where we read some invalid IR but don't know its
invalid yet. Before this patch we would crash in stripDebugInfo when
given IR with a completely empty basic block, and after we get a nice
error from the verifier instead.

llvm-svn: 311202
2017-08-18 21:38:03 +00:00
Jonas Devlieghere a2faf7b60f [llvm-dwarfdump] Hide .debug_str and DIE reference offsets in brief mode
This patch hides the .debug_str offset and DIE reference offsets into
the CU when llvm-dwarfdump is invoked with -brief.

Differential Revision: https://reviews.llvm.org/D36835

llvm-svn: 311201
2017-08-18 21:35:44 +00:00