Commit Graph

153375 Commits

Author SHA1 Message Date
Elena Demikhovsky f58f838495 Changed basic cost of store operation on X86
Store operation takes 2 UOps on X86 processors. The exact cost calculation affects several optimization passes including loop unroling.
This change compensates performance degradation caused by https://reviews.llvm.org/D34458 and shows improvements on some benchmarks.

Differential Revision: https://reviews.llvm.org/D35888

llvm-svn: 311285
2017-08-20 12:34:29 +00:00
Aditya Kumar a525fffd07 [Loop Vectorize] Added a separate metadata
Added a separate metadata to indicate when the loop
has already been vectorized instead of setting width and count to 1.

Patch written by Divya Shanmughan and Aditya Kumar

Differential Revision: https://reviews.llvm.org/D36220

llvm-svn: 311281
2017-08-20 10:32:41 +00:00
Igor Breger 88a3d5c855 [GlobalISel][X86] Support call ABI.
Summary: Support call ABI. For now only Linux C and X86_64_SysV calling conventions supported. Variadic function not supported.

Reviewers: zvi, guyblank, oren_ben_simhon

Reviewed By: oren_ben_simhon

Subscribers: rovka, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D34602

llvm-svn: 311279
2017-08-20 09:25:22 +00:00
Igor Breger b3a860a5e8 [GlobalISel][X86] Support asimetric copy from/to GPR physical register.
Usually this case generated by ABI lowering, it requare to performe trancate/anyext.

llvm-svn: 311278
2017-08-20 07:14:40 +00:00
Alex Bradbury 21c8adf50b [RISCV] Trivial whitespace fix in RISCVInstPrinter
llvm-svn: 311277
2017-08-20 06:58:43 +00:00
Alex Bradbury e45186d43f [RISCV] Fix two abuses of llvm_unreachable
Replace with report_fatal_error.

llvm-svn: 311276
2017-08-20 06:57:27 +00:00
Alex Bradbury dd83484ab4 [RISCV] Set HasRelocationAddend for RISCVELFObjectWriter
llvm-svn: 311275
2017-08-20 06:55:14 +00:00
Sam Elliott 7fe0aaa140 Revert "Emit only A Single Opt Remark When Inlining"
Reverting due to clang build failure

llvm-svn: 311274
2017-08-20 06:55:10 +00:00
Sam Elliott 785dd75369 Emit only A Single Opt Remark When Inlining
Summary:
This updates the Inliner to only add a single Optimization
Remark when Inlining, rather than an Analysis Remark and an
Optimization Remark.

Fixes https://bugs.llvm.org/show_bug.cgi?id=33786

Reviewers: anemet, davidxl, chandlerc

Reviewed By: anemet

Subscribers: haicheng, fhahn, mehdi_amini, dblaikie, llvm-commits, eraman

Differential Revision: https://reviews.llvm.org/D36054

llvm-svn: 311273
2017-08-20 06:43:34 +00:00
Igor Breger d88dfd32f8 [GlobalIsel] Fix undefined behavior if Action not set (release), it aslo crashing in debug mode.
Differential Revision: https://reviews.llvm.org/D34978

llvm-svn: 311272
2017-08-20 06:26:22 +00:00
Sam Elliott b0c9753691 Keep Optimization Remark Yaml in NewPM
Summary:
The New Pass Manager infrastructure was forgetting to keep around the optimization remark yaml file that the compiler might have been producing. This meant setting the option to '-' for stdout worked, but setting it to a filename didn't give file output (presumably it was deleted because compilation didn't explicitly keep it). This change just ensures that the file is kept if compilation succeeds.

So far I have updated one of the optimization remark output tests to add a version with the new pass manager. It is my intention for this patch to also include changes to all tests that use `-opt-remark-output=` but I wanted to get the code patch ready for review while I was making all those changes.

Fixes https://bugs.llvm.org/show_bug.cgi?id=33951

Reviewers: anemet, chandlerc

Reviewed By: anemet, chandlerc

Subscribers: javed.absar, chandlerc, fhahn, llvm-commits

Differential Revision: https://reviews.llvm.org/D36906

llvm-svn: 311271
2017-08-20 01:30:45 +00:00
Chandler Carruth 9ef881efab [x86] Fix an even stranger corner case where we have multiple levels of
cmov self-refrencing.

Pointed out by Amjad Aboud in code review, test case minorly simplified
from the one he posted.

llvm-svn: 311267
2017-08-19 23:35:50 +00:00
Craig Topper 4de6f583da [X86] Merge all of the vecload and alignedload predicates into single predicates.
We can load the memory VT and check for natural alignment. This also adds a new preferNonTemporalLoad helper that checks the correct subtarget feature based on the load size.

This shrinks the isel table by at least 5000 bytes by allowing more reordering and combining to occur.

llvm-svn: 311266
2017-08-19 23:21:22 +00:00
Craig Topper afa69eecbb [X86] Converge alignedstore/alignedstore256/alignedstore512 to a single predicate.
We can read the memoryVT and get its store size directly from the SDNode to check its alignment.

llvm-svn: 311265
2017-08-19 23:21:21 +00:00
Craig Topper a0319bb434 [AVX512] Use alignedstore256 in a pattern that's emitting a 256-bit movaps from an extract subvector operation.
llvm-svn: 311263
2017-08-19 22:02:02 +00:00
Victor Leschuk ee3292c5e4 Set init value for ScalarEvolution::BackedgeTakenInfo::MaxOrZero
Otherwise it can be used uninitialized in move ctor.

llvm-svn: 311262
2017-08-19 21:05:08 +00:00
Martin Storsjo d606d2a643 [ARM] Factorize the calculation of WhichResult in isV*Mask. NFC.
Differential Revision: https://reviews.llvm.org/D36930

llvm-svn: 311260
2017-08-19 20:26:51 +00:00
Martin Storsjo 91522ffa12 [ARM] Check the right order for halves of VZIP/VUZP if both parts are used
This is the exact same fix as in SVN r247254. In that commit, the fix was
applied only for isVTRNMask and isVTRN_v_undef_Mask, but the same issue
is present for VZIP/VUZP as well.

This fixes PR33921.

Differential Revision: https://reviews.llvm.org/D36899

llvm-svn: 311258
2017-08-19 19:47:48 +00:00
Teresa Johnson b225ad05af Fix bot failures by requiring x86 target
The tests added in r311254 require a target triple since they are
running through code generation. Fix bot failures by requiring
an x86 target.

llvm-svn: 311257
2017-08-19 19:15:04 +00:00
Konstantin Zhuravlyov 89377c440c AMDGPU/NFC: Reorder functions in SIMemoryLegalizer:
- Move *load* functions before *atomic* functions
  - Move *store* functions before *atomic* functions

llvm-svn: 311256
2017-08-19 18:44:27 +00:00
Jatin Bhateja 6b4c205685 [DAGCombiner] Extending pattern detection for vector shuffle.
Summary:
    If all the operands of a BUILD_VECTOR extract elements from same vector then split the
    vector efficiently based on the maximum vector access index.

    Reviewers: zvi, delena, RKSimon, thakis

    Reviewed By: RKSimon

    Subscribers: chandlerc, eladcohen, llvm-commits

    Differential Revision: https://reviews.llvm.org/D35788

llvm-svn: 311255
2017-08-19 18:08:59 +00:00
Teresa Johnson 73305f82e9 [ThinLTO] Fix ThinLTO crash
Summary:
Follow up to fix in r311023, which fixed the case where the combined
index is written to disk. The same samplePGO logic exists for the
in-memory index when computing imports, so we need to filter out
GlobalVariable summaries there too.

Reviewers: davidxl

Subscribers: inglorion, llvm-commits

Differential Revision: https://reviews.llvm.org/D36919

llvm-svn: 311254
2017-08-19 18:04:25 +00:00
Craig Topper 6e70f7cd33 [X86] Remove an unnecessary alignment restriction from MOVDDUP pattern.
The SSE MOVDDUP instruction only loads 64-bits with no alignment restriction.

llvm-svn: 311253
2017-08-19 18:02:28 +00:00
Jatin Bhateja 66f7958e91 Revert rL311247 : To rectify commit message.
Summary: This reverts commit rL311247.

Differential Revision: https://reviews.llvm.org/D36927

llvm-svn: 311252
2017-08-19 17:59:58 +00:00
Jatin Bhateja 6f0d0d23b0 Merge branch 'arcpatch-D35788'
llvm-svn: 311247
2017-08-19 17:00:04 +00:00
Jatin Bhateja 1c56863739 Revert rL311242 "Extension of shuffle vector pattern detection, updating post rebase."
Summary:

This reverts commit rL311242.

Differential Revision: https://reviews.llvm.org/D36924

llvm-svn: 311246
2017-08-19 16:40:06 +00:00
Jatin Bhateja 313f97dd84 Extension of shuffle vector pattern detection, updating post rebase.
llvm-svn: 311242
2017-08-19 15:58:36 +00:00
Victor Leschuk ee7d232a41 revert failing test
llvm-svn: 311238
2017-08-19 12:24:41 +00:00
Victor Leschuk ba0954c4e2 Add temporary test to verify that win10 builder hangs on error
llvm-svn: 311236
2017-08-19 12:02:39 +00:00
Victor Leschuk 59dc64f3af Temporary mark lit :: shtest-format as unsupported on windows
When run manually it fails, but when run under buildbot it causes hang.

llvm-svn: 311230
2017-08-19 07:58:07 +00:00
Chandler Carruth 4f3aa29a46 [Inliner] Fix a nasty bug when inlining a non-recursive trace of
a function into itself.

We tried to fix this before in r306495 but that got reverted as the
assert was actually hit.

This fixes the original bug (which we seem to have lost track of with
the revert) by blocking a second remapping when the function being
inlined is also the caller and the remapping could succeed but
erroneously.

The included test case would actually load from an inlined copy of the
alloca before this change, failing to load the stored value and
miscompiling.

Many thanks to Richard Smith for diagnosing a user miscompile to this
bug, and to Kyle for the first attempt and initial analysis and David Li
for remembering the issue and how to fix it and suggesting the patch.
I'm just stitching it together and landing it. =]

llvm-svn: 311229
2017-08-19 06:56:11 +00:00
Chandler Carruth 2a80fddf67 [Inliner] Clean up a test case a bit to make it more clear what is being
tested and why.

llvm-svn: 311228
2017-08-19 06:06:44 +00:00
Chandler Carruth 1f8212597d [SLP] Fix an unused variable warning in non-asserts builds.
llvm-svn: 311227
2017-08-19 05:06:23 +00:00
Chandler Carruth 93a645525c [x86] Teach the cmov converter to aggressively convert cmovs with memory
operands into control flow.

We have seen periodically performance problems with cmov where one
operand comes from memory. On modern x86 processors with strong branch
predictors and speculative execution, this tends to be much better done
with a branch than cmov. We routinely see cmov stalling while the load
is completed rather than continuing, and if there are subsequent
branches, they cannot be speculated in turn.

Also, in many (even simple) cases, macro fusion causes the control flow
version to be fewer uops.

Consider the IACA output for the initial sequence of code in a very hot
function in one of our internal benchmarks that motivates this, and notice the
micro-op reduction provided.
Before, SNB:
```
Throughput Analysis Report
--------------------------
Block Throughput: 2.20 Cycles       Throughput Bottleneck: Port1

| Num Of |              Ports pressure in cycles               |    |
|  Uops  |  0  - DV  |  1  |  2  -  D  |  3  -  D  |  4  |  5  |    |
---------------------------------------------------------------------
|   1    |           | 1.0 |           |           |     |     | CP | mov rcx, rdi
|   0*   |           |     |           |           |     |     |    | xor edi, edi
|   2^   | 0.1       | 0.6 | 0.5   0.5 | 0.5   0.5 |     | 0.4 | CP | cmp byte ptr [rsi+0xf], 0xf
|   1    |           |     | 0.5   0.5 | 0.5   0.5 |     |     |    | mov rax, qword ptr [rsi]
|   3    | 1.8       | 0.6 |           |           |     | 0.6 | CP | cmovbe rax, rdi
|   2^   |           |     | 0.5   0.5 | 0.5   0.5 |     | 1.0 |    | cmp byte ptr [rcx+0xf], 0x10
|   0F   |           |     |           |           |     |     |    | jb 0xf
Total Num Of Uops: 9
```
After, SNB:
```
Throughput Analysis Report
--------------------------
Block Throughput: 2.00 Cycles       Throughput Bottleneck: Port5

| Num Of |              Ports pressure in cycles               |    |
|  Uops  |  0  - DV  |  1  |  2  -  D  |  3  -  D  |  4  |  5  |    |
---------------------------------------------------------------------
|   1    | 0.5       | 0.5 |           |           |     |     |    | mov rax, rdi
|   0*   |           |     |           |           |     |     |    | xor edi, edi
|   2^   | 0.5       | 0.5 | 1.0   1.0 |           |     |     |    | cmp byte ptr [rsi+0xf], 0xf
|   1    | 0.5       | 0.5 |           |           |     |     |    | mov ecx, 0x0
|   1    |           |     |           |           |     | 1.0 | CP | jnbe 0x39
|   2^   |           |     |           | 1.0   1.0 |     | 1.0 | CP | cmp byte ptr [rax+0xf], 0x10
|   0F   |           |     |           |           |     |     |    | jnb 0x3c
Total Num Of Uops: 7
```
The difference even manifests in a throughput cycle rate difference on Haswell.
Before, HSW:
```
Throughput Analysis Report
--------------------------
Block Throughput: 2.00 Cycles       Throughput Bottleneck: FrontEnd

| Num Of |                    Ports pressure in cycles                     |    |
|  Uops  |  0  - DV  |  1  |  2  -  D  |  3  -  D  |  4  |  5  |  6  |  7  |    |
---------------------------------------------------------------------------------
|   0*   |           |     |           |           |     |     |     |     |    | mov rcx, rdi
|   0*   |           |     |           |           |     |     |     |     |    | xor edi, edi
|   2^   |           |     | 0.5   0.5 | 0.5   0.5 |     | 1.0 |     |     |    | cmp byte ptr [rsi+0xf], 0xf
|   1    |           |     | 0.5   0.5 | 0.5   0.5 |     |     |     |     |    | mov rax, qword ptr [rsi]
|   3    | 1.0       | 1.0 |           |           |     |     | 1.0 |     |    | cmovbe rax, rdi
|   2^   | 0.5       |     | 0.5   0.5 | 0.5   0.5 |     |     | 0.5 |     |    | cmp byte ptr [rcx+0xf], 0x10
|   0F   |           |     |           |           |     |     |     |     |    | jb 0xf
Total Num Of Uops: 8
```
After, HSW:
```
Throughput Analysis Report
--------------------------
Block Throughput: 1.50 Cycles       Throughput Bottleneck: FrontEnd

| Num Of |                    Ports pressure in cycles                     |    |
|  Uops  |  0  - DV  |  1  |  2  -  D  |  3  -  D  |  4  |  5  |  6  |  7  |    |
---------------------------------------------------------------------------------
|   0*   |           |     |           |           |     |     |     |     |    | mov rax, rdi
|   0*   |           |     |           |           |     |     |     |     |    | xor edi, edi
|   2^   |           |     | 1.0   1.0 |           |     | 1.0 |     |     |    | cmp byte ptr [rsi+0xf], 0xf
|   1    |           | 1.0 |           |           |     |     |     |     |    | mov ecx, 0x0
|   1    |           |     |           |           |     |     | 1.0 |     |    | jnbe 0x39
|   2^   | 1.0       |     |           | 1.0   1.0 |     |     |     |     |    | cmp byte ptr [rax+0xf], 0x10
|   0F   |           |     |           |           |     |     |     |     |    | jnb 0x3c
Total Num Of Uops: 6
```

Note that this cannot be usefully restricted to inner loops. Much of the
hot code we see hitting this is not in an inner loop or not in a loop at
all. The optimization still remains effective and indeed critical for
some of our code.

I have run a suite of internal benchmarks with this change. I saw a few
very significant improvements and a very few minor regressions,
but overall this change rarely has a significant effect. However, the
improvements were very significant, and in quite important routines
responsible for a great deal of our C++ CPU cycles. The gains pretty
clealy outweigh the regressions for us.

I also ran the test-suite and SPEC2006. Only 11 binaries changed at all
and none of them showed any regressions.

Amjad Aboud at Intel also ran this over their benchmarks and saw no
regressions.

Differential Revision: https://reviews.llvm.org/D36858

llvm-svn: 311226
2017-08-19 05:01:19 +00:00
Chandler Carruth e3b3547e9f [x86] Refactor the CMOV conversion pass to be more flexible.
The primary thing that this accomplishes is to allow future re-use of
these routines in more contexts and clarify the behavior w.r.t. loops.
For example, if handling outer loops is desirable, doing so in
a inside-out order becomes straight forward because it walks the loop
nest itself (rather than walking the function's basic blocks) and
de-couples the CMOV rewriting from the loop structure as there isn't
actually anything loop-specific about this transformation.

This patch should be essentially a no-op. It potentially changes the
order in which we visit the inner loops, but otherwise should merely set
the stage for subsequent changes.

Differential Revision: https://reviews.llvm.org/D36783

llvm-svn: 311225
2017-08-19 04:28:20 +00:00
Dinar Temirbulatov 7aff8cfa55 [SLPVectorizer] Tighten up VLeft, VRight declaration, remove unnecessary testcase test/Transforms/SLPVectorizer/X86/reorder.ll, NFCI.
llvm-svn: 311223
2017-08-19 03:15:07 +00:00
Dinar Temirbulatov e3ce1b455e [SLPVectorizer] Add opcode parameter to reorderAltShuffleOperands, reorderInputsAccordingToOpcode functions.
Reviewers: mkuper, RKSimon, ABataev, mzolotukhin, spatel, filcab

Subscribers: llvm-commits, rengolin

Differential Revision: https://reviews.llvm.org/D36766

llvm-svn: 311221
2017-08-19 02:54:20 +00:00
Matthias Braun 91bd3ad128 ARMRegsiterInfo: Define more ssub indexes; NFC
This doesn't really change anything as Tablegen would have inferred
those indices anyway; defining them gives us shorter names that are
easier to read while debugging (i.e. "ssub_4" rather than
"dsub2_then_ssub_0")

llvm-svn: 311218
2017-08-19 01:21:11 +00:00
Adrian Prantl 2116dd360a Filter out non-constant DIGlobalVariableExpressions reachable via the CU
They won't affect the DWARF output, but they will mess with the
sorting of the fragments. This fixes the crash reported in PR34159.

https://bugs.llvm.org/show_bug.cgi?id=34159

llvm-svn: 311217
2017-08-19 01:15:06 +00:00
Eric Beckmann 91d8af5386 llvm-mt: Merge manifest namespaces.
mt.exe performs a tree merge where certain element nodes are combined
into one.  This introduces the possibility of xml namespaces conflicting
with each other.  The original mt.exe has a hierarchy whereby certain
namespace names can override others, and nodes that would then end up in
ambigious namespaces have their namespaces explicitly defined.  This
namespace handles this merging process.

llvm-svn: 311215
2017-08-19 00:37:41 +00:00
Eugene Zelenko be709f2c19 [Analysis] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 311212
2017-08-18 23:51:26 +00:00
Xinliang David Li 0d07f9d68a Fix comment /NFC
llvm-svn: 311209
2017-08-18 23:08:50 +00:00
Xinliang David Li 709ffe178e [Profile] backward propagate profile info in JumpThreading
Differential Revsion: http://reviews.llvm.org/D36864

llvm-svn: 311208
2017-08-18 23:00:05 +00:00
Amjad Aboud 88ffa3afe2 [InstCombine] Teach ComputeNumSignBitsImpl to handle integer multiply instruction.
Differential Revision: https://reviews.llvm.org/D36679

llvm-svn: 311206
2017-08-18 22:56:55 +00:00
Max Kazantsev 0aaf8c16ac [IRCE] Fix buggy behavior in Clamp
Clamp function was too optimistic when choosing signed or unsigned min/max function for calculations.
In fact, `!IsSignedPredicate` guarantees us that `Smallest` and `Greatest` can be compared safely using unsigned
predicates, but we did not check this for `S` which can in theory be negative.

This patch makes Clamp use signed min/max for cases when it fails to prove `S` being non-negative,
and it adds a test where such situation may lead to incorrect conditions calculation.

Differential Revision: https://reviews.llvm.org/D36873

llvm-svn: 311205
2017-08-18 22:50:29 +00:00
Justin Bogner b29bebe47b IR: Make stripDebugInfo robust against (invalid) empty basic blocks
Since stripDebugInfo runs before the verifier when reading IR, we can
end up in a situation where we read some invalid IR but don't know its
invalid yet. Before this patch we would crash in stripDebugInfo when
given IR with a completely empty basic block, and after we get a nice
error from the verifier instead.

llvm-svn: 311202
2017-08-18 21:38:03 +00:00
Jonas Devlieghere a2faf7b60f [llvm-dwarfdump] Hide .debug_str and DIE reference offsets in brief mode
This patch hides the .debug_str offset and DIE reference offsets into
the CU when llvm-dwarfdump is invoked with -brief.

Differential Revision: https://reviews.llvm.org/D36835

llvm-svn: 311201
2017-08-18 21:35:44 +00:00
Simon Pilgrim f36cca88fb [X86][ADX] Regenerate ADX intrinsics tests
llvm-svn: 311198
2017-08-18 21:21:14 +00:00
Sanjay Patel 7046cbd691 fix typos in comments; NFC
llvm-svn: 311193
2017-08-18 20:27:47 +00:00
Ana Pazos 6210f27dfc [PGO] Fixed assertion due to mismatched memcpy size type.
Summary:
Memcpy intrinsics have size argument of any integer type, like i32 or i64.
Fixed size type along with its value when cloning the intrinsic.

Reviewers: davidxl, xur

Reviewed By: davidxl

Subscribers: mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D36844

llvm-svn: 311188
2017-08-18 19:17:08 +00:00
Tim Northover 14302fcb24 ARM: use an external relocation for calls from MachO ARM mode.
The internal (__text-relative) relocation risks the offset not being encodable
if the destination is Thumb.

llvm-svn: 311187
2017-08-18 19:13:56 +00:00
Matt Morehouse 5c7fc76983 [SanitizerCoverage] Add stack depth tracing instrumentation.
Summary:
Augment SanitizerCoverage to insert maximum stack depth tracing for
use by libFuzzer.  The new instrumentation is enabled by the flag
-fsanitize-coverage=stack-depth and is compatible with the existing
trace-pc-guard coverage.  The user must also declare the following
global variable in their code:
  thread_local uintptr_t __sancov_lowest_stack

https://bugs.llvm.org/show_bug.cgi?id=33857

Reviewers: vitalybuka, kcc

Reviewed By: vitalybuka

Subscribers: kubamracek, hiraditya, cfe-commits, llvm-commits

Differential Revision: https://reviews.llvm.org/D36839

llvm-svn: 311186
2017-08-18 18:43:30 +00:00
Marek Sokolowski 5cd3d5c8d6 Reapply: [llvm-rc] Add basic RC scripts parsing ability.
As for now, the parser supports a limited set of statements and
resources. This will be extended in the following patches.

Thanks to Nico Weber (thakis) for his original work in this area.

This patch was originally submitted as r311175 and got reverted
in r311177 because of the problems with compilation under gcc.

Differential Revision: https://reviews.llvm.org/D36340

llvm-svn: 311184
2017-08-18 18:24:17 +00:00
Jonas Devlieghere e101b07a1d [Debug info] Transfer DI to fragment expressions for split integer values.
This patch teaches the SDag type legalizer how to split up debug info for
integer values that are split into a hi and lo part.

(re-commit)

Differential Revision: https://reviews.llvm.org/D36805

llvm-svn: 311181
2017-08-18 18:07:00 +00:00
Ben Dunbobbin 7b76de2dcb [lit] support unsetting env variables (again!)
This is an updated version of https://reviews.llvm.org/D22144 by @jlpeyton.

The patch was accepted but not landed.

This is useful functionality and I would like to use this to enable lit tests for environment variable behaviour.

Differential Revision: https://reviews.llvm.org/D36403

llvm-svn: 311180
2017-08-18 17:32:57 +00:00
Konstantin Zhuravlyov f5d826a294 AMDGPU/NFC: Rename few things in SIMemoryLegalizer:
- AtomicInfo -> MemOpInfo
  - getAtomicLoadInfo -> getLoadInfo
  - getAtomicStoreInfo -> getStoreInfo
  - expandAtomicLoad -> expandLoad
  - expandAtomicStore -> expandStore

Differential Revision: https://reviews.llvm.org/D36861

llvm-svn: 311179
2017-08-18 17:30:02 +00:00
Marek Sokolowski f276f52014 Revert "[llvm-rc] Add basic RC scripts parsing ability."
This reverts commit r311175.

This failed some buildbots compilation.

llvm-svn: 311177
2017-08-18 17:25:55 +00:00
Jakub Kuderski 756c09a58f [Dominators] Don't print the whole tree when running with -debug
As the incremental API is now used in several transforms, printing
the whole dominator tree creates a lot of noise when running with
the `-debug` flag. This patch fixes that.

llvm-svn: 311176
2017-08-18 17:06:37 +00:00
Marek Sokolowski dbc16476c1 [llvm-rc] Add basic RC scripts parsing ability.
As for now, the parser supports a limited set of statements and
resources. This will be extended in the following patches.

Thanks to Nico Weber (thakis) for his original work in this area.

Differential Revision: https://reviews.llvm.org/D36340

llvm-svn: 311175
2017-08-18 17:05:47 +00:00
Ben Dunbobbin ac6a5aab45 [Support] env vars with empty values on windows
An environment variable can be in one of three states:

1. undefined.
2. defined with a non-empty value.
3. defined but with an empty value.

The windows implementation did not support case 3
(it was not handling errors). The Linux implementation
is already correct.

Differential Revision: https://reviews.llvm.org/D36394

llvm-svn: 311174
2017-08-18 16:55:44 +00:00
Simon Pilgrim 879ce046ad [X86][BMI2] Added scheduling test for RORX/SARX/SHLX/SHRX instructions
llvm-svn: 311171
2017-08-18 16:26:39 +00:00
Brian Gesiak 8953a7c544 [Lexicon] Add "GEP"
Summary:
`getelementptr` is frequently abbreviated as "GEP", often in source files that
do not ever reference the full name of the instruction. Add it to the Lexicon,
in case readers go to look for what it means there.

Test plan:
1. `ninja sphinx`
2. Confirm that the rendered docs HTML contains the new "GEP" entry

llvm-svn: 311168
2017-08-18 15:35:53 +00:00
Simon Pilgrim 358aeae7b8 [X86][AES] Add scheduling latency/throughput tests for AES instructions
llvm-svn: 311167
2017-08-18 15:26:51 +00:00
Simon Pilgrim 9eb0869e91 [X86][PCLMUL] Add scheduling latency/throughput test for PCLMULQDQ instruction
Added it to the SSE42 tests as targets seem to always have both

llvm-svn: 311166
2017-08-18 15:08:30 +00:00
Simon Pilgrim ccaec26175 [X86][SHA] Add scheduling latency/throughput tests for SHA instructions
llvm-svn: 311164
2017-08-18 14:55:50 +00:00
Simon Pilgrim 7f506f7d72 [X86][MOVBE] Add scheduling latency/throughput tests for MOVBE instructions
llvm-svn: 311163
2017-08-18 14:44:31 +00:00
Sam Parker 04a7db5915 [ARM] Add PostRAScheduler option
This patch adds the option to allow also using the PostRA scheduler,
which brings the ARM backend inline with AArch64 targets. The
SchedModel can also set 'PostRAScheduler', as the R52 does, so also
query this property in the overridden function.

Differential Revision: https://reviews.llvm.org/D36866

llvm-svn: 311162
2017-08-18 14:27:51 +00:00
Simon Dardis 02c9a3dfc3 [mips] Follow up comments on r310460
Use dblaikie's suggestion of cast<> instead of a seperate assert.

llvm-svn: 311160
2017-08-18 13:27:02 +00:00
Simon Pilgrim 320f89782a [X86][BMI2] Added scheduling test for MULX instructions
llvm-svn: 311159
2017-08-18 13:22:18 +00:00
Sjoerd Meijer ec9581e5e0 [AArch64] Do not promote f16 when subtarget HasFullFP16
Armv8.2-A adds FP16 support, i.e. f16 is not only a storage-only type, but it
also supports performing data processing on 16-bit floating-point quantities.
All the necessary (tablegen) groundwork of adding the ARMv8.2-A FP16 (scalar)
instructions was done in D15014. To take advantage of this, this patch avoids
promotion of f16 to f32 types when the subtarget supports FullFP16, which
enables instruction selection of these FP16 instructions.

Differential Revision: https://reviews.llvm.org/D36396

llvm-svn: 311154
2017-08-18 10:51:14 +00:00
Renato Golin 6fd16d37ae [Triple] Define OS Check for Haiku
This adds the OS check for the Haiku operating system, as it was
missing in the Triple class. Tests for x86_64-unknown-haiku and
i586-pc-haiku were also added.

These patches only affect Haiku and are completely harmless for
other platforms.

Patch by Calvin Hill <calvin@hakobaito.co.uk>

llvm-svn: 311153
2017-08-18 10:35:42 +00:00
Ilya Biryukov 827c8acc21 Addressed some security issues in Dockerfiles.
Summary:
- Removed --trust-server-cert from `svn checkout` invocations.
  Installing 'ca-certificates' package on ubuntu adds required CAs to
  the system and svn can do proper checkout using https.

- Added checksum verification when installing cmake from cmake.org.

Reviewers: mehdi_amini, klimek

Reviewed By: mehdi_amini

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D36673

llvm-svn: 311152
2017-08-18 09:37:23 +00:00
Diana Picus 42ea77d5c2 Revert "GlobalISel (AArch64): fix ABI at border between GPRs and SP."
This reverts commit e8fd20964798ca6d46d2729dd3a789707a6416da in an
attempt to appease the GlobalISel buildbot, which fails in the
test-suite with errors like
fpcmp: files differ without tolerance allowance

llvm-svn: 311151
2017-08-18 09:31:21 +00:00
Sam Parker 25efe769c0 [AArch64] Fix for buildbots, unused function
Removing function declaration, my previous commit broke the bots.

llvm-svn: 311150
2017-08-18 09:08:05 +00:00
Victor Leschuk 091da14423 Remove useless default case in switch
llvm-svn: 311149
2017-08-18 09:02:06 +00:00
Sam Parker 96f8959cfd [AArch64] Remove DecodeAuthLoadWriteback
The BaseAuthLoad instruction class was incorrectly passing an empty
constraint string to its parent, so I have corrected this. This makes
the DecodeAuthLoadWriteback function redundant, so I've also removed
it.

Differential Revision: https://reviews.llvm.org/D36741

llvm-svn: 311148
2017-08-18 08:39:54 +00:00
Alex Bradbury f698a29a51 Refine report_fatal_error guidance after post-commit review
Use text suggested by Justin Bogner in post-commit review of r311146 
<http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20170814/479898.html>, 
which makes it clear that report_fatal_error shouldn't be used when there is a 
practicable alternative. Also make this clearer in CodingStandards.

llvm-svn: 311147
2017-08-18 06:45:34 +00:00
Alex Bradbury 7182440fbd Give guidance on report_fatal_error in CodingStandards.rst and ProgrammersManual.rst
The current ProgrammersManual.rst document has a lot of well-written 
documentation on error handling thanks to @lhames. It suggests errors can be 
split cleanly into "programmatic" and "recoverable" errors. However, the 
reality in current LLVM seems to be there are a number of cases where a 
non-programmatic error is not easily recoverable. Therefore, add a note to 
indicate the existence of report_fatal_error for these cases. I've also added 
a reminder to CodingStandards.rst in the section on assertions, to indicate 
that llvm_unreachable and assertions should not be relied upon to report 
errors triggered by user input.

The ProgrammersManual is also silent on the use of LLVMContext::diagnose, 
which is used in BPF+WebAssembly+AMDGPU to report some errors during 
instruction selection. I don't address that in this patch, as it's not quite 
clear how to fit in to the current error handling story

Differential Revision: https://reviews.llvm.org/D36826

llvm-svn: 311146
2017-08-18 05:29:21 +00:00
Craig Topper e3edd9c9be [DAGCombiner] Fix bad comment that had immediate values swapped from the code and what they need to be to make sense. NFC
llvm-svn: 311144
2017-08-18 04:52:46 +00:00
Jatin Bhateja e739fc7d11 Test commit access
Summary: Adding a blank line.

Differential Revision: https://reviews.llvm.org/D36859

llvm-svn: 311143
2017-08-18 02:39:28 +00:00
Geoff Berry bd47e8a4f7 Revert "[MachineCopyPropagation] Extend pass to do COPY source forwarding" round 2
This reverts commit r311135.

sanitizer-x86_64-linux-android buildbot is timing out with just this
patch applied.

llvm-svn: 311142
2017-08-18 01:43:11 +00:00
Richard Smith c0541dfa3e Increase tail dup threshold for -O3 from 3 to 4.
We see a modest performance improvement from this slightly higher tail dup threshold.

Differential Revision: https://reviews.llvm.org/D36775

llvm-svn: 311139
2017-08-17 23:38:41 +00:00
Craig Topper 1fae3ae6f0 [X86] Remove SSE/AVX patterns for AND/XOR/OR/ANDN that checked for the inputs being bitcasted from floating point types.
There's really no reason to do this we should just let isel pick the integer version and let the execution dependency fixing pass take care of moving to FP if necessary.

It's not very reliable to look for bitcasts at the edges of patterns. If for some reason one input was bitcasted and the other wasn't, or if one was a v4f32 bitcast and one was a v2f64 bitcast, we would have fallen back to the integer pattern anyway.

llvm-svn: 311138
2017-08-17 23:20:57 +00:00
Tim Northover 48fff995d6 GlobalISel (AArch64): fix ABI at border between GPRs and SP.
If a struct would end up half in GPRs and half on SP the ABI says it should
actually go entirely on the stack. We were getting this wrong in GlobalISel
before, causing compatibility issues.

llvm-svn: 311137
2017-08-17 23:14:01 +00:00
Geoff Berry 51f52c4fca Re-enable "[MachineCopyPropagation] Extend pass to do COPY source forwarding"
Two issues identified by buildbots were addressed:
    - The pass no longer forwards COPYs to physical register uses, since
      doing so can break code that implicitly relies on the physical
      register number of the use.
    - The pass no longer forwards COPYs to undef uses, since doing so
      can break the machine verifier by creating LiveRanges that don't
      end on a use (since the undef operand is not considered a use).

    [MachineCopyPropagation] Extend pass to do COPY source forwarding

    This change extends MachineCopyPropagation to do COPY source forwarding.

    This change also extends the MachineCopyPropagation pass to be able to
    be run during register allocation, after physical registers have been
    assigned, but before the virtual registers have been re-written, which
    allows it to remove virtual register COPY LiveIntervals that become dead
    through the forwarding of all of their uses.

    Reviewers: qcolombet, javed.absar, MatzeB, jonpa

    Subscribers: jyknight, nemanjai, llvm-commits, nhaehnle, mcrosier, mgorny

    Differential Revision: https://reviews.llvm.org/D30751

llvm-svn: 311135
2017-08-17 23:06:55 +00:00
Zachary Turner 4c432b202f Fix warning about covered switch default.
llvm-svn: 311129
2017-08-17 22:20:15 +00:00
Tom Stellard a096b12628 AMDGPU: Add R600InstPrinter class
Summary:
This is step towards separating the GCN and R600 tablegen'd code.

This is a little awkward for now, because the R600 functions won't have the
MCSubtargetInfo parameter, so we need to have AMDMGPUInstPrinter
delegate to R600InstPrinter, but once the tablegen'd code is split,
we will be able to drop the delegation and use R600InstPrinter directly.

Reviewers: arsenm

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D36444

llvm-svn: 311128
2017-08-17 22:20:04 +00:00
Jakub Kuderski e608ef7635 [LoopRotate][Dominators] Use the incremental API to update DomTree
Summary: This patch teaches LoopRotate to use the new incremental API to update the DominatorTree.

Reviewers: dberlin, davide, grosser, sanjoy

Reviewed By: dberlin, davide

Subscribers: hiraditya, llvm-commits, mzolotukhin

Differential Revision: https://reviews.llvm.org/D35581

llvm-svn: 311125
2017-08-17 21:48:19 +00:00
Eugene Zelenko 6e07bfd0d9 [CodeGen] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 311124
2017-08-17 21:26:39 +00:00
Zachary Turner 197bba0028 Remove unused variable.
llvm-svn: 311119
2017-08-17 20:18:36 +00:00
Zachary Turner 96bcd6a37a [llvm-pdbutil] Fix some dumping issues.
When dumping, we were treating the S_INLINESITESYM as referring
to a type record, when it actually refers to an id record.  We
had this correct in TypeIndexDiscovery, so our merging algorithm
should be fine, but we had it wrong in the dumper, which means it
would appear to work most of the time, unless the index was out
of bounds in the type stream, when it would fail.  Fixed this, and
audited a few other cases to make them match the behavior in
TypeIndexDiscovery.

Also, I've now observed a new symbol record with kind 0x1168 which
I have no clue what it is, so to avoid crashing we have to just
print "Unknown Symbol Kind".

llvm-svn: 311117
2017-08-17 20:04:51 +00:00
Zachary Turner f401e1102d Fix a few minor issues when dumping symbols.
1) We weren't handling symbol types that weren't able to parse,
   even if we knew what the leaf type was.  This was triggering
   when trying to dump /DEBUG:FASTLINK PDBs, where we expect a
   certain symbol to show up, but we just don't know how to parse
   it.
2) We lost the code for dumping record bytes, so this was added
   back.

llvm-svn: 311116
2017-08-17 20:04:31 +00:00
Lang Hames df1e59b640 [docs] Tweak phrasing of the varargs explanation in the command section of the
CMake primer.

This moves the introduction of the ARGV/ARGN variables up to immmediately follow
the introduction of the concept of variable argument functions, and explicitly
connects this concept to C varargs functions.

llvm-svn: 311113
2017-08-17 18:21:53 +00:00
Lang Hames 3a6a2d26c6 [docs] Fix typo and tweak wording of special variable handling in CMake primer.
llvm-svn: 311112
2017-08-17 18:00:28 +00:00
Jonas Devlieghere 30756da212 Revert "[Debug info] Transfer DI to fragment expressions for split integer values."
This reverts commit r311102.

llvm-svn: 311111
2017-08-17 17:58:33 +00:00
Alexey Bataev 84ad9ae032 [SimplifyCFG] Add a test for preserve store alignment, NFC.
llvm-svn: 311106
2017-08-17 17:26:52 +00:00
Sanjay Patel f2d67f7ecc [x86] add tests for vector select-of-constants; NFC
We've discussed canonicalizing to this form in IR, so the backend
should be prepared to lower these in ways better than what we see
here in most cases.

llvm-svn: 311103
2017-08-17 17:07:37 +00:00
Jonas Devlieghere 622fedc001 [Debug info] Transfer DI to fragment expressions for split integer values.
This patch teaches the SDag type legalizer how to split up debug info for
integer values that are split into a hi and lo part.

Differential Revision: https://reviews.llvm.org/D36805

llvm-svn: 311102
2017-08-17 17:06:48 +00:00
Sanjay Patel 18424e1581 [PowerPC] add tests for vector select-of-constants; NFC
We've discussed canonicalizing to this form in IR, so the backend
should be prepared to lower these in ways better than what we see
here.

llvm-svn: 311099
2017-08-17 17:03:11 +00:00
Adrian Prantl 6a57daad81 Improve line debug info when translating a CaseBlock to SDNodes.
The SelectionDAGBuilder translates various conditional branches into
CaseBlocks which are then translated into SDNodes. If a conditional
branch results in multiple CaseBlocks only the first CaseBlock is
translated into SDNodes immediately, the rest of the CaseBlocks are
put in a queue and processed when all LLVM IR instructions in the
basic block have been processed.

When a CaseBlock is transformed into SDNodes the SelectionDAGBuilder
is queried for the current LLVM IR instruction and the resulting
SDNodes are annotated with the debug info of the current
instruction (if it exists and has debug metadata).

When the deferred CaseBlocks are processed, the SelectionDAGBuilder
does not have a current LLVM IR instruction, and the resulting SDNodes
will not have any debuginfo. As DwarfDebug::beginInstruction() outputs
a .loc directive for the first instruction in a labeled
block (typically the case for something coming from a CaseBlock) this
tends to produce a line-0 directive.

This patch changes the handling of CaseBlocks to store the current
instruction's debug info into the CaseBlock when it is created (and the
SelectionDAGBuilder knows the current instruction) and to always use
the stored debug info when translating a CaseBlock to SDNodes.

Patch by Frej Drejhammar!

Differential Revision: https://reviews.llvm.org/D36671

llvm-svn: 311097
2017-08-17 16:57:13 +00:00
Jakub Kuderski e35a449140 [Dominators] Teach LoopUnswitch to use the incremental API
Summary:
This patch makes LoopUnswitch use new incremental API for updating dominators.
It also updates SplitCriticalEdge, as it is called in LoopUnswitch.

There doesn't seem to be any noticeable performance difference when bootstrapping clang with this patch.

Reviewers: dberlin, davide, sanjoy, grosser, chandlerc

Reviewed By: davide, grosser

Subscribers: mzolotukhin, llvm-commits

Differential Revision: https://reviews.llvm.org/D35528

llvm-svn: 311093
2017-08-17 16:45:35 +00:00
Craig Topper 3a622a14f9 [AVX512] Don't switch unmasked subvector insert/extract instructions when AVX512DQI is enabled.
There's no reason to switch instructions with and without DQI. It just creates extra isel patterns and test divergences.

There is however value in enabling the masked version of the instructions with DQI.

This required introducing some new multiclasses to enabling this splitting.

Differential Revision: https://reviews.llvm.org/D36661

llvm-svn: 311091
2017-08-17 15:40:25 +00:00
Craig Topper 5960848060 [X86] Remove memopmmx pattern fragment
Summary: Just like the FIXME says, there is no alignment requirement for MMX.

Reviewers: RKSimon, zvi, igorb

Reviewed By: RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D36815

llvm-svn: 311090
2017-08-17 15:25:05 +00:00
Victor Leschuk bcfd8e28c8 Mark Verifier/invalid-eh.ll as unsupported on windows
Mark this unsupported for now as it causes tests hangs on buildbot.
Will place it back when the problem is debugged.

llvm-svn: 311089
2017-08-17 15:07:03 +00:00
Simon Dardis b5205c69d2 [dfsan] Add explicit zero extensions for shadow parameters in function wrappers.
In the case where dfsan provides a custom wrapper for a function,
shadow parameters are added for each parameter of the function.
These parameters are i16s. For targets which do not consider this
a legal type, the lack of sign extension information would cause
LLVM to generate anyexts around their usage with phi variables
and calling convention logic.

Address this by introducing zero exts for each shadow parameter.

Reviewers: pcc, slthakur

Differential Revision: https://reviews.llvm.org/D33349

llvm-svn: 311087
2017-08-17 14:14:25 +00:00
Daniel Sanders 032e7f2cad [globalisel][tablegen] Generate TypeObject table. NFC
Summary:
Generate the type table from the types used by a target rather than hard-coding
the union of types used by all targets.

Depends on D36084

Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar

Reviewed By: rovka

Subscribers: kristof.beyls, igorb, llvm-commits

Differential Revision: https://reviews.llvm.org/D36085

llvm-svn: 311084
2017-08-17 13:18:35 +00:00
Simon Pilgrim 8be9f4af4f [DAGCombiner] Add support for non-uniform constant vectors to (mul x, (1 << c)) -> x << c
llvm-svn: 311083
2017-08-17 13:03:34 +00:00
Amjad Aboud 19f15843ab [X86] Refactoring of X86TargetLowering::EmitLoweredSelect. NFC.
Authored by aivchenk
Differential Revision: https://reviews.llvm.org/D35685

llvm-svn: 311082
2017-08-17 12:12:30 +00:00
Davide Italiano 903fd3ea4e [Verifier] Avoid visiting DIGlobalVariables twice.
We currently visit them twice.
Once, through `visitMDNode()` -> (the code generated by)
  `../include/llvm/IR/Metadata.def:109` -> `visitDIGlobalVariable()`
Then, through `visitMDNode()` -> `visitDIGlobalVariableExpression()`
  -> `visitDIGlobalVariable()`

This results in verification failures printed twice, e.g.:

  $ ./opt -verify ../../test/DebugInfo/pr34186.ll
  missing global variable type
  !4 = distinct !DIGlobalVariable(name: "pat", scope: !0,
    file: !1, line: 27, isLocal: true, isDefinition: true)
  missing global variable type
  !4 = distinct !DIGlobalVariable(name: "pat", scope: !0,
    file: !1, line: 27, isLocal: true, isDefinition: true)
  ./opt: ../../test/DebugInfo/pr34186.ll: error: input module is broken!

The patch removes one call so we ensure each GV is visited exactly once.

Differential Revision:  https://reviews.llvm.org/D36797

llvm-svn: 311081
2017-08-17 11:32:21 +00:00
Ayal Zaks 6627883369 [LV] Using VPlan to model the vectorized code and drive its transformation
VPlan is an ongoing effort to refactor and extend the Loop Vectorizer. This
patch introduces the VPlan model into LV and uses it to represent the vectorized
code and drive the generation of vectorized IR.

In this patch VPlan models the vectorized loop body: the vectorized control-flow
is represented using VPlan's Hierarchical CFG, with predication refactored from
being a post-vectorization-step into a vectorization planning step modeling
if-then VPRegionBlocks, and generating code inline with non-predicated code. The
vectorized code within each VPBasicBlock is represented as a sequence of
Recipes, each responsible for modelling and generating a sequence of IR
instructions. To keep the size of this commit manageable the Recipes in this
patch are coarse-grained and capture large chunks of LV's code-generation logic.
The constructed VPlans are dumped in dot format under -debug.

This commit retains current vectorizer output, except for minor instruction
reorderings; see associated modifications to lit tests.

For further details on the VPlan model see docs/Proposals/VectorizationPlan.rst
and its references.

Authors: Gil Rapaport and Ayal Zaks

Differential Revision: https://reviews.llvm.org/D32871

llvm-svn: 311077
2017-08-17 09:29:59 +00:00
Daniel Sanders edd0784be6 Re-commit: [globalisel][tablegen] Support zero-instruction emission.
Summary:
Support the case where an operand of a pattern is also the whole of the
result pattern. In this case the original result and all its uses must be
replaced by the operand. However, register class restrictions can require
a COPY. This patch handles both cases by always emitting the copy and
leaving it for the register allocator to optimize.

The previous commit failed on Windows machines due to a flaw in the sort
predicate which allowed both A < B < C and B == C to be satisfied
simultaneously. The cause of this was some sloppiness in the priority order of
G_CONSTANT instructions compared to other instructions. These had equal priority
because it makes no difference, however there were operands had higher priority
than G_CONSTANT but lower priority than any other instruction. As a result, a
priority order between G_CONSTANT and other instructions must be enforced to
ensure the predicate defines a strict weak order.

Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar

Subscribers: javed.absar, kristof.beyls, igorb, llvm-commits

Differential Revision: https://reviews.llvm.org/D36084

llvm-svn: 311076
2017-08-17 09:26:14 +00:00
Jonas Paulsson 593d49c0d9 [SystemZ] Also wrap TII with #ifndef NDEBUG in constructor initilizer list.
TII needs to be wrapped with #ifndef NDEBUG to silece compiler warnings.

llvm-svn: 311075
2017-08-17 09:18:02 +00:00
Jonas Paulsson d346924a0e [SystemZ] Add a wrapping with #ifndef NDEBUG to silence warning.
SystemZHazardRecognizer::TII is only used for debug output, so it needs
also to be wrapped with #ifndef NDEBUG.

llvm-svn: 311074
2017-08-17 08:56:09 +00:00
Jonas Paulsson 57a705d9d0 [SystemZ, MachineScheduler] Improve post-RA scheduling.
The idea of this patch is to continue the scheduler state over an MBB boundary
in the case where the successor block has only one predecessor. This means
that the scheduler will continue in the successor block (after emitting any
branch instructions) with e.g. maintained processor resource counters.
Benchmarks have been confirmed to benefit from this.

The algorithm in MachineScheduler.cpp that extracts scheduling regions of an
MBB has been extended so that the strategy may optionally reverse the order
of processing the regions themselves. This is controlled by a new method
doMBBSchedRegionsTopDown(), which defaults to false.

Handling the top-most region of an MBB first also means that a top-down
scheduler can continue the scheduler state across any scheduling boundary
between to regions inside MBB.

Review: Ulrich Weigand, Matthias Braun, Andy Trick.
https://reviews.llvm.org/D35053

llvm-svn: 311072
2017-08-17 08:33:44 +00:00
Elad Cohen 124d32829c [SelectionDAG] Teach the vector-types operand scalarizer about SETCC
When v1i1 is legal (e.g. AVX512) the legalizer can reach
a case where a v1i1 SETCC with an illgeal vector type operand
wasn't scalarized (since v1i1 is legal) but its operands does
have to be scalarized. This used to assert because SETCC was
missing from the vector operand scalarizer.

This patch attemps to teach the legalizer to handle these cases
by scalazring the operands, converting the node into a scalar
SETCC node.

Differential revision: https://reviews.llvm.org/D36651

llvm-svn: 311071
2017-08-17 08:06:36 +00:00
Martin Storsjo caff3268a1 [llvm-dlltool] Improve an error message when unable to open files. NFC.
Differential Revision: https://reviews.llvm.org/D36818

llvm-svn: 311069
2017-08-17 06:26:42 +00:00
Martin Storsjo 9d8ecb4333 [llvm-dlltool] Don't crash if no def file is provided or it can't be opened
Differential Revision: https://reviews.llvm.org/D36780

llvm-svn: 311068
2017-08-17 05:58:27 +00:00
Serguei Katkov 9e5604dbe1 [CGP] Fix the rematerialization of gc.relocates
If we want to substitute the relocation of derived pointer with gep of base then
we must ensure that relocation of base dominates the relocation of derived pointer.

Currently only check for basic block is present. However it is possible that both
relocation are in the same basic block but relocation of derived pointer is defined
earlier.

The patch moves the relocation of base pointer right before relocation of derived
pointer in this case.

Reviewers: sanjoy,artagnon,igor-laevsky,reames
Reviewed By: reames
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D36462

llvm-svn: 311067
2017-08-17 05:48:30 +00:00
Geoff Berry 4e38e02e6f Revert "[MachineCopyPropagation] Extend pass to do COPY source forwarding"
This reverts commit r311038.

Several buildbots are breaking, and at least one appears to be due to
the forwarding of physical regs enabled by this change.  Reverting while
I investigate further.

llvm-svn: 311062
2017-08-17 04:04:11 +00:00
Saleem Abdulrasool dd8c16b58e ARM: mark CPSR as clobbered for Windows VLAs
When lowering a VLA, we emit a __chstk call.  However, this call can
internally clobber CPSR.  We did not mark this register as an ImpDef,
which could potentially allow a comparison to be hoisted above the call
to `__chkstk`.  In such a case, the CPSR could be clobbered, and the
check invalidated.  When the support was initially added, it seemed that
the call would take care of preventing CPSR from being clobbered, but
this is not the case.  Mark the register as clobbered to fix a possible
state corruption.

llvm-svn: 311061
2017-08-17 02:42:24 +00:00
Craig Topper 2f9743d2ea [X86] Exchange the memory op predicate for PALIGNR/VPALIGNR. I accidentally swapped them.
llvm-svn: 311060
2017-08-17 02:34:35 +00:00
Craig Topper 5357526ce8 [X86] Cleanup multiclasses for SSE/AVX2 PALIGNR. Add missing load patterns.
We used to have a separate multiclass for AVX2 and SSE/AVX. Now we have one multiclass and pass the relevant differences.

We were also missing load patterns, though we had them for the AVX-512 version.

llvm-svn: 311059
2017-08-17 01:48:03 +00:00
Craig Topper bbe3e46bb9 [X86] Remove patterns for PALIGNR with non-vXi8 types.
llvm-svn: 311058
2017-08-17 01:48:00 +00:00
Jakub Kuderski fd5c5c9144 Reapply: [ADCE][Dominators] Teach ADCE to preserve dominators
Summary:
This patch teaches ADCE to preserve both DominatorTrees and PostDominatorTrees.

I didn't notice any performance impact when bootstrapping clang with this patch.

The patch was originally committed in r311039 and reverted in r311049.
This revision fixes the problem with not adding a dependency on the
DominatorTreeWrapperPass for the LegacyPassManager.

Reviewers: dberlin, chandlerc, sanjoy, davide, grosser, brzycki

Reviewed By: davide

Subscribers: grandinj, zhendongsu, llvm-commits, david2050

Differential Revision: https://reviews.llvm.org/D35869

llvm-svn: 311057
2017-08-17 01:41:49 +00:00
Craig Topper 42a535351e [X86] Put multiclass closer to its use and simplify slightly. NFC
llvm-svn: 311055
2017-08-16 23:38:25 +00:00
Craig Topper 9025579e8a [X86] Use a static array instead of a SmallVector for a small fixed size array. NFC
llvm-svn: 311054
2017-08-16 23:16:43 +00:00
Sanjay Patel 4abc3f6036 [x86] add cmov promotion tests for D36711; NFC
This way we can see what the current codegen looks like.
I've also explicitly added/removed the cmov attribute from the RUN lines,
so we know exactly what we're checking in the runs.

llvm-svn: 311052
2017-08-16 22:50:11 +00:00
Amjad Aboud 86111c6696 [InstCombine] Teach canEvaluateTruncated to handle arithmetic shift (including those with vector splat shift amount)
Differential Revision: https://reviews.llvm.org/D36784

llvm-svn: 311050
2017-08-16 22:42:38 +00:00
Jakub Kuderski cbcffb173c Revert "[ADCE][Dominators] Teach ADCE to preserve dominators"
This reverts commit r311039. The patch caused the
`test/Bindings/OCaml/Output/scalar_opts.ml` to fail.

llvm-svn: 311049
2017-08-16 22:10:53 +00:00
Eugene Zelenko bb1b2d09cf [Analysis] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 311048
2017-08-16 22:07:40 +00:00
Craig Topper 882f29630b [InstCombine] Make folding (X >s -1) ? C1 : C2 --> ((X >>s 31) & (C2 - C1)) + C1 support splat vectors
This also uses decomposeBitTestICmp to decode the compare.

Differential Revision: https://reviews.llvm.org/D36781

llvm-svn: 311044
2017-08-16 21:52:07 +00:00
Jakub Kuderski 4552e9de9f [ADCE][Dominators] Teach ADCE to preserve dominators
Summary:
This patch teaches ADCE to preserve both DominatorTrees and PostDominatorTrees.

I didn't notice any performance impact when bootstrapping clang with this patch.

Reviewers: dberlin, chandlerc, sanjoy, davide, grosser, brzycki

Reviewed By: davide

Subscribers: grandinj, zhendongsu, llvm-commits, david2050

Differential Revision: https://reviews.llvm.org/D35869

llvm-svn: 311039
2017-08-16 20:50:23 +00:00
Geoff Berry 87f8d25150 [MachineCopyPropagation] Extend pass to do COPY source forwarding
This change extends MachineCopyPropagation to do COPY source forwarding.

This change also extends the MachineCopyPropagation pass to be able to
be run during register allocation, after physical registers have been
assigned, but before the virtual registers have been re-written, which
allows it to remove virtual register COPY LiveIntervals that become dead
through the forwarding of all of their uses.

Reviewers: qcolombet, javed.absar, MatzeB, jonpa

Subscribers: jyknight, nemanjai, llvm-commits, nhaehnle, mcrosier, mgorny

Differential Revision: https://reviews.llvm.org/D30751

llvm-svn: 311038
2017-08-16 20:50:01 +00:00
Petr Hosek ad00bd603e [CMake][runtimes] Support for building target variants
This can be used to build non-sanitized and sanitized versions of
runtimes, where sanitized versions use the just built sanitizer
which in turn may use the non-sanitized version.

Differential Revision: https://reviews.llvm.org/D36348

llvm-svn: 311036
2017-08-16 19:13:45 +00:00
Geoff Berry 40549ad1ac [LoopDataPrefetch][AArch64FalkorHWPFFix] Preserve ScalarEvolution
Summary:
Mark LoopDataPrefetch and AArch64FalkorHWPFFix passes as preserving
ScalarEvolution since they do not alter loop structure and should not
alter any SCEV values (though LoopDataPrefetch may introduce new
instructions that won't have cached SCEV values yet).

This can result in slight code differences, mainly w.r.t. nsw/nuw flags
on SCEVs, since these are computed somewhat lazily when a zext/sext
instruction is encountered.  As a result, passes after the modified
passes may see SCEVs with more nsw/nuw flags present.

Reviewers: sanjoy, anemet

Subscribers: aemerson, rengolin, mzolotukhin, javed.absar, kristof.beyls, mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D36716

llvm-svn: 311032
2017-08-16 19:03:16 +00:00
Simon Atanasyan cb833076ac [mips] Handle R_MIPS_TLS_DTPREL32/64 relocations in the RelocVisitor
Debug information for TLS variables on MIPS might have R_MIPS_TLS_DTPREL32
or R_MIPS_TLS_DTPREL64 relocations. This patch adds a support for such
relocations in the `RelocVisitor`.

llvm-svn: 311031
2017-08-16 19:01:22 +00:00
Adrian Prantl 3d523a657a Add a convenience overload of DWARFDie::dump() for debugging purposes.
llvm-svn: 311026
2017-08-16 17:43:01 +00:00
Xinliang David Li 5a57b842cf Add more comment
llvm-svn: 311025
2017-08-16 17:33:43 +00:00
Xinliang David Li 71ecaa19ff [PGO] Fix ThinLTO crash
Differential Revsion: http://reviews.llvm.org/D36640

llvm-svn: 311023
2017-08-16 17:18:01 +00:00
Evgeny Mankov bf9751760a [AMDGPU] NFC: test commit
llvm-svn: 311019
2017-08-16 16:47:29 +00:00
Konstantin Zhuravlyov d3d89efa3e AMDGPU/NFC: Sort files in CMakeLists.txt alphabetically
llvm-svn: 311017
2017-08-16 16:23:32 +00:00
Simon Pilgrim 38e8a023fa [X86] Regenerate immediate store merging tests
llvm-svn: 311016
2017-08-16 16:22:19 +00:00
Jakub Kuderski 624463a003 [Dominators] Introduce batch updates
Summary:
This patch introduces a way of informing the (Post)DominatorTree about multiple CFG updates that happened since the last tree update. This makes performing tree updates much easier, as it internally takes care of applying the updates in lockstep with the (virtual) updates to the CFG, which is done by reverse-applying future CFG updates.

The batch updater is able to remove redundant updates that cancel each other out. In the future, it should be also possible to reorder updates to reduce the amount of work needed to perform the updates.

Reviewers: dberlin, sanjoy, grosser, davide, brzycki

Reviewed By: brzycki

Subscribers: mgorny, llvm-commits

Differential Revision: https://reviews.llvm.org/D36167

llvm-svn: 311015
2017-08-16 16:12:52 +00:00
Hal Finkel 9e54b7093a [BDCE] Don't check demanded bits on unsized types
To clear assumptions that are potentially invalid after trivialization, we need
to walk the use/def chain. Normally, the only way to reach an instruction with
an unsized type is via an instruction that has side effects (or otherwise will
demand its input bits). That would stop the walk. However, if we have a
readnone function that returns an unsized type (e.g., void), we must avoid
asking for the demanded bits of the function call's return value. A
void-returning readnone function is always dead (and so we can stop walking the
use/def chain here), but the check is necessary to avoid asserting.

Fixes PR34211.

llvm-svn: 311014
2017-08-16 16:09:22 +00:00
Davide Italiano cd21378ff6 [Verifier] Reject globals without a type associated.
llvm-svn: 311012
2017-08-16 15:16:33 +00:00
Dmitry Preobrazhensky b865ef534a [AMDGPU][MC][GFX9] Added op_sel support for v_mad_*16, v_fma_f16, v_div_fixup_f16
This change implements features postponed in https://reviews.llvm.org/D35424 because of a dependency on https://reviews.llvm.org/D36322

Reviewers: SamWot, artem.tamazov, arsenm

Differential Revision: https://reviews.llvm.org/D36694

llvm-svn: 311011
2017-08-16 15:16:32 +00:00
Sanjay Patel 042a53624c [DemandedBits] simplify call; NFC
llvm-svn: 311009
2017-08-16 14:28:23 +00:00
Balaram Makam c5698befb6 Revert "MachineInstr: Reason locally about some memory objects before going to AA."
r310825 caused the clang-ppc64le-linux-lnt bot to go red
(http://lab.llvm.org:8011/builders/clang-ppc64le-linux-lnt/builds/5712)
because of a test-suite failure of
SingleSource/UnitTests/2003-07-09-SignedArgs

This reverts commit 0028f6a87224fb595a1c19c544cde9b003035996.

llvm-svn: 311008
2017-08-16 14:17:43 +00:00
Dmitry Preobrazhensky ff64aa514b [AMDGPU][MC][GFX9] Added integer clamping support for VOP3 opcodes
See Bug 34152: https://bugs.llvm.org//show_bug.cgi?id=34152

Reviewers: SamWot, artem.tamazov, arsenm

Differential Revision: https://reviews.llvm.org/D36674

llvm-svn: 311006
2017-08-16 13:51:56 +00:00
Simon Pilgrim c63f93a197 [CostModel][X86][XOP] Improve costs for XOP shuffles
VPPERM/VPERMIL2PD/VPERMIL2PS all provide more effective 2-input shuffles than regular AVX instructions

llvm-svn: 311005
2017-08-16 13:50:20 +00:00
Davide Italiano 75ebc568e2 [DI] Every DIGlobalVariable should have a type.
I'll make this a verifier check to catch other violations. This
commit fixes the tests already in tree.

llvm-svn: 311004
2017-08-16 13:39:07 +00:00
Simon Dardis 9c5d64b901 [mips] Handle variables with an explicit section and interactions with .sdata, .sbss
If a variable has an explicit section such as .sdata or .sbss, it is placed
in that section and accessed in a gp relative manner. This overrides the global
-G setting.

Otherwise if a variable has a explicit section attached to it, such as '.rodata'
or '.mysection', it is not placed in the small data section. This also overrides
the global -G setting.

Reviewers: atanasyan, nitesh.jain

Differential Revision: https://reviews.llvm.org/D36616

llvm-svn: 311001
2017-08-16 12:18:04 +00:00
Sam Parker 84fd0c3bf2 [ARM] Improve loop unrolling for Cortex-M
- Set the default runtime unroll count to 4 and use the newly added
  UnrollRemainder option.
- Create loop cost and force unroll for a cost less than 12.
- Disable unrolling on Thumb1 only targets.

Differential Revision: https://reviews.llvm.org/D36134

llvm-svn: 310997
2017-08-16 07:42:44 +00:00
Igor Breger ce5ea38135 [GlobalISel][X86] Fix mir tests, use correct physical register.NFC.
llvm-svn: 310996
2017-08-16 07:25:51 +00:00
Martin Storsjo e1f120bbcb [COFF] Make the weak aliases optional
When creating an import library from lld, the cases with
Name != ExtName shouldn't end up as a weak alias, but as a real
export of the new name, which is what actually is exported from
the DLL.

This restores the behaviour of renamed exports to what it was in
4.0.

The other half of this commit, including test, goes into lld.

Differential Revision: https://reviews.llvm.org/D36633

llvm-svn: 310991
2017-08-16 05:22:49 +00:00
Martin Storsjo 58c9527eaf [llvm-dlltool] Fix creating stdcall/fastcall import libraries for i386
Hook up the -k option (that in the original GNU dlltool removes the
@n suffix from the symbol that the final executable ends up linked to).

In llvm-dlltool, make sure that functions end up with the undecorate
name type if this option is set and they are decorated. In mingw, when
creating import libraries from def files instead of creating an import
library as a side effect of linking a DLL, the symbol names in the def
contain the stdcall/fastcall decoration (but no leading underscore).

By setting the undecorate name type, a linker linking to the import
library will omit the decoration from the DLL import entry.

With this in place, mingw-w64 for i386 built with llvm-dlltool/clang
produces import libraries that actually work.

Differential Revision: https://reviews.llvm.org/D36548

llvm-svn: 310990
2017-08-16 05:18:36 +00:00
Martin Storsjo a238b20e23 [COFF] Add SymbolName as a distinct field in COFFImportFile
The previous Name and ExtName aren't enough to convey all the nuances
between weak aliases and stdcall decorated function names.

A test for this will be added in LLD.

Differential Revision: https://reviews.llvm.org/D36544

llvm-svn: 310988
2017-08-16 05:13:16 +00:00
Stanislav Mekhanoshin a9487d92d7 [AMDGPU] Eliminate no effect instructions before s_endpgm
Differential Revision: https://reviews.llvm.org/D36585

llvm-svn: 310987
2017-08-16 04:43:49 +00:00
Dehao Chen 84d412035a Merge debug info when hoist then-else code to if.
Summary: When we move then-else code to if, we need to merge its debug info, otherwise the hoisted instruction may have inaccurate debug info attached.

Reviewers: aprantl, probinson, dblaikie, echristo, loladiro

Reviewed By: aprantl

Subscribers: sanjoy, llvm-commits

Differential Revision: https://reviews.llvm.org/D36778

llvm-svn: 310985
2017-08-16 01:55:26 +00:00
Derek Schuff 8e71359561 [WebAssembly] Remove infinite loop from reg-stackify test
r310940 exposed reverse-unreachable code to some optimizers,
which caused some of the code in this test to be sunk, changing
the input to the pass and breaking the exptectations.

Since that change is irrelevant to this particular test, this change
just adds an exit node to work around the problem; the
test should really be more robust (or be an MIR test?) but this preserves
the existing test intent.

llvm-svn: 310981
2017-08-16 00:49:44 +00:00
Quentin Colombet 647b482c1e [VirtRegRewriter] Properly model the register liveness on undef subreg definition
Undef subreg definition means that the content of the super register
doesn't matter at this point. While that's true for virtual registers,
this may not hold when replacing them with actual physical registers.
Indeed, some part of the physical register may be coalesced with the
related virtual register and thus, the values for those parts matter and
must be live.

The fix consists in checking whether or not subregs of the physical register
being assigned to an undef subreg definition are live through that def and
insert an implicit use if they are. Doing so, will keep them alive until
that point like they should be.

E.g., let vreg14 being assigned to R0_R1 then
%vreg14:gsub_0<def,read-undef> = COPY %R0 ; <-- R1 is still live here
%vreg14:gsub_1<def> = COPY %R1

Before this changes, the rewriter would change the code into:
%R0<def> = KILL %R0, %R0_R1<imp-def> ; <-- this tells R1 is redefined
%R1<def> = KILL %R1, %R0_R1<imp-def>, %R0_R1<imp-use> ; this value of this R1
                                                      ; is believed to come
                                                      ; from the previous
                                                      ; instruction

Because of this invalid liveness, later pass could make wrong choices and in
particular clobber live register as it happened with the register scavenger in
llvm.org/PR34107

Now we would generate:
%R0<def> = KILL %R0, %R0_R1<imp-def>, %R0_R1<imp-use> ; This tells R1 needs to
                                                      ; reach this point
%R1<def> = KILL %R1, %R0_R1<imp-def>, %R0_R1<imp-use>

The bug has been here forever, it got exposed recently because the register
scavenger got smarter.

Fixes llvm.org/PR34107

llvm-svn: 310979
2017-08-16 00:17:05 +00:00
Kuba Mracek a1adbe6ca1 Revert archive-* tests from r310953, there were test failures.
llvm-svn: 310974
2017-08-15 23:41:34 +00:00
Craig Topper 0a1a276d91 [InstCombine] Teach canEvaluateZExtd and canEvaluateTruncated to handle vector shifts with splat shift amount
We were only allowing ConstantInt before. This patch allows splat of ConstantInt too.

Differential Revision: https://reviews.llvm.org/D36763

llvm-svn: 310970
2017-08-15 22:48:41 +00:00
Quentin Colombet 61d71a138b Reapply "[GlobalISel] Remove the GISelAccessor API."
This reverts commit r310425, thus reapplying r310335 with a fix for link
issue of the AArch64 unittests on Linux bots when BUILD_SHARED_LIBS is ON.

Original commit message:
[GlobalISel] Remove the GISelAccessor API.

Its sole purpose was to avoid spreading around ifdefs related to
building global-isel. Since r309990, GlobalISel is not optional anymore,
thus, we can get rid of this mechanism all together.

NFC.

----
The fix for the link issue consists in adding the GlobalISel library in
the list of dependencies for the AArch64 unittests. This dependency
comes from the use of AArch64Subtarget that needs to know how
to destruct the GISel related APIs when being detroyed.

Thanks to Bill Seurer and Ahmed Bougacha for helping me reproducing and
understand the problem.

llvm-svn: 310969
2017-08-15 22:31:51 +00:00
Charles Saternos 55d93e79df [ThinLTO] Fix ThinLTO crash while destroying context
Fix for PR32763

An assert that checks if a Ref was untracked fails during ThinLTO context cleanup. The issue is because lazy loading temporary nodes didn't properly track ValueAsMetadata nodes. This patch ensures that the temporary nodes are properly tracked when they're replaced with the value.

llvm-svn: 310967
2017-08-15 22:23:44 +00:00
Kuba Mracek c0301b2f53 Revert changes in r310953 for llvm-symbolizer.test. The change causes a test failure.
llvm-svn: 310956
2017-08-15 21:02:17 +00:00
Tony Tye 46d3576c24 Update AMDGPUUsage.rst documentation:
1. Correct description of the kernel initial state for FLAT_SCRATCH_INIT.
    2. Add link to GFX9 architecture documentation.
    3. Update product names.
    4. Rename note record from NT_AMD_AMDGPU_METADATA to NT_AMD_AMDGPU_HSA_METADATA and move description to the AMDHSA coding convention section.
    5. Minor typo corrections.

Differential Revision: https://reviews.llvm.org/D36549

llvm-svn: 310954
2017-08-15 20:47:41 +00:00
Kuba Mracek 17ee427ef3 [llvm] Get rid of "%T" expansions
The %T lit expansion expands to a common directory shared between all the tests in the same directory, which is unexpected and unintuitive, and more importantly, it's been a source of subtle race conditions and flaky tests. In https://reviews.llvm.org/D35396, it was agreed that it would be best to simply ban %T and only keep %t, which is unique to each test. When a test needs a temporary directory, it can just create one using mkdir %t.

This patch removes %T in llvm.

Differential Revision: https://reviews.llvm.org/D36495

llvm-svn: 310953
2017-08-15 20:29:24 +00:00
Amjad Aboud 0464c5d958 [InstCombine] Added support for (X >>s C) << C --> X & (-1 << C)
Differential Revision: https://reviews.llvm.org/D36743

llvm-svn: 310949
2017-08-15 19:33:14 +00:00
Lang Hames e815bf3cd8 [ORC][Kaleidoscope] Update Chapter 1 of BuildingAJIT to incorporate recent ORC
API changes.

llvm-svn: 310947
2017-08-15 19:20:10 +00:00
Sanjay Patel f69b7d5c93 [InstCombine] sink sext after ashr
Narrow ops are better for bit-tracking, and in the case of vectors,
may enable better codegen.

As the trunc test shows, this can allow follow-on simplifications.

There's a block of code in visitTrunc that deals with shifted ops
with FIXME comments. It may be possible to remove some of that now,
but I want to make sure there are no problems with this step first.

http://rise4fun.com/Alive/Y3a

Name: hoist_ashr_ahead_of_sext_1
  %s = sext i8 %x to i32
  %r = ashr i32 %s, 3  ; shift value is < than source bit width
  =>
  %a = ashr i8 %x, 3
  %r = sext i8 %a to i32
  
Name: hoist_ashr_ahead_of_sext_2
  %s = sext i8 %x to i32
  %r = ashr i32 %s, 8  ; shift value is >= than source bit width
  =>
  %a = ashr i8 %x, 7   ; so clamp this shift value
  %r = sext i8 %a to i32
  
Name: junc_the_trunc  
  %a = sext i16 %v to i32
  %s = ashr i32 %a, 18
  %t = trunc i32 %s to i16
  =>
  %t = ashr i16 %v, 15
llvm-svn: 310942
2017-08-15 18:25:52 +00:00
Jakub Kuderski 638c085d07 [Dominators] Include infinite loops in PostDominatorTree
Summary:
This patch teaches PostDominatorTree about infinite loops. It is built on top of D29705 by @dberlin which includes a very detailed motivation for this change.

What's new is that the patch also teaches the incremental updater how to deal with reverse-unreachable regions and how to properly maintain and verify tree roots. Before that, the incremental algorithm sometimes ended up preserving reverse-unreachable regions after updates that wouldn't appear in the tree if it was constructed from scratch on the same CFG.

This patch makes the following assumptions:
- A sequence of updates should produce the same tree as a recalculating it.
- Any sequence of the same updates should lead to the same tree.
- Siblings and roots are unordered.

The last two properties are essential to efficiently perform batch updates in the future.
When it comes to the first one, we can decide later that the consistency between freshly built tree and an updated one doesn't matter match, as there are many correct ways to pick roots in infinite loops, and to relax this assumption. That should enable us to recalculate postdominators less frequently.

This patch is pretty conservative when it comes to incremental updates on reverse-unreachable regions and ends up recalculating the whole tree in many cases. It should be possible to improve the performance in many cases, if we decide that it's important enough.
That being said, my experiments showed that reverse-unreachable are very rare in the IR emitted by clang when bootstrapping  clang. Here are the statistics I collected by analyzing IR between passes and after each removePredecessor call:

```
# functions:  52283
# samples:  337609
# reverse unreachable BBs:  216022
# BBs:  247840796
Percent reverse-unreachable:  0.08716159869015269 %
Max(PercRevUnreachable) in a function:  87.58620689655172 %
# > 25 % samples:  471 ( 0.1395104988314885 % samples )
... in 145 ( 0.27733680163724345 % functions )
```

Most of the reverse-unreachable regions come from invalid IR where it wouldn't be possible to construct a PostDomTree anyway.

I would like to commit this patch in the next week in order to be able to complete the work that depends on it before the end of my internship, so please don't wait long to voice your concerns :).

Reviewers: dberlin, sanjoy, grosser, brzycki, davide, chandlerc, hfinkel

Reviewed By: dberlin

Subscribers: nhaehnle, javed.absar, kparzysz, uabelho, jlebar, hiraditya, llvm-commits, dberlin, david2050

Differential Revision: https://reviews.llvm.org/D35851

llvm-svn: 310940
2017-08-15 18:14:57 +00:00
Tom Stellard 590a974e10 test-release.sh: Move test-suite setup to beginning of the script
Summary:
We want to catch failures early before do the full 3 stage build.

The goal here is to avoid running through the whole build process and have
it fail at the end (and not create the binary packages), just because
some prerequisites failed to install.

Reviewers: rovka, hans

Reviewed By: hans

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D36422

llvm-svn: 310939
2017-08-15 18:11:56 +00:00
Lang Hames 359983bdd3 [ORC] Add case statements for AArch64 to the local stub and callback manager
creation functions.

This should allow lli to lazily execute code using OrcLazyJIT on AArch64.

llvm-svn: 310938
2017-08-15 18:10:19 +00:00
Sanjay Patel d52997a296 [InstCombine] add tests for sext+ashr; NFC
llvm-svn: 310935
2017-08-15 17:41:31 +00:00
Rui Ueyama 4a17955030 Fix -Wunused-lambda-capture for Release build.
`I` and `this` are used only in assert or DEBUG, so they are unused
in Release build.

llvm-svn: 310934
2017-08-15 17:39:35 +00:00
George Rimar e5269439cd [llvm-dwarfdump] - Attemp to fix BB after r310915.
Now MIPS one is unhappy:
http://lab.llvm.org:8011/builders/llvm-mips-linux/builds/2221

llvm-svn: 310928
2017-08-15 16:42:21 +00:00
Steven Wu 86a511e836 [Doc] Update LangRef for new Module Flag Behavior
Summary:
Add the documentation for the new module flag behavior. The new
ModFlagBehavior is added in r303590.

Reviewers: tejohnson

Reviewed By: tejohnson

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D36557

llvm-svn: 310926
2017-08-15 16:16:33 +00:00
George Rimar e1c30f74f7 [llvm-dwarfdump] - Refactor section name/uniqueness gathering.
As was requested in D36313 thread,

with this patch section names and uniqueness calculated once,
and not every time when a range is dumped.

Differential revision: https://reviews.llvm.org/D36740

llvm-svn: 310923
2017-08-15 15:54:43 +00:00
Daniel Sanders eb2f5f3256 Revert r310919 - [globalisel][tablegen] Support zero-instruction emission.
As expected, this failed on the windows bots but the instrumentation showed
something interesting. The ADD8ri and INC8r rules are never directly compared
on the windows machines. That implies that the issue lies in transitivity of
the Compare predicate. I believe I've already verified that but maybe I missed
something.

llvm-svn: 310922
2017-08-15 15:10:31 +00:00
Daniel Sanders 16e6dd3cd6 Re-commit with some instrumentation: [globalisel][tablegen] Support zero-instruction emission.
Summary:
Support the case where an operand of a pattern is also the whole of the
result pattern. In this case the original result and all its uses must be
replaced by the operand. However, register class restrictions can require
a COPY. This patch handles both cases by always emitting the copy and
leaving it for the register allocator to optimize.

The previous commit failed on the windows bots and this one is likely to fail
on those same bots. However, the added instrumentation should reveal a particular
isHigherPriorityThan() evaluation which I'm expecting to expose that
these machines are weighing priority of two rules differently from the
non-windows machines.

Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar

Subscribers: javed.absar, kristof.beyls, igorb, llvm-commits

Differential Revision: https://reviews.llvm.org/D36084

llvm-svn: 310919
2017-08-15 13:50:09 +00:00
George Rimar 36f4d8044b [DebugInfo] - Attemp to fix BB after r310915.
Not sure what BB does not like.

While building module 'LLVM_DebugInfo_DWARF' imported from /home/buildbot/modules-slave-2/clang-x86_64-linux-selfhost-modules-2/llvm.src/lib/DebugInfo/DWARF/DWARFAbbreviationDeclaration.cpp:10:
In file included from <module-includes>:7:
In file included from /home/buildbot/modules-slave-2/clang-x86_64-linux-selfhost-modules-2/llvm.src/include/llvm/DebugInfo/DWARF/DWARFContext.h:29:
/home/buildbot/modules-slave-2/clang-x86_64-linux-selfhost-modules-2/llvm.src/include/llvm/DebugInfo/DWARF/DWARFObject.h:30:17: error: declaration of 'object' must be imported from module 'LLVM_Object.Decompressor' before it is required
  virtual const object::ObjectFile *getFile() const { return nullptr; }
                ^
/home/buildbot/modules-slave-2/clang-x86_64-linux-selfhost-modules-2/llvm.src/include/llvm/Object/Decompressor.h:18:11: note: previous declaration is here
namespace object {

http://lab.llvm.org:8011/builders/clang-x86_64-linux-selfhost-modules-2/builds/10766

llvm-svn: 310918
2017-08-15 13:26:12 +00:00
Alex Bradbury 2fee9ead7e [RISCV] Add RISCVInstPrinter and basic MC assembler tests
With the addition of RISCVInstPrinter, it is now possible to test the basic 
operation of the RISCV MC layer.

Differential Revision: https://reviews.llvm.org/D23564

llvm-svn: 310917
2017-08-15 13:08:29 +00:00
George Rimar 6957ab5b7b [llvm-dwarfdump] - Print section name and index when dumping .debug_info ranges
Teaches llvm-dwarfdump to print section index and name of range
when it dumps .debug_info.

Differential revision: https://reviews.llvm.org/D36313

llvm-svn: 310915
2017-08-15 12:32:54 +00:00
Alex Bradbury 67820e015f [RISCV] Recognize new relocation types
This patch adds all RISC-V relocation types, as of binutils 2.29. Note that 
R_RISCV32_PCREL is not currently documented in the RISC-V ELF PSABI.

Differential Revision: https://reviews.llvm.org/D36455

Patch by Chih-Mao Chen (@PkmX)

llvm-svn: 310914
2017-08-15 12:11:10 +00:00
Ayal Zaks 25e2800e20 [LV] Minor savings to Sink casts to unravel first order recurrence
Two minor savings: avoid copying the SinkAfter map and avoid moving a cast if it
is not needed.

Differential Revision: https://reviews.llvm.org/D36408

llvm-svn: 310910
2017-08-15 08:32:59 +00:00
Frederich Munch 7a3da86823 Propagate error in LazyEmittingLayer::removeModule.
Summary:
Besides being the better thing to do, not doing so will triggers an assert with LLVM_ENABLE_ABI_BREAKING_CHECKS.

Reviewers: lhames

Reviewed By: lhames

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D36700

llvm-svn: 310906
2017-08-15 02:25:36 +00:00
Dinar Temirbulatov 9e43d6e7b2 [SLPVectorizer] Replace VL[0] to VL0 with assert, add propagateIRFlags extra parameter VL0,
replace E->Scalars[0] to VL0, NFCI.

llvm-svn: 310904
2017-08-15 00:31:49 +00:00
Petr Hosek ec20fd7731 [CMake] Add install target for LLVMFuzzer
This allows including LLVMFuzzer as distribution component.

Differential Revision: https://reviews.llvm.org/D36540

llvm-svn: 310897
2017-08-14 23:37:31 +00:00
Dehao Chen 45847d3612 Add missing dependency in ICP. (NFC)
llvm-svn: 310896
2017-08-14 23:25:21 +00:00
Jessica Paquette 95c1107f4c [MachineOutliner] Only outline candidates of length >= 2
Since we don't factor in instruction lengths into outlining calculations
right now, it's never the case that a candidate could have length < 2.

Thus, we should quit early when we see such candidates.

llvm-svn: 310894
2017-08-14 22:57:41 +00:00
Craig Topper b1e4b1a070 [InstSimplify] Teach decomposeBitTestICmp to handle non-canonical compares
This adds support non-canonical compare predicates. InstSimplify can't rely on canonicalization to have occurred.

Differential Revision: https://reviews.llvm.org/D36646

llvm-svn: 310893
2017-08-14 22:11:43 +00:00
Reid Kleckner 18728822d2 Remove checks for debug info intrinsics in use lists, NFC
These haven't done anything since debug info intrinsics stopped
appearing in Value use lists in 2014.

llvm-svn: 310892
2017-08-14 22:10:54 +00:00
John Baldwin 1255b165bf [MIPS] Implement support for -mstack-alignment.
Summary:
This is modeled on the implementation for x86 which stores the command line
option in a 'StackAlignOverride' field in MipsSubtarget and then uses this
to compute a 'stackAlignment' value in
MipsSubtarget::initializeSubtargetDependencies.

The stackAlignment() method in MipsSubTarget is renamed to getStackAlignment()
and returns the computed 'stackAlignment'.

Reviewers: sdardis

Reviewed By: sdardis

Subscribers: llvm-commits, arichardson

Differential Revision: https://reviews.llvm.org/D35874

llvm-svn: 310891
2017-08-14 21:49:38 +00:00
Craig Topper 0aa3a19512 Recommit r310869, "[InstSimplify][InstCombine] Modify the interface of decomposeBitTestICmp and use it in the InstSimplify"
This recommits r310869, with the moved files and no extra changes.

Original commit message:

This addresses a fixme in InstSimplify about using decomposeBitTest. This also fixes InstSimplify to handle ugt and ult compares too.

I've modified the interface a little to return only the APInt version of the mask that InstSimplify needs. InstCombine now has a small wrapper routine to create a Constant out of it. I've also dropped the returning of 0 since InstSimplify doesn't need that. So InstCombine creates a zero constant itself.

I also had to make decomposeBitTest support vectors since InstSimplify needs that.

As InstSimplify can't use something from the Transforms library, I've moved the CmpInstAnalysis code to the Analysis library.

Differential Revision: https://reviews.llvm.org/D36593

llvm-svn: 310889
2017-08-14 21:39:51 +00:00
Chandler Carruth bba762a13f [InlineCost] Refactor the checks for different analyses to be a bit more
localized to the code that uses those analyses.

Technically, this can change behavior as we no longer require the
existence of the ProfileSummaryInfo analysis to use local profile
information via BFI. We didn't actually require the PSI to have an
interesting profile though, so this only really impacts the behavior in
non-default pass pipelines.

IMO, this makes it substantially less surprising how everything works --
before an analysis that wasn't actually used had to exist to trigger
*any* profile aware inlining. I think the new organization makes it more
obvious where various checks for profile signals happen.

Differential Revision: https://reviews.llvm.org/D36710

llvm-svn: 310888
2017-08-14 21:25:00 +00:00
Andrew Kaylor 53a5fbb45f Add strictfp attribute to prevent unwanted optimizations of libm calls
Differential Revision: https://reviews.llvm.org/D34163

llvm-svn: 310885
2017-08-14 21:15:13 +00:00
Kostya Serebryany e3cb3c519f [libFuzzer] try to use less RAM while processing the initial corpus
llvm-svn: 310881
2017-08-14 20:34:35 +00:00
Kostya Serebryany 47cb4856d4 [libFuzzer] explicitly use -fsanitize-coverage=trace-pc-guard in test/dump_coverage.test; mark print_coverage/dump_coverage as To-be-deprecated
llvm-svn: 310877
2017-08-14 19:55:23 +00:00
Matt Arsenault 81da0d45f8 IPRA: Allow target to enable IPRA by default
llvm-svn: 310876
2017-08-14 19:54:47 +00:00
Matt Arsenault f9273c81d6 IPRA: Run RegUsageInfoPropagate much later
This was running immediately after isel, before
isel pseudos were even expanded which is really
unreasonable. Move this to before pre-reglloc
passes in case some other pre-regalloc pass wants to
use the updated regmask info.

Fixes one of the reasons IPRA doesn't do anything on
AMDGPU currently. Tests will be included with future
patch after a few more are fixed.

llvm-svn: 310875
2017-08-14 19:54:45 +00:00
Craig Topper 69fa8e0d99 Revert r310869 "[InstSimplify][InstCombine] Modify the interface of decomposeBitTestICmp and use it in the InstSimplify"
Failed to add the two files that moved. And then added an extra change I didn't mean to while trying to fix that. Reverting everything.

llvm-svn: 310873
2017-08-14 19:09:32 +00:00
Craig Topper 9c7b881677 Revert r310870 "[InstCombine][InstSimplify] 'git add' two files that moved in r310869."
An extra change crept in here.

llvm-svn: 310872
2017-08-14 19:09:28 +00:00
Craig Topper 914c836842 [InstCombine][InstSimplify] 'git add' two files that moved in r310869.
llvm-svn: 310870
2017-08-14 19:01:32 +00:00
Craig Topper 2f0b450666 [InstSimplify][InstCombine] Modify the interface of decomposeBitTestICmp and use it in the InstSimplify
This addresses a fixme in InstSimplify about using decomposeBitTest. This also fixes InstSimplify to handle ugt and ult compares too.

I've modified the interface a little to return only the APInt version of the mask that InstSimplify needs. InstCombine now has a small wrapper routine to create a Constant out of it. I've also dropped the returning of 0 since InstSimplify doesn't need that. So InstCombine creates a zero constant itself.

I also had to make decomposeBitTest support vectors since InstSimplify needs that.

As InstSimplify can't use something from the Transforms library, I've moved the CmpInstAnalysis code to the Analysis library.

Differential Revision: https://reviews.llvm.org/D36593

llvm-svn: 310869
2017-08-14 18:49:42 +00:00
Craig Topper 58def1e1f2 [InstSimplify] Add some tests cases for selects with bittests hidden in ugt/ult/uge/ule compares. NFC
llvm-svn: 310868
2017-08-14 18:49:39 +00:00
Lei Huang 451ef4adcd [PowerPC] Add codegen for VSX word extract convert to FP
Add codegen for VSX word extract conversion from signed/unsigned to single/double
precision.

For UINT_TO_FP:
Extract word unsigned and convert to float was implemented in https://reviews.llvm.org/D20239.
Here we will add the missing extract integer and conversion to double. This
utilizes the new P9 instruction xxextractuw to extracting an integer element
when the result will be converted to double thereby saving 2 direct moves
(VSR <-> GPR).

For SINT_TO_FP:
We will implement the following sequence which will also reduce the number of
instructions by saving 2 direct moves.

v4i32->f32:
        xxspltw
        xvcvsxwsp
        xscvspdpn

v4i32->f64:
        xxspltw
        xvcvsxwdp

Differential Revision: https://reviews.llvm.org/D35859

llvm-svn: 310866
2017-08-14 18:09:29 +00:00
Aditya Nandakumar 86021a2345 [GISel]: Add some helper constructors to MIRBuilder
https://reviews.llvm.org/D36636

llvm-svn: 310860
2017-08-14 17:25:11 +00:00
Hal Finkel b03dd4be70 [ValueTracking] Don't delete assumes of side-effectful instructions
ValueTracking has to strike a balance when attempting to propagate information
backwards from assumes, because if the information is trivially propagated
backwards, it can appear to LLVM that the assumption is known to be true, and
therefore can be removed.

This is sound (because an assumption has no semantic effect except for causing
UB), but prevents the assume from allowing further optimizations.

The isEphemeralValueOf check exists to try and prevent this issue by not
removing the source of an assumption. This tries to make it a little bit more
general to handle the case of side-effectful instructions, such as in

  %0 = call i1 @get_val()
  %1 = xor i1 %0, true
  call void @llvm.assume(i1 %1)

Patch by Ariel Ben-Yehuda, thanks!

Differential Revision: https://reviews.llvm.org/D36590

llvm-svn: 310859
2017-08-14 17:11:43 +00:00
Simon Dardis c3f6b2806f Revert "Reland "[mips][mt][6/7] Add support for mftr, mttr instructions.""
This reverts r310834. It didn't pacify the buildbot, FileCheck is still
crashing.

llvm-svn: 310854
2017-08-14 16:20:33 +00:00
Sanjay Patel 92653865e6 [x86] fold the mask op on 8- and 16-bit rotates
Ref the post-commit thread for r310770:
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20170807/478507.html

The motivating cases as 'C' source examples can look like this:

unsigned char rotate_right_8(unsigned char v, int shift) {
  // shift &= 7;
  v = ( v >> shift ) | ( v << ( 8 - shift ) );
  return v;
}

https://godbolt.org/g/K6rc1A

Notice that the source doesn't contain UB-safe masked shift amounts, but instcombine created those 
in order to produce narrow rotate patterns. This should be the last step needed to resolve PR34046:
https://bugs.llvm.org/show_bug.cgi?id=34046

Differential Revision: https://reviews.llvm.org/D36644

llvm-svn: 310849
2017-08-14 15:55:43 +00:00
Dinar Temirbulatov 7b78f5e52d [SLPVectorizer] Schedule bundle with different opcodes.
This change let us schedule a bundle with different opcodes in it, for example : [ load, add, add, add ]

Reviewers: mkuper, RKSimon, ABataev, mzolotukhin, spatel, filcab

Subscribers: llvm-commits, rengolin

Differential Revision: https://reviews.llvm.org/D36518

llvm-svn: 310847
2017-08-14 15:40:16 +00:00
Craig Topper 411f29de69 [X86] Fix a place that was mishandling X86ISD::UMUL.
According to the X86ISelLowering.h, UMUL results are low, high, and flags. But this place was treating result 1 or 2 as flags.

Differential Revision: https://reviews.llvm.org/D36654

llvm-svn: 310846
2017-08-14 15:32:40 +00:00
Craig Topper c0471829b1 [X86] Remove flag setting ISD nodes from computeKnownBitsForTargetNode
Summary:
The flag result is an i32 type. But its only really used for connectivity. I don't think anything even assumes a particular format. We don't ever do any real operations on it. So known bits don't help us optimize anything.

My main motivation is that the UMUL behavior is actually wrong. I was going to fix this in D36654, but then realized there was just no reason for it to be here.

Reviewers: RKSimon, zvi, spatel

Reviewed By: RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D36657

llvm-svn: 310845
2017-08-14 15:28:49 +00:00
Craig Topper b9e3e111ad [AVX512] Make the itinerary parameter actually pass through the the AVX512_maskable_common multiclass
Summary: This looks to have been disconnected about 3 years ago in r219358.

Reviewers: gadi.haber, RKSimon, zvi

Reviewed By: gadi.haber

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D36658

llvm-svn: 310844
2017-08-14 15:28:48 +00:00
Craig Topper 2374de420b [AVX512] Remove leftover code for when i1 was a legal type from the fast isel load/store code.
Summary:
I don't think we need this code anymore. It only existed because i1 used to be legal.

There's probably more unneeded code in fast isel still.

Reviewers: guyblank, zvi

Reviewed By: guyblank

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D36652

llvm-svn: 310843
2017-08-14 15:28:47 +00:00
Sanjay Patel a1067d9bae [BDCE] reduce scope of an assert (PR34179)
The assert was added with r310779 and is usually correct,
but as the test shows, not always. The 'volatile' on the
load is needed to expose the faulty path because without
it, DemandedBits would return that the load is just dead
rather than not demanded, and so we wouldn't hit the
bogus assert.

Also, since the lambda is just a single-line now, get rid 
of it and inline the DB.isAllOnesValue() calls. 

This should fix (prevent execution of a faulty assert):
https://bugs.llvm.org/show_bug.cgi?id=34179

llvm-svn: 310842
2017-08-14 15:13:46 +00:00
Simon Dardis cbf55deaa1 Reland "[mips][mt][6/7] Add support for mftr, mttr instructions."
This adjusts the tests to hopfully pacify the llvm-clang-x86_64-expensive-checks-win
buildbot.

Unlike many other instructions, these instructions have aliases which
take coprocessor registers, gpr register, accumulator (and dsp accumulator)
registers, floating point registers, floating point control registers and
coprocessor 2 data and control operands.

For the moment, these aliases are treated as pseudo instructions which are
expanded into the underlying instruction. As a result, disassembling these
instructions shows the underlying instruction and not the alias.

Reviewers: slthakur, atanasyan

Differential Revision: https://reviews.llvm.org/D35253

llvm-svn: 310834
2017-08-14 12:28:00 +00:00
Amaury Sechet 9c529b6be3 [DAGCombine] Do not try to deduplicate commutative operations if both operand are the same.
Summary: It is creating useless work as the commuted nodes is the same as the node we are working on in that case.

Reviewers: jyknight, nemanjai, mkuper, spatel, RKSimon, zvi, bkramer

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33840

llvm-svn: 310832
2017-08-14 11:44:03 +00:00
Elad Cohen 6a9edda356 [SelectionDAG] combine vextract (v1iX extract_subvector(vNiX, Idx))
into vextract(vNiX,Idx) when creating vextract with getNode().
This case appeared in AVX512 after fixing pr33349 in r310552.

Differential revision: https://reviews.llvm.org/D36571

llvm-svn: 310828
2017-08-14 10:49:45 +00:00
Sean Eveson 9edfeac9ea [llvm-cov] Add an option which maps the location of source directories on another machine to your local copies
Summary:
This patch adds the -path-equivalence option (example: llvm-cov show -path-equivalence=/origin/path,/local/path) which maps the source code path from one machine to another when using `llvm-cov show`. This is similar to the -filename-equivalence option, but doesn't require you to specify all the source files on the command line.

This allows you to generate the coverage data on one machine (e.g. in a CI system), and then use llvm-cov on another machine where you have the same code base on a different path.

Reviewers: vsk

Reviewed By: vsk

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D36391

llvm-svn: 310827
2017-08-14 10:20:12 +00:00
Balaram Makam d9f53414de MachineInstr: Reason locally about some memory objects before going to AA.
This addresses a FIXME in MachineInstr::mayAlias.

llvm-svn: 310825
2017-08-14 09:41:40 +00:00
Sam Parker 718c8a6a2a [LoopUnroll] Enable option to peel remainder loop
On some targets, the penalty of executing runtime unrolling checks
and then not the unrolled loop can be significantly detrimental to
performance. This results in the need to be more conservative with
the unroll count, keeping a trip count of 2 reduces the overhead as
well as increasing the chance of the unrolled body being executed. But
being conservative leaves performance gains on the table.

This patch enables the unrolling of the remainder loop introduced by
runtime unrolling. This can help reduce the overhead of misunrolled
loops because the cost of non-taken branches is much less than the
cost of the backedge that would normally be executed in the remainder
loop. This allows larger unroll factors to be used without suffering
performance loses with smaller iteration counts.

Differential Revision: https://reviews.llvm.org/D36309

llvm-svn: 310824
2017-08-14 09:25:26 +00:00
Sam Parker 647cce82a3 [AArch64] Remove unused MC function
An unused function warning was raised in
https://bugs.llvm.org/show_bug.cgi?id=34178.

The offending function, in AArch64MCCodeEmitter.cpp, was committed by
me last week.

Differential Revision: https://reviews.llvm.org/D36665

llvm-svn: 310823
2017-08-14 09:16:13 +00:00
Elad Cohen 3a90a0c10d Revert "[DAGCombiner] Extending pattern detection for vector shuffle (REAPPLIED)"
This reverts commit r310782.

llvm-svn: 310822
2017-08-14 09:06:00 +00:00
Chandler Carruth 37c7b08710 [ValueTracking] Revert r310583 which enabled functionality that still is
causing compile time issues.

Moreover, the patch *deleted* the flag in addition to changing the
default, and links to a code review that doesn't even discuss the flag
and just has an update to a Clang test case.

I've followed up on the commit thread to ask for numbers on compile time
at this point, leaving the flag in place until things stabilize, and
pointing at specific code that seems to exhibit excessive compile time
with this patch.

Original commit message for r310583:
"""
[ValueTracking] Enabling ValueTracking patch by default (recommit). Part 2.

The original patch was an improvement to IR ValueTracking on
non-negative integers. It has been checked in to trunk (D18777,
r284022). But was disabled by default due to performance regressions.
Perf impact has improved. The patch would be enabled by default.
""""

llvm-svn: 310816
2017-08-14 07:03:24 +00:00
Craig Topper 508aa97b25 [AVX-512] Add hasSideEffects = 0 to the 8-bit and 16-bit register broadcasts.
llvm-svn: 310813
2017-08-14 05:09:34 +00:00
Craig Topper ca98bb9c94 [X86] Remove unused argument from the vextract_for_size multiclass. NFC
llvm-svn: 310812
2017-08-14 05:09:33 +00:00
Craig Topper 2c8cb2fa87 [AVX512] Remove comment I should have removed in r310808. NFC
llvm-svn: 310811
2017-08-14 05:09:31 +00:00
Brian Gesiak 60a3185940 [opt-viewer] Listify `dict_items` for Py3 indexing
Summary:
In Python 2, calling `dict.items()` returns an indexable `list`, whereas
on Python 3 it returns a set-like `dict_items` object, which cannot be
indexed. Explicitly onvert the `dict_items` object so that it can be
indexed when using Python 3.

In combination with D36622, D36623, and D36624, this change allows
`opt-viewer.py` to exit successfully when run with Python 3.4.

Test Plan:
Run `opt-viewer.py` using Python 3.4 and confirm it does not encounter a
runtime error when when indexing into `dict.items()`.

Reviewers: anemet

Reviewed By: anemet

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D36630

llvm-svn: 310810
2017-08-14 04:16:43 +00:00
Chandler Carruth fe6b509f83 [PowerPC] Revert r310346 (and followups r310356 & r310424) which
introduce a miscompile bug.

There appears to be a bug where the generated code to extract the sign
bit doesn't work correctly for 32-bit inputs. I've replied to the
original commit pointing out the problem. I think I see by inspection
(and reading the manual for PPC) how to fix this, but I can't be 100%
confident and I also don't know what the best way to test this is.
Currently it seems nearly impossible to get the backend to hit this code
path, but the patch autohr is likely in a better position to craft such
test cases than I am, and based on where the bug is it should be easily
done.

Original commit message for r310346:
"""
[PowerPC] Eliminate compares - add i32 sext/zext handling for SETLE/SETGE

Adds handling for SETLE/SETGE comparisons on i32 values. Furthermore, it
adds the handling for the special case where RHS == 0.

Differential Revision: https://reviews.llvm.org/D34048
"""

llvm-svn: 310809
2017-08-14 03:41:00 +00:00
Craig Topper aadec7078e [AVX512] Simplify the instruction defintion for VEXTRACT. NFCI
The comment about why we couldn't use avx512_maskable appears to have been incorrect.

llvm-svn: 310808
2017-08-14 01:53:10 +00:00
Javed Absar 37b2286564 [ARM] Tidy-up Cortex-A15 DPR-SPR optimizer implementation
Modernise the code with range-loops etc

Reviewed by: @fhahn, @rovka
Differential Revision: https://reviews.llvm.org/D36502

llvm-svn: 310807
2017-08-14 01:38:01 +00:00
Craig Topper f720099007 [InstCombine] Simplify and inline FoldOrWithConstants/FoldXorWithConstants
Summary:
These functions were overly complicated. The body of this function was rechecking for an And operation to find the constant, but we already knew we were looking at two Ands ORed together and the pieces are in variables. We already had earlier nearby code that checked for ConstantInts. So just inline the remaining parts into the earlier code.

Next step is to use m_APInt instead of ConstantInt.

Reviewers: spatel, efriedma, davide, majnemer

Reviewed By: spatel

Subscribers: zzheng, llvm-commits

Differential Revision: https://reviews.llvm.org/D36439

llvm-svn: 310806
2017-08-14 00:04:21 +00:00
Simon Pilgrim 8d9e6e607a [X86][BMI] Add BEXTR demanded bits test cases (PR34042)
llvm-svn: 310802
2017-08-13 20:35:38 +00:00
Craig Topper 5b59176abb [X86] Fix typo from r310794. Index = 0 should have been Index == 0.
llvm-svn: 310801
2017-08-13 20:21:12 +00:00
Craig Topper 2dfc7889fd [X86] Remove unused pattern fragment that referenced MVT::i1. NFC
llvm-svn: 310799
2017-08-13 20:04:05 +00:00
Martin Storsjo 2341319564 [COFF, ARM64] Use '//' as comment character in assembly files in GNU environments
This allows using semicolons for bundling up more than one
statement per line. This is used within the mingw-w64 project in some
assembly files that contain code for multiple architectures.

Differential Revision: https://reviews.llvm.org/D36366

llvm-svn: 310797
2017-08-13 19:42:05 +00:00
Alex Bradbury f1af56c26a Remove RISCV from LLVM_ALL_TARGETS in CMakeLists.txt
It was mistakenly added to that list in D23560 (committed in rL285712). RISCV 
is an experimental backend and should never have been in that list, I 
mistakenly interpreted LLVM_ALL_TARGETS as a list of all targets rather than 
targets to build by default. Unfortunately, because of this the RISCV backend 
has been building by default when it shouldn't be.

This commet adds a description comment, which should help to avoid such 
mistakes in the future.

See my message to llvm-dev for more information and analysis 
<http://lists.llvm.org/pipermail/llvm-dev/2017-August/116347.html>.

Differential Revision: https://reviews.llvm.org/D36538

llvm-svn: 310796
2017-08-13 18:49:33 +00:00
Craig Topper 43e3b788f4 [AVX512] Correct isExtractSubvectorCheap so that it will return the correct answers for extracting 128-bits from a 512-bit vector and for mask registers.
Previously it would not return true for extracting either of the upper quarters of a 512-bit registers.

For mask registers we support extracting anything from index 0. And otherwise we only support extracting the upper half of a register.

Differential Revision: https://reviews.llvm.org/D36638

llvm-svn: 310794
2017-08-13 17:40:02 +00:00
Craig Topper 2251ef95a3 [X86][ARM][TargetLowering] Add SrcVT to isExtractSubvectorCheap
Summary:
Without the SrcVT its hard to know what is really being asked for. For example if your target has 128, 256, and 512 bit vectors. Maybe extracting 128 from 256 is cheap, but maybe extracting 128 from 512 is not.

For x86 we do support extracting a quarter of a 512-bit register. But for i1 vectors we don't have isel patterns for extracting arbitrary pieces. So we need this to have a correct implementation of isExtractSubvectorCheap for mask vectors.

Reviewers: RKSimon, zvi, efriedma

Reviewed By: RKSimon

Subscribers: aemerson, javed.absar, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D36649

llvm-svn: 310793
2017-08-13 17:29:07 +00:00
Gadi Haber bed2c50607 [X86][SandyBridge] Additional updates to the SNB instructions scheduling information
This is a continuation patch for commit r307529 which completely replaces the scheduling information for the SandyBridge architecture target by modifying the file X86SchedSandyBridge.td located under the X86 Target (see also https://reviews.llvm.org/D35019).

In this patch we added the scheduling information of additional SNB instructions that were missing from the patch commit r307529, fixed the scheduling of several resource groups that include only port0 instead of port05 (i.e., port0 OR port5) and fixed several incorrect instructions' scheduling in the r307529 commit.

The patch also includes the X87 instructions which were missing in previous patch commit r307529 as reported in bugzilla bug 34080.

Reviewers: zvi, RKSimon, chandlerc, igorb, m_zuckerman, craig.topper, aymanmus, dim

Differential Revision: https://reviews.llvm.org/D36388

llvm-svn: 310792
2017-08-13 13:59:24 +00:00
Simon Pilgrim 631991fdde [X86][AVX512] Added additional shuffle+trunc test case.
An existing test should have covered this but a typo caused it to fail. I've kept both as the codegen for the typo case needs addressing as well. 

llvm-svn: 310791
2017-08-13 12:30:36 +00:00
Simon Pilgrim ad1c5566a7 [X86][TBM] Add tests showing failure to fold RFLAGS result into TBM instructions.
And fails to select TBM instructions at all.

llvm-svn: 310790
2017-08-13 12:16:00 +00:00
Coby Tayree 799fa2c76e [X86][AsmParser][AVX512] Error appropriately when K0 is tried as a write-mask
K0 isn't expected as a write-mask, so provide a detailed error here, instead of the more generic one (invalid op for insn)
Conforms with gas

Differential Revision: https://reviews.llvm.org/D36570

llvm-svn: 310789
2017-08-13 12:03:00 +00:00
Simon Pilgrim 808ce12878 [X86][TBM] Regenerate bextri intrinsics tests. NFCI.
llvm-svn: 310788
2017-08-13 11:56:15 +00:00
Guy Blank de425ae753 [X86][AVX512] Add combine for TESTM
Add an X86 combine for TESTM when one of the operands is a BUILD_VECTOR(0,0,...).

TESTM op0, BUILD_VECTOR(0,0,...) -> BUILD_VECTOR(0,0,...)
TESTM BUILD_VECTOR(0,0,...), op1 -> BUILD_VECTOR(0,0,...)

Differential Revision:
https://reviews.llvm.org/D36536

llvm-svn: 310787
2017-08-13 08:03:37 +00:00
Craig Topper 77dd140786 [X86] Early out of combineInsertSubvector for mask vectors.
The combines here shouldn't be done for mask vectors, but it wasn't clear anything was preventing that.

llvm-svn: 310786
2017-08-12 22:33:58 +00:00
Craig Topper dbca6d47f3 [X86] Fix bad comment. NFC
llvm-svn: 310785
2017-08-12 22:33:57 +00:00
Craig Topper 44cb1ffb6a [X86] When handling addcarry intrinsic, create the flag result with the correct type so we don't crash if we use a memory instruction
Summary:
Previously we were creating the flag result with MVT::Other which is interpretted as a Chain node. If we used a memory form of the instruction we would end up with a copyToReg that consumed the chain result of the adcx instruction instead of the flag result.

Pretty sure we should be using MVT::i32 here, that's what we do other places we create these node types.

We should probably consider this for 5.0 as well.

Reviewers: RKSimon, zvi, spatel

Reviewed By: RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D36645

llvm-svn: 310784
2017-08-12 20:19:44 +00:00