Commit Graph

171015 Commits

Author SHA1 Message Date
Simon Pilgrim 7575c6d01b [X86] Use existing pulled out VT variables. NFCI.
llvm-svn: 345388
2018-10-26 14:39:28 +00:00
Max Kazantsev 619a83463f [SimpleLoopUnswitch] Unswitch by experimental.guard intrinsics
This patch adds support of `llvm.experimental.guard` intrinsics to non-trivial
simple loop unswitching. These intrinsics represent implicit control flow which
has pretty much the same semantics as usual conditional branches. The
algorithm of dealing with them is following:

- Consider guards as unswitching candidates;
- If a guard is considered the best candidate, turn it into a branch;
- Apply normal unswitching algorithm on this branch.

The patch has no compile time effect on code that does not contain any guards.

Differential Revision: https://reviews.llvm.org/D53744
Reviewed By: chandlerc

llvm-svn: 345387
2018-10-26 14:20:11 +00:00
Sjoerd Meijer 56f336e2c9 [ARM] Fix ARMCodeGenPrepare test cases
While working on FileCheck producing better diagnostics in D53710, I noticed
that our test case is broken in a few different ways. The test was running, but
results were not checked as prefix CHECK-COMMON wasn't defined (which is what
FileCheck should warn about). Also, the output was different in 2 cases because
of recent changes in ARMCodeGenPrepare.

Differential Revision: https://reviews.llvm.org/D53746

llvm-svn: 345386
2018-10-26 14:19:57 +00:00
Francis Visoiu Mistrih 08d321c9f9 [CodeGen] Remove out operands from PATCHABLE_OP
The current model requires 1 out operand, but it is not used nor created.

This fixed an x86 machine verifier issue.

Part of PR27481.

llvm-svn: 345384
2018-10-26 13:37:25 +00:00
Owen Reynolds c443e7ef55 [llvm-ar] Access ADDLIB in llvm-ar via command line
ADDLIB is called to add the contents of an archive to another archive. 
Previously this was only accessible through the use of an MRI script.

With the use of a new "L" modifier, archive files can treated in the 
manner above when using quick append.

llvm-svn: 345383
2018-10-26 13:34:38 +00:00
Scott Linder 11ef7984b0 [AMDGPU] Add a pass to promote bitcast calls
AMDGPU currently only supports direct calls, but at lower optimisation levels it
fails to lower statically direct calls which appear indirect due to a bitcast.

Add a pass to visit all CallSites and use CallPromotionUtils to "devirtualize"
calls.

Differential Revision: https://reviews.llvm.org/D52741

llvm-svn: 345382
2018-10-26 13:18:36 +00:00
Simon Pilgrim 11c01f402f Regenerate test
llvm-svn: 345379
2018-10-26 12:33:56 +00:00
Sam McCall 0739ffc161 [llvm-mca] Fix -wreorder and -Wunused-private-field after r345376. NFC
llvm-svn: 345378
2018-10-26 12:19:48 +00:00
George Rimar 088d96b43d [Codegen] - Implement basic .debug_loclists section emission (DWARF5).
.debug_loclists is the DWARF 5 version of the .debug_loc.
With that patch, it will be emitted when DWARF 5 is used.

Differential revision: https://reviews.llvm.org/D53365

llvm-svn: 345377
2018-10-26 11:25:12 +00:00
Andrea Di Biagio 84d0051310 [llvm-mca] Removed dependency on mca::SourcMgr in some Views. NFC
llvm-svn: 345376
2018-10-26 10:48:04 +00:00
Max Kazantsev bde31000b1 [SimpleLoopUnswitch] Make all checks before actual non-trivial unswitch
We should be able to make all relevant checks before we actually start the non-trivial
unswitching, so that we could guarantee that once we have started the transform,
it will always succeed.

Reviewed By: chandlerc
Differential Revision: https://reviews.llvm.org/D53747

llvm-svn: 345375
2018-10-26 09:52:58 +00:00
Fangrui Song 065c3610ad [SystemZ] Fix -Wcovered-switch-default as coding standard regulates
llvm-svn: 345369
2018-10-26 06:59:08 +00:00
Kristina Brooks 820aeb214a [NFC] Add periods to CREDITS.txt (testing git-llvm)
NFC commit to test git-llvm bridge for current GitHub monorepo.

llvm-svn: 345368
2018-10-26 06:57:02 +00:00
Fangrui Song 50aaaffedd [llvm-nm] Simplify. NFC
Change a \t to spaces
Change some zero-filling memcpy to aggregate initialization
Delete redundant ArchiveName.clear() after declaration

llvm-svn: 345367
2018-10-26 06:56:51 +00:00
Li Jia He f6fb752fe8 [PowerPC] Fix some missed optimization opportunities in combineSetCC
For both operands are bool, short, int, long, long long, add the following optimization.
1. 0-x == y --> x+y ==0
2. 0-x != y --> x+y != 0

Review: nemanjai
Differential Revision: https://reviews.llvm.org/D53360

llvm-svn: 345366
2018-10-26 06:48:53 +00:00
Li Jia He 9521467318 [PowerPC][NFC] Add tests for some missed optimization opportunities in combineSetCC
For both operands are bool, short, int, long, long long, add the following optimization test case.
1. 0-x == y --> x+y ==0
2. 0-x != y --> x+y != 0

Review: nemanjai
Differential Revision: https://reviews.llvm.org/D53358

llvm-svn: 345365
2018-10-26 05:02:10 +00:00
Li Jia He 15e6b10fa9 This reverts commit r345357, It is wrong to create a new directory and put the test file into it. I am sorry for this.
llvm-svn: 345364
2018-10-26 04:54:56 +00:00
Nemanja Ivanovic fce57f586d [NFC] Fix the regular expression for BE PPC in update_llc_test_checks.py
Currently, the regular expression that matches the lines of assembly for PPC LE
(ELFv2) does not work for the assembly for BE (ELFv1). This patch fixes it.

Differential revision: https://reviews.llvm.org/D53059

llvm-svn: 345363
2018-10-26 03:30:28 +00:00
Nemanja Ivanovic 6a74bfba20 [PowerPC] Keep vector int to fp conversions in vector domain
At present a v2i16 -> v2f64 convert is implemented by extracts to scalar,
scalar converts, and merge back into a vector. Use vector converts instead,
with the int data permuted into the proper position and extended if necessary.

Patch by RolandF.

Differential revision: https://reviews.llvm.org/D53346

llvm-svn: 345361
2018-10-26 03:19:13 +00:00
Fangrui Song 5300c2e0ea [Pipeliner] Mark swp-art-deps-rec.ll as REQUIRES: asserts after rL345319
llvm-svn: 345359
2018-10-26 03:15:56 +00:00
Fangrui Song 61ea8dae2e Add dependency from SystemZAsmParser to SystemZAsmPrinter after rL345349
This fixes -DBUILD_SHARED_LIBS=on build. The dependency is similar to that of X86's.

llvm-svn: 345358
2018-10-26 03:04:54 +00:00
Li Jia He 1ad356dfb3 [PowerPC][NFC] Add tests for some missed optimization opportunities in combineSetCC
For both operands are bool, short, int, long, long long, add the following optimization test case.
1. 0-x == y --> x+y ==0
2. 0-x != y --> x+y != 0

Review: nemanjai
Differential Revision: https://reviews.llvm.org/D53358

llvm-svn: 345357
2018-10-26 02:34:57 +00:00
Vlad Tsyrklevich 21beeb29ea Revert "[AArch64] Create proper memoperand for multi-vector stores"
This reverts commit r345315, it was causing test failures on
sanitizer-x86_64-linux-fast.

llvm-svn: 345356
2018-10-26 02:00:14 +00:00
Li Jia He 63aca02f08 add myself to the CREDITS.TXT
llvm-svn: 345355
2018-10-26 01:58:23 +00:00
Chijun Sima 32fd196cbf Teach the DominatorTree fallback to recalculation when applying updates to speedup JT (PR37929)
Summary:
This patch makes the dominatortree recalculate when applying updates with the size of the update vector larger than a threshold. Directly applying updates is usually slower than recalculating the whole domtree in this case. This patch fixes an issue which causes JT running slowly on some inputs.

In bug 37929, the dominator tree is trying to apply 19,000+ updates several times, which takes several minutes.

After this patch, the time used by DT.applyUpdates:

| Input | Before (s) | After (s) | Speedup |
| the 2nd Reproducer in 37929 | 297 | 0.15 | 1980x |
| clang-5.0.0.0.bc | 9.7 | 4.3 | 2.26x |
| clang-5.0.0.4.bc | 11.6 | 2.6 | 4.46x |

Reviewers: kuhar, brzycki, trentxintong, davide, dmgreen, grosser

Reviewed By: kuhar, brzycki

Subscribers: kristina, llvm-commits

Differential Revision: https://reviews.llvm.org/D53245

llvm-svn: 345353
2018-10-26 01:28:36 +00:00
Jonas Paulsson dda46307c2 [SystemZ] Implement SystemZOperand::print()
SystemZAsmParser can now handle -debug by printing the operands neatly to the
output stream. Before this patch this lead to an llvm_unreachable().

It seems that now '-mllvm -debug' does not cause any crashes anywhere (at
least not on SPEC).

Review: Ulrich Weigand
https://reviews.llvm.org/D53328

llvm-svn: 345349
2018-10-26 00:36:00 +00:00
Zachary Turner ed2597e909 Dump public symbol records in pdb2yaml mode
llvm-svn: 345348
2018-10-26 00:17:31 +00:00
Jonas Paulsson e2c5cbc164 [SystemZ] Pass the DAG pointer from SystemZAddressingMode::dump().
In order to print the IR slot number for the memory operand, the DAG pointer
must be passed to SDNode::dump().

The isel-debug.ll test updated to also check for the IR Value reference being
printed correctly.

Review: Ulrich Weigand
https://reviews.llvm.org/D53333

llvm-svn: 345347
2018-10-26 00:02:33 +00:00
Heejin Ahn 24faf859e5 Reland "[WebAssembly] LSDA info generation"
Summary:
This adds support for LSDA (exception table) generation for wasm EH.
Wasm EH mostly follows the structure of Itanium-style exception tables,
with one exception: a call site table entry in wasm EH corresponds to
not a call site but a landing pad.

In wasm EH, the VM is responsible for stack unwinding. After an
exception occurs and the stack is unwound, the control flow is
transferred to wasm 'catch' instruction by the VM, after which the
personality function is called from the compiler-generated code. (Refer
to WasmEHPrepare pass for more information on this part.)

This patch:
- Changes wasm.landingpad.index intrinsic to take a token argument, to
make this 1:1 match with a catchpad instruction
- Stores landingpad index info and catch type info MachineFunction in
before instruction selection
- Lowers wasm.lsda intrinsic to an MCSymbol pointing to the start of an
exception table
- Adds WasmException class with overridden methods for table generation
- Adds support for LSDA section in Wasm object writer

Reviewers: dschuff, sbc100, rnk

Subscribers: mgorny, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D52748

llvm-svn: 345345
2018-10-25 23:55:10 +00:00
Heejin Ahn 3103d3dcd1 [WebAssembly] Support EH instructions in InstPrinter
Summary: This adds support for exception handling instructions to InstPrinter.

Reviewers: dschuff, aardappel

Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D53634

llvm-svn: 345343
2018-10-25 23:45:48 +00:00
Jonas Paulsson f213f81d9c Fix in MachineOperand::printIRValueReference().
Handle the case where getCurrentFunction() returns nullptr by passing -1 to
printIRSlotNumber(). This will result in <badref> being printed instead of an
assertion failure.

Review: Francis Visoiu Mistrih
https://reviews.llvm.org/D53333

llvm-svn: 345342
2018-10-25 23:39:07 +00:00
Bryan Chan f0923f16f8 [AArch64] Implement FP16FML intrinsics
Add LLVM intrinsics for the ARMv8.2-A FP16FML vector-form instructions. Add a
DAG pattern to define the indexed-form intrinsics in terms of the vector-form
ones, similarly to how the Dot Product intrinsics were implemented.

Based on a patch by Gao Yiling.

Differential Revision: https://reviews.llvm.org/D53632

llvm-svn: 345337
2018-10-25 23:36:41 +00:00
Heejin Ahn 8370a95d0d Delete test case. Assertions can't be tested.
llvm-svn: 345336
2018-10-25 23:35:15 +00:00
Heejin Ahn cc719ba0dd Tidy up test case
llvm-svn: 345335
2018-10-25 23:35:15 +00:00
Heejin Ahn 1d13e6be37 Address comments
- Add llvm-mc test case (and delete the old one)
- Change report_fatal_error to assertions

llvm-svn: 345334
2018-10-25 23:35:14 +00:00
Heejin Ahn 1147d91402 [WebAssembly] Error out when block/loop markers mismatch
Summary:
Currently InstPrinter ignores if there are mismatches between block/loop
and end markers by skipping the case if ControlFlowStack is empty. I
guess it is better to explicitly error out in this case, because this
signals invalid input.

Reviewers: aardappel

Subscribers: dschuff, sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D53620

llvm-svn: 345333
2018-10-25 23:35:13 +00:00
Jonas Paulsson 2b280ea604 [SystemZ] NFC reformatting in SystemZTargetTransformInfo.cpp
Some lines more than 80 characters long reformatted.

llvm-svn: 345331
2018-10-25 22:53:27 +00:00
Jonas Paulsson b7caa809e1 [SystemZ] Improve getMemoryOpCost() to find foldable loads that are converted.
The SystemZ backend can do arithmetic of memory by loading and then extending
one of the operands. Similarly, a load + truncate can be folded into an
operand.

This patch improves the SystemZ TTI cost function to recognize this.

Review: Ulrich Weigand
https://reviews.llvm.org/D52692

llvm-svn: 345327
2018-10-25 22:28:25 +00:00
David Blaikie 73c2f197d2 DebugInfo: Explain why DW_LLE_(GNU_)startx_length is used
This isn't the most object-size efficient encoding, but it's the only
one GDB supports for the pre-standard fission format. I've written fixes
for this twice now... - so perhaps this comment will help me remember
why neither of these have been committed and why I shouldn't try to
write a third fix another year from now...

llvm-svn: 345326
2018-10-25 22:26:25 +00:00
Sanjay Patel c14aafdacc [x86] add tests for missed load folding; NFC
llvm-svn: 345325
2018-10-25 22:23:27 +00:00
Jonas Paulsson 4645711a8d [SystemZ] Improve handling and cost estimates of vector integer div/rem
Enable the DAG optimization that converts vector div/rem with constants into
multiply+shifts sequences by expanding them early. This is needed since
ISD::SMUL_LOHI is 'Custom' lowered on SystemZ, and will therefore not be
available to BuildSDIV after legalization.

Better cost values for these instructions based on how they will be
implemented (a constant divisor is cheaper).

Review: Ulrich Weigand
https://reviews.llvm.org/D53196

llvm-svn: 345321
2018-10-25 21:47:22 +00:00
David Blaikie 2f9c42c994 llvm-dwarfdump: loclists: Don't expect an (albeit empty) expression for LLE_base_address
llvm-svn: 345320
2018-10-25 21:35:59 +00:00
Sumanth Gundapaneni ada0f511ba [Pipeliner] Ignore Artificial dependences while computing recurrences.
The artificial dependencies are not real dependencies. In some cases, they
form circuits with bigger MII. However, they are used to schedule instructions
better.

Differential Revision: https://reviews.llvm.org/D53450

llvm-svn: 345319
2018-10-25 21:27:08 +00:00
Sumanth Gundapaneni dfdbc716e4 [Pipeliner] Remove the unneeded include header(NFC).
Differential Revision: https://reviews.llvm.org/D53451

llvm-svn: 345318
2018-10-25 21:25:30 +00:00
Craig Topper 813064bf4d [X86] Change X86 backend to look for 'min-legal-vector-width' attribute instead of 'required-vector-width' when determining whether 512-bit vectors should be legal.
The required-vector-width attribute was only used for backend testing and has never been generated by clang.

I believe clang is now generating min-legal-vector-width for vector uses in user code.

With this I believe passing -mprefer-vector-width=256 to clang should prevent use of zmm registers in the generated assembly unless the user used a 512-bit intrinsic in their source code.

llvm-svn: 345317
2018-10-25 21:16:06 +00:00
Francis Visoiu Mistrih 5be9e6de89 [CodeGen] Remove operands from FENTRY_CALL
FENTRY_CALL is actually not taking any input / output operands. The
machine verifier complains now because the target description says that:

* It needs 1 unknown output
* It needs 1 or more variable inputs

llvm-svn: 345316
2018-10-25 21:12:15 +00:00
David Greene 53e869da7d [AArch64] Create proper memoperand for multi-vector stores
Include all of the store's source vector operands when creating the
MachineMemOperand. Previously, we were missing the first operand,
making the store size seem smaller than it really is.

Differential Revision: https://reviews.llvm.org/D52816

llvm-svn: 345315
2018-10-25 21:10:39 +00:00
Volkan Keles f28e81f6aa [AArch64][GlobalISel] Simplify a legalizer test. NFC.
llvm-svn: 345307
2018-10-25 20:01:19 +00:00
Thomas Lively 0aad98fd07 [WebAssembly] Use target-independent saturating add
Reviewers: aheejin, dschuff

Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D53721

llvm-svn: 345299
2018-10-25 19:06:13 +00:00
Craig Topper 4a825e7b29 [X86] Add some non-AVX512VL command lines to the *vl-vec-test-testn.ll tests.
This will expose some regressions in the WIP and/or/xor promotion removal patch.

llvm-svn: 345297
2018-10-25 18:23:48 +00:00
Cameron McInally 384a74b0e6 [FPEnv] Last BinaryOperator::isFNeg(...) to m_FNeg(...) changes
Replacing BinaryOperator::isFNeg(...) to avoid regressions when we
separate FNeg from the FSub IR instruction.

Differential Revision: https://reviews.llvm.org/D53650

llvm-svn: 345295
2018-10-25 18:09:33 +00:00
Craig Topper ce0bc3814b [X86] Add KNL command lines to movmsk-cmp.ll.
Some of this code looks pretty bad and we should probably still be using movmskb more with avx512f.

llvm-svn: 345293
2018-10-25 18:06:25 +00:00
Volkan Keles 60c6affcb0 [GlobalISel] LegalizerHelper: Fix the incorrect alignment when splitting loads/stores in narrowScalar
Reviewers: dsanders, bogner, jpaquette, aemerson, ab, paquette

Reviewed By: dsanders

Subscribers: rovka, kristof.beyls, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D53664

llvm-svn: 345292
2018-10-25 17:52:19 +00:00
Simon Pilgrim f02c0f8af6 [LegalizeDAG] Remove dead SINT_TO_FP legalization code
As noticed on D52965, the SINT_TO_FP i64 to f32 legalization code has been dead for years - protected by an assert.

Differential Revision: https://reviews.llvm.org/D53703

llvm-svn: 345290
2018-10-25 17:43:36 +00:00
Volkan Keles f87473fe1c [GISel] LegalizerInfo: Rename MemDesc::Size to SizeInBits to make the value clearer
Requested in D53679.

llvm-svn: 345288
2018-10-25 17:37:07 +00:00
Craig Topper c10de9a37a [X86] Remove ProcIntelKNL and replace with a SlowPMADDWD flag to use in the one place it was checked.
llvm-svn: 345286
2018-10-25 17:29:00 +00:00
Craig Topper 5d787ac4be [X86] Remove some uarch tuning flags from KNL that look to have been inherited from SNB/IVB incorrectly
KNL is based on a modified Silvermont core so I don't think these features apply. I think the LEA flag is probably also wrong, but I'm less sure as I barely understand the 3 LEA flags we have currently.

Differential Revision: https://reviews.llvm.org/D53671

llvm-svn: 345285
2018-10-25 17:28:57 +00:00
Volkan Keles 3a103b1d25 [AArch64][GlobalISel] Fix the LegalityPredicate for lowerIf for G_LOAD/G_STORE
Summary:
Currently, Legalizer is trying to lower G_LOAD with a vector type
that has more than two elements due to the incorrect LegalityPredicate.

This patch fixes the issue by removing the multiplication by 8
as `MemDesc.Size` already contains the size in bits.

Reviewers: dsanders, aemerson

Reviewed By: dsanders

Subscribers: rovka, javed.absar, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D53679

llvm-svn: 345282
2018-10-25 17:23:25 +00:00
Andrea Di Biagio 1e6d0aad7e [llvm-mca] Introduce a new base class for mca::Instruction, and change how read/write information is stored.
This patch introduces a new base class for Instruction named InstructionBase.
Class InstructionBase is responsible for tracking data dependencies with the
help of ReadState and WriteState objects.  Class Instruction now derives from
InstructionBase, and adds extra information related to the `InstrStage` as well
as the `RCUTokenID`.

ReadState and WriteState objects are no longer unique pointers. This avoids
extra heap allocation and pointer checks that weren't really needed.  Now, those
objects are simply stored into SmallVectors.  We use a SmallVector instead of a
std::vector because we expect most instructions to only have a very small number
of reads and writes.  By using a simple SmallVector we also avoid extra heap
allocations most of the time.
In a debug build, this improves the performance of llvm-mca by roughly 10% (I
still have to verify the impact in performance on a release build).

llvm-svn: 345280
2018-10-25 17:03:51 +00:00
Evandro Menezes b53cf99388 [AArch64] Refactor Exynos feature sets (NFC)
llvm-svn: 345279
2018-10-25 16:45:46 +00:00
Simon Pilgrim 8f11ddc397 [ARM] Regenerate vdup tests
llvm-svn: 345276
2018-10-25 15:33:47 +00:00
John Brawn 958865202d [AArch64] Add EXT patterns for 64-bit EXT of a subvector of a 128-bit vector
If we have a 64-bit EXT where one of the operands is a subvector of a 128-bit
vector then in some cases we can eliminate an extract_subvector by converting
to a 128-bit EXT of the 128-bit vector.

Differential Revision: https://reviews.llvm.org/D53582

llvm-svn: 345275
2018-10-25 15:31:51 +00:00
Sam Parker a16667e79b [ARM] Use Cortex-A57 sched model for Cortex-A72
This mirrors what we already do for AArch64 as the cores are similar.
As discussed in the review, enabling the machine scheduler causes
more variations in performance changes so it is not enabled for now.
This patch improves LNT scores by a geomean of 1.57% at -O3.

Differential Revision: https://reviews.llvm.org/D53562

llvm-svn: 345272
2018-10-25 15:08:29 +00:00
John Brawn b8e7887f33 [AArch64] Refactor definition of EXT patterns to use a multiclass
Using a multiclass reduces duplication, and makes it easier to add new patterns
later. This refactoring does add some new patterns, but as far as I can tell
there's no IR that will end up triggering them so this is effectively NFC.

Differential Revision: https://reviews.llvm.org/D53580

llvm-svn: 345271
2018-10-25 15:00:10 +00:00
John Brawn 49e61d90ca [AArch64] Do 64-bit vector move of 0 and -1 by extracting from the 128-bit move
Currently a vector move of 0 or -1 will use different instructions depending on
the size of the vector. Using a single instruction (the 128-bit one) for both
gives more opportunity for Machine CSE to eliminate instructions.

Differential Revision: https://reviews.llvm.org/D53579

llvm-svn: 345270
2018-10-25 14:56:48 +00:00
Alexey Bataev 0f2fe4f135 [DEBUG_INFO][NVPTX]Fix processing of DBG_VALUES.
Summary:
If the instruction in the eliminateFrameIndex function is a DBG_VALUE
instruction, it requires special processing. The frame register is set
to VRFrame and the offset is based on the object offset.
The code is similar to the code used in
lib/CodeGen/PrologEpilogInserter.cpp.

Reviewers: tra

Subscribers: jholewinski, llvm-commits

Differential Revision: https://reviews.llvm.org/D53657

llvm-svn: 345269
2018-10-25 14:27:27 +00:00
Francis Visoiu Mistrih 7d55dd673b [X86] Fix llc invocation on MIR test case
The current state of the llc invocation is:

* Running all the passes from dwarfehprepare to stack coloring
(included)
* It runs it from the LLVM IR included in the file
* It *ADDS* the generated MI from ISel to the MI in the MIR file
* The machine verifier doesn't like it.

Differential Revision: https://reviews.llvm.org/D53698

llvm-svn: 345266
2018-10-25 14:11:07 +00:00
Amara Emerson cbd86d8429 [GlobalISel] Use the target preferred type for G_EXTRACT_VECTOR_ELT index.
Allows for better imported pattern re-use.

llvm-svn: 345265
2018-10-25 14:04:54 +00:00
Krasimir Georgiev 142919bc23 IR: Optimize StructType::get to perform one hash lookup instead of two, NFCI
Summary:
This function was performing two hash lookups when a new struct type was requested: first checking if it exists and second to insert it. This patch updates the function to perform a single hash lookup in this case by updating the value in the hash table in-place in case the struct type was not there before.

Similar to r345151.

Reviewers: bkramer

Reviewed By: bkramer

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D53689

llvm-svn: 345264
2018-10-25 13:38:07 +00:00
Simon Pilgrim 53e8e145e9 [CostModel][X86] Add realistic vXi64 uitofp vXf64 costs
Match codegen improvements from D53649/rL345256

llvm-svn: 345263
2018-10-25 13:06:20 +00:00
Alex Bradbury 74d4931da2 [RISCV] Use PatFrags for variable shift patterns
This follows SystemZ and I think is cleaner vs the multiclass.

llvm-svn: 345262
2018-10-25 12:45:20 +00:00
Simon Pilgrim 0573b8d8b6 [CostModel][X86] Add realistic i64 uitofp f64 scalar costs
llvm-svn: 345261
2018-10-25 12:42:10 +00:00
Andrea Di Biagio 77c26aebda [llvm-mca] Removed a couple of redundant method declarations, and simplified code in ResourcePressureView. NFC
llvm-svn: 345259
2018-10-25 11:51:34 +00:00
Simon Pilgrim 49d79a864c Missing semicolon.
llvm-svn: 345257
2018-10-25 11:38:17 +00:00
Simon Pilgrim 838eb24014 [TargetLowering] Improve vXi64 UINT_TO_FP vXf64 support (P38226)
As suggested on D52965, this patch moves the i64 to f64 UINT_TO_FP expansion code from LegalizeDAG into TargetLowering and makes it available to LegalizeVectorOps as well.

Not only does this help perform X86 lowering as a true vectorization instead of (partially vectorized) scalar conversions, it avoids the HADDPD op from the scalar code which can be slow on most targets.

The AVX512F does have the vcvtusi2sdq scalar operation but we don't unroll to use it as it seems to only help for the v2f64 case - otherwise the unrolling cost will certainly be too high. My feeling is that we should leave it to the vectorizers - and if it generates the vector UINT_TO_FP we should use it.

Differential Revision: https://reviews.llvm.org/D53649

llvm-svn: 345256
2018-10-25 11:15:57 +00:00
George Rimar 581fc63dc0 [llvm-dwarfdump] - Fix incorrect parsing of the DW_LLE_startx_length
As was already mentioned in comments for D53364, DWARF 5
spec says about DW_LLE_startx_length:

"This is a form of bounded location description that has two unsigned ULEB operands.
The first value is an address index (into the .debug_addr section) that indicates the beginning of the address range
over which the location is valid. The second value is the length of the range. ")

Currently, the length is always parsed as U32.
Patch change the behavior to parse DW_LLE_startx_length as ULEB128 for DWARF 5
and keeps it as U32 for DWARF4+(pre-DWARF5) for compatibility.

Differential revision: https://reviews.llvm.org/D53564

llvm-svn: 345254
2018-10-25 10:56:44 +00:00
Simon Pilgrim 071e82218f [TTI] Add generic SK_Broadcast shuffle costs
I noticed while fixing PR39368 that we don't have generic shuffle costs for broadcast style shuffles.

This patch adds SK_BROADCAST handling, but exposes ARM/AARCH64 lack of handling of this type, which I've added a fix for at the same time.

Differential Revision: https://reviews.llvm.org/D53570

llvm-svn: 345253
2018-10-25 10:52:36 +00:00
Simon Pilgrim 2a9c728088 Fix MSVC llvm-exegesis build. NFCI.
MSVC is a bit funny about is_pod.....

llvm-svn: 345252
2018-10-25 10:45:38 +00:00
Carlos Alberto Enciso 9a24e1a7cd [DebugInfo][Dexter] Unreachable line stepped onto after SimplifyCFG.
When SimplifyCFG changes the PHI node into a select instruction, the debug line records becomes ambiguous. It causes the debugger to display unreachable source lines.

Differential Revision: https://reviews.llvm.org/D53287

llvm-svn: 345250
2018-10-25 09:58:59 +00:00
Gabor Buella 1f6ca0ba15 Add -instcombine-code-sinking option
Reviewers: craig.topper, andrew.w.kaylor, efriedma

Reviewed By: craig.topper, andrew.w.kaylor, efriedma

Differential Revision: https://reviews.llvm.org/D52709

llvm-svn: 345248
2018-10-25 08:32:29 +00:00
Clement Courbet b4b6ec01c6 [llvm-exegesis] Add missing initializer.
This is a better fix than rL345245.

llvm-svn: 345246
2018-10-25 08:11:35 +00:00
Clement Courbet fa99b36e4d [llvm-exegesis] Fix VC build of r345243.
"const members cannot be default initialized unless their type has a user defined default constructor"

Make members non-const.

llvm-svn: 345245
2018-10-25 08:08:58 +00:00
Clement Courbet 8902c885d6 [llvm-exegesis] Fix warning in r345243.
warning C4099: 'llvm::exegesis::PfmCountersInfo': type name first seen using 'class' now seen using 'struct'

llvm-svn: 345244
2018-10-25 08:06:35 +00:00
Clement Courbet 41c8af3924 [MCSched] Bind PFM Counters to the CPUs instead of the SchedModel.
Summary:
The pfm counters are now in the ExegesisTarget rather than the
MCSchedModel (PR39165).

This also compresses the pfm counter tables (PR37068).

Reviewers: RKSimon, gchatelet

Subscribers: mgrang, llvm-commits

Differential Revision: https://reviews.llvm.org/D52932

llvm-svn: 345243
2018-10-25 07:44:01 +00:00
Craig Topper 7ae43cad65 [X86] Don't use the OriginalDemandedBits to calculate the DemandedMask for PMULUDQ/PMULDQ inputs.
Multiply a is complex operation so just because some bit of the output isn't used doesn't mean that bit of the input isn't used.

We might able to bound it, but it will require some more thought.

llvm-svn: 345241
2018-10-25 07:00:09 +00:00
Simon Atanasyan 1993254509 [llvm-readobj] Print ELF header flags names in GNU output
GNU readelf tool prints hex value of the ELF header flags field and the
flags names. This change adds the same functionality to llvm-readobj.
Now llvm-readobj can print MIPS and RISCV flags.

New GNUStyle::printFlags() method is a copy of ScopedPrinter::printFlags()
routine. Probably we can escape code duplication and / or simplify the
printFlags() method. But it's a task for separate commit.

Differential revision: https://reviews.llvm.org/D52027

llvm-svn: 345238
2018-10-25 05:39:27 +00:00
Craig Topper eaa1cf5b57 [X86] Fix typo in comment. NFC
llvm-svn: 345236
2018-10-25 05:00:20 +00:00
Thomas Lively 325c9c5e84 [WebAssembly] Set LoadExt and TruncStore actions for SIMD types
Summary: Fixes part of the problem reported in bug 39275.

Reviewers: aheejin, dschuff

Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits, alexcrichton

Differential Revision: https://reviews.llvm.org/D53542

llvm-svn: 345230
2018-10-25 01:46:07 +00:00
Reid Kleckner a6c6698217 [X86] Adjust MIR test case to pacify machine verifier
llvm-svn: 345227
2018-10-24 23:52:33 +00:00
Reid Kleckner 24d12c28e7 [X86] Fix pipeline tests when enabling MIR verification, NFC
llvm-svn: 345226
2018-10-24 23:52:22 +00:00
David Blaikie 60fddac907 DebugInfo: Reuse common addresses for rnglist base address selections
This makes the offsets larger (since they are further from the base
address) but those are in the .dwo - and allows removing addresses and
relocations from the .o file.

This could be built into the AddressPool more fundamentally, perhaps -
when you ask for an AddressPool entry you could say "or give me some
other entry and an offset I need to use" - though what to do about
situations where the first use of an address in a section is not the
earliest address in that section... is tricky.

At least with range addresses we can be fairly sure we've seen the
earliest address first because we see the start address for the
function.

llvm-svn: 345224
2018-10-24 23:36:29 +00:00
Heejin Ahn ac764aa88e [WebAssembly] Fix immediate of rethrow when throwing to caller
Summary:
Currently when assigning depths 'rethrow' does not take the whole
control flow stack into accounts but only considers EH pad stacks. When
assigning depth immmediates to rethrows, in normal cases it is done
correctly but when a rethrow instruction throws up to a caller, i.e., we
convert a pseudo RETHROW_TO_CALLER instruction to a rethrow, it
mistakenly compute the whole stack depth.

Reviewers: dschuff

Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D53619

llvm-svn: 345223
2018-10-24 23:31:24 +00:00
Thomas Lively ed9513472c [WebAssembly] Retain shuffle types during custom lowering
Summary:
Changing the node type in lowering was violating assumptions made in
the DAG combiner, so don't change the node type any more. This fixes
one of the issues reported in bug 39275.

Reviewers: aheejin, dschuff

Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits, alexcrichton

Differential Revision: https://reviews.llvm.org/D53537

llvm-svn: 345221
2018-10-24 23:27:40 +00:00
Thomas Lively 22602a4980 Make fminimum/fmaximum SDNodes commutative and associative
Reviewers: aheejin, dschuff

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D53680

llvm-svn: 345220
2018-10-24 23:14:59 +00:00
Reid Kleckner 49a24278ba [ELF] Fix large code model MIR verifier errors
Instead of using the MOVGOT64r pseudo, use the existing
MO_PIC_BASE_OFFSET support on symbol operands. Now I don't have to
create a "scratch register operand" for the pseudo to use, and the
register allocator can make better decisions.

Fixes some X86 verifier errors tracked in PR27481.

llvm-svn: 345219
2018-10-24 22:57:28 +00:00
Thomas Lively 30f1d69115 [NFC] Rename minnan and maxnan to minimum and maximum
Summary:
Changes all uses of minnan/maxnan to minimum/maximum
globally. These names emphasize that the semantic difference between
these operations is more than just NaN-propagation.

Reviewers: arsenm, aheejin, dschuff, javed.absar

Subscribers: jholewinski, sdardis, wdng, sbc100, jgravelle-google, jrtc27, atanasyan, llvm-commits

Differential Revision: https://reviews.llvm.org/D53112

llvm-svn: 345218
2018-10-24 22:49:55 +00:00
Alexander Shaposhnikov 654d3a9577 [llvm-objcopy] Introduce dispatch mechanism based on the input
In this diff we introduce dispatch mechanism based on 
the type of the input (archive, object file, raw binary) 
and the format (coff, elf, macho).
We also move the ELF-specific code into the namespace llvm::objcopy::elf.

Test plan: make check-all

Differential revision: https://reviews.llvm.org/D53311

llvm-svn: 345217
2018-10-24 22:49:06 +00:00
Alina Sbirlea ad4d018202 Update MemorySSA in LoopRotate.
Summary:
Teach LoopRotate to preserve MemorySSA.
Enable tests for correctness, dependency disabled by default.

Subscribers: sanjoy, jlebar, Prazek, george.burgess.iv, llvm-commits

Differential Revision: https://reviews.llvm.org/D51718

llvm-svn: 345216
2018-10-24 22:46:45 +00:00
David Blaikie c8ae096739 llvm-dwarfdump: Account for skeleton addr_base when dumping addresses in split unit in the same file
llvm-svn: 345215
2018-10-24 22:44:54 +00:00
Volodymyr Sapsai 7faf7ae0ad [VFS] Remove 'ignore-non-existent-contents' attribute for YAML-based VFS.
'ignore-non-existent-contents' stopped working after r342232 in a way
that the actual attribute value isn't used and it works as if it is
always `true`.

Common use case for VFS iteration is iterating through files in umbrella
directories for modules. Ability to detect if some VFS entries point to
non-existing files is nice but non-critical. Instead of adding back
support for `'ignore-non-existent-contents': false` I am removing the
attribute, because such scenario isn't used widely enough and stricter
checks don't provide enough value to justify the maintenance.

Change is done both in LLVM and Clang, corresponding Clang commit is r345212.

rdar://problem/45176119

Reviewers: bruno

Reviewed By: bruno

Subscribers: hiraditya, dexonsmith, sammccall, cfe-commits

Differential Revision: https://reviews.llvm.org/D53228

llvm-svn: 345213
2018-10-24 22:40:54 +00:00
Thomas Lively 43bc46207a [SelectionDAG] DAG combiner for fminnan and fmaxnan
Summary: Depends on D52765.

Reviewers: aheejin, dschuff

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D52768

llvm-svn: 345210
2018-10-24 22:18:54 +00:00
Vedant Kumar c299006879 [HotColdSplitting] Identify larger cold regions using domtree queries
The current splitting algorithm works in three stages:

  1) Identify cold blocks, then
  2) Use forward/backward propagation to mark hot blocks, then
  3) Grow a SESE region of blocks *outside* of the set of hot blocks and
  start outlining.

While testing this pass on Apple internal frameworks I noticed that some
kinds of control flow (e.g. loops) are never outlined, even though they
unconditionally lead to / follow cold blocks. I noticed two other issues
related to how cold regions are identified:

  - An inconsistency can arise in the internal state of the hotness
  propagation stage, as a block may end up in both the ColdBlocks set
  and the HotBlocks set. Further inconsistencies can arise as these sets
  do not match what's in ProfileSummaryInfo.

  - It isn't necessary to limit outlining to single-exit regions.

This patch teaches the splitting algorithm to identify maximal cold
regions and outline them. A maximal cold region is defined as the set of
blocks post-dominated by a cold sink block, or dominated by that sink
block. This approach can successfully outline loops in the cold path. As
a side benefit, it maintains less internal state than the current
approach.

Due to a limitation in CodeExtractor, blocks within the maximal cold
region which aren't dominated by a single entry point (a so-called "max
ancestor") are filtered out.

Results:
  - X86 (LNT + -Os + externals): 134KB of TEXT were outlined compared to
  47KB pre-patch, or a ~3x improvement. Did not see a performance impact
  across two runs.
  - AArch64 (LNT + -Os + externals + Apple-internal benchmarks): 149KB
  of TEXT were outlined. Ditto re: performance impact.
  - Outlining results improve marginally in the internal frameworks I
  tested.

Follow-ups:
  - Outline more than once per function, outline large single basic
  blocks, & try to remove unconditional branches in outlined functions.

Differential Revision: https://reviews.llvm.org/D53627

llvm-svn: 345209
2018-10-24 22:15:41 +00:00
Sanjay Patel e9f9a2a29c [InstCombine] add test for fptrunc with vector with undef elt; NFC
This should be fixed with D53650.

llvm-svn: 345206
2018-10-24 22:02:05 +00:00
Paul Robinson 73766ccfda Make llvm-dwarfdump -name work on type units.
Differential Revision: https://reviews.llvm.org/D53672

llvm-svn: 345203
2018-10-24 21:51:55 +00:00
Joel E. Denny 3e66509f6c [SourceMgr][FileCheck] Obey -color by extending WithColor
(Relands r344930, reverted in r344935, and now hopefully fixed for
Windows.)

While this change specifically targets FileCheck, it affects any tool
using the same SourceMgr facilities.

Previously, -color was documented in FileCheck's -help output, but
-color had no effect.  Now, -color obeys its documentation: it forces
colors to be used in FileCheck diagnostics even when stderr is not a
terminal.

-color is especially helpful when combined with FileCheck's -v, which
can produce a long series of diagnostics that you might wish to pipe
to a pager, such as less -R.  The WithColor extensions here will also
help to clean up color usage in FileCheck's annotated dump of input,
which is proposed in D52999.

Reviewed By: JDevlieghere, zturner

Differential Revision: https://reviews.llvm.org/D53419

llvm-svn: 345202
2018-10-24 21:46:42 +00:00
Evandro Menezes 096e2497b5 [AArch64] Refactor Exynos machine model
Effectively, NFC.

llvm-svn: 345201
2018-10-24 21:40:43 +00:00
Tim Northover 05fe8f918b [DAG] check more operands for cycles when merging stores.
Until now, we've only checked whether merging stores would cause a cycle via
the value argument, but the address and indexed offset arguments are also
capable of creating cycles in some situations.

The addresses are all base+offset with notionally the same base, but the base
SDNode may still be different (e.g. via an indexed load in one case, and an
ISD::ADD elsewhere). This allows cycles to creep in if one of these sources
depends on another.

The indexed offset is usually undef (representing a non-indexed store), but on
some architectures (e.g. 32-bit ARM-mode ARM) it can be an arbitrary value,
again allowing dependency cycles to creep in.

llvm-svn: 345200
2018-10-24 21:36:34 +00:00
Reid Kleckner 9c5bda652c [X86] Add *SP to tailcall register class to fix verifier error
It's possible to do a tail call to a stack argument. LLVM already
calculates the right stack offset to call through.

Fixes the sibcall* and musttail* verifier failures tracked at PR27481.

llvm-svn: 345197
2018-10-24 21:09:34 +00:00
Sanjin Sijaric 625d08eea1 [MIR] Add hasWinCFI field
Adding hasWinCFI field so that I can add MIR test cases to
https://reviews.llvm.org/D50166.

Differential Revision: https://reviews.llvm.org/D51201

llvm-svn: 345196
2018-10-24 21:07:38 +00:00
Lang Hames a0d18b6314 [ExecutionEngine] Remove some dead code from JITEventListener.h.
llvm-svn: 345195
2018-10-24 20:37:40 +00:00
Matt Davis b5d5debdbc [llvm-mca] Replace InstRef::isValid with operator bool. NFC.
llvm-svn: 345190
2018-10-24 20:27:47 +00:00
Reid Kleckner 953bdce68d [MC] Separate masm integer literal lexer support from inline asm
Summary:
This renames the IsParsingMSInlineAsm member variable of AsmLexer to
LexMasmIntegers and moves it up to MCAsmLexer. This is the only behavior
controlled by that variable. I added a public setter, so that it can be
set from outside or from the llvm-mc command line. We may need to
arrange things so that users can get this behavior from clang, but
that's future work.

I also put additional hex literal lexing functionality under this flag
to fix PR32973. It appears that this hex literal parsing wasn't intended
to be enabled in non-masm-style blocks.

Now, masm integers (0b1101 and 0ABCh) work in __asm blocks from clang,
but 0b label references work when using .intel_syntax in standalone .s
files.

However, 0b label references will *not* work from __asm blocks in clang.
They will work from GCC inline asm blocks, which it sounds like is
important for Crypto++ as mentioned in PR36144.

Essentially, we only lex masm literals for inline asm blobs that use
intel syntax. If the .intel_syntax directive is used inside a gnu-style
inline asm statement, masm literals will not be lexed, which is
compatible with gas and llvm-mc standalone .s assembly.

This fixes PR36144 and PR32973.

Reviewers: Gerolf, avt77

Subscribers: eraman, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D53535

llvm-svn: 345189
2018-10-24 20:23:57 +00:00
Tim Northover 1c353419ab AArch64: add a pass to compress jump-table entries when possible.
llvm-svn: 345188
2018-10-24 20:19:09 +00:00
Evandro Menezes 769d4cebad [AArch64] Refactor Exynos machine model (NFC)
llvm-svn: 345187
2018-10-24 20:03:24 +00:00
Evandro Menezes 80bc136732 [AArch64] Fix overlapping instructions
Fix overlapping instruction descriptions in the machine model for Exynos M3.
Effectively, NFC.

llvm-svn: 345186
2018-10-24 20:03:20 +00:00
Andrea Di Biagio cd4deea1c4 [llvm-mca] Simplify the logic in FetchStage. NFCI
Only method 'getNextInstruction()' needs to interact with the SourceMgr.

llvm-svn: 345185
2018-10-24 19:37:45 +00:00
Craig Topper 7bb8c2e6e5 [X86] Explicitly list all KNL features of inheriting from IVB. NFC
I'm not sure all the microarchitectural tuning flags that have been added to IVBFeatures are relevant for KNL. Separating will allow us to see and audit them. There might even be some simplification opportunities in the Sandy Bridge through Icelake inheritance line without KNL using the same chain.

llvm-svn: 345183
2018-10-24 19:24:44 +00:00
Simon Pilgrim c5bb362b13 [X86][SSE] Add SimplifyDemandedBitsForTargetNode PMULDQ/PMULUDQ handling
Add X86 SimplifyDemandedBitsForTargetNode and use it to simplify PMULDQ/PMULUDQ target nodes.

This enables us to repeatedly simplify the node's arguments after the previous approach had to be reverted due to PR39398.

Differential Revision: https://reviews.llvm.org/D53643

llvm-svn: 345182
2018-10-24 19:11:28 +00:00
Simon Pilgrim 6f53b38fd4 [TargetLowering] Add SimplifyDemandedBitsForTargetNode callback
Add a SimplifyDemandedBitsForTargetNode callback to handle target nodes.

Differential Revision: https://reviews.llvm.org/D53643

llvm-svn: 345179
2018-10-24 19:00:56 +00:00
Teresa Johnson c8dba682bb [hot-cold-split] Name split functions with ".cold" suffix
Summary:
The current default of appending "_"+entry block label to the new
extracted cold function breaks demangling. Change the deliminator from
"_" to "." to enable demangling. Because the header block label will
be empty for release compile code, use "extracted" after the "." when
the label is empty.

Additionally, add a mechanism for the client to pass in an alternate
suffix applied after the ".", and have the hot cold split pass use
"cold."+Count, where the Count is currently 1 but can be used to
uniquely number multiple cold functions split out from the same function
with D53588.

Reviewers: sebpop, hiraditya

Subscribers: llvm-commits, erik.pilkington

Differential Revision: https://reviews.llvm.org/D53534

llvm-svn: 345178
2018-10-24 18:53:47 +00:00
Simon Pilgrim ac84005841 [CostModel][X86] Add vXi8 vector division by constants costs.
ISD::MULHS/ISD::MULHU lowering of vXi8 types means we expand these in TargetLowering BuildSDIV/BuildUDIV.

llvm-svn: 345175
2018-10-24 18:44:12 +00:00
Peter Collingbourne 4bb928c110 ARM: Use BKPT instead of TRAP to implement llvm.debugtrap.
The BKPT instruction is specified to cause a software breakpoint,
and at least on Linux results in a SIGTRAP. This makes it more
suitable for implementing debugtrap than TRAP (aka UDF #254), which
is specified to cause an undefined instruction exception and results
in a SIGILL on Linux.

Moreover, BKPT is not marked as a terminator, which is not only
consistent with the IR instruction but allows the analyzeBlock
function to correctly analyze a basic block containing the instruction,
which fixes an assertion failure in the machine block placement pass
previously triggered by the included test case.

Because BKPT is only supported starting with ARMv5T, we continue to
use UDF #254 when targeting v4T.

Differential Revision: https://reviews.llvm.org/D53614

llvm-svn: 345171
2018-10-24 18:10:38 +00:00
Krzysztof Parzyszek 57b5ac1431 [Hexagon] Flip hexagon-autohvx to be true by default
This will allow other generators of LLVM IR to use the auto-vectorizer
without having to change that flag.

Note: on its own, this patch will enable auto-vectorization on Hexagon
in all cases, regardless of the -fvectorize flag. There is a companion
clang patch that together with this one forms an NFC for clang users.

llvm-svn: 345169
2018-10-24 17:55:13 +00:00
Michael Kruse c342c8b87e [docs] Add rawspeed to test-suite proposals.
rawspeed was suggested by Simon Pilgrim and Roman Lebedev in
llvm.org/PR34216 and reviews.llvm.org/D46714.

llvm-svn: 345166
2018-10-24 17:35:35 +00:00
Craig Topper 2417273255 [X86] Bring back the MOV64r0 pseudo instruction
This patch brings back the MOV64r0 pseudo instruction for zeroing a 64-bit register. This replaces the SUBREG_TO_REG MOV32r0 sequence we use today. Post register allocation we will rewrite the MOV64r0 to a 32-bit xor with an implicit def of the 64-bit register similar to what we do for the various XMM/YMM/ZMM zeroing pseudos.

My main motivation is to enable the spill optimization in foldMemoryOperandImpl. As we were seeing some code that repeatedly did "xor eax, eax; store eax;" to spill several registers with a new xor for each store. With this optimization enabled we get a store of a 0 immediate instead of an xor. Though I admit the ideal solution would be one xor where there are multiple spills. I don't believe we have a test case that shows this optimization in here. I'll see if I can try to reduce one from the code were looking at.

There's definitely some other machine CSE(and maybe other passes) behavior changes exposed by this patch. So it seems like there might be some other deficiencies in SUBREG_TO_REG handling.

Differential Revision: https://reviews.llvm.org/D52757

llvm-svn: 345165
2018-10-24 17:32:09 +00:00
Simon Pilgrim 2cce074e8c [CostModel][X86] Enable non-uniform vector division by constants costs.
Non-uniform division/remainder handling was added back at D49248/D50765 - so share the 'mul+sub' costs that already exist for uniform cases.

llvm-svn: 345164
2018-10-24 17:30:29 +00:00
Robert Lougher 18bfb3a5ec [CodeGen] skip lifetime end marker in isInTailCallPosition
A lifetime end intrinsic between a tail call and the return should not
prevent the call from being tail call optimized.

Differential Revision: https://reviews.llvm.org/D53519

llvm-svn: 345163
2018-10-24 17:03:19 +00:00
Sanjay Patel d1fe437cf1 [InstCombine] add test for ComputeNumSignBits with shuffle; NFC
llvm-svn: 345162
2018-10-24 17:01:42 +00:00
Andrea Di Biagio 65c77d7283 [llvm-mca] Remove dependency from InstrBuilder in class InstructionTables.
Also, removed the initialization of vectors used for processor resource masks.
Support function 'computeProcResourceMasks()' already calls method resize on
those vectors.
No functional change intended.

llvm-svn: 345161
2018-10-24 16:56:43 +00:00
Simon Pilgrim c8c7451063 [LegalizeDAG] ExpandLegalINT_TO_FP - cleanup UINT_TO_FP i64 -> f32 expansion.
Use SrcVT/DestVT types and correct shift type.

Part of prep work for D52965

llvm-svn: 345158
2018-10-24 16:35:01 +00:00
Sanjay Patel 2169b9c976 [InstCombine] add test for select with shuffled condition (PR37549); NFC
llvm-svn: 345156
2018-10-24 16:21:23 +00:00
Krasimir Georgiev 09ea204964 IR: Optimize FunctionType::get to perform one hash lookup instead of two, NFCI
Summary: This function was performing two hash lookups when a new function type was requested: first checking if it exists and second to insert it. This patch updates the function to perform a single hash lookup in this case by updating the value in the hash table in-place in case the function type was not there before.

Reviewers: bkramer

Reviewed By: bkramer

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D53471

llvm-svn: 345151
2018-10-24 15:18:51 +00:00
Sanjay Patel 3b206305fd [InstCombine] try harder to form select from logic ops (2nd try)
The original patch was committed here:
rL344609
...and reverted:
rL344612
...because it did not properly check/test data types before calling
ComputeNumSignBits(). 

The tests that caused bot failures for the previous commit are 
over-reaching front-end tests that run the entire -O optimizer 
pipeline: 
    Clang :: CodeGen/builtins-systemz-zvector.c
    Clang :: CodeGen/builtins-systemz-zvector2.c

I've added a negative test here to ensure coverage for that case.
The new early exit check also tests the type of the 'B' parameter,
so we don't waste time on matching if either value is unsuitable.

Original commit message:

This is part of solving PR37549:
https://bugs.llvm.org/show_bug.cgi?id=37549

The patterns shown here are a special case of something
that we already convert to select. Using ComputeNumSignBits()
catches that case (but not the more complicated motivating
patterns yet).

The backend has hooks/logic to convert back to logic ops
if that's better for the target.

llvm-svn: 345149
2018-10-24 15:17:56 +00:00
Andrea Di Biagio 7be45b0f85 [llvm-mca] Refactor class SourceMgr. NFCI
Added begin()/end() methods to allow the usage of SourceMgr in foreach loops.
With this change, method getMCInstFromIndex() (as well as a couple of other
methods) are now redundant, and can be removed from the public interface.

llvm-svn: 345147
2018-10-24 15:06:27 +00:00
Cameron McInally 678f43f666 [FPEnv] Convert more BinaryOperator::isFNeg(...) to m_FNeg(...)
This work is to avoid regressions when we seperate FNeg from the FSub IR instruction. 

Differential Revision: https://reviews.llvm.org/D53205

llvm-svn: 345146
2018-10-24 14:45:18 +00:00
Alexey Bataev c15c853c3a [DEBUGINFO, NVPTX] Try to pack bytes data into a single string.
Summary:
If the target does not support `.asciz` and `.ascii` directives, the
strings are represented as bytes and each byte is placed on the new line
as a separate byte directive `.b8 <data>`. NVPTX target allows to
represent the vector of the data of the same type as a vector, where
values are separated using `,` symbol: `.b8 <data1>,<data2>,...`. This
allows to reduce the size of the final PTX file. Ptxas tool includes ptx
files into the resulting binary object, so reducing the size of the PTX
file is important.

Reviewers: tra, jlebar, echristo

Subscribers: jholewinski, llvm-commits

Differential Revision: https://reviews.llvm.org/D45822

llvm-svn: 345142
2018-10-24 14:04:00 +00:00
James Henderson 5b2e968264 Fix llvm-strings crash for negative char values
On Windows at least, llvm-strings was crashing if it encountered bytes
that mapped to negative chars, as it was passing these into
std::isgraph and std::isblank functions, resulting in undefined
behaviour. On debug builds using MSVC, these functions verfiy that the
value passed in is representable as an unsigned char. Since the char is
promoted to an int, a value greater than 127 would turn into a negative
integer value, and fail the check. Using the llvm::isPrint function is
sufficient to solve the issue.

Reviewed by: ruiu, mstorsjo

Differential Revision: https://reviews.llvm.org/D53509

llvm-svn: 345137
2018-10-24 13:16:16 +00:00
Simon Pilgrim 84cc110732 [X86][SSE] Update PMULDQ schedule tests to survive more aggressive SimplifyDemandedBits
llvm-svn: 345136
2018-10-24 13:13:36 +00:00
Martin Storsjo c4a995c8e0 [MinGW] Enable large file for mingw-w64
64-bit mingw doesn't define _FILE_OFFSET_BITS=64 by default.

Differential Revision: https://reviews.llvm.org/D53569

llvm-svn: 345131
2018-10-24 12:22:12 +00:00
Guillaume Chatelet da11b85606 [llvm-exegesis] Implements a cache of Instruction objects.
llvm-svn: 345130
2018-10-24 11:55:06 +00:00
Andrea Di Biagio 083addf751 [llvm-mca] [llvm-mca] Improved error handling and error reporting from class InstrBuilder.
A new class named InstructionError has been added to Support.h in order to
improve the error reporting from class InstrBuilder.
The llvm-mca driver is responsible for handling InstructionError objects, and
printing them out to stderr.

The goal of this patch is to remove all the remaining error handling logic from
the library code.
In particular, this allows us to:
 - Simplify the logic in InstrBuilder by removing a needless dependency from
MCInstrPrinter.
 - Centralize all the error halding logic in a new function named 'runPipeline'
(see llvm-mca.cpp).

This is also a first step towards generalizing class InstrBuilder, so that in
future, we will be able to reuse its logic to also "lower" MachineInstr to
mca::Instruction objects.

Differential Revision: https://reviews.llvm.org/D53585

llvm-svn: 345129
2018-10-24 10:56:47 +00:00
Eugene Leviant 9465a1a580 [ThinLTO] Change parameter type. NFC
Change destination module type for consistency with r345118

llvm-svn: 345124
2018-10-24 08:59:58 +00:00
Gil Rapaport c523036fd2 Revert r345114
Investigating fails.

llvm-svn: 345123
2018-10-24 08:41:22 +00:00
Tim Renouf 2a1b1d94b6 [AMDGPU] Defined gfx909 Raven Ridge 2
Differential Revision: https://reviews.llvm.org/D53418

Change-Id: Ie3d054f2e956c2768988c0f4c0ffd29a47294eef
llvm-svn: 345120
2018-10-24 08:14:07 +00:00
Eugene Leviant 1f54500af0 [ThinLTO] Fix dot dumper for regular LTO modules
Regular LTO module identifier is (unsigned)-1. This patch emits correct
module identifier while printing edges with source summary in regular
LTO module.

Differential revision: https://reviews.llvm.org/D53583

llvm-svn: 345118
2018-10-24 07:48:32 +00:00
Dorit Nuzman 5114390e48 [LV] Don't have fold-tail under optsize invalidate interleave-groups when
masked-interleaving is enabled

Enable interleave-groups under fold-tail scenario for Opt for size compilation;
D50480 added support for vectorizing loops of arbitrary trip-count without a
remiander, which in turn makes everything in the loop conditional, including
interleave-groups if any. It therefore invalidated all interleave-groups
because we didn't have support for vectorizing predicated interleaved-groups
at the time. In the meantime, D53011 introduced this support, so we don't
have to invalidate interleave-groups when masked-interleaved support is enabled.

Reviewers: Ayal, hsaito, dcaballe, fhahn

Reviewed By: hsaito

Differential Revision: https://reviews.llvm.org/D53559

llvm-svn: 345115
2018-10-24 07:11:38 +00:00
Gil Rapaport 5012e7f6ac [LSR] Combine unfolded offset into invariant register
LSR reassociates constants as unfolded offsets when the constants fit as
immediate add operands, which currently prevents such constants from being
combined later with loop invariant registers.
This patch modifies GenerateCombinations() to generate a second formula which
includes the unfolded offset in the combined loop-invariant register.

Differential Revision: https://reviews.llvm.org/D51861

llvm-svn: 345114
2018-10-24 07:08:38 +00:00
Craig Topper da54bbf52a [X86] Correct a bad isel predicate. Though I don't think it can be exposed.
This B/W VPTEST instructions are only available with AVX512BW. But lowering should prevent any byte or word elements from getting to isel so this can't be exposed.

llvm-svn: 345112
2018-10-24 06:13:36 +00:00
Sanjin Sijaric cd41638292 [ARM64][Windows] Add unwind support to llvm-readobj
This patch adds support for dumping the unwind info from ARM64 COFF object
files.

Differential Revision: https://reviews.llvm.org/D53264

llvm-svn: 345108
2018-10-24 00:03:34 +00:00
Saleem Abdulrasool 4005f9a860 ARM: handle checking aliases with out-of-bounds GEPs
A global alias may use indices which are not considered in bounds.  In
such a case, accessing the base object will fail as it only peers
through inbounds accesses.  This pattern is used by the swift compiler
to create references to preceeding members in the type metadata.  This
would cause the code generation to fail when targeting a platform that
used ELF as the object file format.  Be conservative and fail the
read-only check if we run into an alias that we cannot peer through.

llvm-svn: 345107
2018-10-24 00:00:52 +00:00
Reid Kleckner 5fa1e35bcc Commit missing comment edit and use correct cast to fix std::min overload
llvm-svn: 345105
2018-10-23 23:44:44 +00:00
Reid Kleckner 1500effacd [hurd] Make getMainExecutable get the real binary path
On GNU/Hurd, llvm-config is returning bogus value, such as:

$ llvm-config-6.0 --includedir
/usr/include

while it should be:
$ llvm-config-6.0 --includedir
/usr/lib/llvm-6.0/include

This is because getMainExecutable does not get the actual installation
path. On GNU/Hurd, /proc/self/exe is indeed a symlink to the path that
was used to start the program, and not the eventual binary file. Llvm's
getMainExecutable thus needs to run realpath over it to get the actual
place where llvm was installed (/usr/lib/llvm-6.0/bin/llvm-config), and
not /usr/bin/llvm-config-6.0. This will not change the result on Linux,
where /proc/self/exe already points to the eventual file.

Patch by Samuel Thibault!

While making changes here, I reformatted this block a bit to reduce
indentation and match 2 space indent style.

Differential Revision: https://reviews.llvm.org/D53557

llvm-svn: 345104
2018-10-23 23:35:43 +00:00
Wei Mi 80a0c97e07 [PM] keeping history when original SCC split and then merge into itself
in the same round of SCC update.

In https://reviews.llvm.org/rL309784, inline history is added to prevent
infinite inlining across multiple run of inliner and SCC update, but the
history will only be kept when new SCC is actually generated during SCC update.

We found a case that SCC can be split and then merge into itself in the same
round of SCC update, so the same SCC will be pop out from UR.CWorklist and
then added back immediately, without any new SCC generated, that is why the
existing patch cannot catch the infinite inline case.

What the patch does is even if no new SCC is generated, if only the current
SCC appears in UR.CWorklist again, then keep the inline history.

Differential Revision: https://reviews.llvm.org/D52915

llvm-svn: 345103
2018-10-23 23:29:45 +00:00
Matthias Braun 4f82406c46 SelectionDAG: Reuse bigger sized constants in memset expansion.
When implementing memset's today we often see this pattern:
$x0 = MOV 0xXYXYXYXYXYXYXYXY
store $x0, ...
$w1 = MOV 0xXYXYXYXY
store $w1, ...

We first create a 64bit constant in a 64bit register with all bytes the
same and then create a 32bit constant with all bytes the same in a 32bit
register. In many targets we could just access the lower byte of the
64bit register instead.

- Ideally this would be handled by the ConstantHoist pass but it runs
  too early when memset isn't expanded yet.
- The memset expansion code already had this optimization implemented,
  however SelectionDAG constantfolding would constantfold the
  "trunc(bigconstnat)" pattern to "smallconstant".
- This patch makes the memset expansion mark the constant as Opaque and
  stop DAGCombiner from constant folding in this situation. (Similar to
  how ConstantHoisting marks things as Opaque to avoid folding
  ADD/SUB/etc.)

Differential Revision: https://reviews.llvm.org/D53181

llvm-svn: 345102
2018-10-23 23:19:23 +00:00
Lang Hames 23cb2e7f77 [ORC] Re-apply r345077 with fixes to remove ambiguity in lookup calls.
llvm-svn: 345098
2018-10-23 23:01:39 +00:00
Teresa Johnson 7c6344a64f Revert "[ThinLTO] Fix a crash in lazy loading of Metadata"
This reverts commit r345095. It was accidentally committed.

llvm-svn: 345097
2018-10-23 23:00:29 +00:00
Teresa Johnson d725335bd1 [hot-cold-split] Only perform splitting in ThinLTO backend post-link
Summary:
Fix the new PM to only perform hot cold splitting once during ThinLTO,
by skipping it in the pre-link phase.

This was already fixed in the old PM by the move of the hot cold split
pass later (after the early return when PrepareForThinLTO) by r344869.

Reviewers: vsk, sebpop, hiraditya

Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, llvm-commits

Differential Revision: https://reviews.llvm.org/D53611

llvm-svn: 345096
2018-10-23 22:57:40 +00:00
Teresa Johnson 3513dc245e [ThinLTO] Fix a crash in lazy loading of Metadata
Summary:
This is a revised version of D41474.

When the debug location is parsed in BitcodeReader::parseFunction, the
scope and inlinedAt MDNodes are obtained via MDLoader->getMDNodeFwdRefOrNull(),
which will create a forward ref if they were not yet loaded.
Specifically, if one of these MDNodes is in the module level metadata
block, and this is during ThinLTO importing, that metadata block is
lazily loaded.

Most places in that invoke getMDNodeFwdRefOrNull have a corresponding call
to resolveForwardRefsAndPlaceholders which will take care of resolving them.
E.g. places that call getMetadataFwdRefOrLoad, or at the end of parsing a
function-level metadata block, or at the end of the initial lazy load of
module level metadata in order to handle invocations of getMDNodeFwdRefOrNull
for named metadata and global object attachments. However, the calls for
the scope/inlinedAt of debug locations are not backed by any such call to
resolveForwardRefsAndPlaceholders.

To fix this, change the scope and inlinedAt parsing to instead use
getMetadataFwdRefOrLoad, which will ensure the forward refs to lazily
loaded metadata are resolved.

Fixes PR35472.

Reviewers: dexonsmith, Sunil_Srivastava, vsk

Subscribers: inglorion, eraman, steven_wu, sebpop, mehdi_amini, dmikulin, vsk, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D53596

llvm-svn: 345095
2018-10-23 22:57:21 +00:00
Fangrui Song fa735b0eab Actually fix test from r345085 REQUIRE: asserts
llvm-svn: 345090
2018-10-23 22:07:34 +00:00
Fangrui Song 54b825cafe Fix test after r345085
llvm-svn: 345089
2018-10-23 22:04:33 +00:00
Craig Topper e01d516ac7 [X86] Autogenerate comple checks. NFC
llvm-svn: 345087
2018-10-23 21:58:49 +00:00
Zhizhou Yang 13f76f84bc Print out DebugCounter info with -print-debug-counter
Summary:
This patch will print out {Counter, Skip, StopAfter} info of all passes which have DebugCounter set at destruction.

It can be used to monitor how many times does certain transformation happen in a pass, and also help check if -debug-counter option is set correctly.

Please refer to this [[ http://lists.llvm.org/pipermail/llvm-dev/2018-July/124722.html  | thread ]] for motivation.

Reviewers: george.burgess.iv, davide, greened

Reviewed By: greened

Subscribers: kristina, llozano, mgorny, llvm-commits, mgrang

Differential Revision: https://reviews.llvm.org/D50031

llvm-svn: 345085
2018-10-23 21:51:56 +00:00
Jonas Devlieghere 3ef53e10d3 [dwarfdump] Make incompatibility between -diff and -verbose explicit.
Using -diff and -verbose together doesn't work today. We should audit
where these two options interact and fix them. In the meantime we error
out when the user try to specify both.

llvm-svn: 345084
2018-10-23 21:51:44 +00:00
Matt Arsenault 9ef8e51cec Fix typo in verifier error message
llvm-svn: 345083
2018-10-23 21:23:52 +00:00
Peter Collingbourne abd820a92b CGP: Clear data structures at the end of a loop iteration instead of the beginning.
Clearing LargeOffsetGEPMap at the end fixes a bug where if a large
offset GEP is in a dead basic block, we fail an assertion when trying
to delete the block due to the asserting VH in LargeOffsetGEPMap.

Differential Revision: https://reviews.llvm.org/D53464

llvm-svn: 345082
2018-10-23 21:23:18 +00:00
Jordan Rupprecht ab9f662651 [llvm-objcopy] Fix use-after-move clang-tidy warning
llvm-svn: 345079
2018-10-23 20:54:51 +00:00
Reid Kleckner db367e952e Revert r345077 "[ORC] Change how non-exported symbols are matched during lookup."
Doesn't build on Windows. The call to 'lookup' is ambiguous. Clang and
MSVC agree, anyway.

http://lab.llvm.org:8011/builders/clang-x64-windows-msvc/builds/787
C:\b\slave\clang-x64-windows-msvc\build\llvm.src\unittests\ExecutionEngine\Orc\CoreAPIsTest.cpp(315): error C2668: 'llvm::orc::ExecutionSession::lookup': ambiguous call to overloaded function
C:\b\slave\clang-x64-windows-msvc\build\llvm.src\include\llvm/ExecutionEngine/Orc/Core.h(823): note: could be 'llvm::Expected<llvm::JITEvaluatedSymbol> llvm::orc::ExecutionSession::lookup(llvm::ArrayRef<llvm::orc::JITDylib *>,llvm::orc::SymbolStringPtr)'
C:\b\slave\clang-x64-windows-msvc\build\llvm.src\include\llvm/ExecutionEngine/Orc/Core.h(817): note: or       'llvm::Expected<llvm::JITEvaluatedSymbol> llvm::orc::ExecutionSession::lookup(const llvm::orc::JITDylibSearchList &,llvm::orc::SymbolStringPtr)'
C:\b\slave\clang-x64-windows-msvc\build\llvm.src\unittests\ExecutionEngine\Orc\CoreAPIsTest.cpp(315): note: while trying to match the argument list '(initializer list, llvm::orc::SymbolStringPtr)'

llvm-svn: 345078
2018-10-23 20:54:43 +00:00
Lang Hames 841796decd [ORC] Change how non-exported symbols are matched during lookup.
In the new scheme the client passes a list of (JITDylib&, bool) pairs, rather
than a list of JITDylibs. For each JITDylib the boolean indicates whether or not
to match against non-exported symbols (true means that they should be found,
false means that they should not). The MatchNonExportedInJD and MatchNonExported
parameters on lookup are removed.

The new scheme is more flexible, and easier to understand.

This patch also updates JITDylib search orders to be lists of (JITDylib&, bool)
pairs to match the new lookup scheme. Error handling is also plumbed through
the LLJIT class to allow regression tests to fail predictably when a lookup from
a lazy call-through fails.

llvm-svn: 345077
2018-10-23 20:20:22 +00:00
Michael Kruse 53c722df0b [test-suite/doc] Add list of programs we might add.
Add a list of benchmarks, applications and algorithms which are under
discussion to be added to the test-suite.

The initial list includes the the benchmarks mentioned at
https://llvm.org/PR34216, missing SPEC benchmarks, some image processing
algorithms and a few others. The bug tracker only allows adding to the
discussion, not removing, commenting, adding details to individual
benchmarks.

The first proposal was to add these benchmark into the test-suite
repository, but after a discussion, adding it to llvm/docs/Proposals
seem more appropriate. One advantage is that llvm.org will have a
browsable web page with these suggestions.

Suggested-by: Hal Finkel

Differential Revision: https://reviews.llvm.org/D46714

llvm-svn: 345074
2018-10-23 19:46:29 +00:00
Vedant Kumar 503154615d [HotColdSplitting] Attach MinSize to outlined code
Outlined code is cold by assumption, so it makes sense to optimize it
for minimal code size rather than performance.

After r344869 moved the splitting pass to the end of the IR pipeline,
this does not result in much of a code size reduction. This is probably
because a comparatively small number backend transforms make use of the
MinSize hint.

Running LNT on x86_64, I see that 33/1020 binaries shrink for a total of
919 bytes of TEXT reduction. I didn't measure a significant performance
impact.

Differential Revision: https://reviews.llvm.org/D53518

llvm-svn: 345072
2018-10-23 19:41:12 +00:00
Simon Pilgrim b6c57075c0 [X86][SSE] Revert rL343922 combinePMULDQ AddToWorklist (PR39398)
We can't add the MULDQ node back to the worklist after the demanded bits change has been committed in case the node has been removed entirely. This will have to wait until we have SimplifyDemandedBitsForTargetNode.

llvm-svn: 345070
2018-10-23 19:07:53 +00:00
Jordan Rupprecht aaeaa0a8b3 [llvm-strip] Support -s alias for --strip-all. Make both strip and objcopy case sensitive to support both -s (--strip-all) and -S (--strip-debug).
Summary:
GNU strip supports both `-s` and `-S` as aliases for `--strip-all` and `--strip-debug`, respectfully.

As part of this, it turns out that strip/objcopy were accepting case insensitive command line args. I'm not sure if there was an explicit reason for this. The only others uses of this are llvm-cvtres/llvm-mt/llvm-lib, which are all tools specific for windows support. Forcing case sensitivity allows both aliases to exist, but seems like a good idea anyway.

And as a surprise test case adjustment, the llvm-strip unit test was running with `-keep=unavailable_symbol`, despite `keep` not be a valid flag for strip. This is because there is a flag `-K` which, when case insensitivity is permitted, allows it to be interpreted as `-K` = `eep=unavailable_symbol` (e.g. to allow `-Kfoo` == `--keep-symbol=foo`).

Reviewers: jakehehrlich, jhenderson, alexshap

Reviewed By: jakehehrlich

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D53163

llvm-svn: 345068
2018-10-23 18:46:33 +00:00
Simon Pilgrim 8c4796deb4 [LegalizeDAG] Share Vector/Scalar CTPOP Expansion
As suggested on D53258, this patch move the CTPOP expansion code from SelectionDAGLegalize to TargetLowering to allow it to be reused by the VectorLegalizer.

Proper vector support will be added by D53258.

llvm-svn: 345066
2018-10-23 18:28:24 +00:00
Roman Lebedev 2fae985793 X86DAGToDAGISel::matchBitExtract(): lambdas can't have default arguments.
As reported by ctopper.
That is a gcc-only warning at the moment.

llvm-svn: 345065
2018-10-23 18:27:10 +00:00
Simon Pilgrim d705ba97dd [LegalizeDAG] Share Vector/Scalar CTLZ Expansion
As suggested on D53258, this patch shares common CTLZ expansion code between VectorLegalizer and SelectionDAGLegalize by putting it in TargetLowering.

Extension to D53474

llvm-svn: 345060
2018-10-23 17:48:30 +00:00
Daniel Sanders d0ef689830 Fix MSVC build by correcting placement of declspec after r345056
Going by the MSVC toolchains at godbolt.org, declspec comes after the template<...>.

llvm-svn: 345059
2018-10-23 17:41:39 +00:00
Fangrui Song 531e3d0cd9 [IR] Fix -Wunused-function after r345052
llvm-svn: 345057
2018-10-23 17:24:15 +00:00
Daniel Sanders d300ba1ed7 [tblgen] Allow FixedLenDecoderEmitter to use APInt-like objects as InsnType
Summary:
Some targets have very long encodings and uint64_t isn't sufficient. uint128_t
isn't portable so such targets need to use an object instead.

There is one catch with this at the moment, no string of bits extracted
from the encoding may exceeed 64-bits. Fields are still permitted to
exceed 64-bits so long as they aren't one contiguous string of bits. If
this proves to be a problem then we can modify the generation of
fieldFromInstruction() calls to account for it but for now I've added an
assertion for this.

InsnType must either be integral or an APInt-like object that must:
* Have a static const max_size_in_bits equal to the number of bits in the encoding.
* be default-constructible and copy-constructible
* be constructible from a uint64_t (this is the key area the interface deviates
  from APInt since this constructor does not take the bit width)
* be constructible from an APInt (this can be private)
* be convertible to uint64_t
* Support the ~, &,, ==, !=, and |= operators with other objects of the same type
* Support shift (<<, >>) with signed and unsigned integers on the RHS
* Support put (<<) to raw_ostream&

Reviewers: bogner, charukcs

Subscribers: nhaehnle, llvm-commits

Differential Revision: https://reviews.llvm.org/D52100

llvm-svn: 345056
2018-10-23 17:23:31 +00:00
Reid Kleckner 075897292f [PDB] Fix -Wunused-private-field in DIA
llvm-svn: 345054
2018-10-23 17:20:16 +00:00
Stefan Pintilie 927e8bf316 [Power9] Add __float128 support in the backend for bitcast to a i128
Add support to allow bit-casting from f128 to i128 and then
extracting 64 bits from the result.

Differential Revision: https://reviews.llvm.org/D49507

llvm-svn: 345053
2018-10-23 17:11:36 +00:00
Sanjay Patel 07076cfdf6 [IR] remove fake binop queries for not/neg
The initial motivation is that we want to remove the
fneg API because that would silently fail if we add
an actual fneg instruction to IR. The same would be
true for the integer ops, so we might as well get rid 
of these too.

We have a newer 'match' API that makes checking for
these patterns simpler. It also works with vectors
that may include undef elements in constants.

If any out-of-tree users need updating, they can model
their code changes on these commits:
rL345050
rL345043
rL345042
rL345041
rL345036
rL345030

llvm-svn: 345052
2018-10-23 17:06:03 +00:00
Sanjay Patel 95790c546f [InstCombine] use 'match' to simplify code
There's probably some vector-with-undef-element pattern
that shows an improvement, so this is probably not quite
'NFC'.

This is the last step towards removing the fake binop 
queries for not/neg. Ie, there are no more uses of those
functions in trunk. Fneg should follow.

llvm-svn: 345050
2018-10-23 16:54:28 +00:00
Simon Pilgrim f04a04c2b6 [TTI][X86] Treat SK_Transpose shuffles as SK_PermuteTwoSrc - there's no difference in lowering.
llvm-svn: 345048
2018-10-23 16:45:26 +00:00
Jordan Rupprecht 2fed6ac186 [DebugInfo][GlobalOpt] Fix -debugify for globalopt shrinking globals to booleans.
Summary:
TryToShrinkGlobalToBoolean, when possible, will split store <value> + load <value> into store <bool> + select <bool ? value : 0>. This preserves DebugLoc during that pass.

Fixes PR37959. The test case here is the simplified .ll for:

```
static int foo;
int bar() {
  foo = 5;
  return foo;
}
```

Reviewers: dblaikie, gbedwell, aprantl

Reviewed By: dblaikie

Subscribers: mehdi_amini, JDevlieghere, dexonsmith, llvm-commits

Tags: #debug-info

Differential Revision: https://reviews.llvm.org/D53531

llvm-svn: 345046
2018-10-23 16:35:51 +00:00
Simon Pilgrim f1d8b7c49e [CostModel][X86] Add transpose shuffle cost tests
llvm-svn: 345045
2018-10-23 16:27:14 +00:00
Sanjay Patel 47a52a0521 [WebAssembly] use 'match' to simplify code; NFC
Vector types are not possible here because this code explicitly
checks for a scalar type, but this is another step towards 
completely removing the fake binop queries for not/neg/fneg.

llvm-svn: 345043
2018-10-23 16:05:09 +00:00
Sanjay Patel 5b6b090cf2 [Reassociate] replace fake binop queries with 'match' API
We need to update this code before introducing an 'fneg' instruction in IR,
so we might as well kill off the integer neg/not queries too.

This is no-functional-change-intended for scalar code and most vector code. 
For vectors, we can see that the 'match' API allows for undef elements in 
constants, so we optimize those cases better.

Ideally, there would be a test for each code diff, but I don't see evidence
of that for the existing code, so I didn't try very hard to come up with new 
vector tests for each code change.

Differential Revision: https://reviews.llvm.org/D53533

llvm-svn: 345042
2018-10-23 15:55:06 +00:00
Sanjay Patel ad12df829c [SelectionDAG] use 'match' to simplify code; NFC
Vector types are not possible here because this code only starts
matching from the scalar bool value of a conditional branch, but
this is another step towards completely removing the fake binop
queries for not/neg/fneg.

llvm-svn: 345041
2018-10-23 15:46:10 +00:00
Benjamin Kramer 1e212e8abb [LegalizeDAG] Remove unused variable
llvm-svn: 345040
2018-10-23 15:43:36 +00:00
Simon Pilgrim b975ff4700 [LegalizeDAG] Share Vector/Scalar CTTZ Expansion
As suggested on D53258, this patch demonstrates sharing common CTTZ expansion code between VectorLegalizer and SelectionDAGLegalize by putting it in TargetLowering.

I intend to move CTLZ and (scalar) CTPOP over as well and then update D53258 accordingly.

Differential Revision: https://reviews.llvm.org/D53474

llvm-svn: 345039
2018-10-23 15:37:19 +00:00
Simon Pilgrim 532a0f122e [SLPVectorizer] Add basic support for mul/and/or/xor horizontal reductions
Expand arithmetic reduction to include mul/and/or/xor instructions.

This patch just fixes the SLPVectorizer - the effective reduction costs for AVX1+ are still poor (see rL344846) and will need to be improved before SLP sees this as a valid transform - but we can already see the effect on SSE2 tests.

This partially helps PR37731, but doesn't fix it all as it still falls over on the extraction/reduction order for some reason.

Differential Revision: https://reviews.llvm.org/D53473

llvm-svn: 345037
2018-10-23 15:13:09 +00:00
Sanjay Patel 747feb28e4 [InstCombine] use 'match' to handle vectors and simplify code
This is another step towards completely removing the fake 
binop queries for not/neg/fneg.

llvm-svn: 345036
2018-10-23 15:05:12 +00:00
Sanjay Patel ad76c682c7 [InstCombine] swap select profile metadata when swapping select ops
llvm-svn: 345034
2018-10-23 14:43:31 +00:00
Sanjay Patel 54da0057b9 [InstCombine] add/move tests for select with inverted condition; NFC
The transform is broken in 2 ways - it doesn't correct metadata (or even drop it),
and it doesn't work with vectors with undef elements.

llvm-svn: 345033
2018-10-23 14:37:29 +00:00
Aleksandr Urakov 00d4c38668 Revert "[MachinePipeliner] Split MachinePipeliner code into header and cpp files"
This reverts commit 40760b733d9eef841c897338af5e9d81b12551bf.
It seems that the commit is a cuse of the build failure.

llvm-svn: 345032
2018-10-23 14:27:45 +00:00
Sanjay Patel 5141435d23 [SLSR] use 'match' to simplify code; NFC
This pass could probably be modified slightly to allow
vector splat transforms for practically no cost, but
it only works on scalars for now. So the use of the
newer 'match' API should make no functional difference. 

llvm-svn: 345030
2018-10-23 14:07:39 +00:00
Sanjay Patel d3d2a0b591 [SLSR] auto-generate full test assertions; NFC
llvm-svn: 345028
2018-10-23 13:39:40 +00:00
Roman Lebedev 06e4db07af Experimental re-land of [X86][BMI1] X86DAGToDAGISel: select BEXTR from x << (32 - y) >> (32 - y) pattern
This initially landed in rL345014, but was reverted in rL345017
due to sanitizer-x86_64-linux-fast buildbot failure in
check-lld (ELF/relocatable-versioned.s) test.

While i'm not yet quite sure what is the problem, one obvious
thing here is that extra truncation roundtrip.
Maybe that's it? If not, will re-revert.

Differential Revision: https://reviews.llvm.org/D53521

llvm-svn: 345027
2018-10-23 13:19:31 +00:00
Simon Pilgrim 9a2ee1ea2d Add BROADCAST shuffle cost tests.
Part of a lot of cleanup necessary before PR39368.

llvm-svn: 345025
2018-10-23 13:14:54 +00:00
Simon Pilgrim 051feee5c2 Add BROADCAST shuffle cost tests.
Part of a lot of cleanup necessary before PR39368.

llvm-svn: 345023
2018-10-23 13:00:22 +00:00
Dorit Nuzman da5dc13355 Leftover bits from https://reviews.llvm.org/D53420 that were accidentally left
out of revision 344883

llvm-svn: 345021
2018-10-23 11:51:55 +00:00
Greg Bedwell 98b5f6d159 [lit] Only return a found bash executable on Windows if it can understand Windows paths
Some versions of bash.exe, for example WSL's version expect paths in the form
/mnt/c/path/to/dir rather than c:\\path\\to\\dir so will cause failures
for any tests that require an external shell if used by lit.  If we're on
Windows and looking for an external shell, check that the found version
of bash is able to parse a native path before returning that version.

This patch also partially reverts the behaviour of r228221 by
restoring the warning if bash cannot be found.  This shouldn't pollute
the lit stderr anymore as we're now using internal shell by default on
Windows.  If someone is explicitly specifying to use an external shell, it's
probably worth alerting them to the fact that bash could not be found.

Differential Revision: https://reviews.llvm.org/D52831

llvm-svn: 345019
2018-10-23 11:34:04 +00:00
Simon Pilgrim f85ee9f8b4 [X86][SSE] Update raw mask shuffle decoders to handle UNDEF mask elts
Matches the approach taken in the constant pool shuffle decoders, and uses an UndefElts mask instead of uint64_t(-1) raw mask values, which doesn't work safely for i32/i64 shuffle mask sizes (as the -1 value is legal).

This allows us to remove the constant pool shuffle decoders from most of the getTargetShuffleMask variable shuffle cases (X86ISD::VPERMV3 will be handled in a future commit).

llvm-svn: 345018
2018-10-23 11:33:38 +00:00
Roman Lebedev c29dbbdb10 Revert "[X86][BMI1] X86DAGToDAGISel: select BEXTR from x << (32 - y) >> (32 - y) pattern"
*Seems* to be breaking sanitizer-x86_64-linux-fast buildbot,
the ELF/relocatable-versioned.s test:

==17758==MemorySanitizer CHECK failed: /b/sanitizer-x86_64-linux-fast/build/llvm/projects/compiler-rt/lib/sanitizer_common/sanitizer_allocator.cc:191 "((kBlockMagic)) == ((((u64*)addr)[0]))" (0x6a6cb03abcebc041, 0x0)
    #0 0x59716b in MsanCheckFailed(char const*, int, char const*, unsigned long long, unsigned long long) /b/sanitizer-x86_64-linux-fast/build/llvm/projects/compiler-rt/lib/msan/msan.cc:393
    #1 0x586635 in __sanitizer::CheckFailed(char const*, int, char const*, unsigned long long, unsigned long long) /b/sanitizer-x86_64-linux-fast/build/llvm/projects/compiler-rt/lib/sanitizer_common/sanitizer_termination.cc:79
    #2 0x57d5ff in __sanitizer::InternalFree(void*, __sanitizer::SizeClassAllocatorLocalCache<__sanitizer::SizeClassAllocator32<__sanitizer::AP32> >*) /b/sanitizer-x86_64-linux-fast/build/llvm/projects/compiler-rt/lib/sanitizer_common/sanitizer_allocator.cc:191
    #3 0x7fc21b24193f  (/lib/x86_64-linux-gnu/libc.so.6+0x3593f)
    #4 0x7fc21b241999 in exit (/lib/x86_64-linux-gnu/libc.so.6+0x35999)
    #5 0x7fc21b22c2e7 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x202e7)
    #6 0x57c039 in _start (/b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/lld+0x57c039)

This reverts commit r345014.

llvm-svn: 345017
2018-10-23 10:34:57 +00:00
Simon Pilgrim 816e57be35 [TTI] Add generic cost handling of SK_Reverse shuffles
These can be treated as a general permute.

This required a fix for missing reverse patterns on ARM

llvm-svn: 345015
2018-10-23 09:42:10 +00:00
Roman Lebedev 1c95b2f779 [X86][BMI1] X86DAGToDAGISel: select BEXTR from x << (32 - y) >> (32 - y) pattern
Summary:
Continuation of D52348.

We also get the `c) x &  (-1 >> (32 - y))` pattern here, because of the D48768.
I will add extra-uses into those tests and follow-up with a patch to handle those patterns too.

Reviewers: RKSimon, craig.topper

Reviewed By: craig.topper

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D53521

llvm-svn: 345014
2018-10-23 09:08:44 +00:00
Aleksandr Urakov afe33a2725 Fix non-Windows build for D53324
llvm-svn: 345011
2018-10-23 08:15:00 +00:00
Aleksandr Urakov c43e086c74 Revert "Revert "[PDB] Extend IPDBSession's interface to retrieve frame data""
This reverts commit 466ce67d6ec444962e5cc0136243c16a453190c0.

llvm-svn: 345010
2018-10-23 08:14:53 +00:00
Lama Saba 7d9b3a682e [MachinePipeliner] Split MachinePipeliner code into header and cpp files
Split MachinePipeliner code into header and cpp files to allow inheritance from SwingSchedulerDAG

Differential Revision: https://reviews.llvm.org/D53477

llvm-svn: 345008
2018-10-23 07:58:41 +00:00
Sylvestre Ledru f4719c4d2c Add support for GNU Hurd in Path.inc and other places
Summary: Patch by Svante Signell & myself

Reviewers: rnk, JDevlieghere, efriedma

Reviewed By: efriedma

Subscribers: efriedma, JDevlieghere, krytarowski, llvm-commits, kristina

Differential Revision: https://reviews.llvm.org/D53409

llvm-svn: 345007
2018-10-23 07:13:47 +00:00
Craig Topper f50f086743 [X86] Regenerate test checks to show fma comments. NFC
llvm-svn: 344999
2018-10-23 04:18:08 +00:00
Lang Hames 776f1d50c8 [RuntimeDyld][COFF] Skip non-loaded sections when calculating ImageBase.
Non-loaded sections (whose unused load-address defaults to zero) should not
be taken into account when calculating ImageBase, or ImageBase will be
incorrectly set to 0.

Patch by Andrew Scheidecker. Thanks Andrew!

https://reviews.llvm.org/D51343

+        // The Sections list may contain sections that weren't loaded for
+        // whatever reason: they may be debug sections, and ProcessAllSections
+        // is false, or they may be sections that contain 0 bytes. If the
+        // section isn't loaded, the load address will be 0, and it should not
+        // be included in the ImageBase calculation.

llvm-svn: 344995
2018-10-23 01:36:33 +00:00
Lang Hames 3d16af69cf [ORC] Show JITDylib search order in JITDylib::dump.
This can be helpful in debugging search-order related failures.

llvm-svn: 344994
2018-10-23 01:36:32 +00:00
Lang Hames 1aa3292a43 [ORC] Dump flags for JITDylib symbol table entries.
This can help when debugging flag-specific symbol table issues.

llvm-svn: 344993
2018-10-23 01:36:31 +00:00
Kostya Serebryany af95597c3c [hwasan] add stack frame descriptions.
Summary:
At compile-time, create an array of {PC,HumanReadableStackFrameDescription}
for every function that has an instrumented frame, and pass this array
to the run-time at the module-init time.
Similar to how we handle pc-table in SanitizerCoverage.
The run-time is dummy, will add the actual logic in later commits.

Reviewers: morehouse, eugenis

Reviewed By: eugenis

Subscribers: srhines, llvm-commits, kubamracek

Differential Revision: https://reviews.llvm.org/D53227

llvm-svn: 344985
2018-10-23 00:50:40 +00:00
Jonas Devlieghere 18028f9d43 [dsymutil] Improve error reporting when we cannot create output file.
Before this patch we were returning an empty string in case we couldn't
create the output file. Now we return an expected string so we can
return and print the proper issue. We now return errors instead of bools
and defer printing to the call site.

llvm-svn: 344983
2018-10-23 00:32:22 +00:00
Heejin Ahn a40303aa03 [WebAssembly] Fix assembly printing of br_table
Summary: In `br_table's stack version asm string, \t was missing.

Reviewers: aardappel

Subscribers: dschuff, sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D53516

llvm-svn: 344981
2018-10-23 00:28:14 +00:00
Wouter van Oortmerssen a569c20587 [WebAssembly] Added test for inline assembly roundtrip.
Summary:
Due to previous work to make WebAssembly MC by default stack-only
inline assembly now "just works" (previously it didn't since it had
no way to know types of registers), so no further work required.

So far we only have tests (in inline-asm.ll) which test with
non-existing instructions, so this adds a test that roundtrips
both the inline assembly and its surrounding code thru the assembler.

Reviewers: dschuff, sunfish

Subscribers: sbc100, jgravelle-google, eraman, aheejin, llvm-commits

Differential Revision: https://reviews.llvm.org/D52914

llvm-svn: 344977
2018-10-23 00:12:49 +00:00
Saleem Abdulrasool 96cd3cc312 X86: fix a comment copy-paste issue (NFC)
The comment was copy-pasted but not updated.  NFC.

llvm-svn: 344973
2018-10-22 23:34:24 +00:00
Craig Topper 96889b8b96 [X86] Remove unused entries from the X86ProcFamily enum. Add a note to discourage creation of new enum entries.
As we've learned multiple times, a coarse grained enum like this is not scalable and we should be migrating away from it.

llvm-svn: 344972
2018-10-22 23:14:55 +00:00
Leonard Chan 0acfc6be38 [Intrinsic] Unigned Saturation Addition Intrinsic
Add an intrinsic that takes 2 integers and perform unsigned saturation
addition on them.

This is a part of implementing fixed point arithmetic in clang where some of
the more complex operations will be implemented as intrinsics.

Differential Revision: https://reviews.llvm.org/D53340

llvm-svn: 344971
2018-10-22 23:08:40 +00:00
Matthias Braun a0beeffeed X86: Do not optimize branches with undef eflags inputs
analyzeBranch()/insertBranch() etc. do not properly deal with an undef
flag on the eflags input and used to produce invalid MIR.  I don't see
this ever affecting real world inputs (I don't think it is possible to
produce undef flags with llvm IR), so I simply changed the code to bail
out in this case.

rdar://42122367

llvm-svn: 344970
2018-10-22 22:52:23 +00:00
Sanjay Patel 767625400d [Reassociate] remove bogus tests; NFC
I was trying to provide test coverage for D53533 
with rL344964, but these don't do it...and I don't
think they add any value, so deleting.

llvm-svn: 344969
2018-10-22 22:50:27 +00:00
Reid Kleckner 3d5c2e648c [MC] Shrink MCAsmParser by grouping bools, add const, NFC
I was considering adding another boolean here. I standardized on bools
since they allow default member initializers in the class definition.
This makes ShowParsedOperands protected instead of private, but that's
probably fine.

Reduce the SmallVector size while we're at it, since the common case is
that there is never a pending error.

llvm-svn: 344967
2018-10-22 22:29:09 +00:00
Simon Pilgrim 8c3d87b8cf [ARM] Regenerate reverse shuffle costs
Came about while cleaning up general shuffle costs for PR39368

llvm-svn: 344966
2018-10-22 22:26:00 +00:00
Craig Topper c8e183f9ee Recommit r344877 "[X86] Stop promoting integer loads to vXi64"
I've included a fix to DAGCombiner::ForwardStoreValueToDirectLoad that I believe will prevent the previous miscompile.

Original commit message:

Theoretically this was done to simplify the amount of isel patterns that were needed. But it also meant a substantial number of our isel patterns have to match an explicit bitcast. By making the vXi32/vXi16/vXi8 types legal for loads, DAG combiner should be able to change the load type to rem

I had to add some additional plain load instruction patterns and a few other special cases, but overall the isel table has reduced in size by ~12000 bytes. So it looks like this promotion was hurting us more than helping.

I still have one crash in vector-trunc.ll that I'm hoping @RKSimon can help with. It seems to relate to using getTargetConstantFromNode on a load that was shrunk due to an extract_subvector combine after the constant pool entry was created. So we end up decoding more mask elements than the lo

I'm hoping this patch will simplify the number of patterns needed to remove the and/or/xor promotion.

Reviewers: RKSimon, spatel

Reviewed By: RKSimon

Subscribers: llvm-commits, RKSimon

Differential Revision: https://reviews.llvm.org/D53306

llvm-svn: 344965
2018-10-22 22:14:05 +00:00
Sanjay Patel 21a62e23d8 [Reassociate] add vector tests with undef elements; NFC
Also, regenerate checks for these files. We should do better
on the vector tests by using the PatternMatch API instead of
BinaryOperator::isNot/isNeg.

llvm-svn: 344964
2018-10-22 22:04:13 +00:00
Thomas Lively c63b5fcb2a [WebAssembly][NFC] Remove WebAssemblyStackifier TableGen backend
Summary:
Replace its functionality with a TableGen InstrInfo relational
instruction mapping. Although arguably more complex than the TableGen
backend, the relational mapping is a smaller maintenance burden than a
TableGen backend.

Reviewers: aardappel, aheejin, dschuff

Subscribers: mgorny, sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D53307

llvm-svn: 344962
2018-10-22 21:55:26 +00:00
Vedant Kumar 74533bd3b8 [DWARF] Use a function-local offset for AT_call_return_pc
Logs provided by @stella.stamenova indicate that on Linux, lldb adds a
spurious slide offset to the return PC it loads from AT_call_return_pc
attributes (see the list thread: "[PATCH] D50478: Add support for
artificial tail call frames").

This patch side-steps the issue by getting rid of the load address
calculation in lldb's CallEdge::GetReturnPCAddress.

The idea is to have the DWARF writer emit function-local offsets to the
instruction after a call. I.e. return-pc = label-after-call-insn -
function-entry. LLDB can simply add this offset to the base address of a
function to get the return PC.

Differential Revision: https://reviews.llvm.org/D53469

llvm-svn: 344960
2018-10-22 21:44:21 +00:00
Sanjay Patel dd1c3df72d [Reassociate] add 'using namespace' to reduce bloat; NFC
llvm-svn: 344959
2018-10-22 21:37:02 +00:00
Lang Hames 95abadec0b [ORC] Guard access to the MemMgrs vector in RTDyldObjectLinkingLayer.
Otherwise we can end up with a data-race when linking concurrently.

This should fix an intermittent failure in the multiple-compile-threads-basic.ll
testcase.

llvm-svn: 344956
2018-10-22 21:17:56 +00:00
Sanjay Patel fb41544af8 [x86] add test for PR25498 and complete checks; NFC
Might as well test the actual codegen instead of just the absence of crashing.

llvm-svn: 344955
2018-10-22 21:11:15 +00:00
Tim Northover a23c12a627 X86: add alias for pushfw/popfw in Intel mode
A while ago we changed pushf and popf in Intel mode to generate pushfq
and popfq. Unfortunately that left us with no way to get the 16-bit
encoding in Intel mode so this patch adds pushfw and popfw as aliases
there.

llvm-svn: 344949
2018-10-22 20:38:13 +00:00
Justin Bogner 912adfba7e Reapply "[MachineCopyPropagation] Reimplement CopyTracker in terms of register units"
Recommits r342942, which was reverted in r343189, with a fix for an
issue where we would propagate unsafely if we defined only the upper
part of a register.

Original message:

  Change the copy tracker to keep a single map of register units
  instead of 3 maps of registers. This gives a very significant
  compile time performance improvement to the pass. I measured a
  30-40% decrease in time spent in MCP on x86 and AArch64 and much
  more significant improvements on out of tree targets with more
  registers.

llvm-svn: 344942
2018-10-22 19:51:31 +00:00
Teresa Johnson f431a2f261 [hot-cold-split] Add opt remark on success
Summary: Emit optimization remark on successful hot cold split.

Reviewers: sebpop, hiraditya

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D53512

llvm-svn: 344938
2018-10-22 19:06:42 +00:00
Simon Pilgrim 3b91e9676b Revert rL344931 from llvm/trunk: [X86][SSE] getTargetShuffleMaskIndices - allow opt-in support for whole undef shuffle mask elements
We can't safely assume that certain RawMask entries are UNDEF as most variable shuffles ignore non-index bits - PSHUFB only works on i8 elts so it'd be safe to use but I'm intending to come up with an alternative approach that works for all.
........
Enable this for PSHUFB constant mask decoding and remove the ConstantPool DecodePSHUFBMask

llvm-svn: 344937
2018-10-22 19:01:25 +00:00
Simon Pilgrim 794f85cd93 Revert rL344933 from llvm/trunk: [X86][SSE] Tidyup DecodeVPERMILPMask shuffle mask decoding
We can't safely assume that certain RawMask entries are UNDEF as most variable shuffles ignore non-index bits.
........
Add support for UNDEF raw mask elements and remove the ConstantPool DecodeVPERMILPMask usage in X86ISelLowering.cpp

llvm-svn: 344936
2018-10-22 18:58:32 +00:00
Aaron Ballman 036e54d32e Revert r344930 as it broke some of the bots on Windows.
http://lab.llvm.org:8011/builders/clang-x64-windows-msvc/builds/739

llvm-svn: 344935
2018-10-22 18:51:29 +00:00
Simon Pilgrim 476c9f42fc [X86][SSE] Tidyup DecodeVPERMILPMask shuffle mask decoding
Add support for UNDEF raw mask elements and remove the ConstantPool DecodeVPERMILPMask usage in X86ISelLowering.cpp

llvm-svn: 344933
2018-10-22 18:35:13 +00:00
Simon Pilgrim 3521367ff3 [X86][SSE] getTargetShuffleMaskIndices - allow opt-in support for whole undef shuffle mask elements
Enable this for PSHUFB constant mask decoding and remove the ConstantPool DecodePSHUFBMask

llvm-svn: 344931
2018-10-22 18:09:02 +00:00
Joel E. Denny 279f8a44a1 [SourceMgr][FileCheck] Obey -color by extending WithColor
While this change specifically targets FileCheck, it affects any tool
using the same SourceMgr facilities.

Previously, -color was documented in FileCheck's -help output, but
-color had no effect.  Now, -color obeys its documentation: it forces
colors to be used in FileCheck diagnostics even when stderr is not a
terminal.

-color is especially helpful when combined with FileCheck's -v, which
can produce a long series of diagnostics that you might wish to pipe
to a pager, such as less -R.  The WithColor extensions here will also
help to clean up color usage in FileCheck's annotated dump of input,
which is proposed in D52999.

Reviewed By: JDevlieghere

Differential Revision: https://reviews.llvm.org/D53419

llvm-svn: 344930
2018-10-22 18:00:49 +00:00
Teresa Johnson 16ce43a2e1 [hot-cold-split] Add missing FileCheck invocations
Summary:
r344558 added some CHECK statements to split-cold-2.ll, but didn't add
any invocations of FileCheck. Add those here.

Reviewers: sebpop

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D53505

llvm-svn: 344928
2018-10-22 17:57:02 +00:00
Fangrui Song a342834b24 [llvm-exegesis] Fix name lookup ambiguity in MSVC after 344922
llvm-svn: 344927
2018-10-22 17:52:31 +00:00
Simon Pilgrim 5dff767c25 [X86] getTargetConstantBitsFromNode - handle extraction from larger constant pool entries
First step towards removing X86ShuffleDecodeConstantPool usage from X86ISelLowering.cpp

llvm-svn: 344924
2018-10-22 17:43:33 +00:00
Fangrui Song 32401afd8c [llvm-exegesis] Move namespace exegesis inside llvm::
Summary:
This allows simplifying references of llvm::foo with foo when the needs
come in the future.

Reviewers: courbet, gchatelet

Reviewed By: gchatelet

Subscribers: javed.absar, tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D53455

llvm-svn: 344922
2018-10-22 17:10:47 +00:00
Craig Topper 8d8dcfe690 Revert r344877 "[X86] Stop promoting integer loads to vXi64"
Sam McCall reported miscompiles in some tensorflow code. Reverting while I try to figure out.

llvm-svn: 344921
2018-10-22 16:59:24 +00:00
Vedant Kumar ba88ad35ec [test] Relax test/Other/opt-hot-cold-split.ll
On some ARM bots, 'Target Pass Configuration' does not run after 'Target
Transform Info'. Relax this pipeline test to allow that.

This is the same fix as in r328167.

Bot URL: http://lab.llvm.org:8011/builders/clang-cmake-armv7-quick/builds/4611

llvm-svn: 344919
2018-10-22 16:50:24 +00:00
Andrea Di Biagio db158be646 [llvm-mca] Remove a couple of using directives and a bunch of redundant namespace llvm prefixes. NFC
llvm-svn: 344916
2018-10-22 16:28:07 +00:00
Matt Arsenault 687ec75d10 DAG: Change behavior of fminnum/fmaxnum nodes
Introduce new versions that follow the IEEE semantics
to help with legalization that may need quieted inputs.

There are some regressions from inserting unnecessary
canonicalizes when these are matched from fast math
fcmp + select which should be fixed in a future commit.

llvm-svn: 344914
2018-10-22 16:27:27 +00:00
Zachary Turner b96181c2bf Some cleanups to the native pdb plugin [NFC].
This is mostly some cleanup done in the process of implementing
some basic support for types.  I tried to split up the patch a
bit to get some of the NFC portion of the patch out into a separate
commit, and this is the result of that.  It moves some code around,
deletes some spurious namespace qualifications, removes some
unnecessary header includes, forward declarations, etc.

llvm-svn: 344913
2018-10-22 16:19:07 +00:00