llvm-project

Commit Graph

Author	SHA1	Message	Date
Saleem Abdulrasool	aff96d907b	X86: treat SwiftCC as Win64_CC on Win64 The Swift CC is identical to Win64 CC with the exception of swift error being passed in r12 which is a CSR. However, since this calling convention is only used in swift -> swift code, it does not impact interoperability and can be treated entirely as Win64 CC. We would previously incorrectly lower the frame setup as we did not treat the frame as conforming to Win64 specifications. llvm-svn: 313813	2017-09-20 21:00:40 +00:00
Matt Arsenault	76935122cc	AMDGPU: Start selecting v_mad_mixlo_f16 Also add some tests that should be able to use v_mad_mixhi_f16, but do not yet. This is trickier because we don't really model the partial update of the register done by 16-bit instructions. llvm-svn: 313806	2017-09-20 20:28:39 +00:00
Matt Arsenault	644883ff07	AMDGPU: Fix encoding of op_sel for mad_mix* opcodes llvm-svn: 313797	2017-09-20 19:09:28 +00:00
Sam Clegg	d95ed959d8	Reland "[WebAssembly] Add support for naming wasm data segments" Add adds support for naming data segments. This is useful useful linkers so that they can merge similar sections. Differential Revision: https://reviews.llvm.org/D37886 llvm-svn: 313795	2017-09-20 19:03:35 +00:00
Saleem Abdulrasool	432b88e5f4	CodeGen: support SwiftError SwiftCC on Windows x64 Add support for passing SwiftError through a register on the Windows x64 calling convention. This allows the use of swifterror attributes on parameters which is used by the swift front end for the `Error` parameter. This partially enables building the swift standard library for Windows x86_64. llvm-svn: 313791	2017-09-20 18:40:59 +00:00
Marek Sokolowski	c2189b8311	[llvm-readobj] Teach readobj to dump .res files (WindowsResource). This enables readobj to output Windows resource files (.res). This way, we'll be able to test .res outputs without comparing them byte-by-byte with "magic binary files" generated by MS toolchain. Differential Revision: https://reviews.llvm.org/D38058 llvm-svn: 313790	2017-09-20 18:33:35 +00:00
Reid Kleckner	4e04028791	Re-land "[DebugInfo] Insert DW_OP_deref when spilling indirect DBG_VALUEs" After r313775, it's easier to maintain a parallel BitVector of spilled locations indexed by location number. I wasn't able to build a good reduced test case for this iteration of the bug, but I added a more direct assertion that spilled values must use frame index locations. If this bug reappears, it won't only fire on the NEON vector code that we detected it on, but on medium-sized integer-only programs as well. llvm-svn: 313786	2017-09-20 18:19:08 +00:00
Hans Wennborg	57c3341ada	Revert r313771 "[SLP] Vectorize jumbled memory loads." This broke the buildbots, e.g. http://bb.pgr.jp/builders/test-llvm-i686-linux-RA/builds/391 > Summary: > This patch tries to vectorize loads of consecutive memory accesses, accessed > in non-consecutive or jumbled way. An earlier attempt was made with patch D26905 > which was reverted back due to some basic issue with representing the 'use mask' > jumbled accesses. > > This patch fixes the mask representation by recording the 'use mask' in the usertree entry. > > Change-Id: I9fe7f5045f065d84c126fa307ef6ebe0787296df > > Subscribers: mzolotukhin > > Reviewed By: ayal > > Differential Revision: https://reviews.llvm.org/D36130 > > Review comments updated accordingly > > Change-Id: I22ab0a8a9bac9d49d74baa81a08e1e486f5e75f0 > > Added a TODO for sortLoadAccesses API > > Change-Id: I3c679bf1865422d1b45e17ea28f1992bca660b58 > > Modified the TODO for sortLoadAccesses API > > Change-Id: Ie64a66cb5f9e2a7610438abb0e750c6e090f9565 > > Review comment update for using OpdNum to insert the mask in respective location > > Change-Id: I016d0c1b29874e979efc0205bbf078991f92edce > > Fixes '-Wsign-compare warning' in LoopAccessAnalysis.cpp and code rebase > > Change-Id: I64b2ea5e68c1d7b6a028f5ef8251c5a97333f89b llvm-svn: 313781	2017-09-20 18:00:03 +00:00
Adrian Prantl	d3f9f2138d	llvm-dwarfdump: implement --recurse-depth=<N> This patch implements the Darwin dwarfdump option --recurse-depth=<N>, which limits the recursion depth when selectively printing DIEs at an offset. Differential Revision: https://reviews.llvm.org/D38064 llvm-svn: 313778	2017-09-20 17:44:00 +00:00
Quentin Colombet	aa103b3d86	[InstCombine] Add select simplifications In these cases, two selects have constant selectable operands for both the true and false components and have the same conditional expression. We then create two arithmetic operations of the same type and feed a final select operation using the result of the true arithmetic for the true operand and the result of the false arithmetic for the false operand and reuse the original conditionl expression. The arithmetic operations are naturally folded as a consequence, leaving only the newly formed select to replace the old arithmetic operation. Patch by: Michael Berg <michael_c_berg@apple.com> Differential Revision: https://reviews.llvm.org/D37019 llvm-svn: 313774	2017-09-20 17:32:16 +00:00
Jake Ehrlich	a45afd50d4	Reland "[llvm-objcopy] Add support for .dynamic, .dynsym, and .dynstr" I did not upload two binaries that I reference in tests. This change adds support for sections involved in dynamic loading such as SHT_DYNAMIC, SHT_DYNSYM, and allocated string tables. The two added binaries used for tests can be downloaded here and here Differential Revision: https://reviews.llvm.org/D36560 llvm-svn: 313772	2017-09-20 17:22:06 +00:00
Mohammad Shahid	2b281de576	[SLP] Vectorize jumbled memory loads. Summary: This patch tries to vectorize loads of consecutive memory accesses, accessed in non-consecutive or jumbled way. An earlier attempt was made with patch D26905 which was reverted back due to some basic issue with representing the 'use mask' jumbled accesses. This patch fixes the mask representation by recording the 'use mask' in the usertree entry. Change-Id: I9fe7f5045f065d84c126fa307ef6ebe0787296df Subscribers: mzolotukhin Reviewed By: ayal Differential Revision: https://reviews.llvm.org/D36130 Review comments updated accordingly Change-Id: I22ab0a8a9bac9d49d74baa81a08e1e486f5e75f0 Added a TODO for sortLoadAccesses API Change-Id: I3c679bf1865422d1b45e17ea28f1992bca660b58 Modified the TODO for sortLoadAccesses API Change-Id: Ie64a66cb5f9e2a7610438abb0e750c6e090f9565 Review comment update for using OpdNum to insert the mask in respective location Change-Id: I016d0c1b29874e979efc0205bbf078991f92edce Fixes '-Wsign-compare warning' in LoopAccessAnalysis.cpp and code rebase Change-Id: I64b2ea5e68c1d7b6a028f5ef8251c5a97333f89b llvm-svn: 313771	2017-09-20 17:19:57 +00:00
Jake Ehrlich	e5d424b8dc	Reland "[llvm-objcopy] Add support for .dynamic, .dynsym, and .dynstr" I overzealously landed this before I was sure that another change wouldn't break the build that this change depends on. This change adds support for sections involved in dynamic loading such as SHT_DYNAMIC, SHT_DYNSYM, and allocated string tables. The two added binaries used for tests can be downloaded here and here Differential Revision: https://reviews.llvm.org/D36560 llvm-svn: 313767	2017-09-20 17:11:58 +00:00
Teresa Johnson	f625118ec7	[ThinLTO] Fix dead stripping analysis for SamplePGO Summary: The fix for dead stripping analysis in the case of SamplePGO indirect calls to local functions (r313151) introduced the possibility of an infinite loop. Make sure we check for the value being already live after we update it for SamplePGO indirect call handling. Reviewers: danielcdh Subscribers: mehdi_amini, inglorion, llvm-commits, eraman Differential Revision: https://reviews.llvm.org/D38086 llvm-svn: 313766	2017-09-20 17:09:47 +00:00
David Blaikie	1d5d44ff05	DebugInfo: Remove unneeded attributes from test/DebugInfo/Generic/imported-name-inlined.ll Remove unneeded attributes from test/DebugInfo/Generic/imported-name-inlined.ll because it was causing failures on pure MIPS builds. Patch by Miloš Stojanović! Differential Revision: https://reviews.llvm.org/D38079 llvm-svn: 313762	2017-09-20 15:59:57 +00:00
Simon Atanasyan	8bdbb29524	[mips] Add a valid test case to check the reason of the recent build-bot failure. NFC llvm-svn: 313761	2017-09-20 15:57:25 +00:00
Alexander Kornienko	6a140234ed	Revert r313736: "[SLP] Vectorize jumbled memory loads." The revision breaks buildbots: http://lab.llvm.org:8011/builders/clang-x86_64-debian-fast/builds/6694/steps/test/logs/stdio llvm-svn: 313758	2017-09-20 14:53:07 +00:00
Simon Pilgrim	d202ad15c1	[X86][SSE] Add PR22415 test case llvm-svn: 313755	2017-09-20 13:49:52 +00:00
Florian Hahn	ceb4494786	Recommit [MachineCombiner] Update instruction depths incrementally for large BBs. This version of the patch fixes an off-by-one error causing PR34596. We do not need to use std::next(BlockIter) when calling updateDepths, as BlockIter already points to the next element. Original commit message: > For large basic blocks with lots of combinable instructions, the > MachineTraceMetrics computations in MachineCombiner can dominate the compile > time, as computing the trace information is quadratic in the number of > instructions in a BB and it's relevant successors/predecessors. > In most cases, knowing the instruction depth should be enough to make > combination decisions. As we already iterate over all instructions in a basic > block, the instruction depth can be computed incrementally. This reduces the > cost of machine-combine drastically in cases where lots of instructions > are combined. The major drawback is that AFAIK, computing the critical path > length cannot be done incrementally. Therefore we only compute > instruction depths incrementally, for basic blocks with more > instructions than inc_threshold. The -machine-combiner-inc-threshold > option can be used to set the threshold and allows for easier > experimenting and checking if using incremental updates for all basic > blocks has any impact on the performance. > > Reviewers: sanjoy, Gerolf, MatzeB, efriedma, fhahn > > Reviewed By: fhahn > > Subscribers: kiranchandramohan, javed.absar, efriedma, llvm-commits > > Differential Revision: https://reviews.llvm.org/D36619 llvm-svn: 313751	2017-09-20 11:54:37 +00:00
George Rimar	0eb2f30b0b	Revert r313746 "[yaml2obj] - Don't crash on invalid document." It broke BB: http://lab.llvm.org:8011/builders/llvm-hexagon-elf/builds/9781 llvm-svn: 313748	2017-09-20 10:24:37 +00:00
George Rimar	cefe7e1142	[yaml2obj] - Don't crash on invalid document. Previously jaml2obj would segfault on empty document. (without yaml description). Patch fixes the issue. Differential revision: https://reviews.llvm.org/D38036 llvm-svn: 313746	2017-09-20 09:57:11 +00:00
Mikael Holmen	06064d1bac	[IfConversion] Add testcases [NFC] These tests should have been included in r310697 / D34099 but apparently I missed them. llvm-svn: 313737	2017-09-20 08:23:29 +00:00
Mohammad Shahid	f8db9bd857	[SLP] Vectorize jumbled memory loads. Summary: This patch tries to vectorize loads of consecutive memory accesses, accessed in non-consecutive or jumbled way. An earlier attempt was made with patch D26905 which was reverted back due to some basic issue with representing the 'use mask' of jumbled accesses. This patch fixes the mask representation by recording the 'use mask' in the usertree entry. Change-Id: I9fe7f5045f065d84c126fa307ef6ebe0787296df Reviewers: mkuper, loladiro, Ayal, zvi, danielcdh Reviewed By: Ayal Subscribers: mzolotukhin Differential Revision: https://reviews.llvm.org/D36130 Commit after rebase for patch D36130 Change-Id: I8add1c265455669ef288d880f870a9522c8c08ab llvm-svn: 313736	2017-09-20 08:18:28 +00:00
Andrew V. Tischenko	92980ce6aa	'into' instruction should not be decoded as a valid instr in 64-bit mode llvm-svn: 313735	2017-09-20 08:17:17 +00:00
Matt Arsenault	b81495dccb	AMDGPU: Match load d16 hi instructions Also starts selecting global loads for constant address in some cases. Some end up selecting to mubuf still, which requires investigation. We still get sub-optimal regalloc and extra waitcnts inserted due to not really tracking the liveness of the separate register halves. llvm-svn: 313716	2017-09-20 05:01:53 +00:00
Stanislav Mekhanoshin	5670e6d482	[AMDGPU] Port of HSAIL inliner Differential Revision: https://reviews.llvm.org/D36849 llvm-svn: 313714	2017-09-20 04:25:58 +00:00
Matt Arsenault	fcc213fab7	AMDGPU: Match store d16_hi instructions llvm-svn: 313712	2017-09-20 03:20:09 +00:00
Sanjoy Das	09613b122e	Tighten the invariants around LoopBase::invalidate Summary: With this change: - Methods in LoopBase trip an assert if the receiver has been invalidated - LoopBase::clear frees up the memory held the LoopBase instance This change also shuffles things around as necessary to work with this stricter invariant. Reviewers: chandlerc Subscribers: mehdi_amini, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D38055 llvm-svn: 313708	2017-09-20 02:31:57 +00:00
Mike Edwards	b487bf45f0	Reverting due to Green Dragon bot failure. http://green.lab.llvm.org/green/job/clang-stage1-cmake-RA-incremental/42594/ llvm-svn: 313706	2017-09-20 01:21:02 +00:00
Quentin Colombet	d652aeb144	[MIRPrinter] Print empty successor lists when they cannot be guessed This re-applies commit r313685, this time with the proper updates to the test cases. Original commit message: Unreachable blocks in the machine instr representation are these weird empty blocks with no successors. The MIR printer used to not print empty lists of successors. However, the MIR parser now treats non-printed list of successors as "please guess it for me". As a result, the parser tries to guess the list of successors and given the block is empty, just assumes it falls through the next block (if any). For instance, the following test case used to fail the verifier. The MIR printer would print entry / \ true (def) false (no list of successors) \| split.true (use) The MIR parser would understand this: entry / \ true (def) false \| / <-- invalid edge split.true (use) Because of the invalid edge, we get the "def does not dominate all uses" error. The fix consists in printing empty successor lists, so that the parser knows what to do for unreachable blocks. rdar://problem/34022159 llvm-svn: 313696	2017-09-19 23:34:12 +00:00
Sam Clegg	b292c25966	[WebAssembly] Add support for naming wasm data segments Add adds support for naming data segments. This is useful useful linkers so that they can merge similar sections. Differential Revision: https://reviews.llvm.org/D37886 llvm-svn: 313692	2017-09-19 23:00:57 +00:00
Quentin Colombet	6888dbcda7	Revert "[MIRPrinter] Print empty successor lists when they cannot be guessed" This reverts commit r313685. I thought I had ran ninja check, but apparently I didn't... Need to update a bunch of mir tests. llvm-svn: 313686	2017-09-19 22:03:50 +00:00
Quentin Colombet	7fdaa5e641	[MIRPrinter] Print empty successor lists when they cannot be guessed Unreachable blocks in the machine instr representation are these weird empty blocks with no successors. The MIR printer used to not print empty lists of successors. However, the MIR parser now treats non-printed list of successors as "please guess it for me". As a result, the parser tries to guess the list of successors and given the block is empty, just assumes it falls through the next block (if any). For instance, the following test case used to fail the verifier. The MIR printer would print entry / \ true (def) false (no list of successors) \| split.true (use) The MIR parser would understand this: entry / \ true (def) false \| / <-- invalid edge split.true (use) Because of the invalid edge, we get the "def does not dominate all uses" error. The fix consists in printing empty successor lists, so that the parser knows what to do for unreachable blocks. rdar://problem/34022159 llvm-svn: 313685	2017-09-19 21:55:51 +00:00
Jake Ehrlich	d246b0a284	Reland "[llvm-objcopy] Add support for nested and overlapping segments" I didn't initialize a pointer to be nullptr that I needed to. This change adds support for nested and even overlapping segments. This means that PT_PHDR, PT_GNU_RELRO, PT_TLS, and PT_DYNAMIC can be supported properly. Differential Revision: https://reviews.llvm.org/D36558 llvm-svn: 313682	2017-09-19 21:37:35 +00:00
Jonathan Roelofs	85908aa84b	[ARM] Relax 'cpsie'/'cpsid' flag parsing. The ARM docs suggest in examples that the flags can have either case, and there are applications in the wild that (libopencm3, for example) that expect to be able to use the uppercase spelling. https://reviews.llvm.org/D37953 llvm-svn: 313680	2017-09-19 21:23:19 +00:00
Reid Kleckner	ffdf087499	Revert "[DebugInfo] Insert DW_OP_deref when spilling indirect DBG_VALUEs" This reverts r313640, originally r313400, one more time for essentially the same issue. My BitVector of spilled location numbers isn't working because we coalesce identical DBG_VALUE locations as we rewrite them, invalidating the location numbers used to index the BitVector. llvm-svn: 313679	2017-09-19 21:18:32 +00:00
Dehao Chen	62b9c33e1e	Import all inlined indirect call targets for SamplePGO. Summary: In the ThinLTO compilation, if a function is inlined in the profiling binary, we need to inline it before annotation. If the callee is not available in the primary module, a first step is needed to import that callee function. For the current implementation, if the call is an indirect call, which has been promoted to >1 targets and inlined, SamplePGO will only import one target with the largest sample count. This patch fixed the bug to import all targets instead. Reviewers: tejohnson, davidxl Reviewed By: tejohnson Subscribers: sanjoy, llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D36637 llvm-svn: 313678	2017-09-19 21:18:14 +00:00
Adrian Prantl	ccc21c459b	llvm-dwarfdump: un-hide more command line options llvm-svn: 313673	2017-09-19 20:58:57 +00:00
Adrian Prantl	b780d909fe	Move test into non-target-specific directory. llvm-svn: 313672	2017-09-19 20:58:56 +00:00
Stanislav Mekhanoshin	d4ae470d2e	[AMDGPU] Prevent post-RA scheduler from breaking memory clauses The pre-RA scheduler does load/store clustering, but post-RA scheduler undoes it. Add mutation to prevent it. Differential Revision: https://reviews.llvm.org/D38014 llvm-svn: 313670	2017-09-19 20:54:38 +00:00
Ulrich Weigand	59a01a958a	[SystemZ] Fix truncstore + bswap codegen bug SystemZTargetLowering::combineSTORE contains code to transform a combination of STORE + BSWAP into a STRV type instruction. This transformation is correct for regular stores, but not for truncating stores. The routine neglected to check for that case. Fixes a miscompilation of llvm-objcopy with clang, which caused test suite failures in the SystemZ multistage build bot. llvm-svn: 313669	2017-09-19 20:50:05 +00:00
Saleem Abdulrasool	5c37d57e15	Revert "ExecutionEngine: add R_AARCH64_ABS{16,32}" This reverts commit SVN r313654. Seems that it is triggering an assertion on Windows specifically. Revert until I can build on Windows and look into what is happening there. llvm-svn: 313668	2017-09-19 20:35:25 +00:00
Jake Ehrlich	317782122c	Revert "[llvm-objcopy] Add support for .dynamic, .dynsym, and .dynstr" This reverts commit r313663. Broken because overlapping-sections was reverted. llvm-svn: 313665	2017-09-19 20:00:04 +00:00
Jake Ehrlich	8f108248ba	Revert "[llvm-objcopy] Add support for nested and overlapping segments" This reverts commit r313656. Appears to be broken on Windows. llvm-svn: 313664	2017-09-19 19:52:09 +00:00
Jake Ehrlich	f20c3f4333	[llvm-objcopy] Add support for .dynamic, .dynsym, and .dynstr This change adds support for sections involved in dynamic loading such as SHT_DYNAMIC, SHT_DYNSYM, and allocated string tables. The two added binaries used for tests can be downloaded [[ https://drive.google.com/file/d/0B3gtIAmiMwZXOXE3T0RobFg4ZTg/view?usp=sharing \| here ]] and [[ https://drive.google.com/file/d/0B3gtIAmiMwZXTFJSQUJZMGxNSXc/view?usp=sharing \| here ]] Differential Revision: https://reviews.llvm.org/D36560 llvm-svn: 313663	2017-09-19 19:21:09 +00:00
David Blaikie	610e03a009	Fix test to not depend on another subdirectories Input directory Inputs should be placed local to the test (or possibly in a common parent? I think we do that in some places - but the only common parent between these two directories is 'test' which seems a bit overly broad). llvm-svn: 313662	2017-09-19 19:20:08 +00:00
Jake Ehrlich	523560ef57	[llvm-objcopy] Add test to check that architecture specific values are not used on wrong architecture. This change adds a test that checks the an error is produced when a hexagon specific reserved section index is used but e_machine is not EM_HEXAGON. Differential Revision: https://reviews.llvm.org/D38017 llvm-svn: 313661	2017-09-19 19:05:15 +00:00
Dehao Chen	b6e60c8b80	Handle profile mismatch correctly for SamplePGO. Summary: Fix the bug when promoted call return type mismatches with the promoted function, we should not try to inline it. Otherwise it may lead to compiler crash. Reviewers: davidxl, tejohnson, eraman Reviewed By: tejohnson Subscribers: llvm-commits, sanjoy Differential Revision: https://reviews.llvm.org/D38018 llvm-svn: 313658	2017-09-19 18:26:54 +00:00
Reid Kleckner	26fa1bf4da	Re-land "Fix Bug 30978 by emitting cv file checksums." This reverts r313431 and brings back r313374 with a fix to write checksums as binary data and not ASCII hex strings. llvm-svn: 313657	2017-09-19 18:14:45 +00:00
Jake Ehrlich	0a84b1ac80	[llvm-objcopy] Add support for nested and overlapping segments This change adds support for nested and even overlapping segments. This means that PT_PHDR, PT_GNU_RELRO, PT_TLS, and PT_DYNAMIC can be supported properly. Differential Revision: https://reviews.llvm.org/D36558 llvm-svn: 313656	2017-09-19 18:14:03 +00:00
Saleem Abdulrasool	6567ecd741	ExecutionEngine: add R_AARCH64_ABS{16,32} Add support for the R_AARCH64_ABS{16,32} relocations in the execution engine. This is primarily used for DWARF debug information relocations and needed by the LLVM JIT to support JITing for lldb. Patch by Alex Langford! llvm-svn: 313654	2017-09-19 18:00:50 +00:00
Reid Kleckner	86c74dd791	Re-land r313400 "[DebugInfo] Insert DW_OP_deref when spilling indirect DBG_VALUEs" I forgot to zero out the BitVector when reusing it between UserValues. Later uses of the same location number for a different UserValue would falsely indicate that they were spilled. Usually this would lead to incorrect debug info, but in some cases they would indicate something nonsensical like a memory location based on a vector register (Q8 on ARM). llvm-svn: 313640	2017-09-19 16:32:15 +00:00
Tony Jiang	2d9c5f3b8b	[PowerPC Peephole] Constants into a join add, use ADDI over LI/ADD. Two blocks prior to the join each perform an li and the the join block has an add using the initialized register. Optimize each predecessor block to instead use addi and delete the li's and add. Differential Revision: https://reviews.llvm.org/D36734 llvm-svn: 313639	2017-09-19 16:14:37 +00:00
Evandro Menezes	0a98abc67c	[AArch64] Extend tests of loads and stores of register pairs Include instances of FP register pairs. llvm-svn: 313638	2017-09-19 15:46:35 +00:00
Tony Jiang	425071eff3	[Power9] Add missing Power9 instructions. The following 8 instructions are implemented in this patch. addpcis(subpcis, lnia), darn, maddhd, maddhdu, maddld, setb llvm-svn: 313636	2017-09-19 15:22:36 +00:00
Daniel Sanders	83e23d1398	[globalisel] Add a G_BSWAP instruction and support bswap using it. llvm-svn: 313633	2017-09-19 14:25:15 +00:00
Simon Pilgrim	d5e2878252	[X86][SSE] Add 'redundant pand' test case from PR34620 llvm-svn: 313632	2017-09-19 14:02:16 +00:00
Sanjay Patel	bd7958d7ca	[x86] regenerate checks; NFC llvm-svn: 313631	2017-09-19 13:43:09 +00:00
Alexey Bataev	10ad32b76d	[SLP] Reduce test, NFC. llvm-svn: 313630	2017-09-19 13:38:56 +00:00
Daniel Sanders	000327742f	[globalisel] Add support for intrinsic_void llvm-svn: 313629	2017-09-19 13:23:01 +00:00
Daniel Sanders	28887fe548	[globalisel] Add support for intrinsic_w_chain. This maps directly to G_INTRINSIC_W_SIDE_EFFECTS. llvm-svn: 313627	2017-09-19 12:56:36 +00:00
Jina Nahias	ccfb8d4fe8	[x86] Lowering Mask Set1 intrinsics to LLVM IR This patch, together with a matching clang patch (https://reviews.llvm.org/D37668), implements the lowering of X86 mask set1 intrinsics to IR. Differential Revision: https://reviews.llvm.org/D37669 llvm-svn: 313625	2017-09-19 11:03:06 +00:00
Roger Ferrer Ibanez	8d0180c955	[ARM] Use ADDCARRY / SUBCARRY This is a preparatory step for D34515. This change: - makes nodes ISD::ADDCARRY and ISD::SUBCARRY legal for i32 - lowering is done by first converting the boolean value into the carry flag using (_, C) ← (ARMISD::ADDC R, -1) and converted back to an integer value using (R, _) ← (ARMISD::ADDE 0, 0, C). An ARMISD::ADDE between the two operations does the actual addition. - for subtraction, given that ISD::SUBCARRY second result is actually a borrow, we need to invert the value of the second operand and result before and after using ARMISD::SUBE. We need to invert the carry result of ARMISD::SUBE to preserve the semantics. - given that the generic combiner may lower ISD::ADDCARRY and ISD::SUBCARRYinto ISD::UADDO and ISD::USUBO we need to update their lowering as well otherwise i64 operations now would require branches. This implies updating the corresponding test for unsigned. - add new combiner to remove the redundant conversions from/to carry flags to/from boolean values (ARMISD::ADDC (ARMISD::ADDE 0, 0, C), -1) → C - fixes PR34045 - fixes PR34564 Differential Revision: https://reviews.llvm.org/D35192 llvm-svn: 313618	2017-09-19 09:05:39 +00:00
Andrei Elovikov	142516b456	Test commit. llvm-svn: 313617	2017-09-19 07:56:20 +00:00
Matt Arsenault	e745d9963e	AMDGPU: Run internalize symbols at -O0 The relocations used for externally visible functions aren't supported, so the direct call emitted ends up hitting a linker error. llvm-svn: 313616	2017-09-19 07:40:11 +00:00
Gadi Haber	6f8fbf4b86	[X86][Skylake] Adding the scheduling information for the SkylakeClient target This patch adds the instruction scheduling information for the SkylakeClient (SKL) architecture target by adding the file X86SchedSkylakeClient.td located under the X86 Target. We used the scheduling information retrieved from the Skylake architects in order to create the file. The scheduling information includes latency, number of micro-Ops and used ports by each SKL instruction. The patch continues the scheduling replacement and insertion effort started with the SNB target in r307529 and r310792 and for HSW in r311879. Please expect some performance fluctuations due to code alignment effects. Reviewers: craig.topper, zvi, chandlerc, igorb, aymanmus, RKSimon, delena Differential Revision: https://reviews.llvm.org/D37294 llvm-svn: 313613	2017-09-19 06:19:27 +00:00
Craig Topper	a80949feb5	[X86] Add VPERMPD/VPERMQ and VPERMPS/VPERMD to the execution domain fixing table. llvm-svn: 313610	2017-09-19 04:39:55 +00:00
Vedant Kumar	b7fdaf2cd4	[llvm-cov] Make report metrics agree with line exec counts, fixes PR34615 Use the same logic as the line-oriented coverage view to determine the number of covered lines in a function. Fixes llvm.org/PR34615. llvm-svn: 313604	2017-09-19 02:00:12 +00:00
Vedant Kumar	ad8f637bd8	[Coverage] Use gap regions to select better line exec counts After clang started emitting deferred regions (r312818), llvm-cov has had a hard time picking reasonable line execuction counts. There have been one or two generic improvements in this area (e.g r310012), but line counts can still report coverage for whitespace instead of code (llvm.org/PR34612). To fix the problem: * Introduce a new region kind so that frontends can explicitly label gap areas. This is done by changing the encoding of the columnEnd field of MappingRegion. This doesn't substantially increase binary size, and makes it easy to maintain backwards-compatibility. * Don't set the line count to a count from a gap area, unless the count comes from a wrapped segment. * Don't highlight gap areas as uncovered. Fixes llvm.org/PR34612. llvm-svn: 313597	2017-09-18 23:37:28 +00:00
Vedant Kumar	27d1b14ab0	[llvm-cov] Repair a test. NFC. The checks with the MARKER prefix were not being run over the right input, because stderr was not redirected properly. llvm-svn: 313596	2017-09-18 23:37:27 +00:00
Yonghong Song	9ef85f0677	bpf: add inline-asm support Signed-off-by: Yonghong Song <yhs@fb.com> Acked-by: Alexei Starovoitov <ast@kernel.org> llvm-svn: 313593	2017-09-18 23:29:36 +00:00
Yi Kong	bb4b4eef61	[ThinLTO/gold] Implement ThinLTO cache pruning support Differential Revision: https://reviews.llvm.org/D37993 llvm-svn: 313592	2017-09-18 23:24:55 +00:00
Hans Wennborg	4de620ab3d	Revert r313400 "[DebugInfo] Insert DW_OP_deref when spilling indirect DBG_VALUEs" This caused asserts in Chromium. See http://crbug.com/766261 > Summary: > This comes up in optimized debug info for C++ programs that pass and > return objects indirectly by address. In these programs, > llvm.dbg.declare survives optimization, which causes us to emit indirect > DBG_VALUE instructions. The fast register allocator knows to insert > DW_OP_deref when spilling indirect DBG_VALUE instructions, but the > LiveDebugVariables did not until this change. > > This fixes part of PR34513. I need to look into why this doesn't work at > -O0 and I'll send follow up patches to handle that. > > Reviewers: aprantl, dblaikie, probinson > > Subscribers: qcolombet, hiraditya, llvm-commits > > Differential Revision: https://reviews.llvm.org/D37911 llvm-svn: 313589	2017-09-18 23:08:42 +00:00
Zachary Turner	d4401d354a	[lit] Update clang and lld to use new config helpers. NFC intended here, this only updates clang and lld's lit configs to use some helper functionality in the lit.llvm submodule. llvm-svn: 313579	2017-09-18 22:26:48 +00:00
Sanjay Patel	f31b1a00ea	[DAGCombiner] fold assertzexts separated by trunc If we have an AssertZext of a truncated value that has already been AssertZext'ed, we can assert on the wider source op to improve the zext-y knowledge: assert (trunc (assert X, i8) to iN), i1 --> trunc (assert X, i1) to iN This moves a fold from being Mips-specific to general combining, and x86 shows improvements. Differential Revision: https://reviews.llvm.org/D37017 llvm-svn: 313577	2017-09-18 22:05:35 +00:00
Sanjay Patel	709a804a0a	[InstCombine] auto-generate complete checks; NFC The code responsible for these transforms has the potential to add 2 instructions and break min/max patterns (PR33301). llvm-svn: 313575	2017-09-18 21:57:56 +00:00
Adrian Prantl	c2bc717028	llvm-dwarfdump: add a --show-parents options when selectively dumping DIEs. llvm-svn: 313567	2017-09-18 21:27:44 +00:00
Adrian Prantl	f077b5b885	Fix typo in testcase. llvm-svn: 313566	2017-09-18 21:27:42 +00:00
Konstantin Zhuravlyov	ca8946a376	AMDGPU: Start selecting s_xnor_{b32, b64} Differential Revision: https://reviews.llvm.org/D37981 llvm-svn: 313565	2017-09-18 21:22:45 +00:00
Sanjay Patel	7765c93be2	[DAG, x86] allow store merging before and after legalization (PR34217) rL310710 allowed store merging to occur after legalization to catch stores that are created late, but this exposes a logic hole seen in PR34217: https://bugs.llvm.org/show_bug.cgi?id=34217 We will miss merging stores if the target lowers vector extracts into target-specific operations. This patch allows store merging to occur both before and after legalization if the target chooses to get maximum merging. I don't think the potential regressions in the other tests are relevant. The tests are for correctness of weird IR constructs rather than perf tests, and I think those are still correct. Differential Revision: https://reviews.llvm.org/D37987 llvm-svn: 313564	2017-09-18 20:54:26 +00:00
Craig Topper	39cdb84560	[X86] Make sure we still emit zext for GR32 to GR64 when the source of the zext is AssertZext The AssertZext we might see in this case is only giving information about the lower 32 bits. It isn't providing information about the upper 32 bits. So we should emit a zext. This fixes PR28540. Differential Revision: https://reviews.llvm.org/D37729 llvm-svn: 313563	2017-09-18 20:49:13 +00:00
Alexey Bataev	286fe620ef	[SLP] Add a test for PR34635, NFC. llvm-svn: 313559	2017-09-18 19:33:30 +00:00
Sanjay Patel	74d12b5697	[x86] add tests for PR34217; NFC llvm-svn: 313548	2017-09-18 18:07:50 +00:00
Simon Pilgrim	4aa28b9730	[X86][AVX] Improve (i8 bitcast (v8i1 x)) handling for 256-bit vector compare results. As commented on D37849, AVX1 targets were missing a chance to use vmovmskps for v8f32/v8i32 results for bool vector bitcasts llvm-svn: 313547	2017-09-18 17:58:31 +00:00
Sanjay Patel	078d5d978c	[x86] regenerate checks; NFC llvm-svn: 313545	2017-09-18 17:33:47 +00:00
Manoj Gupta	7476f629ed	[LoopVectorizer] Add more testcases for PR33804. Summary: Add test cases when float <-> pointer types conversion is triggered in presence of load instructions. Reviewers: Ayal, srhines, mkuper, rengolin Reviewed By: rengolin Subscribers: javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D37967 llvm-svn: 313544	2017-09-18 17:28:15 +00:00
Simon Pilgrim	0b21ef1fa3	[SelectionDAG] Add BITCAST handling to ComputeNumSignBits for splatted sign bits. For cases where we are BITCASTing to vectors of smaller elements, then if the entire source was a splatted sign (src's NumSignBits == SrcBitWidth) we can say that the dst's NumSignBit == DstBitWidth, as we're just splitting those sign bits across multiple elements. We could generalize this but at the moment the only use case I have is to peek through bitcasts to vector comparison results. Differential Revision: https://reviews.llvm.org/D37849 llvm-svn: 313543	2017-09-18 16:45:05 +00:00
Craig Topper	77d7f331dd	[X86] Fix two more places to prefer VPERMQ/PD over VPERM2X128 when AVX2 is enabled The shuffle combining and lowerVectorShuffleAsLanePermuteAndBlend were both still trying to use VPERM2XF128 for unary shuffles when AVX2 is enabled. VPERM2X128 takes two inputs meaning when we use it for a unary shuffle one of those inputs is left undefined creating a false dependency on whatever register gets allocated there. If we have VPERMQ/PD we should prefer those since they only have a single input. Differential Revision: https://reviews.llvm.org/D37947 llvm-svn: 313542	2017-09-18 16:39:49 +00:00
Sam Parker	3fa0ccffc6	[AArch64] Add V8_2aOps feature to Cortex-A55 and 75 Add the missing hardware features the ProcA55 and ProcA75 feature. These are already enabled via the target parser, but I had missed them in the backend. Differential Revision: https://reviews.llvm.org/D37974 llvm-svn: 313535	2017-09-18 14:46:14 +00:00
Sam Parker	71efbe4c68	[ARM] Implement isTruncateFree Implement the isTruncateFree hooks, lifted from AArch64, that are used by TargetTransformInfo. This allows simplifycfg to reduce the test case into a single basic block. Differential Revision: https://reviews.llvm.org/D37516 llvm-svn: 313533	2017-09-18 14:28:51 +00:00
Simon Pilgrim	00161c9961	[X86][SSE] Improve support for vselect(Cond, 0, X) -> ANDN(Cond, X) As discussed on PR28925 and D37849. Differential Revision: https://reviews.llvm.org/D37975 llvm-svn: 313532	2017-09-18 14:23:23 +00:00
Sjoerd Meijer	4e6df15962	[ARM] Fix for indexed dot product instruction descriptions The indexed dot product instructions only accept the lower 16 D-registers as the indexed register, but we were e.g. incorrectly accepting: vudot.u8 d16,d16,d18[0] Differential Revision: https://reviews.llvm.org/D37968 llvm-svn: 313531	2017-09-18 14:17:57 +00:00
Jonas Devlieghere	c0a758d8ab	[dwarfdump] Make .eh_frame an alias for .debug_frame This patch makes the `.eh_frame` extension an alias for `.debug_frame`. Up till now it was only possible to dump the section using objdump, but not with dwarfdump. Since the two are essentially interchangeable, we dump whichever of the two is present. As a workaround, this patch also adds parsing for 3 currently unimplemented CFA instructions: `DW_CFA_def_cfa_expression`, `DW_CFA_expression`, and `DW_CFA_val_expression`. Because I lack the required knowledge, I just parse the fields without actually creating the instructions. Finally, this also fixes the typo in the `.debug_frame` section name which incorrectly contained a trailing `s`. Differential revision: https://reviews.llvm.org/D37852 llvm-svn: 313530	2017-09-18 14:15:57 +00:00
Simon Pilgrim	360629d170	[X86][SSE] Add vselect with zero tests (PR28925) llvm-svn: 313529	2017-09-18 13:32:33 +00:00
Nikolai Bozhenov	84af99b3b1	[X86FixupBWInsts] More precise register liveness if no <imp-use> on MOVs. Summary: Subregister liveness tracking is not implemented for X86 backend, so sometimes the whole super register is said to be live, when only a subregister is really live. That might happen if the def and the use are located in different MBBs, see added fixup-bw-isnt.mir test. However, using knowledge of the specific instructions handled by the bw-fixup-pass we can get more precise liveness information which this change does. Reviewers: MatzeB, DavidKreitzer, ab, andrew.w.kaylor, craig.topper Reviewed By: craig.topper Subscribers: n.bozhenov, myatsina, llvm-commits, hiraditya Patch by Andrei Elovikov <andrei.elovikov@intel.com> Differential Revision: https://reviews.llvm.org/D37559 llvm-svn: 313524	2017-09-18 10:17:59 +00:00
Mohammed Agabaria	77cb080c2d	[X86][Codegen] adding masked gathers tests for avx2 related to patch: https://reviews.llvm.org/D35772 adding llvm gathers test before gathers codegen support. Differential Revision: https://reviews.llvm.org/D37800 llvm-svn: 313516	2017-09-18 06:49:54 +00:00
Dean Michael Berris	0f84a7d355	[XRay][tools] Support tail-call exits before we write them in the runtime Summary: This change adds support for explicit tail-exit records to be written by the XRay runtime. This lets us differentiate the tail exit records/events in the log, and allows us to treat those exit events especially in the future. For now we allow printing those out in YAML (and reading them in). Reviewers: kpw, pelikan Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D37964 llvm-svn: 313514	2017-09-18 06:08:46 +00:00
Craig Topper	a6054328e8	[X86] Teach the execution domain fixing tables to use movlhps inplace of unpcklpd for the packed single domain. MOVLHPS has a smaller encoding than UNPCKLPD in the legacy encodings. With VEX and EVEX encodings it doesn't matter. llvm-svn: 313509	2017-09-18 04:40:58 +00:00
Craig Topper	87f7381edf	[X86] Teach execution domain fixing to convert between FP and int unpack instructions. llvm-svn: 313508	2017-09-18 03:29:54 +00:00
Craig Topper	d4341920d5	[X86] Teach execution domain fixing to convert between VPERMILPS and VPSHUFD. llvm-svn: 313507	2017-09-18 03:29:47 +00:00
Craig Topper	ee6646d7de	[X86] Teach shuffle lowering to use MOVLHPS/MOVHLPS for lowering v4f32 unary shuffles with SSE1 only. llvm-svn: 313504	2017-09-17 22:36:41 +00:00
Craig Topper	6c221690a3	[X86] Add a couple more unary shuffles to the sse1 shuffle test. These can be implemented with movlhps and movhlps. llvm-svn: 313503	2017-09-17 22:36:39 +00:00
Jatin Bhateja	356e3e2c1d	Adding test cases for PR34629 & PR34634. Differential Revision: https://reviews.llvm.org/D37962 llvm-svn: 313490	2017-09-17 18:16:26 +00:00
Alex Bradbury	8ab4a9696a	[RISCV] Add support for disassembly This Disassembly support allows for 'round-trip' testing, and rv32i-valid.s has been updated appropriately. Differential Revision: https://reviews.llvm.org/D23567 llvm-svn: 313486	2017-09-17 14:36:28 +00:00
Alex Bradbury	6758ecb98c	[RISCV] Add support for all RV32I instructions This patch supports all RV32I instructions as described in the RISC-V manual. A future patch will add support for pseudoinstructions and other instruction expansions (e.g. 0-arg fence -> fence iorw, iorw). Differential Revision: https://reviews.llvm.org/D23566 llvm-svn: 313485	2017-09-17 14:27:35 +00:00
Igor Breger	f1d388a5c5	[GlobalISel][X86] Legalize i1 G_ADD/G_SUB/G_MUL/G_XOR/G_OR/G_AND instructions. llvm-svn: 313483	2017-09-17 11:34:17 +00:00
Igor Breger	0f382ccb68	[GlobalISel][X86] Use correct physical register in mir tests.NFC. llvm-svn: 313479	2017-09-17 08:30:42 +00:00
Igor Breger	21200ed7af	[GlobalISel][X86] G_FCONSTANT support. Summary: G_FCONSTANT support, port the implementation from X86FastIsel. Reviewers: zvi, delena, guyblank Reviewed By: delena Subscribers: rovka, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D37734 llvm-svn: 313478	2017-09-17 08:08:13 +00:00
Zachary Turner	6326d56195	[llvm-symbolizer] Fix coff-dwarf.test This was a bug in the test that was only exposed as a result of refactoring some code in lit configuration files. Previously, llvm's lit configuration would only set the target-windows feature if the system was also windows. Since cross-compilation is a thing, this isn't correct. target-windows should be set independently of system-windows. Adding to that bug, this particular test then checked for target-windows when it really meant "can I call a certain API on the host machine", which is what system-windows is for. Ultimately, this test only works if both the target and host are Windows, so I've updated the test to reflect that. llvm-svn: 313468	2017-09-16 19:01:04 +00:00
Zachary Turner	c3023d1bd1	Resubmit "Add a shared llvm.lit module that all test suites can use." There were some issues surrounding Py2 / Py3 compatibility, but I've now tested with both Py2 and Py3 and everything seems to work. llvm-svn: 313467	2017-09-16 18:46:21 +00:00
Adrian Prantl	597aa48d11	llvm-dwarfdump: support a --show-children option This will print all children of a DIE when selectively printing only one DIE at a given offset. llvm-svn: 313464	2017-09-16 17:28:00 +00:00
Adrian Prantl	099d7e452a	llvm-dwarfdump: Add support for -debug-types=<offset>. llvm-svn: 313463	2017-09-16 16:58:18 +00:00
George Rimar	762abff698	[llvm-readobj] - Teach tool to report error if some section is in multiple COMDAT groups at once. readelf tool reports an error when output contains the same section in multiple COMDAT groups. That can be useful. Path teaches llvm-readobj to do the same. Differential revision: https://reviews.llvm.org/D37567 llvm-svn: 313459	2017-09-16 14:29:51 +00:00
Sanjay Patel	65d6780703	[x86] enable storeOfVectorConstantIsCheap() target hook This allows vector-sized store merging of constants in DAGCombiner using the existing code in MergeConsecutiveStores(). All of the twisted logic that decides exactly what vector operations are legal and fast for each particular CPU are handled separately in there using the appropriate hooks. For the motivating tests in merge-store-constants.ll, we already produce the same vector code in IR via the SLP vectorizer. So this is just providing a backend backstop for code that doesn't go through that pass (-O1). More details in PR24449: https://bugs.llvm.org/show_bug.cgi?id=24449 (this change should be the last step to resolve that bug) Differential Revision: https://reviews.llvm.org/D37451 llvm-svn: 313458	2017-09-16 13:29:12 +00:00
Craig Topper	23f78c1662	[X86] Add isel patterns to be able to fold loads into VPERM2F128 even when the load is on the first input to the SDNode. We just need to toggle bits 1 and 5 of the immediate and swap the sources. The peephole pass could trigger commuting/folding for this later, but its easy enough to fix in isel. Disable the peephole pass on the main vperm2x128 test so we know we're doing this through isel. llvm-svn: 313455	2017-09-16 09:16:48 +00:00
Craig Topper	0d1b519f78	[X86] Remove unused check lines that got left behind when I moved tests to the instrinsic upgrade file and regenerated. llvm-svn: 313454	2017-09-16 09:16:46 +00:00
Craig Topper	8374ffde08	[X86] Remove the vperm2f128 test file I just added in r313450. I missed the we already had a pretty thorough test file for these instructions. llvm-svn: 313451	2017-09-16 07:51:01 +00:00
Craig Topper	f264fcc704	[X86] Remove VPERM2F128/VPERM2I128 intrinsics and autoupgrade to native shuffles. I've moved the test cases from the InstCombine optimizations to the backend to keep the coverage we had there. It covered every possible immediate so I've preserved the resulting shuffle mask for each of those immediates. llvm-svn: 313450	2017-09-16 07:36:14 +00:00
Craig Topper	aa499c1cb2	[X86] Fix some FileCheck lines that use the wrong prefix. Assume they were moved during autoupgrading and not changed. llvm-svn: 313448	2017-09-16 07:13:39 +00:00
Craig Topper	950b19515a	[X86] Don't set reserved bits in the immediate in the test cases for vperm2f128. I'm going to autoupgrade these intrinsics in a future commit. This bit will never be set in the resulting output so pre-removing the bit. llvm-svn: 313434	2017-09-16 02:11:21 +00:00
Craig Topper	9313df747d	[X86] Remove slash in front of a CHECK line in a test. llvm-svn: 313433	2017-09-16 01:43:21 +00:00
Eric Beckmann	913213c8ae	Revert "Fix Bug 30978 by emitting cv file checksums." This reverts commit 6389e7aa724ea7671d096f4770f016c3d86b0d54. There is a bug in this implementation where the string value of the checksum is outputted, instead of the actual hex bytes. Therefore the checksum is incorrect, and this prevent pdbs from being loaded by visual studio. Revert this until the checksum is emitted correctly. llvm-svn: 313431	2017-09-16 01:14:36 +00:00
Zachary Turner	525b09d347	Revert lit changes related to lit.llvm module. It looks like this is going to be non-trivial to get working in both Py2 and Py3, so for now I'm reverting until I have time to fully test it under Python 3. llvm-svn: 313429	2017-09-16 00:52:49 +00:00
Zachary Turner	2aa5a92bdb	Resubmit "[lit] Add a lit.llvm module that all llvm projects can use" This was reverted alongside the revert of the lit/llvm-lit refactor, but now that that has re-landed, I'm relanding this as well. llvm-svn: 313426	2017-09-16 00:25:58 +00:00
Craig Topper	d02179cd9c	[X86] Remove usages of vperm2f intrinsics from fast isel tests to match what clang generates after r313418. llvm-svn: 313424	2017-09-15 23:53:43 +00:00
Adrian Prantl	057d336c0d	llvm-dwarfdump: Add support for -debug-info=<offset>. This is the first of many commits that enable selectively dumping just one record from the debug info. This reapplies r313412 with some extra qualification to appease GCC and MSVC. llvm-svn: 313419	2017-09-15 23:04:04 +00:00
Vedant Kumar	51d8f887db	[llvm-cov] Avoid over-counting covered lines and regions * Fix an unsigned integer overflow in the logic that computes the number of uncovered lines in a function. * When aggregating region and line coverage summaries, take into account that different instantiations may have a different number of regions. The new test case provides test coverage for both bugs. I also verified this change by preparing a coverage report for a stage2 build of llc -- the new assertions should detect any outstanding over-counting bugs. Fixes PR34613. llvm-svn: 313417	2017-09-15 23:00:02 +00:00
Adrian Prantl	b5abcc558d	Revert "llvm-dwarfdump: Add support for -debug-info=<offset>." This reverts commit r313412 because of a g++ incompatibility. llvm-svn: 313413	2017-09-15 22:47:16 +00:00
Adrian Prantl	fb5d284e97	llvm-dwarfdump: Add support for -debug-info=<offset>. This is the first of many commits that enable selectively dumping just one record from the debug info. llvm-svn: 313412	2017-09-15 22:37:56 +00:00
Guozhi Wei	3d1305f6da	[TargetTransformInfo] Static alloca has 0 cost Static alloca usually doesn't generate any machine instructions, so it has 0 cost. Differential Revision: https://reviews.llvm.org/D37879 llvm-svn: 313410	2017-09-15 22:28:12 +00:00
Chandler Carruth	beb22b5437	[SLP] Revert r312791 and other necessary commits, except for TTI and CostModel. The original patch added support for horizontal min/max reductions to the SLP vectorizer. This patch causes LLVM to miscompile fairly simple signed min reductions. I have attached a test progrom to http://llvm.org/PR34635 that shows the behavior change after this patch. We found this in a test for the open source Eigen library, but also in other code. Unfortunately, the revert is moderately challenging. It required reverting: r313042: [SLP] Test with multiple uses of conditional op and wrong parent. r312853: [SLP] Fix buildbots, NFC. r312793: [SLP] Fix the warning about paths not returning the value, NFC. r312791: [SLP] Support for horizontal min/max reduction. And even then, I had to completely skip reverting the changes to TTI and CostModel because r312832 rewrote so much of this code. Plus, the cost modeling changes aren implicated in the miscompile, so they should be fine and will just not be used until this gets re-introduced. llvm-svn: 313409	2017-09-15 22:23:27 +00:00
Zachary Turner	ce92db13ea	Resubmit "[lit] Force site configs to run before source-tree configs" This is a resubmission of r313270. It broke standalone builds of compiler-rt because we were not correctly generating the llvm-lit script in the standalone build directory. The fixes incorporated here attempt to find llvm/utils/llvm-lit from the source tree returned by llvm-config. If present, it will generate llvm-lit into the output directory. Regardless, the user can specify -DLLVM_EXTERNAL_LIT to point to a specific lit.py on their file system. This supports the use case of someone installing lit via a package manager. If it cannot find a source tree, and -DLLVM_EXTERNAL_LIT is either unspecified or invalid, then we print a warning that tests will not be able to run. Differential Revision: https://reviews.llvm.org/D37756 llvm-svn: 313407	2017-09-15 22:10:46 +00:00
Reid Kleckner	3a66c1cb58	[DebugInfo] Insert DW_OP_deref when spilling indirect DBG_VALUEs Summary: This comes up in optimized debug info for C++ programs that pass and return objects indirectly by address. In these programs, llvm.dbg.declare survives optimization, which causes us to emit indirect DBG_VALUE instructions. The fast register allocator knows to insert DW_OP_deref when spilling indirect DBG_VALUE instructions, but the LiveDebugVariables did not until this change. This fixes part of PR34513. I need to look into why this doesn't work at -O0 and I'll send follow up patches to handle that. Reviewers: aprantl, dblaikie, probinson Subscribers: qcolombet, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D37911 llvm-svn: 313400	2017-09-15 21:54:38 +00:00
Reid Kleckner	9e6c309ef3	[DebugInfo] Add missing DW_OP_deref when an NRVO pointer is spilled Summary: Fixes PR34513. Indirect DBG_VALUEs typically come from dbg.declares of non-trivially copyable C++ objects that must be passed by address. We were already handling the case where the virtual register gets allocated to a physical register and is later spilled. That's what usually happens for normal parameters that aren't NRVO variables: they usually appear in physical register parameters, and are spilled later in the function, which would correctly add deref. NRVO variables are different because the dbg.declare can come much later after earlier instructions cause the incoming virtual register to be spilled. Also, clean up this code. We only need to look at the first operand of a DBG_VALUE, which eliminates the operand loop. Reviewers: aprantl, dblaikie, probinson Subscribers: MatzeB, qcolombet, llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D37929 llvm-svn: 313399	2017-09-15 21:49:56 +00:00
Steven Wu	ab211df5de	[AutoUpgrade] Fix a compatibility issue with module flag Summary: After r304661, module flag to record objective-c image info section is encoded without whitespaces after comma. The new name is equivalent to the old one, except that when LTO a module built by old compiler and a module built by a new compiler, it will fail with conflicting values. Fix the issue by removing whitespaces in bitcode upgrade path. rdar://problem/34416934 Reviewers: compnerd Reviewed By: compnerd Subscribers: mehdi_amini, hans, llvm-commits Differential Revision: https://reviews.llvm.org/D37909 llvm-svn: 313398	2017-09-15 21:12:14 +00:00
Sam Clegg	759631c77b	[WebAssembly] MC: Create wasm data segments based on MCSections This means that we can honor -fdata-sections rather than always creating a segment for each symbol. It also allows for a followup change to add .init_array and friends. Differential Revision: https://reviews.llvm.org/D37876 llvm-svn: 313395	2017-09-15 20:54:59 +00:00
Davide Italiano	dee018c51f	[ConstantFold] Return the correct type when folding a GEP with vector indices. As Eli pointed out (and I got wrong in the first place), langref says: "The getelementptr returns a vector of pointers, instead of a single address, when one or more of its arguments is a vector. In such cases, all vector arguments should have the same number of elements, and every scalar argument will be effectively broadcast into a vector during address calculation." Costantfold for gep doesn't really take in account this paragraph, returning a pointer instead of a vector of pointer which triggers an assertion in RAUW, as we're trying to replace values with mistmatching types. Differential Revision: https://reviews.llvm.org/D37928 llvm-svn: 313394	2017-09-15 20:53:05 +00:00
Vivek Pandya	b5ab895e2a	This patch fixes https://bugs.llvm.org/show_bug.cgi?id=32352 It enables OptimizationRemarkEmitter::allowExtraAnalysis and MachineOptimizationRemarkEmitter::allowExtraAnalysis to return true not only for -fsave-optimization-record but when specific remarks are requested with command line options. The diagnostic handler used to be callback now this patch adds a class DiagnosticHandler. It has virtual method to provide custom diagnostic handler and methods to control which particular remarks are enabled. However LLVM-C API users can still provide callback function for diagnostic handler. llvm-svn: 313390	2017-09-15 20:10:09 +00:00
Vivek Pandya	df8598dcc4	This reverts r313381 llvm-svn: 313387	2017-09-15 19:53:54 +00:00
Vivek Pandya	00d887447b	This patch fixes https://bugs.llvm.org/show_bug.cgi?id=32352 It enables OptimizationRemarkEmitter::allowExtraAnalysis and MachineOptimizationRemarkEmitter::allowExtraAnalysis to return true not only for -fsave-optimization-record but when specific remarks are requested with command line options. The diagnostic handler used to be callback now this patch adds a class DiagnosticHandler. It has virtual method to provide custom diagnostic handler and methods to control which particular remarks are enabled. However LLVM-C API users can still provide callback function for diagnostic handler. llvm-svn: 313382	2017-09-15 19:30:59 +00:00
Sam Clegg	aff1c4df25	[WebAssembly] MC: Fix crash in getProvitionalValue on weak references - Create helper function for resolving weak references. - Add test that preproduces the crash. Differential Revision: https://reviews.llvm.org/D37916 llvm-svn: 313381	2017-09-15 19:22:01 +00:00
Hans Wennborg	534bfbd3ba	Revert r313343 "[X86] PR32755 : Improvement in CodeGen instruction selection for LEAs." This caused PR34629: asserts firing when building Chromium. It also broke some buildbots building test-suite as reported on the commit thread. > Summary: > 1/ Operand folding during complex pattern matching for LEAs has been > extended, such that it promotes Scale to accommodate similar operand > appearing in the DAG. > e.g. > T1 = A + B > T2 = T1 + 10 > T3 = T2 + A > For above DAG rooted at T3, X86AddressMode will no look like > Base = B , Index = A , Scale = 2 , Disp = 10 > > 2/ During OptimizeLEAPass down the pipeline factorization is now performed over LEAs > so that if there is an opportunity then complex LEAs (having 3 operands) > could be factored out. > e.g. > leal 1(%rax,%rcx,1), %rdx > leal 1(%rax,%rcx,2), %rcx > will be factored as following > leal 1(%rax,%rcx,1), %rdx > leal (%rdx,%rcx) , %edx > > 3/ Aggressive operand folding for AM based selection for LEAs is sensitive to loops, > thus avoiding creation of any complex LEAs within a loop. > > Reviewers: lsaba, RKSimon, craig.topper, qcolombet > > Reviewed By: lsaba > > Subscribers: spatel, igorb, llvm-commits > > Differential Revision: https://reviews.llvm.org/D35014 llvm-svn: 313376	2017-09-15 18:40:26 +00:00
Eric Beckmann	349746f044	Fix Bug 30978 by emitting cv file checksums. Summary: The checksums had already been placed in the IR, this patch allows MCCodeView to actually write it out to an MCStreamer. Subscribers: llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D37157 llvm-svn: 313374	2017-09-15 18:20:28 +00:00
Craig Topper	7a183e2760	[X86] Prefer VPERMQ over VPERM2F128 for any unary shuffle, not just the ones that can be done with a insertf128 The early out for AVX2 in lowerV2X128VectorShuffle is positioned in a weird spot below some shuffle mask equivalency checks. But I think we want to allow VPERMQ for any unary shuffle. Differential Revision: https://reviews.llvm.org/D37893 llvm-svn: 313373	2017-09-15 18:11:13 +00:00
Craig Topper	e0d724cf51	[X86] Don't create i64 constants on 32-bit targets when lowering v64i1 constant build vectors When handling a v64i1 build vector of constants on 32-bit targets we were creating an illegal i64 constant that we then bitcasted back to v64i1. We need to instead create two 32-bit constants, bitcast them to v32i1 and concat the result. We should also take care to handle the halves being all zeros/ones after the split. This patch splits the build vector and then recursively lowers the two pieces. This allows us to handle the all ones and all zeros cases with minimal effort. Ideally we'd just do the split and concat, and let lowering get called again on the new nodes, but getNode has special handling for CONCAT_VECTORS that reassembles the pieces back into a single BUILD_VECTOR. Hopefully the two temporary BUILD_VECTORS we had to create to do this that don't get returned don't cause any issues. Fixes PR34605. Differential Revision: https://reviews.llvm.org/D37858 llvm-svn: 313366	2017-09-15 17:09:03 +00:00
Craig Topper	143797eb89	[X86] Add isel pattern infrastructure to begin recognizing when we're inserting 0s into the upper portions of a vector register and the producing instruction as already produced the zeros. Currently if we're inserting 0s into the upper elements of a vector register we insert an explicit move of the smaller register to implicitly zero the upper bits. But if we can prove that they are already zero we can skip that. This is based on a similar idea of what we do to avoid emitting explicit zero extends for GR32->GR64. Unfortunately, this is harder for vector registers because there are several opcodes that don't have VEX equivalent instructions, but can write to XMM registers. Among these are SHA instructions and a MMX->XMM move. Bitcasts can also get in the way. So for now I'm starting with explicitly allowing only VPMADDWD because we emit zeros in combineLoopMAddPattern. So that is placing extra instruction into the reduction loop. I'd like to allow PSADBW as well after D37453, but that's currently blocked by a bitcast. We either need to peek through bitcasts or canonicalize insert_subvectors with zeros to remove bitcasts on the value being inserted. Longer term we should probably have a cleanup pass that removes superfluous zeroing moves even when the producer is in another basic block which is something these isel tricks can't do. See PR32544. Differential Revision: https://reviews.llvm.org/D37653 llvm-svn: 313365	2017-09-15 17:09:00 +00:00
Anna Thomas	f34537dff8	[RuntimeUnroll] Add heuristic for unrolling multi-exit loop Add a profitability heuristic to enable runtime unrolling of multi-exit loop: There can be atmost two unique exit blocks for the loop and the second exit block should be a deoptimizing block. Also, there can be one other exiting block other than the latch exiting block. The reason for the latter is so that we limit the number of branches in the unrolled code to being at most the unroll factor. Deoptimizing blocks are rarely taken so these additional number of branches created due to the unrolling are predictable, since one of their target is the deopt block. Reviewers: apilipenko, reames, evstupac, mkuper Subscribers: llvm-commits Reviewed by: reames Differential Revision: https://reviews.llvm.org/D35380 llvm-svn: 313363	2017-09-15 15:56:05 +00:00
Anna Thomas	512dde77ba	[RuntimeUnrolling] Populate the VMap entry correctly when default generated through lookup During runtime unrolling on loops with multiple exits, we update the exit blocks with the correct phi values from both original and remainder loop. In this process, we lookup the VMap for the mapped incoming phi values, but did not update the VMap if a default entry was generated in the VMap during the lookup. This default value is generated when constants or values outside the current loop are looked up. This patch fixes the assertion failure when null entries are present in the VMap because of this lookup. Added a testcase that showcases the problem. llvm-svn: 313358	2017-09-15 13:29:33 +00:00
Simon Pilgrim	905e79c4dc	[X86][SSE] Add test cases vector for integer multiplies Mainly inspired by PR34474 / D37896 llvm-svn: 313353	2017-09-15 11:17:42 +00:00
Ilya Biryukov	d23faa843e	Revert "[SLPVectorizer] Failure to beneficially vectorize 'copyable' elements in integer binary ops." This reverts commit r313348. Reason: it caused buildbot failures. llvm-svn: 313352	2017-09-15 10:15:00 +00:00
Sjoerd Meijer	0c5ba21cbf	[AArch64] allow v8f16 types when FullFP16 is supported This adds support for allowing v8f16 vector types, thus avoiding conversions from/to single precision for these types. This is a follow up patch of commits r311154 and r312104, which added support for scalars and v4f16 types, respectively. Differential Revision: https://reviews.llvm.org/D37802 llvm-svn: 313351	2017-09-15 09:24:48 +00:00
Dinar Temirbulatov	e2358b53bc	[SLPVectorizer] Failure to beneficially vectorize 'copyable' elements in integer binary ops. Patch tries to improve vectorization of the following code: void add1(int * __restrict dst, const int * __restrict src) { dst++ = src++; dst++ = src++ + 1; dst++ = src++ + 2; dst++ = src++ + 3; } Allows to vectorize even if the very first operation is not a binary add, but just a load. Reviewers: spatel, mzolotukhin, mkuper, hfinkel, RKSimon, filcab, ABataev, davide Subscribers: llvm-commits, RKSimon Differential Revision: https://reviews.llvm.org/D28907 llvm-svn: 313348	2017-09-15 06:56:39 +00:00
Jatin Bhateja	908c8b37c2	[X86] PR32755 : Improvement in CodeGen instruction selection for LEAs. Summary: 1/ Operand folding during complex pattern matching for LEAs has been extended, such that it promotes Scale to accommodate similar operand appearing in the DAG. e.g. T1 = A + B T2 = T1 + 10 T3 = T2 + A For above DAG rooted at T3, X86AddressMode will no look like Base = B , Index = A , Scale = 2 , Disp = 10 2/ During OptimizeLEAPass down the pipeline factorization is now performed over LEAs so that if there is an opportunity then complex LEAs (having 3 operands) could be factored out. e.g. leal 1(%rax,%rcx,1), %rdx leal 1(%rax,%rcx,2), %rcx will be factored as following leal 1(%rax,%rcx,1), %rdx leal (%rdx,%rcx) , %edx 3/ Aggressive operand folding for AM based selection for LEAs is sensitive to loops, thus avoiding creation of any complex LEAs within a loop. Reviewers: lsaba, RKSimon, craig.topper, qcolombet Reviewed By: lsaba Subscribers: spatel, igorb, llvm-commits Differential Revision: https://reviews.llvm.org/D35014 llvm-svn: 313343	2017-09-15 05:29:51 +00:00
Zachary Turner	83dcb68468	Revert "[lit] Force site configs to run before source-tree configs" This patch is still breaking several multi-stage compiler-rt bots. I already know what the fix is, but I want to get the bots green for now and then try re-applying in the morning. llvm-svn: 313335	2017-09-15 02:56:40 +00:00
Reid Kleckner	87288b98b6	[codeview] Use a type index of zero for static method "this" types Otherwise VS won't show anything in the autos or watch window of static methods. llvm-svn: 313329	2017-09-15 00:59:07 +00:00
Zachary Turner	7f9c0ce1bb	[lit] Revert "Add a lit.llvm module that all llvm projects can use" This is breaking due to some changes I forgot to merge in, so I'm temporarily reverting them until I can re-test that this works. llvm-svn: 313328	2017-09-15 00:56:08 +00:00
Zachary Turner	e5412b882a	[lit] Add a lit.llvm module that all test suites can use. To further reduce duplicate code, this patch introduces a module that configs can simply import and get access to a lot of useful functionality such as setting up paths, adding features that are useful across all projects, and other utility-type functions. For now this only updates llvm's suite to use this new library, but subsequent patches will update other projects. Differential Revision: https://reviews.llvm.org/D37778 llvm-svn: 313325	2017-09-15 00:34:00 +00:00
Sam Clegg	7c39594357	[WebAssembly] Use a separate wasm data segment for each global symbol This is stepping stone towards honoring -fdata-sections and letting the assembler decide how many wasm data segments to create. Differential Revision: https://reviews.llvm.org/D37834 llvm-svn: 313313	2017-09-14 23:07:53 +00:00
Matt Arsenault	c317287fde	AMDGPU: Fix violating constant bus restriction You can't use madmk/madmk if it already uses an SGPR input. llvm-svn: 313298	2017-09-14 20:54:29 +00:00
Guozhi Wei	21f8fad909	[TargetTransformInfo] Detect 0 latency instructions For instructions that unlikely generate machine instructions, they should also have 0 latency. Differential Revision: https://reviews.llvm.org/D37833 llvm-svn: 313288	2017-09-14 19:20:02 +00:00
Matt Arsenault	37ab4cf8b8	AMDGPU: Fix assert on alloca of array of struct llvm-svn: 313282	2017-09-14 18:02:29 +00:00
Simon Dardis	6819a7a1cb	[bpf] Fix test to always use little endian. r313055 broke the big endian buildbots as the CHECK lines contained little endian data but -triple bpf uses the host endian. llvm-svn: 313281	2017-09-14 17:55:50 +00:00
Matt Arsenault	defe371771	AMDGPU: Stop modifying SP in call sequences Because the stack growth direction and addressing is done in the same direction, modifying SP at the beginning of the call sequence was incorrect. If we had a stack passed argument, we would end up skipping that number of bytes before pushing arguments, leaving unused/inconsistent space. The callee creates fixed stack objects in its frame, so the space necessary for these is already logically allocated in the callee, so we just let the callee increment SP if it really requires it. llvm-svn: 313279	2017-09-14 17:37:40 +00:00
Dehao Chen	3a81f84d9a	Invoke GetInlineCost for legality check before inline functions in SampleProfileLoader. Summary: SampleProfileLoader inlines hot functions if it is inlined in the profiled binary. However, the inline needs to be guarded by legality check, otherwise it could lead to correctness issues. Reviewers: eraman, davidxl Reviewed By: eraman Subscribers: vitalybuka, sanjoy, llvm-commits Differential Revision: https://reviews.llvm.org/D37779 llvm-svn: 313277	2017-09-14 17:29:56 +00:00
Simon Dardis	55e446737f	[mips] Implement the 'dext' aliases and it's disassembly alias. The other members of the dext family of instructions (dextm, dextu) are traditionally handled by the assembler selecting the right variant of 'dext' depending on the values of the position and size operands. When these instructions are disassembled, rather than reporting the actual instruction, an equivalent aliased form of 'dext' is generated and is reported. This is to mimic the behaviour of binutils. Reviewers: slthakur, nitesh.jain, atanasyan Differential Revision: https://reviews.llvm.org/D34887 llvm-svn: 313276	2017-09-14 17:27:53 +00:00
Adrian Prantl	82959cd5d8	Adapt more testcases for llvm-dwarfdump changes. llvm-svn: 313275	2017-09-14 17:27:03 +00:00
Matt Arsenault	6efd082c01	AMDGPU: Make frame register caller preserved Using SplitCSR for the frame register was very broken. Often the copies in the prolog and epilog were optimized out, in addition to them being inserted after the true prolog where the FP was clobbered. I have a hacky solution which works that continues to use split CSR, but for now this is simpler and will get to working programs. llvm-svn: 313274	2017-09-14 17:14:57 +00:00
Adrian Prantl	5fd3d49bc4	llvm-dwarfdump: support dumping static archives. llvm-svn: 313272	2017-09-14 17:01:53 +00:00
Krzysztof Parzyszek	779d98e1c0	TableGen support for parameterized register class information This replaces TableGen's type inference to operate on parameterized types instead of MVTs, and as a consequence, some interfaces have changed: - Uses of MVTs are replaced by ValueTypeByHwMode. - EEVT::TypeSet is replaced by TypeSetByHwMode. This affects the way that types and type sets are printed, and the tests relying on that have been updated. There are certain users of the inferred types outside of TableGen itself, namely FastISel and GlobalISel. For those users, the way that the types are accessed have changed. For typical scenarios, these replacements can be used: - TreePatternNode::getType(ResNo) -> getSimpleType(ResNo) - TreePatternNode::hasTypeSet(ResNo) -> hasConcreteType(ResNo) - TypeSet::isConcrete -> TypeSetByHwMode::isValueTypeByHwMode(false) For more information, please refer to the review page. Differential Revision: https://reviews.llvm.org/D31951 llvm-svn: 313271	2017-09-14 16:56:21 +00:00
Zachary Turner	a0e55b6403	[lit] Force site configs to be run before source-tree configs This patch simplifies LLVM's lit infrastructure by enforcing an ordering that a site config is always run before a source-tree config. A significant amount of the complexity from lit config files arises from the fact that inside of a source-tree config file, we don't yet know if the site config has been run. However it is always required to run a site config first, because it passes various variables down through CMake that the main config depends on. As a result, every config file has to do a bunch of magic to try to reverse-engineer the location of the site config file if they detect (heuristically) that the site config file has not yet been run. This patch solves the problem by emitting a mapping from source tree config file to binary tree site config file in llvm-lit.py. Then, during discovery when we find a config file, we check to see if we have a target mapping for it, and if so we use that instead. This mechanism is generic enough that it does not affect external users of lit. They will just not have a config mapping defined, and everything will work as normal. On the other hand, for us it allows us to make many simplifications: * We are guaranteed that a site config will be executed first * Inside of a main config, we no longer have to assume that attributes might not be present and use getattr everywhere. * We no longer have to pass parameters such as --param llvm_site_config=<path> on the command line. * It is future-proof, meaning you don't have to edit llvm-lit.in to add support for new projects. * All of the duplicated logic of trying various fallback mechanisms of finding a site config from the main config are now gone. One potentially noteworthy thing that was required to implement this change is that whereas the ninja check targets previously used the first method to spawn lit, they now use the second. In particular, you can no longer run lit.py against the source tree while specifying the various `foo_site_config=<path>` parameters. Instead, you need to run llvm-lit.py. Differential Revision: https://reviews.llvm.org/D37756 llvm-svn: 313270	2017-09-14 16:47:58 +00:00
Krzysztof Parzyszek	6ca02b25a7	[IfConversion] More simple, correct dead/kill liveness handling Patch by Jesper Antonsson. Differential Revision: https://reviews.llvm.org/D37611 llvm-svn: 313268	2017-09-14 15:53:11 +00:00
Simon Dardis	6f83ae38a3	[mips] Implement the 'dins' aliases. Traditionally GAS has provided automatic selection between dins, dinsm and dinsu. Binutils also disassembles all instructions in that family as 'dins' rather than the actual instruction. Reviewers: slthakur Differential Revision: https://reviews.llvm.org/D34877 llvm-svn: 313267	2017-09-14 15:17:50 +00:00
Sanjay Patel	0d4fd5b668	[InstSimplify] fold sdiv/srem based on compare of dividend and divisor This should bring signed div/rem analysis up to the same level as unsigned. We use icmp simplification to determine when the divisor is known greater than the dividend. Each positive test is followed by a negative test to show that we're not overstepping the boundaries of the known bits. There are extra tests for the signed-min-value special cases. Alive proofs: http://rise4fun.com/Alive/WI5 Differential Revision: https://reviews.llvm.org/D37713 llvm-svn: 313264	2017-09-14 14:59:07 +00:00
Chad Rosier	4d6d74e236	Add newline to end of test file. NFC. llvm-svn: 313263	2017-09-14 14:48:59 +00:00
Simon Pilgrim	0b220c7524	[X86] Regenerate test. NFCI. llvm-svn: 313259	2017-09-14 13:00:27 +00:00
Simon Pilgrim	47d8f62472	Regenerate test (broadcast comment). NFCI. llvm-svn: 313258	2017-09-14 12:41:19 +00:00
Ayman Musa	ab68449c53	[X86] When applying the shuffle-to-zero-extend transformation on floating point, bitcast to integer first. Fix issue described in PR34577. Differential Revision: https://reviews.llvm.org/D37803 llvm-svn: 313256	2017-09-14 12:06:38 +00:00
Simon Dardis	28365b33ad	[mips] Pick the right variant of DINS upfront and enable target instruction verification This patch complements D16810 "[mips] Make isel select the correct DEXT variant up front.". Now ISel picks the right variant of DINS, so now there is no need to replace DINS with the appropriate variant during MipsMCCodeEmitter::encodeInstruction(). This patch also enables target specific instruction verification for ins, dins, dinsm, dinsu, ext, dext, dextm, dextu. These instructions have constraints that are checked when generating MipsISD::Ins and MipsISD::Ext nodes, but these constraints are not checked during instruction selection. Adding machine verification should catch outstanding cases. Finally, correct a bug that instruction verification uncovered, where the position operand of a DINSU generated during lowering was being silently and accidently corrected to the correct value. Reviewers: slthakur Differential Revision: https://reviews.llvm.org/D34809 llvm-svn: 313254	2017-09-14 10:58:00 +00:00
Simon Pilgrim	8bd2d8780a	[DAGCombine] (shl (or x, c1), c2) -> (or (shl x, c2), c1 << c2) We already have a combine for this pattern when the input to shl is add, so we just need to enable the transformation when the input is or. Original patch by @tstellar Differential Revision: https://reviews.llvm.org/D19325 llvm-svn: 313251	2017-09-14 10:38:30 +00:00
Simon Pilgrim	337b2d007a	Fix line endings. NFCI. llvm-svn: 313247	2017-09-14 10:30:54 +00:00
Simon Pilgrim	11e2969a35	Fix line endings. NFCI. llvm-svn: 313246	2017-09-14 10:30:22 +00:00
Dean Michael Berris	d7b3add7c4	[XRay][DebugInfo] Update the test to use a specific target Follow-up to D37791. llvm-svn: 313243	2017-09-14 09:58:25 +00:00
Chandler Carruth	7376ae88eb	[PM/CGSCC] Teach the CGSCC pass manager components to gracefully handle invalidated SCCs even when we do not have an updated SCC to redirect towards. This comes up in a fairly subtle and surprising circumstance: we need to have a connected but internal node in the call graph which later becomes a disconnected island, and then gets deleted. All of this needs to happen mid-CGSCC walk. Because it is disconnected, we have no way of computing a new "current" SCC when it gets deleted. Instead, we need to explicitly check for a deleted "current" SCC and bail out of the current CGSCC step. This will bubble all the way up to the post-order walk and then resume correctly. I've included minimal tests for this bug. The specific behavior matches something we've seen in the wild with the new PM combined with ThinLTO and sample PGO, but I've not yet confirmed whether this is the only issue there. llvm-svn: 313242	2017-09-14 08:33:57 +00:00
Dean Michael Berris	b21f580ddc	[XRay][DebugInfo] Remove -debug-compile from test invocation of llc This breaks bootstrap builds, and is actually unnecessary. Tested locally and it seems we can remove -debug-comile just fine. Follow-up to D37791. llvm-svn: 313238	2017-09-14 07:54:54 +00:00
Alon Kom	682cfc1d4c	[LV] Fix maximum legal VF calculation This patch fixes pr34283, which exposed that the computation of maximum legal width for vectorization was wrong, because it relied on MaxInterleaveFactor to obtain the maximum stride used in the loop, however not all strided accesses in the loop have an interleave-group associated with them. Instead of recording the maximum stride in the loop, which can be over conservative (e.g. if the access with the maximum stride is not involved in the dependence limitation), this patch tracks the actual maximum legal width imposed by accesses that are involved in dependencies. Differential Revision: https://reviews.llvm.org/D37507 llvm-svn: 313237	2017-09-14 07:40:02 +00:00
Dean Michael Berris	01fd7c8bd4	[XRay][CodeGen] Use the current function symbol as the associated symbol for the instrumentation map Summary: XRay had been assuming that the previous section is the "text" section of the function when lowering the instrumentation map. Unfortunately this is not a safe assumption, because we may be coming from lowering debug type information for the function being lowered. This fixes an issue with combining -gsplit-dwarf, -generate-type-units, -debug-compile and -fxray-instrument for sole member functions. When the split dwarf section is stripped, we're left with references from the xray_instr_map to the debug section. The change now uses the function's symbol instead of the previous section's start symbol. We found the bug while attempting to strip the split debug sections off an XRay-instrumented object file, which had a peculiar edge-case for single-function classes where the single function is being lowered. Because XRay had assocaited the instrumentation map for a function to the debug types section instead of the function's section, the objcopy call will fail due to the misplaced reference from the xray_instr_map section. Reviewers: pcc, dblaikie, echristo Subscribers: llvm-commits, aprantl Differential Revision: https://reviews.llvm.org/D37791 llvm-svn: 313233	2017-09-14 07:08:23 +00:00
Vitaly Buka	48624d327a	Revert "Invoke GetInlineCost for legality check before inline functions in SampleProfileLoader." Patch introduced uninitialized value. This reverts commit r313195. llvm-svn: 313230	2017-09-14 05:40:33 +00:00
Peter Collingbourne	cfbd089237	Reland r313157, "ThinLTO: Correctly follow aliasee references when dead stripping." which was reverted in r313222. This reland includes a fix for the LowerTypeTests pass so that it looks past aliases when determining which type identifiers are live. Differential Revision: https://reviews.llvm.org/D37842 llvm-svn: 313229	2017-09-14 05:02:59 +00:00
Hans Wennborg	ae050afeb9	Revert r313157 "ThinLTO: Correctly follow aliasee references when dead stripping." This broke Chromium's CFI build; see crbug.com/765004. > We were previously handling aliases during dead stripping by adding > the aliased global's "original name" GUID to the worklist. This will > lead to incorrect behaviour if the global has local linkage because > the original name GUID will not correspond to the global's GUID in > the summary. > > Because an alias is just another name for the global that it > references, there is no need to mark the referenced global as used, > or to follow references from any other copies of the global. So all > we need to do is to follow references from the aliasee's summary > instead of the alias. > > Differential Revision: https://reviews.llvm.org/D37789 llvm-svn: 313222	2017-09-14 00:40:14 +00:00
NAKAMURA Takumi	38fac5905e	Move llvm/test/CodeGen/X86/clear-liverange-spillreg.mir to SystemZ. It was in wrong place. llvm-svn: 313218	2017-09-14 00:03:23 +00:00
Matt Arsenault	ecb43ef1bc	AMDGPU: Don't spill SP reg like a normal CSR llvm-svn: 313217	2017-09-13 23:47:01 +00:00
Hans Wennborg	06e2a384c2	Revert r312719 "[MachineCombiner] Update instruction depths incrementally for large BBs." This caused PR34596. > [MachineCombiner] Update instruction depths incrementally for large BBs. > > Summary: > For large basic blocks with lots of combinable instructions, the > MachineTraceMetrics computations in MachineCombiner can dominate the compile > time, as computing the trace information is quadratic in the number of > instructions in a BB and it's relevant successors/predecessors. > > In most cases, knowing the instruction depth should be enough to make > combination decisions. As we already iterate over all instructions in a basic > block, the instruction depth can be computed incrementally. This reduces the > cost of machine-combine drastically in cases where lots of instructions > are combined. The major drawback is that AFAIK, computing the critical path > length cannot be done incrementally. Therefore we only compute > instruction depths incrementally, for basic blocks with more > instructions than inc_threshold. The -machine-combiner-inc-threshold > option can be used to set the threshold and allows for easier > experimenting and checking if using incremental updates for all basic > blocks has any impact on the performance. > > Reviewers: sanjoy, Gerolf, MatzeB, efriedma, fhahn > > Reviewed By: fhahn > > Subscribers: kiranchandramohan, javed.absar, efriedma, llvm-commits > > Differential Revision: https://reviews.llvm.org/D36619 llvm-svn: 313213	2017-09-13 23:23:09 +00:00
Adrian Prantl	c113801489	Update testcase that was XFAILed on Darwin for llvm-dwarfdump change. llvm-svn: 313209	2017-09-13 22:30:43 +00:00
Stanislav Mekhanoshin	7fe9a5d9b4	Allow target to decide when to cluster loads/stores in misched MachineScheduler when clustering loads or stores checks if base pointers point to the same memory. This check is done through comparison of base registers of two memory instructions. This works fine when instructions have separate offset operand. If they require a full calculated pointer such instructions can never be clustered according to such logic. Changed shouldClusterMemOps to accept base registers as well and let it decide what to do about it. Differential Revision: https://reviews.llvm.org/D37698 llvm-svn: 313208	2017-09-13 22:20:47 +00:00
Adrian Prantl	3ae35eb56b	llvm-dwarfdump: automatically dump both regular and .dwo variant of sections Since users typically don't really care about the .dwo / non.dwo distinction, this patch makes it so dwarfdump --debug-<info,...> dumps .debug_info and (if available) also .debug_info.dwo. This simplifies the command line interface (I've removed all dwo-specific dump options) and makes the tool friendlier to use. Differential Revision: https://reviews.llvm.org/D37771 llvm-svn: 313207	2017-09-13 22:09:01 +00:00
Reid Kleckner	89af112cf5	[codeview] VLAs and unsized arrays should use a size of zero Previously we used a size of '1' for VLAs because we weren't sure what MSVC did. However, MSVC does support declaring an array without a size, for which it emits an array type with a size of zero. Clang emits the same DI metadata for VLAs and arrays without bound, so we would describe arrays without bound as having one element. This lead to Microsoft debuggers only printing a single element. Emitting a size of zero appears to cause these debuggers to search the symbol information to find a definition of the variable with accurate array bounds. Fixes http://crbug.com/763580 llvm-svn: 313203	2017-09-13 21:54:20 +00:00
Dehao Chen	be707f2930	Update the early_inline test to explicitly add attribute for all functions. (NFC) llvm-svn: 313202	2017-09-13 21:49:36 +00:00
Wei Mi	a2a135a01c	Add a comment for the test. NFC. llvm-svn: 313199	2017-09-13 21:47:13 +00:00
Wei Mi	c0d066468e	[RegAlloc] Keep a copy of live interval for the spilled vregs in HoistSpillHelper. This is to fix PR34502. After rL311401, the live range of spilled vreg will be cleared. HoistSpill need to use the live range of the original vreg before splitting to know the moving range of the spills. The patch saves a copy of live interval for the spilled vreg inside of HoistSpillHelper. Differential Revision: https://reviews.llvm.org/D37578 llvm-svn: 313197	2017-09-13 21:41:30 +00:00
Vedant Kumar	4c76c45fed	[Bitcode] Add a compatibility test for 5.0.0 bitcode llvm-svn: 313196	2017-09-13 21:40:59 +00:00

... 2 3 4 5 6 ...

47767 Commits