llvm-project

Commit Graph

Author	SHA1	Message	Date
Marek Olsak	ffadcb744b	AMDGPU: Fold immediate offset into BUFFER_LOAD_DWORD lowered from SMEM Summary: -5.3% code size in affected shaders. Changed stats only: 48486 shaders in 30489 tests Totals: SGPRS: 2086406 -> 2072430 (-0.67 %) VGPRS: 1626872 -> 1627960 (0.07 %) Spilled SGPRs: 7865 -> 7912 (0.60 %) Code Size: 60978060 -> 60188764 (-1.29 %) bytes Max Waves: 374530 -> 374342 (-0.05 %) Totals from affected shaders: SGPRS: 299664 -> 285688 (-4.66 %) VGPRS: 233844 -> 234932 (0.47 %) Spilled SGPRs: 3959 -> 4006 (1.19 %) Code Size: 14905272 -> 14115976 (-5.30 %) bytes Max Waves: 46202 -> 46014 (-0.41 %) Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D38915 llvm-svn: 317750	2017-11-09 01:52:17 +00:00
Craig Topper	93e27d2ecc	[X86] Make sure we don't read too many operands from X86ISD::FMADDS1/FMADDS3 nodes when doing FNEG combine. r317453 added new ISD nodes without rounding modes that were added to an existing if/else chain. But all the previous nodes handled there included a rounding mode. The final code after this if/else chain expected an extra operand that isn't present for the new nodes. llvm-svn: 317748	2017-11-09 01:06:47 +00:00
Petr Hosek	cf1fee2d59	[CMake] Passthrough CMAKE_SYSROOT to external projects Differential Revision: https://reviews.llvm.org/D39029 llvm-svn: 317744	2017-11-09 00:21:29 +00:00
Mitch Phillips	d64af52585	[cfi-verify] Adds blacklist blame behaviour to cfi-verify. Adds the blacklist behaviour to llvm-cfi-verify. Now will calculate which lines caused expected failures in the blacklist and reports the number of affected indirect CF instructions for each blacklist entry. Also moved DWARF checking after instruction analysis to improve performance significantly - unrolling the inlining stack is expensive. Reviewers: vlad.tsyrklevich Subscribers: aprantl, pcc, kcc, llvm-commits Differential Revision: https://reviews.llvm.org/D39750 llvm-svn: 317743	2017-11-09 00:18:31 +00:00
Petr Hosek	fb5ef73460	[CMake][runtimes] Fix the variable name This typo causes the llvm-lit path resolution to fail. Differential Revision: https://reviews.llvm.org/D39811 llvm-svn: 317742	2017-11-08 23:44:27 +00:00
Rui Ueyama	f6490e047a	[FileOutputBuffer] Move factory methods out of their classes. InMemoryBuffer and OnDiskBuffer classes have both factory methods and public constructors, and that looks a bit odd. This patch makes factory methods non-member function to fix it. Differential Revision: https://reviews.llvm.org/D39693 llvm-svn: 317739	2017-11-08 22:57:48 +00:00
Craig Topper	cfd510678f	[X86] X86MaskedGatherSDNode shouldn't inherit from MaskedGatherScatterSDNode The classof implementation in MaskedGatherScatterSDNode doesn't consider X86MaskedGatherSDNode so its misleading. llvm-svn: 317733	2017-11-08 22:26:41 +00:00
Craig Topper	61f81f9637	[X86] Preserve memory refs when folding loads into divides. This is similar to what we already do for multiplies. Without this we can't unfold and hoist an invariant load. llvm-svn: 317732	2017-11-08 22:26:39 +00:00
Craig Topper	55029d811f	[X86] Remove an if check on the result of a cast. NFC cast takes a non-null input and produces a non-null output. So this if can never fail. llvm-svn: 317731	2017-11-08 22:26:37 +00:00
Adrian Prantl	a8e56458e6	Let replaceVTableHolder accept any type. In Rust, a trait can be implemented for any type, and if a trait object pointer is used for the type, then a virtual table will be emitted for that trait/type combination. We would like debuggers to be able to inspect trait objects, which requires finding the concrete type associated with a given vtable. This patch changes LLVM so that any type can be passed to replaceVTableHolder. This allows the Rust compiler to emit the needed debug info -- associating a vtable with the concrete type for which it was emitted. This is a DWARF extension: DWARF only specifies the meaning of DW_AT_containing_type in one specific situation. This style of DWARF extension is routine, though, and LLVM already has one such case for DW_AT_containing_type. Patch by Tom Tromey! Differential Revision: https://reviews.llvm.org/D39503 llvm-svn: 317730	2017-11-08 22:04:43 +00:00
Dan Gohman	2c74fe977d	Add an @llvm.sideeffect intrinsic This patch implements Chandler's idea [0] for supporting languages that require support for infinite loops with side effects, such as Rust, providing part of a solution to bug 965 [1]. Specifically, it adds an `llvm.sideeffect()` intrinsic, which has no actual effect, but which appears to optimization passes to have obscure side effects, such that they don't optimize away loops containing it. It also teaches several optimization passes to ignore this intrinsic, so that it doesn't significantly impact optimization in most cases. As discussed on llvm-dev [2], this patch is the first of two major parts. The second part, to change LLVM's semantics to have defined behavior on infinite loops by default, with a function attribute for opting into potential-undefined-behavior, will be implemented and posted for review in a separate patch. [0] http://lists.llvm.org/pipermail/llvm-dev/2015-July/088103.html [1] https://bugs.llvm.org/show_bug.cgi?id=965 [2] http://lists.llvm.org/pipermail/llvm-dev/2017-October/118632.html Differential Revision: https://reviews.llvm.org/D38336 llvm-svn: 317729	2017-11-08 21:59:51 +00:00
Reid Kleckner	7adb2fdbba	Revert "Correct dwarf unwind information in function epilogue for X86" This reverts r317579, originally committed as r317100. There is a design issue with marking CFI instructions duplicatable. Not all targets support the CFIInstrInserter pass, and targets like Darwin can't cope with duplicated prologue setup CFI instructions. The compact unwind info emission fails. When the following code is compiled for arm64 on Mac at -O3, the CFI instructions end up getting tail duplicated, which causes compact unwind info emission to fail: int a, c, d, e, f, g, h, i, j, k, l, m; void n(int o, int b) { if (g) f = 0; for (; f < o; f++) { m = a; if (l > j k > i) j = i = k = d; h = b[c] - e; } } We get assembly that looks like this: ; BB#1: ; %if.then Lloh3: adrp x9, _f@GOTPAGE Lloh4: ldr x9, [x9, _f@GOTPAGEOFF] mov w8, wzr Lloh5: str wzr, [x9] stp x20, x19, [sp, #-16]! ; 8-byte Folded Spill .cfi_def_cfa_offset 16 .cfi_offset w19, -8 .cfi_offset w20, -16 cmp w8, w0 b.lt LBB0_3 b LBB0_7 LBB0_2: ; %entry.if.end_crit_edge Lloh6: adrp x8, _f@GOTPAGE Lloh7: ldr x8, [x8, _f@GOTPAGEOFF] Lloh8: ldr w8, [x8] stp x20, x19, [sp, #-16]! ; 8-byte Folded Spill .cfi_def_cfa_offset 16 .cfi_offset w19, -8 .cfi_offset w20, -16 cmp w8, w0 b.ge LBB0_7 LBB0_3: ; %for.body.lr.ph Note the multiple .cfi_def* directives. Compact unwind info emission can't handle that. llvm-svn: 317726	2017-11-08 21:31:14 +00:00
Vedant Kumar	a702fa17f3	[cmake] Allow LLVM_BUILD_INSTRUMENTED to be set to IR or Frontend - This deprecates LLVM_ENABLE_IR_PGO but keeps it around for now. - Errors out when LLVM_BUILD_INSTRUMENTED and LLVM_BUILD_INSTRUMENTED_COVERAGE are both set. Motivated by bogner's post-commit review of r313770. llvm-svn: 317725	2017-11-08 21:26:40 +00:00
Rafael Espindola	85593c2398	Make sure an error is always handled. llvm-svn: 317724	2017-11-08 21:15:21 +00:00
Alex Bradbury	fa18b9e73c	Set hasSideEffects=0 for PHI and fix affected passes Previously, hasSideEffects was ? for TargetOpcode::PHI and would be inferred as 1. D37065 sets the previously inferred properties explicitly. This patch sets hasSideEffects=0 for PHI, as it is for G_PHI. MachineInstr::isSafeToMove has been updated so it still returns false for PHI. Additionally, HexagonBitSimplify relied on a PHI node having the hasUnmodeledSideEffects property. This patch fixes that assumption. Differential Revision: https://reviews.llvm.org/D37097 llvm-svn: 317721	2017-11-08 20:19:16 +00:00
Craig Topper	78a770402a	[X86] Correct the implementation of BEXTR load folding to use the shift as the parent node and pass a separate root. We were calling tryFoldLoad with the 'and' node was the root and parent node of the load. But the parent of the load should be the shift that proceeds the and. While the and node is correctly the root node. To fix this I had to make tryFoldLoad take a separate use and root input. I've added a convenience version with the old signature to avoid updating the other call sites. llvm-svn: 317720	2017-11-08 20:17:33 +00:00
Sam Clegg	6368442fb7	[WebAssembly] Update test expectations I believe these were fixed in rL317707 Differential Revision: https://reviews.llvm.org/D39813 llvm-svn: 317718	2017-11-08 20:14:06 +00:00
Teresa Johnson	07ec7d59c2	[ThinLTO] Ensure sanitizer passes are run Summary: In ThinLTO compilation, we exit populateModulePassManager early and were not adding PM extension passes meant to run at the end of the pipeline. This includes sanitizer passes. Add these passes before the early exit. A test will be added to projects/compiler-rt. Reviewers: pcc Subscribers: mehdi_amini, inglorion, llvm-commits Differential Revision: https://reviews.llvm.org/D39565 llvm-svn: 317714	2017-11-08 19:45:52 +00:00
Craig Topper	e6094f9bd9	[X86] Don't call validateInstruction from MatchAndEmitInstruction when MatchingInlineAsm is set. The MCInst won't be populated. Without this we can't parse gather instructions in ms inline asm blocks. The validateInstruction function was introduced in r316700 to check gather constraints. llvm-svn: 317713	2017-11-08 19:38:48 +00:00
Craig Topper	81d772c28a	[ValueTracking] Use APInt::isNullValue/isOneValue which are more efficient for large APInts. llvm-svn: 317712	2017-11-08 19:38:45 +00:00
Dan Gohman	7726026061	[WebAssembly] Add a test for inline-asm "m" constraints. llvm-svn: 317711	2017-11-08 19:37:24 +00:00
Dan Gohman	0828ba1e1e	[WebAssembly] Call signExtend to get sign extended register Patch by Jatin Bhateja! Differential Revision: https://reviews.llvm.org/D39529 llvm-svn: 317710	2017-11-08 19:24:21 +00:00
Adrian Prantl	ff2073a51f	Un-XFAIL a test after the bugfix in r317702. llvm-svn: 317708	2017-11-08 19:18:20 +00:00
Dan Gohman	b465aa0504	[WebAssembly] Revise the strategy for inline asm. Previously, an "r" constraint would mean the compiler provides a value on WebAssembly's operand stack. This was tricky to use properly, particularly since it isn't possible to declare a new local from within an inline asm string. With this patch, "r" provides the value in a WebAssembly local, and the local index is provided to the inline asm string. This requires inline asm to use get_local and set_local to read the register. This does potentially result in larger code size, however inline asm should hopefully be quite rare in WebAssembly. This also means that the "m" constraint can no longer be supported, as WebAssembly has nothing like a "memory operand" that includes an implicit get_local. This fixes PR34599 for the wasm32-unknown-unknown-wasm target (though not for the ELF target). llvm-svn: 317707	2017-11-08 19:18:08 +00:00
Adrian McCarthy	75248a7ade	NFC: Rename MCSafeSEHFragment to MCSymbolIdFragment Summary: This fragment emits a symbol ID and will be useful for more than just Safe SEH tables (e.g., I plan to re-use it for Control Flow Guard tables). This is simply a rename refactor. Reviewers: rnk Subscribers: llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D39770 llvm-svn: 317703	2017-11-08 18:57:02 +00:00
Adrian Prantl	93faeecd8f	Handle inlined variables in SelectionDAGBuilder::EmitFuncArgumentDbgValue(). In 2010 a commit with no testcase and no further explanation explicitly disabled the handling of inlined variables in EmitFuncArgumentDbgValue(). I don't think there is a good reason for this any more and re-enabling this adds debug locations for variables associated with an LLVM function argument in functions that are inlined into the first basic block. The only downside of doing this is that we may insert a DBG_VALUE before the inlined scope, but (1) this could be filtered out later, and (2) LiveDebugValues will not propagate it into subsequent basic blocks if they don't dominate the variable's lexical scope, so this seems like a small price to pay. rdar://problem/26228128 llvm-svn: 317702	2017-11-08 18:27:13 +00:00
Simon Pilgrim	1c1fd15959	[X86] Add some initial scheduling tests for generic x86 instructions These will be using inline asm to ensure we have coverage that we're unlikely to get from lowering of basic ir. Currently waiting for D39728 to land to add support for scheduler comments for inline asm. llvm-svn: 317698	2017-11-08 16:35:42 +00:00
Jonas Hahnfeld	0a41be758c	[CMake] Remove target to build native tablegen This was once needed so that multiple tablegen binaries don't compile the library concurrently. However, this isn't needed anymore since adding USES_TERMINAL to the custom_command. This is supported by the fact that the target was only building LLVMSupport since some cleanups a year ago. If this dependency had really been needed, we would have seen complaints. Differential Revision: https://reviews.llvm.org/D39299 llvm-svn: 317695	2017-11-08 14:31:54 +00:00
Jonas Hahnfeld	5db0ae4d38	[CMake] Add custom target to create build directory CMake does a poor job in tracking dependencies on files and directories directly. Create custom target similar to the configuration step. On my system, this avoids the reconfiguration on each build. Differential Revision: https://reviews.llvm.org/D39298 llvm-svn: 317694	2017-11-08 14:31:51 +00:00
Alex Bradbury	86f971ccd6	[utils] Add RISC-V support to update_llc_test_checks.py This should be a trivial change, and I've started using it for generating all tests at https://github.com/lowrisc/riscv-llvm (i.e. it's been tested in action quite a lot). Note that the regex does not attempt to match .cfi_startproc, as I want to ensure compatibility with functions that have the nounwind attribute. Differential Revision: https://reviews.llvm.org/D39789 llvm-svn: 317693	2017-11-08 14:24:42 +00:00
Alex Bradbury	a337675cdb	[RISCV] Initial support for function calls Note that this is just enough for simple function call examples to generate working code. Support for varargs etc follows in future patches. Differential Revision: https://reviews.llvm.org/D29936 llvm-svn: 317691	2017-11-08 13:41:21 +00:00
Alex Bradbury	74913e1c70	[RISCV] Codegen for conditional branches A good portion of this patch is the extra functions that needed to be implemented to support the test case. e.g. storeRegToStackSlot, loadRegFromStackSlot, eliminateFrameIndex. Setting ISD::BR_CC to Expand may appear non-obvious on an architecture with branch+cmp instructions. However, I found it much easier to deal with matching the expanded form. I had to change simm13_lsb0 and simm21_lsb0 to inherit from the Operand<OtherVT> class rather than Operand<i32> in order to keep tablegen happy. This isn't a big deal, but it does seem a shame to lose the uniformity across immediate types when there's not an obvious benefit (I'm hoping a tablegen expert will educate me on what I'm missing here!). Differential Revision: https://reviews.llvm.org/D29935 llvm-svn: 317690	2017-11-08 13:31:40 +00:00
Alex Bradbury	ec8aa91305	[RISCV] Codegen support for memory operations on global addresses Differential Revision: https://reviews.llvm.org/D39103 llvm-svn: 317688	2017-11-08 13:24:21 +00:00
Alex Bradbury	cfa6291bb1	[RISCV] Codegen support for memory operations This required the implementation of RISCVTargetInstrInfo::copyPhysReg. Support for lowering global addresses follow in the next patch. Differential Revision: https://reviews.llvm.org/D29934 llvm-svn: 317685	2017-11-08 12:20:01 +00:00
Alex Bradbury	0f0e1b54f0	[RISCV] Codegen support for materializing constants Differential Revision: https://reviews.llvm.org/D39101 llvm-svn: 317684	2017-11-08 12:02:22 +00:00
Ivan A. Kosarev	d60a3cc395	[Analysis] Fix merging TBAA tags with different final access types There are cases when we have to merge TBAA access tags with the same base access type, but different final access types. For example, accesses to different members of the same structure may be vectorized into a single load or store instruction. Since we currently assume that the tags to merge always share the same final access type, we incorrectly return a tag that describes an access to one of the original final access types as the generic tag. This patch fixes that by producing generic tags for the common type and not the final access types of the original tags. Resolves: PR35225: Wrong tbaa metadata after load store vectorizer due to recent change https://bugs.llvm.org/show_bug.cgi?id=35225 Differential Revision: https://reviews.llvm.org/D39732 llvm-svn: 317682	2017-11-08 11:42:21 +00:00
Simon Dardis	789f7ca265	[mips] Guard indirect and tailcall pseudo instructions correctly. Previously these pseudo instructions were not guarded by ISA, so their select was dependant on the ordering of the entries in the DAG matcher. Reviewers: atanasyan Differential Revision: https://reviews.llvm.org/D39723 llvm-svn: 317681	2017-11-08 11:13:44 +00:00
Nuno Lopes	17921d9e21	BasicAA: fix bug where we would return partialalias instead of noalias My fix is conservative and will make us return may-alias instead. The test case is: check(gep(x, 0), n, gep(x, n), -1) with n == sizeof(x) Here, the first value accesses the whole object, but the second access doesn't access anything. The semantics of -1 is read until the end of the object, which in this case means read nothing. No test case, since isn't trivial to exploit this one, but I've proved it correct. llvm-svn: 317680	2017-11-08 10:59:00 +00:00
Alex Bradbury	cc988415fe	[NFCI] Ensure TargetOpcode::* are compatible with guessInstructionProperties=0 rL162640 introduced CodeGenTarget::guessInstructionProperties. If a target sets guessInstructionProperties=0 in its FooInstrInfo, tablegen will error if it has to guess properties from patterns. Unfortunately, guessInstructionProperties=0 can't be used with current upstream LLVM as instructions in the TargetOpcode namespace are always included and sometimes have inferred properties for mayLoad, mayStore, and hasSideEffects. This patch provides the simplest possible fix to this problem, setting default values for these fields in the TargetOpcode scope. There is no intended functional change, as the explicitly set properties should match what was previously inferred. A number of the instructions had hasSideEffects=1 inferred unintentionally. This patch makes it explicit, while future patches (such as D37097) correct the property. Differential Revision: https://reviews.llvm.org/D37065 llvm-svn: 317674	2017-11-08 09:26:06 +00:00
Matt Arsenault	f6ee94c1c6	DAG: Add computeKnownBitsForFrameIndex Some of the AMDGPU stack addressing modes require knowing the sign bit is zero. We used to accomplish this by custom lowering frame indexes, and then putting an AssertZext around a TargetFrameIndex. This required specifically looking for the AssextZext + frame index pattern which was moderately disgusting. The same could probably be accomplished with a target specific node, but would still require special handling of frame indexes. llvm-svn: 317671	2017-11-08 08:52:31 +00:00
Serguei Katkov	3664aa8658	Revert "[CGP] Enable extending scope of optimizeMemoryInst" Revert the patch r317665 causing buildbot failures. llvm-svn: 317667	2017-11-08 05:38:54 +00:00
Serguei Katkov	ee892325bf	[CGP] Enable extending scope of optimizeMemoryInst This patch enables the folding of address computation in memory instruction in case adress is represented by Phi node. The inputs of Phi node might be different in base register. Differential Revision: https://reviews.llvm.org/D36073 llvm-svn: 317665	2017-11-08 05:02:51 +00:00
Craig Topper	65e6d0b758	[X86] Add patterns to fold EVEX store with EVEX encoded vcvtps2ph instructions. Remove bad pattern that had vf432 vcvtps2ph storing 128-bits. llvm-svn: 317662	2017-11-08 04:00:31 +00:00
Craig Topper	b832ee68b4	[X86] Allow legacy vcvtps2ph intrinsics to select EVEX encoded instructions. Rely on EVEX->VEX to convert back. Missed store folding opportunities will be fixed in a subsequent commit. llvm-svn: 317661	2017-11-08 04:00:30 +00:00
Rafael Espindola	0d7a38a81d	Convert FileOutputBuffer::commit to Error. llvm-svn: 317656	2017-11-08 01:50:29 +00:00
Dave Lee	48db01b980	Revert "Reapply: Allow yaml2obj to order implicit sections for ELF" This reverts commit r317646. llvm-svn: 317654	2017-11-08 01:31:20 +00:00
Rafael Espindola	81ca0df8f4	Update unittest too. llvm-svn: 317651	2017-11-08 01:10:05 +00:00
Rafael Espindola	e0df357dbd	Convert FileOutputBuffer to Expected. NFC. llvm-svn: 317649	2017-11-08 01:05:44 +00:00
David Blaikie	3f833edc7c	Target/TargetInstrInfo.h -> CodeGen/TargetInstrInfo.h to match layering This header includes CodeGen headers, and is not, itself, included by any Target headers, so move it into CodeGen to match the layering of its implementation. llvm-svn: 317647	2017-11-08 01:01:31 +00:00
Dave Lee	7db10de5e6	Reapply: Allow yaml2obj to order implicit sections for ELF Summary: This change allows yaml input to control the order of implicitly added sections (`.symtab`, `.strtab`, `.shstrtab`). The order is controlled by adding a placeholder section of the given name to the Sections field. This change is to support changes in D39582, where it is desirable to control the location of the `.dynsym` section. This reapplied version fixes: 1. use of a function call within an assert 2. failing lld test which has an unnamed section Additionally, one more test to cover the unnamed section failure. Reviewers: compnerd, jakehehrlich Reviewed By: jakehehrlich Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D39749 llvm-svn: 317646	2017-11-08 00:58:50 +00:00
Matt Arsenault	4709ab9124	AMDGPU: Set correct sched model on v_mad_u64_u32 llvm-svn: 317645	2017-11-08 00:48:25 +00:00
Mitch Phillips	0222224da6	Revert rL317618 The implemented pass fails and is breaking a large number of unit tests. Example: http://lab.llvm.org:8011/builders/clang-with-lto-ubuntu/builds/5777/steps/build-stage3-compiler/logs/stdio This reverts commit rL317618 llvm-svn: 317641	2017-11-08 00:20:53 +00:00
Sriraman Tallam	056b3fd6fb	Attribute nonlazybind should not affect calls to functions with hidden visibility. Differential Revision: https://reviews.llvm.org/D39625 llvm-svn: 317639	2017-11-08 00:01:05 +00:00
Paul Robinson	63efdd32dd	Reapply r317609 with a simpler sed script, thanks to Justin Bogner! llvm-svn: 317634	2017-11-07 23:17:43 +00:00
Dave Lee	dcce03300d	Revert "Allow yaml2obj to order implicit sections for ELF" Also, revert "Fix build bots after r317622" This reverts commit r317622, r317626. llvm-svn: 317630	2017-11-07 22:51:27 +00:00
Paul Robinson	9737f54f1d	Revert r317609, test fails on one bot llvm-svn: 317628	2017-11-07 22:39:12 +00:00
Dave Lee	c227512ce4	Fix build bots after r317622 Example build failure: http://lab.llvm.org:8011/builders/lld-x86_64-darwin13/builds/14660 TIL that the warning flags for local builds are loose compared to what build servers use. llvm-svn: 317626	2017-11-07 22:33:07 +00:00
Justin Lebar	da9e0bd3a2	[NVPTX] Implement __nvvm_atom_add_gen_d builtin. Summary: This just seems to have been an oversight. We already supported the f64 atomic add with an explicit scope (e.g. "cta"), but not the scopeless version. Reviewers: tra Subscribers: jholewinski, sanjoy, cfe-commits, llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D39638 llvm-svn: 317623	2017-11-07 22:10:54 +00:00
Dave Lee	3ae8dfda06	Allow yaml2obj to order implicit sections for ELF Summary: This change allows yaml input to control the order of implicitly added sections (`.symtab`, `.strtab`, `.shstrtab`). The order is controlled by adding a placeholder section of the given name to the Sections field. This change is to support changes in D39582, where it is desirable to control the location of the `.dynsym` section. Reviewers: compnerd, jakehehrlich Reviewed By: jakehehrlich Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D39749 llvm-svn: 317622	2017-11-07 22:05:24 +00:00
Dinar Temirbulatov	b9a2832874	[SLPVectorizer] Failure to beneficially vectorize 'copyable' elements in integer binary ops. Patch tries to improve vectorization of the following code: void add1(int * __restrict dst, const int * __restrict src) { dst++ = src++; dst++ = src++ + 1; dst++ = src++ + 2; dst++ = src++ + 3; } Allows to vectorize even if the very first operation is not a binary add, but just a load. Fixed PR34619 and other issues related to previous commit. Reviewers: spatel, mzolotukhin, mkuper, hfinkel, RKSimon, filcab, ABataev Reviewed By: ABataev, RKSimon Subscribers: llvm-commits, RKSimon Differential Revision: https://reviews.llvm.org/D28907 llvm-svn: 317618	2017-11-07 21:25:34 +00:00
Mitch Phillips	40d6663367	Extend SpecialCaseList to allow users to blame matches on entries in the file. Summary: Extends SCL functionality to allow users to find the line number in the file the SCL is built from through SpecialCaseList::inSectionBlame(...). Also removes the need to compile the SCL before use. As the matcher now contains a list of regexes to test against instead of a single regex, the regexes can be individually built on each insertion rather than one large compilation at the end of construction. This change also fixes a bug where blank lines would cause the parser to become out-of-sync with the line number. An error on line `k` was being reported as being on line `k - num_blank_lines_before_k`. Note: This change has a cyclical dependency on D39486. Both these changes must be submitted at the same time to avoid a build breakage. Reviewers: vlad.tsyrklevich Reviewed By: vlad.tsyrklevich Subscribers: kcc, pcc, llvm-commits Differential Revision: https://reviews.llvm.org/D39485 llvm-svn: 317617	2017-11-07 21:16:46 +00:00
Craig Topper	87e715fbac	[CodeGenPrepare] Fix typo in comment. NFC llvm-svn: 317614	2017-11-07 20:56:17 +00:00
Graham Yiu	5cd044e8c8	Use new vector insert half-word and byte instructions when we see insertelement on '8 x i16' and '16 x i8' types. Also extended existing lit testcase to cover these cases. Differential Revision: https://reviews.llvm.org/D34630 llvm-svn: 317613	2017-11-07 20:55:43 +00:00
Paul Robinson	64b047fcc1	Convert a dwarfdump test from checked-in binary to assembler source. llvm-svn: 317612	2017-11-07 20:35:44 +00:00
Paul Robinson	c58fbe22ea	[DWARFv5] Add new test for previous commit. llvm-svn: 317609	2017-11-07 20:12:58 +00:00
Paul Robinson	e5400f8a6e	[DWARFv5] Support DW_FORM_strp in the .debug_line header. Supporting this form in .debug_line.dwo will be done as a follow-up. Differential Revision: https://reviews.llvm.org/D33155 llvm-svn: 317607	2017-11-07 19:57:12 +00:00
Craig Topper	7dd4d32431	Recommit r317510 "[InstCombine] Pull shifts through a select plus binop with constant" The hexagon test should be fixed now. Original commit message: This pulls shifts through a select+binop with a constant where the select conditionally executes the binop. We already do this for just the binop, but not with the select. This can allow us to get the select closer to other selects to enable removing one. Differential Revision: https://reviews.llvm.org/D39222 llvm-svn: 317600	2017-11-07 18:47:24 +00:00
Craig Topper	386fc2516c	[InstCombine] Update stale comment. NFC Datalayout is no longer optional so the comment didn't match what the code currently does. llvm-svn: 317594	2017-11-07 17:37:32 +00:00
Krzysztof Parzyszek	385a4e0489	[Hexagon] Make a test more flexible in HexagonLoopIdiomRecognition An "or" that sets the sign-bit can be replaced with a "xor", if the sign-bit was known to be clear before. With some changes to instruction combining, the simple sign-bit check was failing. Replace it with a more flexible one to catch more cases. llvm-svn: 317592	2017-11-07 17:05:54 +00:00
Florian Hahn	b936810833	[AArch64][SVE] Asm: Add support for (ADD\|SUB)_ZZZ Patch [5/5] in a series to add assembler/disassembler support for AArch64 SVE unpredicated ADD/SUB instructions. Patch by Sander De Smalen. Reviewed by: rengolin Differential Revision: https://reviews.llvm.org/D39091 llvm-svn: 317591	2017-11-07 16:58:13 +00:00
Florian Hahn	91f11e5ad1	[AArch64][SVE] Asm: Add SVE (Z) Register definitions and parsing support Patch [3/5] in a series to add assembler/disassembler support for AArch64 SVE unpredicated ADD/SUB instructions. To summarise, this patch adds: * SVE register definitions * Methods to parse SVE register operands * Methods to print SVE register operands * RegKind SVEDataVector to distinguish it from other data types like scalar register or Neon vector. * k_SVEDataRegister and SVEDataRegOp to describe SVE registers (which will be extended by further patches with e.g. ElementWidth and the shift-extend type). Patch by Sander De Smalen. Reviewed by: rengolin Differential Revision: https://reviews.llvm.org/D39089 llvm-svn: 317590	2017-11-07 16:45:48 +00:00
Craig Topper	41bf240726	[SelectionDAG] Fix typo in comment. NFC llvm-svn: 317588	2017-11-07 16:32:31 +00:00
Florian Hahn	d825bbdc41	[AArch64][SVE] Asm: Set SVE as unsupported feature for existing scheduler models. Patch [4/5] in a series to add assembler/disassembler support for AArch64 SVE unpredicated ADD/SUB instructions. We add SVE as unsupported feature for CPUs that don't have SVE to prevent errors from scheduler models saying it lacks information for these instructions. Patch by Sander De Smalen. Reviewed by: rengolin Differential Revision: https://reviews.llvm.org/D39090 llvm-svn: 317582	2017-11-07 15:03:11 +00:00
Petar Jovanovic	e2a585dddc	Reland "Correct dwarf unwind information in function epilogue for X86" Reland r317100 with minor fix regarding ComputeCommonTailLength function in BranchFolding.cpp. Skipping top CFI instructions block needs to executed on several more return points in ComputeCommonTailLength(). Original r317100 message: "Correct dwarf unwind information in function epilogue for X86" This patch aims to provide correct dwarf unwind information in function epilogue for X86. It consists of two parts. The first part inserts CFI instructions that set appropriate cfa offset and cfa register in emitEpilogue() in X86FrameLowering. This part is X86 specific. The second part is platform independent and ensures that: - CFI instructions do not affect code generation - Unwind information remains correct when a function is modified by different passes. This is done in a late pass by analyzing information about cfa offset and cfa register in BBs and inserting additional CFI directives where necessary. Changed CFI instructions so that they: - are duplicable - are not counted as instructions when tail duplicating or tail merging - can be compared as equal Added CFIInstrInserter pass: - analyzes each basic block to determine cfa offset and register valid at its entry and exit - verifies that outgoing cfa offset and register of predecessor blocks match incoming values of their successors - inserts additional CFI directives at basic block beginning to correct the rule for calculating CFA Having CFI instructions in function epilogue can cause incorrect CFA calculation rule for some basic blocks. This can happen if, due to basic block reordering, or the existence of multiple epilogue blocks, some of the blocks have wrong cfa offset and register values set by the epilogue block above them. CFIInstrInserter is currently run only on X86, but can be used by any target that implements support for adding CFI instructions in epilogue. Patch by Violeta Vukobrat. llvm-svn: 317579	2017-11-07 14:40:27 +00:00
Kristof Beyls	d5b7a00fd5	Silence MSVC error C2398 Reported by http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-win/builds/6000/steps/build-unified-tree/logs/stdio The error messages were all similar to: llvm\unittests\CodeGen\GlobalISel\LegalizerInfoTest.cpp(54): error C2398: Element '1': conversion from '' to 'unsigned int' requires a narrowing conversion llvm-svn: 317578	2017-11-07 14:37:01 +00:00
Alexey Bataev	e25a6fd390	[SLP] Fix PR35047: Fix default cost model for cast op in X86. Summary: The cost calculation for default case on X86 target does not always follow correct wayt because of missing 4-th argument in `BaseT::getCastInstrCost()` call. Added this missing parameter. Reviewers: hfinkel, mkuper, RKSimon, spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D39687 llvm-svn: 317576	2017-11-07 14:23:44 +00:00
Kristof Beyls	78aa4b28a3	Mark intentional fall-through with LLVM_FALLTHROUGH. ... to silence gcc 7's default -Wimplicit-fallthrough. llvm-svn: 317573	2017-11-07 13:31:52 +00:00
Alexander Richardson	46e1fd6102	Add a -D flag to FileCheck to define variables Summary: This makes it very easy to test files that only differ in a constant value somewhere in the test case. Reviewers: jlebar, hfinkel, chandlerc, probinson Reviewed By: probinson Subscribers: probinson, llvm-commits Differential Revision: https://reviews.llvm.org/D39629 llvm-svn: 317572	2017-11-07 13:24:44 +00:00
Simon Pilgrim	9a6b720f4f	[X86] Regenerate select tests llvm-svn: 317571	2017-11-07 13:21:02 +00:00
Florian Hahn	c4422247b3	[AArch64][SVE] Asm: Replace 'IsVector' by 'RegKind' in AArch64AsmParser (NFC) Patch [2/5] in a series to add assembler/disassembler support for AArch64 SVE unpredicated ADD/SUB instructions. This change is a non functional change that adds RegKind as an alternative to 'isVector' to prepare it for newer types (SVE data vectors and predicate vectors) that will be added in next patches (where the SVE data vector is added as part of this patch set) Patch by Sander De Smalen. Reviewed by: rengolin Differential Revision: https://reviews.llvm.org/D39088 llvm-svn: 317569	2017-11-07 13:07:50 +00:00
Kristof Beyls	178818ba20	Silence C4715 warning from MSVC (NFC). The warning started triggering after r317560. This commit silences it in the same way as previously done in a similar situation, see http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20140915/236088.html llvm-svn: 317568	2017-11-07 11:54:00 +00:00
Florian Hahn	603c6455d2	[AArch64][SVE] Asm: Extend EnforceVectorSubVectorTypeIs to distinguish Scalable Vectors Patch [1/5] in a series to add assembler/disassembler support for AArch64 SVE unpredicated ADD/SUB instructions. Patch by Sander De Smalen. Reviewed by: rengolin Differential Revision: https://reviews.llvm.org/D39087 llvm-svn: 317564	2017-11-07 10:43:56 +00:00
Kristof Beyls	af9814a1fc	[GlobalISel] Enable legalizing non-power-of-2 sized types. This changes the interface of how targets describe how to legalize, see the below description. 1. Interface for targets to describe how to legalize. In GlobalISel, the API in the LegalizerInfo class is the main interface for targets to specify which types are legal for which operations, and what to do to turn illegal type/operation combinations into legal ones. For each operation the type sizes that can be legalized without having to change the size of the type are specified with a call to setAction. This isn't different to how GlobalISel worked before. For example, for a target that supports 32 and 64 bit adds natively: for (auto Ty : {s32, s64}) setAction({G_ADD, 0, s32}, Legal); or for a target that needs a library call for a 32 bit division: setAction({G_SDIV, s32}, Libcall); The main conceptual change to the LegalizerInfo API, is in specifying how to legalize the type sizes for which a change of size is needed. For example, in the above example, how to specify how all types from i1 to i8388607 (apart from s32 and s64 which are legal) need to be legalized and expressed in terms of operations on the available legal sizes (again, i32 and i64 in this case). Before, the implementation only allowed specifying power-of-2-sized types (e.g. setAction({G_ADD, 0, s128}, NarrowScalar). A worse limitation was that if you'd wanted to specify how to legalize all the sized types as allowed by the LLVM-IR LangRef, i1 to i8388607, you'd have to call setAction 8388607-3 times and probably would need a lot of memory to store all of these specifications. Instead, the legalization actions that need to change the size of the type are specified now using a "SizeChangeStrategy". For example: setLegalizeScalarToDifferentSizeStrategy( G_ADD, 0, widenToLargerAndNarrowToLargest); This example indicates that for type sizes for which there is a larger size that can be legalized towards, do it by Widening the size. For example, G_ADD on s17 will be legalized by first doing WidenScalar to make it s32, after which it's legal. The "NarrowToLargest" indicates what to do if there is no larger size that can be legalized towards. E.g. G_ADD on s92 will be legalized by doing NarrowScalar to s64. Another example, taken from the ARM backend is: for (unsigned Op : {G_SDIV, G_UDIV}) { setLegalizeScalarToDifferentSizeStrategy(Op, 0, widenToLargerTypesUnsupportedOtherwise); if (ST.hasDivideInARMMode()) setAction({Op, s32}, Legal); else setAction({Op, s32}, Libcall); } For this example, G_SDIV on s8, on a target without a divide instruction, would be legalized by first doing action (WidenScalar, s32), followed by (Libcall, s32). The same principle is also followed for when the number of vector lanes on vector data types need to be changed, e.g.: setAction({G_ADD, LLT::vector(8, 8)}, LegalizerInfo::Legal); setAction({G_ADD, LLT::vector(16, 8)}, LegalizerInfo::Legal); setAction({G_ADD, LLT::vector(4, 16)}, LegalizerInfo::Legal); setAction({G_ADD, LLT::vector(8, 16)}, LegalizerInfo::Legal); setAction({G_ADD, LLT::vector(2, 32)}, LegalizerInfo::Legal); setAction({G_ADD, LLT::vector(4, 32)}, LegalizerInfo::Legal); setLegalizeVectorElementToDifferentSizeStrategy( G_ADD, 0, widenToLargerTypesUnsupportedOtherwise); As currently implemented here, vector types are legalized by first making the vector element size legal, followed by then making the number of lanes legal. The strategy to follow in the first step is set by a call to setLegalizeVectorElementToDifferentSizeStrategy, see example above. The strategy followed in the second step "moreToWiderTypesAndLessToWidest" (see code for its definition), indicating that vectors are widened to more elements so they map to natively supported vector widths, or when there isn't a legal wider vector, split the vector to map it to the widest vector supported. Therefore, for the above specification, some example legalizations are: * getAction({G_ADD, LLT::vector(3, 3)}) returns {WidenScalar, LLT::vector(3, 8)} * getAction({G_ADD, LLT::vector(3, 8)}) then returns {MoreElements, LLT::vector(8, 8)} * getAction({G_ADD, LLT::vector(20, 8)}) returns {FewerElements, LLT::vector(16, 8)} 2. Key implementation aspects. How to legalize a specific (operation, type index, size) tuple is represented by mapping intervals of integers representing a range of size types to an action to take, e.g.: setScalarAction({G_ADD, LLT:scalar(1)}, {{1, WidenScalar}, // bit sizes [ 1, 31[ {32, Legal}, // bit sizes [32, 33[ {33, WidenScalar}, // bit sizes [33, 64[ {64, Legal}, // bit sizes [64, 65[ {65, NarrowScalar} // bit sizes [65, +inf[ }); Please note that most of the code to do the actual lowering of non-power-of-2 sized types is currently missing, this is just trying to make it possible for targets to specify what is legal, and how non-legal types should be legalized. Probably quite a bit of further work is needed in the actual legalizing and the other passes in GlobalISel to support non-power-of-2 sized types. I hope the documentation in LegalizerInfo.h and the examples provided in the various {Target}LegalizerInfo.cpp and LegalizerInfoTest.cpp explains well enough how this is meant to be used. This drops the need for LLT::{half,double}...Size(). Differential Revision: https://reviews.llvm.org/D30529 llvm-svn: 317560	2017-11-07 10:34:34 +00:00
Serguei Katkov	365200295a	[CGP] Disable Select instruction handling in optimizeMemoryInst. NFC This patch disables the handling of selects in optimization extensing scope of optimizeMemoryInst. The optimization itself is disable by default. The idea here is just to switch optimiztion level step by step. Specifically, first optimization will be enabled only for Phi nodes, then select instructions will be added. In case someone will complain about perfromance it will be easier to detect what part of optimizations is responsible for that. Differential Revision: https://reviews.llvm.org/D36073 llvm-svn: 317555	2017-11-07 09:43:08 +00:00
Peter Smith	7411da557f	[docs][ARM] Add HowTo for cross compiling and testing compiler-rt builtins This document contains information on how to cross-compile the compiler-rt builtins library for several flavours of Arm target and how to test the libraries using qemu. Differential Revision: https://reviews.llvm.org/D39600 llvm-svn: 317554	2017-11-07 09:40:05 +00:00
Bjorn Steinbrink	c02b237e46	[X86] Don't clobber reserved registers with stack adjustments Summary: Calls using invoke in funclet based functions are assumed to clobber all registers, which causes the stack adjustment using pops to consider all registers not defined by the call to be undefined, which can unfortunately include the base pointer, if one is needed. To prevent this (and possibly other hazards), skip reserved registers when looking for candidate registers. This fixes issue #45034 in the Rust compiler. Reviewers: mkuper Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D39636 llvm-svn: 317551	2017-11-07 08:50:21 +00:00
Craig Topper	e7fb300226	[X86] Add patterns to fold a 64-bit load into the EVEX vcvtph2ps instructions. llvm-svn: 317548	2017-11-07 07:13:07 +00:00
Craig Topper	0231b1d445	[X86] Add patterns for folding a v16i8 with the VEX vcvtph2ps intrinsics. Disable the peephole pass to prove that the pattern is working. llvm-svn: 317547	2017-11-07 07:13:06 +00:00
Craig Topper	65fc53320b	[X86] Add a test for a 128-bit vector load feeding a cvtph2ps intrinsic. The instruction only loads 64-bits, but we should be able to fold a wider load and let it be narrowed. llvm-svn: 317546	2017-11-07 07:13:05 +00:00
Craig Topper	8942b33f84	[X86] Remove alignment from a load in the f16c intrinsic test. The alignment shouldn't be required for load folding. llvm-svn: 317545	2017-11-07 07:13:04 +00:00
Craig Topper	cf8e6d0a76	[X86] Add support for using EVEX instructions for the legacy vcvtph2ps intrinsics. Looks like there's some missed load folding opportunities for i64 loads. llvm-svn: 317544	2017-11-07 07:13:03 +00:00
Craig Topper	75510dd6f7	[X86] Add AVX512VL command line to f16c intrinsic test to show missed EVEX opportunities for the legacy intrinsics. llvm-svn: 317543	2017-11-07 07:13:01 +00:00
Craig Topper	afc3c8206e	[X86] Use IMPLICIT_DEF in VEX/EVEX vcvtss2sd/vcvtsd2ss patterns instead of a COPY_TO_REGCLASS. ExeDepsFix pass should take care of making the registers match. llvm-svn: 317542	2017-11-07 04:44:22 +00:00
Craig Topper	4ad81b51ed	[X86] Remove 'Requires' from instructions with no patterns. NFC llvm-svn: 317541	2017-11-07 04:44:21 +00:00
Davide Italiano	1f465aa64a	[Support/UNIX] posix_fallocate() can fail with EINVAL. According to the docs on opegroup.org, the function can return EINVAL if: The len argument is less than zero, or the offset argument is less than zero, or the underlying file system does not support this operation. I'd say it's a peculiar choice (when EONOTSUPP is right there), but let's keep POSIX happy for now. This was independently discovered by Mark Millard (on FreeBSD/ZFS). Quickly ack'ed by Rui on IRC. llvm-svn: 317535	2017-11-07 00:47:04 +00:00
Adrian Prantl	25a09dd408	Make DIExpression::createFragmentExpression() return an Optional. We can't safely split arithmetic into multiple fragments because we can't express carry-over between fragments. llvm-svn: 317534	2017-11-07 00:45:34 +00:00
Keith Wyss	424279958d	[XRay] Minimal tool to convert xray traces to Chrome's Trace Event Format. Minimal tool to convert xray traces to Chrome's Trace Event Format. Summary: Make use of Chrome Trace Event format's Duration events and stack frame dict to produce Json files that chrome://tracing can visualize from xray function call traces. Trace Event format is more robust and has several features like argument logging, function categorization, multi process traces, etc. that we can add as needed. Duration events cover an important base case. Part of this change is rearranging the code so that the TrieNode data structure can be used from multiple tools and can carry parameterized baggage on the nodes. I put the actual behavior changes in llvm-xray convert exclusively. Exploring the trace of instrumented llc was pretty nifty if overwhelming. I can envision this being very useful for analyzing contention scenarios or tuning parameters like batch sizes in a producer consumer queue. For more targeted traces likemthis, let's talk about how we want to approach trace pruning. Reviewers: dberris, pelikan Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D39362 llvm-svn: 317531	2017-11-07 00:28:28 +00:00
Davide Italiano	1a46affb45	[IPO/LowerTypesTest] Skip blockaddress(es) when replacing uses. Blockaddresses refer to the function itself, therefore replacing them would cause an assertion in doRAUW. Fixes https://bugs.llvm.org/show_bug.cgi?id=35201 This was found when trying CFI on a proprietary kernel by Dmitry Mikulin. Differential Revision: https://reviews.llvm.org/D39695 llvm-svn: 317527	2017-11-07 00:09:25 +00:00
Matt Arsenault	6119f80034	AMDGPU: Remove redundant combine This combine was already done in two places. The generic combiner already has done this since r217610, for adds (with a single use). This one was added in r303641, and added support for handling or as well. r313251 later added support to the generic combine for or. It also turns out the isOrEquivalentToAdd check is not necessary for this combine. Additionally, we already reproduce this combine in yet another place in the backend, although in that version multiple uses of the add are still folded if it will allow a fold into the addressing mode. That version needs to be improved to understand ors though, as well as the correct legal offsets for private. llvm-svn: 317526	2017-11-07 00:06:32 +00:00
Vedant Kumar	2b881f567f	[DebugInfo] Unify logic to merge DILocations. NFC. This makes DILocation::getMergedLocation() do what its comment says it does when merging locations for an Instruction: set the common inlineAt scope. This simplifies Instruction::applyMergedLocation() a bit. Testing: check-llvm, check-clang Differential Revision: https://reviews.llvm.org/D39628 llvm-svn: 317524	2017-11-06 23:15:21 +00:00
Simon Dardis	8bdbff37fe	[Support][Chrono] Use explicit cast of text output of time values. rL316419 exposed a platform specific issue where the type of the values passed to llvm::format could be different to the format string. Debian unstable for mips uses long long int for std::chrono:duration, while x86_64 uses long int. For mips, this resulted in the value being corrupted when rendered to a string. Address this by explicitly casting the result of the duration_cast to the type specified in the format string. Reviewers: sammccall Differential Revision: https://reviews.llvm.org/D39597 llvm-svn: 317523	2017-11-06 23:01:46 +00:00
Adrian Prantl	182f9fea37	InstCombine: salvage the debug info of DCE'ed add instructions. rdar://problem/31209283 llvm-svn: 317522	2017-11-06 22:49:39 +00:00
Craig Topper	428a4e6374	[X86] Make FeatureAVX512 imply FeatureF16C. The EVEX to VEX pass is already assuming this is true under AVX512VL. We had special patterns to use zmm instructions if VLX and F16C weren't available. Instead just make AVX512 imply F16C to make the EVEX to VEX behavior explicitly legal and remove the extra patterns. All known CPUs with AVX512 have F16C so this should safe for now. llvm-svn: 317521	2017-11-06 22:49:04 +00:00
Craig Topper	cb6c38612e	[X86] Make FeatureAVX512 imply FeatureFMA. Previously our VEX patterns were checking Subtarget.hasFMA() which checked FMA \|\| AVX512. So we were behaving as if AVX512 implied it anyway. Which means we'd allow VEX encoded 128/256 FMA when AVX512F was enabled but AVX512VL is off. Regardless of the FMA flag. EVEX to VEX also transforms scalar EVEX FMA instructions to their VEX versions even without the FMA flag. Similarly for 128/256 under AVX512VL. So this makes AVX512 imply FeatureFMA to make our current behavior explicit. All known CPUs that support AVX512 have VEX FMA instructions. llvm-svn: 317520	2017-11-06 22:49:01 +00:00
Sanjay Patel	86d24f1668	[ValueTracking] readonly (const) is a requirement for converting sqrt to llvm.sqrt; nnan is not As discussed in D39204, this is effectively a revert of rL265521 which required nnan to vectorize sqrt libcalls based on the old LangRef definition of llvm.sqrt. Now that the definition has been updated so the libcall and intrinsic have the same semantics apart from potentially setting errno, we can remove the nnan requirement. We have the right check to know that errno is not set: if (!ICS.onlyReadsMemory()) ...ahead of the switch. This will solve https://bugs.llvm.org/show_bug.cgi?id=27435 assuming that's being built for a target with -fno-math-errno. Differential Revision: https://reviews.llvm.org/D39642 llvm-svn: 317519	2017-11-06 22:40:09 +00:00
Hans Wennborg	8c4b10e84a	Revert r317510 "[InstCombine] Pull shifts through a select plus binop with constant" This broke the CodeGen/Hexagon/loop-idiom/pmpy-mod.ll test on a bunch of buildbots. > This pulls shifts through a select+binop with a constant where the select conditionally executes the binop. We already do this for just the binop, but not with the select. > > This can allow us to get the select closer to other selects to enable removing one. > > Differential Revision: https://reviews.llvm.org/D39222 > > git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317510 91177308-0d34-0410-b5e6-96231b3b80d8 llvm-svn: 317518	2017-11-06 22:28:02 +00:00
Hans Wennborg	effc12dd43	Revert r316064 "Fix the incorrect detection of ICONV_LIBRARY_PATH" This broke the use of libxml2 on machines where iconv() is provided by libc. I'll follow up on the mailing list to discuss how to fix this properly. > This is introduced in rL308711. > Check for c library is incorrect here just because libc will be found always > and it does not mean that iconv is presented. > > Thank to Andrew Krasny for narrowing down the root cause. > > Reviewers: ecbeckmann > Reviewed By: ecbeckmann > Subscribers: mgorny, llvm-commits > Differential Revision: https://reviews.llvm.org/D38875 llvm-svn: 317517	2017-11-06 22:17:23 +00:00
Xinliang David Li	a531f189fc	Fix comment /NFC llvm-svn: 317514	2017-11-06 21:57:51 +00:00
Bjorn Pettersson	a42ed3e361	[MIRPrinter] Use %subreg.xxx syntax for subregister index operands Summary: Print %subreg.<subregidxname> instead of just the subregister index when printing immediate operands corresponding to subreg indices in INSERT_SUBREG, EXTRACT_SUBREG, SUBREG_TO_REG and REG_SEQUENCE. Reviewers: qcolombet, MatzeB Reviewed By: MatzeB Subscribers: nhaehnle, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D39696 llvm-svn: 317513	2017-11-06 21:46:06 +00:00
Craig Topper	8917647333	[InstCombine] Pull shifts through a select plus binop with constant This pulls shifts through a select+binop with a constant where the select conditionally executes the binop. We already do this for just the binop, but not with the select. This can allow us to get the select closer to other selects to enable removing one. Differential Revision: https://reviews.llvm.org/D39222 llvm-svn: 317510	2017-11-06 21:07:22 +00:00
Justin Bogner	f7b1007345	update_mir_test_checks: Be careful about replacing entire vregs Previously, this could end up replacing a vreg like %14 with [[VREG1]]4, where VREG1 was the match for %1. That's obviously not correct, though it hasn't actually come up in any tests I've converted so far. llvm-svn: 317509	2017-11-06 21:06:09 +00:00
Graham Yiu	52a52a6cab	Fix buildbot breakages from r317503. Add parentheses to assignment when using result as a condition. llvm-svn: 317508	2017-11-06 21:04:19 +00:00
Graham Yiu	030621bbcb	Adds code to PPC ISEL lowering to recognize byte inserts from vector_shuffles, and use P9 shift and vector insert byte instructions instead of vperm. Extends tests from vector insert half-word. Differential Revision: https://reviews.llvm.org/D34497 llvm-svn: 317503	2017-11-06 20:18:30 +00:00
Dehao Chen	5d2a1a5045	Include already promoted counts when computing SUM for VP. Summary: When computing the SUM for indirect call promotion, if the callsite is already promoted in the profile, it will be promoted before ICP. In the current implementation, ICP only sees remaining counts in SUM. This may cause extra indirect call targets being promoted. This patch updates the SUM to include the counts already promoted earlier. This way we do not end up promoting too many indirect call targets. Reviewers: tejohnson Reviewed By: tejohnson Subscribers: llvm-commits, sanjoy Differential Revision: https://reviews.llvm.org/D38763 llvm-svn: 317502	2017-11-06 19:52:49 +00:00
Mitch Phillips	6fb3525113	[cfi-verify] Added a simple check that stops division-by-zero error when no indirect CF instructions are found in the provided file. llvm-svn: 317500	2017-11-06 19:14:09 +00:00
Guozhi Wei	e3b8d9a312	[PPC] Use xxbrd to speed up bswap64 Power doesn't have bswap instructions, so llvm generates following code sequence for bswap64. rotldi 5, 3, 16 rotldi 4, 3, 8 rotldi 9, 3, 24 rotldi 10, 3, 32 rotldi 11, 3, 48 rotldi 12, 3, 56 rldimi 4, 5, 8, 48 rldimi 4, 9, 16, 40 rldimi 4, 10, 24, 32 rldimi 4, 11, 40, 16 rldimi 4, 12, 48, 8 rldimi 4, 3, 56, 0 But Power9 has vector bswap instructions, they can also be used to speed up scalar bswap intrinsic. With this patch, bswap64 can be translated to: mtvsrdd 34, 3, 3 xxbrd 34, 34 mfvsrld 3, 34 Differential Revision: https://reviews.llvm.org/D39510 llvm-svn: 317499	2017-11-06 19:09:38 +00:00
Mitch Phillips	5ebf7a87f3	Make MCAsmBackend and MCCodeEmiiter passed by unique_ptr rval Summary: Fixes build breakage of llvm-mc-assemble-fuzzer introduced by rL315531. Reviewers: lhames Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D39268 llvm-svn: 317498	2017-11-06 18:56:36 +00:00
Matt Arsenault	4f6318fe1b	AMDGPU: Select v_mad_u64_u32 and v_mad_i64_i32 llvm-svn: 317492	2017-11-06 17:04:37 +00:00
Adrian Prantl	3c6491dd75	Canonicalize spelling of long-form-options in dsymutil.rst llvm-svn: 317490	2017-11-06 16:52:05 +00:00
Sanjay Patel	629c411538	[IR] redefine 'UnsafeAlgebra' / 'reassoc' fast-math-flags and add 'trans' fast-math-flag As discussed on llvm-dev: http://lists.llvm.org/pipermail/llvm-dev/2016-November/107104.html and again more recently: http://lists.llvm.org/pipermail/llvm-dev/2017-October/118118.html ...this is a step in cleaning up our fast-math-flags implementation in IR to better match the capabilities of both clang's user-visible flags and the backend's flags for SDNode. As proposed in the above threads, we're replacing the 'UnsafeAlgebra' bit (which had the 'umbrella' meaning that all flags are set) with a new bit that only applies to algebraic reassociation - 'AllowReassoc'. We're also adding a bit to allow approximations for library functions called 'ApproxFunc' (this was initially proposed as 'libm' or similar). ...and we're out of bits. 7 bits ought to be enough for anyone, right? :) FWIW, I did look at getting this out of SubclassOptionalData via SubclassData (spacious 16-bits), but that's apparently already used for other purposes. Also, I don't think we can just add a field to FPMathOperator because Operator is not intended to be instantiated. We'll defer movement of FMF to another day. We keep the 'fast' keyword. I thought about removing that, but seeing IR like this: %f.fast = fadd reassoc nnan ninf nsz arcp contract afn float %op1, %op2 ...made me think we want to keep the shortcut synonym. Finally, this change is binary incompatible with existing IR as seen in the compatibility tests. This statement: "Newer releases can ignore features from older releases, but they cannot miscompile them. For example, if nsw is ever replaced with something else, dropping it would be a valid way to upgrade the IR." ( http://llvm.org/docs/DeveloperPolicy.html#ir-backwards-compatibility ) ...provides the flexibility we want to make this change without requiring a new IR version. Ie, we're not loosening the FP strictness of existing IR. At worst, we will fail to optimize some previously 'fast' code because it's no longer recognized as 'fast'. This should get fixed as we audit/squash all of the uses of 'isFast()'. Note: an inter-dependent clang commit to use the new API name should closely follow commit. Differential Revision: https://reviews.llvm.org/D39304 llvm-svn: 317488	2017-11-06 16:27:15 +00:00
Simon Pilgrim	ad9b9720e8	[X86][SSE] Merge combineExtractVectorElt_SSE into combineExtractVectorElt. NFCI. We still early-out for X86ISD::PEXTRW/X86ISD::PEXTRB so no actual change in behaviour, but it'll make it easier to add support in a future patch. llvm-svn: 317485	2017-11-06 15:28:25 +00:00
Alexey Bataev	676350c16c	[SLP] Test for PR35047, NFC. llvm-svn: 317482	2017-11-06 14:52:57 +00:00
Simon Pilgrim	14450720e6	[X86][SSE] Combine EXTRACT_VECTOR_ELT with combineExtractWithShuffle before XFormVExtractWithShuffleIntoLoad combineExtractWithShuffle can handle more complex shuffles/bitcasts than we can with the equivalent code in XFormVExtractWithShuffleIntoLoad. Mainly a compile time improvement now (combineExtractWithShuffle combines will have always failed late on inside XFormVExtractWithShuffleIntoLoad), and will let us merge combineExtractVectorElt_SSE in a future commit. llvm-svn: 317481	2017-11-06 14:34:19 +00:00
Yaxun Liu	cc56a8b108	[AMDGPU] Change alloca addr space of r600 to 5 for amdgiz environment Differential Revision: https://reviews.llvm.org/D39657 llvm-svn: 317479	2017-11-06 14:32:33 +00:00
Jonas Paulsson	e54cc1a436	[SystemZ] implement hasDivRemOp() SystemZ can do division and remainder in a single instruction for scalar integer types, which are now reflected by returning true in this hook for those cases. Review: Ulrich Weigand llvm-svn: 317477	2017-11-06 13:10:31 +00:00
Yaxun Liu	1ac16619d2	[AMDGPU] Fix assertion due to assuming pointer in default addr space is 32 bit The backend assumes pointer in default addr space is 32 bit, which is not true for the new addr space mapping and causes assertion for unresolved functions. This patch fixes that. Differential Revision: https://reviews.llvm.org/D39643 llvm-svn: 317476	2017-11-06 13:01:33 +00:00
Simon Dardis	169df4e24b	[mips] Add movep for microMIPS32R6 and fix microMIPS32r3 version Previously, the 'movep' instruction was defined for microMIPS32r3 and shared that definition with microMIPS32R6. 'movep' was re-encoded for microMIPS32r6, so this patch provides the correct encoding. Secondly, correct the encoding of the 'rs' and 'rt' operands which have an instruction specific encoding for the registers those operands accept. Finally, correct the decoding of the 'dst_regs' operand which was extracting the relevant field from the instruction, but was actually extracting the field from the alreadly extracted field. Reviewers: atanasyan Differential Revision: https://reviews.llvm.org/D39495 llvm-svn: 317475	2017-11-06 12:59:53 +00:00
Jonas Devlieghere	aaecdc44ae	[docs] Update code block for compatibility with Sphinx 1.5.1 It is currently not possible to build the documentation with cmake and the same version of Sphinx (1.5.1) used to generate the public facing documentation on llvm.org. When code blocks cannot be parsed by Pygments, it generates a warning which is treated as an error. In addition to being annoying and confusing for developers, this needlessly increases the bar for newcomers that want to get involved. This patch removes the language specifier from the affected block. The result is the same as when parsing fails: the block are not highlighted. llvm-svn: 317472	2017-11-06 11:47:24 +00:00
Mohammed Agabaria	6691758364	[LV][X86] update the cost of interleaving mem. access of floats Recommit: This patch contains update of the costs of interleaved loads of v8f32 of stride 3 and 8. fixed the location of the lit test it works with make check-all. Differential Revision: https://reviews.llvm.org/D39403 llvm-svn: 317471	2017-11-06 10:56:20 +00:00
Simon Dardis	e57795384c	[mips] Fix PR35140 Mark all symbols involved with TLS relocations as being TLS symbols. This resolves PR35140. Thanks to Alex Crichton for reporting the issue! Reviewers: atanasyan Differential Revision: https://reviews.llvm.org/D39591 llvm-svn: 317470	2017-11-06 10:50:04 +00:00
Raphael Isemann	53d28a9101	Fixed dead links in WritingAnLLVMPass.rst llvm-svn: 317467	2017-11-06 09:51:39 +00:00
Uriel Korach	bb86686a8b	[X86][AVX512] Improve lowering of AVX512 test intrinsics Added TESTM and TESTNM to the list of instructions that already zeroing unused upper bits and does not need the redundant shift left and shift right instructions afterwards. Added a pattern for TESTM and TESTNM in iselLowering, so now icmp(neq,and(X,Y), 0) goes folds into TESTM and icmp(eq,and(X,Y), 0) goes folds into TESTNM This commit is a preparation for lowering the test and testn X86 intrinsics to IR. Differential Revision: https://reviews.llvm.org/D38732 llvm-svn: 317465	2017-11-06 09:22:38 +00:00
Uriel Korach	eb47d95d52	[X86] Replace duplicate function call with variable. NFC Change from: if (N->getOperand(0).getValueType() == MVT::v8i32 \|\| N->getOperand(0).getValueType() == MVT::v8f32) to: EVT OpVT = N->getOperand(0).getValueType(); if (OpVT == MVT::v8i32 \|\| OpVT == MVT::v8f32) Change-Id: I5a105f8710b73a828e6cfcd55fac2eae6153ce25 llvm-svn: 317464	2017-11-06 08:32:45 +00:00
Zvi Rackover	3122698040	X86 ISel: Basic support for variable-index vector permutations Summary: Try to lower a BUILD_VECTOR composed of extract-extract chains that can be reasoned to be a permutation of a vector by indices in a non-constant vector. We saw this pattern created by ISPC, which resolts to creating it due to the requirement that shufflevector's mask operand be a constant vector. I didn't check this but we could possibly use this pattern for lowering the X86 permute C-instrinsics instead of llvm.x86 instrinsics. This change can be followed by more improvements: 1. Handle vectors with undef elements. 2. Utilize pshufb and zero-mask-blending to support more effiecient construction of vectors with constant-0 elements. 3. Use smaller-element vectors of same width, and "interpolate" the indices, when no native operation available. Reviewers: RKSimon, craig.topper Reviewed By: RKSimon Subscribers: chandlerc, DavidKreitzer Differential Revision: https://reviews.llvm.org/D39126 llvm-svn: 317463	2017-11-06 08:25:46 +00:00
Jina Nahias	3844f1ad5c	Revert "adding a pattern for broadcastm" This reverts commit r317457. Change-Id: If07f1fca1e3453d16c1dac906e87768661384e91 llvm-svn: 317462	2017-11-06 07:48:58 +00:00
Martin Storsjo	568c18cd0b	[test] Add test files that were missed from SVN r317459 llvm-svn: 317461	2017-11-06 07:36:17 +00:00
Martin Storsjo	bed0c519c3	[ObjectYAML] Map relocation types for COFF ARMNT and ARM64 Differential Revision: https://reviews.llvm.org/D39668 llvm-svn: 317459	2017-11-06 07:20:58 +00:00
Jina Nahias	7b705f1f91	[x86][AVX512] Lowering Broadcastm intrinsics to LLVM IR This patch, together with a matching clang patch (https://reviews.llvm.org/D38683), implements the lowering of X86 broadcastm intrinsics to IR. Differential Revision: https://reviews.llvm.org/D38684 Change-Id: I709ac0b34641095397e994c8ff7e15d1315b3540 llvm-svn: 317458	2017-11-06 07:09:24 +00:00
Jina Nahias	9c6561b648	adding a pattern for broadcastm Change-Id: I6551fb13879e098aed74de410e29815cf37d9ab5 llvm-svn: 317457	2017-11-06 07:09:09 +00:00
Craig Topper	70eaeae7f0	[X86] Use EVEX encoded intrinsics for legacy FMA intrinsics when possible. llvm-svn: 317454	2017-11-06 05:48:26 +00:00
Craig Topper	07dac55d95	[X86] Add scalar FMA ISD nodes without rounding mode. NFC Next step is to use them for the legacy FMA scalar intrinsics as well. This will enable the legacy intrinsics to use EVEX encoded opcodes and the extended registers. llvm-svn: 317453	2017-11-06 05:48:25 +00:00
Craig Topper	25cfa4cb55	[X86] Add avx512vl command line to fma-instrinsics-x86.ll Some of these demonstrate a missed EVEX to VEX compression because we aren't prefering EVEX instructions during isel. llvm-svn: 317452	2017-11-06 05:48:24 +00:00
Craig Topper	7e48aa89c7	[X86] Simplify command lines on the fma-instrinsics-x86.ll test and add -show-mc-encoding. Use feature names instead of CPU names. A future commit will add avx512vl command lines to demonstrate missed use of EVEX instructions. llvm-svn: 317451	2017-11-06 05:48:23 +00:00
Craig Topper	eff606cc0e	[X86] Use EVEX encoded instructions for legacy scalar sqrt intrinsics. Fixes PR35161. llvm-svn: 317445	2017-11-06 04:04:01 +00:00
David L. Jones	82b22e0327	[PassManager, SimplifyCFG] Revert r316908 and r316869. These cause Clang to crash with a segfault. See PR35210 for details. llvm-svn: 317444	2017-11-06 00:32:01 +00:00
Craig Topper	d6471cb934	[X86] Add missing predicate to a pattern. NFC Other patterns had higher priority so this wasn't noticed. But we shouldn't be dependent on pattern order. llvm-svn: 317442	2017-11-05 21:14:06 +00:00
Craig Topper	4e2f53511a	[X86] Remove some more RCP and RSQRT patterns from InstrAVX512.td that I missed in r317413. llvm-svn: 317441	2017-11-05 21:14:05 +00:00
Craig Topper	948c39c480	[X86] Fix outdated comment. NFC llvm-svn: 317440	2017-11-05 21:14:04 +00:00
Simon Pilgrim	879c5b15c4	[X86][SSE] Tests for integer min/max horizontal reductions Matching patterns that vectorizers should have created for us. The experimental intrinsics should probably be added as well. llvm-svn: 317439	2017-11-05 19:48:24 +00:00
Dorit Nuzman	eb13dd3eac	[LV/LAA] Avoid specializing a loop for stride=1 when this predicate implies a single-iteration loop This fixes PR34681. Avoid adding the "Stride == 1" predicate when we know that Stride >= Trip-Count. Such a predicate will effectively optimize a single or zero iteration loop, as Trip-Count <= Stride == 1. Differential Revision: https://reviews.llvm.org/D38785 llvm-svn: 317438	2017-11-05 16:53:15 +00:00
Sanjay Patel	92403c12b5	[SLPVectorizer] minimize tests and auto-generate full checks; NFC llvm-svn: 317437	2017-11-05 16:11:01 +00:00
Mohammed Agabaria	acd69dbc7c	[REVERT][LV][X86] update the cost of interleaving mem. access of floats reverted my changes will be committed later after fixing the failure This patch contains update of the costs of interleaved loads of v8f32 of stride 3 and 8. Differential Revision: https://reviews.llvm.org/D39403 llvm-svn: 317433	2017-11-05 09:36:54 +00:00
Mohammed Agabaria	f74c767de6	[LV][X86] update the cost of interleaving mem. access of floats This patch contains update of the costs of interleaved loads of v8f32 of stride 3 and 8. Differential Revision: https://reviews.llvm.org/D39403 llvm-svn: 317432	2017-11-05 09:06:23 +00:00
Serguei Katkov	aee6375b02	[CGP] Fix the bug found by asan. Try to fix the asan failure introduced by r317429. llvm-svn: 317431	2017-11-05 07:59:02 +00:00
Serguei Katkov	cde03f3d27	[CGP] Extends the scope of optimizeMemoryInst optimization. NFC Commit tests for previous commit. Reviewers: efriedma, dberlin, mkazantsev, reames, john.brawn Reviewed By: john.brawn Subscribers: javed.absar, john.brawn, dneilson, llvm-commits Differential Revision: https://reviews.llvm.org/D36073 llvm-svn: 317430	2017-11-05 05:51:44 +00:00
Serguei Katkov	d5d8d54b08	[CGP] Extends the scope of optimizeMemoryInst optimization This is an implementation of PR26223. Currently optimizeMemoryInst optimization tries to fold address computation if all possible way to get compute the address are of the form baseGV + base + scale * Index + offset where scale and offset are constants and baseGV, base and Index are exactly the same instructions if defined. The patch extends this optimization to allow different bases. In this case it tries to find/build a Phi node merging all possible bases and use this Phi node as a base for sunk address computation. Also it supports Select instruction on the way. The main motivation for this scope extension is GCRelocateInst. If there is a relocation of derived pointer it will be represented as relocation of base + offset. Also there will be a Phi node merging address computation for relocated derived pointer and derived pointer itself. If we have a Phi node merging original base and relocated base and can fold the address computation of derived pointer then we can potentially reduce the code size and Phi node for derived pointer. The later can have a positive impact to register allocator. Reviewers: efriedma, dberlin, mkazantsev, reames, john.brawn Reviewed By: john.brawn Subscribers: javed.absar, john.brawn, dneilson, llvm-commits Differential Revision: https://reviews.llvm.org/D36073 llvm-svn: 317429	2017-11-05 05:50:33 +00:00
Simon Pilgrim	f8105cf357	[X86][AVX] Regenerate test. NFCI. llvm-svn: 317424	2017-11-04 21:18:06 +00:00
Harlan Haskins	2ad533c0f9	Use code voice for DIBuilder in LLVM C API (This is a test commit) llvm-svn: 317422	2017-11-04 20:31:20 +00:00
Aaron Ballman	cbaf5a4f50	Move the llvm-tblgen project into the Tablegenning folder on IDEs like Visual Studio rather than leave it in the root directory. NFC. llvm-svn: 317420	2017-11-04 20:07:16 +00:00
Aaron Ballman	a5ee69a010	Move the srpm, ocaml_make_directory, llvm_vcsrevision_h, and llvm-headers projects into the Misc folder on IDEs like Visual Studio rather than leave them in the root directory. NFC. llvm-svn: 317416	2017-11-04 19:59:14 +00:00
Aaron Ballman	207751ade7	Move the LLVMCFIVerify project into the Libraries folder on IDEs like Visual Studio rather than leave it in the root directory. NFC. llvm-svn: 317415	2017-11-04 19:48:17 +00:00
Aaron Ballman	cecf7145e9	Move these CMake projects into the Tests folder on IDEs like Visual Studio rather than leave it in the root directory. NFC. llvm-svn: 317414	2017-11-04 19:39:14 +00:00
Craig Topper	692c8efe30	[X86] Don't use RCP14 and RSQRT14 for reciprocal estimations or for legacy SSE rcp/rsqrt intrinsics when AVX512 features are enabled. Summary: AVX512 added RCP14 and RSQRT instructions which improve accuracy over the legacy RCP and RSQRT instruction, but not enough accuracy to remove the need for a Newton Raphson refinement. Currently we use these new instructions for the legacy packed SSE instrinics, but not the scalar instrinsics. And we use it for fast math optimization of division and reciprocal sqrt. I think switching the legacy instrinsics maybe surprising to the user since it changes the answer based on which processor you're using regardless of any fastmath settings. It's also weird that we did something different between scalar and packed. As far at the reciprocal estimation, I think it creates unnecessary deltas in our output behavior (and prevents EVEX->VEX). A little playing around with gcc and icc and godbolt suggest they don't change which instructions they use here. This patch adds new X86ISD nodes for the RCP14/RSQRT14 and uses those for the new intrinsics. Leaving the old intrinsics to use the old instructions. Going forward I think our focus should be on -Supporting 512-bit vectors, which will have to use the RCP14/RSQRT14. -Using RSQRT28/RCP28 to remove the Newton Raphson step on processors with AVX512ER -Supporting double precision. Reviewers: zvi, DavidKreitzer, RKSimon Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D39583 llvm-svn: 317413	2017-11-04 18:26:41 +00:00
Craig Topper	be1f219050	[X86] Regenerate a couple more tests that I missed in r317410. llvm-svn: 317412	2017-11-04 18:26:39 +00:00
Craig Topper	e5d44cefea	[X86] Teach EVEX->VEX pass to turn SHUFI32X4/SHUFF32X4/SHUFI64X/SHUFF64X2 into VPERM2F128/VPERM2I128. This recovers some of the tests that were changed by r317403. llvm-svn: 317410	2017-11-04 18:10:03 +00:00
Yaxun Liu	0d9673cff2	[AMDGPU] Remove hardcoded address space value from AMDGPULibFunc AMDGPULibFunc hardcodes address space values of the old address space mapping, which causes invalid addrspacecast instructions and undefined functions in APPSDK sample MonteCarloAsianDP. This patch fixes that. Differential Revision: https://reviews.llvm.org/D39616 llvm-svn: 317409	2017-11-04 17:37:43 +00:00
Sean Fertile	4595a915f6	[LTO][ThinLTO] Use the linker resolutions to mark global values as dso_local. Now that we have a way to mark GlobalValues as local we can use the symbol resolutions that the linker plugin provides as part of lto/thinlto link step to refine the compilers view on what symbols will end up being local. Originally commited as r317374, but reverted in r317395 to update some missed tests. Differential Revision: https://reviews.llvm.org/D35702 llvm-svn: 317408	2017-11-04 17:04:39 +00:00
NAKAMURA Takumi	448a3e5785	llvm/test/lit.cfg.py: Don't set the feature "llvm-64-bits" if -m32 is specified. FIXME: LLVM_BUILD_32_BITS should modify host_triple. llvm-svn: 317404	2017-11-04 06:55:55 +00:00
Craig Topper	a96d62b360	[X86] Teach shuffle lowering to use 256-bit SHUF128 when possible. This allows masked operations to be used and allows the register allocator to use YMM16-31 if necessary. As a follow up I'll look into teaching EVEX->VEX how to turn this back into PERM2X128 if any of the additional features don't work out. llvm-svn: 317403	2017-11-04 06:44:47 +00:00
NAKAMURA Takumi	965429ee52	CMake: Let LLVM_BUILD_32_BITS aware of large file. llvm-svn: 317402	2017-11-04 06:03:29 +00:00
NAKAMURA Takumi	1ac3ae743d	llvm/test/Object/archive-SYM64-write.test: Delete large temp files. They are 8GiB total. llvm-svn: 317401	2017-11-04 06:00:11 +00:00
Sean Fertile	39770ca0a1	Revert "[LTO][ThinLTO] Use the linker resolutions to mark global values ..." Changes more tests then expected on one of the build bots. reverting to investigate. This reverts https://llvm.org/svn/llvm-project/llvm/trunk@317374 llvm-svn: 317395	2017-11-04 01:54:20 +00:00
Davide Italiano	c7c05ae4be	[CallSiteSplitting] clang-format my last commit. NFCI. Thanks to Rui for pointing out. llvm-svn: 317393	2017-11-04 00:44:01 +00:00
Davide Italiano	91b4790b33	[CallSiteSplitting] Silence GCC's -Wparentheses. NFCI. llvm-svn: 317385	2017-11-03 23:03:38 +00:00
Craig Topper	d21a53f246	[X86] Give unary PERMI priority over SHUF128 in lowerV8I64VectorShuffle to make it possible to fold a load. llvm-svn: 317382	2017-11-03 22:48:13 +00:00
David Blaikie	1be62f0327	Move TargetFrameLowering.h to CodeGen where it's implemented This header already includes a CodeGen header and is implemented in lib/CodeGen, so move the header there to match. This fixes a link error with modular codegeneration builds - where a header and its implementation are circularly dependent and so need to be in the same library, not split between two like this. llvm-svn: 317379	2017-11-03 22:32:11 +00:00
Adrian Prantl	261ac8b23c	Invoke salvageDebugInfo from CodeGenPrepare's SinkCast() This preserves the debug info for the cast operation in the original location. rdar://problem/33460652 Reapplied r317340 with the test moved into an ARM-specific directory. llvm-svn: 317375	2017-11-03 21:55:03 +00:00
Sean Fertile	36528c2a9b	[LTO][ThinLTO] Use the linker resolutions to mark global values as dso_local. Now that we have a way to mark GlobalValues as local we can use the symbol resolutions that the linker plugin provides as part of lto/thinlto link step to refine the compilers view on what symbols will end up being local. Differential Revision: https://reviews.llvm.org/D35702 llvm-svn: 317374	2017-11-03 21:45:55 +00:00
Kevin Enderby	3fc9188fa8	Fix a crash in llvm-objdump when printing a bad x86_64 relocation in a Mach-O file with a bad section number. rdar://35207539 llvm-svn: 317373	2017-11-03 21:32:44 +00:00
Peter Collingbourne	c2935db629	Revert r317046, "Object: Move some code from ELF.h into ELF.cpp." This change resulted in a measured 1.5-2% perf regression linking chrome. llvm-svn: 317371	2017-11-03 21:30:06 +00:00
Craig Topper	12463779d3	[SimplifyCFG] When merging conditional stores, don't count the store we're merging against the PHINodeFoldingThreshold Merging conditional stores tries to check to see if the code is if convertible after the store is moved. But the store hasn't been moved yet so its being counted against the threshold. The patch adds 1 to the threshold comparison to make sure we don't count the store. I've adjusted a test to use a lower threshold to ensure we still do that conversion with the lower threshold. Differential Revision: https://reviews.llvm.org/D39570 llvm-svn: 317368	2017-11-03 21:08:13 +00:00
David Blaikie	34eb96b03f	GCOV: Move GCOV from IR & Support into ProfileData to fix layering This class was split between libIR and libSupport, which breaks under modular code generation. Move it into the one library that uses it, ProfileData, to resolve this issue. llvm-svn: 317366	2017-11-03 20:57:10 +00:00
David Blaikie	998ff81f7c	llvm-objdump: Fix unused-lambda-capture warning by removing unused lambda capture llvm-svn: 317365	2017-11-03 20:57:09 +00:00
Mitch Phillips	c15bdf5598	[cfi-verify] Add blacklist parsing for result filtering. Adds blacklist parsing behaviour for filtering results into four categories: - Expected Protected: Things that are not in the blacklist and are protected. - Unexpected Protected: Things that are in the blacklist and are protected. - Expected Unprotected: Things that are in the blacklist and are unprotected. - Unexpected Unprotected: Things that are not in the blacklist and are unprotected. now can optionally be invoked with a second command line argument, which specifies the blacklist file that the binary was built with. Current statistics for chromium: Reviewers: vlad.tsyrklevich Subscribers: mgorny, llvm-commits, pcc, kcc Differential Revision: https://reviews.llvm.org/D39525 llvm-svn: 317364	2017-11-03 20:54:26 +00:00
Jun Bum Lim	0c99007db1	Recommit r317351 : Add CallSiteSplitting pass This recommit r317351 after fixing a buildbot failure. Original commit message: Summary: This change add a pass which tries to split a call-site to pass more constrained arguments if its argument is predicated in the control flow so that we can expose better context to the later passes (e.g, inliner, jump threading, or IPA-CP based function cloning, etc.). As of now we support two cases : 1) If a call site is dominated by an OR condition and if any of its arguments are predicated on this OR condition, try to split the condition with more constrained arguments. For example, in the code below, we try to split the call site since we can predicate the argument (ptr) based on the OR condition. Split from : if (!ptr \|\| c) callee(ptr); to : if (!ptr) callee(null ptr) // set the known constant value else if (c) callee(nonnull ptr) // set non-null attribute in the argument 2) We can also split a call-site based on constant incoming values of a PHI For example, from : BB0: %c = icmp eq i32 %i1, %i2 br i1 %c, label %BB2, label %BB1 BB1: br label %BB2 BB2: %p = phi i32 [ 0, %BB0 ], [ 1, %BB1 ] call void @bar(i32 %p) to BB0: %c = icmp eq i32 %i1, %i2 br i1 %c, label %BB2-split0, label %BB1 BB1: br label %BB2-split1 BB2-split0: call void @bar(i32 0) br label %BB2 BB2-split1: call void @bar(i32 1) br label %BB2 BB2: %p = phi i32 [ 0, %BB2-split0 ], [ 1, %BB2-split1 ] llvm-svn: 317362	2017-11-03 20:41:16 +00:00
David Blaikie	526f30b8aa	Modularize: Include some required headers DenseMaps require the definition of a type to be available when using a pointer to that type as a key to know how many bits are available for tombstone/etc. llvm-svn: 317360	2017-11-03 20:24:19 +00:00
Martin Storsjo	1a9593b251	[llvm-ar] Support an options string that start with a dash Some projects call $AR like "$AR -crs output input1 input2". Differential Revision: https://reviews.llvm.org/D39538 llvm-svn: 317358	2017-11-03 20:09:10 +00:00
Aaron Ballman	639ea374d6	Correcting some CRLFs that snuck in with my previous commit; NFC. llvm-svn: 317357	2017-11-03 20:05:51 +00:00
Aaron Ballman	ecf0e95267	Add llvm::for_each as a range-based extensions to <algorithm> and make use of it in some cases where it is a more clear alternative to std::for_each. llvm-svn: 317356	2017-11-03 20:01:25 +00:00
Mitch Phillips	189ebb6976	[cfi-verify] Add an interesting unit test where undef search length changes result. Add an interesting unit test, found by changing --search-length-undef from the default. Program handles it correctly but good for ensuring correctness on further changes :) Reviewers: pcc Subscribers: mgorny, llvm-commits, kcc, vlad.tsyrklevich Differential Revision: https://reviews.llvm.org/D38658 llvm-svn: 317355	2017-11-03 20:00:05 +00:00
Craig Topper	8b6600363a	[X86] Promote athlon, athlon-xp, k8, and k8-sse3 to types instead of subtypes in getHostCPUName. NFCI This removes the athlon type and simplifies the string decoding. We only really need these type/subtype breaks where we need to match libgcc/compiler-rt and these CPUs aren't part of that. I'm looking into moving some of this information to a .def file to share with clang's __builtin_cpu_is handling. And while these CPUs aren't part of that the less lines I have to deal with in the .def file the better. llvm-svn: 317354	2017-11-03 19:37:41 +00:00
Jun Bum Lim	0eb1c2d63a	Revert "Add CallSiteSplitting pass" Revert due to Buildbot failure. This reverts commit r317351. llvm-svn: 317353	2017-11-03 19:17:11 +00:00
Jake Ehrlich	c3a89eefd6	Reland "Add support for writing 64-bit symbol tables for archives when offsets become too large for 32-bit" Tests were failing because some bots were running out of address space and memory. Additionally the test was very slow. These issues were solved by changing the test to take advantage of sparse filse and restricting the test to run only on 64-bit systems. This should fix https://bugs.llvm.org//show_bug.cgi?id=34189 This change makes it so that if writing a K_GNU style archive, you need to output a > 32-bit offset it should output in K_GNU64 style instead. Differential Revision: https://reviews.llvm.org/D36812 llvm-svn: 317352	2017-11-03 19:15:06 +00:00
Jun Bum Lim	2a58933519	Add CallSiteSplitting pass Summary: This change add a pass which tries to split a call-site to pass more constrained arguments if its argument is predicated in the control flow so that we can expose better context to the later passes (e.g, inliner, jump threading, or IPA-CP based function cloning, etc.). As of now we support two cases : 1) If a call site is dominated by an OR condition and if any of its arguments are predicated on this OR condition, try to split the condition with more constrained arguments. For example, in the code below, we try to split the call site since we can predicate the argument (ptr) based on the OR condition. Split from : if (!ptr \|\| c) callee(ptr); to : if (!ptr) callee(null ptr) // set the known constant value else if (c) callee(nonnull ptr) // set non-null attribute in the argument 2) We can also split a call-site based on constant incoming values of a PHI For example, from : BB0: %c = icmp eq i32 %i1, %i2 br i1 %c, label %BB2, label %BB1 BB1: br label %BB2 BB2: %p = phi i32 [ 0, %BB0 ], [ 1, %BB1 ] call void @bar(i32 %p) to BB0: %c = icmp eq i32 %i1, %i2 br i1 %c, label %BB2-split0, label %BB1 BB1: br label %BB2-split1 BB2-split0: call void @bar(i32 0) br label %BB2 BB2-split1: call void @bar(i32 1) br label %BB2 BB2: %p = phi i32 [ 0, %BB2-split0 ], [ 1, %BB2-split1 ] Reviewers: davidxl, huntergr, chandlerc, mcrosier, eraman, davide Reviewed By: davidxl Subscribers: sdesmalen, ashutosh.nema, fhahn, mssimpso, aemerson, mgorny, mehdi_amini, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D39137 llvm-svn: 317351	2017-11-03 19:01:57 +00:00
Jake Ehrlich	5de70d996c	[llvm-objcopy] Add support for dwarf fission This change adds support for dwarf fission. Differential Revision: https://reviews.llvm.org/D39207 llvm-svn: 317350	2017-11-03 18:58:41 +00:00
Evandro Menezes	9dcf099944	[AArch64] Fix the number of iterations for the Newton series The number of iterations was incorrectly determined for DP FP vector types and the tests were insufficient to flag this issue. Differential revision: https://reviews.llvm.org/D39507 llvm-svn: 317349	2017-11-03 18:56:36 +00:00
Evgeny Stupachenko	d699de2b50	The patch fixes PR35131 Summary: Fix a misprint which led to false CTLZ recognition. Reviewers: craig.topper Differential Revision: https://reviews.llvm.org/D39585 From: Evgeny Stupachenko <evstupac@gmail.com> llvm-svn: 317348	2017-11-03 18:50:03 +00:00
Adrian Prantl	8fe9fb0ae5	Revert "Invoke salvageDebugInfo from CodeGenPrepare's SinkCast()" This reverts commit 317342 while investigating bot breakage. llvm-svn: 317345	2017-11-03 18:26:36 +00:00
Craig Topper	666e23b513	[CodeGen] Remove unnecessary semicolons to fix a warning. NFC llvm-svn: 317342	2017-11-03 18:02:46 +00:00
Craig Topper	741e7e6a71	[X86] Initialize Type and Subtype in getHostCPUName to 0. llvm-svn: 317341	2017-11-03 18:02:44 +00:00

... 2 3 4 5 6 ...

156567 Commits