llvm-project

Commit Graph

Author	SHA1	Message	Date
Craig Topper	8eb1f315ac	[AVX-512] Add scalar masked max/min intrinsic instructions to the load folding tables. llvm-svn: 294153	2017-02-05 22:25:46 +00:00
Craig Topper	cb4bc8be5b	[AVX-512] Add scalar masked add/sub/mul/div intrinsic instructions to the load folding tables. llvm-svn: 294152	2017-02-05 22:25:42 +00:00
Craig Topper	59af67206d	[AVX-512] Add masked scalar FMA intrinsics to isNonFoldablePartialRegisterLoad to improve load folding of scalar loads. llvm-svn: 294151	2017-02-05 22:25:40 +00:00
Dylan McKay	ccd819ad94	[AVR] Implement stacksave/stackrestore by expanding (PR31342) Summary: Authored by Florian Zeitz. This implements the missing stacksave/stackrestore intrinsics via expansion. Output of `llc -O0 -march=avr ~/devel/llvm/test/CodeGen/Generic/stacksave-restore.ll` for sanity checking (comments mine): ``` .text .file ".../llvm/test/CodeGen/Generic/stacksave-restore.ll" .globl test .p2align 1 .type test,@function test: ; @test ; BB#0: push r28 push r29 in r28, 61 in r29, 62 sbiw r28, 4 in r0, 63 cli out 62, r29 out 63, r0 out 61, r28 in r18, 61 in r19, 62 mov r20, r22 mov r21, r23 in r30, 61 in r31, 62 lsl r22 rol r23 lsl r22 rol r23 in r26, 61 in r27, 62 sub r26, r22 sbc r27, r23 andi r26, 252 in r0, 63 cli out 62, r27 out 63, r0 out 61, r26 in r0, 63 cli out 62, r31 out 63, r0 out 61, r30 in r30, 61 in r31, 62 sub r30, r22 sbc r31, r23 andi r30, 252 in r0, 63 cli out 62, r31 out 63, r0 out 61, r30 std Y+3, r24 ; 2-byte Folded Spill std Y+4, r25 ; 2-byte Folded Spill mov r24, r26 mov r25, r27 in r0, 63 cli out 62, r19 out 63, r0 out 61, r18 std Y+1, r20 ; 2-byte Folded Spill std Y+2, r21 ; 2-byte Folded Spill adiw r28, 4 in r0, 63 cli out 62, r29 out 63, r0 out 61, r28 pop r29 pop r28 ret .Lfunc_end0: .size test, .Lfunc_end0-test ``` Reviewers: dylanmckay Reviewed By: dylanmckay Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29553 llvm-svn: 294146	2017-02-05 21:35:45 +00:00
Kamil Rytarowski	5d2bd8dd54	Revamp llvm::once_flag to be closer to std::once_flag Summary: Make this interface reusable similarly to std::call_once and std::once_flag interface. This makes porting LLDB to NetBSD easier as there was in the original approach a portable way to specify a non-static once_flag. With this change translating std::once_flag to llvm::once_flag is mechanical. Sponsored by <The NetBSD Foundation> Reviewers: mehdi_amini, labath, joerg Reviewed By: mehdi_amini Subscribers: emaste, clayborg Differential Revision: https://reviews.llvm.org/D29566 llvm-svn: 294143	2017-02-05 21:13:06 +00:00
Craig Topper	cac328f25e	[X86] Fix printing of sha256rnds2 to include the implicit %xmm0 argument. llvm-svn: 294132	2017-02-05 18:33:31 +00:00
Craig Topper	d7ae9ab1fa	[X86] Fix printing of blendvpd/blendvps/pblendvb to include the implicit %xmm0 argument. This makes codegen output more obvious about the %xmm0 usage. llvm-svn: 294131	2017-02-05 18:33:24 +00:00
Craig Topper	6a35a81fc5	[X86] In LowerTRUNCATE, create an ISD::VECTOR_SHUFFLE instead of explicitly creating a PSHUFB. This will be lowered by regular shuffle lowering to a PSHUFB later. Similar was already done for several other shuffles in this function. The test changes are because the old code used explicity zeroing for elements that could have been undef. While I was here I also changed other shuffle vectors in the same function to use the same input twice instead of creating UNDEF nodes. getVectorShuffle can create the UNDEF for us. llvm-svn: 294130	2017-02-05 18:33:14 +00:00
Geoff Berry	76ca8c2b34	[SelectionDAG] In InstrEmitter, handle EXTRACT_SUBREG of a physical register. Summary: Without this change, the getVR() call would hit an assert since it was being passed a physical register. Update the AArch64/ldst-opt.ll test with a case that triggers this behavior by adding a run with strict-align, which causes an unaligned STR XZR instruction to be split into byte stores, creating an EXTRACT_SUBREG of XZR that triggers the original problem. Reviewers: bogner, qcolombet, MatzeB, atrick Subscribers: aemerson, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D29495 llvm-svn: 294129	2017-02-05 18:28:14 +00:00
Amaury Sechet	143902c29f	[DAGCombiner] Leverage add's commutativity Summary: This avoid the need to duplicate all pattern and actually end up exposing some opportunity to optimize existing pattern that did not exists in both directions on an existing test case. Reviewers: mkuper, spatel, bkramer, RKSimon, zvi Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29541 llvm-svn: 294125	2017-02-05 14:22:20 +00:00
Daniel Sanders	c3ac566754	[globalisel][arm] Tablegen-erate current Register Bank Information. Summary: This patch tablegen-erates the ARM register bank information so that the static tables added in D27807 no longer need to be maintained. Depends on D27338 Reviewers: t.p.northover, rovka, ab, qcolombet, aditya_nandakumar Reviewed By: rovka Subscribers: aemerson, rengolin, mgorny, dberris, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D28567 llvm-svn: 294124	2017-02-05 12:07:55 +00:00
Dylan McKay	b78f36657e	[AVR] Fix a bug where asm operands are printed twice We would unconditionally call printOperand, even if PrintAsmOperand already printed the immediate. llvm-svn: 294121	2017-02-05 10:42:49 +00:00
Dylan McKay	7a3eb290ef	[AVR] Support zero-sized arguments in defined methods It is sufficient to skip emission of these arguments as we have nothing to actually pass through the function call. The AVR-GCC reference has nothing to say about zero-sized arguments, presumably because C/C++ doesn't support them. This means we don't have to worry about ABI differences. llvm-svn: 294119	2017-02-05 09:53:45 +00:00
Dehao Chen	448b5790f6	Refactor SampleProfile.cpp to make it cleaner. (NFC) llvm-svn: 294118	2017-02-05 07:32:17 +00:00
Craig Topper	978fdb75a4	[X86] Add support for folding (insert_subvector vec1, (extract_subvector vec2, idx1), idx1) -> (blendi vec2, vec1). llvm-svn: 294112	2017-02-04 23:26:46 +00:00
Craig Topper	3d95228dbe	[X86] Simplify the code that turns INSERT_SUBVECTOR into BLENDI. NFCI llvm-svn: 294111	2017-02-04 23:26:42 +00:00
Craig Topper	42b83f8d6e	[DAGCombiner] Canonicalize the order of a chain of INSERT_SUBVECTORs. Based on similar code for INSERT_VECTOR_ELT. llvm-svn: 294110	2017-02-04 23:26:39 +00:00
Craig Topper	04dce84ead	[DAGCombiner] Use DAG.getAnyExtOrTrunc to simplify some code. NFC llvm-svn: 294109	2017-02-04 23:26:37 +00:00
Craig Topper	ceaf9c1633	[DAGCombiner] In visitINSERT_VECTOR_ELT, move check for BUILD_VECTOR being legal below code that just canonicalizes INSERT_VECTOR_ELT without creating BUILD_VECTORS. llvm-svn: 294108	2017-02-04 23:26:34 +00:00
Davide Italiano	ec49313b11	[IPCP] Don't propagate return value for naked functions. This is pretty much the same change made in SCCP. llvm-svn: 294098	2017-02-04 19:44:14 +00:00
Amaury Sechet	6e2d8e49ec	Formatting in DAGCombiner. NFC llvm-svn: 294091	2017-02-04 13:01:53 +00:00
Xinliang David Li	c7db0d0e75	Fix variable name /NFC llvm-svn: 294090	2017-02-04 07:40:43 +00:00
Matthias Braun	82e7f4d877	MachineCopyPropagation: Respect implicit operands of COPY The code missed to check implicit operands of COPY instructions for defs/uses. Differential Revision: https://reviews.llvm.org/D29522 llvm-svn: 294088	2017-02-04 02:27:20 +00:00
Matthias Braun	776a1d7ecb	MachineCopyPropagation: Do not consider undef operands as clobbers This was originally introduced in r278321 to work around correctness problems in the ExecutionDepsFix pass; Probably also to keep the performance benefits of breaking the false dependencies which of course also affect undef operands. ExecutionDepsFix has been improved here recently (see for example r278321) so we should not need this exception any longer. Differential Revision: https://reviews.llvm.org/D29525 llvm-svn: 294087	2017-02-04 02:27:13 +00:00
Kyle Butt	c7d67eef5a	[CodeGen]: BlockPlacement: Skip extraneous logging. Move a check for blocks that are not candidates for tail duplication up before the logging. Reduces logging noise. No non-logging changes intended. llvm-svn: 294086	2017-02-04 02:26:34 +00:00
Kyle Butt	e9425c4ff8	[CodeGen]: BlockPlacement: Apply const liberally. NFC Anything that needs to be passed to AnalyzeBranch unfortunately can't be const, or more would be const. Added const_iterator to BlockChain to allow BlockChain to be const when we don't expect to change it. llvm-svn: 294085	2017-02-04 02:26:32 +00:00
Eugene Zelenko	502d0bc28e	[CodeGen] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). This is preparation to reduce TargetInstrInfo.h dependencies. llvm-svn: 294084	2017-02-04 02:00:53 +00:00
Craig Topper	aa741abfc5	[TwoAddressInstruction] Fix typo in comment. NFC llvm-svn: 294083	2017-02-04 01:58:10 +00:00
Eric Christopher	b128abcf7a	Remove a bunch of unnecessary casts to a target specific version of TII and TRI as we're working from a target specific STI. llvm-svn: 294081	2017-02-04 01:52:17 +00:00
Bob Haarman	e2fe59fddd	fix nullptr Mangler in LTOModule Reviewers: kcc, pcc Subscribers: mehdi_amini Differential Revision: https://reviews.llvm.org/D29523 llvm-svn: 294079	2017-02-04 01:28:44 +00:00
Eugene Zelenko	3f37f07c7f	[Sparc] Fix some Include What You Use warnings; other minor fixes (NFC). This is preparation to reduce MCExpr.h dependencies. llvm-svn: 294072	2017-02-04 00:36:49 +00:00
Brendon Cahoon	9809b10586	[RegisterCoalescer] Do not call getInstructionIndex with DBG_VALUE An assert occurs when calling SlotIndexes::getInstructionIndex with a DBG_VALUE instruction because the function expects an instruction with a slot index. However, there is no slot index for a DBG_VALUE instruction. Differential Revision: https://reviews.llvm.org/D29048 llvm-svn: 294070	2017-02-04 00:10:22 +00:00
Eugene Zelenko	cd8ea02b4a	[Mips] Fix some Include What You Use warnings; other minor fixes (NFC). This is preparation to reduce MCExpr.h dependencies. llvm-svn: 294069	2017-02-03 23:39:33 +00:00
Eugene Zelenko	06869c04f3	[SystemZ] Fix some Include What You Use warnings; other minor fixes (NFC). This is preparation to reduce MCExpr.h dependencies. llvm-svn: 294068	2017-02-03 23:39:06 +00:00
Eugene Zelenko	e894b4dc59	[AMDGPU] Fix some Include What You Use warnings; other minor fixes (NFC). This is preparation to reduce MCExpr.h dependencies. llvm-svn: 294067	2017-02-03 23:38:40 +00:00
Sanjay Patel	0fe32ac256	[InstCombine] treat i1 as a special type in shouldChangeType() This patch is based on the llvm-dev discussion here: http://lists.llvm.org/pipermail/llvm-dev/2017-January/109631.html Folding to i1 should always be desirable because that's better for value tracking and we have special folds for i1 types. I checked for other users of shouldChangeType() where this might have an effect, but we already handle the i1 case differently than other types in all of those cases. Side note: the default datalayout includes i1, so it seems we only find this gap in shouldChangeType + phi folding for the case when there is (1) an explicit datalayout without i1, (2) casting to i1 from a legal type, and (3) a phi with exactly 2 incoming casted operands (as Björn mentioned). Differential Revision: https://reviews.llvm.org/D29336 llvm-svn: 294066	2017-02-03 23:13:11 +00:00
Amaury Sechet	fb1756b35b	[APInt] Add integer API bor bitwise operations. Summary: As per title. I ran into that limitation of the API doing some other work, so I though that'd be a nice addition. Reviewers: jroelofs, compnerd, majnemer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29503 llvm-svn: 294063	2017-02-03 22:54:41 +00:00
Kostya Serebryany	9f8e47b28c	[libFuzzer] properly hide the memcmp interceptor from msan llvm-svn: 294061	2017-02-03 22:51:38 +00:00
Xinliang David Li	6144a59b7f	[PGO] Add select instr profile in graph dump Differential Revision: http://reviews.llvm.org/D29474 llvm-svn: 294055	2017-02-03 21:57:51 +00:00
Eugene Zelenko	939f6b0167	[AArch64] Fix some Include What You Use warnings; other minor fixes (NFC). This is preparation to reduce MCExpr.h dependencies. llvm-svn: 294053	2017-02-03 21:49:13 +00:00
Eugene Zelenko	07dc38f67a	[ARM] Fix some Include What You Use warnings; other minor fixes (NFC). This is preparation to reduce MCExpr.h dependencies. llvm-svn: 294052	2017-02-03 21:48:12 +00:00
Eugene Zelenko	c3164b9c2f	[XCore] Fix some Include What You Use warnings; other minor fixes (NFC). This is preparation to reduce MCExpr.h dependencies. llvm-svn: 294051	2017-02-03 21:46:55 +00:00
Sanjay Patel	73fc8ddb06	[InstCombine] fix operand-complexity-based canonicalization (PR28296) The code comments didn't match the code logic, and we didn't actually distinguish the fake unary (not/neg/fneg) operators from arguments. Adding another level to the weighting scheme provides more structure and can help simplify the pattern matching in InstCombine and other places. I fixed regressions that would have shown up from this change in: rL290067 rL290127 But that doesn't mean there are no pattern-matching logic holes left; some combines may just be missing regression tests. Should fix: https://llvm.org/bugs/show_bug.cgi?id=28296 Differential Revision: https://reviews.llvm.org/D27933 llvm-svn: 294049	2017-02-03 21:43:34 +00:00
Zachary Turner	5ce0f4a9de	Properly parse the TypeServer2 record. llvm-svn: 294046	2017-02-03 21:22:27 +00:00
Matt Arsenault	f15da6c419	AMDGPU: AsmParser cleanups Use typedef, remove unnecessary enum, line wraps. llvm-svn: 294039	2017-02-03 20:49:51 +00:00
Mike Aizatsky	1b65812267	[libfuzzer] chromium-related compilation fixes Reviewers: kcc Differential Revision: https://reviews.llvm.org/D29502 llvm-svn: 294035	2017-02-03 20:26:44 +00:00
Stanislav Mekhanoshin	81db53109d	[AMDGPU] Bump -amdgpu-unroll-threshold-private to 2000 This has quite positive performance impact according to measurements. Before previous fixes to limit the optimization that was too high and blowed compile time and scratch usage, but now this is gone and we can bump the threshold. Differential Revision: https://reviews.llvm.org/D29505 llvm-svn: 294032	2017-02-03 20:08:29 +00:00
Matt Arsenault	1fa5eacf9d	AMDGPU: Set MCAsmInfo::PointerSize llvm-svn: 294031	2017-02-03 20:02:23 +00:00
Matt Arsenault	d9cd736585	AMDGPU: Don't unroll for private with dynamic allocas This won't be elimnated, so this will just bloat code if/when these are ever used/supported. llvm-svn: 294030	2017-02-03 19:36:00 +00:00
Michael Kuperstein	2a735b71b6	[SLP] Make sortMemAccesses explicitly return an error. NFC. llvm-svn: 294029	2017-02-03 19:32:50 +00:00
Ahmed Bougacha	9677cc6fb7	[TLI] Robustize SDAG LibFunc proto checking by merging it into TLI. This re-applies commit r292189, reverted in r292191. SelectionDAGBuilder recognizes libfuncs using some homegrown parameter type-checking. Use TLI instead, removing another heap of redundant code. This isn't strictly NFC, as the SDAG code was too lax. Concretely, this means changes are required to a few tests: - calling a non-variadic function via a variadic prototype isn't OK; it just happens to work on x86_64 (but not on, e.g., aarch64). - mempcpy has a size_t parameter; the SDAG code accepts any integer type, which meant using i32 on x86_64 worked. - a handful of SystemZ tests check the SDAG support for lax prototype checking: Ulrich agrees on removing them. I don't think it's worth supporting any of these (IMO) invalid testcases. Instead, fix them to be more meaningful. llvm-svn: 294028	2017-02-03 19:11:19 +00:00
Michael Kuperstein	723999d4aa	[SLP] Use SCEV to sort memory accesses. This generalizes memory access sorting to use differences between SCEVs, instead of relying on constant offsets. That allows us to properly do SLP vectorization of non-sequentially ordered loads within loops bodies. Differential Revision: https://reviews.llvm.org/D29425 llvm-svn: 294027	2017-02-03 19:09:45 +00:00
Tim Northover	c3e3f59d12	GlobalISel: translate dynamic alloca instructions. llvm-svn: 294022	2017-02-03 18:22:45 +00:00
Simon Pilgrim	034c1bd32c	[X86][SSE] Add support for combining scalar_to_vector(extract_vector_elt) into a target shuffle. Correctly flagging upper elements as undef. llvm-svn: 294020	2017-02-03 17:59:58 +00:00
Anna Thomas	b555cc8cb6	NFC: [LoopUnroll] More meaningful message in tracing llvm-svn: 294017	2017-02-03 17:12:43 +00:00
Peter Collingbourne	e6fd9ff96a	IRMover: Merge flags LinkModuleInlineAsm and IsPerformingImport. Currently these flags are always the inverse of each other, so there is no need to keep them separate. Differential Revision: https://reviews.llvm.org/D29471 llvm-svn: 294016	2017-02-03 17:01:14 +00:00
Peter Collingbourne	7c70211653	ModuleLinker: Remove importing support. NFCI. Differential Revision: https://reviews.llvm.org/D29470 llvm-svn: 294015	2017-02-03 16:58:19 +00:00
Peter Collingbourne	6d8f817f8b	FunctionImport: Use IRMover directly. The importer was previously using ModuleLinker in a sort of "IRMover mode". Use IRMover directly instead in order to remove a level of indirection. I will remove all importing support from ModuleLinker in a separate change. Differential Revision: https://reviews.llvm.org/D29468 llvm-svn: 294014	2017-02-03 16:56:27 +00:00
Simon Dardis	68e9d94055	[mips] Remove absolute size assertion for end directive The .end <symbol> directive for MIPS marks the end of a symbol and sets the symbol's size. Previously, the corresponding emitDirective handler asserted that a function's size could be evaluated to an absolute value at that point in time. This cannot be done with when directives like .align have been encountered, instead set the function's size to the corresponding symbolic expression and let ELFObjectWriter resolve the expression to an absolute value. This avoids a redundant call to evaluateAsAbsolute. llvm-svn: 294012	2017-02-03 15:48:53 +00:00
Justin Lebar	e90c468444	[NVPTX] Enable combineRepeatedFPDivisors for NVPTX. Reviewers: tra Subscribers: jholewinski, llvm-commits Differential Revision: https://reviews.llvm.org/D29477 llvm-svn: 294011	2017-02-03 15:13:50 +00:00
Artem Tamazov	43b61561b0	[AMDGPU][mc] Fix AddressSanitizer leftover issue in gfx7_asm_all test Issue occurs when assembling "ds_ordered_count v0, v0 gds". llvm-svn: 294004	2017-02-03 12:47:30 +00:00
Alexey Bataev	a0d9f2582b	[SelectionDAG] Fix for PR30775: Assertion `NodeToMatch->getOpcode() != ISD::DELETED_NODE && "NodeToMatch was removed partway through selection"' failed. NodeToMatch can be modified during matching, but code does not handle this situation. Differential Revision: https://reviews.llvm.org/D29292 llvm-svn: 294003	2017-02-03 12:28:40 +00:00
Sanne Wouda	a994185757	[ARM] Change TCReturn to tBL if tailcall optimization fails. Summary: The tail call optimisation is performed before register allocation, so at that point we don't know if LR is being spilt or not. If LR was spilt to the stack, then we cannot do a tail call optimisation. That would involve popping back into LR which is not possible in Thumb1 code. Reviewers: rengolin, jmolloy, rovka, olista01 Reviewed By: olista01 Subscribers: llvm-commits, aemerson Differential Revision: https://reviews.llvm.org/D29020 llvm-svn: 294000	2017-02-03 11:15:53 +00:00
Alexey Bataev	a16cfe6fa9	[SLP] Fix for PR31690: Allow using of extra values in horizontal reductions. Currently LLVM supports vectorization of horizontal reduction instructions with initial value set to 0. Patch supports vectorization of reduction with non-zero initial values. Also it supports a vectorization of instructions with some extra arguments, like: float f(float x[], int a, int b) { float p = a % b; p += x[0] + 3; for (int i = 1; i < 32; i++) p += x[i]; return p; } Patch allows vectorization of this kind of horizontal reductions. Differential Revision: https://reviews.llvm.org/D28961 llvm-svn: 293994	2017-02-03 08:08:50 +00:00
Mehdi Amini	1380edf4ef	Revert "[ThinLTO] Add an auto-hide feature" This reverts commit r293970. After more discussion, this belongs to the linker side and there is no added value to do it at this level. llvm-svn: 293993	2017-02-03 07:41:43 +00:00
Stanislav Mekhanoshin	f29602df65	[AMDGPU] Unroll preferences improvements Exit loop analysis early if suitable private access found. Do not account for GEPs which are invariant to loop induction variable. Do not account for Allocas which are too big to fit into register file anyway. Add option for tuning: -amdgpu-unroll-threshold-private. Differential Revision: https://reviews.llvm.org/D29473 llvm-svn: 293991	2017-02-03 02:20:05 +00:00
Marcos Pividori	db5a565514	[sanitizer coverage] Fix Instrumentation to work on Windows. On Windows, the symbols "___stop___sancov_guards" and "___start___sancov_guards" are not defined automatically. So, we need to take a different approach. We define 3 sections: Section ".SCOV$A" will only hold a variable ___start___sancov_guard. Section ".SCOV$M" will hold the main data. Section ".SCOV$Z" will only hold a variable ___stop___sancov_guards. When linking, they will be merged sorted by the characters after the $, so we can use the pointers of the variables ___[start\|stop]___sancov_guard to know the actual range of addresses of that section. In this diff, I updated instrumentation to include all the guard arrays in section ".SCOV$M". Differential Revision: https://reviews.llvm.org/D28434 llvm-svn: 293987	2017-02-03 01:08:06 +00:00
Matt Arsenault	e1b595306d	AMDGPU: Fold fneg into fmin/fmax_legacy llvm-svn: 293972	2017-02-03 00:51:50 +00:00
David Blaikie	a0e3c75187	DebugInfo: ensure type and namespace names are included in pubnames/pubtypes even when they are only present in type units While looking to add support for placing singular types (types that will only be emitted in one place (such as attached to a strong vtable or explicit template instantiation definition)) not in type units (since type units have overhead) I stumbled across that change causing an increase in pubtypes. Turns out we were missing some types from type units if they were only referenced from other type units and not from the debug_info section. This fixes that, following GCC's line of describing the offset of such entities as the CU die (since there's no compile unit-relative offset that would describe such an entity - they aren't in the CU). Also like GCC, this change prefers to describe the type stub within the CU rather than the "just use the CU offset" fallback where possible. This may give the DWARF consumer some opportunity to find the extra info in the type stub - though I'm not sure GDB does anything with this currently. The size of the pubnames/pubtypes sections now match exactly with or without type units enabled. This nearly triples (+189%) the pubtypes section for a clang self-host and grows pubnames by 0.07% (without compression). For a total of 8% increase in debug info sections of the objects of a Split DWARF build when using type units. llvm-svn: 293971	2017-02-03 00:44:18 +00:00
Mehdi Amini	b0a8ff71e5	[ThinLTO] Add an auto-hide feature When a symbol is not exported outside of the DSO, it is can be hidden. Usually we try to internalize as much as possible, but it is not always possible, for instance a symbol can be referenced outside of the LTO unit, or there can be cross-module reference in ThinLTO. This is a recommit of r293912 after fixing build failures, and a recommit of r293918 after fixing LLD tests. Differential Revision: https://reviews.llvm.org/D28978 llvm-svn: 293970	2017-02-03 00:32:38 +00:00
Craig Topper	bbb2b95ce5	[X86] Mark 256-bit and 512-bit INSERT_SUBVECTOR operations as legal and remove the custom lowering. llvm-svn: 293969	2017-02-03 00:24:49 +00:00
Matt Arsenault	2511c031de	AMDGPU: Fold fneg into fminnum/fmaxnum llvm-svn: 293968	2017-02-03 00:23:15 +00:00
Matt Arsenault	a8fcfadf46	AMDGPU: Check if users of fneg can fold mods In multi-use cases this can save a few instructions. llvm-svn: 293962	2017-02-02 23:21:23 +00:00
Mehdi Amini	21c89dc920	Revert "[ThinLTO] Add an auto-hide feature" This reverts commit r293918, one lld test does not pass. llvm-svn: 293961	2017-02-02 23:20:36 +00:00
Bob Haarman	dd4ebc1d3b	[lto] add getLinkerOpts() Summary: Some compilers, including MSVC and Clang, allow linker options to be specified in source files. In the legacy LTO API, there is a getLinkerOpts() method that returns linker options for the bitcode module being processed. This change adds that method to the new API, so that the COFF linker can get the right linker options when using the new LTO API. Reviewers: pcc, ruiu, mehdi_amini, tejohnson Reviewed By: pcc Differential Revision: https://reviews.llvm.org/D29207 llvm-svn: 293950	2017-02-02 23:00:49 +00:00
Eugene Zelenko	fbd13c5c12	[X86] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 293949	2017-02-02 22:55:55 +00:00
Reid Kleckner	3c467e225e	[X86] Avoid sorted order check in release builds Effectively reverts r290248 and fixes the unused function warning with ifndef NDEBUG. llvm-svn: 293945	2017-02-02 22:06:30 +00:00
Craig Topper	c45657375b	[X86] Move turning 256-bit INSERT_SUBVECTORS into BLENDI from legalize to DAG combine. On one test this seems to have given more chance for DAG combine to do other INSERT_SUBVECTOR/EXTRACT_SUBVECTOR combines before the BLENDI was created. Looks like we can still improve more by teaching DAG combine to optimize INSERT_SUBVECTOR/EXTRACT_SUBVECTOR with BLENDI. llvm-svn: 293944	2017-02-02 22:02:57 +00:00
Reid Kleckner	c35139ec0d	[CodeGen] Remove dead call-or-prologue enum from CCState This enum has been dead since Olivier Stannard re-implemented ARM byval handling in r202985 (2014). llvm-svn: 293943	2017-02-02 21:58:22 +00:00
Xinliang David Li	58fcc9bdce	[PGO] internal option cleanups 1. Added comments for options 2. Added missing option cl::desc field 3. Uniified function filter option for graph viewing. Now PGO count/raw-counts share the same filter option: -view-bfi-func-name=. llvm-svn: 293938	2017-02-02 21:29:17 +00:00
Rafael Espindola	13a79bbfe5	Change how we handle section symbols on ELF. On ELF every section can have a corresponding section symbol. When in an assembly file we have .quad .text the '.text' refers to that symbol. The way we used to handle them is to leave .text an undefined symbol until the very end when the object writer would map them to the actual section symbol. The problem with that is that anything before the end would see an undefined symbol. This could result in bad diagnostics (test/MC/AArch64/label-arithmetic-diags-elf.s), or incorrect results when using the asm streamer (est/MC/Mips/expansion-jal-sym-pic.s). Fixing this will also allow using the section symbol earlier for setting sh_link of SHF_METADATA sections. This patch includes a few hacks to avoid changing our behaviour when handling conflicts between section symbols and other symbols. I reported pr31850 to track that. llvm-svn: 293936	2017-02-02 21:26:06 +00:00
Javed Absar	bb8dcc6aec	[ARM] Classification Improvements to ARM Sched-Model. NFCI. This is the second in the series of patches to enable adding of machine sched-models for ARM processors easier and compact. This patch focuses on integer instructions and adds missing sched definitions. Reviewers: rovka, rengolin Differential Revision: https://reviews.llvm.org/D29127 llvm-svn: 293935	2017-02-02 21:08:12 +00:00
Quentin Colombet	5725f56bb0	[LiveRangeEdit] Don't mess up with LiveInterval when a new vreg is created. In r283838, we added the capability of splitting unspillable register. When doing so we had to make sure the split live-ranges were also unspillable and we did that by marking the related live-ranges in the delegate method that is called when a new vreg is created. However, by accessing the live-range there, we also triggered their lazy computation (LiveIntervalAnalysis::getInterval) which is not what we want in general. Indeed, later code in LiveRangeEdit is going to build the live-ranges this lazy computation may mess up that computation resulting in assertion failures. Namely, the createEmptyIntervalFrom method expect that the live-range is going to be empty, not computed. Thanks to Mikael Holmén <mikael.holmen@ericsson.com> for noticing and reporting the problem. llvm-svn: 293934	2017-02-02 20:44:36 +00:00
Krzysztof Parzyszek	d0d42f0ec8	[Hexagon] Adding opExtentBits and opExtentAlign to GPrel instructions Patch by Colin LeMahieu. llvm-svn: 293933	2017-02-02 20:35:12 +00:00
Michael Kuperstein	e6d59fdca5	[X86] Add costs for non-AVX512 single-source permutation integer shuffles Differential Revision: https://reviews.llvm.org/D29416 llvm-svn: 293932	2017-02-02 20:27:13 +00:00
Krzysztof Parzyszek	e17b0bfb24	[Hexagon] Fix relocation kind for extended predicated calls Patch by Sid Manning. llvm-svn: 293931	2017-02-02 20:21:56 +00:00
Krzysztof Parzyszek	357b048666	[Hexagon] Remove A4_ext_* pseudo instructions Patch by Colin LeMahieu. llvm-svn: 293929	2017-02-02 19:58:22 +00:00
Kostya Serebryany	68382d0900	[libFuzzer] reorganize the tracing code to make it easier to experiment with inlined coverage instrumentation. NFC llvm-svn: 293928	2017-02-02 19:56:01 +00:00
Krzysztof Parzyszek	d67ab623f6	[Hexagon] Fix insertBranch for loops with multiple ENDLOOP instructions llvm-svn: 293925	2017-02-02 19:36:37 +00:00
Dan Gohman	b89f2d3d92	[WebAssembly] Add instruction definitions for drop and get/set_global. llvm-svn: 293922	2017-02-02 19:29:44 +00:00
Xinliang David Li	1eb4ec6a2e	[PGO] make graph view internal options available for all builds Differential Revision: https://reviews.llvm.org/D29259 llvm-svn: 293921	2017-02-02 19:18:56 +00:00
Marcos Pividori	d64360d935	[libFuzzer] Properly handle exceptions with UnhandledExceptionFilter. Use SetUnhandledExceptionFilter instead of AddVectoredExceptionHandler. According to the documentation on Structured Exception Handling, this is the order for the Exception Dispatching: + If the process is being debugged, the system notifies the debugger. + The Vectored Exception Handler is called. + The system attempts to locate a frame-based exception handler by searching the stack frames of the thread in which the exception occurred. + If no frame-based handler can be found, the UnhandledExceptionFilter filter is called. + Default handling based on the exception type. So, similar to what we do for asan, we should use SetUnhandledExceptionFilter instead of AddVectoredExceptionHandler, so user's code that is being fuzzed can execute frame-based exception handlers before we catch them . We want to catch unhandled exceptions, not all the exceptions. Differential Revision: https://reviews.llvm.org/D29462 llvm-svn: 293920	2017-02-02 19:07:53 +00:00
Peter Collingbourne	37e2459186	FunctionImport: Remove the -disable-force-link-odr flag and change importFunctions to never force link. This removes some functionality that was only being used by tests. Differential Revision: https://reviews.llvm.org/D29439 llvm-svn: 293919	2017-02-02 18:42:25 +00:00
Mehdi Amini	97624fb1ec	[ThinLTO] Add an auto-hide feature When a symbol is not exported outside of the DSO, it is can be hidden. Usually we try to internalize as much as possible, but it is not always possible, for instance a symbol can be referenced outside of the LTO unit, or there can be cross-module reference in ThinLTO. This is a recommit of r293912 after fixing build failures. Differential Revision: https://reviews.llvm.org/D28978 llvm-svn: 293918	2017-02-02 18:31:35 +00:00
Nirav Dave	93f9d5ce04	Revert "In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled." This reverts commit r293893 which is miscompiling lua on ARM and bootstrapping for x86-windows. llvm-svn: 293915	2017-02-02 18:24:55 +00:00
Mehdi Amini	827600deaf	Revert "[ThinLTO] Add an auto-hide feature" This reverts r293912, bots are broken. llvm-svn: 293914	2017-02-02 18:24:37 +00:00
Mehdi Amini	dc5a7444f0	[ThinLTO] Add an auto-hide feature When a symbol is not exported outside of the DSO, it is can be hidden. Usually we try to internalize as much as possible, but it is not always possible, for instance a symbol can be referenced outside of the LTO unit, or there can be cross-module reference in ThinLTO. Differential Revision: https://reviews.llvm.org/D28978 llvm-svn: 293912	2017-02-02 18:13:46 +00:00
Simon Dardis	08ce5fb66b	[mips] Expansion of BEQL and BNEL with immediate operands Adds support for BEQL and BNEL macros with immediate operands. Patch by: Srdjan Obucina Reviewers: dsanders, zoran.jovanovic, vkalintiris, sdardis, obucina, seanbruno Differential Revision: https://reviews.llvm.org/D17040 llvm-svn: 293905	2017-02-02 16:13:49 +00:00
Amaury Sechet	f3e421d6e9	Use N0 instead of N->getOperand(0) in DagCombiner::visitAdd. NFC llvm-svn: 293903	2017-02-02 16:07:44 +00:00
Jonas Paulsson	b7a2ef8375	[SystemZ] Add comment for ISD::FP_TO_UINT expansion. (Copied from the fp-conv-10.ll test to SystemZISelLowering.cpp) Review: Ulrich Weigand llvm-svn: 293900	2017-02-02 15:42:14 +00:00
Krzysztof Parzyszek	bc4dc9b4b9	[Hexagon] Emitting individual instructions without copying them Patch by Colin LeMahieu. llvm-svn: 293899	2017-02-02 15:32:26 +00:00
Jun Bum Lim	180bc5a021	[JumpThread] Enhance finding partial redundant loads by continuing scanning single predecessor Summary: While scanning predecessors to find an available loaded value, if the predecessor has a single predecessor, we can continue scanning through the single predecessor. Reviewers: mcrosier, rengolin, reames, davidxl, haicheng Reviewed By: rengolin Subscribers: zzheng, llvm-commits Differential Revision: https://reviews.llvm.org/D29200 llvm-svn: 293896	2017-02-02 15:12:34 +00:00
Krzysztof Parzyszek	f65b8f14f4	[Hexagon] Rename TypeCOMPOUND to TypeCJ llvm-svn: 293894	2017-02-02 15:03:30 +00:00
Nirav Dave	4442667fc5	In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled. Recommiting after fixing X86 inc/dec chain bug. * Simplify Consecutive Merge Store Candidate Search Now that address aliasing is much less conservative, push through simplified store merging search and chain alias analysis which only checks for parallel stores through the chain subgraph. This is cleaner as the separation of non-interfering loads/stores from the store-merging logic. When merging stores search up the chain through a single load, and finds all possible stores by looking down from through a load and a TokenFactor to all stores visited. This improves the quality of the output SelectionDAG and the output Codegen (save perhaps for some ARM cases where we correctly constructs wider loads, but then promotes them to float operations which appear but requires more expensive constant generation). Some minor peephole optimizations to deal with improved SubDAG shapes (listed below) Additional Minor Changes: 1. Finishes removing unused AliasLoad code 2. Unifies the chain aggregation in the merged stores across code paths 3. Re-add the Store node to the worklist after calling SimplifyDemandedBits. 4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is arbitrary, but seems sufficient to not cause regressions in tests. 5. Remove Chain dependencies of Memory operations on CopyfromReg nodes as these are captured by data dependence 6. Forward loads-store values through tokenfactors containing {CopyToReg,CopyFromReg} Values. 7. Peephole to convert buildvector of extract_vector_elt to extract_subvector if possible (see CodeGen/AArch64/store-merge.ll) 8. Store merging for the ARM target is restricted to 32-bit as some in some contexts invalid 64-bit operations are being generated. This can be removed once appropriate checks are added. This finishes the change Matt Arsenault started in r246307 and jyknight's original patch. Many tests required some changes as memory operations are now reorderable, improving load-store forwarding. One test in particular is worth noting: CodeGen/PowerPC/ppc64-align-long-double.ll - Improved load-store forwarding converts a load-store pair into a parallel store and a memory-realized bitcast of the same value. However, because we lose the sharing of the explicit and implicit store values we must create another local store. A similar transformation happens before SelectionDAG as well. Reviewers: arsenm, hfinkel, tstellarAMD, jyknight, nhaehnle llvm-svn: 293893	2017-02-02 14:39:42 +00:00
Nirav Dave	e14300e270	[X86,ISEL] Fix X86 increment chain dependence calculation Merging Load-add-store pattern into a increment op previously dropped the load's chain from the instructions dependence if the store is chained to a TokenFactor. llvm-svn: 293892	2017-02-02 14:39:26 +00:00
Diana Picus	32cd9b434c	[ARM] GlobalISel: Lower pointer args and returns It is important to change the ArgInfo's type from pointer to integer, otherwise the CC assign function won't know what to do. Instead of hacking it up, we use ComputeValueVTs and introduce some of the helpers that we will need later on for lowering more complex types. llvm-svn: 293889	2017-02-02 14:01:00 +00:00
Diana Picus	0c11c7b5c7	[ARM] GlobalISel: Error out instead of asserting Allow unknown types in TLI.getValueType, otherwise we get asserts for certain types that we do not support yet (instead of returning that we don't support them and falling through the normal error path). llvm-svn: 293888	2017-02-02 14:00:54 +00:00
Anna Thomas	7f4b26e189	[LICM] Hoist loads that are dominated by invariant.start intrinsic, and are invariant in the loop. Summary: We can hoist out loads that are dominated by invariant.start, to the preheader. We conservatively assume the load is variant, if we see a corresponding use of invariant.start (it could be an invariant.end or an escaping call). Reviewers: mkuper, sanjoy, reames Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29331 llvm-svn: 293887	2017-02-02 13:22:03 +00:00
Diana Picus	fc19a8ff07	[ARM] GlobalISel: Legalize loading pointers Make it legal to load pointer values. Also check that pointers are assigned to the GPR reg bank by default. llvm-svn: 293886	2017-02-02 13:20:49 +00:00
Simon Pilgrim	20ab6b875a	[X86][SSE] Use MOVMSK for all_of/any_of reduction patterns This is a first attempt at using the MOVMSK instructions to replace all_of/any_of reduction patterns (i.e. an and/or + shuffle chain). So far this only matches patterns where we are reducing an all/none bits source vector (i.e. a comparison result) but we should be able to expand on this in conjunction with improvements to 'bool vector' handling both in the x86 backend as well as the vectorizers etc. Differential Revision: https://reviews.llvm.org/D28810 llvm-svn: 293880	2017-02-02 11:52:33 +00:00
Craig Topper	047a8be18a	[X86] Remove some unused DAGCombinerInfo parameters. NFC llvm-svn: 293873	2017-02-02 08:03:23 +00:00
Craig Topper	94ed54b49a	[X86] Move some INSERT_SUBVECTOR optimizations from legalize to DAG combine. This moves creation of SUBV_BROADCAST and merging of adjacent loads that are being inserted together. This is a step towards removing legalizing of INSERT_SUBVECTOR except for vXi1 cases. llvm-svn: 293872	2017-02-02 08:03:20 +00:00
Adam Nemet	0bf1b863b9	[LV] Also port failure remarks to new OptimizationRemarkEmitter API llvm-svn: 293866	2017-02-02 05:41:51 +00:00
Peter Collingbourne	4613626d49	LTO: Link non-prevailing weak_odr or linkonce_odr globals into the combined module with available_externally linkage. These linkages mean that the ultimately prevailing symbol will have the same semantics as any non-prevailing copy of the symbol, so we are free to ignore the linker's resolution. Differential Revision: https://reviews.llvm.org/D29367 llvm-svn: 293865	2017-02-02 05:22:42 +00:00
Peter Collingbourne	c387e70c69	Linker: Move special casing for available_externally in IRMover to clients. NFCI. The goal is to simplify the semantic model for clients of IRMover. Differential Revision: https://reviews.llvm.org/D29435 llvm-svn: 293864	2017-02-02 05:12:15 +00:00
Craig Topper	b81e6c48f8	[AVX-512] Fix the implicit defs for VZEROALL/VZEROUPPER to include YMM16-YMM31. llvm-svn: 293862	2017-02-02 04:17:18 +00:00
Matt Arsenault	300836098f	InferAddressSpaces: Handle more cases with constant select operands llvm-svn: 293859	2017-02-02 03:37:22 +00:00
Matt Arsenault	9dba9bd4cf	AMDGPU: Use source modifiers with f16->f32 conversions The operand types were defined to fit the fp16_to_fp node, which has the half as an integer type. v_cvt_f32_f16 does support source modifiers, so change this to have an FP type and modifiers. For targets without legal f16, this requires recognizing the bit operations and trying to produce them. llvm-svn: 293857	2017-02-02 02:27:04 +00:00
Matthias Braun	9dc3b5ff89	RegisterCoalescer: Cleanup joinReservedPhysReg(); NFC - Factor out a common subexpression - Add some helpful comments - Fix printing of a register in a debug message llvm-svn: 293856	2017-02-02 02:23:27 +00:00
Matthias Braun	5b49f95592	AArch64RegisterInfo: Simplify getReservedReg(); NFC After marking a 32bit register and all its super registers the 64bit register does not need to be marked again. llvm-svn: 293855	2017-02-02 02:23:25 +00:00
Matt Arsenault	8e190b2f23	NVPTX: Fix not preserving volatile when expanding memset llvm-svn: 293851	2017-02-02 01:20:34 +00:00
Omair Javaid	f5d560bc84	Fix LLDB Android AArch64 GCC debug info build Committing after fixing suggested changes and tested release/debug builds on x86_64-linux and arm/aarch64 builds. Differential revision: https://reviews.llvm.org/D29042 llvm-svn: 293850	2017-02-02 01:17:49 +00:00
Rui Ueyama	a9b29615fb	Re-submit r293820: Return Error instead of bool from mergeTypeStreams(). llvm-svn: 293847	2017-02-02 00:47:10 +00:00
Davide Italiano	cb68f37184	[IPSCCP] Restore the old behaviour (pre r293799). It's not clear the change I made a good idea, and it definitely needs further discussion. Thanks to Eli for pointing out. llvm-svn: 293846	2017-02-02 00:46:54 +00:00
Peter Collingbourne	dc5e583687	X86: Produce @ABS8 symbol modifiers for absolute symbols in range [0,128). Differential Revision: https://reviews.llvm.org/D28689 llvm-svn: 293844	2017-02-02 00:32:03 +00:00
Matt Arsenault	db6e9e89a9	InferAddressSpaces: clang-format some things llvm-svn: 293843	2017-02-02 00:28:25 +00:00
Paul Robinson	5362216c36	Remove an assertion that doesn't hold when mixing -g and -gmlt through LTO. Replace it with a related assertion, ensuring that abstract variables appear only in abstract scopes. Part of PR31437. Differential Revision: http://reviews.llvm.org/D29430 llvm-svn: 293841	2017-02-01 23:51:56 +00:00
Stanislav Mekhanoshin	2b913b1f49	[AMDGPU] Account workgroup size in LDS occupancy limits Functions matching LDS use to occupancy return results for a workgroup of 64 workitems. The numbers has to be adjusted for bigger workgroups. For example a workgroup of size 256 already occupies 4 waves just by itself. Given that all numbers of LDS use in the compiler are per workgroup, occupancy shall be multiplied by 4 in this case. Each 64 workitems still limited by the same number, but 4 subrgoups 64 workitems each can afford 4 times more LDS to get the same occupancy. In addition change initializes LDS size in the subtarget to a real value for SI+ targets. This is required since LDS size is a variable in these calculations. Differential Revision: https://reviews.llvm.org/D29423 llvm-svn: 293837	2017-02-01 22:59:50 +00:00
Eugene Zelenko	c5eb8e29d0	[AArch64] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 293836	2017-02-01 22:56:06 +00:00
Dehao Chen	0944a8c2ec	Change debug-info-for-profiling from a TargetOption to a function attribute. Summary: LTO requires the debug-info-for-profiling to be a function attribute. Reviewers: echristo, mehdi_amini, dblaikie, probinson, aprantl Reviewed By: mehdi_amini, dblaikie, aprantl Subscribers: aprantl, probinson, ahatanak, llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D29203 llvm-svn: 293833	2017-02-01 22:45:09 +00:00
Marcos Pividori	ba03abebfe	[libFuzzer] Disable afl tests on non-posix systems. AflDriver is not supported on non posix systems. Differential Revision: https://reviews.llvm.org/D29422 llvm-svn: 293830	2017-02-01 22:40:50 +00:00
Marcos Pividori	36464dd6a5	[libFuzzer] Disable equivalence tests on non posix systems. We can not run this test until we implement shared memory on Windows. Differential Revision: https://reviews.llvm.org/D29421 llvm-svn: 293829	2017-02-01 22:40:45 +00:00
Marcos Pividori	b056879700	[libFuzzer] Isolate merge tests that require posix. Differential Revision: https://reviews.llvm.org/D29420 llvm-svn: 293828	2017-02-01 22:40:40 +00:00
Marcos Pividori	9c0244c1eb	[libFuzzer] Add features `windows` and `posix` for lit tests. Add 2 features: posix and windows. Sometimes we want some specific tests only for posix and we use: REQUIRES: posix Sometimes we want some specific tests only for windows and we use: REQUIRES: windows Differential Revision: https://reviews.llvm.org/D29418 llvm-svn: 293827	2017-02-01 22:40:34 +00:00
Marcos Pividori	477d153045	[libFuzzer] Accept different extensions. Differential Revision: https://reviews.llvm.org/D29417 llvm-svn: 293826	2017-02-01 22:40:29 +00:00
Marcos Pividori	b340471ff5	[libFuzzer] Fix test because cmd prompt does not expand wildcard. Commands should expand the wildcards on Windows, the cmd prompt doesn't. Because of that sancov was not finding the needed file. To deal with this, we use ls and xargs from gnu win utils. Differential Revision: https://reviews.llvm.org/D29374 llvm-svn: 293825	2017-02-01 22:39:55 +00:00
Rui Ueyama	7d07a1652d	Revert r293820: Return Error instead of bool from mergeTypeStreams(). It broke buildbots. llvm-svn: 293824	2017-02-01 22:28:43 +00:00
Sanjay Patel	52e4e6594e	[ValueTracking] remove a FIXME for something we don't want to do; NFC The comment was added with: https://reviews.llvm.org/rL293773 ...but there would be a cost to implement this and possibly no payoff. llvm-svn: 293823	2017-02-01 22:27:34 +00:00
Rui Ueyama	00d4f49717	Return Error instead of bool from mergeTypeStreams(). Previously, mergeTypeStreams returns only true or false, so it was impossible to know the reason if it failed. This patch changes the function signature so that it returns an Error object. Differential Revision: https://reviews.llvm.org/D29362 llvm-svn: 293820	2017-02-01 22:09:34 +00:00
Paul Robinson	a380e613f1	Remove an assertion that doesn't hold when mixing -g and -gmlt through LTO. Part of PR31437. Differential Revision: http://reviews.llvm.org/D29310 llvm-svn: 293818	2017-02-01 21:54:50 +00:00
Sanjay Patel	c56d1ccd79	[InstCombine] move folds for shift-shift pairs; NFCI Although this is 'no-functional-change-intended', I'm adding tests for shl-shl and lshr-lshr pairs because there is no existing test coverage for those folds. It seems like we should be able to remove some code from foldShiftedShift() at this point because we're handling those patterns on the general path. llvm-svn: 293814	2017-02-01 21:31:34 +00:00
Michael Kuperstein	3c6b3ba258	Shut up another GCC warning about operator precedence. NFC. llvm-svn: 293812	2017-02-01 21:06:33 +00:00
Matt Arsenault	74f64833bc	AMDGPU: Allow clustering flat memory operations llvm-svn: 293809	2017-02-01 20:22:51 +00:00
Jun Bum Lim	423406fdcb	[JumpThread] No need to erase BB from LoopHeaders. NFC. Summary: No need to try to ease BB from LoopHeaders as we already know that BB is not in LoopHeaders. Reviewers: hsung, majnemer, mcrosier, haicheng, rengolin Reviewed By: rengolin Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29232 llvm-svn: 293802	2017-02-01 19:06:55 +00:00
Davide Italiano	6849f20d85	[IPSCCP] Don't propagate return values of functions marked as noinline. This tries to address what Hal defined (in the post-commit review of r293727) a long-standing problem with noinline, where we end up de facto inlining trivial functions e.g. __attribute__((noinline)) int patatino(void) { return 5; } because of return value propagation. llvm-svn: 293799	2017-02-01 18:52:20 +00:00
Simon Dardis	ac9c30c37f	[mips] Parse the 'bopt' and 'nobopt' directives in IAS. The GAS assembler supports the ".set bopt" directive but according to the sources it doesn't do anything. It's supposed to optimize branches by filling the delay slot of a branch with it's target. This patch teaches the MIPS asm parser to accept both and warn in the case of 'bopt' that the bopt directive is unsupported. This resolves PR/31841. Thanks to Sean Bruno for reporting the issue! llvm-svn: 293798	2017-02-01 18:50:24 +00:00
Zachary Turner	d50c01308e	[pdb] Add a new command for analyzing hash collisions. This introduces the `analyze` subcommand. For now there is only one option, to analyze hash collisions in the type streams. In the future, however, we could add many more things here, such as performing size analyses, compacting, and statistics about the type of records etc. llvm-svn: 293795	2017-02-01 18:30:22 +00:00
Marcos Pividori	460886e3cf	[libFuzzer] Do not use llvm-objdump for disassembling a DSO. When disassembling a DSO, for calls to functions from the PLT, llvm-objdump only prints the offset from the PLT, like: <.plt+0x30>. While objdump and dumpbin print the function name, like: <__sanitizer_cov_trace_pc_guard@plt> When analyzing the coverage in libFuzzer we dissasemble and look for the calls to __sanitizer_cov_trace_pc_guard. So, this fails when using llvm-objdump on a DSO. Differential Revision: https://reviews.llvm.org/D29372 llvm-svn: 293791	2017-02-01 17:59:23 +00:00
Marcos Pividori	7a3a390afb	[libFuzzer] Properly check if we can use dumpbin. The flag "/sumary" is necessary, otherwise it returns a non-zero value. Differential Revision: https://reviews.llvm.org/D29371 llvm-svn: 293790	2017-02-01 17:59:19 +00:00
Matthew Simpson	ba5cf9dfee	[LV] Move interleaved access helper functions to VectorUtils (NFC) This patch moves some helper functions related to interleaved access vectorization out of LoopVectorize.cpp and into VectorUtils.cpp. We would like to use these functions in a follow-on patch that improves interleaved load and store lowering in (ARM/AArch64)ISelLowering.cpp. One of the functions was already duplicated there and has been removed. Differential Revision: https://reviews.llvm.org/D29398 llvm-svn: 293788	2017-02-01 17:45:46 +00:00
Sanjoy Das	e0e5795f6b	[InstCombine] Allow InstCombine to merge adjacent guards Summary: If there are two adjacent guards with different conditions, we can remove one of them and include its condition into the condition of another one. This patch allows InstCombine to merge them by the following pattern: guard(a); guard(b) -> guard(a & b). Reviewers: reames, apilipenko, igor-laevsky, anna, sanjoy Reviewed By: sanjoy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29378 llvm-svn: 293778	2017-02-01 16:34:55 +00:00
Simon Pilgrim	ca931efc21	[X86][SSE] Remove unused argument. NFCI. llvm-svn: 293777	2017-02-01 16:34:50 +00:00
Matt Arsenault	d59e640455	AMDGPU: Improve nsw/nuw/exact when promoting uniform i16 ops These were simply preserving the flags of the original operation, which was too conservative in most cases and incorrect for mul. nsw/nuw may be needed for some combines to cleanup messes when intermediate sext_inregs are introduced later. Tested valid combinations with alive. llvm-svn: 293776	2017-02-01 16:25:23 +00:00
Sanjoy Das	08da2e28ee	[ImplicitNullCheck] Extend canReorder scope Summary: This change allows a re-order of two intructions if their uses are overlapped. Patch by Serguei Katkov! Reviewers: reames, sanjoy Reviewed By: sanjoy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29120 llvm-svn: 293775	2017-02-01 16:04:21 +00:00
Sanjay Patel	25f6d710d9	[ValueTracking] avoid crashing from bad assumptions (PR31809) A program may contain llvm.assume info that disagrees with other analysis. This may be caused by UB in the program, so we must not crash because of that. As noted in the code comments: https://llvm.org/bugs/show_bug.cgi?id=31809 ...we can do better, but this at least avoids the assert/crash in the bug report. Differential Revision: https://reviews.llvm.org/D29395 llvm-svn: 293773	2017-02-01 15:41:32 +00:00
Simon Dardis	6433d5af6b	[mips] Fix an initialization issue with MipsABIInfo in MipsTargetELFStreamer DebugInfoDWARFTests is the only user so far which initializes the MCObjectStreamer without initializing the ASMParser. The MIPS backend relies on the ASMParser to initialize the MipsABIInfo object and to update the target streamer with it. This should turn the mips buildbots green. Reviewers: atanasyan, zoran.jovanovic Differential Revision: https://reviews.llvm.org/D28025 llvm-svn: 293772	2017-02-01 15:39:23 +00:00
Kit Barton	d26978796e	[PowerPC] Fix sjlj pseduo instructions to use G8RC_NOX0 register class The the following instructions: - LD/LWZ (expanded from sjLj pseudo-instructions) - LXVL/LXVLL vector loads - STXVL/STXVLL vector stores all require G8RC_NO0X class registers for RA. Differential Revision: https://reviews.llvm.org/D29289 Committed for Lei Huang llvm-svn: 293769	2017-02-01 14:33:57 +00:00
Simon Pilgrim	55a9c79bd1	[X86][SSE] Merge SSE2 PINSRW lowering with SSE41 PINSRB/PINSRW lowering. NFCI. These are identical apart from the extra SSE41 guard for PINSRB. llvm-svn: 293766	2017-02-01 13:32:19 +00:00
Florian Hahn	7a5ec55fb3	[legalizetypes] Push fp16 -> fp32 extension node to worklist. Summary: This way, the type legalization machinery will take care of registering the result of this node properly. This patches fixes all failing fp16 test cases with expensive checks. (CodeGen/ARM/fp16-promote.ll, CodeGen/ARM/fp16.ll, CodeGen/X86/cvt16.ll CodeGen/X86/soft-fp.ll) Reviewers: t.p.northover, baldrick, olista01, bogner, jmolloy, davidxl, ab, echristo, hfinkel Reviewed By: hfinkel Subscribers: mehdi_amini, hfinkel, davide, RKSimon, aemerson, llvm-commits Differential Revision: https://reviews.llvm.org/D28195 llvm-svn: 293765	2017-02-01 13:01:33 +00:00
Artur Pilipenko	2cbaded5b5	[LoopPredication] Add a new line to debug output in LoopPredication pass llvm-svn: 293762	2017-02-01 12:25:38 +00:00
Javed Absar	e5ad87e939	[ARM] Enable Cortex-M23 and Cortex-M33 support. Add both cores to the target parser and TableGen. Test that eabi attributes are set correctly for both cores. Additionally, test the absence and presence of MOVT in Cortex-M23 and Cortex-M33, respectively. Committed on behalf of Sanne Wouda. Reviewers : rengolin, olista01. Differential Revision: https://reviews.llvm.org/D29073 llvm-svn: 293761	2017-02-01 11:55:03 +00:00
Florian Hahn	a35b8a4852	[LoopUnroll] Use addClonedBlockToLoopInfo to add loop header to LI (NFC). Summary: I have a similar patch up for review already (D29173). If you prefer I can squash them both together. Also I think there more potential for code sharing between LoopUnroll.cpp and LoopUnrollRuntime.cpp. Do you think patches for that would be worthwhile? Reviewers: mkuper, mzolotukhin Reviewed By: mkuper, mzolotukhin Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29311 llvm-svn: 293758	2017-02-01 10:39:35 +00:00
NAKAMURA Takumi	468487d71a	*MacroFusion.cpp: Suppress warnings to eliminate \param(s). [-Wdocumentation] llvm-svn: 293744	2017-02-01 07:30:46 +00:00
Craig Topper	0bcba19cdf	[X86] For AVX1/AVX2 isel, don't use FP move instructions for 128-bit loads/stores of integer types. For SSE we use fp because of the smaller encoding, but that doesn't apply to AVX. So just do the natural thing so we don't have to explain why we aren't. We can't do this for 256-bit loads/stores since integer loads and stores aren't available in AVX1 so we need fallback patterns since the integer types are legal. This doesn't affect any tests because execution domain fixing freely converts the instructions anyway. Honestly, we could probably rely on it for the SSE size optimization too. llvm-svn: 293743	2017-02-01 07:17:16 +00:00
Evandro Menezes	455382ea22	[AArch64] Add new target feature to fuse literal generation This feature enables the fusion of such operations on Cortex A57, as recommended in its Software Optimisation Guide, sections 4.14 and 4.15. Differential revision: https://reviews.llvm.org/D28698 llvm-svn: 293739	2017-02-01 02:54:42 +00:00
Evandro Menezes	b21fb29c26	[AArch64] Add new subtarget feature to fuse AES crypto operations This feature enables the fusion of such operations on Cortex A57, as recommended in its Software Optimisation Guide, section 4.13, and on Exynos M1. Differential revision: https://reviews.llvm.org/D28491 llvm-svn: 293738	2017-02-01 02:54:39 +00:00
Evandro Menezes	94edf02923	[CodeGen] Move MacroFusion to the target This patch moves the class for scheduling adjacent instructions, MacroFusion, to the target. In AArch64, it also expands the fusion to all instructions pairs in a scheduling block, beyond just among the predecessors of the branch at the end. Differential revision: https://reviews.llvm.org/D28489 llvm-svn: 293737	2017-02-01 02:54:34 +00:00
Sanjoy Das	15e50b510e	[ImplicitNullCheck] NFC isSuitableMemoryOp cleanup Summary: isSuitableMemoryOp method is repsonsible for verification that instruction is a candidate to use in implicit null check. Additionally it checks that base register is not re-defined before. In case base has been re-defined it just returns false and lookup is continued while any suitable instruction will not succeed this check as well. This results in redundant further operations. So when we found that base register has been re-defined we just stop. Patch by Serguei Katkov! Reviewers: reames, sanjoy Reviewed By: sanjoy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29119 llvm-svn: 293736	2017-02-01 02:49:25 +00:00
Justin Bogner	41e632bf6b	SanitizerCoverage: Support sanitizer guard section on darwin MachO's sections need a segment as well as a section name, and the section start and end symbols are spelled differently than on ELF. llvm-svn: 293733	2017-02-01 02:38:39 +00:00
Matthias Braun	8d115a384c	MCMacho: Allow __thread_ptr section after dwarf sections Differential Revision: https://reviews.llvm.org/D29315 llvm-svn: 293730	2017-02-01 01:31:36 +00:00
Eugene Zelenko	926883e1c2	[Mips] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 293729	2017-02-01 01:22:51 +00:00
Stanislav Mekhanoshin	70c245e92d	Fix regalloc assignment of overlapping registers SplitEditor::defFromParent() can create a register copy. If register is a tuple of other registers and not all lanes are used a copy will be done on a full tuple regardless. Later register unit for an unused lane will be considered free and another overlapping register tuple can be assigned to a different value even though first register is live at that point. That is because interference only look at liveness info, while full register copy clobbers all lanes, even unused. This patch fixes copy to only cover used lanes. Differential Revision: https://reviews.llvm.org/D29105 llvm-svn: 293728	2017-02-01 01:18:36 +00:00
Davide Italiano	7343b9f340	[IPSCCP] Teach how to not propagate return values of naked functions. Differential Revision: https://reviews.llvm.org/D29360 llvm-svn: 293727	2017-02-01 01:01:22 +00:00
Matt Arsenault	da7a656542	AMDGPU: Cleanup fmin/fmax legacy function Use a more specific subtarget check and combine hasOneUse checks llvm-svn: 293726	2017-02-01 00:42:40 +00:00
Matt Arsenault	bdd59e6879	InferAddressSpaces: Handle select This fails to handle some cases where one of the inputs is a constant to be fixed in a later commit. llvm-svn: 293723	2017-02-01 00:08:53 +00:00
Kostya Serebryany	5c76e3d034	[libFuzzer] increase the default size for shmem llvm-svn: 293722	2017-02-01 00:07:47 +00:00
Dean Michael Berris	0e8ababf7d	[XRay] Define the InstrumentationMap type Summary: This change implements the instrumentation map loading library which can understand both YAML-defined instrumentation maps, and ELF 64-bit object files that have the XRay instrumentation map section. We break it out into a library on its own to allow for other applications to deal with the XRay instrumentation map defined in XRay-instrumented binaries. This type provides both raw access to the logical representation of the instrumentation map entries as well as higher level functions for converting a function ID into a function address. At this point we only support ELF64 binaries and YAML-defined XRay instrumentation maps. Future changes should extend this to support 32-bit ELF binaries, as well as other binary formats (like MachO). As part of this change we also migrate all uses of the extraction logic that used to be defined in tools/llvm-xray/ to use this new type and interface for loading from files. We also remove the flag from the `llvm-xray` tool that required users to specify the type of the instrumentation map file being provided to instead make the library auto-detect the file type. Reviewers: dblaikie Subscribers: mgorny, varno, llvm-commits Differential Revision: https://reviews.llvm.org/D29319 llvm-svn: 293721	2017-02-01 00:05:29 +00:00
Matt Arsenault	864fbacb4a	InferAddressSpaces: Remove dead declaration llvm-svn: 293720	2017-01-31 23:57:20 +00:00
Matt Arsenault	517a290e4f	InferAddressSpaces: Avoid double map lookup llvm-svn: 293719	2017-01-31 23:48:44 +00:00
Matt Arsenault	2a46d81038	InferAddressSpaces: Fix broken casting of constants llvm-svn: 293718	2017-01-31 23:48:40 +00:00
Matt Arsenault	1575cb893c	AMDGPU: Fix warning llvm-svn: 293717	2017-01-31 23:48:37 +00:00
Kyle Butt	b15c06677c	CodeGen: Allow small copyable blocks to "break" the CFG. When choosing the best successor for a block, ordinarily we would have preferred a block that preserves the CFG unless there is a strong probability the other direction. For small blocks that can be duplicated we now skip that requirement as well, subject to some simple frequency calculations. Differential Revision: https://reviews.llvm.org/D28583 llvm-svn: 293716	2017-01-31 23:48:32 +00:00
Rafael Espindola	a86be22230	Move more code to helper functions. NFC. llvm-svn: 293715	2017-01-31 23:26:32 +00:00
Justin Lebar	06fcea4cd9	[NVPTX] Compute approx sqrt as 1/rsqrt(x) rather than xrsqrt(x). xrsqrt(x) returns NaN for x == 0, whereas 1/rsqrt(x) returns 0, as desired. Verified that the particular nvptx approximate instructions here do in fact return 0 for x = 0. llvm-svn: 293713	2017-01-31 23:08:57 +00:00
Rafael Espindola	d9953d9dd2	Move some code to a helper function. NFC. llvm-svn: 293712	2017-01-31 23:07:08 +00:00
Michael Kuperstein	e18aad39ab	Shut up GCC warning about operator precedence. NFC. Technically, this is actually changes the expression and the original assert was "wrong", but since the conjunction is with true, it doesn't matter in this case. llvm-svn: 293709	2017-01-31 22:48:45 +00:00
Daniel Berlin	97718e6081	NewGVN: Dead argument cleanup llvm-svn: 293708	2017-01-31 22:32:03 +00:00
Daniel Berlin	ff12c922fe	NewGVN: Cleanup conditions to match reality llvm-svn: 293707	2017-01-31 22:32:01 +00:00
Daniel Berlin	c22aafe5b3	NewGVN: Add basic support for symbolic comparison evaluation llvm-svn: 293706	2017-01-31 22:31:58 +00:00
Daniel Berlin	808e3ff8a2	NewGVN: Formatting cleanup after lookupOperandLeader change llvm-svn: 293705	2017-01-31 22:31:56 +00:00
Daniel Berlin	203f47bbd8	NewGVN: Remove the unsued two arguments from lookupOperandLeader. llvm-svn: 293704	2017-01-31 22:31:53 +00:00
Daniel Berlin	74d300361a	NewGVN: Cleanup header files we are using. llvm-svn: 293703	2017-01-31 22:31:50 +00:00
David Blaikie	0012dd5db1	Add a verbose/human readable mode to llvm-symbolizer to investigate discriminators and other line table/backtrace features Patch by Simon Que! Differential Revision: https://reviews.llvm.org/D29094 llvm-svn: 293697	2017-01-31 22:19:38 +00:00
Davide Italiano	116464a55d	[NewGVN] Preserve TargetLibraryInfo analysis. We can maybe preserve more but this is a first step. Ack'ed by Danny on IRC. llvm-svn: 293694	2017-01-31 21:53:18 +00:00
Davide Italiano	5a473d230d	[Support] Add newline when dumping an APInt. This annoyed me a few times but was lazy so I haven't fixed it until today, when the output of my debugger was too confusing. llvm-svn: 293691	2017-01-31 21:26:18 +00:00
Rafael Espindola	2d55781ae3	Make this file clang-format friendly and clang-format it. llvm-svn: 293689	2017-01-31 21:11:12 +00:00
Taewook Oh	75acec8a14	Do not propagate DebugLoc across basic blocks Summary: DebugLoc shouldn't be propagated across basic blocks to prevent incorrect stepping and imprecise sample profile result. rL288903 addressed the wrong DebugLoc propagation issue by limiting the copy of DebugLoc when GVN removes a fully redundant load that is dominated by some other load. However, DebugLoc is still incorrectly propagated in the following example: ``` 1: extern int g; 2: 3: void foo(int x, int y, int z) { 4: if (x) 5: g = 0; 6: else 7: g = 1; 8: 9: int i = 0; 10: for ( ; i < y ; i++) 11: if (i > z) 12: g++; 13: } ``` Below is LLVM IR representation of the program before GVN: ``` @g = external local_unnamed_addr global i32, align 4 ; Function Attrs: nounwind uwtable define void @foo(i32 %x, i32 %y, i32 %z) local_unnamed_addr #0 !dbg !4 { entry: %not.tobool = icmp eq i32 %x, 0, !dbg !8 %.sink = zext i1 %not.tobool to i32, !dbg !8 store i32 %.sink, i32* @g, align 4, !tbaa !9 %cmp8 = icmp sgt i32 %y, 0, !dbg !13 br i1 %cmp8, label %for.body.preheader, label %for.end, !dbg !17 for.body.preheader: ; preds = %entry br label %for.body, !dbg !19 for.body: ; preds = %for.body.preheader, %for.inc %i.09 = phi i32 [ %inc4, %for.inc ], [ 0, %for.body.preheader ] %cmp1 = icmp sgt i32 %i.09, %z, !dbg !19 br i1 %cmp1, label %if.then2, label %for.inc, !dbg !21 if.then2: ; preds = %for.body %0 = load i32, i32* @g, align 4, !dbg !22, !tbaa !9 %inc = add nsw i32 %0, 1, !dbg !22 store i32 %inc, i32* @g, align 4, !dbg !22, !tbaa !9 br label %for.inc, !dbg !23 for.inc: ; preds = %for.body, %if.then2 %inc4 = add nuw nsw i32 %i.09, 1, !dbg !24 %exitcond = icmp ne i32 %inc4, %y, !dbg !13 br i1 %exitcond, label %for.body, label %for.end.loopexit, !dbg !17 for.end.loopexit: ; preds = %for.inc br label %for.end, !dbg !26 for.end: ; preds = %for.end.loopexit, %entry ret void, !dbg !26 } ``` where ``` !21 = !DILocation(line: 11, column: 9, scope: !15) !22 = !DILocation(line: 12, column: 8, scope: !20) !23 = !DILocation(line: 12, column: 7, scope: !20) !24 = !DILocation(line: 10, column: 20, scope: !25) ``` And below is after GVN: ``` @g = external local_unnamed_addr global i32, align 4 define void @foo(i32 %x, i32 %y, i32 %z) local_unnamed_addr !dbg !4 { entry: %not.tobool = icmp eq i32 %x, 0, !dbg !8 %.sink = zext i1 %not.tobool to i32, !dbg !8 store i32 %.sink, i32* @g, align 4, !tbaa !9 %cmp8 = icmp sgt i32 %y, 0, !dbg !13 br i1 %cmp8, label %for.body.preheader, label %for.end, !dbg !17 for.body.preheader: ; preds = %entry br label %for.body, !dbg !19 for.body: ; preds = %for.inc, %for.body.preheader %0 = phi i32 [ %1, %for.inc ], [ %.sink, %for.body.preheader ], !dbg !21 %i.09 = phi i32 [ %inc4, %for.inc ], [ 0, %for.body.preheader ] %cmp1 = icmp sgt i32 %i.09, %z, !dbg !19 br i1 %cmp1, label %if.then2, label %for.inc, !dbg !22 if.then2: ; preds = %for.body %inc = add nsw i32 %0, 1, !dbg !21 store i32 %inc, i32* @g, align 4, !dbg !21, !tbaa !9 br label %for.inc, !dbg !23 for.inc: ; preds = %if.then2, %for.body %1 = phi i32 [ %inc, %if.then2 ], [ %0, %for.body ] %inc4 = add nuw nsw i32 %i.09, 1, !dbg !24 %exitcond = icmp ne i32 %inc4, %y, !dbg !13 br i1 %exitcond, label %for.body, label %for.end.loopexit, !dbg !17 for.end.loopexit: ; preds = %for.inc br label %for.end, !dbg !26 for.end: ; preds = %for.end.loopexit, %entry ret void, !dbg !26 } ``` As you see, GVN removes the load in if.then2 block and creates a phi instruction in for.body for it. The problem is that DebugLoc of remove load instruction is propagated to the newly created phi instruction, which is wrong. rL288903 cannot handle this case because ValuesPerBlock.size() is not 1 in this example when the load is removed. Reviewers: aprantl, andreadb, wolfgangp Reviewed By: andreadb Subscribers: davide, llvm-commits Differential Revision: https://reviews.llvm.org/D29254 llvm-svn: 293688	2017-01-31 20:57:13 +00:00
Tim Northover	c6bfa481cf	GlobalISel: the translation of an invoke must branch to the good block. Otherwise bad things happen if the basic block order isn't trivial after an invoke. llvm-svn: 293679	2017-01-31 20:12:18 +00:00
Matthias Braun	01fa962226	InterleaveAccessPass: Avoid constructing invalid shuffle masks Fix a bug where we would construct shufflevector instructions addressing invalid elements. Differential Revision: https://reviews.llvm.org/D29313 llvm-svn: 293673	2017-01-31 18:37:53 +00:00
Tim Northover	293f74355b	GlobalISel: merge invoke and call translation paths. Well, sort of. But the lower-level code that invoke used to be using completely botched the handling of varargs functions, which hopefully won't be possible if they're using the same code. llvm-svn: 293670	2017-01-31 18:36:11 +00:00
Peter Collingbourne	d763c4cc85	MC: Introduce the ABS8 symbol modifier. @ABS8 can be applied to symbols which appear as immediate operands to instructions that have a 8-bit immediate form for that operand. It causes the assembler to use the 8-bit form and an 8-bit relocation (e.g. R_386_8 or R_X86_64_8) for the symbol. Differential Revision: https://reviews.llvm.org/D28688 llvm-svn: 293667	2017-01-31 18:28:44 +00:00
Davide Italiano	aec4617dc8	[Instcombine] Combine consecutive identical fences Differential Revision: https://reviews.llvm.org/D29314 llvm-svn: 293661	2017-01-31 18:09:05 +00:00
Arnold Schwaighofer	c368563bd6	Don't combine stores to a swifterror pointer operand to a different type llvm-svn: 293658	2017-01-31 17:53:49 +00:00
Dehao Chen	274df5ea41	Explicitly promote indirect calls before sample profile annotation. Summary: In iterative sample pgo where profile is collected from PGOed binary, we may see indirect call targets promoted and inlined in the profile. Before profile annotation, we need to make this happen in order to annotate correctly on IR. This patch explicitly promotes these indirect calls and inlines them before profile annotation. Reviewers: xur, davidxl Reviewed By: davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29040 llvm-svn: 293657	2017-01-31 17:49:37 +00:00
Matt Arsenault	d5d78510c7	AMDGPU: Use source mods with fcanonicalize llvm-svn: 293654	2017-01-31 17:28:40 +00:00
Sanjay Patel	2217f75ad1	fix formatting; NFC llvm-svn: 293652	2017-01-31 17:25:42 +00:00
Nirav Dave	a7c041d147	[X86] Implement -mfentry Summary: Insert calls to __fentry__ at function entry. Reviewers: hfinkel, craig.topper Subscribers: mgorny, llvm-commits Differential Revision: https://reviews.llvm.org/D28000 llvm-svn: 293648	2017-01-31 17:00:27 +00:00
David Bozier	60b80d2233	Add support for demangling C++11 thread_local variables. In clang, the grammar for mangling for these names are "<special-name> ::= TW <object name>" for wrapper variables or "<special-name> ::= TH <object name>" for initialization variables. Initial change was made in libccxxabi r293638 llvm-svn: 293643	2017-01-31 15:56:36 +00:00
Tom Stellard	124f5cc8c2	AMDGPU/SI: Fix inst-select-load-smrd.mir on some builds Summary: For some reason instructions are being inserted in the wrong order with some builds. I'm not sure why this is happening. Reviewers: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, tony-tye, tpr, llvm-commits Differential Revision: https://reviews.llvm.org/D29325 llvm-svn: 293639	2017-01-31 15:24:11 +00:00
Simon Pilgrim	1b39d5db7b	[X86][SSE] Add support for combining PINSRB into a target shuffle. llvm-svn: 293637	2017-01-31 14:59:44 +00:00
Nicolai Haehnle	8813d5d221	[DAGCombine] require UnsafeFPMath for re-association of addition Summary: The affected transforms all implicitly use associativity of addition, for which we usually require unsafe math to be enabled. The "Aggressive" flag is only meant to convey information about the performance of the fused ops relative to a fmul+fadd sequence. Fixes Bug 31626. Reviewers: spatel, hfinkel, mehdi_amini, arsenm, tstellarAMD Subscribers: jholewinski, nemanjai, wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D28675 llvm-svn: 293635	2017-01-31 14:35:37 +00:00
Sam Parker	9bf658d5fe	[ARM] Avoid using ARM instructions in Thumb mode The Requires class overrides the target requirements of an instruction, rather than adding to them, so all ARM instructions need to include the IsARM predicate when they have overwitten requirements. This caused the swp and swpb instructions to be allowed in thumb mode assembly, and the ARM encoding of CDP to be selected in codegen (which is different for conditional instructions). Differential Revision: https://reviews.llvm.org/D29283 llvm-svn: 293634	2017-01-31 14:35:01 +00:00
Benjamin Kramer	94a833962c	[X86] Silence unused variable warning in Release builds. llvm-svn: 293631	2017-01-31 14:13:53 +00:00
Silviu Baranga	c6d21eba0e	[InstCombine] Make sure that LHS and RHS have the same type in transformToIndexedCompare If they don't have the same type, the size of the constant index would need to be adjusted (and this wouldn't be always possible). Alternatively we could try the analysis with the initial RHS value, which would guarantee that the two sides have the same type. However it is unlikely that in practice this would pass our transformation requirements. Fixes PR31808 (https://llvm.org/bugs/show_bug.cgi?id=31808). llvm-svn: 293629	2017-01-31 14:04:15 +00:00
Simon Pilgrim	4eab18f6b8	[X86][SSE] Detect unary PBLEND shuffles. These can appear during shuffle combining. llvm-svn: 293628	2017-01-31 13:58:01 +00:00
Simon Pilgrim	c29eab52e8	[X86][SSE] Add support for combining PINSRW into a target shuffle. Also add the ability to recognise PINSR(Vex, 0, Idx). Targets shuffle combines won't replace multiple insertions with a bit mask until a depth of 3 or more, so we avoid codesize bloat. The unnecessary vpblendw in clearupper8xi16a will be fixed in an upcoming patch. llvm-svn: 293627	2017-01-31 13:51:10 +00:00
Nemanja Ivanovic	2f2a6ab991	[PowerPC][Altivec] Add vmr extended mnemonic Just adds the vmr (Vector Move Register) mnemonic for the VOR instruction in the PPC back end. Committing on behalf of brunoalr (Bruno Rosa). Differential Revision: https://reviews.llvm.org/D29133 llvm-svn: 293626	2017-01-31 13:43:11 +00:00
Florian Hahn	5364cf3b56	[LoopUnroll] Use addClonedBlockToLoopInfo to clone the top level loop (NFC) Summary: rL293124 added the necessary infrastructure to properly add the cloned top level loop to LoopInfo, which means we do not have to do it manually in CloneLoopBlocks. @mkuper sorry for not pointing this out during my review of D29156, I just realized that today. Reviewers: mzolotukhin, chandlerc, mkuper Reviewed By: mkuper Subscribers: llvm-commits, mkuper Differential Revision: https://reviews.llvm.org/D29173 llvm-svn: 293615	2017-01-31 11:13:44 +00:00
Simon Dardis	12850eeac5	[mips] Addition of the immediate cases for the instructions [d]div, [d]divu Related to http://reviews.llvm.org/D15772 Depends on http://reviews.llvm.org/D16888 Adds support for immediate operand for [D]DIV[U] instructions. Patch By: Srdjan Obucina Reviewers: zoran.jovanovic, vkalintiris, dsanders, obucina Differential Revision: https://reviews.llvm.org/D16889 llvm-svn: 293614	2017-01-31 10:49:24 +00:00
Craig Topper	2cfa2071bd	[AVX-512] Don't both looking into the AVX512DQ execution domain fixing tables if AVX512DQ isn't supported since we can't do any conversion anyway. llvm-svn: 293608	2017-01-31 06:49:55 +00:00
Craig Topper	797e32dd98	[X86] Add AVX and SSE2 version of MOVSDmr to execution domain fixing table. AVX-512 already did this for the EVEX version. llvm-svn: 293607	2017-01-31 06:49:53 +00:00
Craig Topper	779e4c5bb4	[AVX-512] Fix copy and paste bug in execution domain fixing tables so that we can convert 256-bit movnt instructions. llvm-svn: 293606	2017-01-31 06:49:50 +00:00
Justin Lebar	1c9692a46f	[NVPTX] Implement NVPTXTargetLowering::getSqrtEstimate. Summary: This lets us lower to sqrt.approx and rsqrt.approx under more circumstances. * Now we emit sqrt.approx and rsqrt.approx for calls to @llvm.sqrt.f32, when fast-math is enabled. Previously, we only would emit it for calls to @llvm.nvvm.sqrt.f. (With this patch we no longer emit sqrt.approx for calls to @llvm.nvvm.sqrt.f; we rely on intcombine to simplify llvm.nvvm.sqrt.f into llvm.sqrt.f32.) * Now we emit the ftz version of rsqrt.approx when ftz is enabled. Previously, we only emitted rsqrt.approx when ftz was disabled. Reviewers: hfinkel Subscribers: llvm-commits, tra, jholewinski Differential Revision: https://reviews.llvm.org/D28508 llvm-svn: 293605	2017-01-31 05:58:22 +00:00
Craig Topper	06e038c6de	[X86] Update the broadcast fallback patterns to use shuffle instructions from the appropriate execution domain. llvm-svn: 293603	2017-01-31 05:18:29 +00:00
Craig Topper	e9e84c8284	[AVX-512] Fix the ExeDomain for VMOVDDUP, VMOVSLDUP, and VMOVSHDUP. llvm-svn: 293601	2017-01-31 05:18:24 +00:00
Matt Arsenault	f84e5d9a27	AMDGPU: Generalize matching of v_med3_f32 I think this is safe as long as no inputs are known to ever be nans. Also add an intrinsic for fmed3 to be able to handle all safe math cases. llvm-svn: 293598	2017-01-31 03:07:46 +00:00
Matt Arsenault	973c4aebad	InferAddressSpaces: Rename constant llvm-svn: 293594	2017-01-31 02:17:41 +00:00
Matt Arsenault	72f259b8eb	InferAddressSpaces: Handle icmp llvm-svn: 293593	2017-01-31 02:17:32 +00:00
Craig Topper	d064cc93b2	[X86] Remove patterns for X86VPermilpi with integer types. I don't think we've formed these since the shuffle lowering rewrite. llvm-svn: 293592	2017-01-31 02:09:53 +00:00
Craig Topper	85935f69fb	[X86] Remove duplicate patterns for X86VPermilpv that already exist in the instructions themselves. llvm-svn: 293591	2017-01-31 02:09:51 +00:00
Craig Topper	ced68315ce	[X86] Remove patterns for selecting PSHUFD with FP types. We don't seem to do this anymore and the AVX case definitely should be using VPERMILPS anyway. llvm-svn: 293590	2017-01-31 02:09:49 +00:00
Craig Topper	b76494e017	[X86] Remove 'else' after 'return'. NFC llvm-svn: 293589	2017-01-31 02:09:46 +00:00
Craig Topper	f9d901f0ea	[X86] Use integer broadcast instructions for integer broadcast patterns. I'm not sure why we were using an FP instruction before and had to have a comment calling attention to it, but not justifying it. llvm-svn: 293588	2017-01-31 02:09:43 +00:00
Matt Arsenault	6d5a8d48fd	InferAddressSpaces: Support memory intrinsics llvm-svn: 293587	2017-01-31 01:56:57 +00:00
Matt Arsenault	6c907a9bb3	InferAddressSpaces: Support atomics llvm-svn: 293584	2017-01-31 01:40:38 +00:00
Matt Arsenault	d89a6e11a7	InferAddressSpaces: Don't replace volatile users llvm-svn: 293582	2017-01-31 01:30:16 +00:00
Matt Arsenault	b6491cc854	AMDGPU: Implement hook for InferAddressSpaces For now just port some of the existing NVPTX tests and from an old HSAIL optimization pass which approximately did the same thing. Don't enable the pass yet until more testing is done. llvm-svn: 293580	2017-01-31 01:20:54 +00:00
Matt Arsenault	850657a439	NVPTX: Move InferAddressSpaces to generic code llvm-svn: 293579	2017-01-31 01:10:58 +00:00
Eugene Zelenko	342257ea92	[ARM] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 293578	2017-01-31 00:56:17 +00:00
Eli Friedman	10d1ff64fe	[SCEV] Simplify/generalize howFarToZero solving. Make SolveLinEquationWithOverflow take the start as a SCEV, so we can solve more cases. With that implemented, get rid of the special case for powers of two. The additional functionality probably isn't particularly useful, but it might help a little for certain cases involving pointer arithmetic. Differential Revision: https://reviews.llvm.org/D28884 llvm-svn: 293576	2017-01-31 00:42:42 +00:00
Keno Fischer	578cf7aae7	[ExecutionDepsFix] Improve clearance calculation for loops Summary: In revision rL278321, ExecutionDepsFix learned how to pick a better register for undef register reads, e.g. for instructions such as `vcvtsi2sdq`. While this revision improved performance on a good number of our benchmarks, it unfortunately also caused significant regressions (up to 3x) on others. This regression turned out to be caused by loops such as: PH -> A -> B (xmm<Undef> -> xmm<Def>) -> C -> D -> EXIT ^ \| +----------------------------------+ In the previous version of the clearance calculation, we would visit the blocks in order, remembering for each whether there were any incoming backedges from blocks that we hadn't processed yet and if so queuing up the block to be re-processed. However, for loop structures such as the above, this is clearly insufficient, since the block B does not have any unknown backedges, so we do not see the false dependency from the previous interation's Def of xmm registers in B. To fix this, we need to consider all blocks that are part of the loop and reprocess them one the correct clearance values are known. As an optimization, we also want to avoid reprocessing any later blocks that are not part of the loop. In summary, the iteration order is as follows: Before: PH A B C D A' Corrected (Naive): PH A B C D A' B' C' D' Corrected (w/ optimization): PH A B C A' B' C' D To facilitate this optimization we introduce two new counters for each basic block. The first counts how many of it's predecssors have completed primary processing. The second counts how many of its predecessors have completed all processing (we will call such a block done. Now, the criteria to reprocess a block is as follows: - All Predecessors have completed primary processing - For x the number of predecessors that have completed primary processing at the time of primary processing of this block, the number of predecessors that are done has reached x. The intuition behind this criterion is as follows: We need to perform primary processing on all predecessors in order to find out any direct defs in those predecessors. When predecessors are done, we also know that we have information about indirect defs (e.g. in block B though that were inherited through B->C->A->B). However, we can't wait for all predecessors to be done, since that would cause cyclic dependencies. However, it is guaranteed that all those predecessors that are prior to us in reverse postorder will be done before us. Since we iterate of the basic blocks in reverse postorder, the number x above, is precisely the count of the number of predecessors prior to us in reverse postorder. Reviewers: myatsina Differential Revision: https://reviews.llvm.org/D28759 llvm-svn: 293571	2017-01-30 23:37:03 +00:00
Sanjay Patel	8c5f236197	[InstCombine] enable (X <<nsw C1) >>s C2 --> X <<nsw (C1 - C2) for vectors with splat constants llvm-svn: 293570	2017-01-30 23:35:52 +00:00
Derek Schuff	6d76b7b455	[WebAssembly] Add wasm support for llvm-readobj Create a WasmDumper subclass of ObjDumper to support Webassembly binary files. Patch by Sam Clegg Differential Revision: https://reviews.llvm.org/D27355 llvm-svn: 293569	2017-01-30 23:30:52 +00:00
Matt Arsenault	9f432ec24c	NVPTX: Trivial cleanups of NVPTXInferAddressSpaces - Move DEBUG_TYPE below includes - Change unknown address space constant to be consistent with other passes - Grammar fixes in debug output llvm-svn: 293567	2017-01-30 23:27:11 +00:00
Eugene Zelenko	dde94e4c4f	[Mips] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 293565	2017-01-30 23:21:32 +00:00
Benjamin Kramer	365c9bd941	[ICP] Fix bool conversion warning and actually write out the reason instead of dropping it. llvm-svn: 293564	2017-01-30 23:11:29 +00:00
Matt Arsenault	42b6478344	NVPTX: Refactor NVPTXInferAddressSpaces to check TTI Add a new TTI hook for getting the generic address space value. llvm-svn: 293563	2017-01-30 23:02:12 +00:00
Sanjay Patel	0c39d56a60	[InstCombine] enable more lshr(shl X, C1), C2 folds for vectors with splat constants llvm-svn: 293562	2017-01-30 23:01:05 +00:00
Simon Pilgrim	3905e03a47	[X86][SSE] Fix unsigned <= 0 warning in assert. NFCI. Thanks to @mkuper llvm-svn: 293561	2017-01-30 22:58:44 +00:00
Simon Pilgrim	a80a47afef	[X86][SSE] Generalize the number of decoded shuffle inputs. NFCI. combineX86ShufflesRecursively can still only handle a maximum of 2 shuffle inputs but everything before it now supports any number of shuffle inputs. This will be necessary for combining OR(SHUFFLE, SHUFFLE) patterns. llvm-svn: 293560	2017-01-30 22:48:49 +00:00

... 3 4 5 6 7 ...

99511 Commits