llvm-project

Commit Graph

Author	SHA1	Message	Date
Rui Ueyama	dfc8aa8e1b	Simplify WinCOFFObjectWriter by removing a template member function. llvm-svn: 295126	2017-02-14 23:58:19 +00:00
Rui Ueyama	0fcdb48c6e	Do not lookup a DenseMap twice using the same key. llvm-svn: 295124	2017-02-14 23:47:34 +00:00
Rui Ueyama	86e3ef92f3	Use endian::write32le instead of endian::write. llvm-svn: 295120	2017-02-14 23:28:19 +00:00
Rui Ueyama	cbb4e7c1fb	Use zero-initialization instead of memset. llvm-svn: 295119	2017-02-14 23:28:01 +00:00
Kostya Serebryany	32c5004cf5	[libFuzzer] increase the size of FixedWord from 27 to 64, see PR31950 llvm-svn: 295117	2017-02-14 23:02:37 +00:00
Easwaran Raman	5a12f236c6	Fix a bug in caller's BFI update code after inlining. Multiple blocks in the callee can be mapped to a single cloned block since we prune the callee as we clone it. The existing code iterates over the value map and clones the block frequency (and eventually scales the frequencies of the cloned blocks). Value map's iteration is not deterministic and so the cloned block might get the frequency of any of the original blocks. The fix is to set the max of the original frequencies to the cloned block. The first block in the sequence must have this max frequency and, in the call context, subsequent blocks must have its frequency. Differential Revision: https://reviews.llvm.org/D29696 llvm-svn: 295115	2017-02-14 22:49:28 +00:00
Kostya Serebryany	ae579a79c0	Use "%zd" format specifier for printing number of testcases executed. Summary: This helps to avoid signed integer overflow after running a fast fuzz target for several hours, e.g.: <...> Done -1097903291 runs in 54001 second(s) Reviewers: kcc Reviewed By: kcc Differential Revision: https://reviews.llvm.org/D29941 llvm-svn: 295112	2017-02-14 22:14:36 +00:00
Michael Kuperstein	569162fefe	[LV] Rename Induction to PrimaryInduction. NFC. llvm-svn: 295111	2017-02-14 22:14:01 +00:00
Peter Collingbourne	534c0175b6	WholeProgramDevirt: Change internal vcall data structures to match summary. Group calls into constant and non-constant arguments up front, and use uint64_t instead of ConstantInt to represent constant arguments. The goal is to allow the information from the summary to fit naturally into this data structure in a future change (specifically, it will be added to CallSiteInfo). This has two side effects: - We disallow VCP for constant integer arguments of width >64 bits. - We remove the restriction that the bitwidth of a vcall's argument and return types must match those of the vfunc definitions. I don't expect either of these to matter in practice. The first case is uncommon, and the second one will lead to UB (so we can do anything we like). Differential Revision: https://reviews.llvm.org/D29744 llvm-svn: 295110	2017-02-14 22:12:23 +00:00
Simon Dardis	454f2e7840	[mips] Correct mips16 return instructions definitions Correct the definition of MIPS16 instructions that act as return instructions so that isReturn = 1 as expected. llvm-svn: 295109	2017-02-14 21:53:23 +00:00
Taewook Oh	2e945ebb13	[BasicBlockUtils] Use getFirstNonPHIOrDbg to set debugloc for instructions created in SplitBlockPredecessors Summary: When setting debugloc for instructions created in SplitBlockPredecessors, current implementation copies debugloc from the first-non-phi instruction of the original basic block. However, if the first-non-phi instruction is a call for @llvm.dbg.value, the debugloc of the instruction may point the location outside of the block itself. For the example code of ``` 1 typedef struct _node_t { 2 struct _node_t next; 3 } node_t; 4 5 extern node_t root; 6 7 int foo() { 8 node_t node, tmp; 9 int ret = 0; 10 11 node = tmp = root->next; 12 while (node != root) { 13 while (node) { 14 tmp = node; 15 node = node->next; 16 ret++; 17 } 18 } 19 20 return ret; 21 } ``` , below is the basicblock corresponding to line 12 after Reassociate expressions pass: ``` while.cond: ; preds = %while.cond2, %entry %node.0 = phi %struct._node_t* [ %1, %entry ], [ null, %while.cond2 ] %ret.0 = phi i32 [ 0, %entry ], [ %ret.1, %while.cond2 ] tail call void @llvm.dbg.value(metadata i32 %ret.0, i64 0, metadata !19, metadata !20), !dbg !21 tail call void @llvm.dbg.value(metadata %struct._node_t* %node.0, i64 0, metadata !11, metadata !20), !dbg !31 %cmp = icmp eq %struct._node_t* %node.0, %0, !dbg !33 br i1 %cmp, label %while.end5, label %while.cond2, !dbg !35 ``` As you can see, the first-non-phi instruction is a call for @llvm.dbg.value, and the debugloc is ``` !21 = !DILocation(line: 9, column: 7, scope: !6) ``` , which is a definition of 'ret' variable and outside of the scope of the basicblock itself. However, current implementation picks up this debugloc for the instructions created in SplitBlockPredecessors. This patch addresses this problem by picking up debugloc from the first-non-phi-non-dbg instruction. Reviewers: dblaikie, samsonov, eugenis Reviewed By: eugenis Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29867 llvm-svn: 295106	2017-02-14 21:10:40 +00:00
Reid Kleckner	a622fc9bdf	[BranchFolding] Tail common all identical unreachable blocks Summary: Blocks ending in unreachable are typically cold because they end the program or throw an exception, so merging them with other identical blocks is usually profitable because it reduces the size of cold code. MachineBlockPlacement generally does not arrange to fall through to such blocks, so commoning these blocks will not introduce additional unconditional branches. Reviewers: hans, iteratee, haicheng Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29153 llvm-svn: 295105	2017-02-14 21:02:24 +00:00
Tim Northover	398c5f57f9	GlobalISel: deal with new G_PTR_MASK instruction on AArch64. It's just an AND-immediate instruction for us, surprisingly simple to select. llvm-svn: 295104	2017-02-14 20:56:29 +00:00
Tim Northover	c2f8956313	GlobalISel: introduce G_PTR_MASK to simplify alloca handling. This instruction clears the low bits of a pointer without requiring (possibly dodgy if pointers aren't ints) conversions to and from an integer. Since (as far as I'm aware) all masks are statically known, the instruction takes an immediate operand rather than a register to specify the mask. llvm-svn: 295103	2017-02-14 20:56:18 +00:00
Vedant Kumar	55891fc71e	Re-apply "[profiling] Remove dead profile name vars after emitting name data" This reverts 295092 (re-applies 295084), with a fix for dangling references from the array of coverage names passed down from frontends. I missed this in my initial testing because I only checked test/Profile, and not test/CoverageMapping as well. Original commit message: The profile name variables passed to counter increment intrinsics are dead after we emit the finalized name data in __llvm_prf_nm. However, we neglect to erase these name variables. This causes huge size increases in the __TEXT,__const section as well as slowdowns when linker dead stripping is disabled. Some affected projects are so massive that they fail to link on Darwin, because only the small code model is supported. Fix the issue by throwing away the name constants as soon as we're done with them. Differential Revision: https://reviews.llvm.org/D29921 llvm-svn: 295099	2017-02-14 20:03:48 +00:00
Eric Christopher	14303d1815	Reformat slightly. llvm-svn: 295096	2017-02-14 19:43:50 +00:00
Wolfgang Pieb	399dcfaa2a	Reapply r294532, reverted in r294787. Store instructions can have more than one memory operand as a result of optimizations that fold different stores into one. When we identify spill instructions to generate DBG_VALUE instructions to record the spilling of a variable, we disregard stores with multiple memory operands for now. We may miss some relevant spills but the handling is a bit more complex, so we'll do it in a different patch. This fixes PR31935. llvm-svn: 295093	2017-02-14 19:08:45 +00:00
Vedant Kumar	27ebdf4bcb	Revert "[profiling] Remove dead profile name vars after emitting name data" This reverts commit r295084. There is a test failure on: http://lab.llvm.org:8011/builders/clang-atom-d525-fedora-rel/builds/2620/ llvm-svn: 295092	2017-02-14 19:08:39 +00:00
Zachary Turner	8bd42a1a98	[Support] Add StringRef::getAsDouble. Differential Revision: https://reviews.llvm.org/D29918 llvm-svn: 295089	2017-02-14 19:06:37 +00:00
Vedant Kumar	bb10484662	[profiling] Remove dead profile name vars after emitting name data The profile name variables passed to counter increment intrinsics are dead after we emit the finalized name data in __llvm_prf_nm. However, we neglect to erase these name variables. This causes huge size increases in the __TEXT,__const section as well as slowdowns when linker dead stripping is disabled. Some affected projects are so massive that they fail to link on Darwin, because only the small code model is supported. Fix the issue by throwing away the name constants as soon as we're done with them. Differential Revision: https://reviews.llvm.org/D29921 llvm-svn: 295084	2017-02-14 18:48:48 +00:00
Aditya Nandakumar	bb0483bc8e	[Tablegen] Instrumenting table gen DAGGenISelDAG To help assist in debugging ISEL or to prioritize GlobalISel backend work, this patch adds two more tables to <Target>GenISelDAGISel.inc - one which contains the patterns that are used during selection and the other containing include source location of the patterns Enabled through CMake varialbe LLVM_ENABLE_DAGISEL_COV llvm-svn: 295081	2017-02-14 18:32:41 +00:00
Krzysztof Parzyszek	d3b5641586	[Hexagon] Remove leftover debugging code llvm-svn: 295078	2017-02-14 17:37:44 +00:00
Taewook Oh	f22fa72e4a	Do not apply redundant LastCallToStaticBonus Summary: As written in the comments above, LastCallToStaticBonus is already applied to the cost if Caller has only one user, so it is redundant to reapply the bonus here. If the only user is not a caller, TotalSecondaryCost will not be adjusted anyway because callerWillBeRemoved is false. If there's no caller at all, we don't need to care about TotalSecondaryCost because inliningPreventsSomeOuterInline is false. Reviewers: chandlerc, eraman Reviewed By: eraman Subscribers: haicheng, davidxl, davide, llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D29169 llvm-svn: 295075	2017-02-14 17:30:05 +00:00
Adam Nemet	4c98023724	[LazyBFI] Fix typos llvm-svn: 295073	2017-02-14 17:21:12 +00:00
Adam Nemet	bbb141c734	Add new pass LazyMachineBlockFrequencyInfo And use it in MachineOptimizationRemarkEmitter. A test will follow on top of Justin's changes to enable MachineORE in AsmPrinter. The approach is similar to the IR-level pass. It's a bit simpler because BPI is immutable at the Machine level so we don't need to make that lazy. Because of this, a new function mapping is introduced (BPIPassTrait::getBPI). This function extracts BPI from the pass. In case of the lazy pass, this is when the calculation of the BFI occurs. For Machine-level, this is the identity function. Differential Revision: https://reviews.llvm.org/D29836 llvm-svn: 295072	2017-02-14 17:21:09 +00:00
Sanjay Patel	a109dd1398	fix documentation comments for Argument; NFC llvm-svn: 295068	2017-02-14 16:43:49 +00:00
Brian Cain	6dedf65cc9	Correct a typo, s/hosting/hoisting/ llvm-svn: 295066	2017-02-14 16:41:10 +00:00
Diego Novillo	8adfc8ef3a	Remove unused variable. llvm-svn: 295065	2017-02-14 16:39:54 +00:00
Matthew Simpson	f09d13e5cc	Reapply "[LV] Extend trunc optimization to all IVs with constant integer steps" This reapplies commit r294967 with a fix for the execution time regressions caught by the clang-cmake-aarch64-quick bot. We now extend the truncate optimization to non-primary induction variables only if the truncate isn't already free. Differential Revision: https://reviews.llvm.org/D29847 llvm-svn: 295063	2017-02-14 16:28:32 +00:00
Simon Pilgrim	6f732e026d	[X86][SSE] Allow matchVectorShuffleWithUNPCK to recognise UNDEF inputs Add support for specifying an UNPCK input as UNDEF llvm-svn: 295061	2017-02-14 16:22:04 +00:00
Igor Laevsky	c11c1ed909	[SCEV] Cache results during GetMinTrailingZeros query Differential Revision: https://reviews.llvm.org/D29759 llvm-svn: 295060	2017-02-14 15:53:12 +00:00
Alexey Bataev	2a2f35d59c	[SLP] Fix for PR31879: vectorize repeated scalar ops that don't get put back into a vector Previously the cost of the existing ExtractElement/ExtractValue instructions was considered as a dead cost only if it was detected that they have only one use. But these instructions may be considered dead also if users of the instructions are also going to be vectorized, like: ``` %x0 = extractelement <2 x float> %x, i32 0 %x1 = extractelement <2 x float> %x, i32 1 %x0x0 = fmul float %x0, %x0 %x1x1 = fmul float %x1, %x1 %add = fadd float %x0x0, %x1x1 ``` This can be transformed to ``` %1 = fmul <2 x float> %x, %x %2 = extractelement <2 x float> %1, i32 0 %3 = extractelement <2 x float> %1, i32 1 %add = fadd float %2, %3 ``` because though `%x0` and `%x1` have 2 users each other, these users are part of the vectorized tree and we can consider these `extractelement` instructions as dead. Differential Revision: https://reviews.llvm.org/D29900 llvm-svn: 295056	2017-02-14 15:20:48 +00:00
Artyom Skrobov	dc66a82dc7	Removing a redundant assignment llvm-svn: 295055	2017-02-14 14:44:01 +00:00
Alexander Timofeev	9f61feac4a	Revert "[AMDGPU] Fix for SIMachineScheduler crash. SI Scheduler should track" This reverts commit ce06d9cb99298eb844b66e117f5108a06747c907. llvm-svn: 295054	2017-02-14 14:29:05 +00:00
Simon Pilgrim	a0878dea9e	[X86][SSE] Move unary inputs handling inside matchVectorShuffleWithUNPCK. llvm-svn: 295053	2017-02-14 13:47:17 +00:00
Simon Pilgrim	3efdffcb27	[X86][SSE] Tidyup matchVectorShuffleWithUNPCK helper function call. Don't bother setting the V1/V2 operands again for unary shuffles. Don't bother legalizing the value type unless the match succeeds. llvm-svn: 295051	2017-02-14 12:54:39 +00:00
Karl-Johan Karlsson	ec21b769ec	Revert "[LoopVectorize] Added address space check when analysing interleaved accesses" This reverts r295038. The buildbot clang-with-thin-lto-ubuntu failed. I'm reverting to investigate. llvm-svn: 295042	2017-02-14 10:06:16 +00:00
Karl-Johan Karlsson	2ec409cca2	[LoopVectorize] Added address space check when analysing interleaved accesses Prevent memory objects of different address spaces to be part of the same load/store groups when analysing interleaved accesses. This is fixing pr31900. Reviewers: HaoLiu, mssimpso, mkuper Reviewed By: mssimpso, mkuper Subscribers: llvm-commits, efriedma, mzolotukhin Differential Revision: https://reviews.llvm.org/D29717 llvm-svn: 295038	2017-02-14 08:14:06 +00:00
Karl-Johan Karlsson	38cbf5869d	Test commit permission Removing whitespace. llvm-svn: 295037	2017-02-14 07:31:36 +00:00
Craig Topper	d2d50cba2a	[AVX-512] Add PAVGB/PAVGW to load folding tables. llvm-svn: 295035	2017-02-14 06:54:57 +00:00
Mikael Holmen	ece84cd10c	[LSR] Pointers with different address spaces are considered incompatible. Summary: Function isCompatibleIVType is already used as a guard before the call to SE.getMinusSCEV(OperExpr, PrevExpr); in LSRInstance::ChainInstruction. getMinusSCEV requires the expressions to be of the same type, so we now consider two pointers with different address spaces to be incompatible, since it is possible that the pointers in fact have different sizes. Reviewers: qcolombet, eli.friedman Reviewed By: qcolombet Subscribers: nhaehnle, Ka-Ka, llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D29885 llvm-svn: 295033	2017-02-14 06:37:42 +00:00
Alex Bradbury	e4f731b813	[RISCV] Fix RV32 datalayout string and ensure initAsmInfo is called llvm-svn: 295028	2017-02-14 05:20:20 +00:00
Alex Bradbury	6be16fbfb8	[RISCV] Pseudo instructions are isCodeGenOnly, have blank asmstr llvm-svn: 295027	2017-02-14 05:17:23 +00:00
Alex Bradbury	d36e04cb6c	[RISCV] Fix unused variable in RISCVMCTargetDesc. NFC Also, for better uniformity use TargetRegistry::RegisterMCAsmInfo rather than RegisterMCAsmInfoFn. Again, no functional change. llvm-svn: 295026	2017-02-14 05:15:24 +00:00
Peter Collingbourne	002c2d5380	ThinLTOBitcodeWriter: Write available_externally copies of VCP eligible functions to merged module. Differential Revision: https://reviews.llvm.org/D29701 llvm-svn: 295021	2017-02-14 03:42:38 +00:00
Mehdi Amini	a0ddb1ed46	[ThinLTO] Make a copy of buffer identifier in ThinLTOCodeGenerator We can't assume that the `const char *` provided through libLTO has a lifetime that expands beyond the codegenerator itself. llvm-svn: 295018	2017-02-14 02:20:51 +00:00
Philip Reames	b2bca7e309	[LICM] Make store promotion work in the face of unordered atomics Extend our store promotion code to deal with unordered atomic accesses. Ordered atomics continue to be unhandled. Most of the change is straight-forward, the only complicated bit is in the reasoning around mixing of atomic and non-atomic memory access. Rather than trying to reason about the complex semantics in these cases, I simply disallowed promotion when both atomic and non-atomic accesses are present. This is conservatively correct. It seems really tempting to just promote all access to atomics, but the original accesses might have been conditional. Since we can't lower an arbitrary atomic type, it might not be safe to promote all access to atomic. Consider a loop like the following: while(b) { load i128 ... if (can lower i128 atomic) store atomic i128 ... else store i128 } It could be there's no race on the location and thus the code is perfectly well defined even if we can't lower a i128 atomically. It's not clear we need to be this conservative - arguably the program above is brocken since it can't be lowered unless the branch is folded - but I didn't want to have to fix any fallout which might result. Differential Revision: https://reviews.llvm.org/D15592 llvm-svn: 295015	2017-02-14 01:38:31 +00:00
Eugene Zelenko	d96089b248	[MC] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). Same changes in files affected by reduced MC headers dependencies. llvm-svn: 295009	2017-02-14 00:33:36 +00:00
Peter Collingbourne	c45f7f3eb4	FunctionAttrs: Factor out a function for querying memory access of a specific copy of a function. NFC. This will later be used by ThinLTOBitcodeWriter to add copies of readnone functions to the regular LTO module. Differential Revision: https://reviews.llvm.org/D29695 llvm-svn: 295008	2017-02-14 00:28:13 +00:00
Andrew Kaylor	709f1c2a9b	[X86] Add MXCSR register This adds MXCSR to the set of recognized registers for X86 targets and updates the instructions that read or write it. I do not intend for all of the various floating point instructions that implicitly use the control bits or update the status bits of this register to ever have that usage modeled by default. However, when constrained floating point modes (such as strict FP exception status modeling or dynamic rounding modes) are enabled, implicit use/def information for MXCSR will be added to those instructions. Until those additional updates are made this should cause (almost?) no functional changes. Theoretically, this will prevent instructions like LDMXCSR and STMXCSR from being moved past one another, but that should be prevented anyway and I haven't found a case where it is happening now. Differential Revision: https://reviews.llvm.org/D29903 llvm-svn: 295004	2017-02-13 23:38:52 +00:00
Sanjay Patel	4f74216da0	[FunctionAttrs] try to extend nonnull-ness of arguments from a callsite back to its parent function As discussed here: http://lists.llvm.org/pipermail/llvm-dev/2016-December/108182.html ...we should be able to propagate 'nonnull' info from a callsite back to its parent. The original motivation for this patch is our botched optimization of "dyn_cast" (PR28430), but this won't solve that problem. The transform is currently disabled by default while we wait for clang to work-around potential security problems: http://lists.llvm.org/pipermail/cfe-dev/2017-January/052066.html Differential Revision: https://reviews.llvm.org/D27855 llvm-svn: 294998	2017-02-13 23:10:51 +00:00
Tim Northover	48dfa1a6ed	GlobalISel: represent atomic loads & stores via the MachineMemOperand. Also make sure the AArch64 backend doesn't try to convert them into normal loads and stores. llvm-svn: 294993	2017-02-13 22:14:16 +00:00
Tim Northover	b73e309071	MIR: parse & print the atomic parts of a MachineMemOperand. We're going to need them very soon for GlobalISel. llvm-svn: 294992	2017-02-13 22:14:08 +00:00
Taewook Oh	4d35f9e10e	Address post-commit comments for https://reviews.llvm.org/D29596 . NFCI. llvm-svn: 294985	2017-02-13 21:12:27 +00:00
Arnold Schwaighofer	8f3df731dc	swiftcc: Don't emit tail calls from callers with swifterror parameters Backends don't support this yet. They would have to move to the swifterror register before the tail call to make sure it is live-in to the call. rdar://30495920 llvm-svn: 294982	2017-02-13 19:58:28 +00:00
Peter Collingbourne	2b33f65317	IR: Type ID summary extensions for WPD; thread summary into WPD pass. Make the whole thing testable by adding YAML I/O support for the WPD summary information and adding some negative tests that exercise the YAML support. Differential Revision: https://reviews.llvm.org/D29782 llvm-svn: 294981	2017-02-13 19:26:18 +00:00
Taewook Oh	06a2128cfa	Make MachineBasicBlock::updateTerminator to update DebugLoc as well Summary: Currently MachineBasicBlock::updateTerminator simply drops DebugLoc for newly created branch instructions, which may cause incorrect stepping and/or imprecise sample profile data. Below is an example: ``` 1 extern int bar(int x); 2 3 int foo(int begin, int end) { 4 int i; 5 int ret = 0; 6 for ( 7 i = begin ; 8 i != end ; 9 i++) 10 { 11 ret += bar(i); 12 } 13 return ret; 14 } ``` Below is a bitcode of 'foo' at the end of LLVM-IR level optimizations with -O3: ``` define i32 @foo(i32* readonly %begin, i32* readnone %end) !dbg !4 { entry: %cmp6 = icmp eq i32* %begin, %end, !dbg !9 br i1 %cmp6, label %for.end, label %for.body.preheader, !dbg !12 for.body.preheader: ; preds = %entry br label %for.body, !dbg !13 for.body: ; preds = %for.body.preheader, %for.body %ret.08 = phi i32 [ %add, %for.body ], [ 0, %for.body.preheader ] %i.07 = phi i32* [ %incdec.ptr, %for.body ], [ %begin, %for.body.preheader ] %0 = load i32, i32* %i.07, align 4, !dbg !13, !tbaa !15 %call = tail call i32 @bar(i32 %0), !dbg !19 %add = add nsw i32 %call, %ret.08, !dbg !20 %incdec.ptr = getelementptr inbounds i32, i32* %i.07, i64 1, !dbg !21 %cmp = icmp eq i32* %incdec.ptr, %end, !dbg !9 br i1 %cmp, label %for.end.loopexit, label %for.body, !dbg !12, !llvm.loop !22 for.end.loopexit: ; preds = %for.body br label %for.end, !dbg !24 for.end: ; preds = %for.end.loopexit, %entry %ret.0.lcssa = phi i32 [ 0, %entry ], [ %add, %for.end.loopexit ] ret i32 %ret.0.lcssa, !dbg !24 } ``` where ``` !12 = !DILocation(line: 6, column: 3, scope: !11) ``` . As you can see, the terminator of 'entry' block, which is a loop control branch, has a DebugLoc of line 6, column 3. Howerver, after the execution of 'MachineBlock::updateTerminator' function, which is triggered by MachineSinking pass, the DebugLoc info is dropped as below (see there's no debug-location for JNE_1): ``` bb.0.entry: successors: %bb.4(0x30000000), %bb.1.for.body.preheader(0x50000000) liveins: %rdi, %rsi %6 = COPY %rsi %5 = COPY %rdi %8 = SUB64rr %5, %6, implicit-def %eflags, debug-location !9 JNE_1 %bb.1.for.body.preheader, implicit %eflags ``` This patch addresses this issue and make newly created branch instructions to keep debug-location info. Reviewers: aprantl, MatzeB, craig.topper, qcolombet Reviewed By: qcolombet Subscribers: qcolombet, llvm-commits Differential Revision: https://reviews.llvm.org/D29596 llvm-svn: 294976	2017-02-13 18:15:31 +00:00
Matthew Simpson	659f92e2aa	Revert "[LV] Extend trunc optimization to all IVs with constant integer steps" This reverts commit r294967. This patch caused execution time slowdowns in a few LLVM test-suite tests, as reported by the clang-cmake-aarch64-quick bot. I'm reverting to investigate. llvm-svn: 294973	2017-02-13 18:02:35 +00:00
Quentin Colombet	fbae5fcb96	[FastISel] Add a diagnostic to warm on fallback. This is consistent with what we do for GlobalISel. That way, it is easy to see whether or not FastISel is able to fully select a function. At some point we may want to switch that to an optimization remark. llvm-svn: 294970	2017-02-13 17:38:59 +00:00
James Molloy	0ae2202235	[ARM] Fix crash caused by r294945 I'd missed a creator of FCMP nodes - duplicateCmp(). Kindly and promptly reported by Gabor Ballabas, due to his CSiBE test suite. llvm-svn: 294968	2017-02-13 17:18:00 +00:00
Matthew Simpson	7b7f40297f	[LV] Extend trunc optimization to all IVs with constant integer steps This patch extends the optimization of truncations whose operand is an induction variable with a constant integer step. Previously we were only applying this optimization to the primary induction variable. However, the cost model assumes the optimization is applied to the truncation of all integer induction variables (even regardless of step type). The transformation is now applied to the other induction variables, and I've updated the cost model to ensure it is better in sync with the transformation we actually perform. Differential Revision: https://reviews.llvm.org/D29847 llvm-svn: 294967	2017-02-13 16:48:00 +00:00
Simon Dardis	509da1a46d	[mips] divide macro instruction cleanup. Clean up the implementation of divide macro expansion by getting rid of a FIXME regarding magic numbers and branch instructions. Match GAS' behaviour for expansion of ddiv / div in the two and three operand cases. Add the two operand alias for MIPSR6. Finally, optimize macro expansion cases where the divisior is the $zero register. Reviewers: slthakur Differential Revision: https://reviews.llvm.org/D29887 llvm-svn: 294960	2017-02-13 16:06:48 +00:00
Simon Pilgrim	fd6a84fbaa	Fix indentation. NFCI. llvm-svn: 294959	2017-02-13 15:31:08 +00:00
Davide Italiano	513dfaa0a3	[PM] Hook up the instrumented PGO machinery in the new PM. Differential Revision: https://reviews.llvm.org/D29308 llvm-svn: 294955	2017-02-13 15:26:22 +00:00
Davide Italiano	20a895c4be	[LTO] Make sure we flush buffers to work around linker shenanigans. lld, at least, doesn't call global destructors by default (unless --full-shutdown is passed) because it's, allegedly, expensive. llvm-svn: 294953	2017-02-13 14:39:51 +00:00
Sanne Wouda	490d4a6da6	[CodeGen] fix alignment of JUMPTABLE_INSTS on v8M.base Summary: The attached test case fails with "fatal error: error in backend: misaligned pc-relative fixup value" as the jump table is misaligned. The EmitAlignment existed already for ARM and Thumb-1 code, but was missing for Thumb-2. The test checks that the fatal error disappears when generating an obj file, as well as checking the align directive is there when producing an asm file. Reviewers: rengolin, grosbach, t.p.northover, jmolloy, SjoerdMeijer, samparker Reviewed By: samparker Subscribers: samparker, aemerson, llvm-commits Differential Revision: https://reviews.llvm.org/D29650 llvm-svn: 294950	2017-02-13 14:07:45 +00:00
James Molloy	92497542e7	[Thumb-1] TBB generation: spot redefinitions of index register We match a sequence of 3-4 instructions into a tTBB pseudo. One of our checks is that a particular register in that sequence is killed (so it can be clobbered by the pseudo). We weren't noticing if an errant MOV or other instruction had infiltrated the sequence we were walking. If it had, and it defined the register we've already identified as killed, it makes it live across the tBR_JT and thus unclobberable. Notice this case and bail out. llvm-svn: 294949	2017-02-13 14:07:39 +00:00
James Molloy	9b3b899669	[ARM] Register ConstantIslands with the pass manager This allows us to use -stop-before/-stop-after/-run-pass - we can now write .mir tests. llvm-svn: 294948	2017-02-13 14:07:25 +00:00
Sanne Wouda	91eadad3bd	[Assembler] Improve diagnostics for inline assembly. Summary: Keep a vector of LocInfos around; one for each call to EmitInlineAsm. Since each call to EmitInlineAsm creates a new buffer in the inline asm SourceMgr, we can use the buffer number to map to the right LocInfo. Reviewers: rengolin, grosbach, rnk, echristo Reviewed By: rnk Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D29769 llvm-svn: 294947	2017-02-13 13:58:00 +00:00
James Molloy	d508789668	[ARM] Use VCMP, not VCMPE, for floating point equality comparisons When generating a floating point comparison we currently unconditionally generate VCMPE. This has the sideeffect of setting the cumulative Invalid bit in FPSCR if any of the operands are QNaN. It is expected that use of a relational predicate on a QNaN value should raise Invalid. Quoting from the C standard: The relational and equality operators support the usual mathematical relationships between numeric values. For any ordered pair of numeric values exactly one of relationships the less, greater, equal and is true. Relational operators may raise the floating-point exception when argument values are NaNs. The standard doesn't explicitly state the expectation for equality operators, but the implication and obvious expectation is that equality operators should not raise Invalid on a QNaN input, as those predicates are wholly defined on unordered inputs (to return not equal). Therefore, add a new operand to ARMISD::FPCMP and FPCMPZ indicating if QNaN should raise Invalid, and pipe that through to TableGen. llvm-svn: 294945	2017-02-13 12:32:47 +00:00
Simon Pilgrim	828dee1f70	[X86][SSE] Create matchVectorShuffleWithUNPCK helper function. Currently only used by target shuffle combining - will use it for lowering as well in a future patch. llvm-svn: 294943	2017-02-13 11:52:58 +00:00
Ayman Musa	f77219e035	[X86][AVX512] Fix operand classes for some AVX512 instructions to keep consistency between VEX/EVEX versions of the same instruction. Differential Revision: https://reviews.llvm.org/D29873 llvm-svn: 294937	2017-02-13 09:55:48 +00:00
Andrew V. Tischenko	8da96914f9	Compile time decreasing in the case we're dealing with Machine Combiner. Before this patch compile time was about 21s (see below). After this patch we have less than 2s (see bellow). Intel(R) Xeon(R) CPU E5-2676 v3 @ 2.40GHz DAGCombiner - trunk time ./llc spill_fdiv.ll -o /dev/null -enable-unsafe-fp-math real 0m1.685s DAGCombiner + Speed patch time ./llc spill_fdiv.ll -o /dev/null -enable-unsafe-fp-math real 0m1.655s MachineCombiner w/o Speed patch time ./llc spill_fdiv.ll -o /dev/null -enable-unsafe-fp-math real 0m21.614s MachineCombiner + Speed patch time ./llc spill_fdiv.ll -o /dev/null -enable-unsafe-fp-math real 0m1.593s The test spill_fdiv.ll is attached to D29627 D29627 should be closed. llvm-svn: 294936	2017-02-13 09:43:37 +00:00
Alexey Bataev	e8b1536e21	[SLP] Fix for PR31690: Allow using of extra values in horizontal reductions. Currently, LLVM supports vectorization of horizontal reduction instructions with initial value set to 0. Patch supports vectorization of reduction with non-zero initial values. Also, it supports a vectorization of instructions with some extra arguments, like: ``` float f(float x[], int a, int b) { float p = a % b; p += x[0] + 3; for (int i = 1; i < 32; i++) p += x[i]; return p; } ``` Patch allows vectorization of this kind of horizontal reductions. Differential Revision: https://reviews.llvm.org/D29727 llvm-svn: 294934	2017-02-13 08:01:26 +00:00
Craig Topper	3668bde371	[DAGCombiner] Teach DAG combine that inserting an extract_subvector result into the same location of a an undef vector can just use the original input to the extract. llvm-svn: 294932	2017-02-13 04:53:33 +00:00
Craig Topper	680c73e7ab	[X86] Genericize the handling of INSERT_SUBVECTOR from an EXTRACT_SUBVECTOR to support 512-bit vectors with 128-bit or 256-bit subvectors. We now detect that both the extract and insert indices are non-zero and convert to a shuffle. This will be lowered as a blend for 256-bit vectors or as a vshuf operations for 512-bit vectors. llvm-svn: 294931	2017-02-13 04:53:29 +00:00
Craig Topper	aa46204ed9	[DAGCombiner] Remove the half vector width check for the combine of EXTRACT_SUBVECTOR from an INSERT_SUBVECTOR. This gives more parallelism opportunities for AVX-512 when dealing with 128-bit extracts from 512-bit vectors. llvm-svn: 294930	2017-02-12 23:49:49 +00:00
Craig Topper	53eafa8ea4	[X86] Don't let LowerEXTRACT_SUBVECTOR call getNode for EXTRACT_SUBVECTOR. This results in the simplifications inside of getNode running while we're legalizing nodes popped off the worklist during the final DAG combine. This basically makes a DAG combine like operation occur during this legalize step, but we don't handle something quite the same way. I think we don't recursively added the removed nodes to the DAG combiner worklist. llvm-svn: 294929	2017-02-12 23:49:46 +00:00
Daniel Berlin	4d54796f87	NewGVN: Reverse order of congruence class elimination to maximize trivial deadness llvm-svn: 294926	2017-02-12 23:24:45 +00:00
Daniel Berlin	508a1dec94	NewGVN: Use shouldSwapOperands in one more place llvm-svn: 294925	2017-02-12 23:24:42 +00:00
Sanjay Patel	0557a44287	[TargetLowering] fix SETCC SETLT folding with FP types The bug was introduced with: https://reviews.llvm.org/rL294863 ...and manifests as a selection failure in x86, but that's actually another bug. This fix prevents wrong codegen with -0.0, but in the more common case when we have NSZ and NNAN (-ffast-math), we should still be able to fold this setcc/compare. llvm-svn: 294924	2017-02-12 23:07:52 +00:00
Daniel Berlin	31e1b8fe48	Revert accidental commit titled "testing" This reverts commit r294919 llvm-svn: 294923	2017-02-12 22:40:10 +00:00
Daniel Berlin	86eab15f2b	NewGVN: Apply the fast math flags fix in r267113 to NewGVN as well. llvm-svn: 294922	2017-02-12 22:25:20 +00:00
Daniel Berlin	dbe8264c93	PredicateInfo: Handle critical edges Summary: This adds support for placing predicateinfo such that it affects critical edges. This fixes the issues mentioned by Nuno on the mailing list. Depends on D29519 Reviewers: davide, nlopes Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29606 llvm-svn: 294921	2017-02-12 22:12:20 +00:00
Daniel Berlin	eccb8740d1	NewGVN: Fix missed call that should be to shouldSwapOperands llvm-svn: 294920	2017-02-12 22:02:47 +00:00
Daniel Berlin	3fecad0d3e	testing llvm-svn: 294919	2017-02-12 22:02:20 +00:00
Simon Pilgrim	cc9242bd1c	[X86] Fix typo in function name. NFCI. convertBitVectorToUnsiged - convertBitVectorToUnsigned llvm-svn: 294914	2017-02-12 20:53:44 +00:00
Craig Topper	cfe8ce3a58	[AVX-512] Add various EVEX move instructions to load folding tables using the VEX equivalents as a guide. llvm-svn: 294908	2017-02-12 18:47:46 +00:00
Craig Topper	5971b5488e	[AVX-512] Add VMOV64toSDZrm CodeGenOnly instruction based on the same instruction from AVX/SSE. I can't prove that we can select this instruction or the AVX/SSE version, but I'm adding it for consistency for now so I can continue matching the load folding tables. llvm-svn: 294907	2017-02-12 18:47:44 +00:00
Craig Topper	ec26801483	[X86] Fix a couple instruction names to use 'mr' instead of 'rm' to indicate they are stores. AVX-512 version was already named with 'mr'. llvm-svn: 294906	2017-02-12 18:47:40 +00:00
Craig Topper	6eca3170a8	[AVX-512] Add VPEXTRD/Q to load folding tables. llvm-svn: 294905	2017-02-12 18:47:37 +00:00
Simon Pilgrim	04ec0f2b2a	[X86][SSE] Update argument names to match function name. NFCI. The target shuffle match function arguments were using the term 'Ops' but the function names referred to them as 'Inputs' - use 'Inputs' consistently. llvm-svn: 294900	2017-02-12 16:46:41 +00:00
Sanjay Patel	45b7e69fef	[InstCombine] fold icmp sgt/slt (add nsw X, C2), C --> icmp sgt/slt X, (C - C2) I found one special case of this transform for 'slt 0', so I removed that and added the general transform. Alive code to check correctness: Name: slt_no_overflow Pre: WillNotOverflowSignedSub(C1, C2) %a = add nsw i8 %x, C2 %b = icmp slt %a, C1 => %b = icmp slt %x, C1 - C2 Name: sgt_no_overflow Pre: WillNotOverflowSignedSub(C1, C2) %a = add nsw i8 %x, C2 %b = icmp sgt %a, C1 => %b = icmp sgt %x, C1 - C2 http://rise4fun.com/Alive/MH Differential Revision: https://reviews.llvm.org/D29774 llvm-svn: 294898	2017-02-12 16:40:30 +00:00
Sanjay Patel	97e4b98749	[ValueTracking] use nonnull argument attribute to eliminate null checks Enhancing value tracking's analysis of null-ness was suggested in D27855, so here's a first attempt at that. This is part of solving: https://llvm.org/bugs/show_bug.cgi?id=28430 Differential Revision: https://reviews.llvm.org/D28204 llvm-svn: 294897	2017-02-12 15:35:34 +00:00
Simon Pilgrim	4cd841757a	[X86][AVX2] Add support for combining target shuffles to VPMOVZX Initial 256-bit vector support - 512-bit support requires extra checks for AVX512BW support (PMOVZXBW) that will be handled in a future patch. llvm-svn: 294896	2017-02-12 14:31:23 +00:00
NAKAMURA Takumi	022c6e4f33	AMDGPU::expandMemIntrinsicUses(): Fix an uninitialized variable. This function returned true or undef. llvm-svn: 294895	2017-02-12 13:15:31 +00:00
Dorit Nuzman	eac89d736c	[LV/LoopAccess] Check statically if an unknown dependence distance can be proven larger than the loop-count This fixes PR31098: Try to resolve statically data-dependences whose compile-time-unknown distance can be proven larger than the loop-count, instead of resorting to runtime dependence checking (which are not always possible). For vectorization it is sufficient to prove that the dependence distance is >= VF; But in some cases we can prune unknown dependence distances early, and even before selecting the VF, and without a runtime test, by comparing the distance against the loop iteration count. Since the vectorized code will be executed only if LoopCount >= VF, proving distance >= LoopCount also guarantees that distance >= VF. This check is also equivalent to the Strong SIV Test. Reviewers: mkuper, anemet, sanjoy Differential Revision: https://reviews.llvm.org/D28044 llvm-svn: 294892	2017-02-12 09:32:53 +00:00
Elena Demikhovsky	5d91ab46c0	AVX-512: Fixed DWARF register numbers for XMM16-31 The reference is here: https://software.intel.com/sites/default/files/article/402129/mpx-linux64-abi.pdf llvm-svn: 294890	2017-02-12 07:56:50 +00:00
Chandler Carruth	719ffe1a66	[PM] Add devirtualization-based iteration utility into the new PM's default pipeline. A clang with this patch built with ASan and asserts can build all of the test-suite as well, so it seems to not uncover any latent problems. Differential Revision: https://reviews.llvm.org/D29853 llvm-svn: 294888	2017-02-12 05:38:04 +00:00
Chandler Carruth	e87fc8cb71	[PM] Enable GlobalsAA in the new PM's pipeline by default. All the invalidation issues and bugs in this seem to be fixed, it has survived a full build of the test suite plus SPEC with asserts and ASan enabled on the Clang binary used. Differential Revision: https://reviews.llvm.org/D29815 llvm-svn: 294887	2017-02-12 05:34:04 +00:00

1 2 3 4 5 ...

99723 Commits