llvm-project

Commit Graph

Author	SHA1	Message	Date
Hans Wennborg	07f6d3a893	Switch lowering: don't add incoming PHI values from skipped bit test MBB's (PR27135) After r245976, LLVM will skip the last bit test case if knows it will always be true. However, we would still erroneously update PHI nodes with incoming values from the MBB that would perform the final bit test, causing -verify-machineinstrs to fail. llvm-svn: 266479	2016-04-15 21:45:30 +00:00
Hans Wennborg	c944c13dc1	SelectionDAGISel: rangeify a loop llvm-svn: 266478	2016-04-15 21:45:09 +00:00
Easwaran Raman	f53baca686	Replace the use of MaxFunctionCount module flag Adds an interface to get ProfileSummary for a module and makes InlineCost use ProfileSummary to get max function count. Differential Revision: http://reviews.llvm.org/D18622 llvm-svn: 266477	2016-04-15 21:39:58 +00:00
Vasileios Kalintiris	5a971a48c3	[mips] More range-based for loops. NFC. There are still a couple more inside the MIPS target. I opted for a single commit in order to avoid spamming the list. llvm-svn: 266472	2016-04-15 20:43:17 +00:00
Vasileios Kalintiris	36311395ae	[mips] Use range-based for loops and simplify slightly the code. NFC. llvm-svn: 266471	2016-04-15 20:18:48 +00:00
Ulrich Weigand	6e648ea533	[SystemZ] Call tryAddingSymbolicOperand in the disassembler Use the tryAddingSymbolicOperand callback to attempt to present immediate values in symbolic form when disassembling. This is currently only used for PC-relative immediates (which are most likely to be symbolic in the SystemZ ISA). Add new DecodeMethod types to allow distinguishing between branch and non-branch instructions. llvm-svn: 266469	2016-04-15 19:55:58 +00:00
Tim Northover	903f81ba18	ARM: don't try to hoist constant RHS out of a division. Divisions by a constant can be converted into multiplies which are usually cheaper, but this isn't possible if the constant gets separated (particularly in loops). Fix this by telling ConstantHoisting that the immediate in a DIV is cheap. I considered making the check generic, but neither AArch64 (strangely) nor x86 showed any benefit on the tests I had. llvm-svn: 266464	2016-04-15 18:17:18 +00:00
Chad Rosier	1fbe9bcab4	[AArch64] Add load/store pair instructions to getMemOpBaseRegImmOfsWidth(). This improves AA in the MI schduler when reason about paired instructions. Phabricator Revision: http://reviews.llvm.org/D17098 PR26358 llvm-svn: 266462	2016-04-15 18:09:10 +00:00
Igor Kudrin	e880a06559	Revert "[Coverage] Prevent detection of false instantiations in case of macro expansion." This reverts commit r266436 as it broke buildbot. llvm-svn: 266458	2016-04-15 17:53:48 +00:00
Davide Italiano	7950b12957	[ParallelCG] Add a new splitCodeGen() API which takes a TargetMachineFactory. This is a recommit of r266390 with a fix that will allow tests to pass (hopefully). Before we got a StringRef to M->getTargetTriple() and right after we moved the Module so we were referencing a dangling object. llvm-svn: 266456	2016-04-15 17:34:32 +00:00
David Majnemer	2e02ba78d5	[InstCombine] Don't transform compares of calls to functions named fabs{f,l,} InstCombine wants to optimize compares of calls to fabs with zero. However, we didn't have the necessary legality checking to verify that the function call had the same behavior as fabs. llvm-svn: 266452	2016-04-15 17:21:03 +00:00
Adrian Prantl	75819aedf6	[PR27284] Reverse the ownership between DICompileUnit and DISubprogram. Currently each Function points to a DISubprogram and DISubprogram has a scope field. For member functions the scope is a DICompositeType. DIScopes point to the DICompileUnit to facilitate type uniquing. Distinct DISubprograms (with isDefinition: true) are not part of the type hierarchy and cannot be uniqued. This change removes the subprograms list from DICompileUnit and instead adds a pointer to the owning compile unit to distinct DISubprograms. This would make it easy for ThinLTO to strip unneeded DISubprograms and their transitively referenced debug info. Motivation ---------- Materializing DISubprograms is currently the most expensive operation when doing a ThinLTO build of clang. We want the DISubprogram to be stored in a separate Bitcode block (or the same block as the function body) so we can avoid having to expensively deserialize all DISubprograms together with the global metadata. If a function has been inlined into another subprogram we need to store a reference the block containing the inlined subprogram. Attached to https://llvm.org/bugs/show_bug.cgi?id=27284 is a python script that updates LLVM IR testcases to the new format. http://reviews.llvm.org/D19034 <rdar://problem/25256815> llvm-svn: 266446	2016-04-15 15:57:41 +00:00
Sanjay Patel	f11ab05bdb	[SimplifyCFG] propagate branch metadata when creating select (PR27344) This is almost identical to: http://reviews.llvm.org/rL264527 This doesn't solve PR27344; it just allows the profile weights to survive. To solve the bug, we need to use the profile weights in the backend. llvm-svn: 266442	2016-04-15 15:32:12 +00:00
Geoff Berry	c376406669	[AArch64] Add MMOs to callee-save load/store instructions. Summary: Without MMOs, the callee-save load/store instructions were treated as volatile by the MI post-RA scheduler and AArch64LoadStoreOptimizer. Reviewers: t.p.northover, mcrosier Subscribers: aemerson, rengolin, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D17661 llvm-svn: 266439	2016-04-15 15:16:19 +00:00
Nirav Dave	1f51c334ca	Fix typing on generated LXV2DX/STXV2DX instructions [PPC] Previously when casting generic loads to LXV2DX/ST instructions we would leave the original load return type in place allowing for an assertion failure when we merge two equivalent LXV2DX nodes with different types. This fixes PR27350. Reviewers: nemanjai Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D19133 llvm-svn: 266438	2016-04-15 15:01:38 +00:00
Jun Bum Lim	4c5bd58ebe	[MachineScheduler]Add support for store clustering Perform store clustering just like load clustering. This change add StoreClusterMutation in machine-scheduler. To control StoreClusterMutation, added enableClusterStores() in TargetInstrInfo.h. This is enabled only on AArch64 for now. This change also add support for unscaled stores which were not handled in getMemOpBaseRegImmOfs(). llvm-svn: 266437	2016-04-15 14:58:38 +00:00
Igor Kudrin	061d496c51	[Coverage] Prevent detection of false instantiations in case of macro expansion. The root of the problem was that findMainViewFileID(File, Function) could return some ID for any given file, even though that file was not the main file for that function. This patch ensures that the result of this function is conformed with the result of findMainViewFileID(Function). Differential Revision: http://reviews.llvm.org/D18787 llvm-svn: 266436	2016-04-15 14:56:50 +00:00
Nicolai Haehnle	750082d1fe	AMDGPU/SI: Fix regression with no-return atomics Summary: In the added test-case, the atomic instruction feeds into a non-machine CopyToReg node which hasn't been selected yet, so guard against non-machine opcodes here. Reviewers: arsenm, tstellarAMD Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D19043 llvm-svn: 266433	2016-04-15 14:42:36 +00:00
Craig Topper	18e69f4f63	Use MVT instead of EVT to remove a bunch of unnecessary calls to getSimpleVT. llvm-svn: 266414	2016-04-15 06:20:21 +00:00
Craig Topper	ea46b592ab	Add a setOperationPromotedToType convenience method that sets an operation to promoted and set the type in one call. Use it so save code in X86. llvm-svn: 266413	2016-04-15 06:20:18 +00:00
Craig Topper	13e9dc66e4	[X86] AND, OR, and XOR of vectors are always legal no need to set them legal explicitly. llvm-svn: 266412	2016-04-15 06:20:14 +00:00
Craig Topper	5e20fd3e7c	[X86] Combine an if and else block that had the same set of calls to setOperationAction that only varied in Legal/Custom. Use the ternary operator on that argument instead. NFC llvm-svn: 266410	2016-04-15 04:57:09 +00:00
Davide Italiano	2abf2e7c8c	Revert "[LTO] Add a new splitCodeGen() API which takes a TargetMachineFactory." This reverts commits r266390 and r266396 as they broke some bots. llvm-svn: 266408	2016-04-15 02:07:03 +00:00
Justin Lebar	cd5fbea67e	[NVPTX] Set NVPTXTTI::getInliningThresholdMultiplier to 5. Summary: Calls on NVPTX are unusually expensive (for one thing, lots of state needs to be saved to memory, which is slow), so make the inlininer much more aggressive. Reviewers: chandlerc Subscribers: jholewinski, llvm-commits, tra Differential Revision: http://reviews.llvm.org/D18561 llvm-svn: 266406	2016-04-15 01:38:50 +00:00
Justin Lebar	8650a4da93	[TTI] Add getInliningThresholdMultiplier. Summary: InlineCost's threshold is multiplied by this value. This lets us adjust the inlining threshold up or down on a per-target basis. For example, we might want to increase the threshold on targets where calls are unusually expensive. Reviewers: chandlerc Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D18560 llvm-svn: 266405	2016-04-15 01:38:48 +00:00
Justin Lebar	7dba2e0d0c	[ifcnv] Don't duplicate blocks that contain convergent instructions. It's unsafe to duplicate blocks that contain convergent instructions during ifcnv. See the patch for details. Reviewers: hfinkel Differential Revision: http://reviews.llvm.org/D17518 llvm-svn: 266404	2016-04-15 01:38:41 +00:00
Justin Lebar	cf63b64fc6	[PM] Add a SpeculativeExecution pass for targets with divergent branches. Summary: This IR pass is helpful for GPUs, and other targets with divergent branches. It's a nop on targets without divergent branches. Reviewers: chandlerc Subscribers: llvm-commits, jingyue, rnk, joker.eph, tra Differential Revision: http://reviews.llvm.org/D18626 llvm-svn: 266399	2016-04-15 00:32:12 +00:00
Justin Lebar	cad81cf6b3	[Speculation] Add a SpeculativeExecution mode where the pass does nothing unless TTI::hasBranchDivergence() is true. Summary: This lets us add this pass to the IR pass manager unconditionally; it will simply not do anything on targets without branch divergence. Reviewers: tra Subscribers: llvm-commits, jingyue, rnk, chandlerc Differential Revision: http://reviews.llvm.org/D18625 llvm-svn: 266398	2016-04-15 00:32:09 +00:00
Hans Wennborg	40cfde3cb8	Option parser: class for consuming a joined arg in addition to all remaining args llvm-svn: 266394	2016-04-15 00:23:30 +00:00
Davide Italiano	3fdd27df03	[LTO] Add a new splitCodeGen() API which takes a TargetMachineFactory. This will be used in lld to avoid creating TargetMachine in two different places. See D18999 for a more detailed discussion. Differential Revision: http://reviews.llvm.org/D19139 llvm-svn: 266390	2016-04-15 00:07:28 +00:00
Matt Arsenault	9c499c3a74	AMDGPU: Remove custom load/store scalarization llvm-svn: 266385	2016-04-14 23:31:26 +00:00
Matt Arsenault	fd8ab09c0e	AMDGPU: Include LDS size in printed comment llvm-svn: 266382	2016-04-14 22:11:51 +00:00
Michael Kuperstein	16f13e252b	[AliasSetTracker] Correctly handle changing the size of an entry If the size of an AST entry changes, we also need to make sure we perform necessary alias set merges, as the new size may overlap pointers in other sets. We happen to run into this with memset, because memset allows an entry for a i8* pointer to have a decidedly non-i8 size. This fixes PR27262. Differential Revision: http://reviews.llvm.org/D18939 llvm-svn: 266381	2016-04-14 22:00:11 +00:00
Mehdi Amini	dc4c095d51	Nuke getGlobalContext() from LLVM (but the C API) The only use for getGlobalContext() is in the C API. Let's just move the static global here and nuke the C++ API. Differential Revision: http://reviews.llvm.org/D19094 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266380	2016-04-14 21:59:18 +00:00
Mehdi Amini	03b42e41bf	Remove every uses of getGlobalContext() in LLVM (but the C API) At the same time, fixes InstructionsTest::CastInst unittest: yes you can leave the IR in an invalid state and exit when you don't destroy the context (like the global one), no longer now. This is the first part of http://reviews.llvm.org/D19094 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266379	2016-04-14 21:59:01 +00:00
Matt Arsenault	3d1c1deb04	AMDGPU: Run SIFoldOperands after PeepholeOptimizer PeepholeOptimizer cleans up redundant copies, which makes the operand folding more effective. shader-db stats: Totals: SGPRS: 34200 -> 34336 (0.40 %) VGPRS: 22118 -> 21655 (-2.09 %) Code Size: 632144 -> 633460 (0.21 %) bytes LDS: 11 -> 11 (0.00 %) blocks Scratch: 10240 -> 11264 (10.00 %) bytes per wave Max Waves: 8822 -> 8918 (1.09 %) Wait states: 0 -> 0 (0.00 %) Totals from affected shaders: SGPRS: 7704 -> 7840 (1.77 %) VGPRS: 5169 -> 4706 (-8.96 %) Code Size: 234444 -> 235760 (0.56 %) bytes LDS: 2 -> 2 (0.00 %) blocks Scratch: 0 -> 1024 (0.00 %) bytes per wave Max Waves: 1188 -> 1284 (8.08 %) Wait states: 0 -> 0 (0.00 %) Increases: SGPRS: 35 (0.01 %) VGPRS: 1 (0.00 %) Code Size: 59 (0.02 %) LDS: 0 (0.00 %) Scratch: 1 (0.00 %) Max Waves: 48 (0.02 %) Wait states: 0 (0.00 %) Decreases: SGPRS: 26 (0.01 %) VGPRS: 54 (0.02 %) Code Size: 68 (0.03 %) LDS: 0 (0.00 %) Scratch: 0 (0.00 %) Max Waves: 4 (0.00 %) Wait states: 0 (0.00 %) llvm-svn: 266378	2016-04-14 21:58:24 +00:00
Matt Arsenault	4ac341c8b3	AMDGPU: Directly emit m0 initialization with s_mov_b32 Currently what comes out of instruction selection is a register initialized to -1, and then copied to m0. MachineCSE doesn't consider copies, but we want these to be CSEed. This isn't much of a problem currently, because SIFoldOperands is run immediately after. This avoids regressions when SIFoldOperands is run later from leaving all copies to m0. llvm-svn: 266377	2016-04-14 21:58:15 +00:00
Matt Arsenault	7900334dd5	AMDGPU: Fold bitcasts of scalar constants to vectors This cleans up some messes since the individual scalar components can be CSEed. llvm-svn: 266376	2016-04-14 21:58:07 +00:00
Geoff Berry	6381713b37	[ScheduleDAGInstrs] Re-factor for based on review feedback. NFC. Summary: Re-factor some code to improve clarity and style based on review comments from http://reviews.llvm.org/D18093. Reviewers: MatzeB, mcrosier Subscribers: MatzeB, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D19128 llvm-svn: 266372	2016-04-14 21:31:07 +00:00
Renato Golin	5cb666add7	[ARM] Adding IEEE-754 SIMD detection to loop vectorizer Some SIMD implementations are not IEEE-754 compliant, for example ARM's NEON. This patch teaches the loop vectorizer to only allow transformations of loops that either contain no floating-point operations or have enough allowance flags supporting lack of precision (ex. -ffast-math, Darwin). For that, the target description now has a method which tells us if the vectorizer is allowed to handle FP math without falling into unsafe representations, plus a check on every FP instruction in the candidate loop to check for the safety flags. This commit makes LLVM behave like GCC with respect to ARM NEON support, but it stops short of fixing the underlying problem: sub-normals. Neither GCC nor LLVM have a flag for allowing sub-normal operations. Before this patch, GCC only allows it using unsafe-math flags and LLVM allows it by default with no way to turn it off (short of not using NEON at all). As a first step, we push this change to make it safe and in sync with GCC. The second step is to discuss a new sub-normal's flag on both communitues and come up with a common solution. The third step is to improve the FastMath flags in LLVM to encode sub-normals and use those flags to restrict NEON FP. Fixes PR16275. llvm-svn: 266363	2016-04-14 20:42:18 +00:00
Sanjay Patel	e998b91d86	[InstCombine] remove constant by inverting compare + logic (PR27105) https://llvm.org/bugs/show_bug.cgi?id=27105 We can check if all bits outside of a constant mask are set with a single constant. As noted in the bug report, although this form should be considered the canonical IR, backends may want to transform this into an 'andn' / 'andc' comparison against zero because that could be a single machine instruction. Differential Revision: http://reviews.llvm.org/D18842 llvm-svn: 266362	2016-04-14 20:17:40 +00:00
Dehao Chen	34cc676732	Fix null pointer access for discriminator assignment. Summary: This fixes the buildbot failure. Reviewers: dnovillo, davidxl Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D19129 llvm-svn: 266360	2016-04-14 19:46:38 +00:00
Tom Stellard	000c5af3e6	AMDGPU: Add skeleton GlobalIsel implementation Summary: This adds the necessary target code to be able to run the ir translator. Lowering function arguments and returns is a nop and there is no support for RegBankSelect. Reviewers: arsenm, qcolombet Subscribers: arsenm, joker.eph, vkalintiris, llvm-commits Differential Revision: http://reviews.llvm.org/D19077 llvm-svn: 266356	2016-04-14 19:09:28 +00:00
Dehao Chen	46f8fbbb1b	Update discriminator assignment algorithm to handle nested call correctly. Summary: Add discriminator for nested call correctly. Reviewers: davidxl, dnovillo Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D19127 llvm-svn: 266354	2016-04-14 18:37:18 +00:00
Reid Kleckner	28865809fe	Sink DI metadata usage out of MachineInstr.h and MachineInstrBuilder.h MachineInstr.h and MachineInstrBuilder.h are very popular headers, widely included across all LLVM backends. It turns out that there only a handful of TUs that actually care about DI operands on MachineInstrs. After this change, touching DebugInfoMetadata.h and rebuilding llc only needs 112 actions instead of 542. llvm-svn: 266351	2016-04-14 18:29:59 +00:00
Davide Italiano	96d2a1c603	[ValueMapper] Range-loopify to improve readability. NFC. llvm-svn: 266350	2016-04-14 18:07:32 +00:00
Jacques Pienaar	ad1db3597e	[lanai] Add custom lowering for SRL_PARTS i32. llvm-svn: 266349	2016-04-14 17:59:22 +00:00
Tom Stellard	cef0fe4245	[GlobalISel] Move GISelAccessor class into public headers Reviewers: qcolombet Subscribers: joker.eph, vkalintiris, llvm-commits Differential Revision: http://reviews.llvm.org/D19120 llvm-svn: 266348	2016-04-14 17:45:38 +00:00
Nicolai Haehnle	13d90f324c	[DivergenceAnalysis] Treat PHI with incoming undef as constant Summary: If a PHI has an incoming undef, we can pretend that it is equal to one non-undef, non-self incoming value. This is particularly relevant in combination with the StructurizeCFG pass, which introduces PHI nodes with undefs. Previously, this lead to branch conditions that were uniform before StructurizeCFG to become non-uniform afterwards, which confused the SIAnnotateControlFlow pass. This fixes a crash when Mesa radeonsi compiles a shader from dEQP-GLES3.functional.shaders.switch.switch_in_for_loop_dynamic_vertex Reviewers: arsenm, tstellarAMD, jingyue Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D19013 llvm-svn: 266347	2016-04-14 17:42:47 +00:00
Nicolai Haehnle	05b127da06	[StructurizeCFG] Annotate branches that were treated as uniform Summary: This fully solves the problem where the StructurizeCFG pass does not consider the same branches as uniform as the SIAnnotateControlFlow pass. The patch in D19013 helps with this problem, but is not sufficient (and, interestingly, causes a "regression" with one of the existing test cases). No tests included here, because tests in D19013 already cover this. Reviewers: arsenm, tstellarAMD Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D19018 llvm-svn: 266346	2016-04-14 17:42:35 +00:00
Nicolai Haehnle	723b73b4eb	AMDGPU: Remove SIFixSGPRLiveRanges pass Summary: This pass is unnecessary and overly conservative. It was motivated by situations like def %vreg0:SGPR_32 ... if-block: .. def %vreg1:SGPR_32 ... else-block: ... use %vreg0:SGPR_32 ... and similar situations with uses after the non-uniform control flow, where we are not allowed to assign %vreg0 and %vreg1 to the same physical register, even though in the original, thread/workitem-based CFG, it looks like the live ranges of these registers do not overlap. However, by the time register allocation runs, we have moved to a wave-based CFG that accurately represents the fact that the wave may run through both the if- and the else-block. So the live ranges of %vreg0 and %vreg1 already overlap even without the SIFixSGPRLiveRanges pass. In addition to proving this change correct, I have tested it with Piglit and a small number of other tests. Reviewers: arsenm, tstellarAMD Subscribers: MatzeB, arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D19041 llvm-svn: 266345	2016-04-14 17:42:29 +00:00
Nicolai Haehnle	19f0f5177d	AMDGPU: change a redundant if () to an assert(). NFC Summary: I've been carrying this change around with me for a while, because the if () managed to confuse me while following the code. All callers ensure that the assertion holds. Reviewers: arsenm, tstellarAMD Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D19042 llvm-svn: 266344	2016-04-14 17:42:18 +00:00
Tom Stellard	b72a65ff53	[GlobalISel] Coding style and whitespace fixes Reviewers: qcolombet Subscribers: joker.eph, llvm-commits, vkalintiris Differential Revision: http://reviews.llvm.org/D19119 llvm-svn: 266342	2016-04-14 17:23:33 +00:00
Tim Northover	cdf1529c01	AArch64: expand cmpxchg after regalloc at -O0. FastRegAlloc works only at the basic-block level and spills all live-out registers. Unfortunately for a stack-based cmpxchg near the spill slots, this can perpetually clear the exclusive monitor, which means the cmpxchg will never succeed. I believe the only way to handle this within LLVM is by expanding the loop post-regalloc. We don't want this in general because it severely limits the optimisations that can be done, so we limit this to -O0 compilations. It's an ugly hack, and about the one good point in the whole mess is that we can treat all cmpxchg operations in the most naive way possible (seq_cst, no clrex faff) without affecting correctness. Should fix PR25526. llvm-svn: 266339	2016-04-14 17:03:29 +00:00
Jacques Pienaar	add4a274ba	[lanai] Add areMemAccessesTriviallyDisjoint, getMemOpBaseRegImmOfs and getMemOpBaseRegImmOfsWidth. Summary: Add getMemOpBaseRegImmOfsWidth to enable determining independence during MiSched. Reviewers: eliben, majnemer Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D18903 llvm-svn: 266338	2016-04-14 16:47:42 +00:00
Tom Stellard	79a1fd718c	AMDGPU: allow specifying a workgroup size that needs to fit in a compute unit Summary: For GL_ARB_compute_shader we need to support workgroup sizes of at least 1024. However, if we want to allow large workgroup sizes, we may need to use less registers, as we have to run more waves per SIMD. This patch adds an attribute to specify the maximum work group size the compiled program needs to support. It defaults, to 256, as that has no wave restrictions. Reducing the number of registers available is done similarly to how the registers were reserved for chips with the sgpr init bug. Reviewers: mareko, arsenm, tstellarAMD, nhaehnle Subscribers: FireBurn, kerberizer, llvm-commits, arsenm Differential Revision: http://reviews.llvm.org/D18340 Patch By: Bas Nieuwenhuizen llvm-svn: 266337	2016-04-14 16:27:07 +00:00
Tom Stellard	f110f8f9f7	AMDGPU/SI: Use the correct scratch wave offset register for shaders. Summary: The code previously always used s1 as it was using the user + system SGPR information for compute kernels. This is incorrect for Mesa shaders though, The register should be the next SGPR after all user and system SGPR's. We use that Mesa adds arguments for all input and system SGPR's and take the next available SGPR for the scratch wave offset register. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewers: mareko, arsenm, nhaehnle, tstellarAMD Subscribers: qcolombet, arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18941 Patch By: Bas Nieuwenhuizen llvm-svn: 266336	2016-04-14 16:27:03 +00:00
Betul Buyukkurt	4f1e8c94bf	[PGO] Do not attach VP metadata if value count at site is 0 [NFC] llvm-svn: 266335	2016-04-14 16:25:45 +00:00
Silviu Baranga	b77365b595	[SCEV][LAA] Add tests for SCEV expression transformations performed during LAA Summary: Add a print method to Predicated Scalar Evolution which prints all interesting transformations done by PSE. Loop Access Analysis will now print this as part of the analysis output. We now use this to check the exact expression transformations that were done by PSE in LAA. The additional checking also acts as white-box testing for the getAsAddRec method. Reviewers: anemet, sanjoy Subscribers: sanjoy, mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D18792 llvm-svn: 266334	2016-04-14 16:08:45 +00:00
Simon Dardis	53a3492b71	Summary: Alias 'jic $reg, 0' to 'jrc $reg' and 'jialc $reg, 0' to 'jalrc $reg' like binutils. This patch was previous committed as r266055 as seemed to have caused some spurious test failures. They did not reappear after further local testing. llvm-svn: 266301	2016-04-14 13:43:17 +00:00
Igor Kudrin	c0774e6374	[Coverage] Avoid unnecessary copying of std::vector Approved by: Justin Bogner <mail@justinbogner.com> Differential Revision: http://reviews.llvm.org/D18756 llvm-svn: 266284	2016-04-14 09:10:00 +00:00
Adam Nemet	7aab648831	Revert "Support arbitrary addrspace pointers in masked load/store intrinsics" This reverts commit r266086. It breaks the LTO build of gcc in SPEC2000. llvm-svn: 266282	2016-04-14 08:47:17 +00:00
Mehdi Amini	8dcc8080ce	ThinLTO: linkonce compile-time optimization, do not bother when there is only one input file From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266281	2016-04-14 08:46:22 +00:00
David Majnemer	0f26b0aeb4	[CodeGen] Teach LLVM how to lower @llvm.{min,max}num to {MIN,MAX}NAN The behavior of {MIN,MAX}NAN differs from that of {MIN,MAX}NUM when only one of the inputs is NaN: -NUM will return the non-NaN argument while -NAN would return NaN. It is desirable to lower to @llvm.{min,max}num to -NAN if they don't have a native instruction for -NUM. Notably, ARMv7 NEON's vmin has the -NAN semantics. N.B. Of course, it is only safe to do this if the intrinsic call is marked nnan. llvm-svn: 266279	2016-04-14 07:13:24 +00:00
Mehdi Amini	867e91468b	Do not use getGlobalContext()... ever. This code was creating a new type in the global context, regardless of which context the user is sitting in, what can possibly go wrong? From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266275	2016-04-14 04:36:40 +00:00
Matt Arsenault	9cd90712f0	AMDGPU: Implement canonicalize Also add generic DAG node for it. llvm-svn: 266272	2016-04-14 01:42:16 +00:00
Matthias Braun	46b0f03e12	TargetLowering: Factor out common code for tail call eligibility checking; NFC llvm-svn: 266270	2016-04-14 01:10:42 +00:00
George Burgess IV	cae581d13f	[CFLAA] Fix up code style a bit. NFC. llvm-svn: 266262	2016-04-13 23:27:37 +00:00
Tim Northover	5c02f9ad28	ARM: override cost function to re-enable ConstantHoisting (& fix it). At some point, ARM stopped getting any benefit from ConstantHoisting because the pass called a different variant of getIntImmCost. Reimplementing the correct variant revealed some problems, however: + ConstantHoisting was modifying switch statements. This is simply invalid, the cases must remain integer constants no matter the notional cost. + ConstantHoisting was mangling alloca instructions in the entry block. These should be handled by FrameLowering, so constants actually have a cost of 0. Worse, the resulting bitcasts meant they became dynamic allocas. rdar://25707382 llvm-svn: 266260	2016-04-13 23:08:27 +00:00
Amaury Sechet	2a734db7d3	Revert "Add LLVMGetAttrKindIDInContext in the C API in order to facilitate migration away from LLVMAttribute" This reverts commit 0bcfd95c268bcb180a525e1837e84475df8acdc7. llvm-svn: 266259	2016-04-13 23:01:39 +00:00
Duncan P. N. Exon Smith	11f60fd65a	ValueMapper: Resolve cycles on the new nodes Fix a major bug from r265456. Although it's now much rarer, ValueMapper sometimes has to duplicate cycles. The might-transitively-reference-a-temporary counts don't decrement on their own when there are cycles, and you need to call MDNode::resolveCycles to fix it. r265456 was checking the input nodes to see if they were unresolved. This is useless; they should never be unresolved. Instead we should check the output nodes and resolve cycles on them. llvm-svn: 266258	2016-04-13 22:54:01 +00:00
Amaury Sechet	3ef4e4a98c	Add LLVMGetAttrKindIDInContext in the C API in order to facilitate migration away from LLVMAttribute Summary: LLVMAttribute has outlived its utility and is becoming a problem for C API users that what to use all the LLVM attributes. In order to help moving away from LLVMAttribute in a smooth manner, this diff introduce LLVMGetAttrKindIDInContext, which can be used instead of the enum values. Reviewers: Wallbraker, whitequark, joker.eph, echristo Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D18749 llvm-svn: 266257	2016-04-13 22:51:40 +00:00
Matthias Braun	707e02c273	ARM: Use a callee save register for the swiftself parameter. It is very likely that the swiftself parameter is alive throughout most functions function so putting it into a callee save register should avoid spills for the callers with only a minimum amount of extra spills in the callees. Currently the generated code is correct but unnecessarily spills and reloads arguments passed in callee save registers, I will address this in upcoming patches. This also adds a missing check that for tail calls the preserved value of the caller must be the same as the callees parameter. Differential Revision: http://reviews.llvm.org/D18901 llvm-svn: 266253	2016-04-13 21:43:25 +00:00
Matthias Braun	588d1cdad4	X86: Use a callee save register for the swiftself parameter. It is very likely that the swiftself parameter is alive throughout most functions function so putting it into a callee save register should avoid spills for the callers with only a minimum amount of extra spills in the callees. Currently the generated code is correct but unnecessarily spills and reloads arguments passed in callee save registers, I will address this in upcoming patches. This also adds a missing check that for tail calls the preserved value of the caller must be the same as the callees parameter. Differential Revision: http://reviews.llvm.org/D18902 llvm-svn: 266252	2016-04-13 21:43:21 +00:00
Matthias Braun	74a0bd319a	AArch64: Use a callee save registers for swiftself parameters It is very likely that the swiftself parameter is alive throughout most functions function so putting it into a callee save register should avoid spills for the callers with only a minimum amount of extra spills in the callees. Currently the generated code is correct but unnecessarily spills and reloads arguments passed in callee save registers, I will address this in upcoming patches. This also adds a missing check that for tail calls the preserved value of the caller must be the same as the callees parameter. Differential Revision: http://reviews.llvm.org/D19007 llvm-svn: 266251	2016-04-13 21:43:16 +00:00
Easwaran Raman	d295b00ae9	Return immediately from analyzeCall if analyzeBlock returns false. This is part of the patch reviewed at http://reviews.llvm.org/D17584 llvm-svn: 266249	2016-04-13 21:20:22 +00:00
Kevin Enderby	8702574557	Start to add real error messages for malformed Mach-O files. And update the existing test cases in test/Object/macho-invalid.test to use llvm-objdump with the -macho option to produce these error messages and stop producing the generic "Invalid data was encountered while parsing the file" message. Working from the beginning of the file, if the mach header is too large for the size of the file and then if the load commands that follow extend past the end of the file these two errors now generate correct error messages. Both of these have existing test cases in test/Object/macho-invalid.test . But the first with macho-invalid-header it will never trigger the error message "mach header extends past the end of the file" using any of the llvm tools as they all use identify_magic() which rejects files with the correct magic number that are too small in size. So I tested this by hacking that code and seeing the error message down in parseHeader() really does happen. So in case there is ever code in llvm that directly calls createMachOObjectFile() this error message will be correctly produced. The second error message of "load commands extends past the end of the file" is triggered by a number of existing tests cases in test/Object/macho-invalid.test . Also other tests trigger different error messages now like "ilocalsym plus nlocalsym in LC_DYSYMTAB load command extends past the end of the symbol table". There are two existing test cases that still get the "Invalid data was encountered ..." error messages that I will tackle next. But they will involve a bit of pluming an Expect<...> up through the call stack and I want to do those as separate changes. FYI, for those test cases that were trying to test specific errors that now get different errors I’ll fix those in follow on changes and create new test cases for those so they test the error they were meant to test. llvm-svn: 266248	2016-04-13 21:17:58 +00:00
JF Bastien	8331458deb	NFC mergefunc: const correctness Some of the comparators were const others weren't making it annoying to add new comparators which call existing ones. llvm-svn: 266247	2016-04-13 21:12:21 +00:00
Tom Stellard	b951f10701	AMDGPU/SI: Add support for spilling VGPRs without having to scavenge registers Summary: When we are spilling SGPRs to scratch memory, we usually don't have free SGPRs to do the address calculation, so we need to re-use the ScratchOffset register for the calculation. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18917 llvm-svn: 266244	2016-04-13 20:44:16 +00:00
Tim Northover	c0bef99bb0	AsmParser: record "# line file" context to calculate location for diag Since we can't emit diagnostics for missing "jmp 1f" labels until the end of the file, we need to be able to restore the context used to calculate file/line. This is basically the "# line file" directive that's being used at the time the expression is seen. rdar://25706972 llvm-svn: 266238	2016-04-13 19:46:54 +00:00
Peter Collingbourne	5413f6f863	LibDriver: Silently do nothing when provided no inputs. This behavior is strange, but it matches lib.exe. Based on a patch by Nico Weber. Fixes PR27335. llvm-svn: 266236	2016-04-13 19:36:04 +00:00
Betul Buyukkurt	bf8554c279	[PGO] Remove redundant VP instrumentation LLVM optimization passes may reduce a profiled target expression to a constant. Removing runtime calls at such instrumentation points would help speedup the runtime of the instrumented program. llvm-svn: 266229	2016-04-13 18:52:19 +00:00
Nemanja Ivanovic	87bcae366d	[PowerPC] Basic support for P9 byte comparison and count trailing zero insns This patch corresponds to review: http://reviews.llvm.org/D17850 This patch implements the following instructions: cmprb, cmpeqb, cnttzw, cnttzw., cnttzd, cnttzd. llvm-svn: 266228	2016-04-13 18:51:18 +00:00
Evandro Menezes	8d53f88162	[AArch64] Disable LDP/STP for quads Disable LDP/STP for quads on Exynos M1 as they are not as efficient as pairs of regular LDR/STR. Patch by Abderrazek Zaafrani <a.zaafrani@samsung.com>. llvm-svn: 266223	2016-04-13 18:31:45 +00:00
Davide Italiano	266a665f9f	Revert "[IR/Verifier] Each DISubprogram with isDefinition: true must belong to a CU." This reverts commit r266102. The O(N^2) verifier check causes timeouts in LTO test suite. llvm-svn: 266221	2016-04-13 18:08:07 +00:00
David Blaikie	6662d6ad2a	[IR/DebugInfoMetadata] Simplify array length calculation by using array_lengthof instead of ArrayRef::size llvm-svn: 266218	2016-04-13 17:42:56 +00:00
Nirav Dave	2477491a92	Cleanup Store Merging in UseAA case This patch fixes a bug (PR26827) when using anti-aliasing in store merging. This sets the chain users of the component stores to point to the new store instead of the component stores chain parent. Reviewers: jyknight Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D18909 llvm-svn: 266217	2016-04-13 17:27:26 +00:00
Mehdi Amini	b5b289339b	Revert "Make aliases explicit in the summary" Inadvertently commited... This reverts commit e618ec93786d99df2ddf280ad2d5e02f5516cecf. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266215	2016-04-13 17:20:07 +00:00
Mehdi Amini	ce744a95fd	Make aliases explicit in the summary Summary: To be able to work accurately on the reference graph when taking decision about internalizing, promoting, renaming, etc. We need to have the alias information explicit. Reviewers: tejohnson Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D18836 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266214	2016-04-13 17:18:42 +00:00
Tim Northover	b8a1ecfc62	AArch64: don't create instructions that write to xzr/wzr twice. These are unpredictable even on AArch64. Patch by Yichao Yu. llvm-svn: 266206	2016-04-13 16:25:39 +00:00
Artem Tamazov	eb4d5a9b0b	[AMDGPU][llvm-mc] Support of Trap Handler registers (TTMP0..11 and TBA/TMA)git status Tests added along with implemented feature. Note that there is a small leftover of unecessary MI sheduling issue (more info in the review). CodeGen/AMDGPU/salu-to-valu.ll updated to fix the false regression. TODO: Support for TTMP quads, comma-separated syntax in "[]" and more. Differential Revision: http://reviews.llvm.org/D17825 llvm-svn: 266205	2016-04-13 16:18:41 +00:00
Zoran Jovanovic	2f6845ba39	[mips] Fix emitAtomicCmpSwapPartword to handle 64 bit pointers correctly Differential Revision: http://reviews.llvm.org/D18995 llvm-svn: 266204	2016-04-13 16:02:25 +00:00
Vasileios Kalintiris	3751d4114c	[mips] Sign-extend i32 values truncated from previously zero-extended i32 values. Summary: This is a special case for MIPS64 because the architecture requires properly 32-bit sign-extended values in the register containers. Additionaly, we merge consecutive trunc + AssertZExt nodes in order to avoid unnecessary sign-extensions when the extension comes from a type smaller than i32. Reviewers: dsanders Subscribers: dsanders, sdardis, llvm-commits Differential Revision: http://reviews.llvm.org/D18893 llvm-svn: 266203	2016-04-13 15:07:45 +00:00
David L Kreitzer	752c1448fe	Simplify strlen to a subtraction for certain cases. Patch by Li Huang (li1.huang@intel.com) Differential Revision: http://reviews.llvm.org/D18230 llvm-svn: 266200	2016-04-13 14:31:06 +00:00
Petar Jovanovic	644b8c1a5d	Calculate __builtin_object_size when pointer depends on a condition This patch fixes calculating of builtin_object_size if it depends on a condition. Before this patch compiler did not know how to calculate the object size when it finds a condition that cannot be eliminated. This patch enables calculating of builtin_object_size even in case when condition cannot be eliminated by choosing minimum or maximum value as a result from condition. Choosing minimum or maximum value from condition is based on the second argument of __builtin_object_size function. Patch by Strahinja Petrovic. Differential Revision: http://reviews.llvm.org/D18438 llvm-svn: 266193	2016-04-13 12:25:25 +00:00
Zlatko Buljan	58d6a959be	[mips][microMIPS] Add CodeGen support for DIV, MOD, DIVU, MODU, DDIV, DMOD, DDIVU and DMODU instructions Differential Revision: http://reviews.llvm.org/D17137 This patch was reverted after the revertion of dependant patch http://reviews.llvm.org/D17068. There was the problem with test-suite failure. The problem is hopefully solved with dependant patch so this patch is commited again. llvm-svn: 266179	2016-04-13 08:02:26 +00:00
David Majnemer	3ee5f34469	[InstCombine] We folded an fcmp to an i1 instead of a vector of i1 Remove an ad-hoc transform in InstCombine and replace it with more general machinery (ValueTracking, InstructionSimplify and VectorUtils). This fixes PR27332. llvm-svn: 266175	2016-04-13 06:55:52 +00:00
Mehdi Amini	ce23e9702e	Simplify LTOInternalize into UpdateLLVMCompilerUsed It is now only doing the update to the llvm.compiler_used global. The client has to call separately the internalization stage. Hopefully the code is simpler to understand this way. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266174	2016-04-13 06:32:46 +00:00
Mehdi Amini	105938302a	Minor cleanup in Internalize, hide helper class using anonymous namespace (NFC) From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266173	2016-04-13 06:32:29 +00:00
Mehdi Amini	16fcb418ad	LTOInternalize: Use a StringSet instead of a sorted vector and a binary search query for each function From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266172	2016-04-13 06:32:04 +00:00
Hrvoje Varga	11dd31df9a	[mips][microMIPS] Fix for "Cannot copy registers" assertion Differential Revision: http://reviews.llvm.org/D17068 This changes contains fix for failing test-suite. So, this patch should hopefully work now. llvm-svn: 266171	2016-04-13 06:17:21 +00:00
Mehdi Amini	deee003a58	Move "ExternalSymbols" out of LTOInternalize (NFC) This is not really related to internalization per se. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266170	2016-04-13 05:36:06 +00:00
Mehdi Amini	59269a874f	Really return whether Internalize did change the Module or not. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266169	2016-04-13 05:25:16 +00:00
Mehdi Amini	3949b9e6dd	Modernize Internalizer with for-range loop (NFC) From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266168	2016-04-13 05:25:12 +00:00
Mehdi Amini	24d3414f06	Refactor the InternalizePass into a helper class, and expose it through a public free function (NFC) There is really no reason to require to instanciate a pass manager to internalize. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266167	2016-04-13 05:25:08 +00:00
Mehdi Amini	4078709957	Refactor Internalization pass to use as a callback instead of a StringSet (NFC) This will save a bunch of copies / initialization of intermediate datastructure, and (hopefully) simplify the code. This also abstract the symbol preservation mechanism outside of the Internalization pass into the client code, which is not forced to keep a map of strings for instance (ThinLTO will prefere hashes). From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266163	2016-04-13 04:20:32 +00:00
Wei Mi	9a16d655c7	Recommit r265547, and r265610,r265639,r265657 on top of it, plus two fixes with one about error verify-regalloc reported, and another about live range update of phi after rematerialization. r265547: Replace analyzeSiblingValues with new algorithm to fix its compile time issue. The patch is to solve PR17409 and its duplicates. analyzeSiblingValues is a N x N complexity algorithm where N is the number of siblings generated by reg splitting. Although it causes siginificant compile time issue when N is large, it is also important for performance since it removes redundent spills and enables rematerialization. To solve the compile time issue, the patch removes analyzeSiblingValues and replaces it with lower cost alternatives containing two parts. The first part creates a new spill hoisting method in postOptimization of register allocation. It does spill hoisting at once after all the spills are generated instead of inside every instance of selectOrSplit. The second part queries the define expr of the original register for rematerializaiton and keep it always available during register allocation even if it is already dead. It deletes those dead instructions only in postOptimization. With the two parts in the patch, it can remove analyzeSiblingValues without sacrificing performance. Patches on top of r265547: r265610 "Fix the compare-clang diff error introduced by r265547." r265639 "Fix the sanitizer bootstrap error in r265547." r265657 "InlineSpiller.cpp: Escap \@ in r265547. [-Wdocumentation]" Differential Revision: http://reviews.llvm.org/D15302 Differential Revision: http://reviews.llvm.org/D18934 Differential Revision: http://reviews.llvm.org/D18935 Differential Revision: http://reviews.llvm.org/D18936 llvm-svn: 266162	2016-04-13 03:08:27 +00:00
Mehdi Amini	ef7555fbb2	Fix FunctionImport export list computation: need to take a reference to a map entry to actually modify it From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266159	2016-04-13 01:52:32 +00:00
Mehdi Amini	3dfc952e83	LTOInternalize: Fix member type, should be a reference and not a copy From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266153	2016-04-12 23:58:30 +00:00
Tom Stellard	703b2ec43f	AMDGPU/SI: Fix spilling of 96-bit registers Summary: It seems like this was broken in r252327. I thought we had test cases for this, but it's really hard to tirgger spills of this exact register size since they aren't used very much. Reviewers: arsenm, nhaehnle Subscribers: nhaehnle, arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D19021 llvm-svn: 266152	2016-04-12 23:57:30 +00:00
Mehdi Amini	818f67add5	Fix mismatch on returned type between header and implementation for createNameAnonFunctionPass() From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266151	2016-04-12 23:25:11 +00:00
Justin Bogner	263f314ba7	CodeGen: Clear the MFI's save and restore point after PrologEpilogInserter This state is no longer useful and not guaranteed to be valid in later codegen passes. For example, see the added test, which would print a savepoint of %bb.-1 without this change, and crashes with a use-after-free error under ASan if you apply the recycling allocator patch from llvm.org/PR26808. llvm-svn: 266150	2016-04-12 23:21:53 +00:00
Jeroen Ketema	e48e393729	Add space between words in verify-scev-maps option help message llvm-svn: 266149	2016-04-12 23:21:46 +00:00
Sanjay Patel	5e5056d939	[x86, InstCombine] fix masked load pass-through operand to be a zero vector This bug was introduced with: http://reviews.llvm.org/rL262269 AVX masked loads are specified to set vector lanes to zero when the high bit of the mask element for that lane is zero: "If the mask is 0, the corresponding data element is set to zero in the load form of these instructions, and unmodified in the store form." --Intel manual Differential Revision: http://reviews.llvm.org/D19017 llvm-svn: 266148	2016-04-12 23:16:23 +00:00
Evandro Menezes	551af44e31	[AArch64] Fuse AES{D,E}/AESMC for Exynos M1. (NFC) llvm-svn: 266144	2016-04-12 22:42:36 +00:00
James Y Knight	7873fb9d73	Pre-fill LibcallRoutineNames with nullptr. And rearrange InitLibcallNames slightly. llvm-svn: 266142	2016-04-12 22:32:47 +00:00
David Blaikie	a0fa262181	[MC/ELFObjectWriter] Fix indentation of class body. llvm-svn: 266136	2016-04-12 21:45:53 +00:00
David L Kreitzer	99775c1b6e	Fixed a few typos and formatting problems. NFCI. llvm-svn: 266135	2016-04-12 21:45:09 +00:00
Mehdi Amini	d5faa267c4	Add a pass to name anonymous/nameless function Summary: For correct handling of alias to nameless function, we need to be able to refer them through a GUID in the summary. Here we name them using a hash of the non-private global names in the module. Reviewers: tejohnson Subscribers: joker.eph, llvm-commits Differential Revision: http://reviews.llvm.org/D18883 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266132	2016-04-12 21:35:28 +00:00
Justin Bogner	32ad24d4ef	X86: Avoid accessing SDValues after they've been RAUW'd This fixes two use-after-frees in selectLEA64_32Addr. If matchAddress matches an ADD with an AND as an operand, and that AND hits one of the "heroic transforms" that folds masks and shifts, we end up with N pointing to an SDNode that was deleted. Make sure we're done accessing it before that. Found by ASan with the recycling allocator changes in llvm.org/PR26808. llvm-svn: 266130	2016-04-12 21:34:24 +00:00
JF Bastien	f90029bb14	NFC: MergeFunctions return early Same effect, easier to read. llvm-svn: 266128	2016-04-12 21:23:05 +00:00
Nicolai Haehnle	df77c9ada4	AMDGPU: add llvm.amdgcn.buffer.load/store intrinsics Summary: They correspond to BUFFER_LOAD/STORE_DWORD[_X2,X3,X4] and mostly behave like llvm.amdgcn.buffer.load/store.format. They will be used by Mesa for SSBO and atomic counters at least when robust buffer access behavior is desired. (These instructions perform no format conversion and do buffer range checking per component.) As a side effect of sharing patterns with llvm.amdgcn.buffer.store.format, it has become trivial to add support for the f32 and v2f32 variants of that intrinsic, so the patch does so. Also DAG-ify (and fix) some tests that I noticed intermittent failures in while developing this patch. Some tests were (temporarily) adjusted for the required mayLoad/hasSideEffects changes to the BUFFER_STORE_DWORD* instructions. See also http://reviews.llvm.org/D18291. Reviewers: arsenm, tstellarAMD, mareko Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18292 llvm-svn: 266126	2016-04-12 21:18:10 +00:00
Teresa Johnson	c86af3345c	[ThinLTO] Only compute imports for current module in FunctionImport pass Summary: The function import pass was computing all the imports for all the modules in the index, and only using the imports for the current module. Change this to instead compute only for the given module. This means that the exports list can't be populated, but they weren't being used anyway. Longer term, the linker can collect all the imports and export lists and serialize them out for consumption by the distributed backend processes which use this pass. Reviewers: joker.eph Subscribers: llvm-commits, joker.eph Differential Revision: http://reviews.llvm.org/D18945 llvm-svn: 266125	2016-04-12 21:13:11 +00:00
JF Bastien	1bb32ac480	NFC: MergeFunctions update more comments They are wordy. Some words were wrong. llvm-svn: 266124	2016-04-12 21:13:01 +00:00
James Y Knight	19f6cce4e3	Add __atomic_* lowering to AtomicExpandPass. (Recommit of r266002, with r266011, r266016, and not accidentally including an extra unused/uninitialized element in LibcallRoutineNames) AtomicExpandPass can now lower atomic load, atomic store, atomicrmw, and cmpxchg instructions to __atomic_* library calls, when the target doesn't support atomics of a given size. This is the first step towards moving all atomic lowering from clang into llvm. When all is done, the behavior of __sync_* builtins, __atomic_* builtins, and C11 atomics will be unified. Previously LLVM would pass everything through to the ISelLowering code. There, unsupported atomic instructions would turn into __sync_* library calls. Because of that behavior, Clang currently avoids emitting llvm IR atomic instructions when this would happen, and emits __atomic_* library functions itself, in the frontend. This change makes LLVM able to emit __atomic_* libcalls, and thus will eventually allow clang to depend on LLVM to do the right thing. It is advantageous to do the new lowering to atomic libcalls in AtomicExpandPass, before ISel time, because it's important that all atomic operations for a given size either lower to __atomic_* libcalls (which may use locks), or native instructions which won't. No mixing and matching. At the moment, this code is enabled only for SPARC, as a demonstration. The next commit will expand support to all of the other targets. Differential Revision: http://reviews.llvm.org/D18200 llvm-svn: 266115	2016-04-12 20:18:48 +00:00
Tom Stellard	ab1d3a9d50	AMDGPU/SI: Insert wait states required after v_readfirstlane on SI Summary: We will be able to handle this case much better once the hazard recognizer is finished, but this conservative implementation fixes a hang with the piglit test: spec/arb_arrays_of_arrays/execution/sampler/fs-nested-struct-arrays-nonconst-nested-arra Reviewers: arsenm, nhaehnle Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18988 llvm-svn: 266105	2016-04-12 18:40:43 +00:00
Matt Arsenault	3b08238f78	AMDGPU: Eliminate half of i64 or if one operand is zero_extend from i32 This helps clean up some of the mess when expanding unaligned 64-bit loads when changed to be promote to v2i32, and fixes situations where or x, 0 was emitted after splitting 64-bit ors during moveToVALU. I think this could be a generic combine but I'm not sure. llvm-svn: 266104	2016-04-12 18:24:38 +00:00
Davide Italiano	b390d8ee39	[IR/Verifier] Each DISubprogram with isDefinition: true must belong to a CU. Add a check to catch violations. ~60 tests were broken and prevented this change to be committed. Adrian and I (thanks Adrian!) went through them in the last week or so updating. The check can be done more efficiently but I'd still like to get this in ASAP to avoid more broken tests to be checked in (if any). PR: 27101 llvm-svn: 266102	2016-04-12 18:22:33 +00:00
Ahmed Bougacha	7ac86c47d2	[CodeGen] Remove constant-folding dead code. NFC. This code was specific to vector operations with scalar operands: all the opcodes in FoldValue (via FoldConstantArithmetic) can't match those criteria. Replace it with an assert if that ever changes: at that point, we might need to add back a splat BUILD_VECTOR. llvm-svn: 266100	2016-04-12 18:15:39 +00:00
JF Bastien	5502e91c8b	Check alloca's special state Following up to a similar fix in MergeFunctions: r266022. This patch keeps both in sync, it would be nice to not have to do this. It doesn't look like there's an easy way to test this code directly at the moment: AFAICT all currect uses of isSameOperationAs are looking at instructions deep inside a function. IndVarSimplify/pr24952.ll and InstMerge/st_sink_* look at alloca inadvertently but are brittle tests. llvm-svn: 266099	2016-04-12 18:06:55 +00:00
Philip Reames	92d1f0cb6d	Introduce an GCRelocateInst class [NFC] Previously, we were using isGCRelocate predicates. Using a subclass of IntrinsicInst is far more idiomatic. The refactoring also enables a couple of minor simplifications and code sharing. llvm-svn: 266098	2016-04-12 18:05:10 +00:00
Sanjay Patel	e6a0a23e08	fix indentation; NFC llvm-svn: 266097	2016-04-12 18:01:48 +00:00
Nicolai Haehnle	279970c0dc	AMDGPU/SI: Fix a mis-compilation of multi-level breaks Summary: Under certain circumstances, multi-level breaks (or what is understood by the control flow passes as such) could be miscompiled in a way that causes infinite loops, by emitting incorrect control flow intrinsics. This fixes a hang in dEQP-GLES3.functional.shaders.loops.while_dynamic_iterations.conditional_continue_vertex Reviewers: arsenm, tstellarAMD Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18967 llvm-svn: 266088	2016-04-12 16:10:38 +00:00
Artur Pilipenko	dbe0bc8df4	Support arbitrary addrspace pointers in masked load/store intrinsics This is a resubmittion of 263158 change. This patch fixes the problem which occurs when loop-vectorize tries to use @llvm.masked.load/store intrinsic for a non-default addrspace pointer. It fails with "Calling a function with a bad signature!" assertion in CallInst constructor because it tries to pass a non-default addrspace pointer to the pointer argument which has default addrspace. The fix is to add pointer type as another overloaded type to @llvm.masked.load/store intrinsics. Reviewed By: reames Differential Revision: http://reviews.llvm.org/D17270 llvm-svn: 266086	2016-04-12 15:58:04 +00:00
Geoff Berry	c0739d8305	[ScheduleDAGInstrs] Handle instructions with multiple MMOs Summary: In getUnderlyingObjectsForInstr(): Don't give up on instructions with multiple MMOs, instead look through all the MMOs and if they all meet the conservative criteria previously used for single MMO instructions, then return all of the underlying objects derived from the MMOs. The change to ScheduleDAGInstrs::buildSchedGraph() is needed to avoid the case where multiple underlying objects are present and are related in such a way that successive iterations of the loop end up adding a dependency from an instruction to itself. Reviewers: atrick, hfinkel Subscribers: MatzeB, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D18093 llvm-svn: 266084	2016-04-12 15:50:19 +00:00
Petar Jovanovic	48e4db1ca2	[mips] add assembler support for .set arch=octeon This patch enables assembler support for .set arch=octeon. It will fix issues with inline assembler when this directive is used. Patch by Strahinja Petrovic. Differential Revision: http://reviews.llvm.org/D18548 llvm-svn: 266081	2016-04-12 15:28:16 +00:00
Matt Arsenault	64fa2f4513	AMDGPU: Implement i64 global atomics llvm-svn: 266075	2016-04-12 14:05:11 +00:00
Matt Arsenault	a9dbdcae04	AMDGPU: Add atomic_inc + atomic_dec intrinsics These are different than atomicrmw add 1 because they have an additional input value to clamp the result. llvm-svn: 266074	2016-04-12 14:05:04 +00:00
Matt Arsenault	21ecfe43ba	AMDGPU: Remove trailing whitespace llvm-svn: 266073	2016-04-12 14:04:54 +00:00
Rafael Espindola	d41b54be11	This reverts commit r266002, r266011 and r266016. They broke the msan bot. Original message: Add __atomic_* lowering to AtomicExpandPass. AtomicExpandPass can now lower atomic load, atomic store, atomicrmw,and cmpxchg instructions to __atomic_* library calls, when the target doesn't support atomics of a given size. This is the first step towards moving all atomic lowering from clang into llvm. When all is done, the behavior of __sync_* builtins, __atomic_* builtins, and C11 atomics will be unified. Previously LLVM would pass everything through to the ISelLowering code. There, unsupported atomic instructions would turn into __sync_* library calls. Because of that behavior, Clang currently avoids emitting llvm IR atomic instructions when this would happen, and emits __atomic_* library functions itself, in the frontend. This change makes LLVM able to emit __atomic_* libcalls, and thus will eventually allow clang to depend on LLVM to do the right thing. It is advantageous to do the new lowering to atomic libcalls in AtomicExpandPass, before ISel time, because it's important that all atomic operations for a given size either lower to __atomic_* libcalls (which may use locks), or native instructions which won't. No mixing and matching. At the moment, this code is enabled only for SPARC, as a demonstration. The next commit will expand support to all of the other targets. Differential Revision: http://reviews.llvm.org/D18200 llvm-svn: 266062	2016-04-12 12:30:25 +00:00
Simon Dardis	ee1590f5f0	Revert "[mips] MIPSR6 Compact branch aliases" This reverts commit r266055. ps4-buildslave2 is highlighting a failure. llvm-svn: 266061	2016-04-12 12:22:45 +00:00
Jonas Paulsson	f0fc50905f	[SystemZ] Use LDE32 instead of LE, when Offset is small. On z13, if eliminateFrameIndex() chooses LE (and not LEY), immediately transform that LE to LDE32 to avoid partial register dependencies. LEY should be generally preferred for big offsets over an expansion into LAY + LDE32. Reviewed by Ulrich Weigand. llvm-svn: 266060	2016-04-12 12:07:23 +00:00
Simon Dardis	703c864fe3	[mips] MIPSR6 Compact branch aliases Summary: Alias 'jic $reg, 0' to 'jrc $reg' and 'jialc $reg, 0' to 'jalrc $reg' like binutils. Reviewers: dsanders Differential Revision: http://reviews.llvm.org/D18856 llvm-svn: 266055	2016-04-12 10:41:53 +00:00
Stephan Bergmann	a6ced02a67	Avoid GCC -fpermissive error about llvm::Mangler hidden by member named Mangler llvm-svn: 266049	2016-04-12 08:23:44 +00:00
Mehdi Amini	f59f2bb1b5	Refactor the Internalize stage of libLTO in a separate file (NFC) This is intended to be shared by the ThinLTOCodeGenerator. Note that there is a change in the way the verifier is run, previously it was ran as a Pass on the merged module during internalization. While now the verifier is called explicitely on the merged module outside of the internalize "pass pipeline". What remains strange in the API is the fact that `DisableVerify` in the API does not disable this initial verifier. Differential Revision: http://reviews.llvm.org/D19000 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266047	2016-04-12 06:34:10 +00:00
Chuang-Yu Cheng	94f58e79ae	[PPC64] Mark CR0 Live if PPCInstrInfo::optimizeCompareInstr Creates a Use of CR0 Resolve Bug 27046 (https://llvm.org/bugs/show_bug.cgi?id=27046). The PPCInstrInfo::optimizeCompareInstr function could create a new use of CR0, even if CR0 were previously dead. This patch marks CR0 live if a use of CR0 is created. Author: Tom Jablin (tjablin) Reviewers: hfinkel kbarton cycheng http://reviews.llvm.org/D18884 llvm-svn: 266040	2016-04-12 03:10:52 +00:00
Chuang-Yu Cheng	6efde2fb45	[PPC64] Use mfocrf in prologue when we only need to save 1 nonvolatile CR field In the ELFv2 ABI, we are not required to save all CR fields. If only one nonvolatile CR field is clobbered, use mfocrf instead of mfcr to selectively save the field, because mfocrf has short latency compares to mfcr. Thanks Nemanja's invaluable hint! Reviewers: nemanjai tjablin hfinkel kbarton http://reviews.llvm.org/D17749 llvm-svn: 266038	2016-04-12 03:04:44 +00:00
Matthias Braun	dff243e597	AArch64: Drive-by cleanup llvm-svn: 266035	2016-04-12 02:16:13 +00:00
George Burgess IV	4540ca36a0	Attempt to make buildbot happier with r266032. Apparently std::numeric_limits<unsigned>::max() isn't constexpr everywhere yet. llvm-svn: 266034	2016-04-12 01:44:13 +00:00
George Burgess IV	278199f615	Add the allocsize attribute to LLVM. `allocsize` is a function attribute that allows users to request that LLVM treat arbitrary functions as allocation functions. This patch makes LLVM accept the `allocsize` attribute, and makes `@llvm.objectsize` recognize said attribute. The review for this was split into two patches for ease of reviewing: D18974 and D14933. As promised on the revisions, I'm landing both patches as a single commit. Differential Revision: http://reviews.llvm.org/D14933 llvm-svn: 266032	2016-04-12 01:05:35 +00:00
Quentin Colombet	777a7717ef	[RegBankSelect] Teach the repairing code how to handle physical registers. llvm-svn: 266029	2016-04-12 00:38:51 +00:00
Quentin Colombet	5aacb1da00	[RegisterBankInfo] Do not provide a default mapping for non-reg of phi operations. llvm-svn: 266027	2016-04-12 00:30:14 +00:00
Quentin Colombet	904a2c7422	[RegBankSelect] Teach how to repair definitions. Although repairing definitions is not mandatory for correctness (only phis would be impacted because of the RPO traversal), not repairing might go against the cost model. Therefore, just repair when it is possible. llvm-svn: 266025	2016-04-12 00:12:59 +00:00
JF Bastien	4f43cfd2c2	MergeFunctions: test alloca better r237193 fix handling of alloca size / align in MergeFunctions, but only tested one and didn't follow FunctionComparator::cmpOperations's usual comparison pattern. It also didn't update Instruction.cpp:haveSameSpecialState which I'll do separately. llvm-svn: 266022	2016-04-12 00:03:26 +00:00
Derek Schuff	f7b2bce1f1	Replace MachineRegisterInfo::TracksLiveness with a MachineFunctionProperty Use the MachineFunctionProperty mechanism to indicate whether the liveness info is accurate instead of a bool flag on MRI. Keeps the MRI accessor function for convenience. NFC Differential Revision: http://reviews.llvm.org/D18767 llvm-svn: 266020	2016-04-11 23:32:13 +00:00
Mehdi Amini	ae280e54a9	ThinLTO renaming: use module hash instead of position in the summary This is more robust to changes in the link ordering. Differential Revision: http://reviews.llvm.org/D18946 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266018	2016-04-11 23:26:46 +00:00
JF Bastien	b3ac75f748	AtomicExpandPass: mark assert variable as used Avoid -Wunused-variable llvm-svn: 266016	2016-04-11 23:03:54 +00:00
James Y Knight	00db547f97	Fix compile with GCC after r266002 (Add __atomic_* lowering to AtomicExpandPass) It doesn't like implicitly calling the ArrayRef constructor with a returned array -- it appears to decays the returned value to a pointer, first, before trying to make an ArrayRef out of it. llvm-svn: 266011	2016-04-11 22:52:42 +00:00
Justin Bogner	1faf01578e	CodeGen: Fix a use-after-free in TailDuplication The call to processPHI already erased MI from its parent, so MI isn't even valid here, making the getParent() call a use-after-free in addition to being redundant. Found by ASan with the ArrayRecycler changes in llvm.org/pr26808. llvm-svn: 266008	2016-04-11 22:37:13 +00:00
JF Bastien	3b6eaace62	NFC: keep comment up to date MergeFunctions was refactored a while ago, and Instruction.cpp's comments went out of sync. The content did as well, will fix later. llvm-svn: 266007	2016-04-11 22:30:37 +00:00
Evgeniy Stepanov	f17120a85f	[safestack] Add canary to unsafe stack frames Add StackProtector to SafeStack. This adds limited protection against data corruption in the caller frame. Current implementation treats all stack protector levels as -fstack-protector-all. llvm-svn: 266004	2016-04-11 22:27:48 +00:00
Tim Northover	a6dea06fe3	ARM: use r7 as the frame-pointer on all MachO targets. This is better for a few reasons: + It matches the other tooling for iOS. + It matches EABI in more cases (i.e. Thumb-mode, and in practice we don't use ARM mode). + It leads to infinitesimally smaller code (0.2%, yay!). rdar://25369506 llvm-svn: 266003	2016-04-11 22:27:40 +00:00
James Y Knight	b91d38c5fe	Add __atomic_* lowering to AtomicExpandPass. AtomicExpandPass can now lower atomic load, atomic store, atomicrmw, and cmpxchg instructions to __atomic_* library calls, when the target doesn't support atomics of a given size. This is the first step towards moving all atomic lowering from clang into llvm. When all is done, the behavior of __sync_* builtins, __atomic_* builtins, and C11 atomics will be unified. Previously LLVM would pass everything through to the ISelLowering code. There, unsupported atomic instructions would turn into __sync_* library calls. Because of that behavior, Clang currently avoids emitting llvm IR atomic instructions when this would happen, and emits __atomic_* library functions itself, in the frontend. This change makes LLVM able to emit __atomic_* libcalls, and thus will eventually allow clang to depend on LLVM to do the right thing. It is advantageous to do the new lowering to atomic libcalls in AtomicExpandPass, before ISel time, because it's important that all atomic operations for a given size either lower to __atomic_* libcalls (which may use locks), or native instructions which won't. No mixing and matching. At the moment, this code is enabled only for SPARC, as a demonstration. The next commit will expand support to all of the other targets. Differential Revision: http://reviews.llvm.org/D18200 llvm-svn: 266002	2016-04-11 22:22:33 +00:00
Simon Pilgrim	82e54871d0	[DAGCombiner] Fold xor/and/or (bitcast(A), bitcast(B)) -> bitcast(op (A,B)) anytime before LegalizeVectorOprs xor/and/or (bitcast(A), bitcast(B)) -> bitcast(op (A,B)) was only being combined at the AfterLegalizeTypes stage, this patch permits the combine to occur anytime before then as well. The main aim with this to improve the ability to recognise bitmasks that can be converted to shuffles. I had to modify a number of AVX512 mask tests as the basic bitcast to/from scalar pattern was being stripped out, preventing testing of the mmask bitops. By replacing the bitcasts with loads we can get almost the same result. Differential Revision: http://reviews.llvm.org/D18944 llvm-svn: 265998	2016-04-11 21:10:33 +00:00
Manman Ren	5751814eda	Swift Calling Convention: swifterror target support. Differential Revision: http://reviews.llvm.org/D18716 llvm-svn: 265997	2016-04-11 21:08:06 +00:00
Tom Stellard	0ffdf65eaa	Revert "AMDGPU/SI: Do not generate s_waitcnt after ds_permute/ds_bpermute" This reverts commit r263720. Just confirmed that s_waitcnt is required after ds_permute/ds_bpermute. llvm-svn: 265992	2016-04-11 20:38:40 +00:00
Hans Wennborg	1f09485c40	Fix broken assert, PR24624 llvm-svn: 265989	2016-04-11 20:35:41 +00:00
Hans Wennborg	e631996350	Remove redundant .c_str(), as suggested by PR25633 llvm-svn: 265988	2016-04-11 20:35:17 +00:00
Hans Wennborg	e9134897f4	Fix a couple of redundant conditional expressions (PR27283, PR28282) llvm-svn: 265987	2016-04-11 20:35:01 +00:00
Sanjay Patel	892f167aa5	use range-loops; NFCI llvm-svn: 265985	2016-04-11 20:13:44 +00:00
Tim Northover	6b3169bb97	MCParser: diagnose missing directional labels more clearly. Before, ELF at least managed a diagnostic but it was a completely untraceable "undefined symbol" error. MachO had a variety of even worse behaviours: crash, emit corrupt file, or an equally bad message. llvm-svn: 265984	2016-04-11 19:50:46 +00:00
Matthew Simpson	53207a99f9	[LoopUtils, LV] Fix PR27246 (first-order recurrences) This patch ensures that when we detect first-order recurrences, we reject a phi node if its previous value is also a phi node. During vectorization the initial and previous values of the recurrence are shuffled together to create the value for the current iteration. However, phi nodes are not widened like other instructions. This fixes PR27246. Differential Revision: http://reviews.llvm.org/D18971 llvm-svn: 265983	2016-04-11 19:48:18 +00:00
Sriraman Tallam	f39e190ad8	Test commit. llvm-svn: 265976	2016-04-11 18:40:50 +00:00
Lang Hames	f9033bbf54	[Object] Make .alt_entry directive parsing MachO specific. ELF and COFF will now treat .alt_entry like any other unrecognized directive. llvm-svn: 265975	2016-04-11 18:33:45 +00:00
Reid Kleckner	b6800b3052	Combine redundant stack realignment booleans in MachineFrameInfo MachineFrameInfo does not need to be able to distinguish between the user asking us not to realign the stack and the target telling us it doesn't support stack realignment. Either way, fixed stack objects have their alignment clamped. llvm-svn: 265971	2016-04-11 17:54:03 +00:00
Sanjay Patel	b91bcd704a	add FIXME comment; NFC llvm-svn: 265970	2016-04-11 17:35:57 +00:00
Sanjay Patel	3a48e9823e	add an assert for safety; NFC llvm-svn: 265969	2016-04-11 17:27:44 +00:00
Sanjay Patel	4b9c682acf	variable names start with a capital letter; NFC llvm-svn: 265968	2016-04-11 17:25:23 +00:00
Xinliang David Li	8dd4ca819b	Add code comment/NFC llvm-svn: 265966	2016-04-11 17:13:08 +00:00
Sanjay Patel	371290790f	[InstCombine] use canEvaluateShiftedShift() to handle the lshr case (NFCI) We need just a couple of logic tweaks to consolidate the shl and lshr cases. This is step 5 of refactoring to solve PR26760: https://llvm.org/bugs/show_bug.cgi?id=26760 llvm-svn: 265965	2016-04-11 17:11:55 +00:00
Sanjay Patel	816ec8882a	[InstCombine] don't try to shift an illegal amount (PR26760) This is the straightforward fix for PR26760: https://llvm.org/bugs/show_bug.cgi?id=26760 But we still need to make some changes to generalize this helper function and then send the lshr case into here. llvm-svn: 265960	2016-04-11 16:50:32 +00:00
Tom Stellard	52686e4182	TargetRegisterInfo: Add getRegAsmName() Summary: The motivation for this new function is to move an invalid assumption about the relationship between the names of register definitions in tablegen files and their assembly names into TargetRegisterInfo, so that we can begin working on fixing this assumption. The current problem is that if you have a register definition in TableGen like: def MYReg0 : Register<"r0", 0>; The function TargetLowering::getRegForInlineAsmConstraint() derives the assembly name from the tablegen name: "MyReg0" rather than the given assembly name "r0". This is working, because on most targets the tablegen name and the assembly names are case insensitive matches for each other (e.g. def EAX : X86Reg<"eax", ...> getRegAsmName() will allow targets to override this default assumption and return the correct assembly name. Reviewers: echristo, hfinkel Subscribers: SamWot, echristo, hfinkel, llvm-commits Differential Revision: http://reviews.llvm.org/D15614 llvm-svn: 265955	2016-04-11 16:21:12 +00:00
Sanjay Patel	bd8b779d16	[InstCombine] rename variables in shifted-shift helper function (NFCI) This is step 3 of refactoring to solve PR26760: https://llvm.org/bugs/show_bug.cgi?id=26760 llvm-svn: 265954	2016-04-11 16:11:07 +00:00
Sanjay Patel	6eaff5cec6	[InstCombine] add helper function for shift-shift optimization (NFCI) This is step 2 of refactoring to solve PR26760: https://llvm.org/bugs/show_bug.cgi?id=26760 llvm-svn: 265951	2016-04-11 15:43:41 +00:00
Sanjoy Das	f9d88e650b	This reverts commit r265913 and r265912 See PR27315 r265913: "[IndVars] Eliminate op.with.overflow when possible" r265912: "[SCEV] See through op.with.overflow intrinsics" llvm-svn: 265950	2016-04-11 15:26:18 +00:00
Petar Jovanovic	e578e970cb	[mips] Make Static a default relocation model for MIPS codegen This change follows up defaults for GCC and Clang, so LLVM does not differ from them. While number of the test files are touched with this change, they all keep the old (expected) behaviour with the explicit option: "-relocation-model=pic" The tests that have not been touched are insensitive to relocation model. Differential Revision: http://reviews.llvm.org/D17995 llvm-svn: 265949	2016-04-11 15:24:23 +00:00
Daniel Sanders	a45d3e439f	[mips] Trivial corrections to range checked immediates. Summary: SYNC has a 5-bit unsigned immediate. Move MIPS16-specific pcrel16 operand to Mips16 files. Reviewers: vkalintiris Subscribers: dsanders, sdardis, llvm-commits Differential Revision: http://reviews.llvm.org/D18755 llvm-svn: 265947	2016-04-11 15:20:40 +00:00
Teresa Johnson	6f6fa36244	[ThinLTO] BitcodeWriter still requires Analysis library This should fix bot failure: http://bb.pgr.jp/builders/i686-mingw32-RA-on-linux/builds/9873 The bitcode writer unfortunately still needs the Analysis library, as it replaces old dependence on BFI etc with dependence on new ModuleSummaryAnalysis pass. llvm-svn: 265945	2016-04-11 14:59:07 +00:00
Ulrich Weigand	aa04768600	[SystemZ] README: remove an implemented idea, add some new ones The note about conditional returns can now be removed, as they are implemented. Let's also add 2 new ones in exchange. Author: koriakin Differential Revision: http://reviews.llvm.org/D18962 llvm-svn: 265944	2016-04-11 14:38:47 +00:00
Ulrich Weigand	1bac911c58	[SystemZ] Add SVC instruction This is going to be useful for inline assembly only. Author: koriakin Differential Revision: http://reviews.llvm.org/D18952 llvm-svn: 265943	2016-04-11 14:35:39 +00:00
Teresa Johnson	2d5487cf44	[ThinLTO] Move summary computation from BitcodeWriter to new pass Summary: This is the first step in also serializing the index out to LLVM assembly. The per-module summary written to bitcode is moved out of the bitcode writer and to a new analysis pass (ModuleSummaryIndexWrapperPass). The pass itself uses a new builder class to compute index, and the builder class is used directly in places where we don't have a pass manager (e.g. llvm-as). Because we are computing summaries outside of the bitcode writer, we no longer can use value ids created by the bitcode writer's ValueEnumerator. This required changing the reference graph edge type to use a new ValueInfo class holding a union between a GUID (combined index) and Value* (permodule index). The Value* are converted to the appropriate value ID during bitcode writing. Also, this enables removal of the BitWriter library's dependence on the Analysis library that was previously required for the summary computation. Reviewers: joker.eph Subscribers: joker.eph, llvm-commits Differential Revision: http://reviews.llvm.org/D18763 llvm-svn: 265941	2016-04-11 13:58:45 +00:00
Oliver Stannard	c869e9158d	[ARM] Avoid switching ARM/Thumb mode on .arch/.cpu directive When we see a .arch or .cpu directive, we should try to avoid switching ARM/Thumb mode if possible. If we do have to switch modes, we also need to emit the correct mapping symbol for the new ISA. We did not do this previously, so could emit ARM code with Thumb mapping symbols (or vice-versa). The GAS behaviour is to always stay in the same mode, and to emit an error on any instructions seen when the current mode is not available on the current target. We can't represent that situation easily (we assume that Thumb mode is available if ModeThumb is set), so we differ from the GAS behaviour when switching to a target that can't support the old mode. I've added a warning for when this implicit mode-switch occurs. Differential Revision: http://reviews.llvm.org/D18955 llvm-svn: 265936	2016-04-11 13:06:28 +00:00
Ulrich Weigand	848a513d0a	[SystemZ] Support conditional indirect sibling calls via BCR This adds a conditional variant of CallBR instruction, CallBCR. Also, it can be fused with integer comparisons, resulting in one of the new C*BCall instructions. In addition to CallBRCL limitations, this has another one: it won't trigger if the function to call isn't already in %r1 - see f22 in the test for an example (it's also why the loads in tests are volatile). Author: koriakin Differential Revision: http://reviews.llvm.org/D18928 llvm-svn: 265933	2016-04-11 12:12:32 +00:00
Ulrich Weigand	fb97c51f6f	[SystemZ] Remove incorrect CC use for C*BReturn instructions These are fused compare-and-branches, so they obviously don't use CC. Author: koriakin Differential Revision: http://reviews.llvm.org/D18927 llvm-svn: 265932	2016-04-11 12:03:30 +00:00
Andrey Turetskiy	9df334c28e	[X86] Restrict max long nop length for Lakemont. Restrict the max length of long nops for Lakemont to 7. Experiments on MCU benchmarks (Dhrystone, Coremark) show that this is the most optimal length. Differential Revision: http://reviews.llvm.org/D18897 llvm-svn: 265924	2016-04-11 10:07:36 +00:00
Sanjoy Das	a07ad647ee	[IndVars] Eliminate op.with.overflow when possible Summary: If we can prove that an op.with.overflow intrinsic does not overflow, we can get rid of the intrinsic, and replace it with non-wrapping arithmetic. Reviewers: atrick, regehr Subscribers: sanjoy, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D18685 llvm-svn: 265913	2016-04-10 22:50:31 +00:00
Sanjoy Das	3c529a40ca	[SCEV] See through op.with.overflow intrinsics Summary: This change teaches SCEV to see reduce `(extractvalue 0 (op.with.overflow X Y))` into `op X Y` (with a no-wrap tag if possible). Reviewers: atrick, regehr Subscribers: mcrosier, mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D18684 llvm-svn: 265912	2016-04-10 22:50:26 +00:00
Mehdi Amini	f9e4576e08	Plumb the option to emit the `ModuleHash` in the bitcode through the bitcode writer APIs From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 265907	2016-04-10 21:07:19 +00:00
Simon Pilgrim	d263fdc512	[X86][AVX512BW] Add support for v64i8 multiplies Extend the existing lowering of vXi8 multiplies to support v64i8 on avx512bw targets. I added the Lower512IntArith helper function to help with this - not sure how often this could be used in the future, but it seemed better than putting all that logic inside LowerMUL. Differential Revision: http://reviews.llvm.org/D18937 llvm-svn: 265902	2016-04-10 17:02:48 +00:00
Elena Demikhovsky	751ed0a06a	Loop vectorization with uniform load Vectorization cost of uniform load wasn't correctly calculated. As a result, a simple loop that loads a uniform value wasn't vectorized. Differential Revision: http://reviews.llvm.org/D18940 llvm-svn: 265901	2016-04-10 16:53:19 +00:00

... 2 3 4 5 6 ...

89205 Commits