llvm-project

Commit Graph

Author	SHA1	Message	Date
Dan Gohman	69c4c76396	[WebAssembly] CodeGen support for __builtin_wasm_page_size() llvm-svn: 245872	2015-08-24 21:03:24 +00:00
Bill Schmidt	32fd189de2	[PPC64LE] Fix PR24546 - Swap optimization and debug values This patch fixes PR24546, which demonstrates a segfault during the VSX swap removal pass. The problem is that debug value instructions were not excluded from the list of instructions to be analyzed for webs of related computation. I've added the test case from the PR as a crash test in test/CodeGen/PowerPC. llvm-svn: 245862	2015-08-24 19:27:27 +00:00
Dan Gohman	7b63484b99	[WebAssembly] Skeleton FastISel support llvm-svn: 245860	2015-08-24 18:44:37 +00:00
Dan Gohman	896e53fae8	[WebAssembly] Implement floating point rounding operators. llvm-svn: 245859	2015-08-24 18:23:13 +00:00
Dan Gohman	08fc966d3c	[WebAssembly] Implement the is_zero_undef forms of cttz and ctlz llvm-svn: 245851	2015-08-24 16:39:37 +00:00
Michael Zuckerman	9beca2e7e2	[X86] Add support for mmword memory operand size for Intel-syntax x86 assembly Differential Revision: http://reviews.llvm.org/D12151 llvm-svn: 245835	2015-08-24 10:26:54 +00:00
Oliver Stannard	284f2bffc9	Add DAG optimisation for FP16_TO_FP The FP16_TO_FP node only uses the bottom 16 bits of its input, so the following pattern can be optimised by removing the AND: (FP16_TO_FP (AND op, 0xffff)) -> (FP16_TO_FP op) This is a common pattern for ARM targets when functions have __fp16 arguments, as they are passed as floats (so that they get passed in the correct registers), but then bitcast and truncated to ignore the top 16 bits. llvm-svn: 245832	2015-08-24 09:47:45 +00:00
Scott Douglass	bdef60462d	[ARM] Use AEABI helpers for i64 div and rem Differential Revision: http://reviews.llvm.org/D12232 llvm-svn: 245830	2015-08-24 09:17:18 +00:00
Sanjay Patel	453b4df973	remove FIXME; fixed by r245733 llvm-svn: 245819	2015-08-23 20:43:25 +00:00
Simon Pilgrim	2a7049abe0	[DAGCombiner] Fold CONCAT_VECTORS of bitcasted EXTRACT_SUBVECTOR Minor generalization of D12125 - peek through any bitcast to the original vector that we're extracting from. llvm-svn: 245814	2015-08-23 15:22:14 +00:00
Frederic Riss	7bb12261a3	[dwarfdump] Do not apply relocations in mach-o files if there is no LoadedObjectInfo. Not only do we not need to do anything to read correct values from the object files, but the current logic actually wrongly applies twice the section base address when there is no LoadedObjectInfo passed to the DWARFContext creation (as the added test shows). Simply do not apply any relocations on the mach-o debug info if there is no load offset to apply. llvm-svn: 245807	2015-08-23 04:44:21 +00:00
Frederic Riss	4568ce79bd	[dsymutil] Remove old ODR uniquing tests These tests have been obsoleted by the refactored versions introduced in the previous commit. llvm-svn: 245804	2015-08-23 02:38:37 +00:00
Frederic Riss	f8bcc0c610	[dsymutil] Refactor ODR uniquing tests to be more readable. This patch adds all the refactored tests in new files, the old tests will be removed by a followup commit. Thanks to D. Blaikie for all the feedback. llvm-svn: 245803	2015-08-23 02:38:29 +00:00
Joseph Tremoulet	8220bcc570	[WinEH] Require token linkage in EH pad/ret signatures Summary: WinEHPrepare is going to require that cleanuppad and catchpad produce values of token type which are consumed by any cleanupret or catchret exiting the pad. This change updates the signatures of those operators to require/enforce that the type produced by the pads is token type and that the rets have an appropriate argument. The catchpad argument of a `CatchReturnInst` must be a `CatchPadInst` (and similarly for `CleanupReturnInst`/`CleanupPadInst`). To accommodate that restriction, this change adds a notion of an operator constraint to both LLParser and BitcodeReader, allowing appropriate sentinels to be constructed for forward references and appropriate error messages to be emitted for illegal inputs. Also add a verifier rule (noted in LangRef) that a catchpad with a catchpad predecessor must have no other predecessors; this ensures that WinEHPrepare will see the expected linear relationship between sibling catches on the same try. Lastly, remove some superfluous/vestigial casts from instruction operand setters operating on BasicBlocks. Reviewers: rnk, majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12108 llvm-svn: 245797	2015-08-23 00:26:33 +00:00
David Blaikie	0732e16e69	Update test case so it passes the verifier Some debug info was drastically out of date, from the days where we used to emit a list of length one (with a single null entry) rather than an empty list (or, more recently, no list at all) for list fields that have no elements. llvm-svn: 245796	2015-08-22 22:38:44 +00:00
David Blaikie	3c338f3a7e	Verifier: Don't crash on null entries in debug info retained types list There was already a good error path for this. Added a test for it & made a minor code change to ensure the error path was actually reached, rather than crashing before we got that far. llvm-svn: 245795	2015-08-22 22:36:40 +00:00
Davide Italiano	60963e3682	[llvm-readobj] Test --macho-data-in-code option. As added bonus this converts an existing test from macho-dump to llvm-readobj. Only 66 to go. llvm-svn: 245791	2015-08-22 20:30:56 +00:00
Jingyue Wu	fcec09866a	[NVPTX] Allow undef value as global initializer Summary: __shared__ variable may now emit undef value as initializer, do not throw error on that. Test Plan: test/CodeGen/NVPTX/global-addrspace.ll Patch by Xuetian Weng Reviewers: jholewinski, tra, jingyue Subscribers: llvm-commits, jholewinski Differential Revision: http://reviews.llvm.org/D12242 llvm-svn: 245785	2015-08-22 05:40:26 +00:00
Matt Arsenault	e8df879948	AMDGPU: Improve accuracy of instruction rates for some FP instructions llvm-svn: 245774	2015-08-22 00:50:41 +00:00
JF Bastien	057292a76c	Improve the determinism of MergeFunctions Summary: Merge functions previously relied on unsigned comparisons of pointer values to order functions. This caused observable non-determinism in the compiler for large bitcode programs. Basically, opt -mergefuncs program.bc \| md5sum produces different hashes when run repeatedly on the same machine. Differing output was observed on three large bitcodes, but it was less frequent on the smallest file. It is possible that this only manifests on the large inputs, hence remaining undetected until now. This patch fixes this by removing (almost, see below) all places where comparisons between pointers are used to order functions. Most of these changes are local, but the comparison of global values requires assigning an identifier to each local in the order it is visited. This is very similar to the way the comparison function identifies Value's defined within a function. Because the order of visiting the functions and their subparts is deterministic, the identifiers assigned to the globals will be as well, and the order of functions will be deterministic. With these changes, there is no more observed non-determinism. There is also only minor slowdowns (negligible to 4%) compared to the baseline, which is likely a result of the fact that global comparisons involve hash lookups and not just pointer comparisons. The one caveat so far is that programs containing BlockAddress constants can still be non-deterministic. It is not clear what the right solution is here. In particular, even if the global numbers are used to order by function, we still need a way to order the BasicBlock's. Unfortunately, we cannot just bail out and fail to order the functions or consider them equal, because we require a total order over functions. Note that programs with BlockAddress constants are relatively rare, so the impact of leaving this in is minor as long as this pass is opt-in. Author: jrkoenig Reviewers: nlewycky, jfb, dschuff Subscribers: jevinskie, llvm-commits, chapuni Differential revision: http://reviews.llvm.org/D12168 llvm-svn: 245762	2015-08-21 23:27:24 +00:00
Adam Nemet	4e533ef7a9	[LAA] Hold bounds via ValueHandles during SCEV expansion SCEV expansion can invalidate previously expanded values. For example in SCEVExpander::ReuseOrCreateCast, if we already have the requested cast value but it's not at the desired location, a new cast is inserted and the old cast will be invalidated. Therefore, when expanding the bounds for the pointers, a later entry can invalidate the IR value for an earlier one. The fix is to store a value handle rather than the value itself. The newly added test has a more detailed description of how the bug triggers. This bug can have a negative but potentially highly variable performance impact in Loop Distribution. Because one of the bound values was invalidated and is an undef expression now, InstCombine is free to transform the array overlap check: Start0 <= End1 && Start1 <= End0 into: Start0 <= End1 So depending on the runtime location of the arrays, we would detect a conflict and fall back on the original loop of the versioned loop. Also tested compile time with SPEC2006 LTO bc files. llvm-svn: 245760	2015-08-21 23:19:57 +00:00
Tom Stellard	bd8a0856e2	AMDGPU/SI: Better handle s_wait insertion We can wait on either VM, EXP or LGKM. The waits are independent. Without this patch, a wait inserted because of one of them would also wait for all the previous others. This patch makes s_wait only wait for the ones we need for the next instruction. Here's an example of subtle perf reduction this patch solves: This is without the patch: buffer_load_format_xyzw v[8:11], v0, s[44:47], 0 idxen buffer_load_format_xyzw v[12:15], v0, s[48:51], 0 idxen s_load_dwordx4 s[44:47], s[8:9], 0xc s_waitcnt lgkmcnt(0) buffer_load_format_xyzw v[16:19], v0, s[52:55], 0 idxen s_load_dwordx4 s[48:51], s[8:9], 0x10 s_waitcnt vmcnt(1) buffer_load_format_xyzw v[20:23], v0, s[44:47], 0 idxen The s_waitcnt vmcnt(1) is useless. The reason it is added is because the last buffer_load_format_xyzw needs s[44:47], which was issued by the first s_load_dwordx4. It waits for all VM before that call to have finished. Internally after every instruction, 3 counters (for VM, EXP and LGTM) are updated after every instruction. For example buffer_load_format_xyzw will increase the VM counter, and s_load_dwordx4 the LGKM one. Without the patch, for every defined register, the current 3 counters are stored, and are used to know how long to wait when an instruction needs the register. Because of that, the s[44:47] counter includes that to use the register you need to wait for the previous buffer_load_format_xyzw. Instead this patch stores only the counters that matter for the register, and puts zero for the other ones, since we don't need any wait for them. Patch by: Axel Davy Differential Revision: http://reviews.llvm.org/D11883 llvm-svn: 245755	2015-08-21 22:47:27 +00:00
Sanjoy Das	c86c162a58	Re-apply r245635, "[InstCombine] Transform A & (L - 1) u< L --> L != 0" The original checkin was buggy, this change has a fix. Original commit message: [InstCombine] Transform A & (L - 1) u< L --> L != 0 Summary: This transform is never a pessimization at the IR level (since it replaces an `icmp` with another), and has potentiall payoffs: 1. It may make the `icmp` fold away or become loop invariant. 2. It may make the `A & (L - 1)` computation dead. This shows up in Java, in range checks generated by array accesses of the form `a[i & (a.length - 1)]`. Reviewers: reames, majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12210 llvm-svn: 245753	2015-08-21 22:22:37 +00:00
Alex Lorenz	c1136ef3b8	MIR Serialization: Serialize the pointer IR expression values in the machine memory operands. llvm-svn: 245745	2015-08-21 21:54:12 +00:00
Vedant Kumar	366dd9fd2b	[ARM] Fix MachO CPU Subtype selection Differential Revision: http://reviews.llvm.org/D12040 llvm-svn: 245744	2015-08-21 21:52:48 +00:00
Hal Finkel	ff9639d6b7	[PowerPC] PPCVSXFMAMutate should not segfault on undef input registers When PPCVSXFMAMutate would look at the input addend register, it would get its input value number. This would fail, however, if the register was undef, causing a segfault. Don't segfault (just skip such FMA instructions). Fixes the test case from PR24542 (although that may have been over-reduced). llvm-svn: 245741	2015-08-21 21:34:24 +00:00
Simon Pilgrim	76b91d9084	Line endings fix. llvm-svn: 245736	2015-08-21 21:09:51 +00:00
Sanjay Patel	f0bc07f7a5	[x86] enable machine combiner reassociations for 256-bit vector min/max llvm-svn: 245735	2015-08-21 21:04:21 +00:00
Sanjay Patel	dddad10241	remove 'FeatureSlowUAMem' from AMD CPUs based on 10H micro-arch or later See discussion in D12154 ( http://reviews.llvm.org/D12154 ), AMD Software Optimization Guides for 10H/12H/15H/16H, and Agner Fog's experimental data. llvm-svn: 245733	2015-08-21 20:39:17 +00:00
Sanjay Patel	cf942fa905	[x86] enable machine combiner reassociations for 128-bit vector min/max llvm-svn: 245715	2015-08-21 18:06:49 +00:00
Sanjay Patel	c9f4aa6b8c	save some testing time; get rid of the non-SSE chips in this test It doesn't matter what slow/fast unaligned attribute the old chips have - they can't use anything more than 4-byte stores. llvm-svn: 245709	2015-08-21 17:16:51 +00:00
Sanjay Patel	44d7bef0fa	add a test case to check the fast-unaligned-mem attribute per CPU This will confirm that the patch in D12154 is actually NFC. It will also confirm that the proposed changes for the AMD chips are behaving as expected. llvm-svn: 245704	2015-08-21 16:08:26 +00:00
John Brawn	eab960c46f	[DAGCombiner] Fold together mul and shl when both are by a constant This is intended to improve code generation for GEPs, as the index value is shifted by the element size and in GEPs of multi-dimensional arrays the index of higher dimensions is multiplied by the lower dimension size. Differential Revision: http://reviews.llvm.org/D12197 llvm-svn: 245689	2015-08-21 10:48:17 +00:00
NAKAMURA Takumi	6a6232818d	Revert r245635, "[InstCombine] Transform A & (L - 1) u< L --> L != 0" It caused miscompilation in clang. llvm-svn: 245678	2015-08-21 07:46:07 +00:00
James Y Knight	667395f334	[Sparc] Support user-specified stack object overalignment. Note: I do not implement a base pointer, so it's still impossible to have dynamic realignment AND dynamic alloca in the same function. This also moves the code for determining the frame index reference into getFrameIndexReference, where it belongs, instead of inline in eliminateFrameIndex. [Begin long-winded screed] Now, stack realignment for Sparc is actually a silly thing to support, because the Sparc ABI has no need for it -- unlike the situation on x86, the stack is ALWAYS aligned to the required alignment for the CPU instructions: 8 bytes on sparcv8, and 16 bytes on sparcv9. However, LLVM unfortunately implements user-specified overalignment using stack realignment support, so for now, I'm going to go along with that tradition. GCC instead treats objects which have alignment specification greater than the maximum CPU-required alignment for the target as a separate block of stack memory, with their own virtual base pointer (which gets aligned). Doing it that way avoids needing to implement per-target support for stack realignment, except for the targets which actually have an ABI-specified stack alignment which is too small for the CPU's requirements. Further unfortunately in LLVM, the default canRealignStack for all targets effectively returns true, despite that implementing that is something a target needs to do specifically. So, the previous behavior on Sparc was to silently ignore the user's specified stack alignment. Ugh. Yet MORE unfortunate, if a target actually does return false from canRealignStack, that also causes the user-specified alignment to be silently ignored, rather than emitting an error. (I started looking into fixing that last, but it broke a bunch of tests, because LLVM actually depends on having it silently ignored: some architectures (e.g. non-linux i386) have smaller stack alignment than spilled-register alignment. But, the fact that a register needs spilling is not known until within the register allocator. And by that point, the decision to not reserve the frame pointer has been frozen in place. And without a frame pointer, stack realignment is not possible. So, canRealignStack() returns false, and needsStackRealignment() then returns false, assuming everyone can just go on their merry way assuming the alignment requirements were probably just suggestions after-all. Sigh...) Differential Revision: http://reviews.llvm.org/D12208 llvm-svn: 245668	2015-08-21 04:17:56 +00:00
Peter Collingbourne	1dc6a8d179	TransformUtils: Introduce module splitter. The module splitter splits a module into linkable partitions. It will be used to implement parallel LTO code generation. This initial version of the splitter does not attempt to deal with the somewhat subtle symbol visibility issues around module splitting. These will be dealt with in a future change. Differential Revision: http://reviews.llvm.org/D12132 llvm-svn: 245662	2015-08-21 02:48:20 +00:00
Matthias Braun	25fd09a756	AArch64: Fix testcase of r245640 llvm-svn: 245647	2015-08-21 00:23:19 +00:00
Michael Zolotukhin	6002295c6a	[SLP] Add one more test case for propagating 'nontemporal' attributes. llvm-svn: 245644	2015-08-21 00:08:39 +00:00
Adrian Prantl	1d8741bc0c	delete more dead code from this testcase. llvm-svn: 245643	2015-08-21 00:02:04 +00:00
Adrian Prantl	62222edb77	Further reduce the IR in this testcase based on a further reduction of the original source by David Blaikie (thanks!). llvm-svn: 245642	2015-08-20 23:59:39 +00:00
Matthias Braun	46e5639806	AArch64: Fix cmp;ccmp ordering When producing conditional compare sequences for or operations we need to negate the operands and the finally tested flags. The thing is if we negate the finally tested flags this equals a logical negation of all previously emitted expressions. There was a case missing where we have to order OR expressions so they get emitted first. This fixes http://llvm.org/PR24459 llvm-svn: 245641	2015-08-20 23:33:34 +00:00
Matthias Braun	266204b7dc	AArch64: Do not create CCMP on multiple users. Create CMP;CCMP sequences from and/or trees does not gain us anything if the and/or tree is materialized to a GP register anyway. While most of the code already checked for hasOneUse() there was one important case missing. llvm-svn: 245640	2015-08-20 23:33:31 +00:00
David Majnemer	2df38cd0c4	[InstSimplify] add nuw %x, C2 must be at least C2 Use the fact that add nuw always creates a larger bit pattern when trying to simplify comparisons. llvm-svn: 245638	2015-08-20 23:01:41 +00:00
Sanjoy Das	e472d8a57a	[InstCombine] Transform A & (L - 1) u< L --> L != 0 Summary: This transform is never a pessimization at the IR level (since it replaces an `icmp` with another), and has potentiall payoffs: 1. It may make the `icmp` fold away or become loop invariant. 2. It may make the `A & (L - 1)` computation dead. This shows up in Java, in range checks generated by array accesses of the form `a[i & (a.length - 1)]`. Reviewers: reames, majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12210 llvm-svn: 245635	2015-08-20 22:31:55 +00:00
Michael Zolotukhin	51b00e6d82	[SLP] Propagate 'nontemporal' attribute into vectorized instructions. llvm-svn: 245633	2015-08-20 22:28:15 +00:00
Michael Zolotukhin	2a3d99fedf	[LoopVectorize] Propagate 'nontemporal' attribute into vectorized instructions. llvm-svn: 245632	2015-08-20 22:27:38 +00:00
Ahmed Bougacha	0cdc7719f0	[X86] Look for scalar through one bitcast when lowering to VBROADCAST. Fixes PR23464: one way to use the broadcast intrinsics is: _mm256_broadcastw_epi16(_mm_cvtsi32_si128((int)src)); We don't currently fold this, but now that we use native IR for the intrinsics (r245605), we can look through one bitcast to find the broadcast scalar. Differential Revision: http://reviews.llvm.org/D10557 llvm-svn: 245613	2015-08-20 21:02:39 +00:00
Ahmed Bougacha	69a17acb74	[X86] Add some broadcast-from-memory tests. llvm-svn: 245612	2015-08-20 20:59:41 +00:00
Jingyue Wu	ca3ef11a9b	[NVPTX] truncating 64-bit to 32-bit is free Summary: Add an LSR test that exercises isTruncateFree. Without this change, LSR creates another indvar representing the truncated value. Reviewers: jholewinski, eliben Subscribers: jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D12058 llvm-svn: 245611	2015-08-20 20:59:02 +00:00
Ahmed Bougacha	1a498705e4	[X86] Replace avx2 broadcast intrinsics with native IR. Since r245605, the clang headers don't use these anymore. r245165 updated some of the tests already; update the others, add an autoupgrade, remove the intrinsics, and cleanup the definitions. Differential Revision: http://reviews.llvm.org/D10555 llvm-svn: 245606	2015-08-20 20:36:19 +00:00
Adrian Prantl	baf90fc265	Fix a bug that caused SimplifyCFG to drop DebugLocs. Instruction::dropUnknownMetadata(KnownSet) is supposed to preserve all metadata in KnownSet, but the condition for DebugLocs was inverted. Most users of dropUnknownMetadata() actually worked around this by not adding LLVMContext::MD_dbg to their list of KnowIDs. This is now made explicit. llvm-svn: 245589	2015-08-20 18:24:02 +00:00
Adrian Prantl	a317cd2583	Fix a debug location handling bug in GVN. Caught by the famous "DebugLoc describes the currect SubProgram" assertion. When GVN is removing a nonlocal load it updates the debug location of the SSA value it replaced the load with with the one of the load. In the testcase this actually overwrites a valid debug location with an empty one. In reality GVN has to make an arbitrary choice between two equally valid debug locations. This patch changes to behavior to only update the location if the value doesn't already have a debug location. llvm-svn: 245588	2015-08-20 18:23:56 +00:00
Rafael Espindola	c30c7c493f	Fix symbol value computation when part of the expression is weak. This matches the behaviour of the gnu assembler and is part of fixing pr24486. llvm-svn: 245576	2015-08-20 16:18:30 +00:00
Douglas Katzman	58195a2d74	[Sparc]: correct the 'set' synthetic instruction Differential Revision: http://reviews.llvm.org/D12194 llvm-svn: 245575	2015-08-20 16:16:16 +00:00
Balaram Makam	ccf59731e3	Optimize bitwise even/odd test (-x&1 -> x&1) to not use negation. Summary: We know that -x & 1 is equivalent to x & 1, avoid using negation for testing if a negative integer is even or odd. Reviewers: majnemer Subscribers: junbuml, mssimpso, gberry, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D12156 llvm-svn: 245569	2015-08-20 15:35:00 +00:00
Zoran Jovanovic	56585d517b	[mips][microMIPS] Add microMIPS32r6 and microMIPS64r6 tests for existing 16-bit ADDIUR1SP, ADDIUR2, ADDIUS5 and ADDIUSP instructions Differential Revision: http://reviews.llvm.org/D10955 llvm-svn: 245554	2015-08-20 11:51:49 +00:00
Marina Yatsina	bce1ab67a5	[X86] Fix FBLD and FBSTP FBLD and FBSTP should receive TBYTE because it is defined as FBLD m80 FBSTP m80 Differential Revision: http://reviews.llvm.org/D11748 llvm-svn: 245553	2015-08-20 11:51:24 +00:00
Marina Yatsina	7a4e1ba737	[X86] Fix bug in COMISD and COMISS definition in td files COMISD should receive QWORD because it is defined as (V)COMISD xmm1, xmm2/m64 COMISS should receive DWORD because it is defined as (V)COMISS xmm1, xmm2/m32 Differential Revision: http://reviews.llvm.org/D11712 llvm-svn: 245551	2015-08-20 11:21:36 +00:00
David Majnemer	cfc1df553e	[X86] Fix the (shl (and (setcc_c), c1), c2) -> (and setcc_c, (c1 << c2)) fold We didn't check for the necessary preconditions before folding a mask/shift into a single mask. This fixes PR24516. llvm-svn: 245544	2015-08-20 09:00:56 +00:00
Bjorn Steinbrink	2e2f66557e	Revert "[DSE] Enable removal of lifetime intrinsics in terminating blocks" llvm-svn: 245543	2015-08-20 08:58:47 +00:00
Bjorn Steinbrink	cc7e8a9705	[DSE] Enable removal of lifetime intrinsics in terminating blocks Usually DSE is not supposed to remove lifetime intrinsics, but it's actually ok to remove them for dead objects in terminating blocks, because they convey no extra information there. Until we hit a lifetime start that cannot be removed, that is. Because from that point on the lifetime intrinsics become interesting again, e.g. for stack coloring. Reviewers: reames Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11710 llvm-svn: 245542	2015-08-20 08:25:28 +00:00
Hal Finkel	9fdce9adee	[PowerPC] Fix value type on XVCMPEQDP for v2f64 comparisons XVCMPEQDP is used for VSX v2f64 equality comparisons, but the value type needs to be v2i64 (as that's the corresponding SETCC type). Fixes PR24225. llvm-svn: 245535	2015-08-20 03:02:02 +00:00
Hal Finkel	be78c25acb	[PowerPC] Fix the int2fp(fp2int(x)) DAGCombine to ignore ppc_fp128 This DAGCombine was creating custom SDAG nodes with an illegal ppc_fp128 operand type because it was triggering on f64/f32 int2fp(fp2int(ppc_fp128 x)), but shouldn't (it should only apply to f32/f64 types). The result was a crash. llvm-svn: 245530	2015-08-20 01:18:20 +00:00
Alex Lorenz	36efd3883d	MIR Serialization: Use the global value syntax for global value memory operands. This commit modifies the serialization syntax so that the global IR values in machine memory operands use the global value '@<name>' syntax instead of the current '%ir.<name>' syntax. The unnamed global IR values are handled by this commit as well, as the existing global value parsing method can parse the unnamed globals already. llvm-svn: 245527	2015-08-20 00:20:03 +00:00
Alex Lorenz	0d009645a1	MIR Serialization: Change syntax for the call entry pseudo source values. The global IR values in machine memory operands should use the global value '@<name>' syntax instead of the current '%ir.<name>' syntax. However, the global value call entry pseudo source values use the global value syntax already. Therefore, the syntax for the call entry pseudo source values has to be changed so that the global values and call entry global value PSVs can be parsed without ambiguities. llvm-svn: 245526	2015-08-20 00:12:57 +00:00
Alex Lorenz	dd13be0bcc	MIR Serialization: Serialize unnamed local IR values in memory operands. llvm-svn: 245521	2015-08-19 23:31:05 +00:00
Sanjay Patel	9e5927fdc3	[x86] enable machine combiner reassociations for scalar double-precision min/max llvm-svn: 245506	2015-08-19 21:27:27 +00:00
Sanjay Patel	4e3ee1e548	[x86] enable machine combiner reassociations for scalar single-precision maximums llvm-svn: 245504	2015-08-19 21:18:46 +00:00
Simon Pilgrim	35f528262f	[DAGCombiner] Added SMAX/SMIN/UMAX/UMIN constant folding We still need to add constant folding of vector comparisons to fold the tests for targets that don't support the respective min/max nodes I needed to update 2011-12-06-AVXVectorExtractCombine to load a vector instead of using a constant vector to prevent it folding Differential Revision: http://reviews.llvm.org/D12118 llvm-svn: 245503	2015-08-19 21:11:58 +00:00
Juergen Ributzka	b12248e9cd	[AArch64][FastISel] Don't fold shifts with UB. We are already falling back to SelectionDAG when encountering an shift with UB. This adds the same checks for shifts with UB that get folded into arithmetic or logical operations. This fixes rdar://problem/22345295. llvm-svn: 245499	2015-08-19 20:52:55 +00:00
David Majnemer	f25fe64716	[X86] Emit more efficient >= comparisons against 0 We don't do a great job with >= 0 comparisons against zero when the result is used as an i8. Given something like: void f(long long LL, bool B) { B = LL >= 0; } We used to generate: shrq $63, %rdi xorb $1, %dil movb %dil, (%rsi) Now we generate: testq %rdi, %rdi setns (%rsi) Differential Revision: http://reviews.llvm.org/D12136 llvm-svn: 245498	2015-08-19 20:51:40 +00:00
Dan Gohman	dde8dce6a9	[WebAssembly] Use the default alignment for SIMD types. Previously WebAssembly's datalayout string had -v128:8:128. This had been an attempt to declare a certain level of support for unaligned SIMD accesses. However, clang makes its own determinations for SIMD alignment that are independent of the datalayout string, so this wasn't actually meaningful. llvm-svn: 245494	2015-08-19 20:30:20 +00:00
Simon Pilgrim	989cbbd2f5	[DAGCombiner] Fold CONCAT_VECTORS of EXTRACT_SUBVECTOR (or undef) to VECTOR_SHUFFLE. Check to see if this is a CONCAT_VECTORS of a bunch of EXTRACT_SUBVECTOR operations. If so, and if the EXTRACT_SUBVECTOR vector inputs come from at most two distinct vectors the same size as the result, attempt to turn this into a legal shuffle. Differential Revision: http://reviews.llvm.org/D12125 llvm-svn: 245490	2015-08-19 20:09:50 +00:00
Paul Robinson	9c10414ce0	Minor tidying of regex in a test llvm-svn: 245486	2015-08-19 19:36:35 +00:00
Douglas Katzman	2362b69dd9	[Sparc]: asm-only support for the ldstub instruction. llvm-svn: 245485	2015-08-19 19:30:57 +00:00
Alex Lorenz	5ef93b0c4c	MIR Serialization: Serialize instruction's register ties. This commit serializes the machine instruction's register operand ties. The ties are printed out only when the instructon has register ties that are different from the ties that are specified in the instruction's description. llvm-svn: 245482	2015-08-19 19:05:34 +00:00
Nemanja Ivanovic	5f1cea4141	Temporary fix for the self-host failures introduced by rL244921. This revision has introduced an issue that only affects bootstrapped compiler when it is printing the ASM. I am working on resolving the issue, but in the meantime, I'm disabling the legalization of scalar_to_vector operation for v2i64 and the associated testing until I can get this fixed. llvm-svn: 245481	2015-08-19 19:04:47 +00:00
Alex Lorenz	e66a7ccf77	MIR Serialization: Serialize defined registers that require 'def' register flag. The defined registers are already serialized - they are represented by placing them before the '=' in a machine instruction. However, certain instructions like INLINEASM can have defined register operands after the '=', so this commit introduces the 'def' register flag for such operands. llvm-svn: 245480	2015-08-19 18:55:47 +00:00
Bruno Cardoso Lopes	27fd06922b	[PeepholeOptimizer] Look through PHIs to find additional register sources Reintroduce r245442. Remove an overly conservative assertion introduced in r245442. We could replace the assertion to use `shareSameRegisterFile` instead, but in that point in `insertPHI` we already lost the original Def subreg to check against. So drop the assertion completely. Original commit message: - Teaches the ValueTracker in the PeepholeOptimizer to look through PHI instructions. - Add findNextSourceAndRewritePHI method to lookup into multiple sources returnted by the ValueTracker and rewrite PHIs with new sources. With these changes we can find more register sources and rewrite more copies to allow coaslescing of bitcast instructions. Hence, we eliminate unnecessary VR64 <-> GR64 copies in x86, but it could be extended to other archs by marking "isBitcast" on target specific instructions. The x86 example follows: A: psllq %mm1, %mm0 movd %mm0, %r9 jmp C B: por %mm1, %mm0 movd %mm0, %r9 jmp C C: movd %r9, %mm0 pshufw $238, %mm0, %mm0 Becomes: A: psllq %mm1, %mm0 jmp C B: por %mm1, %mm0 jmp C C: pshufw $238, %mm0, %mm0 Differential Revision: http://reviews.llvm.org/D11197 rdar://problem/20404526 llvm-svn: 245479	2015-08-19 18:53:36 +00:00
Douglas Katzman	e5485c651e	[SPARC] Enable writing to floating-point-state register. llvm-svn: 245475	2015-08-19 18:34:48 +00:00
Ahmed Bougacha	9e00ec6195	[AArch64] Improve short-form diags on long-form Match_InvalidOperand. Since r244955, we try to use the short-form ErrorInfo when both tries failed, and the long-form match failed on a suffix operand. However, this means we sometimes mix ErrorInfo and MatchResult (one manifestation of this being PR24498). Instead, restore both. llvm-svn: 245469	2015-08-19 17:40:19 +00:00
Derek Schuff	55817ee604	x32. Fixes a bug in x32 exception handling. This patch updates the X86 lowering so that the Exception Pointer and Selector are 64-bit wide only if Subtarget.isTarget64BitLP64. Patch by João Porto Reviewers: dschuff, rnk Differential Revision: http://reviews.llvm.org/D12111 llvm-svn: 245454	2015-08-19 16:28:21 +00:00
JF Bastien	5ab87edbb4	x32. Fixes jmp %reg in x32 x32 has 32-bit pointers; x86-64 can't jmp %r32. This patch addresses this issue by explicitly zero-extending brind's target to 64-bits. Author: jpp Reviewers: jfb, dschuff, pavel.v.chupin Subscribers: llvm-commits Differential revision: http://reviews.llvm.org/D12112 llvm-svn: 245452	2015-08-19 16:17:08 +00:00
Bruno Cardoso Lopes	61009142b8	Revert "[PeepholeOptimizer] Look through PHIs to find additional register sources" Revert r245442 while investigating a fix. An assertion hit in http://lab.llvm.org:8080/green/job/clang-stage1-configure-RA_build/11380 llvm-svn: 245446	2015-08-19 15:10:32 +00:00
James Y Knight	d966fb6fef	[SPARC] Fix BooleanContents, so that select of a trunc doesn't eliminate the trunc. Differential Revision: http://reviews.llvm.org/D10442 llvm-svn: 245444	2015-08-19 14:47:04 +00:00
Bruno Cardoso Lopes	0a1c126684	[PeepholeOptimizer] Look through PHIs to find additional register sources Reapply r243486. - Teaches the ValueTracker in the PeepholeOptimizer to look through PHI instructions. - Add findNextSourceAndRewritePHI method to lookup into multiple sources returnted by the ValueTracker and rewrite PHIs with new sources. With these changes we can find more register sources and rewrite more copies to allow coaslescing of bitcast instructions. Hence, we eliminate unnecessary VR64 <-> GR64 copies in x86, but it could be extended to other archs by marking "isBitcast" on target specific instructions. The x86 example follows: A: psllq %mm1, %mm0 movd %mm0, %r9 jmp C B: por %mm1, %mm0 movd %mm0, %r9 jmp C C: movd %r9, %mm0 pshufw $238, %mm0, %mm0 Becomes: A: psllq %mm1, %mm0 jmp C B: por %mm1, %mm0 jmp C C: pshufw $238, %mm0, %mm0 Differential Revision: http://reviews.llvm.org/D11197 rdar://problem/20404526 llvm-svn: 245442	2015-08-19 14:34:41 +00:00
Silviu Baranga	ad1b19fcb7	[ARM] Add instruction selection patterns for vmin/vmax Summary: The mid-end was generating vector smin/smax/umin/umax nodes, but we were using vbsl to generatate the code. This adds the vmin/vmax patterns and a test to check that we are now generating vmin/vmax instructions. Reviewers: rengolin, jmolloy Subscribers: aemerson, rengolin, llvm-commits Differential Revision: http://reviews.llvm.org/D12105 llvm-svn: 245439	2015-08-19 14:11:27 +00:00
Joerg Sonnenberger	7d180c59bb	Map %fprs to %asr6 in the Sparc assembler parser. llvm-svn: 245437	2015-08-19 13:55:14 +00:00
Tobias Grosser	85508e804b	Revert "[X86] Widen the 'AND' mask if doing so shrinks the encoding size" This reverts commit 245169 which miscompiles MultiSource/Applications/siod from LNT. llvm-svn: 245432	2015-08-19 11:35:10 +00:00
Michael Kuperstein	9fe42604aa	[X86] Do not lower scalar sdiv/udiv to a shifts + mul sequence when optimizing for minsize There are some cases where the mul sequence is smaller, but for the most part, using a div is preferable. This does not apply to vectors, since x86 doesn't have vector idiv, and a vector mul/shifts sequence ought to be smaller than a scalarized division. Differential Revision: http://reviews.llvm.org/D12082 llvm-svn: 245431	2015-08-19 11:21:43 +00:00
Hal Finkel	0ef2b10f16	Fix how DependenceAnalysis calls delinearization Fix how DependenceAnalysis calls delinearization, mirroring what is done in Delinearization.cpp (mostly by making sure to call getSCEVAtScope before delinearizing, and by removing the unnecessary 'Pairs == 1' check). Patch by Vaivaswatha Nagaraj! llvm-svn: 245408	2015-08-19 02:56:36 +00:00
Eric Christopher	0efe9f60bb	Revert "Fix PR24469 resulting from r245025 and re-enable dead store elimination across basicblocks." This is causing bootstrap problems, e.g.: http://bb.pgr.jp/builders/clang-3stage-i686-linux/builds/2960 This reverts r245195. llvm-svn: 245402	2015-08-19 02:15:13 +00:00
Hal Finkel	a8d205f145	Make ScalarEvolution::isKnownPredicate a little smarter Here we make ScalarEvolution::isKnownPredicate, indirectly, a little smarter. Given some relational comparison operator OP, and two AddRec SCEVs, {I,+,S} OP {J,+,T}, we can reduce this to the comparison I OP J when S == T, both AddRecs are for the same loop, and both are known not to wrap. As it turns out, because of the way that backedge-guard expressions can be leveraged when computing known predicates, this allows indvars to simplify the if-statement comparison in this loop: void foo (int a, int b, int n) { for (int i = 0; i < n; ++i) { if (i > n) a[i] = b[i] + 1; } } which, somewhat surprisingly, we were not previously optimizing away. llvm-svn: 245400	2015-08-19 01:51:51 +00:00
Chih-Hung Hsieh	fdcf541871	Split ARM and AArch64 emutls.ll test Differential Revision: http://reviews.llvm.org/D12127 llvm-svn: 245399	2015-08-19 01:44:51 +00:00
Alex Lorenz	df9e3c6fb0	MIR Serialization: Serialize MMI's variable debug information. llvm-svn: 245396	2015-08-19 00:13:25 +00:00
Quentin Colombet	861ad97e6f	[BasicAA] Add a test for PR24468 to be sure we won't regress when we finally get the GEP aliasing right. llvm-svn: 245395	2015-08-19 00:08:26 +00:00
Quentin Colombet	b700e357b5	[BasicAA] Revert r221876 because it can produce incorrect aliasing information: see PR24468. llvm-svn: 245394	2015-08-19 00:07:20 +00:00
Alex Lorenz	607efb6c7e	MIR Parser: Return true on error when parsing standalone registers. llvm-svn: 245384	2015-08-18 22:57:36 +00:00
Alex Lorenz	f3630113cd	MIR Serialization: Serialize the operand's bit mask target flags. This commit adds support for bit mask target flag serialization to the MIR printer and the MIR parser. It also adds support for the machine operand's target flag serialization to the AArch64 target. Reviewers: Duncan P. N. Exon Smith llvm-svn: 245383	2015-08-18 22:52:15 +00:00
Alex Lorenz	a314d81328	MIR Serialization: Serialize the frame information's stack protector index. llvm-svn: 245372	2015-08-18 22:26:26 +00:00
David Majnemer	c6bb0e2a51	[InstSimplify] Don't assume getAggregateElement will succeed It isn't always possible to get a value from getAggregateElement. This fixes PR24488. llvm-svn: 245365	2015-08-18 22:07:25 +00:00
Joerg Sonnenberger	b0ce8747c3	Load/store instructions for floating points with address space require SparcV9. To properly handle this, define the *a instructions as separate instruction classes by refactoring the LoadA and StoreA multiclasses. Move the instruction tests into the sparcv9 file to test the difference. llvm-svn: 245360	2015-08-18 21:31:46 +00:00
Simon Pilgrim	4bce6d6b73	[X86] Refreshed sign extension tests. llvm-svn: 245358	2015-08-18 21:21:35 +00:00
Simon Pilgrim	ce30ae62e2	[X86][AVX] Added shuffle concatenation tests llvm-svn: 245351	2015-08-18 20:51:15 +00:00
Matthias Braun	fa3b248a66	DAGCombiner: Improve DAGCombiner select normalization The current code normalizes select(C0, x, select(C1, x, y)) towards select(C0\|C1, x, y) if the targets prefers that form. This patch adds an additional rule that if the select(C1, x, y) part already exists in the function then we want to normalize into the other direction because the effects of reusing the existing value are bigger than transforming into the target preferred form. This addresses regressions following r238793, see also: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20150727/290272.html Differential Revision: http://reviews.llvm.org/D11616 llvm-svn: 245350	2015-08-18 20:48:36 +00:00
Matthias Braun	2e920bd04f	DAGCombiner: Optimize SELECTs first before turning them into SELECT_CC This is part of http://reviews.llvm.org/D11616 - I just decided to split this up into a separate commit. llvm-svn: 245349	2015-08-18 20:48:29 +00:00
Simon Pilgrim	edaba3b7c3	Updated constants to give more useful min/max constant folding tests llvm-svn: 245348	2015-08-18 20:46:48 +00:00
David Majnemer	0ad363eebc	[WinEH] Calculate state numbers for the new EH representation State numbers are calculated by performing a walk from the innermost funclet to the outermost funclet. Rudimentary support for the new EH constructs has been added to the assembly printer, just enough to test the new machinery. Differential Revision: http://reviews.llvm.org/D12098 llvm-svn: 245331	2015-08-18 19:07:12 +00:00
Alex Lorenz	eb7c9be43c	MIR Parser: Implicit register verifier should accept unexpected implicit subregister operands. llvm-svn: 245315	2015-08-18 17:17:13 +00:00
Daniel Sanders	63f4a5dcad	[mips] Expand JAL instructions when PIC is enabled. Summary: This is the correct way to handle JAL instructions when PIC is enabled. Patch by Toma Tabacu Reviewers: seanbruno, tomatabacu Subscribers: brooks, seanbruno, emaste, llvm-commits Differential Revision: http://reviews.llvm.org/D6231 llvm-svn: 245305	2015-08-18 16:18:09 +00:00
Davide Italiano	485cb66d0e	[MC] Convert another bunch of tests from macho-dump to llvm-readobj. This is (almost) everything under MC/MachO/ARM. There are still some cases missing, because llvm-readobj doesn't (yet) support some features, that macho-dump provides. I plan to reduce the gap between them shortly. llvm-svn: 245302	2015-08-18 16:05:13 +00:00
Zoran Jovanovic	2fe8466f6e	[mips][microMIPS] Implement DDIV, DMOD, DDIVU and DMODU instructions Differential Revision: http://reviews.llvm.org/D10953 llvm-svn: 245297	2015-08-18 14:40:43 +00:00
Zoran Jovanovic	a6593ff613	[mips][microMIPS] Implement SW and SWE instructions Differential Revision: http://reviews.llvm.org/D10869 llvm-svn: 245293	2015-08-18 12:53:08 +00:00
Simon Pilgrim	9f4374d361	Fixed max/min typo in test names llvm-svn: 245278	2015-08-18 09:02:51 +00:00
Simon Pilgrim	19ffd57f45	[X86][SSE} Added constant SMAX/SMIN/UMAX/UMIN tests Constant folding patch to follow soon llvm-svn: 245276	2015-08-18 08:52:43 +00:00
Simon Pilgrim	08d823afe4	[X86][SSE] Added extra vector truncation tests. Including cases for PR14866 llvm-svn: 245274	2015-08-18 08:37:09 +00:00
Justin Bogner	9f00ebaeda	Revert "Constant propagation after hiting llvm.assume" This was also failing bootstrap: http://lab.llvm.org:8080/green/job/clang-stage2-configure-Rlto_build This reverts r245265. llvm-svn: 245269	2015-08-18 07:00:34 +00:00
Piotr Padlewski	94ca3783b8	Constant propagation after hiting llvm.assume After hitting @llvm.assume(X) we can: - propagate equality that X == true - if X is icmp/fcmp (with eq operation), and one of operand is constant we can change all variables with constants in the same BasicBlock http://reviews.llvm.org/D11918 llvm-svn: 245265	2015-08-18 03:55:30 +00:00
Guozhi Wei	f66d384443	Align SP adjustment in function getSPAdjust This commit adds a new function TargetFrameLowering::alignSPAdjust and calls it from TargetInstrInfo::getSPAdjust. It fixes PR24142. llvm-svn: 245253	2015-08-17 22:36:27 +00:00
Alex Lorenz	a56ba6a6dd	MIR Serialization: Serialize the local offsets for the stack objects. llvm-svn: 245249	2015-08-17 22:17:42 +00:00
Alex Lorenz	eb62568625	MIR Serialization: Serialize the memory operand's range metadata node. llvm-svn: 245247	2015-08-17 22:09:52 +00:00
Alex Lorenz	03e940d1f8	MIR Serialization: Serialize the memory operand's noalias metadata node. llvm-svn: 245246	2015-08-17 22:08:02 +00:00
Alex Lorenz	a16f624dc3	MIR Serialization: Serialize the memory operand's alias scope metadata node. llvm-svn: 245245	2015-08-17 22:06:40 +00:00
Alex Lorenz	a617c9162d	MIR Serialization: Serialize the memory operand's TBAA metadata node. llvm-svn: 245244	2015-08-17 22:05:15 +00:00
David Majnemer	83f4bb23c4	[WinEHPrepare] Replace unreasonable funclet terminators with unreachable It is possible to be in a situation where more than one funclet token is a valid SSA value. If we see a terminator which exits a funclet which doesn't use the funclet's token, replace it with unreachable. Differential Revision: http://reviews.llvm.org/D12074 llvm-svn: 245238	2015-08-17 20:56:39 +00:00
Douglas Katzman	685a7d1a70	[SPARC]: recognize '.' as the start of an assembler expression. llvm-svn: 245232	2015-08-17 19:55:01 +00:00
James Molloy	974838f294	[ARM] Fix crash when targetting CPU without NEON We emulate a scalar vmin/vmax with NEON instructions as they don't exist in the VFP ISA. So only mark these as legal when NEON is available. Found here: https://code.google.com/p/chromium/issues/detail?id=521671 llvm-svn: 245231	2015-08-17 19:37:12 +00:00
Silviu Baranga	b322aa6f53	[CostModel][AArch64] Increase cost of vector insert element and add missing cast costs Summary: Increase the estimated costs for insert/extract element operations on AArch64. This is motivated by results from benchmarking interleaved accesses. Add missing costs for zext/sext/trunc instructions and some integer to floating point conversions. These costs were previously calculated by scalarizing these operation and were affected by the cost increase of the insert/extract element operations. Reviewers: rengolin Subscribers: mcrosier, aemerson, rengolin, llvm-commits Differential Revision: http://reviews.llvm.org/D11939 llvm-svn: 245226	2015-08-17 16:05:09 +00:00
Silviu Baranga	d5ac26937c	[CostModel][ARM] Increase cost of insert/extract operations Summary: This change limits the minimum cost of an insert/extract element operation to 2 in cases where this would result in mixing of NEON and VFP code. Reviewers: rengolin Subscribers: mssimpso, aemerson, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D12030 llvm-svn: 245225	2015-08-17 15:57:05 +00:00
Artur Pilipenko	34d8ba84c8	Take alignment into account in isSafeToSpeculativelyExecute and isSafeToLoadUnconditionally. Reviewed By: hfinkel, sanjoy, MatzeB Differential Revision: http://reviews.llvm.org/D9791 llvm-svn: 245223	2015-08-17 15:54:26 +00:00
Joseph Tremoulet	7031c9fc2e	[WinEHPrepare] Fix catchret successor phi demotion Summary: When demoting an SSA value that has a use on a phi and one of the phi's predecessors terminates with catchret, the edge needs to be split and the load inserted in the new block, else we'll still have a cross-funclet SSA value. Add a test for this, and for the similar case where a def to be spilled is on and invoke and a critical edge, which was already implemented but missing a test. Reviewers: majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12065 llvm-svn: 245218	2015-08-17 13:51:37 +00:00
Daniel Sanders	a39ef1c68f	[mips] [IAS] Add support for the DLA pseudo-instruction and fix problems with DLI Summary: It is the same as LA, except that it can also load 64-bit addresses and it only works on 64-bit MIPS architectures. Reviewers: tomatabacu, seanbruno, vkalintiris Subscribers: brooks, seanbruno, emaste, llvm-commits Differential Revision: http://reviews.llvm.org/D9524 llvm-svn: 245208	2015-08-17 10:11:55 +00:00
Michael Kuperstein	adc4e9c414	[GMR] isNonEscapingGlobalNoAlias() should look through Bitcasts/GEPs when looking at loads. This fixes yet another case from PR24288. Differential Revision: http://reviews.llvm.org/D12064 llvm-svn: 245207	2015-08-17 10:06:08 +00:00
James Molloy	ef183397b1	Generate FMINNAN/FMINNUM/FMAXNAN/FMAXNUM from SDAGBuilder. These only get generated if the target supports them. If one of the variants is not legal and the other is, and it is safe to do so, the other variant will be emitted. For example on AArch32 (V8), we have scalar fminnm but not fmin. Fix up a couple of tests while we're here - one now produces better code, and the other was just plain wrong to start with. llvm-svn: 245196	2015-08-17 07:13:10 +00:00
Karthik Bhat	3af28945b9	Fix PR24469 resulting from r245025 and re-enable dead store elimination across basicblocks. PR24469 resulted because DeleteDeadInstruction in handleNonLocalStoreDeletion was deleting the next basic block iterator. Fixed the same by resetting the basic block iterator post call to DeleteDeadInstruction. llvm-svn: 245195	2015-08-17 05:51:39 +00:00
David Majnemer	8ed559ad22	Revert "[InstCombinePHI] Partial simplification of identity operations." This reverts commit r244887, it caused PR24470. llvm-svn: 245194	2015-08-17 03:11:26 +00:00
Chandler Carruth	2f1fd1658f	[PM] Port ScalarEvolution to the new pass manager. This change makes ScalarEvolution a stand-alone object and just produces one from a pass as needed. Making this work well requires making the object movable, using references instead of overwritten pointers in a number of places, and other refactorings. I've also wired it up to the new pass manager and added a RUN line to a test to exercise it under the new pass manager. This includes basic printing support much like with other analyses. But there is a big and somewhat scary change here. Prior to this patch ScalarEvolution was never actually invalidated!!! Re-running the pass just re-wired up the various other analyses and didn't remove any of the existing entries in the SCEV caches or clear out anything at all. This might seem OK as everything in SCEV that can uses ValueHandles to track updates to the values that serve as SCEV keys. However, this still means that as we ran SCEV over each function in the module, we kept accumulating more and more SCEVs into the cache. At the end, we would have a SCEV cache with every value that we ever needed a SCEV for in the entire module!!! Yowzers. The releaseMemory routine would dump all of this, but that isn't realy called during normal runs of the pipeline as far as I can see. To make matters worse, there is actually a key that we don't update with value handles -- there is a map keyed off of Loops. Because LoopInfo does* release its memory from run to run, it is entirely possible to run SCEV over one function, then over another function, and then lookup a Loop* from the second function but find an entry inserted for the first function! Ouch. To make matters still worse, there are plenty of updates that don't trip a value handle. It seems incredibly unlikely that today GVN or another pass that invalidates SCEV can update values in just such a way that a subsequent run of SCEV will incorrectly find lookups in a cache, but it is theoretically possible and would be a nightmare to debug. With this refactoring, I've fixed all this by actually destroying and recreating the ScalarEvolution object from run to run. Technically, this could increase the amount of malloc traffic we see, but then again it is also technically correct. ;] I don't actually think we're suffering from tons of malloc traffic from SCEV because if we were, the fact that we never clear the memory would seem more likely to have come up as an actual problem before now. So, I've made the simple fix here. If in fact there are serious issues with too much allocation and deallocation, I can work on a clever fix that preserves the allocations (while clearing the data) between each run, but I'd prefer to do that kind of optimization with a test case / benchmark that shows why we need such cleverness (and that can test that we actually make it faster). It's possible that this will make some things faster by making the SCEV caches have higher locality (due to being significantly smaller) so until there is a clear benchmark, I think the simple change is best. Differential Revision: http://reviews.llvm.org/D12063 llvm-svn: 245193	2015-08-17 02:08:17 +00:00
Sanjay Patel	57fd1dc5db	transform fmin/fmax calls when possible (PR24314) If we can ignore NaNs, fmin/fmax libcalls can become compare and select (this is what we turn std::min / std::max into). This IR should then be optimized in the backend to whatever is best for any given target. Eg, x86 can use minss/maxss instructions. This should solve PR24314: https://llvm.org/bugs/show_bug.cgi?id=24314 Differential Revision: http://reviews.llvm.org/D11866 llvm-svn: 245187	2015-08-16 20:18:19 +00:00
David Majnemer	e04443baff	Revert "Add support for cross block dse. This patch enables dead stroe elimination across basicblocks." This reverts commit r245025, it caused PR24469. llvm-svn: 245172	2015-08-16 07:11:59 +00:00
David Majnemer	dfa3b09541	[InstCombine] Replace an and+icmp with a trunc+icmp Bitwise arithmetic can obscure a simple sign-test. If replacing the mask with a truncate is preferable if the type is legal because it permits us to rephrase the comparison more explicitly. llvm-svn: 245171	2015-08-16 07:09:17 +00:00
David Majnemer	1a59e49f3c	[X86] Widen the 'AND' mask if doing so shrinks the encoding size We can set additional bits in a mask given that we know the other operand of an AND already has some bits set to zero. This can be more efficient if doing so allows us to use an instruction which implicitly sign extends the immediate. This fixes PR24085. Differential Revision: http://reviews.llvm.org/D11289 llvm-svn: 245169	2015-08-16 04:52:11 +00:00
Sanjay Patel	40d4eb40f6	[x86] enable machine combiner reassociations for scalar single-precision minimums llvm-svn: 245166	2015-08-15 17:01:54 +00:00
Simon Pilgrim	d65ace84c7	Updated broadcast stack folding test to avoid use of broadcast intrinsics. llvm-svn: 245165	2015-08-15 16:54:18 +00:00
Sanjay Patel	3b7e3677e3	fix typos; NFC llvm-svn: 245164	2015-08-15 16:53:08 +00:00
Sanjay Patel	9f6c7dddd2	add test case to show current codegen llvm-svn: 245163	2015-08-15 16:49:50 +00:00
Simon Pilgrim	0750c84623	[DAGCombiner] Attempt to mask vectors before zero extension instead of after. For cases where we TRUNCATE and then ZERO_EXTEND to a larger size (often from vector legalization), see if we can mask the source data and then ZERO_EXTEND (instead of after a ANY_EXTEND). This can help avoid having to generate a larger mask, and possibly applying it to several sub-vectors. (zext (truncate x)) -> (zext (and(x, m)) Includes a minor patch to SystemZ to better recognise 8/16-bit zero extension patterns from RISBG bit-extraction code. This is the first of a number of minor patches to help improve the conversion of byte masks to clear mask shuffles. Differential Revision: http://reviews.llvm.org/D11764 llvm-svn: 245160	2015-08-15 13:27:30 +00:00
David Majnemer	0bc0eef71c	[IR] Give catchret an optional 'return value' operand Some personality routines require funclet exit points to be clearly marked, this is done by producing a token at the funclet pad and consuming it at the corresponding ret instruction. CleanupReturnInst already had a spot for this operand but CatchReturnInst did not. Other personality routines don't need to use this which is why it has been made optional. llvm-svn: 245149	2015-08-15 02:46:08 +00:00
JF Bastien	5e4303dc14	Accelerate MergeFunctions with hashing This patch makes the Merge Functions pass faster by calculating and comparing a hash value which captures the essential structure of a function before performing a full function comparison. The hash is calculated by hashing the function signature, then walking the basic blocks of the function in the same order as the main comparison function. The opcode of each instruction is hashed in sequence, which means that different functions according to the existing total order cannot have the same hash, as the comparison requires the opcodes of the two functions to be the same order. The hash function is a static member of the FunctionComparator class because it is tightly coupled to the exact comparison function used. For example, functions which are equivalent modulo a single variant callsite might be merged by a more aggressive MergeFunctions, and the hash function would need to be insensitive to these differences in order to exploit this. The hashing function uses a utility class which accumulates the values into an internal state using a standard bit-mixing function. Note that this is a different interface than a regular hashing routine, because the values to be hashed are scattered amongst the properties of a llvm::Function, not linear in memory. This scheme is fast because only one word of state needs to be kept, and the mixing function is a few instructions. The main runOnModule function first computes the hash of each function, and only further processes functions which do not have a unique function hash. The hash is also used to order the sorted function set. If the hashes differ, their values are used to order the functions, otherwise the full comparison is done. Both of these are helpful in speeding up MergeFunctions. Together they result in speedups of 9% for mysqld (a mostly C application with little redundancy), 46% for libxul in Firefox, and 117% for Chromium. (These are all LTO builds.) In all three cases, the new speed of MergeFunctions is about half that of the module verifier, making it relatively inexpensive even for large LTO builds with hundreds of thousands of functions. The same functions are merged, so this change is free performance. Author: jrkoenig Reviewers: nlewycky, dschuff, jfb Subscribers: llvm-commits, aemerson Differential revision: http://reviews.llvm.org/D11923 llvm-svn: 245140	2015-08-15 01:18:18 +00:00
Matt Arsenault	427a0fd22e	LoopStrengthReduce: Try to pass address space to isLegalAddressingMode This seems to only work some of the time. In some situations, this seems to use a nonsensical type and isn't actually aware of the memory being accessed. e.g. if branch condition is an icmp of a pointer, it checks the addressing mode of i1. llvm-svn: 245137	2015-08-15 00:53:06 +00:00
Matt Arsenault	297ae311ce	AMDGPU/SI: Fix printing useless info with amdhsa The comments at the bottom would all report 0 if amdhsa was used. llvm-svn: 245135	2015-08-15 00:12:39 +00:00

1 2 3 4 5 ...

31711 Commits