llvm-project

Commit Graph

Author	SHA1	Message	Date
Mitch Bodart	e60465ddf7	[X86] Enable the post-RA-scheduler for clang's default 32-bit cpu. For compilations with no explicit cpu specified, this exhibits nice gains on Silvermont, with neutral performance on big cores. Differential Revision: http://reviews.llvm.org/D19138 llvm-svn: 267809	2016-04-27 22:52:35 +00:00
Kevin Enderby	c493085c8d	Fix a bug in llvm-objdump printing of 32-bit addresses for -section in non i386 and x86 files. rdar://25896202 llvm-svn: 267807	2016-04-27 22:36:18 +00:00
Quentin Colombet	bf200688de	[X86][FastISel] Make sure we use the right register class when we select stores. llvm-svn: 267806	2016-04-27 22:33:42 +00:00
Rong Xu	4b1dc5d60b	Fix buildbot failure due to r267792 Relax the test check as some targets do not have name compression. llvm-svn: 267803	2016-04-27 22:06:35 +00:00
Colin LeMahieu	a3782da3e3	[Hexagon] Merging nops in to previous packet rather than always creating a new one. llvm-svn: 267798	2016-04-27 21:37:44 +00:00
Quentin Colombet	d6dbec4c6f	[X86] Fix the lowering of TLS calls. The callseq_end node must be glued with the TLS calls, otherwise, the generic code will miss the uses of the returned value and will mark it dead. Moreover, TLSCall 64-bit pseudo must not set an implicit-use on RDI, the pseudo uses the symbol address at this point not RDI and the lowering will do the right thing. llvm-svn: 267797	2016-04-27 21:37:37 +00:00
Rong Xu	af5aebaa32	[PGO] Prohibit address recording if the function is both internal and COMDAT Differential Revision: http://reviews.llvm.org/D19515 llvm-svn: 267792	2016-04-27 21:17:30 +00:00
Matt Arsenault	0547b016b1	AMDGPU: Account for globals in AMDGPUPromoteAlloca pass Patch by Bas Nieuwenhuizen llvm-svn: 267791	2016-04-27 21:05:08 +00:00
Kevin Enderby	60c4e6aafa	Add a test case for the crash fixed with r267037. David Blaikie said it would be nice to have! This was crashing llvm-objdump with -macho -objc-meta-data when trying dump a non-existent section. So the test binary is simply created from an empty .s file compiled with: clang -arch armv7 empty.s -c llvm-svn: 267782	2016-04-27 20:37:06 +00:00
Ahmed Bougacha	9e71425f54	[AArch64] Set correct successors in CMPXCHG pseudo expansion. transferSuccessors() would LoadCmpBB a successor of DoneBB, whereas it should be a successor of the original MBB. Follow-up to r266339. Unfortunately, it's tricky to catch this in the verifier. llvm-svn: 267779	2016-04-27 20:33:02 +00:00
Ahmed Bougacha	b4af107239	[ARM] Set correct successors in CMPXCHG pseudo expansion. transferSuccessors() would LoadCmpBB a successor of DoneBB, whereas it should be a successor of the original MBB. The testcase changes are caused by Thumb2SizeReduction, which was previously confused by the broken CFG. Follow-up to r266679. Unfortunately, it's tricky to catch this in the verifier. llvm-svn: 267778	2016-04-27 20:32:54 +00:00
Simon Pilgrim	3f595aabe2	[InstCombine][AVX2] Add AVX2 per-element vector shift tests At the moment we don't simplify PSRAV/PSRLV/PSLLV intrinsics to generic IR for constant shift amounts, but we could. llvm-svn: 267777	2016-04-27 20:25:34 +00:00
Kevin B. Smith	c378a99ba5	[X86]: Quit promoting 16 bit loads to 32 bit. Differential Revision: http://reviews.llvm.org/D19592 llvm-svn: 267773	2016-04-27 19:58:03 +00:00
David Majnemer	0c80e2eac6	[CodeGenPrepare] Don't sink a cast past its user The sink cast machinery is supposed to sink casts as close to their user as possible. However, an EH pad is the first instruction in it's basic block. Don't sink if the user is an EH pad. This fixes PR27536. llvm-svn: 267767	2016-04-27 19:36:38 +00:00
Ahmed Bougacha	ace97c1f7d	[LIR] Set attributes on memset_pattern16. "inferattrs" will deduce the attribute, but it will be too late for many optimizations. Set it ourselves when creating the call. Differential Revision: http://reviews.llvm.org/D17598 llvm-svn: 267762	2016-04-27 19:04:50 +00:00
Ahmed Bougacha	44c19876c7	[InferAttrs] Mark memset_pattern16 params nocapture. Differential Revision: http://reviews.llvm.org/D19471 llvm-svn: 267760	2016-04-27 19:04:43 +00:00
Chad Rosier	03e1647d19	Revert "[AMDGPU][llvm-mc] Add support of TTMP quads. Rework M0 exclusion for SMRD." This reverts commit r267733 due to a -Werror,-Wunused-function error. llvm-svn: 267752	2016-04-27 18:29:11 +00:00
Matthew Simpson	622b95be7b	[LV] Reallow positive-stride interleaved load groups with gaps We previously disallowed interleaved load groups that may cause us to speculatively access memory out-of-bounds (r261331). We did this by ensuring each load group had an access corresponding to the first and last member. Instead of bailing out for these interleaved groups, this patch enables us to peel off the last vector iteration, ensuring that we execute at least one iteration of the scalar remainder loop. This solution was proposed in the review of the previous patch. Differential Revision: http://reviews.llvm.org/D19487 llvm-svn: 267751	2016-04-27 18:21:36 +00:00
Marcin Koscielnicki	7efdca5622	[Mips] Add support for llvm.thread.pointer intrinsic. This will be used to implement __builtin_thread_pointer in clang. Differential Revision: http://reviews.llvm.org/D19569 llvm-svn: 267743	2016-04-27 17:21:49 +00:00
Gerolf Hoflehner	88017c08a6	[InstCombine] Sharpended test case in pr21210.ll llvm-svn: 267742	2016-04-27 17:19:54 +00:00
Artem Tamazov	3896f8f83d	[AMDGPU][llvm-mc] Add support of TTMP quads. Rework M0 exclusion for SMRD. Added support of TTMP quads. Reworked M0 exclusion machinery for SMRD and similar instructions to enable usage of TTMP registers in those instructions as destinations. Tests added. Differential Revision: http://reviews.llvm.org/D19342 llvm-svn: 267733	2016-04-27 16:20:23 +00:00
Reid Kleckner	0336cc05e7	[PDB] Fix function names for private symbols in PDBs Summary: llvm-symbolizer wants to get linkage names of functions for historical reasons. Linkage names are only recorded in the PDB for public symbols, and the linkage name is apparently stored separately in some "public symbol" record. We had a workaround in PDBContext which would look for such symbols when the user requested linkage names. However, when given an address that was truly in a private function and public funciton, we would accidentally find nearby public symbols and return those function names. The fix is to look for both function symbols and public symbols and only prefer the public symbol name if the addresses of the symbols agree. Fixes PR27492 Reviewers: zturner Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D19571 llvm-svn: 267732	2016-04-27 16:10:29 +00:00
Nicolai Haehnle	f66bdb5ea8	AMDGPU/SI: Add llvm.amdgcn.s.waitcnt.all intrinsic Summary: So it appears that to guarantee some of the ordering requirements of a GLSL memoryBarrier() executed in the shader, we need to emit an s_waitcnt. (We can't use an s_barrier, because memoryBarrier() may appear anywhere in the shader, in particular it may appear in non-uniform control flow.) Reviewers: arsenm, mareko, tstellarAMD Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D19203 llvm-svn: 267729	2016-04-27 15:46:01 +00:00
Matthew Simpson	e5dfb08fcb	[TTI] Add hook for vector extract with extension This change adds a new hook for estimating the cost of vector extracts followed by zero- and sign-extensions. The motivating example for this change is the SMOV and UMOV instructions on AArch64. These instructions move data from vector to general purpose registers while performing the corresponding extension (sign-extend for SMOV and zero-extend for UMOV) at the same time. For these operations, TargetTransformInfo can assume the extensions are free and only report the cost of the vector extract. The SLP vectorizer has been updated to make use of the new hook. Differential Revision: http://reviews.llvm.org/D18523 llvm-svn: 267725	2016-04-27 15:20:21 +00:00
Artem Tamazov	5cd55b1784	[AMDGPU][llvm-mc] s_getreg/setreg* - Support symbolic names of hardware registers. Possibility to specify code of hardware register kept. Disassemble to symbolic name, if name is known. Tests updated/added. Differential Revision: http://reviews.llvm.org/D19335 llvm-svn: 267724	2016-04-27 15:17:03 +00:00
Nico Weber	e69b9548b8	Revert r267649, it caused PR27539. llvm-svn: 267723	2016-04-27 15:16:54 +00:00
Kristof Beyls	a14e9a5576	Remove size 1 from check as that isn't part of what the test is meant to be testing. This test also runs on e.g. ARM-native builds when the X86 backend is also built. This test produces code for the default instruction set, even though it is in a "X86" sub-directory. Given that this test doesn't seem to be testing anything architecture-specific, it seems it's best to adapt the check to not check for an architecture-dependent value (the size of the function), rather than hard-code the test to target x86. llvm-svn: 267722	2016-04-27 15:03:09 +00:00
Teresa Johnson	02e98331c0	[ThinLTO] Use valueid instead of bitcode offsets in combined index file Summary: With the removal of support for lazy parsing of combined index summary records (e.g. r267344), we no longer need to include the summary record bitcode offset in the VST entries for definitions. Change the combined index format to be similar to the per-module index format in using value ids to cross-reference from the summary record to the VST entry (rather than the summary record bitcode offset to cross-reference in the other direction). The visible changes are: 1) Add the value id to the combined summary records 2) Remove the summary offset from the combined VST records, which has the following effects: - No longer need the VST_CODE_COMBINED_GVDEFENTRY record, as all combined index VST entries now only contain the value id and corresponding GUID. - No longer have duplicate VST entries in the case where there are multiple definitions of a symbol (e.g. weak/linkonce), as they all have the same value id and GUID. An implication of #2 above is that in order to hook up an alias to the correct aliasee based on the value id of the aliasee recorded in the combined index alias record, we need to scan the entries in the index for that GUID to find the one from the same module (i.e. the case where there are multiple entries for the aliasee). But the reader no longer has to maintain a special map to hook up the alias/aliasee. Reviewers: joker.eph Subscribers: joker.eph, llvm-commits Differential Revision: http://reviews.llvm.org/D19481 llvm-svn: 267712	2016-04-27 13:28:35 +00:00
Simon Pilgrim	f23aa2a9c9	[InstCombine][SSE] Regenerated vector shift tests llvm-svn: 267699	2016-04-27 12:04:44 +00:00
Zlatko Buljan	de0bbe6d1c	[mips][microMIPS] Add CodeGen support for SUBU16, SUB, SUBU, DSUB and DSUBU instructions Differential Revision: http://reviews.llvm.org/D16676 llvm-svn: 267694	2016-04-27 11:31:44 +00:00
Zlatko Buljan	29813620bc	[mips][microMIPS] Add CodeGen support for SLL16, SRL16, SLL, SLLV, SRA, SRAV, SRL and SRLV instructions Differential Revision: http://reviews.llvm.org/D17989 llvm-svn: 267693	2016-04-27 11:02:23 +00:00
Artur Pilipenko	c97eac6555	Use DL preferred alignment for alloca in Value::getPointerAlignment Teach Value::getPointerAlignment that allocas with no explicit alignment are aligned to preferred alignment of the allocated type. Reviewed By: hfinkel Differential Revision: http://reviews.llvm.org/D17569 llvm-svn: 267689	2016-04-27 10:42:29 +00:00
Simon Pilgrim	d2ea708739	[InstCombine][SSE] Added DemandedBits tests for MOVMSK instructions MOVMSK zeros the upper bits of the gpr - we should be able to use this. llvm-svn: 267686	2016-04-27 09:53:09 +00:00
Adam Nemet	d2fa414718	[LoopDist] Add llvm.loop.distribute.enable loop metadata Summary: D19403 adds a new pragma for loop distribution. This change adds support for the corresponding metadata that the pragma is translated to by the FE. As part of this I had to rethink the flag -enable-loop-distribute. My goal was to be backward compatible with the existing behavior: A1. pass is off by default from the optimization pipeline unless -enable-loop-distribute is specified A2. pass is on when invoked directly from opt (e.g. for unit-testing) The new pragma/metadata overrides these defaults so the new behavior is: B1. A1 + enable distribution for individual loop with the pragma/metadata B2. A2 + disable distribution for individual loop with the pragma/metadata The default value whether the pass is on or off comes from the initiator of the pass. From the PassManagerBuilder the default is off, from opt it's on. I moved -enable-loop-distribute under the pass. If the flag is specified it overrides the default from above. Then the pragma/metadata can further modifies this per loop. As a side-effect, we can now also use -enable-loop-distribute=0 from opt to emulate the default from the optimization pipeline. So to be precise this is the new behavior: C1. pass is off by default from the optimization pipeline unless -enable-loop-distribute or the pragma/metadata enables it C2. pass is on when invoked directly from opt unless -enable-loop-distribute=0 or the pragma/metadata disables it Reviewers: hfinkel Subscribers: joker.eph, mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D19431 llvm-svn: 267672	2016-04-27 05:28:18 +00:00
Mehdi Amini	c7b950171d	Revert "Support "preserving" the summary information when using setModule() API in LTOCodeGenerator" This reverts commit r267665. ASAN shows that there is a use of undefined value. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 267668	2016-04-27 05:11:44 +00:00
Mehdi Amini	360ed847bc	Support "preserving" the summary information when using setModule() API in LTOCodeGenerator Another attempt at r267655... From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 267665	2016-04-27 04:24:10 +00:00
Mehdi Amini	a1b8b6cd56	Revert "Support "preserving" the summary information when using setModule() API in LTOCodeGenerator" This reverts commit r267657, r267656, and r267655. The test does not pass on multiple bots, I'm unsure why yet but let's unbreak them. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 267664	2016-04-27 03:34:28 +00:00
Evgeny Stupachenko	23ce61b663	The patch fixes PR27392. Summary: It is incorrect to compare TripCount (which is BECount + 1) with extraiters (or Count) to check if we should enter unrolled loop or not, because TripCount can potentially overflow (when BECount is max unsigned integer). While comparing BECount with (Count - 1) is overflow safe and therefore correct. Reviewer: hfinkel Differential Revision: http://reviews.llvm.org/D19256 From: Evgeny Stupachenko <evstupac@gmail.com> llvm-svn: 267662	2016-04-27 03:04:54 +00:00
Chuang-Yu Cheng	8676c3d599	[ppc64] fix bug in prologue that mfocrf's cr operand should be explict state instead of implicit This fixes PR27414 Reviewers: kbarton mgrang tjablin http://reviews.llvm.org/D19255 llvm-svn: 267660	2016-04-27 02:59:28 +00:00
Mehdi Amini	4960919723	Fix the test from r267656: Support "preserving" the summary information when using setModule() API in LTOCodeGenerator From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 267657	2016-04-27 01:49:11 +00:00
Mehdi Amini	1f65039f06	Add a test for r267655: Support "preserving" the summary information when using setModule() API in LTOCodeGenerator From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 267656	2016-04-27 01:47:46 +00:00
Ahmed Bougacha	19a2ee591a	[X86] Don't assume that MMX extractelts are from index 0. It's probably the case for all 3 MMX users out there, but with hand-crafted IR, you can trigger selection failures. Fix that. llvm-svn: 267652	2016-04-27 01:35:29 +00:00
Ahmed Bougacha	e68363a03c	[X86] Re-enable MMX i32 extractelt combine. This effectively adds back the extractelt combine removed by r262358: the direct case can still occur (because x86_mmx is special, see r262446), but it's the indirect case that's now superseded by the generic combine. llvm-svn: 267651	2016-04-27 01:35:25 +00:00
Cong Hou	6f879d9eb1	Detects the SAD pattern on X86 so that much better code will be emitted once the pattern is matched. Differential revision: http://reviews.llvm.org/D14840 llvm-svn: 267649	2016-04-27 01:29:18 +00:00
Mehdi Amini	b4e1e8297b	ThinLTO: do not promote GlobalVariable that have a specific section. Differential Revision: http://reviews.llvm.org/D18298 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 267646	2016-04-27 00:32:13 +00:00
Mehdi Amini	da168fbc2e	LTOCodeGenerator: turns linkonce(_odr) into weak_(odr) when present "MustPreserve" set Summary: If the linker requested to preserve a linkonce function, we should honor this even if we drop all uses. Reviewers: dexonsmith Subscribers: llvm-commits, joker.eph Differential Revision: http://reviews.llvm.org/D19527 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 267644	2016-04-27 00:32:02 +00:00
Philip Reames	3f83dbeed9	[LVI] Reduce compile time by lazily scanning blocks if needed When encountering a non-local pointer, LVI would eagerly scan the block for dereferences of the given object to prove the pointer to be non null. That's all well and good, but then we'd go recurse through our input blocks. As a result, we could end up scanning each and every block we traverse, even if the final definition was obviously non null or we found a constant value somewhere up the chain. The previous code papered over this by using the isKnownNonNull routine from value tracking. This made the duplication less painful in the common case. Instead, we know do the block scan only after we've gotten the recursive results back. This lets us stop scanning individual blocks as soon as we've determined it to be non-null in any predecessor block and use our usual merge rules to propagate that information cheaply through successor blocks. For a pointer which can be found non-null, this does strictly less work and sometimes substaintially so. Note that the case where we can't prove something non-null is still the really expensive case. We end up scanning each and every block looking for a dereference and never end up finding one. llvm-svn: 267642	2016-04-27 00:30:55 +00:00
Quentin Colombet	4ff3cfb673	[X86] Make sure it is safe to clobber EFLAGS, if need be, when choosing the prologue. Do not use basic blocks that have EFLAGS live-in as prologue if we need to realign the stack. Realigning the stack uses AND instruction and this clobbers EFLAGS. An other alternative would have been to save and restore EFLAGS around the stack realignment code, but this is likely inefficient. Fixes PR27531. llvm-svn: 267634	2016-04-26 23:44:14 +00:00
Justin Bogner	c2bf63d29d	PM: Port Reassociate to the new pass manager llvm-svn: 267631	2016-04-26 23:39:29 +00:00
Mitch Bodart	807e13379b	[X86] Replace -mcpu with -mattr in several tests Differential Revision: http://reviews.llvm.org/D19568 llvm-svn: 267629	2016-04-26 23:36:38 +00:00
Sanjay Patel	29dea0d230	[SimplifyCFG] propagate branch metadata when creating select llvm-svn: 267624	2016-04-26 23:15:48 +00:00
Quentin Colombet	08e79990a0	[MachineBasicBlock] Take advantage of the partially dead information. Thanks to that information we wouldn't lie on a register being live whereas it is not. llvm-svn: 267622	2016-04-26 23:14:29 +00:00
Quentin Colombet	3f19245015	[MachineInstrBundle] Improvement the recognition of dead definitions. Now, it is possible to know that partial definitions are dead definitions and recognize that clobbered registers are also dead. llvm-svn: 267621	2016-04-26 23:14:24 +00:00
Philip Reames	053c2a6f25	[LVI] Apply transfer rule for overdefine inputs for binary operators As pointed out by John Regehr over in http://reviews.llvm.org/D19485, LVI was being incredibly stupid about applying its transfer rules. Rather than gathering local facts from the expression itself, it was simply giving up entirely if one of the inputs was overdefined. This greatly impacts the precision of the overall analysis and makes it far more fragile as well. This patch builds on 267609 which did the same thing for unary casts. llvm-svn: 267620	2016-04-26 23:10:35 +00:00
Philip Reames	e5030e85ea	[LVI] A better fix for the assertion error introduced by 267609 Essentially, I was using the wrong size function. For types which were sized, but not primitive, I wasn't getting a useful size for the operand and failed an assert. I fixed this, and also added a guard that the input is a sized type. Test case is for the original mistake. I'm not sure how to actually exercise the sized type check. llvm-svn: 267618	2016-04-26 22:52:30 +00:00
Sanjay Patel	d2d2aa52cd	[LowerExpectIntrinsic] make default likely/unlikely ratio bigger We need the default ratio to be sufficiently large that it triggers transforms based on block frequency info (BFI) and plays well with the recently introduced BranchProbability used by CGP. Differential Revision: http://reviews.llvm.org/D19435 llvm-svn: 267615	2016-04-26 22:23:38 +00:00
Philip Reames	38c87c2e50	[LVI] Infer local facts from unary expressions As pointed out by John Regehr over in http://reviews.llvm.org/D19485, LVI was being incredibly stupid about applying its transfer rules. Rather than gathering local facts from the expression itself, it was simply giving up entirely if one of the inputs was overdefined. This greatly impacts the precision of the overall analysis and makes it far more fragile as well. This patch implements only the unary operation case. Once this is in, I'll implement the same for the binary operations. Differential Revision: http://reviews.llvm.org/D19492 llvm-svn: 267609	2016-04-26 21:48:16 +00:00
David Majnemer	abb9f55c80	Revert "[SimplifyLibCalls] sprintf doesn't copy null bytes" The destination buffer that sprintf uses is restrict qualified, we do not need to worry about derived pointers referenced via format specifiers. This reverts commit r267580. llvm-svn: 267605	2016-04-26 21:04:47 +00:00
Nico Weber	2f1459cbb7	Try to get ResponseFile.ll passing on Windows after r267556. llvm-svn: 267599	2016-04-26 20:32:51 +00:00
Elena Demikhovsky	308a7eb0d2	Masked Store in Loop Vectorizer - bugfix Fixed a bug in loop vectorization with conditional store. Differential Revision: http://reviews.llvm.org/D19532 llvm-svn: 267597	2016-04-26 20:18:04 +00:00
Justin Bogner	4563a06cee	PM: Port Internalize to the new pass manager llvm-svn: 267596	2016-04-26 20:15:52 +00:00
Zachary Turner	53a65ba5c9	Parse and dump PDB DBI Stream Header Information The DBI stream contains a lot of bookkeeping information for other streams. In particular it contains information about section contributions and linked modules. This patch is a first attempt at parsing some of the information out of the DBI stream. It currently only parses and dumps the headers of the DBI stream, so none of the module data or section contribution data is pulled out. This is just a proof of concept that we understand the basic properties of the DBI stream's metadata, and followup patches will try to extract more detailed information out. Differential Revision: http://reviews.llvm.org/D19500 Reviewed By: majnemer, ruiu llvm-svn: 267585	2016-04-26 18:42:34 +00:00
Krzysztof Parzyszek	4773f647bd	[Tail duplication] Handle source registers with subregisters When a block is tail-duplicated, the PHI nodes from that block are replaced with appropriate COPY instructions. When those PHI nodes contained use operands with subregisters, the subregisters were dropped from the COPY instructions, resulting in incorrect code. Keep track of the subregister information and use this information when remapping instructions from the duplicated block. Differential Revision: http://reviews.llvm.org/D19337 llvm-svn: 267583	2016-04-26 18:36:34 +00:00
Tim Northover	4397837be2	Reapply: "ARM: put correct symbol index on indirect pointers in __thread_ptr."" A latent bug in llvm-objdump used the wrong format specifier on 32-bit targets, causing the test to fail. This fixes the issue. llvm-svn: 267582	2016-04-26 18:29:16 +00:00
David Majnemer	8cd77baebc	[SimplifyLibCalls] sprintf doesn't copy null bytes sprintf doesn't read or copy the terminating null byte from it's string operands. sprintf will append it's own after processing all of the format specifiers. This fixes PR27526. llvm-svn: 267580	2016-04-26 18:16:49 +00:00
Manman Ren	1c3f65a18c	Swift Calling Convention: use %RAX for sret. We don't need to copy the sret argument into %rax upon return. rdar://25671494 llvm-svn: 267579	2016-04-26 18:08:06 +00:00
Saleem Abdulrasool	4c6c4e2bbb	tests: tweak MIR for ARM tests to correct MI issues The Machine Instruction Verifier flagged some issues in the serialized MIR. Adjust the input to correct them. Fixes the remaining portion of PR27480. llvm-svn: 267578	2016-04-26 17:54:21 +00:00
Saleem Abdulrasool	601e029ba3	test: remove some bleeding whitespace Kill bleeding whitespace. NFC llvm-svn: 267577	2016-04-26 17:54:16 +00:00
Sanjay Patel	d66607bd8c	[CodeGenPrepare] use branch weight metadata to decide if a select should be turned into a branch This is part of solving PR27344: https://llvm.org/bugs/show_bug.cgi?id=27344 CGP should undo the SimplifyCFG transform for the same reason that earlier patches have used this same mechanism: it's possible that passes between SimplifyCFG and CGP may be able to optimize the IR further with a select in place. For the TLI hook default, >99% taken or not taken is chosen as the default threshold for a highly predictable branch. Even the most limited HW branch predictors will be correct on this branch almost all the time, so even a massive mispredict penalty perf loss would be overcome by the win from all the times the branch was predicted correctly. As a follow-up, we could make the default target hook less conservative by using the SchedMachineModel's MispredictPenalty. Or we could just let targets override the default by implementing the hook with that and other target-specific options. Note that trying to statically determine mispredict rates for close-to-balanced profile weight data is generally impossible if the HW is sufficiently advanced. Ie, 50/50 taken/not-taken might still be 100% predictable. Finally, note that this patch as-is will not solve PR27344 because the current __builtin_unpredictable() branch weight default values are 4 and 64. A proposal to change that is in D19435. Differential Revision: http://reviews.llvm.org/D19488 llvm-svn: 267572	2016-04-26 17:11:17 +00:00
Zachary Turner	f34e01624a	Refactor some more PDB reading code into DebugInfoPDB. Differential Revision: http://reviews.llvm.org/D19445 Reviewed By: David Majnemer llvm-svn: 267564	2016-04-26 16:20:00 +00:00
Konstantin Zhuravlyov	1d99c4d03c	[AMDGPU] Reserve VGPRs for trap handler usage if instructed Differential Revision: http://reviews.llvm.org/D19235 llvm-svn: 267563	2016-04-26 15:43:14 +00:00
Sam Kolton	3025e7f25f	[AMDGPU] Assembler: basic support for SDWA instructions Support for SDWA instructions for VOP1 and VOP2 encoding. Not done yet: - converters for support optional operands and modifiers - VOPC - sext() modifier - intrinsics - VOP2b (see vop_dpp.s) - V_MAC_F32 (see vop_dpp.s) Differential Revision: http://reviews.llvm.org/D19360 llvm-svn: 267553	2016-04-26 13:33:56 +00:00
Andrey Turetskiy	b405606432	[X86] PR27502: Fix the LEA optimization pass. Handle MachineBasicBlock as a memory displacement operand in the LEA optimization pass. Differential Revision: http://reviews.llvm.org/D19409 llvm-svn: 267551	2016-04-26 12:18:12 +00:00
Marcin Koscielnicki	0cfb612413	[PowerPC] Add support for llvm.thread.pointer Differential Revision: http://reviews.llvm.org/D19304 llvm-svn: 267546	2016-04-26 10:37:22 +00:00
Marcin Koscielnicki	33571e2c41	[SPARC] [SSP] Add support for LOAD_STACK_GUARD. This fixes PR22248 on sparc. Differential Revision: http://reviews.llvm.org/D19386 llvm-svn: 267545	2016-04-26 10:37:14 +00:00
Marcin Koscielnicki	fafb44951a	[SPARC] Add support for llvm.thread.pointer. Differential Revision: http://reviews.llvm.org/D19387 llvm-svn: 267544	2016-04-26 10:37:01 +00:00
Mehdi Amini	aa309b1a81	ThinLTOCodeGenerator: preserve linkonce when in "MustPreserved" set If the linker specifically requested for a linkonce to be preserved, we need to make sure we won't drop it even if all the uses in the current module disappear. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 267543	2016-04-26 10:35:01 +00:00
Renato Golin	5a55a029c0	Revert "ARM: put correct symbol index on indirect pointers in __thread_ptr." This reverts commit r267488, as it broke some ARM buildbots. llvm-svn: 267541	2016-04-26 10:02:02 +00:00
Sanjoy Das	51df5fae4a	Symbolize operand bundle blocks for bcanalyzer Reviewers: joker.eph Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D19523 llvm-svn: 267524	2016-04-26 05:59:08 +00:00
Craig Topper	c5551bfc26	[AArch64] Expand v1i64 and v2i64 ctlz. The default is legal, which results in 'Cannot select' errors. llvm-svn: 267522	2016-04-26 05:26:51 +00:00
Craig Topper	d8d6be4f99	[ARM] Expand vector ctlz_zero_undef so it becomes ctlz. The default is Legal, which results in 'Cannot select' errors. llvm-svn: 267521	2016-04-26 05:04:37 +00:00
Craig Topper	edb4a6ba98	[ARM] Expand v1i64 and v2i64 ctlz. The default is legal, which results in 'Cannot select' errors. llvm-svn: 267520	2016-04-26 05:04:33 +00:00
Dehao Chen	5d6d4841ed	Tune basic block annotation algorithm. Summary: Instead of using maximum IR weight as the basic block weight, this patch uses the voting algorithm to find the most likely weight for the basic block. This can effectively avoid the cases when some IRs are annotated incorrectly due to code motion of the profiled binary. This patch also updates propagate.ll unittest to include discriminator in the input file so that it is testing something meaningful. Reviewers: davidxl, dnovillo Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D19301 llvm-svn: 267519	2016-04-26 04:59:11 +00:00
Bill Seurer	d6e92135bd	[powerpc] mark JIT tests as UNSUPPORTED on powerpc64 big endian Some of the JIT tests began failing with "[llvm] r266663 - [Orc] Re-commit r266581 with fixes for MSVC, and format cleanups." on powerpc64 big endian. To get the buildbots running I am marking these as UNSUPPORTED for now. If this is fixed remove the UNSUPPORTED flag "powerpc64-unknown-linux-gnu". In r267516 I marked these as XFAIL but they succeed on some of the bots on stage1. llvm-svn: 267518	2016-04-26 03:59:19 +00:00
Richard Trieu	d7f31a31d1	Pass the test file in through stdin instead of by filename. When passed in via filename, this test will fail if the path to the test has the strings "f1" and "f2" in somewhere. Pass the file through stdin to prevent test failures due to coincidences in path names. llvm-svn: 267517	2016-04-26 03:43:49 +00:00
Bill Seurer	ab5171f988	[powerpc] mark JIT tests as XFAIL on powerpc64 big endian Some of the JIT tests began failing with "[llvm] r266663 - [Orc] Re-commit r266581 with fixes for MSVC, and format cleanups." on powerpc64 big endian. To get the buildbots running I am marking these as XFAIL for now. If this is fixed remove the XFAIL flag "powerpc64-unknown-linux-gnu". llvm-svn: 267516	2016-04-26 02:33:22 +00:00
Hal Finkel	e4c0c1679b	[SimplifyCFG] Preserve !llvm.mem.parallel_loop_access when merging When SimplifyCFG merges identical instructions from both sides of a diamond, it can preserve !llvm.mem.parallel_loop_access (as it does with most of the other metadata). There's no real data or control dependency change in this case. llvm-svn: 267515	2016-04-26 02:06:06 +00:00
Hal Finkel	411d31ad72	[LoopVectorize] Don't consider conditional-load dereferenceability for marked parallel loops I really thought we were doing this already, but we were not. Given this input: void Test(int res, int c, int d, int p) { for (int i = 0; i < 16; i++) res[i] = (p[i] == 0) ? res[i] : res[i] + d[i]; } we did not vectorize the loop. Even with "assume_safety" the check that we don't if-convert conditionally-executed loads (to protect against data-dependent deferenceability) was not elided. One subtlety: As implemented, it will still prefer to use a masked-load instrinsic (given target support) over the speculated load. The choice here seems architecture specific; the best option depends on how expensive the masked load is compared to a regular load. Ideally, using the masked load still reduces unnecessary memory traffic, and so should be preferred. If we'd rather do it the other way, flipping the order of the checks is easy. The LangRef is updated to make explicit that llvm.mem.parallel_loop_access also implies that if conversion is okay. Differential Revision: http://reviews.llvm.org/D19512 llvm-svn: 267514	2016-04-26 02:00:36 +00:00
Dan Gohman	f456290fca	[WebAssembly] Account for implicit operands when computing operand indices. llvm-svn: 267511	2016-04-26 01:40:56 +00:00
Sanjay Patel	a31b0c0ece	[CodeGenPrepare] don't convert an unpredictable select into control flow Suggested in the review of D19488: http://reviews.llvm.org/D19488 llvm-svn: 267504	2016-04-26 00:47:39 +00:00
Justin Bogner	1a07501379	PM: Port GlobalOpt to the new pass manager llvm-svn: 267499	2016-04-26 00:28:01 +00:00
Ahmed Bougacha	5cf735a5b1	[X86] Use LivePhysRegs in X86FixupBWInsts. Kill-flags, which computeRegisterLiveness uses, are not reliable. LivePhysRegs is. Differential Revision: http://reviews.llvm.org/D19472 llvm-svn: 267495	2016-04-26 00:00:48 +00:00
Justin Bogner	6f6c5f2a02	GlobalOpt: Convert a bunch of tests from grep to FileCheck llvm-svn: 267493	2016-04-25 23:36:50 +00:00
Sanjay Patel	82059090d3	Add check for "branch_weights" with prof metadata While we're here, fix the comment and variable names to make it clear that these are raw weights, not percentages. llvm-svn: 267491	2016-04-25 23:15:16 +00:00
James Y Knight	51208eaccc	[Sparc] Fix double-float fabs and fneg on little endian CPUs. The SparcV8 fneg and fabs instructions interestingly come only in a single-float variant. Since the sign bit is always the topmost bit no matter what size float it is, you simply operate on the high subregister, as if it were a single float. However, the layout of double-floats in the float registers is reversed on little-endian CPUs, so that the high bits are in the second subregister, rather than the first. Thus, this expansion must check the endianness to use the correct subregister. llvm-svn: 267489	2016-04-25 22:54:09 +00:00
Tim Northover	cbba0aba16	ARM: put correct symbol index on indirect pointers in __thread_ptr. Otherwise the linker has no idea what should be resolved. llvm-svn: 267488	2016-04-25 22:36:07 +00:00
Arch D. Robison	be0490a6e8	Optimize store of "bitcast" from vector to aggregate. This patch is what was the "instcombine" portion of D14185, with an additional test added (see julia_pseudovec in test/Transforms/InstCombine/insert-val-extract-elem.ll). The patch causes instcombine to replace sequences of extractelement-insertvalue-store that act essentially like a bitcast followed by a store. Differential review: http://reviews.llvm.org/D14260 llvm-svn: 267482	2016-04-25 22:22:39 +00:00
Tim Northover	5c3140f745	ARM: put extern __thread stubs in a special section. The linker needs to know that the symbols are thread-local to do its job properly. llvm-svn: 267473	2016-04-25 21:12:04 +00:00
Quentin Colombet	abe2d016cf	Re-apply r267206 with a fix for the encoding problem: when the immediate of log2(Mask) is smaller than 32, we must use the 32-bit variant because the 64-bit variant cannot encode it. Therefore, set the subreg part accordingly. [AArch64] Fix optimizeCondBranch logic. The opcode for the optimized branch does not depend on the size of the activate bits in the AND masks, but the AND opcode itself. Indeed, we need to use a X or W variant based on the AND variant not based on whether the mask fits into the related variant. Otherwise, we may end up using the W variant of the optimized branch for 64-bit register inputs! This fixes the last make check verifier issues for AArch64: PR27479. llvm-svn: 267465	2016-04-25 20:54:08 +00:00
Matt Arsenault	99c14524ec	AMDGPU: Implement addrspacecast llvm-svn: 267452	2016-04-25 19:27:24 +00:00

1 2 3 4 5 ...

36048 Commits