llvm-project

Commit Graph

Author	SHA1	Message	Date
Chad Rosier	03e1647d19	Revert "[AMDGPU][llvm-mc] Add support of TTMP quads. Rework M0 exclusion for SMRD." This reverts commit r267733 due to a -Werror,-Wunused-function error. llvm-svn: 267752	2016-04-27 18:29:11 +00:00
Matthew Simpson	622b95be7b	[LV] Reallow positive-stride interleaved load groups with gaps We previously disallowed interleaved load groups that may cause us to speculatively access memory out-of-bounds (r261331). We did this by ensuring each load group had an access corresponding to the first and last member. Instead of bailing out for these interleaved groups, this patch enables us to peel off the last vector iteration, ensuring that we execute at least one iteration of the scalar remainder loop. This solution was proposed in the review of the previous patch. Differential Revision: http://reviews.llvm.org/D19487 llvm-svn: 267751	2016-04-27 18:21:36 +00:00
Arch D. Robison	aca7c412b4	[SLPVectorizer] Refactor where MinVecRegSize and MaxVecRegSize live. This is the first of two commits for extending SLP Vectorizer to deal with aggregates. This commit merely refactors existing logic. http://reviews.llvm.org/D14185 llvm-svn: 267748	2016-04-27 17:46:25 +00:00
Gerolf Hoflehner	50426191d7	[DAGCombiner] Follow coding convention for function name (NFC) llvm-svn: 267745	2016-04-27 17:27:16 +00:00
Marcin Koscielnicki	7efdca5622	[Mips] Add support for llvm.thread.pointer intrinsic. This will be used to implement __builtin_thread_pointer in clang. Differential Revision: http://reviews.llvm.org/D19569 llvm-svn: 267743	2016-04-27 17:21:49 +00:00
Reid Kleckner	7f0ae15e9d	Silence a -Wdangling-else llvm-svn: 267737	2016-04-27 16:46:33 +00:00
Matthew Simpson	47bd3994b7	Add parentheses to silence buildbot warning llvm-svn: 267734	2016-04-27 16:25:04 +00:00
Artem Tamazov	3896f8f83d	[AMDGPU][llvm-mc] Add support of TTMP quads. Rework M0 exclusion for SMRD. Added support of TTMP quads. Reworked M0 exclusion machinery for SMRD and similar instructions to enable usage of TTMP registers in those instructions as destinations. Tests added. Differential Revision: http://reviews.llvm.org/D19342 llvm-svn: 267733	2016-04-27 16:20:23 +00:00
Reid Kleckner	0336cc05e7	[PDB] Fix function names for private symbols in PDBs Summary: llvm-symbolizer wants to get linkage names of functions for historical reasons. Linkage names are only recorded in the PDB for public symbols, and the linkage name is apparently stored separately in some "public symbol" record. We had a workaround in PDBContext which would look for such symbols when the user requested linkage names. However, when given an address that was truly in a private function and public funciton, we would accidentally find nearby public symbols and return those function names. The fix is to look for both function symbols and public symbols and only prefer the public symbol name if the addresses of the symbols agree. Fixes PR27492 Reviewers: zturner Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D19571 llvm-svn: 267732	2016-04-27 16:10:29 +00:00
Nicolai Haehnle	f66bdb5ea8	AMDGPU/SI: Add llvm.amdgcn.s.waitcnt.all intrinsic Summary: So it appears that to guarantee some of the ordering requirements of a GLSL memoryBarrier() executed in the shader, we need to emit an s_waitcnt. (We can't use an s_barrier, because memoryBarrier() may appear anywhere in the shader, in particular it may appear in non-uniform control flow.) Reviewers: arsenm, mareko, tstellarAMD Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D19203 llvm-svn: 267729	2016-04-27 15:46:01 +00:00
Matthew Simpson	e5dfb08fcb	[TTI] Add hook for vector extract with extension This change adds a new hook for estimating the cost of vector extracts followed by zero- and sign-extensions. The motivating example for this change is the SMOV and UMOV instructions on AArch64. These instructions move data from vector to general purpose registers while performing the corresponding extension (sign-extend for SMOV and zero-extend for UMOV) at the same time. For these operations, TargetTransformInfo can assume the extensions are free and only report the cost of the vector extract. The SLP vectorizer has been updated to make use of the new hook. Differential Revision: http://reviews.llvm.org/D18523 llvm-svn: 267725	2016-04-27 15:20:21 +00:00
Artem Tamazov	5cd55b1784	[AMDGPU][llvm-mc] s_getreg/setreg* - Support symbolic names of hardware registers. Possibility to specify code of hardware register kept. Disassemble to symbolic name, if name is known. Tests updated/added. Differential Revision: http://reviews.llvm.org/D19335 llvm-svn: 267724	2016-04-27 15:17:03 +00:00
Nico Weber	e69b9548b8	Revert r267649, it caused PR27539. llvm-svn: 267723	2016-04-27 15:16:54 +00:00
Teresa Johnson	df5ef8711f	[ThinLTO] Refine fix to avoid renaming of uses in inline assembly. Summary: Refine the workaround from r266877 that attempts to prevent renaming of locals in inline assembly, so that in addition to looking for a llvm.used local value, that there is at least one inline assembly call in the module. Otherwise, debug functions added to the llvm.used can block importing/exporting unnecessarily. Reviewers: joker.eph Subscribers: llvm-commits, joker.eph Differential Revision: http://reviews.llvm.org/D19573 llvm-svn: 267717	2016-04-27 14:19:38 +00:00
Teresa Johnson	02e98331c0	[ThinLTO] Use valueid instead of bitcode offsets in combined index file Summary: With the removal of support for lazy parsing of combined index summary records (e.g. r267344), we no longer need to include the summary record bitcode offset in the VST entries for definitions. Change the combined index format to be similar to the per-module index format in using value ids to cross-reference from the summary record to the VST entry (rather than the summary record bitcode offset to cross-reference in the other direction). The visible changes are: 1) Add the value id to the combined summary records 2) Remove the summary offset from the combined VST records, which has the following effects: - No longer need the VST_CODE_COMBINED_GVDEFENTRY record, as all combined index VST entries now only contain the value id and corresponding GUID. - No longer have duplicate VST entries in the case where there are multiple definitions of a symbol (e.g. weak/linkonce), as they all have the same value id and GUID. An implication of #2 above is that in order to hook up an alias to the correct aliasee based on the value id of the aliasee recorded in the combined index alias record, we need to scan the entries in the index for that GUID to find the one from the same module (i.e. the case where there are multiple entries for the aliasee). But the reader no longer has to maintain a special map to hook up the alias/aliasee. Reviewers: joker.eph Subscribers: joker.eph, llvm-commits Differential Revision: http://reviews.llvm.org/D19481 llvm-svn: 267712	2016-04-27 13:28:35 +00:00
Artur Pilipenko	345f01481b	NFC. Introduce Value::getPointerDerferecnceableBytes Extract a part of isDereferenceableAndAlignedPointer functionality to Value::getPointerDerferecnceableBytes. Currently it's a NFC, but in future I'm going to accumulate all the logic about value dereferenceability in this function similarly to Value::getPointerAlignment function (D16144). Reviewed By: reames Differential Revision: http://reviews.llvm.org/D17572 llvm-svn: 267708	2016-04-27 12:51:01 +00:00
Zlatko Buljan	de0bbe6d1c	[mips][microMIPS] Add CodeGen support for SUBU16, SUB, SUBU, DSUB and DSUBU instructions Differential Revision: http://reviews.llvm.org/D16676 llvm-svn: 267694	2016-04-27 11:31:44 +00:00
Zlatko Buljan	29813620bc	[mips][microMIPS] Add CodeGen support for SLL16, SRL16, SLL, SLLV, SRA, SRAV, SRL and SRLV instructions Differential Revision: http://reviews.llvm.org/D17989 llvm-svn: 267693	2016-04-27 11:02:23 +00:00
Artur Pilipenko	9bb6beabf4	isSafeToLoadUnconditionally support queries without a context This is required to use this function from isSafeToSpeculativelyExecute Reviewed By: hfinkel Differential Revision: http://reviews.llvm.org/D16231 llvm-svn: 267692	2016-04-27 11:00:48 +00:00
Artur Pilipenko	c97eac6555	Use DL preferred alignment for alloca in Value::getPointerAlignment Teach Value::getPointerAlignment that allocas with no explicit alignment are aligned to preferred alignment of the allocated type. Reviewed By: hfinkel Differential Revision: http://reviews.llvm.org/D17569 llvm-svn: 267689	2016-04-27 10:42:29 +00:00
Adam Nemet	d2fa414718	[LoopDist] Add llvm.loop.distribute.enable loop metadata Summary: D19403 adds a new pragma for loop distribution. This change adds support for the corresponding metadata that the pragma is translated to by the FE. As part of this I had to rethink the flag -enable-loop-distribute. My goal was to be backward compatible with the existing behavior: A1. pass is off by default from the optimization pipeline unless -enable-loop-distribute is specified A2. pass is on when invoked directly from opt (e.g. for unit-testing) The new pragma/metadata overrides these defaults so the new behavior is: B1. A1 + enable distribution for individual loop with the pragma/metadata B2. A2 + disable distribution for individual loop with the pragma/metadata The default value whether the pass is on or off comes from the initiator of the pass. From the PassManagerBuilder the default is off, from opt it's on. I moved -enable-loop-distribute under the pass. If the flag is specified it overrides the default from above. Then the pragma/metadata can further modifies this per loop. As a side-effect, we can now also use -enable-loop-distribute=0 from opt to emulate the default from the optimization pipeline. So to be precise this is the new behavior: C1. pass is off by default from the optimization pipeline unless -enable-loop-distribute or the pragma/metadata enables it C2. pass is on when invoked directly from opt unless -enable-loop-distribute=0 or the pragma/metadata disables it Reviewers: hfinkel Subscribers: joker.eph, mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D19431 llvm-svn: 267672	2016-04-27 05:28:18 +00:00
Vaivaswatha Nagaraj	08efb0efcd	[Cloning] cloneLoopWithPreheader(): add assert to ensure no sub-loops Summary: cloneLoopWithPreheader() does not update LoopInfo for sub-loop of the original loop being cloned. Add assert to ensure no sub-loops for loop being cloned. Reviewers: anemet, ashutosh.nema, hfinkel Subscribers: mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D15922 llvm-svn: 267671	2016-04-27 05:25:09 +00:00
Craig Topper	de4318b928	[Support][X86] Add a few more Intel model numbers to getHostCPUName for airmont and knl. llvm-svn: 267670	2016-04-27 05:17:00 +00:00
Craig Topper	e7d743ccf8	[Support][X86] Change the case values in the Intel family 6 code to hex so its easier to compare with Intel's docs. NFC llvm-svn: 267669	2016-04-27 05:16:58 +00:00
Mehdi Amini	c7b950171d	Revert "Support "preserving" the summary information when using setModule() API in LTOCodeGenerator" This reverts commit r267665. ASAN shows that there is a use of undefined value. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 267668	2016-04-27 05:11:44 +00:00
Craig Topper	0e2f14fa83	[Support][X86] Add a couple more Broadwell CPU models numbers to getHostCPUName. llvm-svn: 267666	2016-04-27 04:40:03 +00:00
Mehdi Amini	360ed847bc	Support "preserving" the summary information when using setModule() API in LTOCodeGenerator Another attempt at r267655... From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 267665	2016-04-27 04:24:10 +00:00
Mehdi Amini	a1b8b6cd56	Revert "Support "preserving" the summary information when using setModule() API in LTOCodeGenerator" This reverts commit r267657, r267656, and r267655. The test does not pass on multiple bots, I'm unsure why yet but let's unbreak them. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 267664	2016-04-27 03:34:28 +00:00
Evgeny Stupachenko	23ce61b663	The patch fixes PR27392. Summary: It is incorrect to compare TripCount (which is BECount + 1) with extraiters (or Count) to check if we should enter unrolled loop or not, because TripCount can potentially overflow (when BECount is max unsigned integer). While comparing BECount with (Count - 1) is overflow safe and therefore correct. Reviewer: hfinkel Differential Revision: http://reviews.llvm.org/D19256 From: Evgeny Stupachenko <evstupac@gmail.com> llvm-svn: 267662	2016-04-27 03:04:54 +00:00
Philip Reames	c67651dd70	[LVI] Delete stale and misleading comment. llvm-svn: 267661	2016-04-27 03:03:15 +00:00
Chuang-Yu Cheng	8676c3d599	[ppc64] fix bug in prologue that mfocrf's cr operand should be explict state instead of implicit This fixes PR27414 Reviewers: kbarton mgrang tjablin http://reviews.llvm.org/D19255 llvm-svn: 267660	2016-04-27 02:59:28 +00:00
Ahmed Bougacha	9a0c9adade	[X86] Set AddPristinesAndCSRs to FixupBW LivePhysRegs. NFC. We run after PEI, so we need to AddPristinesAndCSRs. In practice, that makes no difference here, because we only ask about liveness of super-registers of defined GR8/GR16 registers, so they can't be pristine. Still, it's the correct thing to do. Thanks to Quentin for noticing! Follow-up to r267495. llvm-svn: 267658	2016-04-27 01:51:38 +00:00
Mehdi Amini	e2a65fe5ec	Support "preserving" the summary information when using setModule() API in LTOCodeGenerator From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 267655	2016-04-27 01:46:48 +00:00
Sanjoy Das	5253a089ba	Fix typo in comment; NFC llvm-svn: 267653	2016-04-27 01:44:31 +00:00
Ahmed Bougacha	19a2ee591a	[X86] Don't assume that MMX extractelts are from index 0. It's probably the case for all 3 MMX users out there, but with hand-crafted IR, you can trigger selection failures. Fix that. llvm-svn: 267652	2016-04-27 01:35:29 +00:00
Ahmed Bougacha	e68363a03c	[X86] Re-enable MMX i32 extractelt combine. This effectively adds back the extractelt combine removed by r262358: the direct case can still occur (because x86_mmx is special, see r262446), but it's the indirect case that's now superseded by the generic combine. llvm-svn: 267651	2016-04-27 01:35:25 +00:00
Cong Hou	6f879d9eb1	Detects the SAD pattern on X86 so that much better code will be emitted once the pattern is matched. Differential revision: http://reviews.llvm.org/D14840 llvm-svn: 267649	2016-04-27 01:29:18 +00:00
Philip Reames	2ab964e263	[LVI] Add a comment explaining a subtle piece of code Or at least, I didn't understand the implications the first several times I read it it. llvm-svn: 267648	2016-04-27 01:02:25 +00:00
Mehdi Amini	b4e1e8297b	ThinLTO: do not promote GlobalVariable that have a specific section. Differential Revision: http://reviews.llvm.org/D18298 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 267646	2016-04-27 00:32:13 +00:00
Matt Arsenault	ba437c67d2	SLSR: Use UnknownAddressSpace instead of 0 for pure arithmetic. In the case where isLegalAddressingMode is used for cases not related to addressing modes, such as pure adds and muls, it should not be using address space 0. LSR already passes -1 as the address space in these cases. llvm-svn: 267645	2016-04-27 00:32:09 +00:00
Mehdi Amini	da168fbc2e	LTOCodeGenerator: turns linkonce(_odr) into weak_(odr) when present "MustPreserve" set Summary: If the linker requested to preserve a linkonce function, we should honor this even if we drop all uses. Reviewers: dexonsmith Subscribers: llvm-commits, joker.eph Differential Revision: http://reviews.llvm.org/D19527 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 267644	2016-04-27 00:32:02 +00:00
Adam Nemet	61399ac424	[LoopDist] Split main class. NFC This splits out the per-loop functionality from the Pass class. With this the fact whether the loop is forced-distribute with the new metadata/pragma can be cached in the per-loop class rather than passed around. llvm-svn: 267643	2016-04-27 00:31:03 +00:00
Philip Reames	3f83dbeed9	[LVI] Reduce compile time by lazily scanning blocks if needed When encountering a non-local pointer, LVI would eagerly scan the block for dereferences of the given object to prove the pointer to be non null. That's all well and good, but then we'd go recurse through our input blocks. As a result, we could end up scanning each and every block we traverse, even if the final definition was obviously non null or we found a constant value somewhere up the chain. The previous code papered over this by using the isKnownNonNull routine from value tracking. This made the duplication less painful in the common case. Instead, we know do the block scan only after we've gotten the recursive results back. This lets us stop scanning individual blocks as soon as we've determined it to be non-null in any predecessor block and use our usual merge rules to propagate that information cheaply through successor blocks. For a pointer which can be found non-null, this does strictly less work and sometimes substaintially so. Note that the case where we can't prove something non-null is still the really expensive case. We end up scanning each and every block looking for a dereference and never end up finding one. llvm-svn: 267642	2016-04-27 00:30:55 +00:00
Quentin Colombet	ddad5aa152	[MachineInstrBundle] Actually set the PartialDeadDef flag only when the register is defined! The users were checking the proper thing (Defined + PartialDeadDef), but the information may have been wrong for other use cases, so fix that. llvm-svn: 267641	2016-04-27 00:16:29 +00:00
Andrew Kaylor	d9974cc913	Add optimization bisect opt-in calls for SystemZ passes Differential Revision: http://reviews.llvm.org/D19562 llvm-svn: 267636	2016-04-26 23:49:41 +00:00
Andrew Kaylor	87b10dd7b3	Add optimization bisect opt-in calls for NVPTX passes Differential Revision: http://reviews.llvm.org/D19518 llvm-svn: 267635	2016-04-26 23:44:31 +00:00
Quentin Colombet	4ff3cfb673	[X86] Make sure it is safe to clobber EFLAGS, if need be, when choosing the prologue. Do not use basic blocks that have EFLAGS live-in as prologue if we need to realign the stack. Realigning the stack uses AND instruction and this clobbers EFLAGS. An other alternative would have been to save and restore EFLAGS around the stack realignment code, but this is likely inefficient. Fixes PR27531. llvm-svn: 267634	2016-04-26 23:44:14 +00:00
Justin Bogner	c2bf63d29d	PM: Port Reassociate to the new pass manager llvm-svn: 267631	2016-04-26 23:39:29 +00:00
Justin Bogner	cb8a21c88e	Reassociate: Convert another functor into a lambda. NFC Also move the explanatory comment with it. llvm-svn: 267628	2016-04-26 23:32:00 +00:00
Philip Reames	f105db4fc3	[LVI] Cut short search if we know we can't return a useful result Previously we were recursing on our operands for unary and binary operators regardless of whether we knew how to reason about the operator in question. This has the effect of doing a potentially large amount of work, only to throw it away. By checking whether the operation is one LVI can handle, we can cut short the search and return the (overdefined) answer more quickly. The quality of the results produced should not change. llvm-svn: 267626	2016-04-26 23:27:33 +00:00

1 2 3 4 5 ...

89566 Commits