llvm-project

Commit Graph

Author	SHA1	Message	Date
Gerolf Hoflehner	01b3a6184a	[MachineCombiner] Support for floating-point FMA on ARM64 (re-commit r267098) The original patch caused crashes because it could derefence a null pointer for SelectionDAGTargetInfo for targets that do not define it. Evaluates fmul+fadd -> fmadd combines and similar code sequences in the machine combiner. It adds support for float and double similar to the existing integer implementation. The key features are: - DAGCombiner checks whether it should combine greedily or let the machine combiner do the evaluation. This is only supported on ARM64. - It gives preference to throughput over latency: the heuristic used is to combine always in loops. The targets decides whether the machine combiner should optimize for throughput or latency. - Supports for fmadd, f(n)msub, fmla, fmls patterns - On by default at O3 ffast-math llvm-svn: 267328	2016-04-24 05:14:01 +00:00
Daniel Sanders	591c379563	Revert r267098 - [MachineCombiner] Support for floating-point FMA on ARM64 It introduced buildbot failures on clang-cmake-mips, clang-ppc64le-linux, among others. llvm-svn: 267127	2016-04-22 09:37:26 +00:00
Gerolf Hoflehner	b32f11fc62	[MachineCombiner] Support for floating-point FMA on ARM64 Evaluates fmul+fadd -> fmadd combines and similar code sequences in the machine combiner. It adds support for float and double similar to the existing integer implementation. The key features are: - DAGCombiner checks whether it should combine greedily or let the machine combiner do the evaluation. This is only supported on ARM64. - It gives preference to throughput over latency: the heuristic used is to combine always in loops. The targets decides whether the machine combiner should optimize for throughput or latency. - Supports for fmadd, f(n)msub, fmla, fmls patterns - On by default at O3 ffast-math llvm-svn: 267098	2016-04-22 02:15:19 +00:00
Mehdi Amini	b550cb1750	[NFC] Header cleanup Removed some unused headers, replaced some headers with forward class declarations. Found using simple scripts like this one: clear && ack --cpp -l '#include "llvm/ADT/IndexedMap.h"' \| xargs grep -L 'IndexedMap[<]' \| xargs grep -n --color=auto 'IndexedMap' Patch by Eugene Kosov <claprix@yandex.ru> Differential Revision: http://reviews.llvm.org/D19219 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266595	2016-04-18 09:17:29 +00:00
Junmo Park	272a2bc365	Minor code cleanup. NFC. llvm-svn: 262096	2016-02-27 01:10:43 +00:00
Duncan P. N. Exon Smith	e59c8af705	Reapply "CodeGen: Use references in MachineTraceMetrics::Trace, NFC" This reverts commit r261510, effectively reapplying r261509. The original commit missed a caller in AArch64ConditionalCompares. Original commit message: Pass non-null arguments by reference in MachineTraceMetrics::Trace, simplifying future work to remove implicit iterator => pointer conversions. llvm-svn: 261511	2016-02-22 03:33:28 +00:00
Duncan P. N. Exon Smith	0cc90a9147	Revert "CodeGen: Use references in MachineTraceMetrics::Trace, NFC" This reverts commit r261509. I'm not sure how this compiled locally, but something was out of whack. llvm-svn: 261510	2016-02-22 03:12:42 +00:00
Duncan P. N. Exon Smith	83d3476fd2	CodeGen: Use references in MachineTraceMetrics::Trace, NFC Pass non-null arguments by reference in MachineTraceMetrics::Trace, simplifying future work to remove implicit iterator => pointer conversions. llvm-svn: 261509	2016-02-22 03:07:49 +00:00
Sanjay Patel	33ec5dbe35	less indent; NFCI llvm-svn: 252643	2015-11-10 20:09:02 +00:00
Sanjay Patel	766589efdc	add 'MustReduceDepth' as an objective/cost-metric for the MachineCombiner This is one of the problems noted in PR25016: https://llvm.org/bugs/show_bug.cgi?id=25016 and: http://lists.llvm.org/pipermail/llvm-dev/2015-October/090998.html The spilling problem is independent and not addressed by this patch. The MachineCombiner was doing reassociations that don't improve or even worsen the critical path. This is caused by inclusion of the "slack" factor when calculating the critical path of the original code sequence. If we don't add that, then we have a more conservative cost comparison of the old code sequence vs. a new sequence. The more liberal calculation must be preserved, however, for the AArch64 MULADD patterns because benchmark regressions were observed without that. The two failing test cases now have identical asm that does what we want: a + b + c + d ---> (a + b) + (c + d) Differential Revision: http://reviews.llvm.org/D13417 llvm-svn: 252616	2015-11-10 16:48:53 +00:00
Sanjay Patel	387e66e79f	replace MachineCombinerPattern namespace and enum with enum class; NFCI Also, remove an enum hack where enum values were used as indexes into an array. We may want to make this a real class to allow pattern-based queries/customization (D13417). llvm-svn: 252196	2015-11-05 19:34:57 +00:00
Hans Wennborg	083ca9bb32	Fix Clang-tidy modernize-use-nullptr warnings in source directories and generated files; other minor cleanups. Patch by Eugene Zelenko! Differential Revision: http://reviews.llvm.org/D13321 llvm-svn: 249482	2015-10-06 23:24:35 +00:00
Sanjay Patel	acd4baefca	include equal sign in debug equations; NFC llvm-svn: 249248	2015-10-03 20:45:01 +00:00
Sanjay Patel	74ca312666	fix minsize detection: minsize attribute implies optimizing for size llvm-svn: 244604	2015-08-11 14:31:14 +00:00
Hal Finkel	17caf326e5	[MachineCombiner] Don't use the opcode-only form of computeInstrLatency In r242277, I updated the MachineCombiner to work with itineraries, but I missed a call that is scheduling-model-only (the opcode-only form of computeInstrLatency). Using the form that takes an MI* allows this to work with itineraries (and should be NFC for subtargets with scheduling models). llvm-svn: 244020	2015-08-05 07:45:28 +00:00
Sanjay Patel	924879ad2c	wrap OptSize and MinSize attributes for easier and consistent access (NFCI) Create wrapper methods in the Function class for the OptimizeForSize and MinSize attributes. We want to hide the logic of "or'ing" them together when optimizing just for size (-Os). Currently, we are not consistent about this and rely on a front-end to always set OptimizeForSize (-Os) if MinSize (-Oz) is on. Thus, there are 18 FIXME changes here that should be added as follow-on patches with regression tests. This patch is NFC-intended: it just replaces existing direct accesses of the attributes by the equivalent wrapper call. Differential Revision: http://reviews.llvm.org/D11734 llvm-svn: 243994	2015-08-04 15:49:57 +00:00
Hal Finkel	e0fa8f2c86	[MachineCombiner] Work with itineraries MachineCombiner predicated its use of scheduling-based metrics on hasInstrSchedModel(), but useful conclusions can be drawn from pipeline itineraries as well. Almost all of the logic (except for resource tracking in preservesResourceLen) can be used if we have an itinerary, so enable it in that case as well. This will be used by the PowerPC backend in an upcoming commit. llvm-svn: 242277	2015-07-15 08:22:23 +00:00
Alexander Kornienko	f00654e31b	Revert r240137 (Fixed/added namespace ending comments using clang-tidy. NFC) Apparently, the style needs to be agreed upon first. llvm-svn: 240390	2015-06-23 09:49:53 +00:00
Sanjay Patel	e79b43a01f	[x86] generalize reassociation optimization in machine combiner to 2 instructions Currently ( D10321, http://reviews.llvm.org/rL239486 ), we can use the machine combiner pass to reassociate the following sequence to reduce the critical path: A = ? op ? B = A op X C = B op Y --> A = ? op ? B = X op Y C = A op B 'op' is currently limited to x86 AVX scalar FP adds (with fast-math on), but in theory, it could be any associative math/logic op (see TODO in code comment). This patch generalizes the pattern match to ignore the instruction that defines 'A'. So instead of a sequence of 3 adds, we now only need to find 2 dependent adds and decide if it's worth reassociating them. This generalization has a compile-time cost because we can now match more instruction sequences and we rely more heavily on the machine combiner to discard sequences where reassociation doesn't improve the critical path. For example, in the new test case: A = M div N B = A add X C = B add Y We'll match 2 reassociation patterns, but this transform doesn't reduce the critical path: A = M div N B = A add Y C = B add X We need the combiner to reject that pattern but select this: A = M div N B = X add Y C = B add A Differential Revision: http://reviews.llvm.org/D10460 llvm-svn: 240361	2015-06-23 00:39:40 +00:00
Sanjay Patel	cfe0393b82	name change: hasPattern() -> getMachineCombinerPatterns() ; NFC This was suggested as part of D10460, but it's independent of any functional change. llvm-svn: 240192	2015-06-19 23:21:42 +00:00
Alexander Kornienko	70bc5f1398	Fixed/added namespace ending comments using clang-tidy. NFC The patch is generated using this command: tools/clang/tools/extra/clang-tidy/tool/run-clang-tidy.py -fix \ -checks=-,llvm-namespace-comment -header-filter='llvm/.\|clang/.*' \ llvm/lib/ Thanks to Eugene Kosov for the original patch! llvm-svn: 240137	2015-06-19 15:57:42 +00:00
Sanjay Patel	5714998484	hoist loop-invariant; NFCI llvm-svn: 239681	2015-06-13 15:33:15 +00:00
Sanjay Patel	85924e5bf3	remove unnecessary casts; NFCI llvm-svn: 239678	2015-06-13 15:06:33 +00:00
Sanjay Patel	ccb8d5cc57	punctuation policing; NFC llvm-svn: 239484	2015-06-10 19:52:58 +00:00
Sanjay Patel	a32fadd14a	fix typo in comment; NFC llvm-svn: 239478	2015-06-10 17:08:12 +00:00
Sanjay Patel	f911484051	fix typo in comment; NFC llvm-svn: 237962	2015-05-21 21:29:13 +00:00
Sanjay Patel	f69f4e42ce	use range-based for-loops; NFCI llvm-svn: 237918	2015-05-21 17:43:26 +00:00
Duncan P. N. Exon Smith	70eb9c5ae5	CodeGen: Canonicalize access to function attributes, NFC Canonicalize access to function attributes to use the simpler API. getAttributes().getAttribute(AttributeSet::FunctionIndex, Kind) => getFnAttribute(Kind) getAttributes().hasAttribute(AttributeSet::FunctionIndex, Kind) => hasFnAttribute(Kind) Also, add `Function::getFnStackAlignment()`, and canonicalize: getAttributes().getStackAlignment(AttributeSet::FunctionIndex) => getFnStackAlignment() llvm-svn: 229208	2015-02-14 01:44:41 +00:00
Sanjay Patel	b1ca4e48d4	remove function names from comments; NFC llvm-svn: 227256	2015-01-27 22:26:56 +00:00
Sanjay Patel	6b280777b7	fix typos; NFC llvm-svn: 227253	2015-01-27 22:16:52 +00:00
Eric Christopher	3d4276f053	The subtarget is cached on the MachineFunction. Access it directly. llvm-svn: 227173	2015-01-27 07:31:29 +00:00
Pete Cooper	1175945710	Change MCSchedModel to be a struct of statically initialized data. This removes static initializers from the backends which generate this data, and also makes this struct match the other Tablegen generated structs in behaviour Reviewed by Andy Trick and Chandler C llvm-svn: 216919	2014-09-02 17:43:54 +00:00
Gerolf Hoflehner	fe2c11ffd6	[MachineCombiner] Removal of dangling DBG_VALUES after combining [20598] This is a cleaner solution to the problem described in r215431. When instructions are combined a dangling DBG_VALUE is removed. This resolves bug 20598. llvm-svn: 215587	2014-08-13 22:07:36 +00:00
Gerolf Hoflehner	97c383bc36	MachineCombiner Pass for selecting faster instruction sequence on AArch64 Re-commit of r214832,r21469 with a work-around that avoids the previous problem with gcc build compilers The work-around is to use SmallVector instead of ArrayRef of basic blocks in preservesResourceLen()/MachineCombiner.cpp llvm-svn: 215151	2014-08-07 21:40:58 +00:00
Eric Christopher	d913448b38	Remove the TargetMachine forwards for TargetSubtargetInfo based information and update all callers. No functional change. llvm-svn: 214781	2014-08-04 21:25:23 +00:00
Saleem Abdulrasool	befa21532c	CodeGen: silence a warning GCC 4.8.2 objects to the tautological condition in the assert as the unsigned value is guaranteed to be >= 0. Simplify the assertion by dropping the tautological condition. llvm-svn: 214671	2014-08-03 23:00:38 +00:00
Gerolf Hoflehner	5e1207e54c	MachineCombiner Pass for selecting faster instruction sequence - target independent framework When the DAGcombiner selects instruction sequences it could increase the critical path or resource len. For example, on arm64 there are multiply-accumulate instructions (madd, msub). If e.g. the equivalent multiply-add sequence is not on the crictial path it makes sense to select it instead of the combined, single accumulate instruction (madd/msub). The reason is that the conversion from add+mul to the madd could lengthen the critical path by the latency of the multiply. But the DAGCombiner would always combine and select the madd/msub instruction. This patch uses machine trace metrics to estimate critical path length and resource length of an original instruction sequence vs a combined instruction sequence and picks the faster code based on its estimates. This patch only commits the target independent framework that evaluates and selects code sequences. The machine instruction combiner is turned off for all targets and expected to evolve over time by gradually handling DAGCombiner pattern in the target specific code. This framework lays the groundwork for fixing rdar://16319955 llvm-svn: 214666	2014-08-03 21:35:39 +00:00

37 Commits