llvm-project

Commit Graph

Author	SHA1	Message	Date
Tobias Grosser	3a99893618	Adjust to clang-format changes llvm-svn: 328005	2018-03-20 17:16:32 +00:00
Tobias Grosser	718d04c653	Use isl::manage_copy to simplify calls to isl::manage(isl_.._copy()) As part of this cleanup a couple of unnecessary isl::manage(obj.copy()) pattern are eliminated as well. We checked for all potential cleanups by scanning for: "grep -R isl::manage\( lib/ \| grep copy" llvm-svn: 325558	2018-02-20 07:26:58 +00:00
Michael Kruse	1a745a4ef6	Run clang-format after r324003. NFC. llvm-svn: 324112	2018-02-02 18:11:58 +00:00
Benjamin Kramer	e65c7bbe8a	Update polly for r323999. llvm-svn: 324003	2018-02-01 20:49:53 +00:00
Philip Pfaffe	00fd43b327	Port ScopInfo to the isl cpp bindings Summary: Most changes are mechanical, but in one place I changed the program semantics by fixing a likely bug: In `Scop::hasFeasibleRuntimeContext()`, I'm now explicitely handling the error-case. Before, when the call to `addNonEmptyDomainConstraints()` returned a null set, this (probably) accidentally worked because isl_bool_error converts to true. I'm checking for nullptr now. Reviewers: grosser, Meinersbur, bollu Reviewed By: Meinersbur Subscribers: nemanjai, kbarton, pollydev, llvm-commits Differential Revision: https://reviews.llvm.org/D39971 llvm-svn: 318632	2017-11-19 22:13:34 +00:00
Michael Kruse	0e370cf1a7	Check whether IslAstInfo and DependenceInfo were computed for the same Scop. Since -polly-codegen reports itself to preserve DependenceInfo and IslAstInfo, we might get those analysis that were computed by a different ScopInfo for a different Scop structure. This would be unfortunate because DependenceInfo and IslAstInfo hold references to resources allocated by ScopInfo/ScopBuilder/Scop (e.g. isl_id). If -polly-codegen and DependenceInfo/IslAstInfo do not agree on which Scop to use, unpredictable things can happen. When the ScopInfo/Scop object is freed, there is a high probability that the new ScopInfo/Scop object will be created at the same heap position with the same address. Comparing whether the Scop or ScopInfo address is the expected therefore is unreliable. Instead, we compare the address of the isl_ctx object. Both, DependenceInfo and IslAstInfo must hold a reference to the isl_ctx object to ensure it is not freed before the destruction of those analyses which might happen after the destruction of the Scop/ScopInfo they refer to. Hence, the isl_ctx will not be freed and its address not reused as long there is a DependenceInfo or IslAstInfo around. This fixes llvm.org/PR34441 llvm-svn: 313842	2017-09-21 00:01:13 +00:00
Michael Kruse	8dceb76066	[ScheduleOptimizer] Fix and test schedule tree statistics. Fix walking over the schedule tree to collect its properties (Number of permutable bands etc.). Also add regression tests for these statistics. llvm-svn: 313750	2017-09-20 11:53:05 +00:00
Roman Gareev	925ce50f1b	Unroll and separate the remaining parts of isolation The remaining parts produced by the full partial tile isolation can contain hot spots that are worth to be optimized. Currently, we rely on the simple loop unrolling pass, LiCM and the SLP vectorizer to optimize such parts. However, the approach can suffer from the lack of the information about aliasing that Polly provides using additional alias metadata or/and the lack of the information required by simple loop unrolling pass. This patch is the first step to optimize the remaining parts. To do it, we unroll and separate them. In case of, for instance, Intel Kaby Lake, it helps to increase the performance of the generated code from 39.87 GFlop/s to 49.23 GFlop/s. The next possible step is to avoid unrolling performed by Polly in case of isolated and remaining parts and rely only on simple loop unrolling pass and the Loop vectorizer. Reviewed-by: Tobias Grosser <tobias@grosser.es> Differential Revision: https://reviews.llvm.org/D37692 llvm-svn: 312929	2017-09-11 17:46:47 +00:00
Roman Gareev	6589748920	Use the information about the target cache provided by the TargetTransformInfo. Reviewed-by: Tobias Grosser <tobias@grosser.es> Differential Revision: https://reviews.llvm.org/D37178 llvm-svn: 312255	2017-08-31 17:07:54 +00:00
Michael Kruse	a4f447c2a4	[PM] Properly require and preserve OptimizationRemarkEmitter. NFCI. Properly require and preserve the OptimizationRemarkEmitter for use in ScopPass. Previously one had to get the ORE from ScopDetection because CodeGeneration did not mark it as preserved. It would need to be recomputed which results in the legacy PM to throw away all previous SCoP analysis. This also changes the implementation of ScopPass::getAnalysisUsage to not unconditionally preserve all passes, but only those needed to be preserved by any SCoP pass (at least when using the legacy PM). This allows invalidating DependenceInfo (and IslAstInfo) in case the pass would cause them to change (e.g. OpTree, DeLICM, MaximalArrayExpansion) JSONImporter should also invalidate the DependenceInfo. In this patch it marks DependenceInfo as preserved anyway because some regression tests depend on it. Differential Revision: https://reviews.llvm.org/D37010 llvm-svn: 311888	2017-08-28 14:07:33 +00:00
Eugene Zelenko	9248fde53a	[Polly] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 311704	2017-08-24 21:22:41 +00:00
Michael Kruse	06ed529205	Add more statistics. Add statistics about - Which optimizations are applied - Number of loops in Scops at various stages - Number of scalar/singleton writes at various stages representative for scalar false dependencies - Number of parallel loops These will be useful to find regressions due to moving Polly further down of LLVM's pass pipeline. Differential Revision: https://reviews.llvm.org/D37049 llvm-svn: 311553	2017-08-23 13:50:30 +00:00
Roman Gareev	0956a606ff	Disable the Loop Vectorizer in case of GEMM Currently, in case of GEMM and the pattern matching based optimizations, we use only the SLP Vectorizer out of two LLVM vectorizers. Since the Loop Vectorizer can get in the way of optimal code generation, we disable the Loop Vectorizer for the innermost loop using mark nodes and emitting the corresponding metadata. Reviewed-by: Tobias Grosser <tobias@grosser.es> Differential Revision: https://reviews.llvm.org/D36928 llvm-svn: 311473	2017-08-22 17:38:46 +00:00
Michael Kruse	d091bf8d8e	[MatMul] Make MatMul detection independent of internal isl representations. The pattern recognition for MatMul is restrictive. The number of "disjuncts" in the isl_map containing constraint information was previously required to be 1 (as per isl_*_coalesce - which should ideally produce a domain map with a single disjunct, but does not under some circumstances). This was changed and made more flexible. Contributed-by: Annanay Agarwal <cs14btech11001@iith.ac.in> Differential Revision: https://reviews.llvm.org/D36460 llvm-svn: 311302	2017-08-20 21:31:11 +00:00
Roman Gareev	dbde718676	Do not use isl_set_project_out to get all loop prefixes Currently, only convex isolation sets can be efficiently processed by isl. Consequently, as a temporary solution, we use a different algorithm for partial tile isolation that helps to build convex isolation sets in some cases. Reviewed-by: Tobias Grosser <tobias@grosser.es> Differential Revision: https://reviews.llvm.org/D36278 llvm-svn: 310374	2017-08-08 16:15:33 +00:00
Tobias Grosser	327e9ecb0d	[ScheduleOptimizer] Make matmul pattern detection work with delicm output In certain cases delicm might decide to not leave the original array write in the loop body, but to remove it and instead leave a transformed phi node as write access. This commit teached the matmul pattern detection to order the memory accesses according to when the access actually happens and use this information to detect the new pattern. This makes pattern based matmul optimization work for 2mm and 3mm in polybench 4 after polly-position=before-vectorizer has been enabled. llvm-svn: 310338	2017-08-08 06:15:15 +00:00
Tobias Grosser	61bd3a4840	[ScopInfo] Move Scop::getPwAffOnly to isl++ [NFC] llvm-svn: 310231	2017-08-06 21:42:38 +00:00
Tobias Grosser	31df6f31c0	[ScopInfo] Move Scop::getDomains to isl++ [NFC] llvm-svn: 310230	2017-08-06 21:42:25 +00:00
Tobias Grosser	85048eff1a	[ScopInfo] Move ScopStmt::ScopStmt to isl++ [NFC] llvm-svn: 310210	2017-08-06 17:24:59 +00:00
Tobias Grosser	dcf8d696ff	Move ScopInfo::getDomain(), getDomainSpace(), getDomainId() to isl++ llvm-svn: 310209	2017-08-06 16:39:52 +00:00
Tobias Grosser	feae3dfe9f	[unittests] Add unittest for getPartialTilePrefixes In https://reviews.llvm.org/D36278 it was pointed out that the behavior of getPartialTilePrefixes is not very well understood. To allow for a better understanding, we first provide some basic unittests. llvm-svn: 310175	2017-08-05 09:38:09 +00:00
Tobias Grosser	7b45af13ce	Move setNewAccessRelation to isl++ llvm-svn: 309871	2017-08-02 19:27:25 +00:00
Roman Gareev	2e580538be	[ScheduleOptimizer] Translate to C++ bindings Translate the ScheduleOptimizer to use the new isl C++ bindings. Reviewed-by: Michael Kruse <llvm@meinersbur.de> Differential Revision: https://reviews.llvm.org/D35845 llvm-svn: 309119	2017-07-26 14:59:15 +00:00
Tobias Grosser	d7065e5df5	Move MemoryAccess::isStride* to isl++ llvm-svn: 308927	2017-07-24 20:50:22 +00:00
Tobias Grosser	1515f6b937	Move MemoryAccess::NewAccessRelation to isl++ We also move related accessor functions llvm-svn: 308840	2017-07-23 04:08:38 +00:00
Tobias Grosser	77eef90f50	Move ScopArrayInfo to isl++ This moves the full ScopArrayInfo class to isl++ llvm-svn: 308801	2017-07-21 23:07:56 +00:00
Michael Kruse	cd4c977b8b	[ScopInfo] Print instructions in dump(). Print a statement's instruction on dump() regardless of -polly-print-instructions. dump() is supposed to be used in the debugger only and never in regression tests. While debugging, get all the information we have and we are not bound to break anything. For non-dump purposes of print, forward the setting of -polly-print-instructions as parameters. Some calls to print() had to be changed because the PollyPrintInstructions setting is only available in ScopInfo.cpp. In ScheduleOptimizer.cpp, dump() was used in regression tests. That's not what dump() is for. The print parameter "PrintInstructions" will also be useful for an explicit print SCoP pass in a future patch. llvm-svn: 308746	2017-07-21 15:35:53 +00:00
Roman Gareev	750374181b	Make the pattern matching work with modified memory accesses Some optimizations (e.g., DeLICM) can modify memory accesses (e.g., change their MemoryKind). Consequently, the pattern matching should take it into the account. Reviewed-by: Tobias Grosser <tobias@grosser.es>, Michael Kruse <llvm@meinersbur.de> Differential Revision: https://reviews.llvm.org/D33138 llvm-svn: 308494	2017-07-19 16:59:06 +00:00
Singapuram Sanjay Srivallabh	02ca346e48	Introduce a hybrid target to generate code for either the GPU or CPU Summary: Introduce a "hybrid" `-polly-target` option to optimise code for either the GPU or CPU. When this target is selected, PPCGCodeGeneration will attempt first to optimise a Scop. If the Scop isn't modified, it is then sent to the passes that form the CPU pipeline, i.e. IslScheduleOptimizerPass, IslAstInfoWrapperPass and CodeGeneration. In case the Scop is modified, it is marked to be skipped by the subsequent CPU optimisation passes. Reviewers: grosser, Meinersbur, bollu Reviewed By: grosser Subscribers: kbarton, nemanjai, pollydev Tags: #polly Differential Revision: https://reviews.llvm.org/D34054 llvm-svn: 306863	2017-06-30 19:42:21 +00:00
Tobias Grosser	dcd94e3e93	[ScheduleOptimizer] Fix minor typo [NFC] llvm-svn: 305709	2017-06-19 16:55:48 +00:00
Tobias Grosser	2fb3ed200a	[ScheduleOptimizer] Move isolateFullPartialTiles and isolateAndUnrollMatMulInnerLoops to C++ llvm-svn: 305676	2017-06-19 10:40:12 +00:00
Michael Kruse	a6d48f59a1	Fix a lot of typos. NFC. llvm-svn: 304974	2017-06-08 12:06:15 +00:00
Tobias Grosser	7205f93a98	[ScheduleOptimizer] Move schedule construction to isl C++ [NFC] llvm-svn: 303508	2017-05-21 16:21:33 +00:00
Tobias Grosser	f3adab4c20	[Polly] Canonicalize arrays according to base-ptr equivalence class Summary: In case two arrays share base pointers in the same invariant load equivalence class, we canonicalize all memory accesses to the first of these arrays (according to their order in the equivalence class). This enables us to optimize kernels such as boost::ublas by ensuring that different references to the C array are interpreted as accesses to the same array. Before this change the runtime alias check for ublas would fail, as it would assume models of the C array with differing (but identically valued) base pointers would reference distinct regions of memory whereas the referenced memory regions were indeed identical. As part of this change we remove most of the MemoryAccess::getBaseAddr interface. We removed already all references to getBaseAddr in previous commits to ensure that no code relies on matching base pointers between memory accesses and scop arrays -- except for three remaining uses where we need the original base pointer. We document for these situations that MemoryAccess::getOriginalBaseAddr may return a base pointer that is distinct to the base pointer of the scop array referenced by this memory access. Reviewers: sebpop, Meinersbur, zinob, gareevroman, pollydev, huihuiz, efriedma, jdoerfert Reviewed By: Meinersbur Subscribers: etherzhhb Tags: #polly Differential Revision: https://reviews.llvm.org/D28518 llvm-svn: 302636	2017-05-10 10:59:58 +00:00
Roman Gareev	9d4d91ca6a	[FIX] Fix ScheduleTreeOptimizer::optimizeMatMulPattern Use new values of the dimensions during their permutation. llvm-svn: 299663	2017-04-06 17:25:08 +00:00
Roman Gareev	e0d466342b	Restore the initial ordering of dimensions before applying the pattern matching Dimensions of band nodes can be implicitly permuted by the algorithm applied during the schedule generation. For example, in case of the following matrix-matrix multiplication, for (i = 0; i < 1024; i++) for (k = 0; k < 1024; k++) for (j = 0; j < 1024; j++) C[i][j] += A[i][k] * B[k][j]; it can produce the following schedule tree domain: "{ Stmt_for_body6[i0, i1, i2] : 0 <= i0 <= 1023 and 0 <= i1 <= 1023 and 0 <= i2 <= 1023 }" child: schedule: "[{ Stmt_for_body6[i0, i1, i2] -> [(i0)] }, { Stmt_for_body6[i0, i1, i2] -> [(i1)] }, { Stmt_for_body6[i0, i1, i2] -> [(i2)] }]" permutable: 1 coincident: [ 1, 1, 0 ] The current implementation of the pattern matching optimizations relies on the initial ordering of dimensions. Otherwise, it can produce the miscompilation (e.g., [1]). This patch helps to restore the initial ordering of dimensions by recreating the band node when the corresponding conditions are satisfied. Refs.: [1] - https://bugs.llvm.org/show_bug.cgi?id=32500 Reviewed-by: Michael Kruse <llvm@meinersbur.de> Differential Revision: https://reviews.llvm.org/D31741 llvm-svn: 299662	2017-04-06 17:09:54 +00:00
Siddharth Bhat	5eeb1dd42e	[Polly] [ScheduleOptimizer] Prevent incorrect tile size computation Because Polly exposes parameters that directly influence tile size calculations, one can setup situations like divide-by-zero. Check against a possible divide-by-zero in getMacroKernelParams and return early. Also assert at the end of getMacroKernelParams that the block sizes computed for matrices are positive (>= 1). Tags: #polly Differential Revision: https://reviews.llvm.org/D31708 llvm-svn: 299633	2017-04-06 08:20:22 +00:00
Siddharth Bhat	bcbfdade41	[Polly] [DependenceInfo] change WAR, WAW generation to correct semantics = Change of WAR, WAW generation: = - `buildFlow(Sink, MustSource, MaySource, Sink)` treates any flow of the form `sink <- may source <- must source` as a may dependence. - we used to call: ```lang=cpp, name=old-flow-call.cpp Flow = buildFlow(MustWrite, MustWrite, Read, Schedule); WAW = isl_union_flow_get_must_dependence(Flow); WAR = isl_union_flow_get_may_dependence(Flow); ``` - This caused some WAW dependences to be treated as WAR dependences. - Incorrect semantics. - Now, we call WAR and WAW correctly. == Correct WAW: == ```lang=cpp, name=new-waw-call.cpp Flow = buildFlow(Write, MustWrite, MayWrite, Schedule); WAW = isl_union_flow_get_may_dependence(Flow); isl_union_flow_free(Flow); ``` == Correct WAR: == ```lang=cpp, name=new-war-call.cpp Flow = buildFlow(Write, Read, MustaWrite, Schedule); WAR = isl_union_flow_get_must_dependence(Flow); isl_union_flow_free(Flow); ``` - We want the "shortest" WAR possible (exact dependences). - We mark all the must-writes as may-source, reads as must-souce. - Then, we ask for must dependence. - This removes all the reads that flow through a must-write before reaching a sink. - Note that we only block ealier writes with must-writes. This is intuitively correct, as we do not want may-writes to block must-writes. - Leaves us with direct (R -> W). - This affects reduction generation since RED is built using WAW and WAR. = New StrictWAW for Reductions: = - We used to call: ```lang=cpp,name=old-waw-war-call.cpp Flow = buildFlow(MustWrite, MustWrite, Read, Schedule); WAW = isl_union_flow_get_must_dependence(Flow); WAR = isl_union_flow_get_may_dependence(Flow); ``` - This is the right model of WAW we need for reductions, just not in general. - Reductions need to track only strict WAW, without any interfering reductions. = Explanation: Why the new WAR dependences in tests are correct: = - We no longer set WAR = WAR - WAW - Hence, we will have WAR dependences that were originally removed. - These may look incorrect, but in fact make sense. == Code: == ```lang=llvm, name=new-war-dependence.ll ; void manyreductions(long A) { ; for (long i = 0; i < 1024; i++) ; for (long j = 0; j < 1024; j++) ; S0: A += 42; ; ; for (long i = 0; i < 1024; i++) ; for (long j = 0; j < 1024; j++) ; S1: A += 42; ; ``` === WAR dependence: === { S0[1023, 1023] -> S1[0, 0] } - Between `S0[1023, 1023]` and `S1[0, 0]`, we will have the dependences: ```lang=cpp, name=dependence-incorrect, counterexample S0[1023, 1023]: -- tmp = A (load0)-- WAR 2 add = tmp + 42 \| -> A = add (store0) \| WAR 1 S1[0, 0]: \| tmp = A (load1) \| add = tmp + 42 \| A = add (store1)<- ``` - One may assume that WAR2 hides WAR1 (since store0 happens before store1). However, within a statement, Polly has no idea about the ordering of loads and stores. - Hence, according to Polly, the code may have looked like this: ```lang=cpp, name=dependence-correct S0[1023, 1023]: A = add (store0) tmp = A (load0) ---* add = A + 42 \| WAR 1 S1[0, 0]: \| tmp = A (load1) \| add = A + 42 \| A = add (store1) <-* ``` - So, Polly generates (correct) WAR dependences. It does not make sense to remove these dependences, since they are correct with respect to Polly's model. Reviewers: grosser, Meinersbur tags: #polly Differential revision: https://reviews.llvm.org/D31386 llvm-svn: 299429	2017-04-04 13:08:23 +00:00
Roman Gareev	cdfb57dc46	Introduce another level of metadata to distinguish non-aliasing accesses Introduce another level of alias metadata to distinguish the individual non-aliasing accesses that have inter iteration alias-free base pointers marked with "Inter iteration alias-free" mark nodes. It can be used to, for example, distinguish different stores (loads) produced by unrolling of the innermost loops and, subsequently, sink (hoist) them by LICM. Reviewed-by: Tobias Grosser <tobias@grosser.es> Differential Revision: https://reviews.llvm.org/D30606 llvm-svn: 298510	2017-03-22 14:25:24 +00:00
Siddharth Bhat	3e4a7d38ab	[ScheduleOptimiser] fix typos in top comment [NFC] coice -> choice Transations -> Transactions llvm-svn: 298095	2017-03-17 14:52:19 +00:00
Tobias Grosser	c9d4cb2f42	[ScheduleOptimizer] Allow tiling after fusion In ScheduleOptimizer::isTileableBand(), allow the case in which the band node's child is an isl_schedule_sequence_node and its grandchildren isl_schedule_leaf_nodes. This case can arise when two or more statements are fused by the isl scheduler. The tile_after_fusion.ll test has two statements in separate loop nests and checks whether they are tiled after being fused when polly-opt-fusion equals "max". Reviewers: grosser Subscribers: gareevroman, pollydev Tags: #polly Contributed-by: Theodoros Theodoridis <theodort@student.ethz.ch> Differential Revision: https://reviews.llvm.org/D30815 llvm-svn: 297587	2017-03-12 19:02:31 +00:00
Roman Gareev	96e1119a96	Make optimizations based on pattern matching be enabled by default Currently, pattern based optimizations of Polly can identify matrix multiplication and optimize it according to BLIS matmul optimization pattern (see ScheduleTreeOptimizer for details). This patch makes optimizations based on pattern matching be enabled by default. Reviewed-by: Tobias Grosser <tobias@grosser.es> Differential Revision: https://reviews.llvm.org/D30293 llvm-svn: 295958	2017-02-23 11:44:12 +00:00
Roman Gareev	4eb07e481e	[FIX] Fix the typo in ScheduleOptimizer.cpp. llvm-svn: 295292	2017-02-16 07:04:41 +00:00
Roman Gareev	b196055c0c	Check reduction dependencies in case of the matrix multiplication optimization To determine parameters of the matrix multiplication, we check RAW dependencies that can be expressed using only reduction dependencies. Consequently, we should check the reduction dependencies, if this is the case. Reviewed-by: Tobias Grosser <tobias@grosser.es>, Sven Verdoolaege <skimo-polly@kotnet.org> Michael Kruse <llvm@meinersbur.de> Differential Revision: https://reviews.llvm.org/D29814 llvm-svn: 294836	2017-02-11 09:59:09 +00:00
Roman Gareev	de69293b01	[FIX] Fix the potential issue of containsOnlyMatMulDep. llvm-svn: 294835	2017-02-11 09:48:09 +00:00
Roman Gareev	5ef7e210c0	[NFC] Fix the style issue of lib/Transform/ScheduleOptimizer.cpp. llvm-svn: 294834	2017-02-11 08:43:41 +00:00
Roman Gareev	afcf026d81	[NFC] Fix style issues of lib/Transform/ScheduleOptimizer.cpp. llvm-svn: 294831	2017-02-11 07:14:37 +00:00
Roman Gareev	3d4eae31ea	Use the size of the widest type of the matrix multiplication operands The size of the operands type is the one of the parameters required to determine the BLIS micro-kernel. We get the size of the widest type of the matrix multiplication operands in case there are several different types. Reviewed-by: Michael Kruse <llvm@meinersbur.de> Differential Revision: https://reviews.llvm.org/D29269 llvm-svn: 294828	2017-02-11 07:00:05 +00:00
Roman Gareev	9989088ee9	Isolate a set of partial tile prefixes in case of the matrix multiplication optimization Isolate a set of partial tile prefixes to allow hoisting and sinking out of the unrolled innermost loops produced by the optimization of the matrix multiplication. In case it cannot be proved that the number of loop iterations can be evenly divided by tile sizes and we tile and unroll the point loop, the isl generates conditional expressions. Subsequently, the conditional expressions can prevent stores and loads of the unrolled loops from being sunk and hoisted. The patch isolates a set of partial tile prefixes, which have exactly Mr x Nr iterations of the two innermost loops, the result of the loop tiling performed by the matrix multiplication optimization, where Mr and Mr are parameters of the micro-kernel. This helps to get rid of the conditional expressions of the unrolled innermost loops. Probably this approach can be replaced with padding in future. In case of, for example, the gemm from Polybench/C 3.2 and parametric loop bounds, it helps to increase the performance from 7.98 GFlops (27.71% of theoretical peak) to 21.47 GFlops (74.57% of theoretical peak). Hence, we get the same performance as in case of scalar loops bounds. It also cause compile time regression. The compile-time is increased from 0.795 seconds to 0.837 seconds in case of scalar loops bounds and from 1.222 seconds to 1.490 seconds in case of parametric loops bounds. Reviewed-by: Michael Kruse <llvm@meinersbur.de> Differential Revision: https://reviews.llvm.org/D29244 llvm-svn: 294564	2017-02-09 07:10:01 +00:00
Roman Gareev	772498dc68	[NFC] Make ScheduleTreeOptimizer::optimizeBand return a schedule node optimized with optimizeMatMulPattern This patch makes ScheduleTreeOptimizer::optimizeBand return a schedule node optimized with optimizeMatMulPattern. Otherwise, it could not use the isolate option, because standardBandOpts could try to tile a band node with anchored subtree and get the error, since the use of the isolate option causes any tree containing the node to be considered anchored. Furthermore, it is not intended to apply standard optimizations, when the matrix multiplication has been detected. llvm-svn: 294444	2017-02-08 13:29:06 +00:00

1 2 3

138 Commits