llvm-project

Commit Graph

Author	SHA1	Message	Date
Simon Pilgrim	bc730b5e43	[InstCombine] collectBitParts - use APInt directly to check for out of range bit shifts. NFCI.	2020-10-01 12:50:36 +01:00
Andrew Paverd	ef4e971e5e	[CFGuard] Add address-taken IAT tables and delay-load support This patch adds support for creating Guard Address-Taken IAT Entry Tables (.giats$y sections) in object files, matching the behavior of MSVC. These contain lists of address-taken imported functions, which are used by the linker to create the final GIATS table. Additionally, if any DLLs are delay-loaded, the linker must look through the .giats tables and add the respective load thunks of address-taken imports to the GFIDS table, as these are also valid call targets. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D87544	2020-10-01 12:45:07 +01:00
Kerry McLaughlin	fcf70e1e3b	[SVE][CodeGen] Lower scalable fp_extend & fp_round operations This patch adds FP_EXTEND_MERGE_PASSTHRU & FP_ROUND_MERGE_PASSTHRU ISD nodes, used to lower scalable vector fp_extend/fp_round operations. fp_round has an additional argument, the 'trunc' flag, which is an integer of zero or one. This also fixes a warning introduced by the new tests added to sve-split-fcvt.ll, resulting from an implicit TypeSize -> uint64_t cast in SplitVecOp_FP_ROUND. Reviewed By: sdesmalen, paulwalker-arm Differential Revision: https://reviews.llvm.org/D88321	2020-10-01 12:17:37 +01:00
Nicolas Vasilache	a81b938b6d	[mlir][Linalg] Fix ASAN bug ``` LinalgTilingOptions &setTileSizes(ValueRange ts) ``` makes it all too easy to create stack-use-after-return errors. In particular, `c694588fc5` introduced one such issue. Instead just take a copy in the lambda and be done with it.	2020-10-01 06:57:35 -04:00
Max Kazantsev	69acdfe075	[SCEV] Prove implicaitons via AddRec start If we know that some predicate is true for AddRec and an invariant (w.r.t. this AddRec's loop), this fact is, in particular, true on the first iteration. We can try to prove the facts we need using the start value. The motivating example is proving things like ``` isImpliedCondOperands(>=, X, 0, {X,+,-1}, 0} ``` Differential Revision: https://reviews.llvm.org/D88208 Reviewed By: reames	2020-10-01 17:09:38 +07:00
Sam Parker	38f625d0d1	[ARM][LowOverheadLoops] Adjust Start insertion. Try to move the insertion point to become the terminator of the block, usually the preheader. Differential Revision: https://reviews.llvm.org/D88638	2020-10-01 10:49:19 +01:00
Kerry McLaughlin	75db7cf78a	[SVE][CodeGen] Legalisation of integer -> floating point conversions Splitting the operand of a scalable [S\|U]INT_TO_FP results in a concat_vectors operation where the operands are unpacked FP scalable vectors (e.g. nxv2f32). This patch adds custom lowering of concat_vectors which checks that the number of operands is 2, and isel patterns to match concat_vectors of scalable FP types with uzp1. Reviewed By: efriedma, paulwalker-arm Differential Revision: https://reviews.llvm.org/D88033	2020-10-01 10:43:20 +01:00
Paul Walker	8931c3d682	[NFC] Iterate across an explicit list of scalable MVTs when driving setOperationAction. Iterating across all of integer_scalable_vector_valuetypes seems wasteful when there's only a handful we care about. Also removes some rouge whitespace. Differential Revision: https://reviews.llvm.org/D88552	2020-10-01 10:17:59 +01:00
Sam Parker	6ec5f32497	[ARM][LowOverheadLoops] Iteration count liveness Before deciding to insert a [W\|D]LSTP, check that defining LR with the element count won't affect any other instructions that should be taking the iteration count. Differential Revision: https://reviews.llvm.org/D88549	2020-10-01 10:11:10 +01:00
Sam Parker	7b90516d47	[ARM][LowOverheadLoops] Start insertion point If possible, try not to move the start position earlier than it already is. Differential Revision: https://reviews.llvm.org/D88542	2020-10-01 10:05:25 +01:00
Stefan Gränitz	306571cc46	[ORC][examples] Temporarily remove LLJITWithChildProcess until ORC TPC lands This solves a phase ordering problem: OrcV2 remote process support depends on OrcV2 removable code, OrcV2 removable code depends on OrcV1 removal, OrcV1 removal depends on LLJITWithChildProcess migration, and LLJITWithChildProcess migration depends on OrcV2 TargetProcessControl support.	2020-10-01 10:25:13 +02:00
Stefan Gränitz	e5795a1b36	[ORC][examples] Remove ThinLtoJIT example after LLJITWithThinLTOSummaries landed in OrcV2Examples The ThinLtoJIT example was aiming to utilize ThinLTO summaries and concurrency in ORC for speculative compilation. The latter is heavily dependent on asynchronous task scheduling which is probably done better out-of-tree with a mature library like Boost-ASIO. The pure utilization of ThinLTO summaries in ORC is demonstrated in OrcV2Examples/LLJITWithThinLTOSummaries.	2020-10-01 10:22:09 +02:00
Vitaly Buka	456974ac78	[sanitizer] Fix SymbolizedStack leak	2020-10-01 00:50:45 -07:00
Sam Parker	dfa2c14b8f	[ARM][LowOverheadLoops] Use iterator for InsertPt. Use a MachineBasicBlock::iterator instead of a MachineInstr* for the position of our LoopStart instruction. NFCish, as it change debug info.	2020-10-01 08:32:35 +01:00
Fangrui Song	1e8fbb3b74	[MC] Inline MCExpr::printVariantKind & remove UseParensForSymbolVariantBit Note, MAI may be nullptr in -show-encoding.	2020-10-01 00:10:06 -07:00
Amara Emerson	da11479fd1	[AArch64][GlobalISel] Select all-zero G_BUILD_VECTOR into a zero mov. Unfortunately the leaf SDAG patterns aren't supported yet so we need to do this manually, but it's not a significant amount of code anyway. Differential Revision: https://reviews.llvm.org/D87924	2020-09-30 23:53:38 -07:00
Andrew Dona-Couch	1fedd90cc7	[AVR] fix interrupt stack pointer restoration This patch fixes a corruption of the stack pointer and several registers in any AVR interrupt with non-empty stack frame. Previously, the callee-saved registers were popped before restoring the stack pointer, causing the pointer math to use the wrong base value while also corrupting the caller's register. This change fixes the code to restore the stack pointer last before exiting the interrupt service routine. https://bugs.llvm.org/show_bug.cgi?id=47253 Reviewed By: dylanmckay Differential Revision: https://reviews.llvm.org/D87735 Patch by Andrew Dona-Couch.	2020-10-01 18:52:13 +13:00
Chris Lattner	71dcbe1e88	We don't need two different ways to get commit access, just simplify the policy here so that old SVN users and new contributors do the same thing.	2020-09-30 22:36:44 -07:00
Muhammad Omair Javaid	3d27a99b2e	[LLDB] Remove AArch64/Linux xfail decorator from TestGuiBasicDebug This test now passes on AArch64/Linux after following change by Jonas: `d689570d7d`	2020-10-01 10:20:22 +05:00
Igor Chervatyuk	de973e0b07	[RISCV][ASAN] implementation for previous/next pc routines for riscv64 [7/11] patch series to port ASAN for riscv64 Depends On D87575 Reviewed By: eugenis, vitalybuka, luismarques Differential Revision: https://reviews.llvm.org/D87577	2020-10-01 08:14:44 +03:00
Max Kazantsev	c93a39dd1f	[SCEV][NFC] Introduce isKnownPredicateAt method We can query known predicates in different points, respecting their dominating conditions.	2020-10-01 12:11:24 +07:00
Michael Liao	2c9dc7bbbf	Revert "[llvm-exegesis] Add option to check the hardware support for a given feature before benchmarking." This reverts commit `4fcd1a8e65` as `llvm/test/tools/llvm-exegesis/X86/lbr/mov-add.s` failed on hosts without LBR supported if the build has LIBPFM enabled. On that host, `perf_event_open` fails with `EOPNOTSUPP` on LBR config. That change's basic assumption > If this is run on a non-supported hardware, it will produce all zeroes for latency. could not stand as `perf_event_open` system call will fail if the underlying hardware really don't have LBR supported.	2020-09-30 23:15:35 -04:00
Fangrui Song	4e9277eda1	[ELF] --wrap: don't unnecessarily expose __real_ The routing rules are: sym -> __wrap_sym __real_sym -> sym __wrap_sym and sym are routing targets, so they need to be exposed to the symbol table. __real_sym is not and can be eliminated if not used by regular object.	2020-09-30 20:09:25 -07:00
Craig Topper	12bdd427b3	[APFloat] Improve asserts in isSignificandAllOnes and isSignificandAllZeros so they protect shift operations from undefined behavior. For example, the assert in isSignificandAllZeros allowed NumHighBits to be integerPartWidth. But since it is used directly as a shift amount it must be less than integerPartWidth.	2020-09-30 19:32:34 -07:00
Michael Kruse	d4a1db4f3f	[flang][msvc] Workaround 'forgotten' symbols in FoldOperation. NFC. This resolves an issue where the Microsoft compiler 'forgets' symbols when using constexpr in a lambda in a templated function. The symbols are: 1. The implicit lambda captures `context` and `convert`. Fix by making them explicit captures. The error message was: ``` fold-implementation.h(1220): error C2065: 'convert': undeclared identifier ``` 2. The function template argument FROMCAT. Fix by storing it in a temporary constexpr variable inside the function. The error message was: ``` fold-implementation.h(1216): error C2065: 'FROMCAT': undeclared identifier ``` This patch is part of the series to make flang compilable with MS Visual Studio <http://lists.llvm.org/pipermail/flang-dev/2020-July/000448.html>. Reviewed By: klausler Differential Revision: https://reviews.llvm.org/D88504	2020-09-30 21:28:34 -05:00
Dan Gohman	6cd8511e59	[WebAssembly] New-style command support This adds support for new-style command support. In this mode, all exports are considered command entrypoints, and the linker inserts calls to `__wasm_call_ctors` and `__wasm_call_dtors` for all such entrypoints. This enables support for: - Command entrypoints taking arguments other than strings and return values other than `int`. - Multicall executables without requiring on the use of string-based command-line arguments. This new behavior is disabled when the input has an explicit call to `__wasm_call_ctors`, indicating code not expecting new-style command support. This change does mean that wasm-ld no longer supports DCE-ing the `__wasm_call_ctors` function when there are no calls to it. If there are no calls to it, and there are ctors present, we assume it's wasm-ld's job to insert the calls. This seems ok though, because if there are ctors present, the program is expecting them to be called. This change affects the init-fini-gc.ll test.	2020-09-30 19:02:40 -07:00
Michael Kruse	b656189e6a	[flang][msvc] Avoid ReferenceVariantBase ctor ambiguity. NFC. Msvc reports the following error when a ReferenceVariantBase is constructed using an r-value reference or instantiated as std::vector template parameter. The error message is: ``` PFTBuilder.h(59,1): error C2665: 'std::variant<...>::variant': none of the 2 overloads could convert all the argument types variant(1248,1): message : could be 'std::variant<...>::variant(std::variant<...> &&) noexcept(false)' variant(1248,1): message : or 'std::variant<...>::variant(const std::variant<...> &) noexcept(false)' PFTBuilder.h(59,1): message : while trying to match the argument list '(common::Reference<lower::pft::ReferenceVariantBase<false,...>>)' ``` Work around the ambiguity by only taking `common::Reference` arguments in the constructor. That is, conversion to common::Reference has to be done be the caller instead of being done inside the ctor. Unfortunately, with this change clang/gcc (but not msvc) insist on that the ReferenceVariantBase is stored in a `std::initializer_list`-initialized variable before being used, like being passed to a function or returned. This patch is part of the series to make flang compilable with MS Visual Studio <http://lists.llvm.org/pipermail/flang-dev/2020-July/000448.html>. Reviewed By: DavidTruby Differential Revision: https://reviews.llvm.org/D88109	2020-09-30 20:54:09 -05:00
Amara Emerson	196c097bba	[AArch64][GlobalISel] Clamp oversize FP arithmetic vectors.	2020-09-30 18:03:37 -07:00
River Riddle	f050553490	[mlir] Split Dialect::addOperations into two functions The current implementation uses a fold expression to add all of the operations at once. This is really nice, but apparently the lifetime of each of the AbstractOperation instances is for the entire expression which may lead to a stack overflow for large numbers of operations. This splits the method in two to allow for the lifetime of the AbstractOperation to be properly scoped.	2020-09-30 18:03:14 -07:00
peter klausler	4fb679d3b1	[flang] Fix Gw.d format output The estimation of the decimal exponent needs to allow for all 'd' of the requested significant digits. Also accept a plus sign on a "+kP" scaling factor in a format. Differential revision: https://reviews.llvm.org/D88618	2020-09-30 18:02:25 -07:00
Geoffrey Martin-Noble	d4e889f1f5	Remove `Ops` suffix from dialect library names Dialects include more than just ops, so this suffix is outdated. Follows discussion in https://llvm.discourse.group/t/rfc-canonical-file-paths-to-dialects/621 Reviewed By: stellaraccident Differential Revision: https://reviews.llvm.org/D88530	2020-09-30 18:00:44 -07:00
Sam Clegg	3c45a06f26	[lld][WebAssembly] Allow exporting of mutable globals In particular allow explict exporting of `__stack_pointer` but exclud this from `--export-all` to avoid requiring the mutable globals feature whenenve `--export-all` is used. This uncovered a bug in populateTargetFeatures regarding checking if the mutable-globals feature is allowed. See: https://github.com/WebAssembly/binaryen/issues/2934 Differential Revision: https://reviews.llvm.org/D88506	2020-09-30 17:53:27 -07:00
Amara Emerson	93a1fc2e18	Try to fix build. May have used a C++ feature too new/not supported on all platforms.	2020-09-30 17:36:38 -07:00
Arthur Eubanks	460dda071e	[WholeProgramDevirt][NewPM] Add NPM testing path to match legacy pass The legacy pass's default constructor sets UseCommandLine = true and goes down a separate testing route. Match that in the NPM pass. This fixes all tests in llvm/test/Transforms/WholeProgramDevirt under NPM. Reviewed By: ychen Differential Revision: https://reviews.llvm.org/D88588	2020-09-30 17:27:37 -07:00
Amara Emerson	4ab45cc226	[AArch64][GlobalISel] Add some more legal types for G_PHI, G_IMPLICIT_DEF, G_FREEZE. Also use this opportunity start to clean up the mess of vector type lists we have in the LegalizerInfo. Unfortunately since the legalizer rule builders require std::initializer_list objects as parameters we can't programmatically generate the type lists.	2020-09-30 17:25:33 -07:00
peter klausler	e24f0ac7a3	[flang] Allow record advancement in external formatted sequential READ The '/' control edit descriptor causes a runtime crash for an external formatted sequential READ because the AdvanceRecord() member function for external units implemented only the tasks to finish reading the current record. Split those out into a new FinishReadingRecord() member function, call that instead from EndIoStatement(), and change AdvanceRecord() to both finish reading the current record and to begin reading the next one. Differential revision: https://reviews.llvm.org/D88607	2020-09-30 17:16:55 -07:00
Jonas Devlieghere	d689570d7d	[lldb] Make TestGuiBasicDebug more lenient Matt's change to the register allocator in `89baeaef2f` changed where we end up after the `finish`. Before we'd end up on line 4. * thread #1, queue = 'com.apple.main-thread', stop reason = step out Return value: (int) $0 = 1 frame #0: 0x0000000100003f7d a.out`main(argc=1, argv=0x00007ffeefbff630) at main.c:4:3 1 extern int func(); 2 3 int main(int argc, char *argv) { -> 4 func(); // Break here 5 func(); // Second 6 return 0; 7 } Now, we end up on line 5. thread #1, queue = 'com.apple.main-thread', stop reason = step out Return value: (int) $0 = 1 frame #0: 0x0000000100003f8d a.out`main(argc=1, argv=0x00007ffeefbff630) at main.c:5:3 2 3 int main(int argc, char **argv) { 4 func(); // Break here -> 5 func(); // Second 6 return 0; 7 } Given that this is not expected stable to be stable I've made the test a bit more lenient to accept both scenarios.	2020-09-30 17:06:47 -07:00
Jessica Paquette	bc43ddf42f	[AArch64][GlobalISel] NFC: Refactor G_FCMP selection code Refactor this so it's similar to the existing integer comparison code. Also add some missing 64-bit testcases to select-fcmp.mir. Refactoring to prep for improving selection for G_FCMP-related conditional branches etc. Differential Revision: https://reviews.llvm.org/D88614	2020-09-30 16:50:39 -07:00
Ranjeet Singh	e4f50e587f	[ARM] Add missing target for Arm neon test case. This is a follow-up from https://reviews.llvm.org/D61717. Where Richard described the issue with compiling arm_neon.h under -flax-vector-conversions=none. It looks like the example reproducer does actually work but what was missing was a test entry for that target. Differential Revision: https://reviews.llvm.org/D88546	2020-10-01 00:32:33 +01:00
Joachim Protze	23419bfd1c	[OpenMP][libarcher] Allow all possible argument separators in TSAN_OPTIONS Currently, the parser used to tokenize the TSAN_OPTIONS in libomp uses only spaces as separators, even though TSAN in compiler-rt supports other separators like ':' or ','. CTest uses ':' to separate sanitizer options by default. The documentation for other sanitizers mentions ':' as separator, but TSAN only lists spaces, which is probably where this mismatch originated. Patch provided by upsj Differential Revision: https://reviews.llvm.org/D87144	2020-10-01 01:10:13 +02:00
Craig Topper	b23916504a	Patch IEEEFloat::isSignificandAllZeros and IEEEFloat::isSignificandAllOnes (bug 34579) Patch IEEEFloat::isSignificandAllZeros and IEEEFloat::isSignificandAllOnes to behave correctly in the case that the size of the significand is a multiple of the width of the integerParts making up the significand. The patch to IEEEFloat::isSignificandAllOnes fixes bug 34579, and the patch to IEEE:Float:isSignificandAllZeros fixes the unit test "APFloatTest.x87Next" I added here. I have included both in this diff since the changes are very similar. Patch by Andrew Briand	2020-09-30 16:07:15 -07:00
Ahsan Saghir	66d2e3f495	[PowerPC] Add outer product instructions for MMA This patch adds outer product instructions for MMA, including related infrastructure, and their tests. Depends on D84968. Reviewed By: #powerpc, bsaleil, amyk Differential Revision: https://reviews.llvm.org/D88043	2020-09-30 18:06:49 -05:00
Akira Hatanaka	21cf2e6c26	Handle unknown OSes in DarwinTargetInfo::getExnObjectAlignment rdar://problem/69727650	2020-09-30 16:05:17 -07:00
Joachim Protze	6104b30446	[OpenMP][OMPT] Update OMPT tests for newly added GOMP interface patches This patch updates the expected results for the GOMP interface patches: D87267, D87269, and D87271. The taskwait-depend test is changed to really use taskwait-depend and copied to an task_if0-depend test. To pass the tests, the handling of the return address was fixed. Differential Revision: https://reviews.llvm.org/D87680	2020-10-01 00:53:41 +02:00
Joachim Protze	55cff5b288	[OpenMP][libomptarget] make omp_get_initial_device 5.1 compliant OpenMP 5.1 defines omp_get_initial_device to return the same value as omp_get_num_devices. Since this change is also 5.0 compliant, no versioning is needed. Differential Revision: https://reviews.llvm.org/D88149	2020-10-01 00:51:11 +02:00
peter klausler	37b2e2b04c	[flang] Semantic analysis for FINAL subroutines Represent FINAL subroutines in the symbol table entries of derived types. Enforce constraints. Update tests that have inadvertent violations or modified messages. Added a test. The specific procedure distinguishability checking code for generics was used to enforce distinguishability of FINAL procedures. (Also cleaned up some confusion and redundancy noticed in the type compatibility infrastructure while digging into that area.) Differential revision: https://reviews.llvm.org/D88613	2020-09-30 15:46:15 -07:00
Reid Kleckner	5519e4da83	Re-land "[PDB] Merge types in parallel when using ghashing" Stored Error objects have to be checked, even if they are success values. This reverts commit `8d250ac3cd`. Relands commit 49b3459930655d879b2dc190ff8fe11c38a8be5f.. Original commit message: ----------------------------------------- This makes type merging much faster (-24% on chrome.dll) when multiple threads are available, but it slightly increases the time to link (+10%) when /threads:1 is passed. With only one more thread, the new type merging is faster (-11%). The output PDB should be identical to what it was before this change. To give an idea, here is the /time output placed side by side: BEFORE \| AFTER Input File Reading: 956 ms \| 968 ms Code Layout: 258 ms \| 190 ms Commit Output File: 6 ms \| 7 ms PDB Emission (Cumulative): 6691 ms \| 4253 ms Add Objects: 4341 ms \| 2927 ms Type Merging: 2814 ms \| 1269 ms -55%! Symbol Merging: 1509 ms \| 1645 ms Publics Stream Layout: 111 ms \| 112 ms TPI Stream Layout: 764 ms \| 26 ms trivial Commit to Disk: 1322 ms \| 1036 ms -300ms ----------------------------------------- -------- Total Link Time: 8416 ms 5882 ms -30% overall The main source of the additional overhead in the single-threaded case is the need to iterate all .debug$T sections up front to check which type records should go in the IPI stream. See fillIsItemIndexFromDebugT. With changes to the .debug$H section, we could pre-calculate this info and eliminate the need to do this walk up front. That should restore single-threaded performance back to what it was before this change. This change will cause LLD to be much more parallel than it used to, and for users who do multiple links in parallel, it could regress performance. However, when the user is only doing one link, it's a huge improvement. In the future, we can use NT worker threads to avoid oversaturating the machine with work, but for now, this is such an improvement for the single-link use case that I think we should land this as is. Algorithm ---------- Before this change, we essentially used a DenseMap<GloballyHashedType, TypeIndex> to check if a type has already been seen, and if it hasn't been seen, insert it now and use the next available type index for it in the destination type stream. DenseMap does not support concurrent insertion, and even if it did, the linker must be deterministic: it cannot produce different PDBs by using different numbers of threads. The output type stream must be in the same order regardless of the order of hash table insertions. In order to create a hash table that supports concurrent insertion, the table cells must be small enough that they can be updated atomically. The algorithm I used for updating the table using linear probing is described in this paper, "Concurrent Hash Tables: Fast and General(?)!": https://dl.acm.org/doi/10.1145/3309206 The GHashCell in this change is essentially a pair of 32-bit integer indices: <sourceIndex, typeIndex>. The sourceIndex is the index of the TpiSource object, and it represents an input type stream. The typeIndex is the index of the type in the stream. Together, we have something like a ragged 2D array of ghashes, which can be looked up as: tpiSources[tpiSrcIndex]->ghashes[typeIndex] By using these side tables, we can omit the key data from the hash table, and keep the table cell small. There is a cost to this: resolving hash table collisions requires many more loads than simply looking at the key in the same cache line as the insertion position. However, most supported platforms should have a 64-bit CAS operation to update the cell atomically. To make the result of concurrent insertion deterministic, the cell payloads must have a priority function. Defining one is pretty straightforward: compare the two 32-bit numbers as a combined 64-bit number. This means that types coming from inputs earlier on the command line have a higher priority and are more likely to appear earlier in the final PDB type stream than types from an input appearing later on the link line. After table insertion, the non-empty cells in the table can be copied out of the main table and sorted by priority to determine the ordering of the final type index stream. At this point, item and type records must be separated, either by sorting or by splitting into two arrays, and I chose sorting. This is why the GHashCell must contain the isItem bit. Once the final PDB TPI stream ordering is known, we need to compute a mapping from source type index to PDB type index. To avoid starting over from scratch and looking up every type again by its ghash, we save the insertion position of every hash table insertion during the first insertion phase. Because the table does not support rehashing, the insertion position is stable. Using the array of insertion positions indexed by source type index, we can replace the source type indices in the ghash table cells with the PDB type indices. Once the table cells have been updated to contain PDB type indices, the mapping for each type source can be computed in parallel. Simply iterate the list of cell positions and replace them with the PDB type index, since the insertion positions are no longer needed. Once we have a source to destination type index mapping for every type source, there are no more data dependencies. We know which type records are "unique" (not duplicates), and what their final type indices will be. We can do the remapping in parallel, and accumulate type sizes and type hashes in parallel by type source. Lastly, TPI stream layout must be done serially. Accumulate all the type records, sizes, and hashes, and add them to the PDB. Differential Revision: https://reviews.llvm.org/D87805	2020-09-30 15:44:38 -07:00
Stanislav Mekhanoshin	722d792499	[AMDGPU] Reorganize VOP3P encoding This changes width of encoding and opcode fields to match the documentation. Differential Revision: https://reviews.llvm.org/D88619	2020-09-30 15:27:06 -07:00
Vitaly Buka	7475bd5411	[Msan] Add ptsname, ptsname_r interceptors Reviewed By: eugenis, MaskRay Differential Revision: https://reviews.llvm.org/D88547	2020-09-30 15:00:52 -07:00
MaheshRavishankar	c694588fc5	[mlir][Linalg] Add pattern to tile and fuse Linalg operations on buffers. The pattern is structured similar to other patterns like LinalgTilingPattern. The fusion patterns takes options that allows you to fuse with producers of multiple operands at once. - The pattern fuses only at the level that is known to be legal, i.e if a reduction loop in the consumer is tiled, then fusion should happen "before" this loop. Some refactoring of the fusion code is needed to fuse only where it is legal. - Since the fusion on buffers uses the LinalgDependenceGraph that is not mutable in place the fusion pattern keeps the original operations in the IR, but are tagged with a marker that can be later used to find the original operations. This change also fixes an issue with tiling and distribution/interchange where if the tile size of a loop were 0 it wasnt account for in these. Differential Revision: https://reviews.llvm.org/D88435	2020-09-30 14:56:58 -07:00

1 2 3 4 5 ...

367804 Commits All Branches Search

367804 Commits

All Branches