llvm-project

Commit Graph

Author	SHA1	Message	Date
Konstantin Zhuravlyov	6054a456da	AMDGPU: Add support for amdgpu-unsafe-fp-atomics attribute If amdgpu-unsafe-fp-atomics is specified, allow {flat\|global}_atomic_add_f32 even if atomic modes don't match. Differential Revision: https://reviews.llvm.org/D95391	2021-02-04 08:09:34 -05:00
Dylan McKay	83e2710eb0	[AVR] Remove an assertion that causes generic CodeGen tests to fail It was discussed a few years ago and agreed that it makes sense to remove this assertion as other targets do not perform similar register size checking in inline assembly constraint logic, so the check just adds a needless barrier on AVR. This patch removes the assertion and removes 'XFAIL' from two Generic CodeGen tests for AVR as a result.	2021-02-05 02:05:23 +13:00
Andrzej Warzynski	62e4f22e29	[clang] Add AddClang.cmake to the list of the CMake modules that are installed This makes sure that AddClang.cmake is installed alongside other Clang CMake modules. This mirrors LLVM and MLIR in this respect and is required when building the new Flang driver out of tree (as it depends on Clang and includes AddClang.cmake). Reviewed By: bogner Differential Revision: https://reviews.llvm.org/D94533	2021-02-04 12:38:38 +00:00
Faris Rehman	3a1513c142	[flang][driver] Add forced form flags and -ffixed-line-length Add support for the following layout options: * -ffree-form * -ffixed-form - -ffixed-line-length=n (alias -ffixed-line-length-n) Additionally remove options `-fno-free-form` and `-fno-fixed-form` as they were initially added to forward to gfortran but gfortran does not support these flags. This patch adds the flag FlangOnlyOption to the existing options `-ffixed-form`, `-ffree-form` and `-ffree-line-length-` in Options.td. As of commit `6a75496836`, these flags are not currently forwarded to gfortran anyway. The default fixed line length in FrontendOptions is 72, based off the current default in Fortran::parser::Options. The line length cannot be set to a negative integer, or a positive integer less than 7 excluding 0, consistent with the behaviour of gfortran. This patch does not add `-ffree-line-length-n` as Fortran::parser::Options does not have a variable for free form columns. Whilst the `fixedFormColumns` variable is used in f18 for `-ffree-line-length-n`, f18 only allows `-ffree-line-length-none`/`-ffree-line-length-0` and not a user-specified value. `fixedFormcolumns` cannot be used in the new driver as it is ignored in the frontend when dealing with free form files. Summary of changes: - Remove -fno-fixed-form and -fno-free-form from Options.td - Make -ffixed-form, -ffree-form and -ffree-line-length-n FlangOnlyOption in Options.td - Create AddFortranDialectOptions method in Flang.cpp - Create FortranForm enum in FrontendOptions.h - Add fortranForm_ and fixedFormColumns_ to Fortran::frontend::FrontendOptions - Update fixed-form-test.f so that it guarantees that it fails when forced as a free form file to better facilitate testing. Differential Revision: https://reviews.llvm.org/D95460	2021-02-04 12:24:15 +00:00
Simon Pilgrim	fa2cdb8140	[X86] Remove stale TODO comment. NFC. We now handle implicit zero-extension shuffle mask cases.	2021-02-04 12:14:05 +00:00
Nico Weber	4874ff0241	Revert "[hip][cuda] Enable extended lambda support on Windows." This reverts commit `a2fdf9d4d7`. Slightly speculative, seeing several cuda tests fail on this Windows bot: http://45.33.8.238/win/32620/step_7.txt	2021-02-04 07:10:46 -05:00
Nico Weber	26ca503bd2	[gn build] (manually) port `0609f257dc`	2021-02-04 06:52:55 -05:00
Sander de Smalen	8f69da9f97	[ElementCount] NFC: Set 'const' qualifier for getWithIncrement/Decrement. These class methods simply return a new UnivariateLinearPolyBase (e.g. ElementCount), and do not modify the object in any way or form, so qualify for being 'const'.	2021-02-04 11:27:45 +00:00
Nicolas Vasilache	f4ac9f0334	[mlir][Linalg] Drop SliceOp This op is subsumed by rank-reducing SubViewOp and has become useless. Differential revision: https://reviews.llvm.org/D95317	2021-02-04 11:22:01 +00:00
Jeremy Morse	8998f58435	Re-land D94976 after revert in `e29552c5af` This modified patch avoids redirecting the unit in which a subprogram is created if type units are enabled -- DIEs were getting children allocated from different units memory pools. Original commit message: [DWARF] Create subprogram's DIE in DISubprogram's unit This is a fix for PR48790. Over in D70350, subprogram DIEs were permitted to be shared between CUs. However, the creation of a subprogram DIE can be triggered early, from other CUs. The subprogram definition is then created in one CU, and when the function is actually emitted children are attached to the subprogram that expect to be in another CU. This breaks internal CU references in the children. Fix this by redirecting the creation of subprogram DIEs in getOrCreateContextDIE to the CU specified by it's DISubprogram definition. This ensures that the subprogram DIE is always created in the correct CU. Differential Revision: https://reviews.llvm.org/D94976	2021-02-04 11:17:18 +00:00
David Green	649a3d00df	[ARM] Handle f16 in GeneratePerfectShuffle This new f16 shuffle under Neon would hit an assert in GeneratePerfectShuffle as it would try to treat a f16 vector as an i8. Add f16 handling, treating them like an i16. Differential Revision: https://reviews.llvm.org/D95446	2021-02-04 11:14:52 +00:00
Alex Zinenko	ba87f99168	[mlir] make vector to llvm conversion truly partial Historically, the Vector to LLVM dialect conversion subsumed the Standard to LLVM dialect conversion patterns. This was necessary because the conversion infrastructure did not have sufficient support for reconciling type conversions. This support is now available. Only keep the patterns related to the Vector dialect in the Vector to LLVM conversion and require type casts operations to be inserted if necessary. These casts will be removed by following conversions if possible. Update integration tests to also run the Standard to LLVM conversion. There is a significant amount of test churn, which is due to (a) unnecessarily strict tests in VectorToLLVM and (b) many patterns actually targeting Standard dialect ops instead of LLVM dialect ops leading to tests actually exercising a Vector->Standard->LLVM conversion. This churn is a good illustration of the reason to make the conversion partial: now the tests only check the code in the Vector to LLVM conversion and will not be randomly broken by changes in Standard to LLVM conversion. Arguably, it may be possible to extract Vector to Standard patterns into a separate pass, but given the ongoing splitting of the Standard dialect, such pass will be short-lived and will require further refactoring. Depends On D95626 Reviewed By: nicolasvasilache, aartbik Differential Revision: https://reviews.llvm.org/D95685	2021-02-04 11:33:24 +01:00
Pavel Labath	aa56b30014	[lldb] Make TestLocalVariables.py compatible with the new pass manager The new PM is more aggressive at inlining, which breaks assumptions in the test => slap some __attribute__((noinlines)) to prevent that.	2021-02-04 11:27:08 +01:00
Alex Zinenko	5b91060dcc	[mlir] Apply source materialization in case of transitive conversion In dialect conversion infrastructure, source materialization applies as part of the finalization procedure to results of the newly produced operations that replace previously existing values with values having a different type. However, such operations may be created to replace operations created in other patterns. At this point, it is possible that the results of the _original_ operation are still in use and have mismatching types, but the results of the _intermediate_ operation that performed the type change are not in use leading to the absence of source materialization. For example, %0 = dialect.produce : !dialect.A dialect.use %0 : !dialect.A can be replaced with %0 = dialect.other : !dialect.A %1 = dialect.produce : !dialect.A // replaced, scheduled for removal dialect.use %1 : !dialect.A and then with %0 = dialect.final : !dialect.B %1 = dialect.other : !dialect.A // replaced, scheduled for removal %2 = dialect.produce : !dialect.A // replaced, scheduled for removal dialect.use %2 : !dialect.A in the same rewriting, but only the %1->%0 replacement is currently considered. Change the logic in dialect conversion to look up all values that were replaced by the given value and performing source materialization if any of those values is still in use with mismatching types. This is performed by computing the inverse value replacement mapping. This arguably expensive manipulation is performed only if there were some type-changing replacements. An alternative could be to consider all replaced operations and not only those that resulted in type changes, but it would harm pattern-level composability: the pattern that performed the non-type-changing replacement would have to be made aware of the type converter in order to call the materialization hook. Reviewed By: rriddle Differential Revision: https://reviews.llvm.org/D95626	2021-02-04 11:15:11 +01:00
Hans Wennborg	6625680a58	[clang-cl] Remove the /fallback option As discussed in https://lists.llvm.org/pipermail/cfe-dev/2021-January/067524.html It doesn't appear to be used, isn't really maintained, and adds some complexity to the code. Let's remove it. Differential revision: https://reviews.llvm.org/D95876	2021-02-04 10:33:16 +01:00
Jan Svoboda	225ccf0c50	[clang][cli] Command line round-trip for HeaderSearch options This patch implements generation of remaining header search arguments. It's done manually in C++ as opposed to TableGen, because we need the flexibility and don't anticipate reuse. This patch also tests the generation of header search options via a round-trip. This way, the code gets exercised whenever Clang is built and tested in asserts mode. All `check-clang` tests pass. Reviewed By: dexonsmith Differential Revision: https://reviews.llvm.org/D94472	2021-02-04 10:18:34 +01:00
Joachim Meyer	e3f02302e3	[Support] Indent multi-line descr of enum cli options. As noted in https://reviews.llvm.org/D93459, the formatting of multi-line descriptions of clEnumValN and the likes is unfavorable. Thus this patch adds support for correctly indenting these. Reviewed By: serge-sans-paille Differential Revision: https://reviews.llvm.org/D93494	2021-02-04 10:14:44 +01:00
Sebastian Neubauer	6c59dc474d	[AMDGPU] Save all lanes for reserved VGPRs When SGPRs are spilled to VGPRs, they can overwrite any lane. We need to preserve the value of inactive lanes in function calls, so we save the register even if it is marked as caller saved. Also, teach buildPrologSpill to work when no registers are free like in CodeGen/AMDGPU/pei-scavenge-vgpr-spill.mir and update the comment on findScratchNonCalleeSaveRegister as it is not used anymore to realign the stack pointer since D95865. Differential Revision: https://reviews.llvm.org/D95946	2021-02-04 09:56:36 +01:00
Kirill Bobyrev	5eec9a380a	[clangd] Detect rename conflicits within enclosing scope This patch allows detecting conflicts with variables defined in the current CompoundStmt or If/While/For variable init statements. Reviewed By: hokein Differential Revision: https://reviews.llvm.org/D95925	2021-02-04 09:45:42 +01:00
Haojian Wu	6c1a23303d	[Syntax] Support condition for IfStmt. Differential Revision: https://reviews.llvm.org/D95782	2021-02-04 09:15:30 +01:00
Nicolas Vasilache	f245b7ad36	[mlir][Linalg] Generalize the definition of a Linalg contraction. This revision defines a Linalg contraction in general terms: 1. Has 2 input and 1 output shapes. 2. Has at least one reduction dimension. 3. Has only projected permutation indexing maps. 4. its body computes `u5(u1(c) + u2(u3(a) * u4(b)))` on some field (AddOpType, MulOpType), where u1, u2, u3, u4 and u5 represent scalar unary operations that may change the type (e.g. for mixed-precision). As a consequence, when vectorization of such an op occurs, the only special behavior is that the (unique) MulOpType is vectorized into a `vector.contract`. All other ops are handled in a generic fashion. In the future, we may wish to allow more input arguments and elementwise and constant operations that do not involve the reduction dimension(s). A test is added to demonstrate the proper vectorization of matmul_i8_i8_i32. Differential revision: https://reviews.llvm.org/D95939	2021-02-04 07:50:44 +00:00
Richard Smith	3b9de993c9	Give this test a target triple.	2021-02-03 23:38:52 -08:00
Richard Smith	cde8d2fddb	Fix miscompile when performing template instantiation of non-dependent doubly-nested implicit CXXConstructExprs. Ensure that we transform the parameter initializer using TransformInitializer rather than TransformExpr so that we properly strip down and rebuild the initialization, including any necessary CXXBindTemporaryExprs. Otherwise we can end up forgetting to destroy temporary objects used to construct a constructor parameter.	2021-02-03 23:38:02 -08:00
Nicolas Vasilache	1029c82c1e	[mlir][Linalg] NFC - Extract a standalone LinalgInterfaces This separation improves the layering and paves the way for more interfaces coming up in the future. Differential revision: https://reviews.llvm.org/D95941	2021-02-04 07:19:38 +00:00
Michael Liao	a2fdf9d4d7	[hip][cuda] Enable extended lambda support on Windows. - On Windows, extended lambda has extra issues due to the numbering schemes are different between the host compilation (Microsoft C++ ABI) and the device compilation (Itanium C++ ABI. Additional device side lambda number is required per lambda for the host compilation to correctly mangle the device-side lambda name. - A hybrid numbering context `MSHIPNumberingContext` is introduced to number a lambda for both host- and device-compilations. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D69322	2021-02-04 01:38:29 -05:00
wlei	ac14bb14e7	[CSSPGO][llvm-profgen] Compress recursive cycles in calling context This change compresses the context string by removing cycles due to recursive function for CS profile generation. Removing recursion cycles is a way to normalize the calling context which will be better for the sample aggregation and also make the context promoting deterministic. Specifically for implementation, we recognize adjacent repeated frames as cycles and deduplicated them through multiple round of iteration. For example: Considering a input context string stack: [“a”, “a”, “b”, “c”, “a”, “b”, “c”, “b”, “c”, “d”] For first iteration,, it removed all adjacent repeated frames of size 1: [“a”, “b”, “c”, “a”, “b”, “c”, “b”, “c”, “d”] For second iteration, it removed all adjacent repeated frames of size 2: [“a”, “b”, “c”, “a”, “b”, “c”, “d”] So in the end, we get compressed output: [“a”, “b”, “c”, “d”] Compression will be called in two place: one for sample's context key right after unwinding, one is for the eventual context string id in the ProfileGenerator. Added a switch `compress-recursion` to control the size of duplicated frames, default -1 means no size limit. Added unit tests and regression test for this. Differential Revision: https://reviews.llvm.org/D93556	2021-02-03 22:16:07 -08:00
wlei	6bccdcdb35	Revert "[CSSPGO][llvm-profgen] Compress recursive cycles in calling context" This reverts commit `0609f257dc`.	2021-02-03 22:16:05 -08:00
wlei	08e8bb60cf	Revert "[CSSPGO][llvm-profgen] Aggregate samples on call frame trie to speed up profile generation" This reverts commit `1714ad2336`.	2021-02-03 22:16:05 -08:00
Ben Barham	a2c1054c30	[ASTReader] Always rebuild a cached module that has errors A module in the cache with an error should just be a cache miss. If allowing errors (with -fallow-pcm-with-compiler-errors), a rebuild is needed so that the appropriate diagnostics are output and in case search paths have changed. If not allowing errors, the module was built allowing errors and thus should be rebuilt regardless. Reviewed By: akyrtzi Differential Revision: https://reviews.llvm.org/D95989	2021-02-03 22:06:46 -08:00
Petr Hosek	b42ccdf38f	[NFC] Fix the noprofile attribute comment	2021-02-03 21:54:09 -08:00
Dave Lee	0ed758b260	[lldb] Convert more assertTrue to assertEqual (NFC) Follow up to D95813, this converts multiline assertTrue to assertEqual. Differential Revision: https://reviews.llvm.org/D95899	2021-02-03 21:15:08 -08:00
Chuanqi Xu	9511fa2dda	[NFC][Coroutine] Remove redundant comment The functionallity in the TODO was added before: https://reviews.llvm.org/rGb3a722e66b75328ab5e2eb5c8572022cb083855b	2021-02-04 12:54:30 +08:00
Kazu Hirata	be37475897	[Transforms/IPO] Use range-based for loops (NFC)	2021-02-03 20:41:20 -08:00
Kazu Hirata	643c00f717	[TableGen] Use ListSeparator (NFC)	2021-02-03 20:41:18 -08:00
Kazu Hirata	b4de30f6af	[Support] Drop unnecessary const from return types (NFC) Identified with const-return-type.	2021-02-03 20:41:16 -08:00
Akira Hatanaka	aade0ec23b	Fix the guaranteed alignment of memory returned by malloc/new on Darwin The guaranteed alignment is 16 bytes on Darwin. rdar://73431623 Differential Revision: https://reviews.llvm.org/D95910	2021-02-03 19:40:51 -08:00
Arthur Eubanks	781a1b1e36	[test] Pin spir-codegen.ll to legacy PM -polly-enable-delicm is not supported under the new PM but is tested here: Assertion `!EnableDeLICM && "This option is not implemented"' failed.	2021-02-03 19:37:32 -08:00
wlei	1714ad2336	[CSSPGO][llvm-profgen] Aggregate samples on call frame trie to speed up profile generation For CS profile generation, the process of call stack unwinding is time-consuming since for each LBR entry we need linear time to generate the context( hash, compression, string concatenation). This change speeds up this by grouping all the call frame within one LBR sample into a trie and aggregating the result(sample counter) on it, deferring the context compression and string generation to the end of unwinding. Specifically, it uses `StackLeaf` as the top frame on the stack and manipulates(pop or push a trie node) it dynamically during virtual unwinding so that the raw sample can just be recoded on the leaf node, the path(root to leaf) will represent its calling context. In the end, it traverses the trie and generates the context on the fly. Results: Our internal branch shows about 5X speed-up on some large workloads in SPEC06 benchmark. Differential Revision: https://reviews.llvm.org/D94110	2021-02-03 18:50:14 -08:00
wlei	0609f257dc	[CSSPGO][llvm-profgen] Compress recursive cycles in calling context This change compresses the context string by removing cycles due to recursive function for CS profile generation. Removing recursion cycles is a way to normalize the calling context which will be better for the sample aggregation and also make the context promoting deterministic. Specifically for implementation, we recognize adjacent repeated frames as cycles and deduplicated them through multiple round of iteration. For example: Considering a input context string stack: [“a”, “a”, “b”, “c”, “a”, “b”, “c”, “b”, “c”, “d”] For first iteration,, it removed all adjacent repeated frames of size 1: [“a”, “b”, “c”, “a”, “b”, “c”, “b”, “c”, “d”] For second iteration, it removed all adjacent repeated frames of size 2: [“a”, “b”, “c”, “a”, “b”, “c”, “d”] So in the end, we get compressed output: [“a”, “b”, “c”, “d”] Compression will be called in two place: one for sample's context key right after unwinding, one is for the eventual context string id in the ProfileGenerator. Added a switch `compress-recursion` to control the size of duplicated frames, default -1 means no size limit. Added unit tests and regression test for this. Differential Revision: https://reviews.llvm.org/D93556	2021-02-03 18:50:14 -08:00
Isuru Fernando	c95c0db2eb	[MLIR] Fix building unittests in in-tree build Reviewed By: mehdi_amini Differential Revision: https://reviews.llvm.org/D95978	2021-02-04 01:59:12 +00:00
Mehdi Amini	a1d5bdf819	Make the folder more robust against op fold() methods that generate a type mismatch We could extend this with an interface to allow dialect to perform a type conversion, but that would make the folder creating operation which isn't the case at the moment, and isn't necessarily always desirable. Reviewed By: rriddle Differential Revision: https://reviews.llvm.org/D95991	2021-02-04 01:58:56 +00:00
Shilei Tian	0f0ce3c12e	[OpenMP][NVPTX] Take functions in `deviceRTLs` as `convergent` OpenMP device compiler (similar to other SPMD compilers) assumes that functions are convergent by default to avoid invalid transformations, such as the bug (https://bugs.llvm.org/show_bug.cgi?id=49021). Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D95971	2021-02-03 20:58:12 -05:00
Michael Kruse	26b5be66f9	[OpenMPIRBuilder] Implement collapseLoops. The collapseLoops method implements a transformations facilitating the implementation of the collapse-clause. It takes a list of loops from a loop nest and reduces it to a single loop that can be used by other methods that are implemented on just a single loop, such as createStaticWorkshareLoop. This patch shares some changes with D92974 (such as adding some getters to CanonicalLoopNest), used by both patches. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D93268	2021-02-03 19:12:02 -06:00
Jonas Devlieghere	e3bb1c80fe	[lldb] Rollback to using i386 for the watch simulator I switched the watch simulator test from i386 to using x86_64, but apparently that's not supported on the bots. Rollback to using i386 and solve the original issue by passing the target, similar to what I did in TestSimulatorPlatform.py.	2021-02-03 16:32:55 -08:00
wlei	c82b24f475	[CSSPGO][llvm-profgen] Pseudo probe based CS profile generation This change implements profile generation infra for pseudo probe in llvm-profgen. During virtual unwinding, the raw profile is extracted into range counter and branch counter and aggregated to sample counter map indexed by the call stack context. This change introduces the last step and produces the eventual profile. Specifically, the body of function sample is recorded by going through each probe among the range and callsite target sample is recorded by extracting the callsite probe from branch's source. Please refer https://groups.google.com/g/llvm-dev/c/1p1rdYbL93s and https://reviews.llvm.org/D89707 for more context about CSSPGO and llvm-profgen. Implementation - Extended `PseudoProbeProfileGenerator` for pseudo probe based profile generation. - `populateBodySamplesWithProbes` reading range counter is responsible for recording function body samples and inferring caller's body samples. - `populateBoundarySamplesWithProbes` reading branch counter is responsible for recording call site target samples. - Each sample is recorded with its calling context(named `ContextId`). Remind that the probe based context key doesn't include the leaf frame probe info, so the `ContextId` string is created from two part: one from the probe stack strings' concatenation and other one from the leaf frame probe. - Added regression test Test Plan: ninja & ninja check-llvm Differential Revision: https://reviews.llvm.org/D92998	2021-02-03 16:21:53 -08:00
Jessica Paquette	56fcd4ea8d	[AArch64][GlobalISel] Change store value type from p0 -> s64 to import patterns Similar to the G_PTR_ADD + G_LOAD twiddling we do in `preISelLower`. The imported patterns expect scalars only, so they can't handle things like ``` G_STORE %ptr1, %ptr2 ``` To get around this, use s64 instead. (This probably makes a good portion of the manual selection code for G_STORE dead.) This is a 0.2% geomean code size improvement on CTMark at -Os. (Best is consumer-typeset @ -0.7%) Differential Revision: https://reviews.llvm.org/D95908	2021-02-03 16:19:16 -08:00
Nico Weber	b995314143	Revert "[InstrProfiling] Use !associated metadata for counters, data and values" This reverts commit `97ba5cde52`. Still breaks tests: https://reviews.llvm.org/D76802#2540647	2021-02-03 19:14:34 -05:00
Jessica Paquette	a1f6bb20db	[AArch64][GlobalISel] Emit G_ASSERT_ZEXT in assignValueToAddress for ZExt params When we have a zeroext parameter coming in on the stack, build ``` %x = G_LOAD ... %x_assert_zext = G_ASSERT_ZEXT %x, narrow_size %trunc = G_TRUNC %x_assert_zext ``` Rather than just loading into the truncated type. This allows us to optimize cases like this: https://godbolt.org/z/vfjhW8 Differential Revision: https://reviews.llvm.org/D95805	2021-02-03 16:06:05 -08:00
Arthur O'Dwyer	493f140792	[libc++] [P0879] constexpr std::sort This completes libc++'s implementation of P0879 "Constexpr for swap and swap related functions." http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0879r0.html For the feature-macro adjustment, see https://cplusplus.github.io/LWG/issue3256 Differential Revision: https://reviews.llvm.org/D93661	2021-02-03 18:57:05 -05:00
Amy Huang	26e9c99010	[Docs] Add some documentation for constructor homing, a debug info optimization (-fuse-ctor-homing) Adding this, since there's currently no documentation about this. Differential Revision: https://reviews.llvm.org/D95911	2021-02-03 15:25:49 -08:00

1 2 3 4 5 ...

379011 Commits All Branches Search

379011 Commits

All Branches