llvm-project

Commit Graph

Author	SHA1	Message	Date
Daniel Grumberg	7443a504bf	[clang][extract-api] Add support for true anonymous enums Anonymous enums without a typedef should have a "(anonymous)" identifier. Differential Revision: https://reviews.llvm.org/D123533	2022-04-12 20:42:17 +01:00
Changpeng Fang	8edaf25986	AMDGPU: Emit metadata for the hidden_multigrid_sync_arg conditionally Summary: Introduce a new function attribute, amdgpu-no-multigrid-sync-arg, which is default. We use implicitarg_ptr + offset to check whether the multigrid synchronization pointer is used. If yes, we remove this attribute and also remove amdgpu-no-implicitarg-ptr. We generate metadata for the hidden_multigrid_sync_arg only when the amdgpu-no-multigrid-sync-arg attribute is removed from the function. Reviewers: arsenm, sameerds, b-sumner and foad Differential Revision: https://reviews.llvm.org/D123548	2022-04-12 12:36:30 -07:00
Stanislav Mekhanoshin	65b8a43243	[AMDGPU] Update ds-alignment.ll test checks. NFC.	2022-04-12 12:06:02 -07:00
Aart Bik	28063a281b	[mlir][sparse] refactored python setup of sparse compiler Reviewed By: bixia Differential Revision: https://reviews.llvm.org/D123419	2022-04-12 11:58:41 -07:00
Mahesh Ravishankar	b40e901333	[mlir][Linalg] Allow collapsing subset of the reassociations when fusing by collapsing. This change generalizes the fusion of `tensor.expand_shape` -> `linalg.generic` op by collapsing to handle cases where only a subset of the reassociations specified in the `tensor.expand_shape` are valid to be collapsed. The method that does the collapsing is refactored to allow it to be a generic utility when required. Reviewed By: gysit Differential Revision: https://reviews.llvm.org/D123153	2022-04-12 18:56:32 +00:00
Simon Pilgrim	f061c1050b	[SLP][X86] Add ray_sphere intersection methods from c-ray benchmark We're failing to vectorize several comparison reduction patterns. Issue #43090 was based off this, but while that simplified test case is now folding, the original still fails due to poor cost model values for vXi1 extractions	2022-04-12 19:51:27 +01:00
Jonas Devlieghere	a66ff2316e	[lldb] Re-enable fixed on-device tests These tests were fixed by `833882b327`.	2022-04-12 11:39:25 -07:00
Nick Desaulniers	23ec5782c3	[Bitcode] materialize Functions early when BlockAddress taken IRLinker builds a work list of functions to materialize, then moves them from a source module to a destination module one at a time. This is a problem for blockaddress Constants, since they need not refer to the function they are used in; IPSCCP is quite good at sinking these constants deep into other functions when passed as arguments. This would lead to curious errors during LTO: ld.lld: error: Never resolved function from blockaddress ... based on the ordering of function definitions in IR. The problem was that IRLinker would basically do: for function f in worklist: materialize f splice f from source module to destination module in one pass, with Functions being lazily added to the running worklist. This confuses BitcodeReader, which cannot disambiguate whether a blockaddress is referring to a function which has not yet been parsed ("materialized") or is simply empty because its body was spliced out. This causes BitcodeReader to insert Functions into its BasicBlockFwdRefs list incorrectly, as it will never re-materialize an already materialized (but spliced out) function. Because of the possibility that blockaddress Constants may appear in Functions other than the ones they reference, this patch adds a new bitcode function code FUNC_CODE_BLOCKADDR_USERS that is a simple list of Functions that contain BlockAddress Constants that refer back to this Function, rather then the Function they are scoped in. We then materialize those functions when materializing `f` from the example loop above. This might over-materialize Functions should the user of BitcodeReader ultimately decide not to link those Functions, but we can at least now we can avoid this ordering related issue with blockaddresses. Fixes: https://github.com/llvm/llvm-project/issues/52787 Fixes: https://github.com/ClangBuiltLinux/linux/issues/1215 Reviewed By: dexonsmith Differential Revision: https://reviews.llvm.org/D120781	2022-04-12 11:38:35 -07:00
Shraiysh Vaishay	b18e82186f	[mlir][OpenMP] Added omp.task This patch adds tasking construct according to Section 2.10.1 of OpenMP 5.0 Reviewed By: peixin, kiranchandramohan, abidmalikwaterloo Differential Revision: https://reviews.llvm.org/D123575	2022-04-12 23:55:47 +05:30
Fangrui Song	fdd424e37a	[ubsan] Fix print_stacktrace=1:fast_unwind_on_fatal=0 to correctly fallback to fast unwinder ubsan_GetStackTrace (from `52b751088b`) called by ~ScopeReport leaves top/bottom zeroes in the `!WillUseFastUnwind(request_fast_unwind)` code path. When BufferedStackTrace::Unwind falls back to UnwindFast, `if (stack_top < 4096) return;` will return early, leaving just one frame in the stack trace. Fix this by always initializing top/bottom like `261d6e05d5`. Reviewed By: eugenis, yln Differential Revision: https://reviews.llvm.org/D123562	2022-04-12 11:24:19 -07:00
Martin Sebor	deadda749a	[InstCombine] Add more memrchr tests (NFC).	2022-04-12 11:55:33 -06:00
Jonathan Peyton	d49ce7c356	[OpenMP][libomp] Replace global variable references with local object Remove references to global __kmp_topology within a kmp_topology_t object method. There should just be implicit references to the private object.	2022-04-12 12:50:41 -05:00
Arthur Eubanks	9faab435a3	[docs] Mention that we are in the process of removing the legacy PM for the optimization pipeline And remove references to flags to turn it off. Reviewed By: nikic, MaskRay Differential Revision: https://reviews.llvm.org/D123547	2022-04-12 10:47:58 -07:00
Louis Dionne	0cc34ca7ec	[libc++] Define legacy symbols for inline functions at a finer-grained level When we build the library with the stable ABI, we need to include some functions in the dylib that were made inline in later versions of the library (to avoid breaking code that might be relying on those symbols). However, those methods were made non-inline whenever we'd be building the library, which means that all translation units would end up using the old out-of-line definition of these methods, as opposed to the new inlined version. This patch makes it so that only the translation units that actually define the out-of-line methods use the old definition, opening up potential optimization opportunities in other translation units. This should solve some of the issues encountered in D65667. Differential Revision: https://reviews.llvm.org/D123519	2022-04-12 13:44:30 -04:00
Ahmed Bougacha	cfa4fe7c51	[AArch64][LOH] Don't ignore regmasks in bundles by iterating over instrs. The LOH pass iterates over instructions to build its custom register state machine, but it uses the top-level bundle iterator. This should be okay, because when the wrapper BUNDLE MI is built, it aggregates the register defs/uses in its instructions into MOs. However, that doesn't apply to regmasks, and accumulating regmasks across multiple instructions would be messy business. There are a couple AnalyzePhysRegInBundle (/Virt) helpers that do look at regmasks, but those don't fit in very well here. AArch64 has started to use a few bundle instructions, specifically as glorified pseudos for variant call instructions, which have regmasks. So the LOH pass ends up ignoring regmasks. Concretely, this has been wrong for a while, but, on aarch64, the most common bundle (rv_marker call) was always followed by the attached call instruction, a plain BL with a regmask. Which was properly detected by the pass. However, we recently started keeping the attached call in the bundle, so the regmask is now ignored. And the pass happily combines ADRPs, of say, x8, across the bundle, resulting in corrupt pointers later.	2022-04-12 10:34:54 -07:00
Ahmed Bougacha	f3e76dcae3	[AArch64] Cleanup call-rv-marker.ll test. NFC. This was doing -iphoneos instead of -ios. While there, remove an old TODO and cleanup some alignment.	2022-04-12 10:34:54 -07:00
Harald van Dijk	3337f50625	[X86] Fix handling of maskmovdqu in x32 differently This reverts the functional changes of D103427 but keeps its tests, and and reimplements the functionality by reusing the existing 32-bit MASKMOVDQU and VMASKMOVDQU instructions as suggested by skan in review. These instructions were previously predicated on Not64BitMode. This reimplementation restores the disassembly of a class of instructions, which will see a test added in followup patch D122449. These instructions are in 64-bit mode special cased in X86MCInstLower::Lower, because we use flags with one meaning for subtly different things: we have an AdSize32 class which indicates both that the instruction needs a 0x67 prefix and that the text form of the instruction implies a 0x67 prefix. These instructions are special in needing a 0x67 prefix but having a text form that does not imply a 0x67 prefix, so we encode this in MCInst as an instruction that has an explicit address size override. Note that originally VMASKMOVDQU64 was special cased to be excluded from disassembly, as we cannot distinguish between VMASKMOVDQU and VMASKMOVDQU64 and rely on the fact that these are indistinguishable, or close enough to it, at the MCInst level that it does not matter which we use. Because VMASKMOVDQU now receives special casing, even though it does not make a difference in the current implementation, as a precaution VMASKMOVDQU is excluded from disassembly rather than VMASKMOVDQU64. Reviewed By: RKSimon, skan Differential Revision: https://reviews.llvm.org/D122540	2022-04-12 18:32:14 +01:00
Groverkss	20aedb148b	[MLIR][Presburger] Remove inheritance from PresburgerSpace in IntegerRelation, PresburgerRelation and PWMAFunction This patch removes inheritence from PresburgerSpace in IntegerRelation and instead makes it a member of these classes. This is required for three reasons: - It prevents implicit casting to PresburgerSpace. - Not all functions of PresburgerSpace need to be exposed by the deriving classes. - IntegerRelation and IntegerPolyhedron are defined in a PresburgerSpace. It makes more sense for the space to be a member instead of them inheriting from a space. Reviewed By: arjunp, ftynse Differential Revision: https://reviews.llvm.org/D123585	2022-04-12 22:48:52 +05:30
Zixu Wang	e08c435401	[clang][ExtractAPI][NFC] Fix sed delimiter in test Fix path replacement in sed (properly this time) using lit regex_replacement. Differential Revision: https://reviews.llvm.org/D123526 Co-authored-by: Michele Scandale <michele.scandale@gmail.com> Co-authored-by: Zixu Wang <9819235+zixu-w@users.noreply.github.com>	2022-04-12 10:00:15 -07:00
Shao-Ce SUN	e90110e696	[NFC][CodeGen] Use ArrayRef in TargetLowering functions This patch is similar to D122557, adding an `ArrayRef` version for `setOperationAction`, `setLoadExtAction`, `setCondCodeAction`, `setLibcallName`. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D123467	2022-04-13 00:46:05 +08:00
Anshil Gandhi	528aa09010	[AMDGPU][Codegen] Unsupported image sample texture map instructions Disables image_sample_*_g16 instructions on architectures lacking g16 support. This patch fixes the issue 54672. Differential Revision: https://reviews.llvm.org/D123461	2022-04-12 10:38:59 -06:00
Sanjay Patel	d9211be13d	[SimplifyCFG] cleanup code for converting switch to select (NFC) This renames functions for more general usage (and current capitalization style) before a proposed logic change in D122485. Differential Revision: https://reviews.llvm.org/D123614	2022-04-12 12:17:54 -04:00
Jonathan Peyton	747a490612	[OpenMP][libomp] Fix some Doxygen issues Fix spelling of variable names and remove accidental references (#) in Doxygen comments.	2022-04-12 11:05:30 -05:00
Momchil Velikov	d0ea42a7c1	[AArch64] Async unwind - function epilogues Reviewed By: MaskRay, chill Differential Revision: https://reviews.llvm.org/D112330	2022-04-12 16:50:50 +01:00
Mark de Wever	7738db2c06	[NFC][libc++][test] Move time tests. In the C++20 Standard time is no longer section under utilities, but became its own chapter. This moves the time tests accordingly so their location matches the current Standard. Reviewed By: ldionne, #libc Differential Revision: https://reviews.llvm.org/D122745	2022-04-12 17:49:48 +02:00
Jay Foad	8a53b25ed5	[AMDGPU] Use default member initializers in Subtarget classes Use default member initializers in AMDGPUSubtarget and subclasses. This is to guard against adding a new feature boolean in AMDGPUSubtarget.h but forgetting to initialize it to false in AMDGPUSubtarget.cpp. This was mostly autogenerated by: clang-tidy -checks=-,cppcoreguidelines-prefer-member-initializer,modernize-use-default-member-init -header-filter=Subtarget -fix lib/Target/AMDGPU/Subtarget.cpp Differential Revision: https://reviews.llvm.org/D123613	2022-04-12 16:42:30 +01:00
Nico Weber	2ac876c52c	[gn build] Fix a URL in a comment	2022-04-12 11:38:12 -04:00
Nikita Popov	1d530b914e	[InstSimplify] Don't fold phi of poison and trapping const expr (PR49839) Folding this case would result in the constant expression being executed unconditionally, which may introduce a new trap. Fixes https://github.com/llvm/llvm-project/issues/49839.	2022-04-12 17:32:25 +02:00
Nikita Popov	bc6d7ed8a9	[InstSimplify] Add test for PR49839 (NFC)	2022-04-12 17:32:25 +02:00
Stanislav Mekhanoshin	3870b36025	[AMDGPU] Split unaligned 3 DWORD DS operations I have written a minitest to check the performance. Overall the benefit of aligned b96 operations on data which is not known but happens to be aligned is small, while performance hit of using b96 operations on a really unaligned memory is high. The only exception is when data is not aligned even by 4, it is better to use b96 in this case. Here is the test output on Vega and Navi: ``` Using platform: AMD Accelerated Parallel Processing Using device: gfx900:xnack- ds_write_b96 aligned: 3.4 sec ds_write_b32 + ds_write_b64 aligned: 4.5 sec ds_write_b32 * 3 aligned: 4.8 sec ds_write_b96 misaligned by 1: 4.8 sec ds_write_b32 + ds_write_b64 misaligned by 1: 7.2 sec ds_write_b32 * 3 misaligned by 1: 10.0 sec ds_write_b96 misaligned by 2: 4.8 sec ds_write_b32 + ds_write_b64 misaligned by 2: 7.2 sec ds_write_b32 * 3 misaligned by 2: 10.1 sec ds_write_b96 misaligned by 4: 4.8 sec ds_write_b32 + ds_write_b64 misaligned by 4: 4.2 sec ds_write_b32 * 3 misaligned by 4: 4.9 sec ds_write_b96 misaligned by 8: 4.8 sec ds_write_b32 + ds_write_b64 misaligned by 8: 4.6 sec ds_write_b32 * 3 misaligned by 8: 4.9 sec ds_read_b96 aligned: 3.3 sec ds_read_b32 + ds_read_b64 aligned: 4.9 sec ds_read_b32 * 3 aligned: 2.6 sec ds_read_b96 misaligned by 1: 4.1 sec ds_read_b32 + ds_read_b64 misaligned by 1: 7.2 sec ds_read_b32 * 3 misaligned by 1: 10.1 sec ds_read_b96 misaligned by 2: 4.1 sec ds_read_b32 + ds_read_b64 misaligned by 2: 7.2 sec ds_read_b32 * 3 misaligned by 2: 10.1 sec ds_read_b96 misaligned by 4: 4.1 sec ds_read_b32 + ds_read_b64 misaligned by 4: 2.6 sec ds_read_b32 * 3 misaligned by 4: 2.6 sec ds_read_b96 misaligned by 8: 4.1 sec ds_read_b32 + ds_read_b64 misaligned by 8: 4.9 sec ds_read_b32 * 3 misaligned by 8: 2.6 sec Using platform: AMD Accelerated Parallel Processing Using device: gfx1030 ds_write_b96 aligned: 4.1 sec ds_write_b32 + ds_write_b64 aligned: 13.0 sec ds_write_b32 * 3 aligned: 4.5 sec ds_write_b96 misaligned by 1: 12.5 sec ds_write_b32 + ds_write_b64 misaligned by 1: 22.0 sec ds_write_b32 * 3 misaligned by 1: 31.5 sec ds_write_b96 misaligned by 2: 12.4 sec ds_write_b32 + ds_write_b64 misaligned by 2: 22.0 sec ds_write_b32 * 3 misaligned by 2: 31.5 sec ds_write_b96 misaligned by 4: 12.4 sec ds_write_b32 + ds_write_b64 misaligned by 4: 4.0 sec ds_write_b32 * 3 misaligned by 4: 4.5 sec ds_write_b96 misaligned by 8: 12.4 sec ds_write_b32 + ds_write_b64 misaligned by 8: 13.0 sec ds_write_b32 * 3 misaligned by 8: 4.5 sec ds_read_b96 aligned: 3.8 sec ds_read_b32 + ds_read_b64 aligned: 12.8 sec ds_read_b32 * 3 aligned: 4.4 sec ds_read_b96 misaligned by 1: 10.9 sec ds_read_b32 + ds_read_b64 misaligned by 1: 21.8 sec ds_read_b32 * 3 misaligned by 1: 31.5 sec ds_read_b96 misaligned by 2: 10.9 sec ds_read_b32 + ds_read_b64 misaligned by 2: 21.9 sec ds_read_b32 * 3 misaligned by 2: 31.5 sec ds_read_b96 misaligned by 4: 10.9 sec ds_read_b32 + ds_read_b64 misaligned by 4: 3.8 sec ds_read_b32 * 3 misaligned by 4: 4.5 sec ds_read_b96 misaligned by 8: 10.9 sec ds_read_b32 + ds_read_b64 misaligned by 8: 12.8 sec ds_read_b32 * 3 misaligned by 8: 4.5 sec ``` Fixes: SWDEV-330802 Differential Revision: https://reviews.llvm.org/D123524	2022-04-12 07:52:39 -07:00
Stanislav Mekhanoshin	b8e09f1553	[AMDGPU] Refactor LDS alignment checks. Move features/bugs checks into the single place allowsMisalignedMemoryAccessesImpl. This is mostly NFCI except for the order of selection in couple places. A separate change may be needed to stop lying about Fast. Differential Revision: https://reviews.llvm.org/D123343	2022-04-12 07:49:40 -07:00
Simon Pilgrim	0488c6638b	[X86] getFauxShuffleMask - remove use DemandedElts TODO Most of the getTargetShuffleInputs recursive calls have now gone and the remaining uses aren't likely to benefit from a DemandedElts mask	2022-04-12 15:36:30 +01:00
Sam McCall	60502ed11a	[pseudo] Remove unused clangTesting dep. NFC	2022-04-12 16:17:43 +02:00
Fabian Wolff	a18634b74f	[clang-tidy] Never consider assignments as equivalent in `misc-redundant-expression` check Fixes https://github.com/llvm/llvm-project/issues/35853. Reviewed By: aaron.ballman Differential Revision: https://reviews.llvm.org/D122535	2022-04-12 16:03:14 +02:00
Pavel Labath	45428412fd	[lldb] Adjust libc++ string formatter for changes in D122598 The __size_ member is now in a slightly different location.	2022-04-12 15:44:16 +02:00
Jun Zhang	f9c2f821d7	[Clang] Fix unknown type attributes diagnosed twice with [[]] spelling Don't warn on unknown type attributes in Parser::ProhibitCXX11Attributes for most cases, but left the diagnostic to the later checks. module declaration and module import declaration are special cases. Fixes https://github.com/llvm/llvm-project/issues/54817 Differential Revision: https://reviews.llvm.org/D123447	2022-04-12 21:11:51 +08:00
serge-sans-paille	e810d55809	[ValueTracking] Make getStringLenth aware of strdup During strlen compile-time evaluation, make it possible to track size of strduped strings. Differential Revision: https://reviews.llvm.org/D123497	2022-04-12 14:47:29 +02:00
David Spickett	0231a90bc4	[lldb][AArch64] Automatically add all extensions to disassembler This means we don't have to remember to update this code as much. This is all tested in lldb/test/Shell/Commands/command-disassemble-aarch64-extensions.s which I added previously. We don't have a way to get the latest base architecture yet so that remains manual. Having all the extensions specified will probably be equivalent to the latest architecture version in any case. Reviewed By: labath Differential Revision: https://reviews.llvm.org/D123582	2022-04-12 12:31:43 +00:00
Dmitry Preobrazhensky	c33770d87f	[AMDGPU][DOC][NFC] Updated GFX10 assembler syntax description The description has been updated to reflect AMDGPU MC changes: - enabled literals for src0 of v_fmaak_f, v_fmamk_f, v_madak_f32, v_madmk_f32; - enabled global_atomic_fcmpswap and global_atomic_fcmpswap_x2; - enabled dlc with flat_atomic* and global_atomic_*. Bug fixing and improvements: - enabled s_wait_idle; - enabled s_waitcnt_depctr; - added description of s_waitcnt_depctr syntactic sugar; - disabled SYSMSG_OP_HOST_TRAP_ACK (it is not supported on GFX10); - corrected description of lgkmcnt (accept values from 0 to 63).	2022-04-12 15:18:44 +03:00
Arjun P	0ac213667d	[MLIR][Presburger] normalizeDiv: add assert that denom > 0	2022-04-12 13:06:53 +01:00
Dmitry Preobrazhensky	4e83d4fd92	[AMDGPU][DOC][NFC] Updated GFX1030 assembler syntax description Summary of changes: - enabled null for VOP operands; - added description of s_waitcnt_depctr syntactic sugar.	2022-04-12 14:58:18 +03:00
Simon Pilgrim	bc32a1dd76	[DAG] Add non-uniform vector support to (shl (sr[la] exact X, C1), C2) folds	2022-04-12 12:57:56 +01:00
Dmitri Gribenko	e67b90bdb3	Update the Bazel build files for "[mlir][Math] Replace some constant ..."	2022-04-12 13:47:51 +02:00
jacquesguan	83bd4fe2e8	[mlir][Math] Replace some constant folder functions with common folder functions. Differential Revision: https://reviews.llvm.org/D123485	2022-04-12 11:34:47 +00:00
Arjun P	4aeb2a57f4	[MLIR][Presburger][Simplex] addSymbolicCut: don't add symbol div if denom is 1 This is unncessary, so we remove it as an optimization. Reviewed By: Groverkss Differential Revision: https://reviews.llvm.org/D123540	2022-04-12 12:27:27 +01:00
Simon Pilgrim	bb1a1f42db	[X86] Fix extact -> exact typo in test names	2022-04-12 12:21:45 +01:00
LLVM GN Syncbot	dbf1557359	[gn build] Port `95f0f69f1f`	2022-04-12 09:55:37 +00:00
Haojian Wu	95f0f69f1f	Revert "[AST] Add a new TemplateKind for template decls found via a using decl." It breaks arm build, there is no free bit for the extra UsingShadowDecl in TemplateName::StorageType. Reverting it to build the buildbot back until we comeup with a fix. This reverts commit `5a5be4044f`.	2022-04-12 11:51:00 +02:00
Andrzej Warzynski	fb16ed258c	[mlir] Prefix pass manager options with `mlir-` With this change, there's going to be a clear distinction between LLVM and MLIR pass maanger options (e.g. `-mlir-print-after-all` vs `-print-after-all`). This change is desirable from the point of view of projects that depend on both LLVM and MLIR, e.g. Flang. For consistency, all pass manager options in MLIR are prefixed with `mlir-`, even options that don't have equivalents in LLVM . Differential Revision: https://reviews.llvm.org/D123495	2022-04-12 09:32:44 +00:00
Matthias Springer	fa087b4352	[mlir][scf][bufferize][NFC] Lookup buffer using helper function Lookup iter_arg buffers using `lookupBuffer` instead of always creating a new `ToMemrefOp`. Also cast all yielded buffers (if necessary), regardless of whether they are an equivalent buffer or a new allocation. Note: This should have been part of D123369. Differential Revision: https://reviews.llvm.org/D123383	2022-04-12 18:09:30 +09:00

... 2 3 4 5 6 ...

420992 Commits All Branches Search

420992 Commits

All Branches