llvm-project

Commit Graph

Author	SHA1	Message	Date
Shilei Tian	0c623ab1bf	[Clang][OpenMP] Only check value if the expression is not instantiation dependent Currently the following case fails: ``` template<typename Ty> Ty foo(Ty addr, Ty val) { Ty v; #pragma omp atomic compare capture { v = addr; if (addr > val) addr = val; } return v; } ``` The compiler complains `addr` is not a lvalue. That's because when an expression is instantiation dependent, we cannot tell if it is lvalue or not. Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D135224	2022-10-05 08:44:56 -04:00
David Stuttard	d1d7d2235c	[AggressiveInstCombine] Fix cases where non-opaque pointers are used In the case of non-opaque pointers, when combining consecutive loads, need to bitcast the pointer source to the combined type size, otherwise asserts are triggered. Differential Revision: https://reviews.llvm.org/D135249	2022-10-05 13:42:46 +01:00
Oleksandr "Alex" Zinenko	d14a029222	[mlir] tweak declarative assembly doc Change the formal argument of the `functional-type` directive from "results" to "outputs" to avoid confusion with the `results` directive.	2022-10-05 14:33:47 +02:00
Peixin Qiao	f4accbf55f	[flang][OpenMP] Support privatization for single construct This supports the lowering of private and firstprivate clauses in single construct. The alloca ops are emitted in the entry block according to https://llvm.org/docs/Frontend/PerformanceTips.html#use-of-allocas, and the load/store ops are emitted in the single region. The data race problem is handled in OMPIRBuilder. That is, the barrier is emitted in OMPIRBuilder. Co-authored-by: Nimish Mishra <neelam.nimish@gmail.com> Reviewed By: kiranchandramohan Differential Revision: https://reviews.llvm.org/D128596	2022-10-05 20:22:33 +08:00
Nikita Popov	e94619b955	Reapply [InstCombine] Switch foldOpIntoPhi() to use InstSimplify The infinite loop seen on buildbots should be fixed by `11897708c0` (assuming there are not multiple infinite combine loops...) ----- foldOpIntoPhi() currently only folds operations into the phi if all but one operands constant-fold. The two exceptions to this are freeze and select, where we allow more general simplification. This patch makes foldOpIntoPhi() generally simplification based and removes all the instruction-specific logic. We just try to simplify the instruction for each operand, and for the (potentially) one non-simplified operand, we move it into the new block with adjusted operands. This fixes https://github.com/llvm/llvm-project/issues/57448, which was my original motivation for the change. Differential Revision: https://reviews.llvm.org/D134954	2022-10-05 14:00:19 +02:00
Emmmer	d0dcbb9b02	[LLDB][RISCV][NFC] Rewrite instruction in algebraic datatype The old approach (dedicated ExecXXX for each instruction) is not flexible and results in duplicated code when RVC kicks in. According to the spec, every compressed instruction can be decoded to a non-compressed one. So we can lower compressed instructions to instructions we already had, which requires a decoupling between the decoder and executor. This patch: - use llvm::Optional and its combinators AMAP. - use template constraints on common instruction. - make instructions strongly-typed (no uint32_t everywhere bc it is error-prone and burdens the developer when lowering the RVC) with the help of algebraic datatype (std::variant). Note: (NFC) because this is more of a refactoring in preparation for RVC. Reviewed By: DavidSpickett Differential Revision: https://reviews.llvm.org/D135015	2022-10-05 19:45:28 +08:00
Nikita Popov	11897708c0	[InstCombine] Directly replace instr in foldIntegerTypedPHI() (NFCI) Rather than inserting a ptrtoint + inttoptr pair, directly replace the inttoptr with the new phi node. This ensures that no other transform can undo it before the pair gets folded away. This avoids the infinite loop when combined with D134954. This is NFCI in the sense that it shouldn't make a difference, but could due to different worklist order.	2022-10-05 13:28:23 +02:00
Nikita Popov	e035f03e92	[InstCombine] Add test for infinite combine loop with D134954 (NFC) The patch interacts badly with foldIntegerTypedPHI().	2022-10-05 13:12:14 +02:00
Guray Ozen	e68a7bed59	[mlir][transform] Add failing test for GPU transform dialect The GPU transform dialect currently has restrictions and several situations where we can't use transform dialect. This update includes a method to test a failing cases in GPU transform dialect. Differential Revision: https://reviews.llvm.org/D135063	2022-10-05 13:10:13 +02:00
Guray Ozen	78305720f3	[mlir][transform][nfc] typo fix fix typo Reviewed By: nicolasvasilache, ftynse Differential Revision: https://reviews.llvm.org/D135242	2022-10-05 13:05:46 +02:00
LLVM GN Syncbot	b1bdcd4d5c	[gn build] Port `f0f474dfd0`	2022-10-05 09:44:51 +00:00
David Sherwood	f0f474dfd0	[AArch64][SME] Add codegen pass to handle ZA state in arm_new_za functions. The new pass implements the following: * Inserts code at the start of an arm_new_za function to commit a lazy-save when the lazy-save mechanism is active. * Adds a smstart intrinsic at the start of the function. * Adds a smstop intrinsic at the end of the function. Patch co-authored by kmclaughlin. Differential Revision: https://reviews.llvm.org/D133896	2022-10-05 09:43:57 +00:00
Fraser Cormack	08497a785b	[VP] Fix unused variable in release configurations	2022-10-05 10:33:07 +01:00
Mikhail Goncharov	c21e57156c	Fix clang baremetal test `def48cae45` accidentally dropped -no-canonical-prefixes	2022-10-05 11:32:28 +02:00
Florian Hahn	469f0fc6a6	[SimpleLoopUnswitch] Clear dispos in deleteDeadBlocksFromLoop. SimpleLoopUnswitch may remove blocks from loops. Clear block and loop dispositions in that case, to clean up invalid entries in the cache. Fixes #58158. Fixes #58159.	2022-10-05 10:28:15 +01:00
Florian Hahn	eb975cae4b	[SimpleLoopUnswitch] Simplify test, reduce the passes to trigger crash. This simplifies the test case added in `e399dd601` to only require indvars and simple-loop-unswitch. This allows adding the test case for #58158 to the same file, keeping related tests together.	2022-10-05 10:19:55 +01:00
Kadir Cetinkaya	feea7ef23c	Revert "[clang][Lex] Fix a crash on malformed string literals" This reverts commit `36a200208f`.	2022-10-05 10:37:32 +02:00
Nikita Popov	2813bc5d24	[SROA] Regenerate test checks (NFC)	2022-10-05 10:31:03 +02:00
David Sherwood	991a36da1b	[AArch64][SME] Prevent SVE object address calculations between smstop and call This patch introduces a new AArch64 ISD node (OBSCURE_COPY) that can be used when we want to prevent SVE object address calculations from being rematerialised between a smstop/smstart and a call. At the moment we use COPY to copy the frame index to a register, which leads to problems because the "simple register coalescing" pass understands the COPY instruction and attempts to rematerialise an address calculation with 'addvl' between an smstop and a call. When in streaming mode the 'addvl' instruction may have different behaviour because the streaming SVE vector length is not guaranteed to equal the normal SVE vector length. The new ISD opcode OBSCURE_COPY gets lowered to a new pseudo instruction also called OBSCURE_COPY. This ensures it cannot be rematerialised and we expand this into a simple move very late in the machine instruction pipeline. A new test is added here: CodeGen/AArch64/sme-streaming-interface.ll Differential Revision: https://reviews.llvm.org/D134940	2022-10-05 08:11:16 +00:00
Valentin Clement	e50e19af00	[flang] Update to fir::isUnlimitedPolymorphicType and fir::isPolymorphicType functions This patch update the fir::isUnlimitedPolymorphicType function to reflect the chosen design. It adds also a fir::isPolymorphicType function. Reviewed By: jeanPerier Differential Revision: https://reviews.llvm.org/D135143	2022-10-05 10:05:11 +02:00
Martin Storsjö	2f7fbf8376	[AArch64] Add missing SEH_Nop when aligning the stack This makes sure that the instructions of the prologue matches the SEH opcodes. Also remove a couple redundant cases of setting HasWinCFI; it was already set unconditionally after the conditional cases. Differential Revision: https://reviews.llvm.org/D135101	2022-10-05 11:00:36 +03:00
David Spickett	a9ffb47345	Fix LLDB build on old Linux kernels (pre-4.1) These fields are guarded elsewhere, but were missing here. Reviewed By: wallace Differential Revision: https://reviews.llvm.org/D133778	2022-10-05 08:00:05 +00:00
Nicolas Vasilache	5fc28ebbaf	[mlir][Linalg] NFC - Add bbarg pretty printing to linalg::generic Differential Revision: https://reviews.llvm.org/D135151	2022-10-05 00:59:42 -07:00
Kadir Cetinkaya	36a200208f	[clang][Lex] Fix a crash on malformed string literals Differential Revision: https://reviews.llvm.org/D135161	2022-10-05 09:55:50 +02:00
Nicolas Vasilache	05fa8e88f4	[mlir][Linalg] Retire LinalgStrategyLowerVectorsPass and filter-based patterns Context: https://discourse.llvm.org/t/psa-retire-linalg-filter-based-patterns/63785 Depends on D135200 Differential Revision: https://reviews.llvm.org/D135222	2022-10-05 00:55:27 -07:00
Rainer Orth	93b1256e38	[compiler-rt][test] Heed COMPILER_RT_DEBUG when compiling unittests When trying to debug some `compiler-rt` unittests, I initially had a hard time because - even in a `Debug` build one needs to set `COMPILER_RT_DEBUG` to get debugging info for some of the code and - even so the unittests used a hardcoded `-O2` which often makes debugging impossible. This patch addresses this by instead using `-O0` if `COMPILER_RT_DEBUG`. Two tests in `sanitizer_type_traits_test.cpp` need to be disabled since they have undefined references to `__sanitizer::integral_constant<bool, true>::value`. Tested on `sparcv9-sun-solaris2.11`, `amd64-pc-solaris2.11`, and `x86_64-pc-linux-gnu`. Differential Revision: https://reviews.llvm.org/D91620	2022-10-05 09:53:26 +02:00
Nicolas Vasilache	27c634aed6	[mlir][Linalg] Retire LinalgStrategyPeelPass and filter-based pattern. Context: https://discourse.llvm.org/t/psa-retire-linalg-filter-based-patterns/63785 Differential Revision: https://reviews.llvm.org/D135200	2022-10-05 00:50:13 -07:00
Adrian Kuegel	ac06f7169f	[mlir] Add attribute constraints for sorted order. We often have constraints for array attributes that they are sorted non-decreasing or strictly increasing. This change adds AttrConstraint classes that support DenseArrayAttr for integer types. Differential Revision: https://reviews.llvm.org/D134944	2022-10-05 09:46:19 +02:00
Siva Chandra Reddy	995105de1b	[libc] Add the POSIX waitpid function and the BSD wait4 function. Reviewed By: lntue, michaelrj Differential Revision: https://reviews.llvm.org/D135225	2022-10-05 07:38:55 +00:00
Fraser Cormack	a3a9b0743e	[VP][NFC] Remove \brief commands from doxygen comments Following a precedent set in D46861.	2022-10-05 08:08:30 +01:00
Fraser Cormack	3362e2d57f	[VP] Add IR expansion for vp.icmp and vp.fcmp These intrinsics are simply expanded to regular icmp/fcmp instructions. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D121594	2022-10-05 08:07:39 +01:00
Emilia Dreamer	2d149d17f0	[clang-tidy] Fix crashes on `if consteval` in readability checks The `readability-braces-around-statements` check tries to look at the closing parens of the if condition to determine where to insert braces, however, "consteval if" statements don't have a condition, and always have braces regardless, so the skip can be checked. The `readability-simplify-boolean-expr` check looks at the condition of the if statement to determine what could be simplified, but as "consteval if" statements do not have a condition that could be simplified, they can also be skipped here. There may still be more checks that try to look at the conditions of `if`s that aren't included here Fixes https://github.com/llvm/llvm-project/issues/57568 Reviewed By: njames93, aaron.ballman Differential Revision: https://reviews.llvm.org/D133413	2022-10-05 09:38:05 +03:00
Serguei Katkov	d330731f94	[RegAllocFast] Clean-up. Remove redundant operations. NFC. Reviewed By: MatzeB, arsenm Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D109213	2022-10-05 11:38:54 +07:00
Johannes Doerfert	f8ee045c6d	[OpenMP] Eliminate the ThreadStates array in favor of indirection If we have thread states, the program is going to be rather slow. If we don't, we want to avoid wasting shared memory. This patch introduces a slight penalty (malloc + indirection) for the slow path and reduces resource usage for the fast path. Differential Revision: https://reviews.llvm.org/D135037	2022-10-04 20:27:34 -07:00
Johannes Doerfert	b113965073	[OpenMP] Introduce more atomic operations into the runtime We should use OpenMP atomics but they don't take variable orderings. Maybe we should expose all of this in the header but that solves only part of the problem anyway. Differential Revision: https://reviews.llvm.org/D135036	2022-10-04 20:20:55 -07:00
Johannes Doerfert	f85c1f3b7c	[OpenMP] Replace __ATOMIC_XYZ with atomic::xyz for style Also fixes one ordering argument not used. Differential Revision: https://reviews.llvm.org/D135035	2022-10-04 19:43:30 -07:00
Johannes Doerfert	a9557115b4	[Attributor] Qualify variables to avoid clashes in the future	2022-10-04 19:43:04 -07:00
Johannes Doerfert	abbc3fa17b	[OpenMP] Replace pointer comparison with `isSharedMemPtr` check The pointer comparison was causing confusion for capture tracking, let's avoid confusion. Differential Revision: https://reviews.llvm.org/D135160	2022-10-04 19:24:22 -07:00
Johannes Doerfert	792e60c744	[NVVM] Mark the pointer argument of `llvm.nvvm.isspace.*` `nocapture` Differential Revision: https://reviews.llvm.org/D135172	2022-10-04 19:22:26 -07:00
Yeting Kuo	74a130af97	[RISCV] Add isel patterns for vfmacc, vfnmacc, vfmsac and vfnmsac. The patch selects VSELECT_VL/VP_MERGE_VL that uses VF(N)M(ACC\|SAC) as its true operand and the adden of the true operand as its false operand. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D135080	2022-10-05 09:56:43 +08:00
Eli Friedman	0b61db423b	[AArch64][Windows] Add llvm-readobj support for save_any_reg unwind opcode. This is primarily used for Arm64EC, but it could be used for other non-standard calling conventions. The testcase is based on an Arm64EC thunk generated by MSVC. The name save_any_reg comes from Microsoft documentation, but the full encoding isn't specified there; this is reverse-engineered from the behavior of the unwinder. (Thanks to Martin Storsjö for his example of how to write simple unwinder testcases by directly calling RtlVirtualUnwind.) Differential Revision: https://reviews.llvm.org/D135196	2022-10-04 18:55:01 -07:00
Stefan Pintilie	30d639180f	[PowerPC] Fix the register allocation hints for ACC registers. The allocation hints for copies of ACC registers assumed that we would only be copying between VSRp and UACC registers. In reality it is also possible to copy between UACC and ACC registers. This patch adds a new case for the ACC copy to fix that issue. Note that the test case added with this patch will hit an assert without the fix. Reviewed By: lei, amyk Differential Revision: https://reviews.llvm.org/D134501	2022-10-04 20:30:16 -05:00
owenca	b60e7a7f1a	[clang-format] Handle C# interpolated verbatim string prefix @$ Fixes #58062. Differential Revision: https://reviews.llvm.org/D135026	2022-10-04 18:27:36 -07:00
Matthias Springer	129420df51	[mlir][bufferization][NFC] Move EmptyTensorToAllocTensorPass This change moves the pass from the Linalg dialect to the bufferization dialect. Differential Revision: https://reviews.llvm.org/D135130	2022-10-05 09:57:22 +09:00
Stella Laurenzo	e28b15b572	Add APFloat and MLIR type support for fp8 (e5m2). (Re-Apply with fixes to clang MicrosoftMangle.cpp) This is a first step towards high level representation for fp8 types that have been built in to hardware with near term roadmaps. Like the BFLOAT16 type, the family of fp8 types are inspired by IEEE-754 binary floating point formats but, due to the size limits, have been tweaked in various ways in order to maximally use the range/precision in various scenarios. The list of variants is small/finite and bounded by real hardware. This patch introduces the E5M2 FP8 format as proposed by Nvidia, ARM, and Intel in the paper: https://arxiv.org/pdf/2209.05433.pdf As the more conformant of the two implemented datatypes, we are plumbing it through LLVM's APFloat type and MLIR's type system first as a template. It will be followed by the range optimized E4M3 FP8 format described in the paper. Since that format deviates further from the IEEE-754 norms, it may require more debate and implementation complexity. Given that we see two parts of the FP8 implementation space represented by these cases, we are recommending naming of: * `F8M<N>` : For FP8 types that can be conceived of as following the same rules as FP16 but with a smaller number of mantissa/exponent bits. Including the number of mantissa bits in the type name is enough to fully specify the type. This naming scheme is used to represent the E5M2 type described in the paper. * `F8M<N>F` : For FP8 types such as E4M3 which only support finite values. The first of these (this patch) seems fairly non-controversial. The second is previewed here to illustrate options for extending to the other known variant (but can be discussed in detail in the patch which implements it). Many conversations about these types focus on the Machine-Learning ecosystem where they are used to represent mixed-datatype computations at a high level. At that level (which is why we also expose them in MLIR), it is important to retain the actual type definition so that when lowering to actual kernels or target specific code, the correct promotions, casts and rescalings can be done as needed. We expect that most LLVM backends will only experience these types as opaque `I8` values that are applicable to some instructions. MLIR does not make it particularly easy to add new floating point types (i.e. the FloatType hierarchy is not open). Given the need to fully model FloatTypes and make them interop with tooling, such types will always be "heavy-weight" and it is not expected that a highly open type system will be particularly helpful. There are also a bounded number of floating point types in use for current and upcoming hardware, and we can just implement them like this (perhaps looking for some cosmetic ways to reduce the number of places that need to change). Creating a more generic mechanism for extending floating point types seems like it wouldn't be worth it and we should just deal with defining them one by one on an as-needed basis when real hardware implements a new scheme. Hopefully, with some additional production use and complete software stacks, hardware makers will converge on a set of such types that is not terribly divergent at the level that the compiler cares about. (I cleaned up some old formatting and sorted some items for this case: If we converge on landing this in some form, I will NFC commit format only changes as a separate commit) Differential Revision: https://reviews.llvm.org/D133823	2022-10-04 17:18:17 -07:00
Sam Clegg	be758cd4a3	[WebAssembly][MC] Fix missing `else` after `return` due to type checker bug Once we are in the `Unreachable` we want to disable type checking, but we were unconditionally returning `true` here which means we encountered and error. Instead we unconditionally return false to signal no error. Fixes: https://github.com/llvm/llvm-project/issues/56935 Differential Revision: https://reviews.llvm.org/D135195	2022-10-04 16:43:22 -07:00
Amara Emerson	c5cebf78bd	[GlobalISel] Add computeNumSignBits() support for compares. Doing so allows G_SEXT_INREG to be combined away for many vector cases. Differential Revision: https://reviews.llvm.org/D135168	2022-10-05 00:28:08 +01:00
Amara Emerson	8055aa8e8a	[AArch64][GlobalISel] Make vector G_SEXT_INREG legal and allow combining. As a result of making these legal, and tweaking the combine to allow vectors, we generate vector G_SEXT_INREG during legalization. The reason we want to make these legal in the first place is to allow for more combine opportunities. Once those have been done, we can just lower them back to shifts in the post-legalizer lowering. This needs to be one commit otherwise we start causing tests to fail due to incomplete support for selection etc.	2022-10-05 00:28:08 +01:00
Tarun Prabhu	43fe6f7cc3	[flang] Add -fpass-plugin option to Flang frontend Add the -fpass-plugin option to flang which dynamically loads LLVM passes from the shared object passed as the argument to the flag. The behavior of the option is designed to replicate that of the same option in clang and thus has the same capabilities and limitations. - Multiple instances of -fpass-plugin=path-to-file can be specified and each of the files will be loaded in that order. - The flag can be passed to both flang-new and flang-new -fc1. Differential Revision: https://reviews.llvm.org/D129156	2022-10-04 17:02:45 -06:00
Craig Topper	ece4bb5ab8	[RISCV] Teach SExtWRemoval to recognize sign extended values that come from arguments. This information is not preserved in MIR today. So this patch adds information to RISCVMachineFunctionInfo when the vreg is created for the argument. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D134621	2022-10-04 15:39:10 -07:00

1 2 3 4 5 ...

438003 Commits All Branches Search

438003 Commits

All Branches