llvm-project

Commit Graph

Author	SHA1	Message	Date
peter klausler	fc4f457fcc	[flang] Fix ARM/POWER test failure (folding20.f90) Recent code for folding MINVAL() didn't allow for architectures whose C/C++ char type is unsigned, so the value of the maximum Fortran character was incorrect. This was caught by the folding20.f90 test. The fix is to avoid numeric_limits<> and use hard values for max signed integers of various character kinds. Pushing into llvm-project/main to restore ARM/POWER buildbots.	2021-06-16 16:41:08 -07:00
Ben Shi	0799057181	[RISCV][test] Add new tests of SHADD in the zba extension These tests will show the following optimization by future patches. Rx + Ry 6 => (SH1ADD (SH2ADD Rx, Ry), Ry) Rx + Ry * 10 => (SH1ADD (SH3ADD Rx, Ry), Ry) Rx + Ry * 12 => (SH2ADD (SH3ADD Rx, Ry), Ry) Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D104210	2021-06-17 07:02:33 +08:00
Robert David	7cfb7a67c5	[mlir] Make Type::print and Type::dump const	2021-06-16 15:31:20 -07:00
Nico Weber	a127dffc49	[gn build] (manually) port `f9aba9a5af`	2021-06-16 18:04:46 -04:00
peter klausler	fdf33771fe	[flang] Implement runtime for IALL & IANY We had IPARITY (xor-reduction) but I missed IALL (and) and IANY (or). Differential Revision: https://reviews.llvm.org/D104339	2021-06-16 14:54:36 -07:00
Joachim Meyer	053dbb939d	Use `-cfg-func-name` value as filter for `-view-cfg`, etc. Currently the value is only used when calling `F->viewCFG()` which is missing out on its potential and usefulness. So I added the check to the printer passes as well. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D102011	2021-06-16 23:54:51 +02:00
Adrian Prantl	f9aba9a5af	Move the definition of LLVM_SUPPORT_XCODE_SIGNPOSTS into llvm-config.h since it is now used by a public header file (Signposts.h). This fixes the standalone LLDB build.	2021-06-16 14:40:37 -07:00
peter klausler	c375ec8613	[flang] Use a "double-double" accumulator in SUM Use a "double-double" accumulator, a/k/a Kahan summation, in the SUM intrinsic in the runtime for real & complex. This seems to be the best-recommended technique for reducing error, as opposed to the initial implementation of SUM's distinct accumulators for positive and negative items. Differential Revision: https://reviews.llvm.org/D104338	2021-06-16 14:29:39 -07:00
Kostya Kortchinsky	8b062b6160	[scudo] Ensure proper allocator alignment in TSD test The `MockAllocator` used in `ScudoTSDTest` wasn't allocated properly aligned, which resulted in the `TSDs` of the shared registry not being aligned either. This lead to some failures like: https://reviews.llvm.org/D103119#2822008 This changes how the `MockAllocator` is allocated, same as Vitaly did in the combined tests, properly aligning it, which results in the `TSDs` being aligned as well. Add a `DCHECK` in the shared registry to check that it is. Differential Revision: https://reviews.llvm.org/D104402	2021-06-16 14:21:58 -07:00
peter klausler	47f18af55f	[flang] Fold MAXVAL & MINVAL Implement constant folding for the reduction transformational intrinsic functions MAXVAL and MINVAL. In anticipation of more folding work to follow, with (I hope) some common infrastructure, these two have been implemented in a new header file. Differential Revision: https://reviews.llvm.org/D104337	2021-06-16 14:06:34 -07:00
Andrzej Warzynski	46446e398b	[flang][driver] Add missing `! REQUIRES` LIT directive The test added in https://reviews.llvm.org/D104305 will only work with the new driver and should be marked as such. Sending this without a review as it's fairly straightforward and fixes test failures for developers that don't want to build the new driver.	2021-06-16 21:00:13 +00:00
peter klausler	ec3049c79b	[flang] Cope with errors with array constructors When a program attempts to put something like a subprogram into an array constructor, emit an error rather than crashing. Differential Revision: https://reviews.llvm.org/D104336	2021-06-16 13:44:20 -07:00
Terry Wilmarth	25073a4ecf	[OpenMP] Add Two-level Distributed Barrier Two-level distributed barrier is a new experimental barrier designed for Intel hardware that has better performance in some cases than the default hyper barrier. This barrier is designed to handle fine granularity parallelism where barriers are used frequently with little compute and memory access between barriers. There is no need to use it for codes with few barriers and large granularity compute, or memory intensive applications, as little difference will be seen between this barrier and the default hyper barrier. This barrier is designed to work optimally with a fixed number of threads, and has a significant setup time, so should NOT be used in situations where the number of threads in a team is varied frequently. The two-level distributed barrier is off by default -- hyper barrier is used by default. To use this barrier, you must set all barrier patterns to use this type, because it will not work with other barrier patterns. Thus, to turn it on, the following settings are required: KMP_FORKJOIN_BARRIER_PATTERN=dist,dist KMP_PLAIN_BARRIER_PATTERN=dist,dist KMP_REDUCTION_BARRIER_PATTERN=dist,dist Branching factors (set with KMP_FORKJOIN_BARRIER, KMP_PLAIN_BARRIER, and KMP_REDUCTION_BARRIER) are ignored by the two-level distributed barrier. Differential Revision: https://reviews.llvm.org/D103121	2021-06-16 15:34:55 -05:00
Yitzhak Mandelbaum	c7ed4fe56e	[libTooling] Change `access` stencil to recognize use of `operator`. Currently, `access` doesn't recognize a dereferenced smart pointer. So, `access(e, "field")` where `e = x`, yields: * `x->field`, for normal-pointer x, * `(*x).field`, for smart-pointer x. This patch normalizes handling of smart pointer to match normal pointer, when the smart pointer type supports `->`. Differential Revision: https://reviews.llvm.org/D104390	2021-06-16 20:34:00 +00:00
Gus Smith	f9a6d47c36	Add sparse matrix multiplication integration test Adds an integration test for the SPMM (sparse matrix multiplication) kernel, which multiplies a sparse matrix by a dense matrix, resulting in a dense matrix. This is just a simple modification on the existing matrix-vector multiplication kernel. Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D104334	2021-06-16 13:20:20 -07:00
Yitzhak Mandelbaum	439c920694	[ASTMatchers] Fix bug in `hasUnaryOperand` Currently, `hasUnaryOperand` fails for the overloaded `operator*`. This patch fixes the bug and adds tests for this case. Differential Revision: https://reviews.llvm.org/D104389	2021-06-16 20:17:56 +00:00
Uday Bondhugula	54384d1723	[MLIR] Make store to load fwd condition less conservative Make store to load fwd condition for -memref-dataflow-opt less conservative. Post dominance info is not really needed. Add additional check for common cases. Differential Revision: https://reviews.llvm.org/D104174	2021-06-17 01:26:38 +05:30
Prashant Kumar	51d43bbc46	[MLIR] Fix affine parallelize pass. To control the number of outer parallel loops, we need to process the outer loops first and hence pre-order walk fixes the issue. Reviewed By: bondhugula Differential Revision: https://reviews.llvm.org/D104361	2021-06-17 01:25:24 +05:30
Jacques Pienaar	0e760a0870	Add hook for dialect specializing processing blocks post inlining calls This allows for dialects to do different post-processing depending on operations with the inliner (my use case requires different attribute propagation rules depending on call op). This hook runs before the regular processInlinedBlocks method. Differential Revision: https://reviews.llvm.org/D104399	2021-06-16 12:53:21 -07:00
peter klausler	e5813a683a	[flang] Fix crashes on calls to non-procedures When a procedure reference is attempted to an entity that just isn't a procedure, say so. Differential Revision: https://reviews.llvm.org/D104329	2021-06-16 12:48:54 -07:00
Min-Yih Hsu	c29555342c	[MCA] Anchoring the vtable of CustomBehaviour Put the dtor of mca::CustomBehaviour into the cpp file to avoid undefined vtable when linking libLLVMMCACustomBehaviourAMDGPU as shared library. Differential Revision: https://reviews.llvm.org/D104401	2021-06-16 12:43:58 -07:00
Mehdi Amini	066b320723	Use early exist and simplify a condition in Block SuccessorRange (NFC)	2021-06-16 19:42:41 +00:00
Mehdi Amini	a6559b42ce	Fix verifier crashing on some invalid IR In a region with multiple blocks the verifier will try to look for dominance and may get successor list for blocks, even though a block may be empty or does not end with a terminator. Differential Revision: https://reviews.llvm.org/D104411	2021-06-16 19:36:28 +00:00
Eli Friedman	27963ccf07	[NFC][ScalarEvolution] Refactor createNodeForSelectOrPHI In preparation for D103660.	2021-06-16 12:32:32 -07:00
peter klausler	3061334e0d	[flang] Don't crash on some bogus expressions Recover more gracefully from user errors in expressions. Differential Revision: https://reviews.llvm.org/D104326	2021-06-16 12:26:39 -07:00
Jez Ng	560636e549	[lld-macho] Put DATA_IN_CODE immediately after FUNCTION_STARTS codesign checks for this. Reviewed By: #lld-macho, thakis Differential Revision: https://reviews.llvm.org/D104354	2021-06-16 15:23:07 -04:00
Jez Ng	eeac6b2bec	[lld-macho] Handle multiple LC_LINKER_OPTIONs We previously only parsed the first one. Reviewed By: #lld-macho, thakis Differential Revision: https://reviews.llvm.org/D104352	2021-06-16 15:23:06 -04:00
Jez Ng	b8bbb9723a	[lld-macho][nfc] Put back shouldOmitFromOutput() asserts I removed them in rG5de7467e982 but @thakis pointed out that they were useful to keep, so here they are again. I've also converted the `!isCoalescedWeak()` asserts into `!shouldOmitFromOutput()` asserts, since the latter check subsumes the former. Reviewed By: #lld-macho, thakis Differential Revision: https://reviews.llvm.org/D104169	2021-06-16 15:23:04 -04:00
Fangrui Song	d619cf5ac5	[llvm-objcopy][MachO] Copy LC_LINKER_OPTIMIZATION_HINT This fixes `error: unsupported load command (cmd=0x2e)`	2021-06-16 12:09:50 -07:00
Hongtao Yu	cef9b96b01	[CSSPGO] Report zero-count probe in profile instead of dangling probes. Previously dangling samples were represented by INT64_MAX in sample profile while probes never executed were not reported. This was based on an observation that dangling probes were only at a smaller portion than zero-count probes. However, with compiler optimizations, dangling probes end up becoming at large portion of all probes in general and reporting them does not make sense from profile size point of view. This change flips sample reporting by reporting zero-count probes instead. This enabled dangling probe to be represented by none (missing entry in profile). This has a couple benefits: 1. Reducing sample profile size in optimize mode, even when the number of non-executed probes outperform the number of dangling probes, since INT64_MAX takes more space over 0 to encode. 2. Binary size savings. No need to encode dangling probe anymore, since missing probes are treated as dangling in the profile reader. 3. Reducing compiler work to track dangling probes. However, for probes that are real dead and removed, we still need the compiler to identify them so that they can be reported as zero-count, instead of mistreated as dangling probes. 4. Improving counts quality by respecting the counts already collected on the non-dangling copy of a probe. A probe, when duplicated, gets two copies at runtime. If one of them is dangling while the other is not, merging the two probes at profile generation time will cause the real samples collected on the non-dangling one to be discarded. Not reporting the dangling counterpart will keep the real samples. 5. Better readability. 6. Be consistent with non-CS dwarf line number based profile. Zero counts are trusted by the compiler counts inferencer while missing counts will be inferred by the compiler. Note that the current patch does include any work for #3. There will be follow-up changes. For #1, I've seen for a large Facebook service, the text profile is reduced by 7%. For extbinary profile, the size of LBRProfileSection is reduced by 35%. For #4, I have seen general counts quality for SPEC2017 is improved by 10%. Reviewed By: wenlei, wlei, wmi Differential Revision: https://reviews.llvm.org/D104129	2021-06-16 11:45:29 -07:00
Aart Bik	619bfe8bd2	[mlir][sparse] support new kind of scalar in sparse linalg generic op We have several ways of introducing a scalar invariant value into linalg generic ops (should we limit this somewhat?). This revision makes sure we handle all of them correctly in the sparse compiler. Reviewed By: gysit Differential Revision: https://reviews.llvm.org/D104335	2021-06-16 11:00:49 -07:00
Sanjay Patel	a993bb08b8	[ValueTracking] add FP intrinsics to test for propagatesPoison; NFC I'm not sure what behavior we want if the FP environment is not default (also not sure if there's a way to enumerate the full list of intrinsics programmatically), but currently these are all defaulting to 'false' (doesn't propagate).	2021-06-16 13:43:03 -04:00
Fangrui Song	1a76bff626	RISCVFixupKinds.h: Don’t duplicate function or class name at the beginning of the comment && fix some comments	2021-06-16 10:42:43 -07:00
peter klausler	8ba9ee46e4	[flang] Correct the subscripts used for arguments to character intrinsics When chasing down another unrelated bug, I noticed that the implementations of various character intrinsic functions assume that the lower bounds of (some of) their arguments were 1. This isn't necessarily the case, so I've cleaned them up, tweaked the unit tests to exercise the fix, and regularized the allocation pattern used for results to use SetBounds() before Allocate() rather than the old original Descriptor::Allocate() wrapper around CFI_allocate(). Since there were few other remaining uses of the old original Descriptor::Allocate() wrapper, I also converted them to the new one and deleted the old one. Differential Revision: https://reviews.llvm.org/D104325	2021-06-16 10:26:25 -07:00
Ben Langmuir	773ad55a39	[index] Fix performance regression with indexing macros When using FileIndexRecord with macros, symbol references can be seen out of source order, which was causing a regression to insert the symbols into a vector. Instead, we now lazily sort the vector. The impact is small on most code, but in very large files with many macro references (M) near the beginning of the file followed by many decl references (D) it was O(M*D). A particularly bad protobuf-generated header was observed with a 100% regression in practice. rdar://78628133	2021-06-16 10:16:26 -07:00
Fangrui Song	1de18ad8d7	[llvm-objcopy] Make ihex writer similar to binary writer There is no need to differentiate whether `UseSegments` is true or false. Unifying the cases makes the behavior closer to BinaryWriter. This improves compatibility with objcopy because SHF_ALLOC sections not in a PT_LOAD will not be skipped. Such cases are usually erroneous input, though. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D104186	2021-06-16 10:08:20 -07:00
Sushma Unnibhavi	2193347e72	[M68k][GloballSel] Adding initial GlobalISel infrastructure Wiring up GlobalISel for the M68k backend Differential Revision: https://reviews.llvm.org/D101819	2021-06-16 10:48:38 -06:00
Christopher Di Bella	c5076d8371	Revert "Revert "[libcxx][module-map] creates submodules for private headers"" This reverts commit `d9633f229c` as a workaround was discovered. Differential Revision: https://reviews.llvm.org/D104170	2021-06-16 16:36:41 +00:00
Sanjay Patel	572e506b55	[ValueTracking] add tests for propagatesPoison with FP ops; NFC Verify that this matches the behavior in InstSimplify: D104383 / `ce95200b79` We still need to add code/tests for FP intrinsics.	2021-06-16 12:14:28 -04:00
LLVM GN Syncbot	35a085bfab	[gn build] Port `ef16c8eaa5`	2021-06-16 15:57:43 +00:00
Patrick Holland	ef16c8eaa5	Reapply "[MCA] Adding the CustomBehaviour class to llvm-mca". The original change was pushed in main as commit `f7a23ecece`. It was then reverted by commit `a04f01bab2` because it caused linker failures on buildbots that don't build the AMDGPU target. -- Some instructions are not defined well enough within the target’s scheduling model for llvm-mca to be able to properly simulate its behaviour. The ideal solution to this situation is to modify the scheduling model, but that’s not always a viable strategy. Maybe other parts of the backend depend on that instruction being modelled the way that it is. Or maybe the instruction is quite complex and it’s difficult to fully capture its behaviour with tablegen. The CustomBehaviour class (which I will refer to as CB frequently) is designed to provide intuitive scaffolding for developers to implement the correct modelling for these instructions. More details are available in the original commit log message (`f7a23ecece`). Differential Revision: https://reviews.llvm.org/D104149	2021-06-16 16:54:48 +01:00
Vyacheslav Zakharin	b5c4fc0f23	[NFC][libomptarget] Reduce the dependency on libelf This change-set removes libelf usage from elf_common part of the plugins. libelf is still used in x86_64 generic plugin code and in some plugins (e.g. amdgpu) - these will have to be cleaned up in separate checkins. Differential Revision: https://reviews.llvm.org/D103545	2021-06-16 08:34:23 -07:00
Sanjay Patel	ce95200b79	[InstSimplify] propagate poison through FP ops We already have this fold: fadd float poison, 1.0 --> poison ...via ConstantFolding, so this makes the behavior consistent if the other operand(s) are non-constant. The fold for undef was added before poison existed as a value/type in IR. This came up in D102673 / D103169 because we're trying to sort out the more complicated handling for constrained math ops. We should have the handling for the regular instructions done first, so we can build on that (or diverge as needed). Differential Revision: https://reviews.llvm.org/D104383	2021-06-16 11:31:58 -04:00
Sjoerd Meijer	08c75fc5e3	[FuncSpec] Fixed prefix typo in test function-specialization-noexec.ll. NFC.	2021-06-16 16:25:26 +01:00
Yitzhak Mandelbaum	f387c8545d	[libTooling][NFC] Refactor implemenation of Transformer Stencils to use standard OOP Currently, the implementation combines OOP and overloads, using a template to tie the two together. In practice, this has proven confusing with no benefits. This patch simplifies the code to use standard OOP design (a collection of classes deriving from an interface). Differential Revision: https://reviews.llvm.org/D104317	2021-06-16 15:18:40 +00:00
Jez Ng	d52d1b93c3	[lld-macho] Downgrade version mismatch to warning It's a warning in ld64. While having LLD be stricter would be nice, it makes it harder for it to be a drop-in replacement into existing builds. Reviewed By: #lld-macho, thakis Differential Revision: https://reviews.llvm.org/D104333	2021-06-16 11:06:26 -04:00
Dylan Fleming	2a936be388	[SVE] Selection failure with scalable insertelements Reviewed By: efriedma, CarolineConcatto Differential Revision: https://reviews.llvm.org/D104244	2021-06-16 15:38:31 +01:00
James Henderson	b9ce8ea454	[obj2yaml] Address D104035 review comments Accidentally missed from commit `5c1639fe06`. Differential Revision: https://reviews.llvm.org/D104035	2021-06-16 15:01:54 +01:00
Jay Foad	66234ce49f	[AMDGPU] Set VOP3P flag on Real instructions This does not affect codegen but might benefit llvm-mca.	2021-06-16 15:00:45 +01:00
David Spickett	e4ecd83fe9	[llvm][AArch64] Handle arrays of struct properly (from IR) This only applies to FastIsel. GlobalIsel seems to sidestep the issue. This fixes https://bugs.llvm.org/show_bug.cgi?id=46996 One of the things we do in llvm is decide if a type needs consecutive registers. Previously, we just checked if it was an array or not. (plus an SVE specific check that is not changing here) This causes some confusion when you arbitrary IR like: ``` %T1 = type { double, i1 }; define [ 1 x %T1 ] @foo() { entry: ret [ 1 x %T1 ] zeroinitializer } ``` We see it is an array so we call CC_AArch64_Custom_Block which bails out when it sees the i1, a type we don't want to put into a block. This leaves the location of the double in some kind of intermediate state and leads to odd codegen. Which then crashes the backend because it doesn't know how to implement what it's been asked for. You get this: ``` renamable $d0 = FMOVD0 $w0 = COPY killed renamable $d0 ``` Rather than this: ``` $d0 = FMOVD0 $w0 = COPY $wzr ``` The backend knows how to copy 64 bit to 64 bit registers, but not 64 to 32. It can certainly be taught how but the real issue seems to be us even trying to assign a register block in the first place. This change makes the logic of AArch64TargetLowering::functionArgumentNeedsConsecutiveRegisters a bit more in depth. If we find an array, also check that all the nested aggregates in that array have a single member type. Then CC_AArch64_Custom_Block's assumption of a type that looks like [ N x type ] will be valid and we get the expected codegen. New tests have been added to exercise these situations. Note that some of the output is not ABI compliant. The aim of this change is to simply handle these situations and not to make our processing of arbitrary IR ABI compliant. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D104123	2021-06-16 13:56:01 +00:00

1 2 3 4 5 ...

391398 Commits All Branches Search

391398 Commits

All Branches