llvm-project

Commit Graph

Author	SHA1	Message	Date
Paul Kirth	46774df307	[misexpect] Re-implement MisExpect Diagnostics Reimplements MisExpect diagnostics from D66324 to reconstruct its original checking methodology only using MD_prof branch_weights metadata. New checks rely on 2 invariants: 1) For frontend instrumentation, MD_prof branch_weights will always be populated before llvm.expect intrinsics are lowered. 2) for IR and sample profiling, llvm.expect intrinsics will always be lowered before branch_weights are populated from the IR profiles. These invariants allow the checking to assume how the existing branch weights are populated depending on the profiling method used, and emit the correct diagnostics. If these invariants are ever invalidated, the MisExpect related checks would need to be updated, potentially by re-introducing MD_misexpect metadata, and ensuring it always will be transformed the same way as branch_weights in other optimization passes. Frontend based profiling is now enabled without using LLVM Args, by introducing a new CodeGen option, and checking if the -Wmisexpect flag has been passed on the command line. Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D115907	2022-03-31 17:38:21 +00:00
Joachim Protze	7641e42def	[OpenMP][Tools] Fix handling of initial-task-end Latest OpenMP spec says parallel_data is NULL for initial/implicit-task-end. We nevertheless need to cleanup the ParallelData here, as there is no other callback for the end of the implicit parallel region. We can use the reference stored in the TaskData. Reviewed By: dreachem Differential Revision: https://reviews.llvm.org/D114005	2022-03-31 12:33:40 -05:00
Thomas Symalla	1a6aa8b195	[AMDGPU] Add missing use check in SIOptimizeExecMasking pass. Whenever a v_cmp, s_and_saveexec instruction sequence shall be transformed to an equivalent s_mov, v_cmpx sequence, it needs to be detected if the v_cmp target register is used between the two instructions as the v_cmp result gets omitted by using the v_cmpx instruction, resulting in invalid code. Reviewed By: foad Differential Revision: https://reviews.llvm.org/D122797	2022-03-31 19:25:35 +02:00
Simon Pilgrim	535211c3eb	[X86] Remove redundant FIXME lowerV64I8Shuffle has been extended a lot since this was added.	2022-03-31 18:05:52 +01:00
Simon Pilgrim	fac1729924	[X86] lowerV64I8Shuffle - don't use lowerShuffleWithPERMV until we've tried simpler options Shuffle combining will still lower to this with better fast cross lane checks. Noticed while triaging Issue #54658	2022-03-31 18:05:51 +01:00
Okwan Kwon	65bdeddb1e	[mlir] Bubble up tensor.extract_slice above linalg operation Bubble up extract_slice above Linalg operation. A sequence of operations %0 = linalg.<op> ... arg0, arg1, ... %1 = tensor.extract_slice %0 ... can be replaced with %0 = tensor.extract_slice %arg0 %1 = tensor.extract_slice %arg1 %2 = linalg.<op> ... %0, %1, ... This results in the reduce computation of the linalg operation. The implementation uses the tiling utility functions. One difference from the tiling process is that we don't need to insert the checking code for the out-of-bound accesses. The use of the slice itself represents that the code writer is sure about the boundary condition. To avoid adding the boundary condtion check code, `omitPartialTileCheck` is introduced for the tiling utility functions. Differential Revision: https://reviews.llvm.org/D122437	2022-03-31 16:48:38 +00:00
Chris Bieneman	19054163e1	[HLSL] Further improve to numthreads diagnostics This adds diagnostics for conflicting attributes on the same declarataion, conflicting attributes on a forward and final declaration, and defines a more narrowly scoped HLSLEntry attribute target. Big shout out to @aaron.ballman for the great feedback and review on this!	2022-03-31 11:34:01 -05:00
Abinav Puthan Purayil	898d5776ec	[AMDGPU][GlobalISel] Scalarize add/sub with overflow ops in the legalizer Differential Revision: https://reviews.llvm.org/D122803	2022-03-31 21:46:34 +05:30
Abinav Puthan Purayil	db17ebd593	[AMDGPU][GlobalISel] Add end to end IR tests for add/sub with overflow Differential Revision: https://reviews.llvm.org/D122818	2022-03-31 21:46:34 +05:30
Aaron Ballman	2267549296	Fix the build after `cd26190a10` These variables were being used uninitialized and it caused a significant number of test failures on Windows.	2022-03-31 12:03:53 -04:00
Kirill Bobyrev	f43c4c5be2	Revert "[clangd] IncludeCleaner: Add support for IWYU pragma private" This reverts commit `4cb38bfe76`. Awkwardly enough, this builds Windows buildbots: http://45.33.8.238/win/55402/step_9.txt It is yet unclear why this is happening but I will need more time to diagnose the issue.	2022-03-31 17:59:52 +02:00
Michał Górny	09b53121c3	[compiler-rt] [scudo] Use -mcrc32 on x86 when available Update the hardware CRC32 logic in scudo to support using `-mcrc32` instead of `-msse4.2`. The CRC32 intrinsics use the former flag in the newer compiler versions, e.g. in clang since `12fa608af4`. With these compilers, passing `-msse4.2` is insufficient to enable the instructions and causes build failures when `-march` does not enable CRC32: /var/tmp/portage/sys-libs/compiler-rt-sanitizers-14.0.0/work/compiler-rt/lib/scudo/scudo_crc32.cpp:20:10: error: always_inline function '_mm_crc32_u32' requires target feature 'crc32', but would be inlined into function 'computeHardwareCRC32' that is compiled without support for 'crc32' return CRC32_INTRINSIC(Crc, Data); ^ /var/tmp/portage/sys-libs/compiler-rt-sanitizers-14.0.0/work/compiler-rt/lib/scudo/scudo_crc32.h:27:27: note: expanded from macro 'CRC32_INTRINSIC' # define CRC32_INTRINSIC FIRST_32_SECOND_64(_mm_crc32_u32, _mm_crc32_u64) ^ /var/tmp/portage/sys-libs/compiler-rt-sanitizers-14.0.0/work/compiler-rt/lib/scudo/../sanitizer_common/sanitizer_platform.h:132:36: note: expanded from macro 'FIRST_32_SECOND_64' # define FIRST_32_SECOND_64(a, b) (a) ^ 1 error generated. For backwards compatibility, use `-mcrc32` when available and fall back to `-msse4.2`. The `<smmintrin.h>` header remains in use as it still works and is compatible with GCC, while clang's `<crc32intrin.h>` is not. Originally reported in https://bugs.gentoo.org/835870. Differential Revision: https://reviews.llvm.org/D122789	2022-03-31 17:49:42 +02:00
Siva Chandra	97417e0300	[libc] Enable threads.h functions on aarch64. Reviewed By: lntue Differential Revision: https://reviews.llvm.org/D122788	2022-03-31 08:42:07 -07:00
Sven van Haastregt	4dfec37037	[OpenCL] Set MinVersion for sub_group_barrier with memory_scope The memory_scope enum is not available before OpenCL 2.0, so ensure the sub_group_barrier overload with a memory_scope argument is restricted to OpenCL 2.0 and above. This is already the case in opencl-c.h. Fixes the issue revealed by https://reviews.llvm.org/D120254 Reported-by: Harald van Dijk (hvdijk)	2022-03-31 16:41:40 +01:00
Mark de Wever	11c14bca58	[libc++][ci] Installs Japanese locale in Docker. The alternative outputs of std::put_time and std::strftime are the easiest to test with the Japanese locale. This is a preparation for the tests of the chrono formatters. Note since it takes a while before the Docker file changes propagate to the build nodes the verification of the locale is done in a separate patch. Reviewed By: ldionne, #libc Differential Revision: https://reviews.llvm.org/D122736	2022-03-31 17:41:06 +02:00
Mark de Wever	e3ad15d7ff	[libc++][doc] Update formatting status. Reduced the details of the non-chrono formatting information. This has been shipped and these details part of P0645 which is still documented. Removing this information keeps the information up-to-date. Adds the formatters required for the types chrono namespace. Reviewed By: ldionne, #libc Differential Revision: https://reviews.llvm.org/D122735	2022-03-31 17:37:51 +02:00
Vince Bridgers	4d5b824e3d	[analyzer] Avoid checking addrspace pointers in cstring checker This change fixes an assert that occurs in the SMT layer when refuting a finding that uses pointers of two different sizes. This was found in a downstream build that supports two different pointer sizes, The CString Checker was attempting to compute an overlap for the 'to' and 'from' pointers, where the pointers were of different sizes. In the downstream case where this was found, a specialized memcpy routine patterned after memcpy_special is used. The analyzer core hits on this builtin because it matches the 'memcpy' portion of that builtin. This cannot be duplicated in the upstream test since there are no specialized builtins that match that pattern, but the case does reproduce in the accompanying LIT test case. The amdgcn target was used for this reproducer. See the documentation for AMDGPU address spaces here https://llvm.org/docs/AMDGPUUsage.html#address-spaces. The assert seen is: `Solver->getSort(LHS) == Solver->getSort(RHS) && "AST's must have the same sort!"' Ack to steakhal for reviewing the fix, and creating the test case. Reviewed By: steakhal Differential Revision: https://reviews.llvm.org/D118050	2022-03-31 17:34:56 +02:00
Peter Waller	f1cb816f90	[AArch64][SVE] Mark {CNT*,RDVL,INDEX} as materializable Differential Revision: https://reviews.llvm.org/D122731	2022-03-31 15:28:24 +00:00
Fraser Cormack	ee51aefba0	[RISCV][NFC] Minor formatting fix	2022-03-31 16:15:22 +01:00
Jay Foad	e8e32e5714	[AMDGPU] Fix typo in RUN line	2022-03-31 16:23:40 +01:00
Wenju He	0bda12b5bc	[NewPM] Add OptimizerEarly module extension point VectorizerStart extension is module callback in old PM, but is function callback in new PM. We lack a module extension point between end of buildModuleSimplificationPipeline and the function optimization (including vectorizer) pipeline. So this patch adds a new module extension point before the function optimization pipeline. Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D122296	2022-03-31 08:22:27 -07:00
Groverkss	152e501d87	[MLIR][Presburger] Carry IdKind information in LinearTransform::applyTo This patch fixes a bug in LinearTransform::applyTo where it did not carry the IdKind information, and instead treated every id as IdKind::Domain. Reviewed By: arjunp Differential Revision: https://reviews.llvm.org/D122823	2022-03-31 20:42:50 +05:30
Nico Weber	d2f7547f14	[gn build] (manually) port `19246b0779`	2022-03-31 11:10:18 -04:00
David Goldman	d9739f29cd	Serialize PragmaAssumeNonNullLoc to support preambles Previously, if a `#pragma clang assume_nonnull begin` was at the end of a premable with a `#pragma clang assume_nonnull end` at the end of the main file, clang would diagnose an unterminated begin in the preamble and an unbalanced end in the main file. With this change, those errors no longer occur and the case above is now properly handled. I've added a corresponding test to clangd, which makes use of preambles, in order to verify this works as expected. Differential Revision: https://reviews.llvm.org/D122179	2022-03-31 11:08:01 -04:00
Changpeng Fang	1711020c37	AMDGPU: Use isLiteralConstantLike to check whether the operand could ever be literal Summary: To compute the size of a VALU/SALU instruction, we need to check whether an operand could ever be literal. Previously isLiteralConstant was used, which missed cases like global variables or external symbols. These misses lead to under-estimation of the instruction size and branch offset, and thus incorrectly skip the necessary branch relaxation when the branch offset is actually greater than what the branch bits can hold. In this work, we use isLiteralConstantLike to check the operands. It maybe conservative, but it is safe. Reviewers: arsenm Differential Revision: https://reviews.llvm.org/D122778	2022-03-31 08:06:31 -07:00
Louis Dionne	0a460416e6	[libc++] Install psutil on the macOS nodes	2022-03-31 10:52:58 -04:00
Nikita Popov	0721d7c4d8	[X86] Add test for PR54369 (NFC)	2022-03-31 16:45:05 +02:00
Carlo Marcelo Arenas Belón	81f5c6270c	[compiler-rt] Implement __clear_cache on FreeBSD/powerpc `dd9173420f` (Add clear_cache implementation for ppc64. Fix buffer to meet ppc64 alignment., 2017-07-28), adds an implementation for __builtin___clear_cache on powerpc64, which was promptly ammended to also be used with big endian mode in `f67036b62c` (This ppc64 implementation of clear_cache works for both big and little endian., 2017-08-02) clang will use this implementation for it's builtin on FreeBSD and result in an abort() in the cases where 32-bit generation was requested (ex in macppc or when the big endian powerpc64 build was done with "-m32") and as reported[1] recently with pcre2, but there is no reason why the same code couldn't be used in those cases, so use instead the more generic identifier for the PowerPC architecture. While at it, update the comment to reflect that POWER8/9 have a 128 byte wide cache line and so the code could instead use 64 byte windows instead but that possible optimization has been punted for now. [1] https://github.com/PhilipHazel/pcre2/issues/92 Reviewed By: jhibbits, #powerpc, MaskRay Differential Revision: https://reviews.llvm.org/D122640	2022-03-31 14:19:26 +00:00
Arjun P	9615d717d1	[MLIR][Presburger] IntegerRelation::truncate: fix bug when truncating equalities This was truncating inequalities instead of equalities. Reviewed By: Groverkss Differential Revision: https://reviews.llvm.org/D122811	2022-03-31 15:16:30 +01:00
Nikita Popov	33ac23e7cf	[Float2Int] Avoid unnecessary lamdbas (NFC) Instead of first creating a lambda for calculating the range, then collecting the ranges for the operands, and then calling the lambda on those ranges, we can first calculate the operand ranges and then calculate the result directly in the switch.	2022-03-31 16:13:13 +02:00
Nikita Popov	f66975555f	[Float2Int] Extract calcRange() method (NFC) This avoids the awkward "Abort" flag, because we can simply early-return instead.	2022-03-31 16:13:13 +02:00
Arjun P	d81fa76f3a	[MLIR][Presburger] MultiAffineFunction:eliminateRedundantLocalId: fix bug where local offset was not considered Previously, when updating the outputs matrix, the local offset was not being considered. Reviewed By: Groverkss Differential Revision: https://reviews.llvm.org/D122812	2022-03-31 15:11:55 +01:00
Florian Hahn	8378a71b6c	Recommit "[LV] Remove unneeded createHeaderBranch.(NFCI)" This reverts the revert commit `2760cdc9c6`. This version pulls in the code to create the vector loop object in VPlan from D121624. This is needed because otherwise existing LoopInfo verification will fail, as a loop block doesn't have in-loop successors now that we do not replace the branch. Now that we do not add new loops during skeleton construction, there's also no need to verify LI there.	2022-03-31 14:48:32 +01:00
Louis Dionne	19246b0779	[libc++] Remove the __libcpp_version file It seems to have been added back in `761e42fa3d` for Clang to use it, however it seems to have never been used for that purpose, so it is probably fine to remove it. Differential Revision: https://reviews.llvm.org/D122330	2022-03-31 09:34:41 -04:00
Vy Nguyen	33b3c86afa	Revert "[llvm-readobj][MachO] Add option to sort the symbol table before dumping (MachO only, for now)." This reverts commit `ea9cf2dc96`. Broke LLDB - reverting to investigage	2022-03-31 09:33:32 -04:00
Louis Dionne	cb055e51f9	[libc++] Add a CI job running MSAN For some reason, we've been going without a MSAN CI job, even though even run-buildbot defined a generic-msan job. This must have been an oversight that went unnoticed. Thanks to @EricWF for the catch. Differential Revision: https://reviews.llvm.org/D120851	2022-03-31 09:31:22 -04:00
Yitzhak Mandelbaum	7f076004e9	[clang][dataflow] Add support for `value_or` in a comparison. This patch adds limited modeling of the `value_or` method. Specifically, when used in a particular idiom in a comparison to implicitly check whether the optional holds a value. Differential Revision: https://reviews.llvm.org/D122231	2022-03-31 13:21:39 +00:00
Sanjay Patel	4a54e3eed3	[x86] try to replace 0.0 in fcmp with negated operand This inverts a fold recently added to IR with: `3491f2f4b0` We can put -bidirectional on the Alive2 examples to show that the reverse transforms work: https://alive2.llvm.org/ce/z/8iVQwB The motivation for the IR change was to improve matching to 'fabs' in IR (see https://github.com/llvm/llvm-project/issues/38828 ), but it regressed x86 codegen for 'not-quite-fabs' patterns like (X > -X) ? X : -X. Ie, when there is no fast-math (nsz), the cmp+select is not a proper fabs operation, but it does map nicely to the unusual NAN semantics of MINSS/MAXSS. I drafted this as a target-independent fold, but it doesn't appear to help any other targets and seems to cause regressions for SystemZ at least. Differential Revision: https://reviews.llvm.org/D122726	2022-03-31 09:17:49 -04:00
Vy Nguyen	ea9cf2dc96	[llvm-readobj][MachO] Add option to sort the symbol table before dumping (MachO only, for now). This would help making tests less brittle as the order will be fixed. (see also PR/53026) Differential Revision: https://reviews.llvm.org/D116787	2022-03-31 09:13:31 -04:00
Jay Foad	fdaf606c8e	[AMDGPU] Fix last remaining checks in perfhint.ll Unfortunately this just shows that the test case for D47740 never really tested what it was supposed to test. Differential Revision: https://reviews.llvm.org/D122664	2022-03-31 13:39:15 +01:00
Fraser Cormack	a276d1f44b	[RISCV][NFC] Fix formatting on one line	2022-03-31 13:17:37 +01:00
Sergei Lebedev	e1fdd8048c	Fixed the type of context in type stubs for MLIR Python bindings Reviewed By: ftynse Differential Revision: https://reviews.llvm.org/D122795	2022-03-31 14:28:10 +02:00
Abinav Puthan Purayil	2f284b0ff9	[AMDGPU] Regenerate checks in some mir tests	2022-03-31 17:49:00 +05:30
Serge Pavlov	47b3b76825	Implement inlining of strictfp functions According to the current design, if a floating point operation is represented by a constrained intrinsic somewhere in a function, all floating point operations in the function must be represented by constrained intrinsics. It imposes additional requirements to inlining mechanism. If non-strictfp function is inlined into strictfp function, all ordinary FP operations must be replaced with their constrained counterparts. Inlining strictfp function into non-strictfp is not implemented as it would require replacement of all FP operations in the host function, which now is undesirable due to expected performance loss. Differential Revision: https://reviews.llvm.org/D69798	2022-03-31 19:15:52 +07:00
Alexandros Lamprineas	b4417075dc	[FuncSpec] Constant propagate multiple arguments for recursive functions. This fixes a TODO in constantArgPropagation() to make it feature complete. However, I do find myself in agreement with the review comments in https://reviews.llvm.org/D106426. I don't think we should pursue specializing such recursive functions as the code size increase becomes linear to 'max-iters'. Compiling the modified test just with -O3 (no function specialization) generates the same code. Differential Revision: https://reviews.llvm.org/D122755	2022-03-31 13:00:08 +01:00
Priyansh Singh	1cb299165c	Fixed minor documentation issues Fixed whitespace and punctuation issues, added a name to a link, and fixed a typo.	2022-03-31 07:37:45 -04:00
Florian Hahn	2760cdc9c6	Revert "[LV] Remove unneeded createHeaderBranch.(NFCI)" This reverts commit `32bc83d11e`. This is causing bots with expensive-checks to fail. Revert while I investigate.	2022-03-31 12:32:50 +01:00
Abinav Puthan Purayil	acf83abcbf	[AMDGPU][GlobalISel] Remove unused variable. NFC.	2022-03-31 16:50:34 +05:30
Luo, Yuanke	6753eb0c90	[X86][AMX] Materialize undef or zero value to tilezero The AMX combiner would store undef or zero to stack and invoke tileload to load the data to tile register. To avoid the store/load, we can materialzie undef or zero value to tilezero. Differential Revision: https://reviews.llvm.org/D122714	2022-03-31 19:10:28 +08:00
Kirill Bobyrev	4cb38bfe76	[clangd] IncludeCleaner: Add support for IWYU pragma private Reviewed By: sammccall Differential Revision: https://reviews.llvm.org/D120306	2022-03-31 12:49:52 +02:00

1 2 3 4 5 ...

419683 Commits All Branches Search

419683 Commits

All Branches