llvm-project

Commit Graph

Author	SHA1	Message	Date
Dmitry Vyukov	b988d8ddc2	tsan: remove unnecessary brackets Reviewed By: melver Differential Revision: https://reviews.llvm.org/D130236	2022-07-21 12:11:44 +02:00
Nikita Popov	8d58c8e57b	Reapply [InstCombine] Don't check for alloc fn before fetching alloc size Reapply the patch with getObjectSize() replaced by getAllocSize(). The former will also look through calls that return their argument, and we'll end up placing dereferenceable attributes on intrinsics like llvm.launder.invariant.group. While this isn't wrong, it also doesn't seem to be particularly useful. For now, use getAllocSize() instead, which sticks closer to the original behavior of this code. ----- This code is just interested in the allocsize, not any other allocator properties.	2022-07-21 11:48:24 +02:00
Nikita Popov	d144ae6e1b	[MemoryBuiltins] Default to trivial mapper in getAllocSize() (NFC) Default getAllocSize() to use the trivial mapper. Also switch from using std::function to function_ref. Furthermore, update the doc comment to point out a subtle difference between getAllocSize() and getObjectSize(): The latter may also return something for calls that return their argument (via "returned" attribute or special intrinsics like invariant groups).	2022-07-21 11:43:48 +02:00
jacquesguan	e60eb7053d	recommit "[DAGCombiner] Teach scalarizeBinOpOfSplats handle scalable splat." With fix for AArch64 and Hexgon test cases.	2022-07-21 17:34:34 +08:00
Nikita Popov	235fb602ed	[MemoryBuiltins] Don't query TLI for non-pointer functions (NFC) Fetching allocation data for calls is a rather hot operation, and TLI lookups are slow. We can greatly reduce the number of calls for which TLI is queried by checking that they return a pointer value first, as this is a requirement for allocation functions anyway.	2022-07-21 11:28:36 +02:00
Chuanqi Xu	ea623af7c9	[C++20] [Modules] Avoid inifinite loop when iterating default args Currently, clang may meet an infinite loop in a very tricky case when it iterates the default args. This patch tries to fix this by adding a `fixed` check.	2022-07-21 17:25:05 +08:00
Andrzej Warzynski	7c49f56956	[flang][nfc] Add missing `REQUIRES: asserts` in tests Tests that use `--mlir-pass-statistics-display=` from MLIR require the following condition to hold: (extracted from LLVM's Statistics.h): ``` #define LLVM_ENABLE_STATS 1 ``` This is normally enforced with `REQUIRES: asserts`. This patch updates relevant Flang tests accordingly. For "Release" builds (with assertions disabled), the affected tests will be failing without this change. Differential Revision: https://reviews.llvm.org/D130185	2022-07-21 09:22:01 +00:00
Ivan Butygin	d4217e6cc8	[mlir][memref] Missing type conversion in memref.reshape llvm lowering Shape can be memref of index type, so memref::LoadOp result need to be converted into llvm type. Differential Revision: https://reviews.llvm.org/D129965	2022-07-21 11:15:35 +02:00
Nikita Popov	70056d04e2	Revert "[InstCombine] Don't check for alloc fn before fetching object size" This reverts commit `c72c22c04d`. This affected an Analysis test that I missed. Reverting for now.	2022-07-21 10:59:12 +02:00
Nikita Popov	c72c22c04d	[InstCombine] Don't check for alloc fn before fetching object size This code is just interested in the allocsize, not any other allocator properties.	2022-07-21 10:45:03 +02:00
Qiu Chaofan	708084ec37	[PowerPC] Support x86 compatible intrinsics on AIX These headers used to be guarded only on PowerPC64 Linux or FreeBSD, but they can also be enabled for AIX OS target since it's big-endian ready. Reviewed By: shchenz Differential Revision: https://reviews.llvm.org/D129461	2022-07-21 16:33:41 +08:00
Chen Zheng	bc5c637376	enable P10 vector builtins test on AIX 64 bit; NFC Verify that P10 vector builtins with type `vector signed __int128` and `vector unsigned __int128` work well on AIX 64 bit.	2022-07-21 04:23:02 -04:00
Iain Sandoe	97af17c5ca	re-land [C++20][Modules] Update handling of implicit inlines [P1779R3] re-land fixes an unwanted interaction with module-map modules, seen in Greendragon testing. This provides updates to [class.mfct]: Pre C++20 [class.mfct]p2: A member function may be defined (8.4) in its class definition, in which case it is an inline member function (7.1.2) Post C++20 [class.mfct]p1: If a member function is attached to the global module and is defined in its class definition, it is inline. and [class.friend]: Pre-C++20 [class.friend]p5 A function can be defined in a friend declaration of a class . . . . Such a function is implicitly inline. Post C++20 [class.friend]p7 Such a function is implicitly an inline function if it is attached to the global module. We add the output of implicit-inline to the TextNodeDumper, and amend a couple of existing tests to account for this, plus add tests for the cases covered above. Differential Revision: https://reviews.llvm.org/D129045	2022-07-21 09:17:01 +01:00
lorenzo chelini	2ed7c3fd84	[MLIR][SCF] Enable better bufferization for `TileConsumerAndFuseProducersUsingSCFForOp` Replace iterators of the outermost loop with region arguments of the innermost one. The changes avoid later `bufferization` passes to insert allocation within the body of the innermost loop. Reviewed By: mravishankar Differential Revision: https://reviews.llvm.org/D130083	2022-07-21 10:14:26 +02:00
Haojian Wu	2955192df8	[pseudo] Make sure we rebuild pseudo_gen tool.	2022-07-21 10:09:21 +02:00
Daniel Bertalan	54e18b2397	[lld-macho] Optimize rebase opcode generation This commit reduces the size of the emitted rebase sections by generating the REBASE_OPCODE_DO_REBASE_ADD_ADDR_ULEB and REBASE_OPCODE_DO_REBASE_ULEB_TIMES_SKIPPING_ULEB opcodes. With this change, chromium_framework's rebase section is a 40% smaller 197 kilobytes, down from the previous 320 kB. That is 6 kB smaller than what ld64 produces for the same input. Performance figures from my M1 Mac mini: x before + after N Min Max Median Avg Stddev x 10 4.2269349 4.3300061 4.2689675 4.2690016 0.031151669 + 10 4.219331 4.2914009 4.2398136 4.2448277 0.023817308 No difference proven at 95.0% confidence Differential Revision: https://reviews.llvm.org/D130180	2022-07-21 10:00:39 +02:00
Zi Xuan Wu (Zeson)	08db089124	[CSKY] Fix the testcase error due to the verifyInstructionPredicates - Test cases for arch only has 16-bit instruction such as ck801/ck802 need compile with -mattr=+btst16 - Fix the GPR copy instruction with MOV16 for 16-bit only arch.	2022-07-21 15:53:50 +08:00
Chen Zheng	ecdeabef38	enable P10 vector builtins test on AIX 64 bit; NFC Verify that P10 vector builtins with type `vector signed __int128` and `vector unsigned __int128` work well on AIX 64 bit.	2022-07-21 03:51:30 -04:00
lorenzo chelini	7f1c03171d	Revert "[RFC][MLIR][SCF] Enable better bufferization for `TileConsumerAndFuseProducersUsingSCFForOp`" This reverts commit `9e65850305`.	2022-07-21 09:40:30 +02:00
Nikita Popov	f45ab43332	[MemoryBuiltins] Avoid isAllocationFn() call before checking removable alloc Alloc directly checking whether a given call is a removable allocation, instead of first checking whether it is an allocation first.	2022-07-21 09:39:19 +02:00
Rainer Orth	3776db9a4f	[sanitizer_common] Support Solaris < 11.4 in GetStaticTlsBoundary This patch, on top of D120048 <https://reviews.llvm.org/D120048>, supports GetTls on Solaris 11.3 and Illumos that lack `dlpi_tls_modid`. It's the same method originally used in D91605 <https://reviews.llvm.org/D91605>, but integrated into `GetStaticTlsBoundary`. Tested on `amd64-pc-solaris2.11`, `sparcv9-sun-solaris2.11`, and `x86_64-pc-linux-gnu`. Differential Revision: https://reviews.llvm.org/D120059	2022-07-21 09:18:10 +02:00
David Green	23d6186be0	[SelectionDAG] Fix fptoi.sat scalable vector lowering Vector fptosi_sat and fptoui_sat were being expanded by unrolling the vector operation. This doesn't work for scalable vector, so this patch adds a call to TLI.expandFP_TO_INT_SAT if the vector is scalable. Scalable tests are added for AArch64 and RISCV. Some of the AArch64 fptoi_sat operations should be legal, but that will be handled in another patch. Differential Revision: https://reviews.llvm.org/D130028	2022-07-21 08:00:22 +01:00
lorenzo chelini	9e65850305	[RFC][MLIR][SCF] Enable better bufferization for `TileConsumerAndFuseProducersUsingSCFForOp` Replace iterators of the outermost loop with region arguments of the innermost one. The changes avoid later `bufferization` passes to insert allocation within the body of the innermost loop. Reviewed By: mravishankar Differential Revision: https://reviews.llvm.org/D130083	2022-07-21 08:56:50 +02:00
Luo, Yuanke	cc72af4e13	[X86] Add test case for shuffle	2022-07-21 14:42:03 +08:00
Congzhe Cao	05ccde8023	[LoopCacheAnalysis] Fix a type mismatch problem in cost calculation There is a problem in loop cache analysis that the types of SCEV variables `Coeff` and `ElemSize` in function `isConsecutive()` may not match. The mismatch would cause SCEV failures when `Coeff` is multiplied with `ElemSize`. The fix in this patch is to extend the type of both `Coeff` and `ElemSize` to whichever is wider in those two variables. As a clean-up, duplicate calculations of `Stride` in `computeRefCost()` is then removed. Reviewed By: Meinersbur, #loopoptwg Differential Revision: https://reviews.llvm.org/D128877	2022-07-21 01:57:05 -04:00
Shraiysh Vaishay	61fa7a88c7	[clang][OpenMP] Add IRBuilder support for taskgroup This patch makes use of OMPIRBuilder support for codegen of taskgroup construct in clang. Depends on D128203 Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D129992	2022-07-21 11:13:57 +05:30
Craig Topper	add17fc8e4	[RISCV] Combine (select_cc (srl (and X, 1<<C), C), 0, eq/ne, true, fale) (srl (and X, 1<<C), C) is the form we receive for testing bit C. An earlier combine removed the setcc so it wasn't there to match when we created the SELECT_CC. This doesn't happen for BR_CC because generic DAG combine rebuilds the setcc if it is used by BRCOND. We can shift X left by XLen-1-C to put the bit to be tested in the MSB, and use a signed compare with 0 to test the MSB.	2022-07-20 22:32:11 -07:00
Ian Anderson	28800c2e18	[sanitizer] Use consistent checks for XDR sanitizer_platform_limits_posix.h defines `__sanitizer_XDR ` if `SANITIZER_LINUX && !SANITIZER_ANDROID`, but sanitizer_platform_limits_posix.cpp tries to check it if `HAVE_RPC_XDR_H`. This coincidentally works because macOS has a broken <rpc/xdr.h> which causes `HAVE_RPC_XDR_H` to be 0, but if <rpc/xdr.h> is fixed then clang fails to compile on macOS. Restore the platform checks so that <rpc/xdr.h> can be fixed on macOS. Reviewed By: vitalybuka Differential Revision: https://reviews.llvm.org/D130060	2022-07-20 22:28:07 -07:00
esmeyi	339392ecf2	[AIX] follow-up of D124654. Emitting the remaining aliases instead of reporting an error to avoid SPEC2017 PEAK failures. And mark this as a TODO.	2022-07-21 01:10:09 -04:00
Mahesh Ravishankar	485190df95	[mlir][Linalg] Deprecate `tileAndFuseLinalgOps` method and associated patterns. The `tileAndFuseLinalgOps` is a legacy approach for tiling + fusion of Linalg operations. Since it was also intended to work on operations with buffer operands, this method had fairly complex logic to make sure tile and fuse was correct even with side-effecting linalg ops. While complex, it still wasnt robust enough. This patch deprecates this method and thereby deprecating the tiling + fusion method for ops with buffer semantics. Note that the core transformation to do fusion of a producer with a tiled consumer still exists. The deprecation here only removes methods that auto-magically tried to tile and fuse correctly in presence of side-effects. The `tileAndFuseLinalgOps` also works with operations with tensor semantics. There are at least two other ways the same functionality exists. 1) The `tileConsumerAndFuseProducers` method. This does a similar transformation, but using a slightly different logic to automatically figure out the legal tile + fuse code. Note that this is also to be deprecated soon. 2) The prefered way uses the `TilingInterface` for tile + fuse, and relies on the caller to set the tiling options correctly to ensure that the generated code is correct. As proof that (2) is equivalent to the functionality provided by `tileAndFuseLinalgOps`, relevant tests have been moved to use the interface, where the test driver sets the tile sizes appropriately to generate the expected code. Differential Revision: https://reviews.llvm.org/D129901	2022-07-21 05:05:06 +00:00
owenca	a4c62f6654	[clang-format][NFC] Refactor RequiresDoesNotChangeParsingOfTheRest Differential Revision: https://reviews.llvm.org/D129982	2022-07-20 21:56:48 -07:00
owenca	892a9968ec	[clang-format] Indent tokens after hash only if it starts a line Fixes #56602. Differential Revision: https://reviews.llvm.org/D130136	2022-07-20 21:52:17 -07:00
Craig Topper	7dda6c71b1	[RISCV] Refactor the common combines for SELECT_CC and BR_CC into a helper function. The only difference between the combines were the calls to getNode that include the true/false values for SELECT_CC or the chain and branch target for BR_CC. Wrap the rest of the code into a helper that reads LHS, RHS, and CC and outputs new values and a bool if a new node needs to be created.	2022-07-20 21:18:07 -07:00
Xi Ruoyao	bba1f26f2e	Port address sanitizer to LoongArch Depends on D129371. It survived all GCC ASan tests. Changes are trivial and mostly "borrowed" RISC-V logics, except that a different SHADOW_OFFSET is used. Reviewed By: SixWeining, MaskRay, XiaodongLoong Differential Revision: https://reviews.llvm.org/D129418	2022-07-21 11:32:21 +08:00
jacquesguan	9c22853ec4	[mlir][Math] Add constant folder for LogOp. This patch adds constant folder for LogOp which only supports single and double precision floating-point. Reviewed By: ftynse Differential Revision: https://reviews.llvm.org/D130148	2022-07-21 10:58:32 +08:00
Chenbing Zheng	8c124c9088	[InstCombine] (ShiftValC >> Y) >s -1/<s 0 --> Y != 0/==0 We can do folds (ShiftValC >> Y) >s -1 --> Y != 0 and (ShiftValC >> Y) <s 0 --> Y == 0, with ShiftValC < 0. Alive2: https://alive2.llvm.org/ce/z/-PRHfD Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D129726	2022-07-21 10:12:29 +08:00
Chenbing Zheng	8075f680c8	[InstCombine] add fold (X > C - 1) ^ (X < C + 1) --> X != C Considering the correctness of this pattern, we should avoid that C - 1 is non-negative and C + 1 is negative. Alive2: https://alive2.llvm.org/ce/z/c_rBaq Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D129622	2022-07-21 10:08:21 +08:00
Vitaly Buka	82995e0e82	[NFC][asan] Clang-format a code	2022-07-20 18:57:37 -07:00
Vitaly Buka	e8554402b3	[NFC][memprof] Remove unused code	2022-07-20 18:50:45 -07:00
Vitaly Buka	26a7ee3d54	[NFC][asan] Use RoundDownTo	2022-07-20 18:50:44 -07:00
Craig Topper	8983db15a3	[RISCV] Optimize (brcond (seteq (and X, 1 << C), 0)) If C > 10, this will require a constant to be materialized for the And. To avoid this, we can shift X left by XLen-1-C bits to put the tested bit in the MSB, then we can do a signed compare with 0 to determine if the MSB is 0 or 1. Thanks to @reames for the suggestion. I've implemented this inside of translateSetCCForBranch which is called when setcc+brcond or setcc+select is converted to br_cc or select_cc during lowering. It doesn't make sense to do this for general setcc since we lack a sgez instruction. I've tested bit 10, 11, 31, 32, 63 and a couple bits betwen 11 and 31 and between 32 and 63 for both i32 and i64 where applicable. Select has some deficiencies where we receive (and (srl X, C), 1) instead. This doesn't happen for br_cc due to the call to rebuildSetCC in the generic DAGCombiner for brcond. I'll explore improving select in a future patch. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D130203	2022-07-20 18:40:49 -07:00
Hui Xie	7abbd6224b	[libc++] Fix proxy iterator issues that trigger an assertion in Chromium. Crash report: https://bugs.chromium.org/p/chromium/issues/detail?id=1346012 The triggered assertion is related sorting with `v8::internal::AtomicSlot`. `AtomicSlot` is a proxy iterator with a proxy type `AtomicSlot::Reference` (see `9bcb5eb590/src/objects/slots-atomic-inl.h`). https://reviews.llvm.org/D130197 correctly spotted the issue in `__iter_move` but doesn't actually fix the issue. The reason is that `AtomicSlot::operator` returns a prvalue `Reference`. After the fix in D130197, the return type of `__iter_move` is `Reference&&`. But the rvalue reference is bound to the temporary value returned by `operator`, which will be dangling after `__iter_move` returns. The idea of the fix in this change is borrowed from C++17's move_iterator https://timsong-cpp.github.io/cppwp/n4659/move.iterators#move.iterator-1 When the underlying reference is a prvalue, we just return it by value. Differential Revision: https://reviews.llvm.org/D130212	2022-07-20 18:05:49 -07:00
LLVM GN Syncbot	f6b5f24c19	[gn build] Port `4fcf8434dd`	2022-07-21 00:53:15 +00:00
Anubhab Ghosh	4fcf8434dd	[ORC] Add a new MemoryMapper-based JITLinkMemoryManager implementation. MapperJITLinkMemoryManager supports executor memory management using any implementation of MemoryMapper to do the transfer such as InProcessMapper or SharedMemoryMapper. Reviewed By: lhames Differential Revision: https://reviews.llvm.org/D129495	2022-07-20 17:52:37 -07:00
Slava Zakharin	7434375666	Revert "[flang] Run algebraic simplification optimization pass." This reverts commit `4fbd1d6c87`.	2022-07-20 16:56:28 -07:00
Lang Hames	aabc4b13e8	[ORC] Don't try to copy from an empty segment in SimpleExecutorMemoryManager. Since `67220c2ad7` empty SPSSequence<char>s deserialize to default-constructed ArrayRef<char>s, which have a null data field. We need to check for this to avoid memcpy'ing from a nullptr. This should fix the bot failure in https://lab.llvm.org/buildbot/#/builders/85/builds/9323	2022-07-20 16:47:00 -07:00
Steven Wu	d072826057	[Darwin toolchain] Tune the logic for finding arclite. The heuristic used to determine where the arclite libraries are to be found was based on the path of the `clang` executable. However, in some scenarios the `clang` executable is within a toolchain that does not have arclite. When this happens, derive the arclite paths from the sysroot option. This allows Clang to correctly derive the arclite directory in, e.g., Swift CI, using similar logic to what the Swift driver has been doing for several years. Patched by Doug Gregor. Reviewed By: keith Differential Revision: https://reviews.llvm.org/D130205	2022-07-20 16:45:52 -07:00
Slava Zakharin	4fbd1d6c87	[flang] Run algebraic simplification optimization pass. Flang algebraic simplification pass will run algebraic simplification rewrite patterns for Math/Complex/etc. dialects. It is enabled under opt-for-speed optimization levels (i.e. for O1/O2/O3; Os/Oz will not enable it). With this change the FIR/MLIR optimization pipeline becomes affected by the -O* optimization level switches. Until now these switches only affected the middle-end and back-end. Differential Revision: https://reviews.llvm.org/D130035	2022-07-20 16:33:52 -07:00
River Riddle	ed344c8877	[mlir:LSP] Add a quickfix code action for inserting expected-* diagnostic checks This allows for automatically inserting expected checks for parser and verifier diagnostics, which simplifies the workflow when building new dialect constructs or extending existing ones. Differential Revision: https://reviews.llvm.org/D130152	2022-07-20 15:43:59 -07:00
Johannes Doerfert	ad98ef8be4	[Attributor] Deal with complex PHI nodes better during AAPointerInfo We were quite conservative when it came to PHI node handling to avoid recursive reasoning. Now we check more direct if we have seen a PHI already or not. This allows non-recursive PHI chains to be handled. This also exposed a bug as we did only model the effect of one loop traversal. `phi_no_store_3` has been adapted to show how we would have used `undef` instead of `1` before. With this patch we don't replace it at all, which is expected as we do not argue about loop iterations (or alignments).	2022-07-20 17:34:50 -05:00

... 4 5 6 7 8 ...

430772 Commits All Branches Search

430772 Commits

All Branches