llvm-project

Commit Graph

Author	SHA1	Message	Date
Nico Weber	ea3777fe22	[gn build] (semi-manually) port `0b10bb7ddd` more	2021-05-05 18:15:13 -04:00
Dave Lee	c5cf4b8f11	[lldb] Handle missing SBStructuredData copy assignment cases Fix cases that can crash `SBStructuredData::operator=`. This happened in a case where `rhs` had a null `SBStructuredDataImpl`. Differential Revision: https://reviews.llvm.org/D101585	2021-05-05 15:12:03 -07:00
Vy Nguyen	23233ad139	[lld-macho] Check simulator platforms to avoid issuing false positive errors. Currently the linker causes unnecessary errors when either the target or the config's platform is a simulator. Differential Revision: https://reviews.llvm.org/D101855	2021-05-05 18:07:58 -04:00
Nico Weber	ceccfaae14	[gn build] (semi-manually) port `0b10bb7ddd`	2021-05-05 18:06:52 -04:00
Matt Arsenault	ef5f0adecd	AMDGPU: Add a few more tail call tests Add some cases I noticed were missing when porting to GlobalISel. The cases that required any argument splitting did not work at first.	2021-05-05 17:55:02 -04:00
Matt Arsenault	6e88539ab1	ARM/GlobalISel: Don't store a MachineInstrBuilder reference This is basically a pointer anyway	2021-05-05 17:55:02 -04:00
Richard Smith	6bbfa0fd40	When performing template argument deduction to select a partial specialization while substituting a partial template parameter pack, don't try to extend the existing deduction. This caused us to select the wrong partial specialization in some rare cases. A recent change to libc++ caused this to happen in practice for code using std::conjunction.	2021-05-05 14:47:18 -07:00
Stanislav Mekhanoshin	909a5ccf3b	[AMDGPU] Improve global SADDR selection An address can be a uniform sum of two i64 bit values. That regularly happens in a loop where index is an induction variable promoted to 64 bit by the LSR. We can materialize zero in a VGPR and still use SADDR form of the load. Differential Revision: https://reviews.llvm.org/D101591	2021-05-05 14:44:21 -07:00
Kirill Bobyrev	e623ce6188	[clangd] Split CC and refs limit and increase refs limit to 1000 Related discussion: https://github.com/clangd/clangd/discussions/761 Reviewed By: kadircet Differential Revision: https://reviews.llvm.org/D101902	2021-05-05 23:39:48 +02:00
Matt Arsenault	e723b511e6	GlobalISel: Update documentation	2021-05-05 17:35:02 -04:00
Matt Arsenault	8fc4eb9e73	AMDGPU/GlobalISel: Remove unnecessary override This is the same as the default implementation	2021-05-05 17:35:02 -04:00
Matt Arsenault	23ae35e858	X86/GlobalISel: Use generic version of splitToValueTypes The custom insert of an unmerge and the callback weirdness should be unnecessary. Since handleAssignments should now use getRegisterTypeForCalling conv as SelectionDAG builder would, this should now just be able to use the generic code. X86-32 relies on the generated CCAssignFns not seeing illegal types and sharing code with x86_64, so i64 values would incorrectly be assigned to 64-bit registers.	2021-05-05 17:35:02 -04:00
Matt Arsenault	fa0b93b5a0	GlobalISel: Use DAG call lowering infrastructure in a more compatible way Unfortunately the current call lowering code is built on top of the legacy MVT/DAG based code. However, GlobalISel was not using it the same way. In short, the DAG passes legalized types to the assignment function, and GlobalISel was passing the original raw type if it was simple. I do believe the DAG lowering is conceptually broken since it requires picking a type up front before knowing how/where the value will be passed. This ends up being a problem for AArch64, which wants to pass i1/i8/i16 values as a different size if passed on the stack or in registers. The argument type decision is split across 3 different places which is hard to follow. SelectionDAG builder uses getRegisterTypeForCallingConv to pick a legal type, tablegen gives the illusion of controlling the type, and the target may have additional hacks in the C++ part of the call lowering. AArch64 hacks around this by not using the standard AnalyzeFormalArguments and special casing i1/i8/i16 by looking at the underlying type of the original IR argument. I believe people have generally assumed the calling convention code is processing the original types, and I've discovered a number of dead paths in several targets. x86 actually relies on the opposite behavior from AArch64, and relies on x86_32 and x86_64 sharing calling convention code where the 64-bit cases implicitly do not work on x86_32 due to using the pre-legalized types. AMDGPU targets without legal i16/f16 have always used a broken ABI that promotes to i32/f32. GlobalISel accidentally fixed this to be the ABI we should have, but this fixes it so we're using the worse ABI that is compatible with the DAG. Ideally we would fix the DAG to match the old GlobalISel behavior, but I don't wish to fight that battle. A new native GlobalISel call lowering framework should let the target process the incoming types directly. CCValAssigns select a "ValVT" and "LocVT" but the meanings of these aren't entirely clear. Different targets don't use them consistently, even within their own call lowering code. My current belief is the intent was "ValVT" is supposed to be the legalized value type to use in the end, and and LocVT was supposed to be the ABI passed type (which is also legalized). With the default CCState::Analyze functions always passing the same type for these arguments, these only differ when the TableGen part of the lowering decide to promote the type from one legal type to another. AArch64's i1/i8/i16 hack ends up inverting the meanings of these values, so I had to add an additional hack to let the target interpret how large the argument memory is. Since targets don't consistently interpret ValVT and LocVT, this doesn't produce quite equivalent code to the initial DAG lowerings. I've opted to consistently interpret LocVT as the in-memory size for stack passed values, and ValVT as the register type to assign from that memory. We therefore produce extending loads directly out of the IRTranslator, whereas the DAG would emit regular loads of smaller values. This will also produce loads/stores that are wider than the argument value if the allocated stack slot is larger (and there will be undef padding bytes). If we had the optimizations to reduce load/stores based on truncated values, this wouldn't produce a different end result. Since ValVT/LocVT are more consistently interpreted, we now will emit more G_BITCASTS as requested by the CCAssignFn. For example AArch64 was directly assigning types to some physical vector registers which according to the tablegen spec should have been casted to a vector with a different element type. This also moves the responsibility for inserting G_ASSERT_SEXT/G_ASSERT_ZEXT from the target ValueHandlers into the generic code, which is closer to how SelectionDAGBuilder works. I had to xfail an x86 test since I don't see a quick way to fix it right now (I filed bug 50035 for this). It's broken independently of this change, and only triggers since now we end up with more ands which hit the improperly handled selection pattern. I also observed that FP arguments that need promotion (e.g. f16 passed as f32) are broken, and use regular G_TRUNC and G_ANYEXT. TLDR; the current call lowering infrastructure is bad and nobody has ever understood how it chooses types.	2021-05-05 17:35:02 -04:00
Emilio Cota	0edc4bc84a	[mlir] Add polynomial approximation for math::ExpM1 This approximation matches the one in Eigen. ``` name old cpu/op new cpu/op delta BM_mlir_Expm1_f32/10 90.9ns ± 4% 52.2ns ± 4% -42.60% (p=0.000 n=74+87) BM_mlir_Expm1_f32/100 837ns ± 3% 231ns ± 4% -72.43% (p=0.000 n=79+69) BM_mlir_Expm1_f32/1k 8.43µs ± 3% 1.58µs ± 5% -81.30% (p=0.000 n=77+83) BM_mlir_Expm1_f32/10k 83.8µs ± 3% 15.4µs ± 5% -81.65% (p=0.000 n=83+69) BM_eigen_s_Expm1_f32/10 68.8ns ±17% 72.5ns ±14% +5.40% (p=0.000 n=118+115) BM_eigen_s_Expm1_f32/100 694ns ±11% 717ns ± 2% +3.34% (p=0.000 n=120+75) BM_eigen_s_Expm1_f32/1k 7.69µs ± 2% 7.97µs ±11% +3.56% (p=0.000 n=95+117) BM_eigen_s_Expm1_f32/10k 88.0µs ± 1% 89.3µs ± 6% +1.45% (p=0.000 n=74+106) BM_eigen_v_Expm1_f32/10 44.3ns ± 6% 45.0ns ± 8% +1.45% (p=0.018 n=81+111) BM_eigen_v_Expm1_f32/100 351ns ± 1% 360ns ± 9% +2.58% (p=0.000 n=73+99) BM_eigen_v_Expm1_f32/1k 3.31µs ± 1% 3.42µs ± 9% +3.37% (p=0.000 n=71+100) BM_eigen_v_Expm1_f32/10k 33.7µs ± 8% 34.1µs ± 9% +1.04% (p=0.007 n=99+98) ``` Reviewed By: ezhulenev Differential Revision: https://reviews.llvm.org/D101852	2021-05-05 14:31:34 -07:00
Michael Kitzan	a11489ae3e	[MachineCSE][NFC]: Refactor and comment on preventing CSE for isConvergent instrs - Move the code preventing CSE of `isConvergent` instrs into `ProcessBlockCSE` (from `isProfitableToCSE`) - Add comments explaining why `isConvergent` is used to prevent CSE of non-local instrs in MachineCSE and the new test	2021-05-05 14:22:03 -07:00
Giorgis Georgakoudis	78a7d8c4dd	[Utils][NFC] Rename replace-function-regex in update_cc_test_checks This patch renames the replace-function-regex to replace-value-regex to indicate that the existing regex replacement functionality can replace any IR value besides functions. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D101934	2021-05-05 14:19:30 -07:00
Krzysztof Parzyszek	1817dae192	Preserve metadata on masked intrinsics in auto-upgrade When auto-upgrade was replacing a call to a masked intrinsic, it would not copy the metadata from the original call. If an intrinsic had metadata, but did not need any updates, the metadata would stay, but if an update was needed, the would end up being removed. A similar effect could be observed with masked_expandload and masked_compressstore, which at the moment are not handled by auto-upgrade: the metadata remained untouched. Differential Revision: https://reviews.llvm.org/D101201	2021-05-05 15:51:46 -05:00
Roman Lebedev	40147c33d1	[NFC][X86][Codegen] Add some tests for 64-bit shift by (32-x)	2021-05-05 23:47:11 +03:00
Thomas Lively	81fce29d6e	[WebAssembly] Add SIMD const_splat intrinsics These intrinsics do not correspond to their own underlying instruction, but are a convenience for the common case of materializing a constant vector that has the same value in each lane. Differential Revision: https://reviews.llvm.org/D101885	2021-05-05 13:46:45 -07:00
Isuru Fernando	662a58fa05	[lld] Convert LLVM_CMAKE_PATH to a CMake path Otherwise I get the following error on windows. ``` CMake Error at D:/bld/lld_1569206597988/work/build/CMakeFiles/CMakeTmp/CMakeLists.txt:2 (set): Syntax error in cmake code at D:/bld/lld_1569206597988/work/build/CMakeFiles/CMakeTmp/CMakeLists.txt:2 when parsing string D:\bld\lld_1569206597988\_h_env\Library\lib\cmake\llvm Invalid character escape '\b'. CMake Error at D:/bld/lld_1569206597988/_build_env/Library/share/cmake-3.15/Modules/CheckSymbolExists.cmake:100 (try_compile): Failed to configure test project build system. Call Stack (most recent call first): D:/bld/lld_1569206597988/_build_env/Library/share/cmake-3.15/Modules/CheckSymbolExists.cmake:57 (__CHECK_SYMBOL_EXISTS_IMPL) D:/bld/lld_1569206597988/_h_env/Library/lib/cmake/llvm/HandleLLVMOptions.cmake:943 (check_symbol_exists) CMakeLists.txt:56 (include) ``` Reviewed By: sbc100 Differential Revision: https://reviews.llvm.org/D68158	2021-05-05 15:42:55 -05:00
Rob Suderman	7abb56c78b	[mlir][tosa] Add tosa.depthwise lowering to existing linalg.depthwise_conv Implements support for undialated depthwise convolution using the existing depthwise convolution operation. Once convolutions migrate to yaml defined versions we can rewrite for cleaner implementation. Reviewed By: mravishankar Differential Revision: https://reviews.llvm.org/D101579	2021-05-05 13:30:05 -07:00
Vitaly Buka	1d767b13bf	[scudo] Align objects with alignas Operator new must align allocations for types with large alignment. Before c++17 behavior was implementation defined and both clang and gc++ before 11 ignored alignment. Miss-aligned objects mysteriously crashed tests on Ubuntu 14. Alternatives are compile with -std=c++17 or -faligned-new, but they were discarded as less portable. Reviewed By: hctim Differential Revision: https://reviews.llvm.org/D101874	2021-05-05 13:29:21 -07:00
Arthur O'Dwyer	9ea2db2c51	[libc++] [LIBCXX-DEBUG-FIXME] Stop using invalid iterators to insert into sets/maps. This simply applies Howard's commit `4c80bfbd53` consistently across all the associative and unordered container tests. "unord.set/insert_hint_const_lvalue.pass.cpp" failed with `-D_LIBCPP_DEBUG=1` before this patch; it was the only one that incorrectly reused invalid iterator `e`. The others already used valid iterators (generally `c.end()`); I'm just making them all match the same pattern of usage: "e, then r, then c.end() for the rest." Differential Revision: https://reviews.llvm.org/D101679	2021-05-05 16:21:09 -04:00
Arthur O'Dwyer	9571b8f238	[libc++] [LIBCXX-DEBUG-FIXME] std::advance shouldn't use ADL `>=` on the _Distance type. Convert to a primitive type first; then use primitive `>=` on that value. Differential Revision: https://reviews.llvm.org/D101678	2021-05-05 16:21:09 -04:00
Arthur O'Dwyer	165ad89947	[libc++] [LIBCXX-DEBUG-FIXME] Our `__debug_less` breaks some complexity guarantees. `__debug_less` ends up running the comparator up-to-twice per comparison, because whenever `(x < y)` it goes on to verify that `!(y < x)`. This breaks the strict "Complexity" guarantees of algorithms like `inplace_merge`, which we test in the test suite. So, just skip the complexity assertions in debug mode. Differential Revision: https://reviews.llvm.org/D101677	2021-05-05 16:21:09 -04:00
Arthur O'Dwyer	12dd9cdf1a	[libc++] [LIBCXX-DEBUG-FIXME] Iterating a string::iterator "off the end" is UB. The range of char pointers [data, data+size] is a valid closed range, but the range [begin, end) is valid only half-open. Differential Revision: https://reviews.llvm.org/D101676	2021-05-05 16:21:09 -04:00
Arthur O'Dwyer	db9425cb06	[libc++] [LIBCXX-DEBUG-FIXME] Fix an iterator-invalidation issue in string::assign. This appears to be a bug in our string::assign: when assigning into a longer string, from a shorter snippet of itself, we invalidate iterators before doing the copy. We should invalidate them afterward. Also drive-by improve the formatting of a function header. Differential Revision: https://reviews.llvm.org/D101675	2021-05-05 16:20:53 -04:00
Arthur O'Dwyer	0b10bb7ddd	[libc++] Move <__sso_allocator> out of include/ into src/. NFCI. This allocator is not intended for libc++'s users to use; it's strictly an implementation detail of `src/locale.cpp`. So, move it to the `src/include/` directory. Drive-by const-qualify its comparison operators. For consistency with `__hidden_allocator` (defined in `src/thread.cpp`), do not remove it from "libcxx/lib/libc++unexp.exp", "libcxx/utils/symcheck-blacklists/linux_blacklist.txt", etc. Differential Revision: https://reviews.llvm.org/D101293	2021-05-05 16:20:52 -04:00
Thomas Lively	602f318cfd	[WebAssembly] Fix constness of pointer params to load intrinsics Update the SIMD builtin load functions to take pointers to const data and update the intrinsics themselves to not cast away constness. Differential Revision: https://reviews.llvm.org/D101884	2021-05-05 13:16:56 -07:00
Thomas Lively	627a526955	[WebAssembly] Update narrowing builtin function operand types Make the inputs to all narrowing builtins signed, which is how they are interpreted by the underlying instructions (only the result changes sign between instructions). Differential Revision: https://reviews.llvm.org/D101883	2021-05-05 13:04:04 -07:00
Tomasz Miąsko	0e7c2aeaa8	Add fuzzer for Rust demangler Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D101823	2021-05-05 12:50:50 -07:00
Jez Ng	75ba351300	[lld-macho] Try to unbreak build Looks like the PointerUnion casting cares about const-ness...	2021-05-05 15:47:14 -04:00
Martin Storsjö	9b24ff9cd2	[libcxx] [ci] Add a Windows CI configuration for a statically linked libc++ On Windows, static vs DLL linking affects details in quite a few cases, so it's good to have coverage for both cases. Testing with static linking also increases coverage for a number of cases and individual checks that have had to be waived for the DLL case, and allows testing libc++experimental, increasing the number of test cases actually executed by 180 (176 new tests from libc++experimental and 4 ones that are XFAIL windows-dll). Also drop the "generic-" prefix from these configuration names, as they're perhaps not what the "generic" prefix intended originally in the other generic-posix configurations. Differential Revision: https://reviews.llvm.org/D101565	2021-05-05 22:28:00 +03:00
Louis Dionne	7fbc7bfdfd	[libc++] NFC: Remove stray semicolon in from-scratch config files	2021-05-05 15:06:12 -04:00
Thomas Lively	89333b35a7	[WebAssembly] Set alignment to 1 for SIMD memory intrinsics The WebAssembly SIMD intrinsics in wasm_simd128.h generally try not to require any particular alignment for memory operations to be maximally flexible. For builtin memory access functions and their corresponding LLVM IR intrinsics, there's no way to set the expected alignment, so the best we can do is set the alignment to 1 in the backend. This change means that the alignment hints in the emitted code will no longer be incorrect when users use the intrinsics to access unaligned data. Differential Revision: https://reviews.llvm.org/D101850	2021-05-05 11:59:33 -07:00
Jon Chesterfield	25fe17d3c1	[libomptarget] Initial documentation on amdgpu offload [libomptarget] Initial documentation on amdgpu offload Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D101927	2021-05-05 19:58:52 +01:00
Evgenii Stepanov	18959a6a09	[hwasan] Fix missing synchronization in AllocThread. The problem was introduced in D100348. It's really hard to trigger the bug in a stress test - the race is just too narrow - but the new checks in Thread::Init should at least provide usable diagnostic if the problem ever returns. Differential Revision: https://reviews.llvm.org/D101881	2021-05-05 11:57:18 -07:00
Jez Ng	8806df4778	[lld-macho] Preliminary support for ARM_RELOC_BR24 ARM_RELOC_BR24 is used for BL/BLX instructions from within ARM (i.e. not Thumb) code. This diff just handles the basic case: branches from ARM to ARM, or from ARM to Thumb where no shimming is required. (See comments in ARM.cpp for why shims are required.) Note: I will likely be deprioritizing ARM work for the near future to focus on other parts of LLD. Apologies for the half-done state of this; I'm just trying to wrap up what I've already worked on. Reviewed By: #lld-macho, alexshap Differential Revision: https://reviews.llvm.org/D101814	2021-05-05 14:41:01 -04:00
Jez Ng	20f51ffe67	[lld-macho] Have --reproduce account for path rerooting We need to account for path rerooting when generating the response file. We could either reroot the paths before generating the file, or pass through the original filenames and change just the syslibroot. I've opted for the latter, in order that the reproduction run more closely mirrors the original. We must also be careful not to make an absolute path relative if it is shadowed by a rerooted path. See repro6.tar in reroot-path.s for details. I've moved the call to `createResponseFile()` after the initialization of `config->systemLibraryRoots`, since it now needs to know what those roots are. Reviewed By: #lld-macho, oontvoo Differential Revision: https://reviews.llvm.org/D101224	2021-05-05 14:41:01 -04:00
Harald van Dijk	7907c46fe6	Make clangd CompletionModel not depend on directory layout. The current code accounts for two possible layouts, but there is at least a third supported layout: clang-tools-extra may also be checked out as clang/tools/extra with the releases, which was not yet handled. Rather than treating that as a special case, use the location of CompletionModel.cmake to handle all three cases. This should address the problems that prompted D96787 and the problems that prompted the proposed revert D100625. Reviewed By: usaxena95 Differential Revision: https://reviews.llvm.org/D101851	2021-05-05 19:25:34 +01:00
Nick Desaulniers	aefbfbcbd7	[Clang] remove text extension from diag::err_drv_invalid_value_with_suggestion This hinders translations, as per: https://clang.llvm.org/docs/InternalsManual.html#the-format-string Reviewed By: MaskRay, xbolva00 Differential Revision: https://reviews.llvm.org/D101387	2021-05-05 11:01:43 -07:00
Roman Lebedev	8048005739	[NFC][SimplifyCFG] Update documentation comments for SinkCommonCodeFromPredecessors() after `1886aad`	2021-05-05 20:34:59 +03:00
Fangrui Song	b3336bfa2e	[llvm-objcopy][ELF] --only-keep-debug: set offset/size of segments with no sections to zero PR50160: we currently ignore non-PT_PHDR segments with no sections, not accounting for its p_offset and p_filesz: this can cause an out-of-bounds write in `writeSegmentData` if the p_offset+p_filesz is larger than the total file size. This can be fixed by setting p_offset=p_filesz=0. The logic nicely unifies with the logic added in D90897. Reviewed By: jhenderson, rupprecht Differential Revision: https://reviews.llvm.org/D101560	2021-05-05 10:26:57 -07:00
Saleem Abdulrasool	ba5c122647	RISSCV: clang-format RISC-V AsmParser (NFC) This corrects a few issues identified by `clang-format`. This is meant to be preparation for a subsequent change.	2021-05-05 10:16:41 -07:00
Roman Lebedev	833b33a7f4	[NFC][X86][CostModel] Add tests for byteswap intrinsic	2021-05-05 20:11:46 +03:00
Philipp Krones	632ebc4ab4	[MC] Untangle MCContext and MCObjectFileInfo This untangles the MCContext and the MCObjectFileInfo. There is a circular dependency between MCContext and MCObjectFileInfo. Currently this dependency also exists during construction: You can't contruct a MOFI without a MCContext without constructing the MCContext with a dummy version of that MOFI first. This removes this dependency during construction. In a perfect world, MCObjectFileInfo wouldn't depend on MCContext at all, but only be stored in the MCContext, like other MC information. This is future work. This also shifts/adds more information to the MCContext making it more available to the different targets. Namely: - TargetTriple - ObjectFileType - SubtargetInfo Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D101462	2021-05-05 10:03:02 -07:00
Philip Reames	80e8025083	[LV] Workaround PR49900 (a crash due to analyzing partially mutated IR) LoopVectorize has a fairly deeply baked in design problem where it will try to query analysis (primarily SCEV, but also ValueTracking) in the midst of mutating IR. In particular, the intermediate IR state does not represent the semantics of the original (or final) program. Fixing this for real is hard, but all of the cases seen so far share a common symptom. In cases seen to date, the analysis being queried is the computation of the original loop's trip count. We can fix this particular instance of the issue by simply computing the trip count early, and caching it. I want to be really clear that this is nothing but a workaround. It does nothing to fix the root issue, and at best, delays the time until we have to fix this for real. Florian and I have discussed an eventual solution in the review comments for https://reviews.llvm.org/D100663, but it's a lot of work. Test taken from https://reviews.llvm.org/D100663. Differential Revision: https://reviews.llvm.org/D101487	2021-05-05 09:56:28 -07:00
Javier Setoain	95861216ac	[mlir][ArmSVE] Add masked arithmetic operations These instructions map to SVE-specific instrinsics that accept a predicate operand to support control flow in vector code. Differential Revision: https://reviews.llvm.org/D100982	2021-05-05 17:41:58 +01:00
Nico Weber	f16afcd9b5	[clang] remove an incremental build workaround This cleaned up an oversight over a year ago. Should no longer be needed.	2021-05-05 12:21:56 -04:00
Stanislav Mekhanoshin	4c178d809b	[AMDGPU] Pre-commit 2 new saddr load tests. NFC.	2021-05-05 09:16:11 -07:00

1 2 3 4 5 ...

387572 Commits All Branches Search

387572 Commits

All Branches