llvm-project

Commit Graph

Author	SHA1	Message	Date
Yaxun (Sam) Liu	187658b8a6	Recommit "[HIP] Change default --gpu-max-threads-per-block value to 1024" Recommit `04abbb3a78`	2020-09-28 22:43:17 -04:00
Yaxun (Sam) Liu	10eb3bf2d4	Skip -fPIE for AMDGPU and HIP toolchain AMDGPU toolchain does not support -fPIE, therefore skip it if specified by driver. Differential Revision: https://reviews.llvm.org/D88425	2020-09-28 22:03:18 -04:00
Richard Smith	c375635d05	Ensure that we don't compute linkage for an anonymous class too early if it has a member whose name is the same as a builtin. Fixes a regression from the introduction of BuiltinAttr.	2020-09-28 17:22:40 -07:00
Jan Korous	6fd8c69049	[clang] Update warning-wall.c test Follow-up to 1e86d637eb4f: [clang] Selectively ena/disa-ble format-insufficient-args warning	2020-09-28 17:19:51 -07:00
Zahira Ammarguellat	efd04721c9	BuildVectorType with a dependent (array) type is crashing the compiler - Fix for PR-47542 Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D88150	2020-09-28 17:10:32 -07:00
Yonghong Song	54d9f743c8	BPF: move AbstractMemberAccess and PreserveDIType passes to EP_EarlyAsPossible Move abstractMemberAccess and PreserveDIType passes as early as possible, right after clang code generation. Currently, compiler may transform the above code p1 = llvm.bpf.builtin.preserve.struct.access(base, 0, 0); p2 = llvm.bpf.builtin.preserve.struct.access(p1, 1, 2); a = llvm.bpf.builtin.preserve_field_info(p2, EXIST); if (a) { p1 = llvm.bpf.builtin.preserve.struct.access(base, 0, 0); p2 = llvm.bpf.builtin.preserve.struct.access(p1, 1, 2); bpf_probe_read(buf, buf_size, p2); } to p1 = llvm.bpf.builtin.preserve.struct.access(base, 0, 0); p2 = llvm.bpf.builtin.preserve.struct.access(p1, 1, 2); a = llvm.bpf.builtin.preserve_field_info(p2, EXIST); if (a) { bpf_probe_read(buf, buf_size, p2); } and eventually assembly code looks like reloc_exist = 1; reloc_member_offset = 10; //calculate member offset from base p2 = base + reloc_member_offset; if (reloc_exist) { bpf_probe_read(bpf, buf_size, p2); } if during libbpf relocation resolution, reloc_exist is actually resolved to 0 (not exist), reloc_member_offset relocation cannot be resolved and will be patched with illegal instruction. This will cause verifier failure. This patch attempts to address this issue by do chaining analysis and replace chains with special globals right after clang code gen. This will remove the cse possibility described in the above. The IR typically looks like %6 = load @llvm.sk_buff:0:50$0:0:0:2:0 %7 = bitcast %struct.sk_buff* %2 to i8* %8 = getelementptr i8, i8* %7, %6 for a particular address computation relocation. But this transformation has another consequence, code sinking may happen like below: PHI = <possibly different @preserve__access_globals> %7 = bitcast %struct.sk_buff %2 to i8* %8 = getelementptr i8, i8* %7, %6 For such cases, we will not able to generate relocations since multiple relocations are merged into one. This patch introduced a passthrough builtin to prevent such optimization. Looks like inline assembly has more impact for optimizaiton, e.g., inlining. Using passthrough has less impact on optimizations. A new IR pass is introduced at the beginning of target-dependent IR optimization, which does: - report fatal error if any reloc global in PHI nodes - remove all bpf passthrough builtin functions Changes for existing CORE tests: - for clang tests, add "-Xclang -disable-llvm-passes" flags to avoid builtin->reloc_global transformation so the test is still able to check correctness for clang generated IR. - for llvm CodeGen/BPF tests, add "opt -O2 <ir_file> \| llvm-dis" command before "llc" command since "opt" is needed to call newly-placed builtin->reloc_global transformation. Add target triple in the IR file since "opt" requires it. - Since target triple is added in IR file, if a test may produce different results for different endianness, two tests will be created, one for bpfeb and another for bpfel, e.g., some tests for relocation of lshift/rshift of bitfields. - field-reloc-bitfield-1.ll has different relocations compared to old codes. This is because for the structure in the test, new code returns struct layout alignment 4 while old code is 8. Align 8 is more precise and permits double load. With align 4, the new mechanism uses 4-byte load, so generating different relocations. - test intrinsic-transforms.ll is removed. This is used to test cse on intrinsics so we do not lose metadata. Now metadata is attached to global and not instruction, it won't get lost with cse. Differential Revision: https://reviews.llvm.org/D87153	2020-09-28 16:56:22 -07:00
David Tenty	ee80615b5c	[clang][driver][AIX] Set compiler-rt as default rtlib Reviewed By: hubert.reinterpretcast Differential Revision: https://reviews.llvm.org/D88182	2020-09-28 19:45:43 -04:00
Jan Korous	1e86d637eb	[clang] Selectively ena/disa-ble format-insufficient-args warning Differential Revision: https://reviews.llvm.org/D87176	2020-09-28 16:24:50 -07:00
Aaron Ballman	e7549dafcd	Fix a think-o with the numerical suffixes in the docs for init_priority.	2020-09-28 16:52:58 -04:00
Craig Topper	288c5776c9	[X86] Use inlineasm flag output for the _bittest* intrinsics. Instead of expliciting emitting a setc in the inline asm instructions, we can use flag output. This allows the backend to use the flag directly if it is needed by a branch. Previously we needed a test instruction to convert the register back to a flag. If the flag can't be used directly, the backend will emit a setcc. Differential Revision: https://reviews.llvm.org/D87888	2020-09-28 13:33:22 -07:00
Baptiste Saleil	0156914275	[PowerPC] Legalize v256i1 and v512i1 and implement load and store of these types This patch legalizes the v256i1 and v512i1 types that will be used for MMA. It implements loads and stores of these types. v256i1 is a pair of VSX registers, so for this type, we load/store the two underlying registers. v512i1 is used for MMA accumulators. So in addition to loading and storing the 4 associated VSX registers, we generate instructions to prime (copy the VSX registers to the accumulator) after loading and unprime (copy the accumulator back to the VSX registers) before storing. This patch also adds the UACC register class that is necessary to implement the loads and stores. This class represents accumulator in their unprimed form and allow the distinction between primed and unprimed accumulators to avoid invalid copies of the VSX registers associated with primed accumulators. Differential Revision: https://reviews.llvm.org/D84968	2020-09-28 14:39:37 -05:00
Paweł Bylica	0c82fa677f	[python][tests] Fix string comparison with "is"	2020-09-28 21:11:50 +02:00
Vedant Kumar	06bc685fa2	[ubsan] nullability-arg: Fix crash on C++ member pointers Extend -fsanitize=nullability-arg to handle call sites which accept C++ member pointers. rdar://62476022 Differential Revision: https://reviews.llvm.org/D88336	2020-09-28 09:41:18 -07:00
Michael Liao	5dbf80cad9	[clang][codegen] Annotate `correctly-rounded-divide-sqrt-fp-math` fn-attr for OpenCL only. - `-cl-fp32-correctly-rounded-divide-sqrt` is an OpenCL-specific option and `correctly-rounded-divide-sqrt-fp-math` should be added for OpenCL at most. Differential revision: https://reviews.llvm.org/D88303	2020-09-28 11:40:32 -04:00
Haojian Wu	bf890dcb0f	[clang] Don't emit "no member" diagnostic if the lookup fails on an invalid record decl. The "no member" diagnostic is likely bogus. Reviewed By: sammccall, #libc Differential Revision: https://reviews.llvm.org/D86765	2020-09-28 15:10:00 +02:00
David Sherwood	bafdd11326	[SVE] Replace / operator in TypeSize/ElementCount with divideCoefficientBy After some recent upstream discussion we decided that it was best to avoid having the / operator for both ElementCount and TypeSize, since this could give the impression that these classes can be used in the same way as basic integer integer types. However, division for scalable types is a bit odd because we are only dividing the minimum quantity by a value, as opposed to something like: (MinSize * Vscale) / SomeValue This is why when performing division it's important the caller first establishes whether the operation makes sense, perhaps by calling isKnownMultipleOf() prior to division. The caller must now explictly call divideCoefficientBy() on the class to perform the operation. Differential Revision: https://reviews.llvm.org/D87700	2020-09-28 08:03:00 +01:00
Richard Smith	df2a1f2aab	Add profiling support for APValues. For C++20 P0732R2; unused so far. Will be used and tested by a follow-on commit.	2020-09-27 20:05:39 -07:00
Richard Smith	9dcd96f728	Canonicalize declaration pointers when forming APValues. References to different declarations of the same entity aren't different values, so shouldn't have different representations. Recommit of `e6393ee813` with fixed handling for weak declarations. We now look for attributes on the most recent declaration when determining whether a declaration is weak. (Second recommit with further fixes for mishandling of weak declarations. Our behavior here is fundamentally unsound -- see PR47663 -- but this approach attempts to not make things worse.)	2020-09-27 19:05:26 -07:00
Aaron Ballman	de55ebe3bb	Typo fix; NFC	2020-09-27 08:30:41 -04:00
Aaron Puchert	485501899d	Fix sphinx warnings in AttributeReference, NFC The previous attempt in `d34c8c70` didn't help (the problem was missing indentation), and another issue was introduced by `a51d51a0`.	2020-09-27 00:52:36 +02:00
Russell Yanofsky	f702a6fa7c	Thread safety analysis: Improve documentation for ASSERT_CAPABILITY Previous description didn't actually state the effect the attribute has on thread safety analysis (causing analysis to assume the capability is held). Previous description was also ambiguous about (or slightly overstated) the noreturn assumption made by thread safety analysis, implying the assumption had to be true about the function's behavior in general, and not just its behavior in places where it's used. Stating the assumption specifically should avoid a perceived need to disable thread safety analysis in places where only asserting that a specific capability is held would be better. Reviewed By: aaronpuchert, vasild Differential Revision: https://reviews.llvm.org/D87629	2020-09-26 22:16:50 +02:00
Florian Hahn	915310bf14	Revert "[DSE] Switch to MemorySSA-backed DSE by default." There appears to be a mis-compile with MemorySSA-backed DSE in combination with llvm.lifetime.end. It currently appears like DSE is doing the right thing and the llvm.lifetime.end markers are incorrect. The reverted patch uncovers the mis-compile. This patch temporarily switches back to the legacy DSE implementation, while we investigate. This reverts commit `9d172c8e9c`.	2020-09-26 18:35:27 +01:00
Serge Pavlov	f91b9c0f98	Run test on particular target only The test `AST/const-fpfeatures-diag.c` requires setting strict FP semantics, so it fails on targets where support of such semantic is limited.	2020-09-26 20:26:34 +07:00
Serge Pavlov	6314f412a8	[FPEnv] Evaluate constant expressions under non-default rounding modes The change implements evaluation of constant floating point expressions under non-default rounding modes. The main objective was to support evaluation of global variable initializers, where constant rounding mode may be specified by `#pragma STDC FENV_ROUND`. Differential Revision: https://reviews.llvm.org/D87822	2020-09-26 17:59:39 +07:00
Dmitry Antipov	2ca0ea15e5	[Driver] Fix formatting as suggested by clang-format (NFC)	2020-09-26 08:52:51 +03:00
Dmitry Antipov	96318f64a7	[Driver] Perform Linux distribution detection only once Differential Revision: https://reviews.llvm.org/D87187	2020-09-26 08:44:08 +03:00
Shilei Tian	ebb1092a28	[Clang][OpenMP] Added support for nowait target in CodeGen via regular task Previously for nowait target, CG emitted a function call to `__tgt_target_nowait`, etc. However, in OpenMP RTL, these functions just directly call the no-nowait version, which means nowait is not working as expected. OpenMP specification says a target is acutally a target task, which is an untied and detachable task. It is natural to go to the direction that generates a task for a nowait target. However, OpenMP task has a problem that it must be within to a parallel region; otherwise the task will be executed immediately. As a result, if we directly wrap to a regular task, the `target nowait` outside of a parallel region is still a synchronous version. In D77609, I added the support for unshackled task in OpenMP RTL. Basically, unshackled task is a task that is not bound to any parallel region. So all nowait target will be tranformed into an unshackled task. In order to distinguish from regular task, a new flag bit is set for unshackled task. This flag will be used by RTL for later process. Since all target tasks are allocated via `__kmpc_omp_target_task_alloc`, and in current `libomptarget`, `__kmpc_omp_target_task_alloc` just calls `__kmpc_omp_task_alloc`. Therefore, we can modify the flag in `__kmpc_omp_target_task_alloc` so that we don't need to modify the FE too much. If users choose to opt out the feature, they just need to use a RTL w/o support of unshackled threads. As a result, in this patch, the `target nowait` region is simply wrapped into a regular task. Later once we have RTL support for unshackled tasks, the wrapped tasks can be executed by unshackled threads w/o changes in the FE. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D78075	2020-09-25 22:10:36 -04:00
Evandro Menezes	a000580a89	[RISCV] Update driver tests Add the RISC-V Bullet core to the driver tests.	2020-09-25 18:36:53 -05:00
Saleem Abdulrasool	58cdbf518b	Sema: add support for `__attribute__((__swift_private__))` This attribute allows declarations to be restricted to the framework itself, enabling Swift to remove the declarations when importing libraries. This is useful in the case that the functions can be implemented in a more natural way for Swift. This is based on the work of the original changes in `8afaf3aad2` Differential Revision: https://reviews.llvm.org/D87720 Reviewed By: Aaron Ballman	2020-09-25 22:33:53 +00:00
Matt Arsenault	55c4ff91bd	OpaquePtr: Add type to sret attribute Make the corresponding change that was made for byval in `b7141207a4`. Like byval, this requires a bulk update of the test IR tests to include the type before this can be mandatory.	2020-09-25 14:07:30 -04:00
Saleem Abdulrasool	76eb163259	Sema: remove unnecessary parameter for SwiftName handling (NFCI) This code never actually did anything in the implementation. `mergeDeclAttribute` is declared as `static`, and referenced exactly once in the file: from `Sema::mergeDeclAttributes`. `Sema::mergeDeclAttributes` sets `LocalAMK` to `AMK_None`. If the attribute is `DeprecatedAttr`, `UnavailableAttr`, or `AvailabilityAttr` then the `LocalAMK` is updated. However, because we are dealing with a `SwiftNameDeclAttr` here, `LocalAMK` remains `AMK_None`. This is then passed to the function which will as a result pass the value of `AMK_None == AMK_Override` aka `false`. Simply propagate the value through and erase the dead codepath. Thanks to Aaron Ballman for flagging the use of the availability merge kind here leading to this simplification! Differential Revision: https://reviews.llvm.org/D88263 Reviewed By: Aaron Ballman	2020-09-25 17:01:06 +00:00
Vedant Kumar	62c372770d	[profile] Add %t LLVM_PROFILE_FILE option to substitute $TMPDIR Add support for expanding the %t filename specifier in LLVM_PROFILE_FILE to the TMPDIR environment variable. This is supported on all platforms. On Darwin, TMPDIR is used to specify a temporary application-specific scratch directory. When testing apps on remote devices, it can be challenging for the host device to determine the correct TMPDIR, so it's helpful to have the runtime do this work. rdar://68524185 Differential Revision: https://reviews.llvm.org/D87332	2020-09-25 09:39:40 -07:00
Aaron Ballman	a51d51a0d4	Fix some of the more egregious 80-col and whitespace issues; NFC	2020-09-25 10:37:38 -04:00
Aaron Ballman	85cea77ecb	Typo fix; NFC	2020-09-25 10:26:29 -04:00
Benjamin Kramer	6a1bca8798	[Analyzer] Fix unused variable warning in Release builds clang/lib/StaticAnalyzer/Core/ExprEngineCXX.cpp:377:19: warning: unused variable 'Init'	2020-09-25 14:09:43 +02:00
Manuel Klimek	e336b74c99	[clang-format] Add a MacroExpander. Summary: The MacroExpander allows to expand simple (non-resursive) macro definitions from a macro identifier token and macro arguments. It annotates the tokens with a newly introduced MacroContext that keeps track of the role a token played in expanding the macro in order to be able to reconstruct the macro expansion from an expanded (formatted) token stream. Made Token explicitly copy-able to enable copying tokens from the parsed macro definition. Reviewers: sammccall Subscribers: mgorny, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D83296	2020-09-25 14:08:13 +02:00
Chris Bowler	f330d9f163	[PPC] [AIX] Implement calling convention IR for C99 complex types on AIX Add AIX calling convention logic to Clang for C99 complex types on AIX Differential Revision: https://reviews.llvm.org/D88130	2020-09-25 07:43:31 -04:00
Adam Balogh	facad21b29	[Analyzer] Fix for `ExprEngine::computeObjectUnderConstruction()` for base and delegating consturctor initializers For /C++/ constructor initializers `ExprEngine:computeUnderConstruction()` asserts that they are all member initializers. This is not neccessarily true when this function is used to get the return value for the construction context thus attempts to fetch return values of base and delegating constructor initializers result in assertions. This small patch fixes this issue. Differential Revision: https://reviews.llvm.org/D85351	2020-09-25 13:28:22 +02:00
Momchil Velikov	a88c722e68	[AArch64] PAC/BTI code generation for LLVM generated functions PAC/BTI-related codegen in the AArch64 backend is controlled by a set of LLVM IR function attributes, added to the function by Clang, based on command-line options and GCC-style function attributes. However, functions, generated in the LLVM middle end (for example, asan.module.ctor or __llvm_gcov_write_out) do not get any attributes and the backend incorrectly does not do any PAC/BTI code generation. This patch record the default state of PAC/BTI codegen in a set of LLVM IR module-level attributes, based on command-line options: * "sign-return-address", with non-zero value means generate code to sign return addresses (PAC-RET), zero value means disable PAC-RET. * "sign-return-address-all", with non-zero value means enable PAC-RET for all functions, zero value means enable PAC-RET only for functions, which spill LR. * "sign-return-address-with-bkey", with non-zero value means use B-key for signing, zero value mean use A-key. This set of attributes are always added for AArch64 targets (as opposed, for example, to interpreting a missing attribute as having a value 0) in order to be able to check for conflicts when combining module attributed during LTO. Module-level attributes are overridden by function level attributes. All the decision making about whether to not to generate PAC and/or BTI code is factored out into AArch64FunctionInfo, there shouldn't be any places left, other than AArch64FunctionInfo, which directly examine PAC/BTI attributes, except AArch64AsmPrinter.cpp, which is/will-be handled by a separate patch. Differential Revision: https://reviews.llvm.org/D85649	2020-09-25 11:47:14 +01:00
Ian Levesque	7db7a35545	Fix uninitialized XRayArg	2020-09-25 00:20:36 -04:00
Chris Bowler	64b8a633a8	[NFC] [PPC] Add PowerPC expected IR tests for C99 complex Adding this test so that I can extend it in a follow on patch with expected IR for AIX when I implement complex handling in AIXABIInfo. Reviewed By: daltenty, ZarkoCA Differential Revision: https://reviews.llvm.org/D88105	2020-09-24 23:28:40 -04:00
Ian Levesque	6f7fbdd285	[xray] Function coverage groups Add the ability to selectively instrument a subset of functions by dividing the functions into N logical groups and then selecting a group to cover. By selecting different groups over time you could cover the entire application incrementally with lower overhead than instrumenting the entire application at once. Differential Revision: https://reviews.llvm.org/D87953	2020-09-24 22:09:53 -04:00
Richard Smith	8c98c88034	PR47176: Don't read from an inactive union member if a friend function has default arguments and an exception specification.	2020-09-24 19:02:27 -07:00
Reid Kleckner	276f68eace	Revert "Add a static_assert confirming that DiagnosticBuilder is small" This reverts commit `a32feed0db`. This assert doesn't hold in 32-bit builds, I didn't do the math right.	2020-09-24 16:39:46 -07:00
Reid Kleckner	a32feed0db	Add a static_assert confirming that DiagnosticBuilder is small	2020-09-24 16:38:41 -07:00
Reid Kleckner	ecfc9b9712	[MS] For unknown ISAs, pass non-trivially copyable arguments indirectly Passing them directly is likely to be non-conforming, since it usually involves copying the bytes of the record. For unknown architectures, we don't know what MSVC does or will do, but we should at least try to conform as well as we can.	2020-09-24 16:29:48 -07:00
Reid Kleckner	b8a50e9207	[MS] Simplify rules for passing C++ records Regardless of the target architecture, we should always use the C rules (RAA_Default) for records that "canBePassedInRegisters". Those are trivially copyable things, and things marked with [[trivial_abi]]. This should be NFC, although it changes where the final decision about x86_32 overaligned records is made. The current x86_32 C rules say that overaligned things are passed indirectly, so there is no functional difference.	2020-09-24 16:29:47 -07:00
Bill Wendling	c9b53b3bf2	Fix regex in test.	2020-09-24 15:21:28 -07:00
Amy Huang	c8df781e54	[DebugInfo] Fix bug in constructor homing with classes with trivial constructors. This changes the code to avoid using constructor homing for aggregate classes and classes with trivial default constructors, instead of trying to loop through the constructors. Differential Revision: https://reviews.llvm.org/D87808	2020-09-24 14:43:48 -07:00
Bill Wendling	f97b68ef4d	Fix testcase.	2020-09-24 14:34:28 -07:00

1 2 3 4 5 ...

85942 Commits