llvm-project

Commit Graph

Author	SHA1	Message	Date
Igor Kudrin	aa84289629	[DebugInfo] Keep the DWARF64 flag in the module metadata This allows the option to affect the LTO output. Module::Max helps to generate debug info for all modules in the same format. Differential Revision: https://reviews.llvm.org/D96597	2021-02-17 17:03:34 +07:00
Artur Gainullin	ff50b121e3	[SYCL] Ignore file-scope asm during device-side SYCL compilation. Reviewed By: bader, eandrews Differential Revision: https://reviews.llvm.org/D96538	2021-02-12 17:00:45 -08:00
Akira Hatanaka	ed4718eccb	[ObjC][ARC] Use operand bundle 'clang.arc.attachedcall' instead of explicitly emitting retainRV or claimRV calls in the IR Background: This fixes a longstanding problem where llvm breaks ARC's autorelease optimization (see the link below) by separating calls from the marker instructions or retainRV/claimRV calls. The backend changes are in https://reviews.llvm.org/D92569. https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-autoreleasereturnvalue What this patch does to fix the problem: - The front-end adds operand bundle "clang.arc.attachedcall" to calls, which indicates the call is implicitly followed by a marker instruction and an implicit retainRV/claimRV call that consumes the call result. In addition, it emits a call to @llvm.objc.clang.arc.noop.use, which consumes the call result, to prevent the middle-end passes from changing the return type of the called function. This is currently done only when the target is arm64 and the optimization level is higher than -O0. - ARC optimizer temporarily emits retainRV/claimRV calls after the calls with the operand bundle in the IR and removes the inserted calls after processing the function. - ARC contract pass emits retainRV/claimRV calls after the call with the operand bundle. It doesn't remove the operand bundle on the call since the backend needs it to emit the marker instruction. The retainRV and claimRV calls are emitted late in the pipeline to prevent optimization passes from transforming the IR in a way that makes it harder for the ARC middle-end passes to figure out the def-use relationship between the call and the retainRV/claimRV calls (which is the cause of PR31925). - The function inliner removes an autoreleaseRV call in the callee if nothing in the callee prevents it from being paired up with the retainRV/claimRV call in the caller. It then inserts a release call if claimRV is attached to the call since autoreleaseRV+claimRV is equivalent to a release. If it cannot find an autoreleaseRV call, it tries to transfer the operand bundle to a function call in the callee. This is important since the ARC optimizer can remove the autoreleaseRV returning the callee result, which makes it impossible to pair it up with the retainRV/claimRV call in the caller. If that fails, it simply emits a retain call in the IR if retainRV is attached to the call and does nothing if claimRV is attached to it. - SCCP refrains from replacing the return value of a call with a constant value if the call has the operand bundle. This ensures the call always has at least one user (the call to @llvm.objc.clang.arc.noop.use). - This patch also fixes a bug in replaceUsesOfNonProtoConstant where multiple operand bundles of the same kind were being added to a call. Future work: - Use the operand bundle on x86-64. - Fix the auto upgrader to convert call+retainRV/claimRV pairs into calls with the operand bundles. rdar://71443534 Differential Revision: https://reviews.llvm.org/D92808	2021-02-12 09:51:57 -08:00
Anastasia Stulova	79b222c39f	[OpenCL] Fix types with signed prefix in arginfo metadata. Signed prefix is removed and the single word spelling is printed for the scalar types. Tags: #clang Differential Revision: https://reviews.llvm.org/D96161	2021-02-09 15:13:19 +00:00
Anastasia Stulova	ecc8ac3f08	[OpenCL] Fix pipe type printing in arg info metadata Pipe element type spelling for arg info metadata should follow the same behavior as normal type spelling. We should only use the canonical type spelling in the base type field. This patch also removed duplication in type handling. Tags: #clang Differential Revision: https://reviews.llvm.org/D96151	2021-02-08 16:05:13 +00:00
Yaxun (Sam) Liu	b008ea304d	[CUDA][HIP] Fix device variable linkage For -fgpu-rdc, shadow variables should not be internalized, otherwise they cannot be accessed by other TUs. This is necessary because the shadow variable of external device variables are always emitted as undefined symbols, which need to resolve to a global symbols. Managed variables need to be emitted as undefined symbols in device compilations. Reviewed by: Artem Belevich Differential Revision: https://reviews.llvm.org/D95901	2021-02-05 15:11:12 -05:00
Yaxun (Sam) Liu	0b2af1a288	[NFC][CUDA] Refactor registering device variable Extract registering device variable to CUDA runtime codegen function since it will be called in multiple places. Reviewed by: Artem Belevich Differential Revision: https://reviews.llvm.org/D95558	2021-02-03 14:29:51 -05:00
Hans Wennborg	0479c53b6c	[dllimport] Honor always_inline when deciding whether a dllimport function should be available for inlining (PR48925) Normally, Clang will not make dllimport functions available for inlining if they reference non-imported symbols, as this can lead to confusing link errors. But if the function is marked always_inline, the user presumably knows what they're doing and the attribute should be honored. Differential revision: https://reviews.llvm.org/D95673	2021-02-02 10:28:32 +01:00
Petr Hosek	bb9eb19829	Support for instrumenting only selected files or functions This change implements support for applying profile instrumentation only to selected files or functions. The implementation uses the sanitizer special case list format to select which files and functions to instrument, and relies on the new noprofile IR attribute to exclude functions from instrumentation. Differential Revision: https://reviews.llvm.org/D94820	2021-01-26 17:13:34 -08:00
Petr Hosek	1e634f3952	Revert "Support for instrumenting only selected files or functions" This reverts commit `4edf35f11a` because the test fails on Windows bots.	2021-01-26 12:25:28 -08:00
Petr Hosek	4edf35f11a	Support for instrumenting only selected files or functions This change implements support for applying profile instrumentation only to selected files or functions. The implementation uses the sanitizer special case list format to select which files and functions to instrument, and relies on the new noprofile IR attribute to exclude functions from instrumentation. Differential Revision: https://reviews.llvm.org/D94820	2021-01-26 11:11:39 -08:00
Bjorn Pettersson	ea2cfda386	[CGExpr] Use getCharWidth() more consistently in CCGExprConstant. NFC Most of CGExprConstant.cpp is using the CharUnits abstraction and is using getCharWidth() (directly of indirectly) when converting between size of a char and size in bits. This patch is making that abstraction more consistent by adding CharTy to the CodeGenTypeCache (honoring getCharWidth() when mapping from char to LLVM IR types, instead of using Int8Ty directly). Reviewed By: rjmccall Differential Revision: https://reviews.llvm.org/D94979	2021-01-22 21:12:17 +01:00
Yaxun (Sam) Liu	622eaa4a4c	[HIP] Support __managed__ attribute This patch implements codegen for __managed__ variable attribute for HIP. Diagnostics will be added later. Differential Revision: https://reviews.llvm.org/D94814	2021-01-22 11:43:58 -05:00
Zequan Wu	e53bbd9951	[IR] move nomerge attribute from function declaration/definition to callsites Move nomerge attribute from function declaration/definition to callsites to allow virtual function calls attach the attribute. Differential Revision: https://reviews.llvm.org/D94537	2021-01-12 12:10:46 -08:00
Fangrui Song	e2e82c9983	[CodeGenModule] Drop dso_local on function declarations for ELF -fno-pic -fno-direct-access-external-data ELF -fno-pic sets dso_local on a function declaration to allow direct accesses when taking its address (similar to a data symbol). The emitted code follows the traditional GCC/Clang -fno-pic behavior: an absolute relocation is produced. If the function is not defined in the executable, a canonical PLT entry will be needed at link time. This is similar to a copy relocation and is incompatible with (-Bsymbolic or --dynamic-list linked shared objects / protected symbols in a shared object). This patch gives -fno-pic code a way to avoid such a canonical PLT entry. The FIXME was about a generalization for -fpie -mpie-copy-relocations (now -fpie -fdirect-access-external-data). While we could set dso_local to avoid GOT when taking the address of a function declaration (there is an ignorable difference about R_386_PC32 vs R_386_PLT32 on i386), it likely does not provide any benefit and can just cause trouble, so we don't make the generalization.	2021-01-09 16:31:56 -08:00
Fangrui Song	38a716c30f	Make -fno-pic respect -fno-direct-access-external-data D92633 added -f[no-]direct-access-external-data to supersede -m[no-]pie-copy-relocations. (The option works for -fpie but is a no-op for -fno-pic and -fpic.) This patch makes -fno-pic -fno-direct-access-external-data drop dso_local from global variable declarations. This usually causes the backend to emit a GOT indirection for external data access. With a GOT relocation, the subsequent -no-pie link will not have copy relocation even if the data symbol turns out to be defined by a shared object. Differential Revision: https://reviews.llvm.org/D92714	2021-01-09 00:32:02 -08:00
Fangrui Song	1d3ebbf537	Add -f[no-]direct-access-external-data to supersede -mpie-copy-relocations GCC r218397 "x86-64: Optimize access to globals in PIE with copy reloc" made -fpie code emit R_X86_64_PC32 to reference external data symbols by default. Clang adopted -mpie-copy-relocations D19996 as a flexible alternative. The name -mpie-copy-relocations can be improved [1] and does not capture the idea that this option can apply to -fno-pic and -fpic [2], so this patch introduces -f[no-]direct-access-external-data and makes -mpie-copy-relocations their aliases for compatibility. [1] For ``` extern int var; int get() { return var; } ``` if var is defined in another translation unit in the link unit, there is no copy relocation. [2] -fno-pic -fno-direct-access-external-data is useful to avoid copy relocations. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65888 If a shared object is linked with -Bsymbolic or --dynamic-list and exports a data symbol, normally the data symbol cannot be accessed by -fno-pic code (because by default an absolute relocation is produced which will lead to a copy relocation). -fno-direct-access-external-data can prevent copy relocations. -fpic -fdirect-access-external-data can avoid GOT indirection. This is like the undefined counterpart of -fno-semantic-interposition. However, the user should define var in another translation unit and link with -Bsymbolic or --dynamic-list, otherwise the linker will error in a -shared link. Generally the user has better tools for their goal but I want to mention that this combination is valid. On COFF, the behavior is like always -fdirect-access-external-data. `__declspec(dllimport)` is needed to enable indirect access. There is currently no plan to affect non-ELF behaviors or -fpic behaviors. -fno-pic -fno-direct-access-external-data will be implemented in the subsequent patch. GCC feature request https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98112 Reviewed By: tmsriram Differential Revision: https://reviews.llvm.org/D92633	2021-01-09 00:32:01 -08:00
Fangrui Song	d1fd72343c	Refactor how -fno-semantic-interposition sets dso_local on default visibility external linkage definitions The idea is that the CC1 default for ELF should set dso_local on default visibility external linkage definitions in the default -mrelocation-model pic mode (-fpic/-fPIC) to match COFF/Mach-O and make output IR similar. The refactoring is made available by `2820a2ca3a`. Currently only x86 supports local aliases. We move the decision to the driver. There are three CC1 states: * -fsemantic-interposition: make some linkages interposable and make default visibility external linkage definitions dso_preemptable. * (default): selected if the target supports .Lfoo$local: make default visibility external linkage definitions dso_local * -fhalf-no-semantic-interposition: if neither option is set or the target does not support .Lfoo$local: like -fno-semantic-interposition but local aliases are not used. So references can be interposed if not optimized out. Add -fhalf-no-semantic-interposition to a few tests using the half-based semantic interposition behavior.	2020-12-31 13:59:45 -08:00
Fangrui Song	809a1e0ffd	[CodeGenModule] Set dso_local for Mach-O GlobalValue * static relocation model: always * other relocation models: if isStrongDefinitionForLinker This will make LLVM IR emitted for COFF/Mach-O and executable ELF similar.	2020-12-30 20:52:01 -08:00
Fangrui Song	2820a2ca3a	Move -fno-semantic-interposition dso_local logic from TargetMachine to Clang CodeGenModule This simplifies TargetMachine::shouldAssumeDSOLocal and and gives frontend the decision to use dso_local. For LLVM synthesized functions/globals, they may lose inferred dso_local but such optimizations are probably not very useful. Note: the hasComdat() condition in canBenefitFromLocalAlias (D77429) may be dead now. (llvm/CodeGen/X86/semantic-interposition-comdat.ll) (Investigate whether we need test coverage when Fuchsia C++ ABI is clearer)	2020-12-29 23:37:55 -08:00
Rong Xu	3733463dbb	[IR][PGO] Add hot func attribute and use hot/cold attribute in func section Clang FE currently has hot/cold function attribute. But we only have cold function attribute in LLVM IR. This patch adds support of hot function attribute to LLVM IR. This attribute will be used in setting function section prefix/suffix. Currently .hot and .unlikely suffix only are added in PGO (Sample PGO) compilation (through isFunctionHotInCallGraph and isFunctionColdInCallGraph). This patch changes the behavior. The new behavior is: (1) If the user annotates a function as hot or isFunctionHotInCallGraph is true, this function will be marked as hot. Otherwise, (2) If the user annotates a function as cold or isFunctionColdInCallGraph is true, this function will be marked as cold. The changes are: (1) user annotated function attribute will used in setting function section prefix/suffix. (2) hot attribute overwrites profile count based hotness. (3) profile count based hotness overwrite user annotated cold attribute. The intention for these changes is to provide the user a way to mark certain function as hot in cases where training input is hard to cover all the hot functions. Differential Revision: https://reviews.llvm.org/D92493	2020-12-17 18:41:12 -08:00
Zequan Wu	fb0f728805	[Clang] Make nomerge attribute a function attribute as well as a statement attribute. Differential Revision: https://reviews.llvm.org/D92800	2020-12-17 07:45:38 -08:00
Rong Xu	c36f31c4db	[PGO] remove unintentional code in early commit Remove unintentional code in commit 54e03d [PGO] Verify BFI counts after loading profile data.	2020-12-14 18:41:49 -08:00
Rong Xu	54e03d03a7	[PGO] Verify BFI counts after loading profile data This patch adds the functionality to compare BFI counts with real profile counts right after reading the profile. It will print remarks under -Rpass-analysis=pgo, or the internal option -pass-remarks-analysis=pgo. Differential Revision: https://reviews.llvm.org/D91813	2020-12-14 15:56:10 -08:00
Fangrui Song	1ab9327d1c	[TargetMachine][CodeGenModule] Delete unneeded ppc32 special case from shouldAssumeDSOLocal PPCMCInstLower does not actually call shouldAssumeDSOLocal for ppc32 so this is dead. Actually Clang ppc32 does produce a pair of absolute relocations which match GCC. This also fixes a comment (R_PPC_COPY and R_PPC64_COPY do exist).	2020-12-05 00:42:07 -08:00
Nico Weber	0cbf61be8b	[mac/arm] Fix rtti codegen tests when running on an arm mac shouldRTTIBeUnique() returns false for iOS64CXXABI, which causes RTTI objects to be emitted hidden. Update two tests that didn't expect this to happen for the default triple. Also rename iOS64CXXABI to AppleARM64CXXABI, since it's used for arm64-apple-macos triples too. Part of PR46644. Differential Revision: https://reviews.llvm.org/D91904	2020-12-03 09:11:03 -05:00
Ben Dunbobbin	e42021d5cc	[Clang][-fvisibility-from-dllstorageclass] Set DSO Locality from final visibility Ensure that the DSO Locality of the globals in the IR is derived from their final visibility when using -fvisibility-from-dllstorageclass. To accomplish this we reset the DSO locality of globals (before setting their visibility from their dllstorageclass) at the end of IRGen in Clang. This removes any effects that visibility options or annotations may have had on the DSO locality. The resulting DSO locality of the globals will be pessimistic w.r.t. to the normal compiler IRGen. Differential Revision: https://reviews.llvm.org/D91779	2020-11-24 00:32:14 +00:00
Xiangling Liao	17497ec514	[AIX][FE] Support constructor/destructor attribute Support attribute((constructor)) and attribute((destructor)) on AIX Differential Revision: https://reviews.llvm.org/D90892	2020-11-19 09:24:01 -05:00
Nick Desaulniers	f4c6080ab8	Revert "[IR] add fn attr for no_stack_protector; prevent inlining on mismatch" This reverts commit `b7926ce6d7`. Going with a simpler approach.	2020-11-17 17:27:14 -08:00
Richard Smith	b637148ecb	[c++20] For P0732R2 / P1907R1: Basic code generation and name mangling support for non-type template parameters of class type and template parameter objects. The Itanium side of this follows the approach I proposed in https://github.com/itanium-cxx-abi/cxx-abi/issues/47 on 2020-09-06. The MSVC side of this was determined empirically by observing MSVC's output. Differential Revision: https://reviews.llvm.org/D89998	2020-11-09 22:10:27 -08:00
Tyker	d093401a26	[NFC] Remove string parameter of annotation attribute from AST childs. this simplifies using annotation attributes when using clang as library	2020-11-09 16:39:59 +01:00
Simon Pilgrim	8930032f53	Don't dereference a dyn_cast<> result - use cast<> instead. NFCI. We were relying on the dyn_cast<> succeeding - better use cast<> and have it assert that its the correct type than dereference a null result.	2020-11-08 13:06:07 +00:00
Jan Ole Hüser	d2e7dca5ca	[CodeGen] Fix Bug 47499: __unaligned extension inconsistent behaviour with C and C++ For the language C++ the keyword __unaligned (a Microsoft extension) had no effect on pointers. The reason, why there was a difference between C and C++ for the keyword __unaligned: For C, the Method getAsCXXREcordDecl() returns nullptr. That guarantees that hasUnaligned() is called. If the language is C++, it is not guaranteed, that hasUnaligend() is called and evaluated. Here are some links: The Bug: https://bugs.llvm.org/show_bug.cgi?id=47499 Thread on the cfe-dev mailing list: http://lists.llvm.org/pipermail/cfe-dev/2020-September/066783.html Diff, that introduced the check hasUnaligned() in getNaturalTypeAlignment(): https://reviews.llvm.org/D30166 Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D90630	2020-11-05 12:57:17 -08:00
Ben Dunbobbin	7ad6010f58	Fix - [Clang] Add the ability to map DLL storage class to visibility `415f7ee883` had a silly typo introduced when I inlined some code into a loop from its own function. Original commit message: For PlayStation we offer source code compatibility with Microsoft's dllimport/export annotations; however, our file format is based on ELF. To support this we translate from DLL storage class to ELF visibility at the end of codegen in Clang. Other toolchains have used similar strategies (e.g. see the documentation for this ARM toolchain: https://developer.arm.com/documentation/dui0530/i/migrating-from-rvct-v3-1-to-rvct-v4-0/changes-to-symbol-visibility-between-rvct-v3-1-and-rvct-v4-0) This patch adds the ability to perform this translation. Options are provided to support customizing the mapping behaviour. Differential Revision: https://reviews.llvm.org/D89970	2020-11-03 19:13:54 +00:00
Yaxun (Sam) Liu	abd8cd9199	[CUDA][HIP] Fix linkage for -fgpu-rdc Currently for explicit template function instantiation in CUDA/HIP device compilation clang emits instantiated kernel with external linkage and instantiated device function with internal linkage. This is fine for -fno-gpu-rdc since there is only one TU. However this causes duplicate symbols for kernels for -fgpu-rdc if the same instantiation happen in multiple TU. Or missing symbols if a device function calls an explicitly instantiated template function in a different TU. To make explicit template function instantiation work for -fgpu-rdc we need to follow the C++ linkage paradigm, i.e. use weak_odr linkage. Differential Revision: https://reviews.llvm.org/D90311	2020-11-03 08:07:19 -05:00
Ben Dunbobbin	ae9231ca2a	Reland - [Clang] Add the ability to map DLL storage class to visibility `415f7ee883` had LIT test failures on any build where the clang executable was not called "clang". I have adjusted the LIT CHECKs to remove the binary name to fix this. Original commit message: For PlayStation we offer source code compatibility with Microsoft's dllimport/export annotations; however, our file format is based on ELF. To support this we translate from DLL storage class to ELF visibility at the end of codegen in Clang. Other toolchains have used similar strategies (e.g. see the documentation for this ARM toolchain: https://developer.arm.com/documentation/dui0530/i/migrating-from-rvct-v3-1-to-rvct-v4-0/changes-to-symbol-visibility-between-rvct-v3-1-and-rvct-v4-0) This patch adds the ability to perform this translation. Options are provided to support customizing the mapping behaviour. Differential Revision: https://reviews.llvm.org/D89970	2020-11-02 23:24:49 +00:00
Ben Dunbobbin	5024d3aa18	Revert "[Clang] Add the ability to map DLL storage class to visibility" This reverts commit `415f7ee883`. The added tests were failing on the build bots!	2020-11-02 17:33:54 +00:00
Ben Dunbobbin	415f7ee883	[Clang] Add the ability to map DLL storage class to visibility For PlayStation we offer source code compatibility with Microsoft's dllimport/export annotations; however, our file format is based on ELF. To support this we translate from DLL storage class to ELF visibility at the end of codegen in Clang. Other toolchains have used similar strategies (e.g. see the documentation for this ARM toolchain: https://developer.arm.com/documentation/dui0530/i/migrating-from-rvct-v3-1-to-rvct-v4-0/changes-to-symbol-visibility-between-rvct-v3-1-and-rvct-v4-0) This patch adds the ability to perform this translation. Options are provided to support customizing the mapping behaviour. Differential Revision: https://reviews.llvm.org/D89970	2020-11-02 17:08:23 +00:00
Teresa Johnson	0949f96dc6	[MemProf] Pass down memory profile name with optional path from clang Similar to -fprofile-generate=, add -fmemory-profile= which takes a directory path. This is passed down to LLVM via a new module flag metadata. LLVM in turn provides this name to the runtime via the new __memprof_profile_filename variable. Additionally, always pass a default filename (in $cwd if a directory name is not specified vi the = form of the option). This is also consistent with the behavior of the PGO instrumentation. Since the memory profiles will generally be fairly large, it doesn't make sense to dump them to stderr. Also, importantly, the memory profiles will eventually be dumped in a compact binary format, which is another reason why it does not make sense to send these to stderr by default. Change the existing memprof tests to specify log_path=stderr when that was being relied on. Depends on D89086. Differential Revision: https://reviews.llvm.org/D89087	2020-11-01 17:38:23 -08:00
Nick Desaulniers	c8f84bd094	[Clang][CodeGen] fix failed assertion Ensure we can emit symbol aliases via function attribute even when function signatures contain incomplete types. Via bugreport: https://reviews.llvm.org/D66492#2350947 Reviewed By: erichkeane Differential Revision: https://reviews.llvm.org/D90073	2020-10-26 11:37:55 -07:00
Tyker	d3205bbca3	[Annotation] Allows annotation to carry some additional constant arguments. This allows using annotation in a much more contexts than it currently has. especially when annotation with template or constexpr. Reviewed By: aaron.ballman Differential Revision: https://reviews.llvm.org/D88645	2020-10-26 10:50:05 +01:00
Melanie Blower	2e204e2391	[clang] Enable support for #pragma STDC FENV_ACCESS Reviewers: rjmccall, rsmith, sepavloff Differential Revision: https://reviews.llvm.org/D87528	2020-10-25 06:46:25 -07:00
Nick Desaulniers	b7926ce6d7	[IR] add fn attr for no_stack_protector; prevent inlining on mismatch It's currently ambiguous in IR whether the source language explicitly did not want a stack a stack protector (in C, via function attribute no_stack_protector) or doesn't care for any given function. It's common for code that manipulates the stack via inline assembly or that has to set up its own stack canary (such as the Linux kernel) would like to avoid stack protectors in certain functions. In this case, we've been bitten by numerous bugs where a callee with a stack protector is inlined into an __attribute__((__no_stack_protector__)) caller, which generally breaks the caller's assumptions about not having a stack protector. LTO exacerbates the issue. While developers can avoid this by putting all no_stack_protector functions in one translation unit together and compiling those with -fno-stack-protector, it's generally not very ergonomic or as ergonomic as a function attribute, and still doesn't work for LTO. See also: https://lore.kernel.org/linux-pm/20200915172658.1432732-1-rkir@google.com/ https://lore.kernel.org/lkml/20200918201436.2932360-30-samitolvanen@google.com/T/#u Typically, when inlining a callee into a caller, the caller will be upgraded in its level of stack protection (see adjustCallerSSPLevel()). By adding an explicit attribute in the IR when the function attribute is used in the source language, we can now identify such cases and prevent inlining. Block inlining when the callee and caller differ in the case that one contains `nossp` when the other has `ssp`, `sspstrong`, or `sspreq`. Fixes pr/47479. Reviewed By: void Differential Revision: https://reviews.llvm.org/D87956	2020-10-23 11:55:39 -07:00
Richard Smith	ba4768c966	[c++20] For P0732R2 / P1907R1: Basic frontend support for class types as non-type template parameters. Create a unique TemplateParamObjectDecl instance for each such value, representing the globally unique template parameter object to which the template parameter refers. No IR generation support yet; that will follow in a separate patch.	2020-10-21 13:21:41 -07:00
Hans Wennborg	0628bea513	Revert "[PM/CC1] Add -f[no-]split-cold-code CC1 option to toggle splitting" This broke Chromium's PGO build, it seems because hot-cold-splitting got turned on unintentionally. See comment on the code review for repro etc. > This patch adds -f[no-]split-cold-code CC1 options to clang. This allows > the splitting pass to be toggled on/off. The current method of passing > `-mllvm -hot-cold-split=true` to clang isn't ideal as it may not compose > correctly (say, with `-O0` or `-Oz`). > > To implement the -fsplit-cold-code option, an attribute is applied to > functions to indicate that they may be considered for splitting. This > removes some complexity from the old/new PM pipeline builders, and > behaves as expected when LTO is enabled. > > Co-authored by: Saleem Abdulrasool <compnerd@compnerd.org> > Differential Revision: https://reviews.llvm.org/D57265 > Reviewed By: Aditya Kumar, Vedant Kumar > Reviewers: Teresa Johnson, Aditya Kumar, Fedor Sergeev, Philip Pfaffe, Vedant Kumar This reverts commit `273c299d5d`.	2020-10-19 12:31:14 +02:00
Vedant Kumar	273c299d5d	[PM/CC1] Add -f[no-]split-cold-code CC1 option to toggle splitting This patch adds -f[no-]split-cold-code CC1 options to clang. This allows the splitting pass to be toggled on/off. The current method of passing `-mllvm -hot-cold-split=true` to clang isn't ideal as it may not compose correctly (say, with `-O0` or `-Oz`). To implement the -fsplit-cold-code option, an attribute is applied to functions to indicate that they may be considered for splitting. This removes some complexity from the old/new PM pipeline builders, and behaves as expected when LTO is enabled. Co-authored by: Saleem Abdulrasool <compnerd@compnerd.org> Differential Revision: https://reviews.llvm.org/D57265 Reviewed By: Aditya Kumar, Vedant Kumar Reviewers: Teresa Johnson, Aditya Kumar, Fedor Sergeev, Philip Pfaffe, Vedant Kumar	2020-10-15 23:13:33 +00:00
Leonard Chan	79829a4704	Revert "[clang] Add -fc++-abi= flag for specifying which C++ ABI to use" This reverts commits `683b308c07` and `8487bfd4e9`. We will go for a more restricted approach that does not give freedom to everyone to change ABIs on whichever platform. See the discussion on https://reviews.llvm.org/D85802.	2020-10-15 14:24:38 -07:00
Leonard Chan	683b308c07	[clang] Add -fc++-abi= flag for specifying which C++ ABI to use This implements the flag proposed in RFC http://lists.llvm.org/pipermail/cfe-dev/2020-August/066437.html. The goal is to add a way to override the default target C++ ABI through a compiler flag. This makes it easier to test and transition between different C++ ABIs through compile flags rather than build flags. In this patch: - Store `-fc++-abi=` in a LangOpt. This isn't stored in a CodeGenOpt because there are instances outside of codegen where Clang needs to know what the ABI is (particularly through ASTContext::createCXXABI), and we should be able to override the target default if the flag is provided at that point. - Expose the existing ABIs in TargetCXXABI as values that can be passed through this flag. - Create a .def file for these ABIs to make it easier to check flag values. - Add an error for diagnosing bad ABI flag values. Differential Revision: https://reviews.llvm.org/D85802	2020-10-14 12:31:21 -07:00
Fangrui Song	a2cc883368	[CUDA] Don't call __cudaRegisterVariable on C++17 inline variables D17779: host-side shadow variables of external declarations of device-side global variables have internal linkage and are referenced by `__cuda_register_globals`. nvcc from CUDA 11 does not allow `__device__ inline` or `__device__ constexpr` (C++17 inline variables) but clang has incorrectly supported them for a while: ``` error: A __device__ variable cannot be marked constexpr error: An inline __device__/__constant__/__managed__ variable must have internal linkage when the program is compiled in whole program mode (-rdc=false) ``` If such a variable (which has a comdat group) is discarded (a copy from another translation unit is prevailing and selected), accessing the variable from outside the section group (`__cuda_register_globals`) is a violation of the ELF specification and will be rejected by linkers: > A symbol table entry with STB_LOCAL binding that is defined relative to one of a group's sections, and that is contained in a symbol table section that is not part of the group, must be discarded if the group members are discarded. References to this symbol table entry from outside the group are not allowed. As a workaround, don't register such inline variables for now. (If we register the variables in all TUs, we will keep multiple instances of the shadow and break the C++ semantics for inline variables). We should reject such variables in Sema but our internal users need some time to migrate. Reviewed By: tra Differential Revision: https://reviews.llvm.org/D88786	2020-10-05 12:53:59 -07:00
Momchil Velikov	a88c722e68	[AArch64] PAC/BTI code generation for LLVM generated functions PAC/BTI-related codegen in the AArch64 backend is controlled by a set of LLVM IR function attributes, added to the function by Clang, based on command-line options and GCC-style function attributes. However, functions, generated in the LLVM middle end (for example, asan.module.ctor or __llvm_gcov_write_out) do not get any attributes and the backend incorrectly does not do any PAC/BTI code generation. This patch record the default state of PAC/BTI codegen in a set of LLVM IR module-level attributes, based on command-line options: * "sign-return-address", with non-zero value means generate code to sign return addresses (PAC-RET), zero value means disable PAC-RET. * "sign-return-address-all", with non-zero value means enable PAC-RET for all functions, zero value means enable PAC-RET only for functions, which spill LR. * "sign-return-address-with-bkey", with non-zero value means use B-key for signing, zero value mean use A-key. This set of attributes are always added for AArch64 targets (as opposed, for example, to interpreting a missing attribute as having a value 0) in order to be able to check for conflicts when combining module attributed during LTO. Module-level attributes are overridden by function level attributes. All the decision making about whether to not to generate PAC and/or BTI code is factored out into AArch64FunctionInfo, there shouldn't be any places left, other than AArch64FunctionInfo, which directly examine PAC/BTI attributes, except AArch64AsmPrinter.cpp, which is/will-be handled by a separate patch. Differential Revision: https://reviews.llvm.org/D85649	2020-09-25 11:47:14 +01:00

1 2 3 4 5 ...

1678 Commits