llvm-project

Commit Graph

Author	SHA1	Message	Date
Chih-Ping Chen	5f75dcf571	[DebugInfo] Support Fortran 'use <external module>' statement. The main change is to add a 'IsDecl' field to DIModule so that when IsDecl is set to true, the debug info entry generated for the module would be marked as a declaration. That way, the debugger would look up the definition of the module in the gloabl scope. Please see the comments in llvm/test/DebugInfo/X86/dimodule.ll for what the debug info entries would look like. Differential Revision: https://reviews.llvm.org/D93462	2020-12-18 13:10:57 -05:00
Rong Xu	3733463dbb	[IR][PGO] Add hot func attribute and use hot/cold attribute in func section Clang FE currently has hot/cold function attribute. But we only have cold function attribute in LLVM IR. This patch adds support of hot function attribute to LLVM IR. This attribute will be used in setting function section prefix/suffix. Currently .hot and .unlikely suffix only are added in PGO (Sample PGO) compilation (through isFunctionHotInCallGraph and isFunctionColdInCallGraph). This patch changes the behavior. The new behavior is: (1) If the user annotates a function as hot or isFunctionHotInCallGraph is true, this function will be marked as hot. Otherwise, (2) If the user annotates a function as cold or isFunctionColdInCallGraph is true, this function will be marked as cold. The changes are: (1) user annotated function attribute will used in setting function section prefix/suffix. (2) hot attribute overwrites profile count based hotness. (3) profile count based hotness overwrite user annotated cold attribute. The intention for these changes is to provide the user a way to mark certain function as hot in cases where training input is hard to cover all the hot functions. Differential Revision: https://reviews.llvm.org/D92493	2020-12-17 18:41:12 -08:00
Gulfem Savrun Yeniceri	7c0e3a77bc	[clang][IR] Add support for leaf attribute This patch adds support for leaf attribute as an optimization hint in Clang/LLVM. Differential Revision: https://reviews.llvm.org/D90275	2020-12-14 14:48:17 -08:00
Fangrui Song	b5ad32ef5c	Migrate deprecated DebugLoc::get to DILocation::get This migrates all LLVM (except Kaleidoscope and CodeGen/StackProtector.cpp) DebugLoc::get to DILocation::get. The CodeGen/StackProtector.cpp usage may have a nullptr Scope and can trigger an assertion failure, so I don't migrate it. Reviewed By: #debug-info, dblaikie Differential Revision: https://reviews.llvm.org/D93087	2020-12-11 12:45:22 -08:00
Zhengyang Liu	75f50e15bf	Adding PoisonValue for representing poison value explicitly in IR Define ConstantData::PoisonValue. Add support for poison value to LLLexer/LLParser/BitcodeReader/BitcodeWriter. Add support for poison value to llvm-c interface. Add support for poison value to OCaml binding. Add m_Poison in PatternMatch. Differential Revision: https://reviews.llvm.org/D71126	2020-11-25 17:33:51 -07:00
Nick Desaulniers	f4c6080ab8	Revert "[IR] add fn attr for no_stack_protector; prevent inlining on mismatch" This reverts commit `b7926ce6d7`. Going with a simpler approach.	2020-11-17 17:27:14 -08:00
Nick Desaulniers	dd6087cac0	Revert "[BitCode] decode nossp fn attr" This reverts commit `0b11d018cc`. Going with a simpler approach.	2020-11-17 17:27:14 -08:00
serge-sans-paille	9218ff50f9	llvmbuildectomy - replace llvm-build by plain cmake No longer rely on an external tool to build the llvm component layout. Instead, leverage the existing `add_llvm_componentlibrary` cmake function and introduce `add_llvm_component_group` to accurately describe component behavior. These function store extra properties in the created targets. These properties are processed once all components are defined to resolve library dependencies and produce the header expected by llvm-config. Differential Revision: https://reviews.llvm.org/D90848	2020-11-13 10:35:24 +01:00
Simon Pilgrim	bd76f3724d	[Bitcode] Make some basic PlaceholderQueue/MetadataLoaderImpl helper methods const. NFCI. Fixes a number of cppcheck remarks.	2020-10-31 12:16:48 +00:00
Simon Pilgrim	5bf45ee156	BitcodeReader::popValue - pass SmallVectorImpl<> as const reference. NFCI. Fixes cppcheck warning.	2020-10-30 14:33:19 +00:00
Mircea Trofin	735ab4be35	[ThinLTO] Fix .llvmcmd emission llvm::EmbedBitcodeInModule needs (what used to be called) EmbedMarker set, in order to emit .llvmcmd. EmbedMarker is really about embedding the command line, so renamed the parameter accordingly, too. This was not caught at test because the check-prefix was incorrect, but FileCheck does not report that when multiple prefixes are provided. A separate patch will address that. Differential Revision: https://reviews.llvm.org/D90278	2020-10-28 17:45:30 -07:00
Alok Kumar Sharma	a6dd01afa3	[DebugInfo] Support for DW_TAG_generic_subrange This is needed to support fortran assumed rank arrays which have runtime rank. Summary: Fortran assumed rank arrays have dynamic rank. DWARF TAG DW_TAG_generic_subrange is needed to support that. Testing: unit test cases added (hand-written) check llvm check debug-info Reviewed By: aprantl Differential Revision: https://reviews.llvm.org/D89218	2020-10-29 01:34:15 +05:30
Mircea Trofin	6fa35541a0	[NFC][ThinLTO] Change command line passing to EmbedBitcodeInModule Changing to pass by ref - less null checks to worry about. Differential Revision: https://reviews.llvm.org/D90330	2020-10-28 12:33:39 -07:00
Nick Desaulniers	0b11d018cc	[BitCode] decode nossp fn attr I missed this in https://reviews.llvm.org/D87956. Reviewed By: void Differential Revision: https://reviews.llvm.org/D90177	2020-10-26 13:06:54 -07:00
Nick Desaulniers	b7926ce6d7	[IR] add fn attr for no_stack_protector; prevent inlining on mismatch It's currently ambiguous in IR whether the source language explicitly did not want a stack a stack protector (in C, via function attribute no_stack_protector) or doesn't care for any given function. It's common for code that manipulates the stack via inline assembly or that has to set up its own stack canary (such as the Linux kernel) would like to avoid stack protectors in certain functions. In this case, we've been bitten by numerous bugs where a callee with a stack protector is inlined into an __attribute__((__no_stack_protector__)) caller, which generally breaks the caller's assumptions about not having a stack protector. LTO exacerbates the issue. While developers can avoid this by putting all no_stack_protector functions in one translation unit together and compiling those with -fno-stack-protector, it's generally not very ergonomic or as ergonomic as a function attribute, and still doesn't work for LTO. See also: https://lore.kernel.org/linux-pm/20200915172658.1432732-1-rkir@google.com/ https://lore.kernel.org/lkml/20200918201436.2932360-30-samitolvanen@google.com/T/#u Typically, when inlining a callee into a caller, the caller will be upgraded in its level of stack protection (see adjustCallerSSPLevel()). By adding an explicit attribute in the IR when the function attribute is used in the source language, we can now identify such cases and prevent inlining. Block inlining when the callee and caller differ in the case that one contains `nossp` when the other has `ssp`, `sspstrong`, or `sspreq`. Fixes pr/47479. Reviewed By: void Differential Revision: https://reviews.llvm.org/D87956	2020-10-23 11:55:39 -07:00
David Stenberg	0c0fcea557	Handle value uses wrapped in metadata for the use-list order When generating the use-list order, also consider value uses that are operands which are wrapped in metadata; e.g. llvm.dbg.value operands. This fixes PR36778. The test case is based on the reproducer from that report. Reviewed By: dexonsmith Differential Revision: https://reviews.llvm.org/D53758	2020-10-20 20:05:59 +02:00
Atmn Patel	595c615606	[IR] Adds mustprogress as a LLVM IR attribute This adds the LLVM IR attribute `mustprogress` as defined in LangRef through D86233. This attribute will be applied to functions with in languages like C++ where forward progress is guaranteed. Functions without this attribute are not required to make progress. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D85393	2020-10-20 03:09:57 -04:00
Matt Arsenault	0a7cd99a70	Reapply "OpaquePtr: Add type to sret attribute" This reverts commit `eb9f7c28e5`. Previously this was incorrectly handling linking of the contained type, so this merges the fixes from D88973.	2020-10-16 11:05:02 -04:00
Craig Topper	a184c758b7	[BitCodeAnalyzer] Add a few missing TYPE_CODES and MODULE_CODE_COMDAT to GetCodeName Happened to notice some of these printing as UnknownCode while running llvm-bcanalyzer on a bc file I had. Differential Revision: https://reviews.llvm.org/D86900	2020-10-12 15:43:12 -07:00
Teresa Johnson	c27ab339ad	Restore "[ThinLTO] Avoid temporaries when loading global decl attachment metadata" This restores commit `ab1b4810b5` which was reverted in `01b9deba76`, with a fix for the issue it caused. We should use a temporary BitstreamCursor when loading the global decl attachment records so that the abbrev ids held in the lazy loading IndexCursor are not clobbered. Enhanced the test so that the issue is exposed there. Original description: When performing ThinLTO importing, the metadata loader attempts to lazy load, by building an index. However, module level global decl attachment metadata was being parsed early while building the index, since the associated (module level) global values aren't materialized on demand. This results in the creation of forward reference temporary metadatas, which are expensive. Normally, these module level global values don't have much attached metadata. However, in the case of -fwhole-program-vtables (e.g. for whole program devirtualization), the vtables may have many attached type metadatas. This was resulting in very slow performance when performing ThinLTO importing with the default lazy loading. This patch restructures the handling of these global decl attachment records, delaying their parsing until after the lazy loading index has been built. Then the parser can use the interface that loads from the index, which resolves forward references immediately instead of creating expensive temporaries. For one ThinLTO backend that imports from modules containing huge numbers of vtables and associated types, I measured the following compile times for the metadata materialization during function importing, rounded to nearest second: No -fwhole-program-vtables: Lazy loading on (head): 1s Lazy loading off (head): 3s Lazy loading on (patch): 1s With -fwhole-program-vtables: Lazy loading on (head): 440s Lazy loading off (head): 4s Lazy loading on (patch): 2s Differential Revision: https://reviews.llvm.org/D87970	2020-10-12 10:11:56 -07:00
Alok Kumar Sharma	96bd4d34a2	[DebugInfo] Support for DWARF attribute DW_AT_rank This patch adds support for DWARF attribute DW_AT_rank. Summary: Fortran assumed rank arrays have dynamic rank. DWARF attribute DW_AT_rank is needed to support that. Testing: unit test cases added (hand-written) check llvm check debug-info Reviewed By: aprantl Differential Revision: https://reviews.llvm.org/D89141	2020-10-10 17:51:12 +05:30
Sam McCall	b953a01b2c	Reapply [ADT] function_ref's constructor is unavailable if the argument is not callable. This reverts commit `281703e67f`. GCC 5.4 bugs are worked around by avoiding use of variable templates. Differential Revision: https://reviews.llvm.org/D88977	2020-10-07 18:31:12 +02:00
Sam McCall	281703e67f	Revert "[ADT] function_ref's constructor is unavailable if the argument is not callable." This reverts commit `4cae6228d1`. Breaks GCC build: http://lab.llvm.org:8011/#/builders/8/builds/33/steps/6/logs/stdio	2020-10-07 16:37:13 +02:00
Sam McCall	4cae6228d1	[ADT] function_ref's constructor is unavailable if the argument is not callable. This allows overload sets containing function_ref arguments to work correctly Otherwise they're ambiguous as anything "could be" converted to a function_ref. This matches proposed std::function_ref, absl::function_ref, etc. Differential Revision: https://reviews.llvm.org/D88901	2020-10-07 16:31:09 +02:00
Tres Popp	eb9f7c28e5	Revert "OpaquePtr: Add type to sret attribute" This reverts commit `55c4ff91bd`. Issues were introduced as discussed in https://reviews.llvm.org/D88241 where this change made previous bugs in the linker and BitCodeWriter visible.	2020-09-29 10:31:04 +02:00
Matt Arsenault	55c4ff91bd	OpaquePtr: Add type to sret attribute Make the corresponding change that was made for byval in `b7141207a4`. Like byval, this requires a bulk update of the test IR tests to include the type before this can be mandatory.	2020-09-25 14:07:30 -04:00
Fangrui Song	01b9deba76	Revert D87970 "[ThinLTO] Avoid temporaries when loading global decl attachment metadata" This reverts commit `ab1b4810b5`. It caused an issue in llvm::lto::thinBackend for a -fsanitize=cfi build. ``` AbbrevNo is 0 => "Invalid abbrev number" 0 llvm::BitstreamCursor::getAbbrev (this=0x9db4c8, AbbrevID=4) at llvm/include/llvm/Bitstream/BitstreamReader.h:528 1 0x00007f5f777a6eb4 in llvm::BitstreamCursor::readRecord (this=0x9db4c8, AbbrevID=4, Vals=llvm::SmallVector of Size 0, Capacity 64, Blob=0x7ffcd0e26558) at usr/local/google/home/maskray/llvm/llvm/lib/Bitstream/Reader/BitstreamReader.cpp:228 2 0x00007f5f796bf633 in llvm::MetadataLoader::MetadataLoaderImpl::lazyLoadOneMetadata (this=0x9db3a0, ID=188, Placeholders=...) at /usr/local/google/home/mas ray/llvm/llvm/lib/Bitcode/Reader/MetadataLoader.cpp:1091 3 0x00007f5f796c2527 in llvm::MetadataLoader::MetadataLoaderImpl::getMetadataFwdRefOrLoad (this=0x9db3a0, ID=188) at llvm lib/Bitcode/Reader/MetadataLoader.cpp:668 4 0x00007f5f796bfff3 in llvm::MetadataLoader::getMetadataFwdRefOrLoad (this=0xd31580, Idx=188) at llvm/lib/Bitcode/Reader MetadataLoader.cpp:2290 5 0x00007f5f79638265 in (anonymous namespace)::BitcodeReader::parseFunctionBody (this=0xd312e0, F=0x9de758) at llvm/lib/B tcode/Reader/BitcodeReader.cpp:3938 6 0x00007f5f79635d32 in (anonymous namespace)::BitcodeReader::materialize (this=0xd312e0, GV=0x9de758) at llvm/lib/Bitcod /Reader/BitcodeReader.cpp:5408 7 0x00007f5f7f8dbe3e in llvm::Module::materialize (this=0x9b92c0, GV=0x9de758) at llvm/lib/IR/Module.cpp:442 8 0x00007f5f7f7f8fbe in llvm::GlobalValue::materialize (this=0x9de758) at llvm/lib/IR/Globals.cpp:50 9 0x00007f5f83b9b5f5 in llvm::FunctionImporter::importFunctions (this=0x7ffcd0e2a730, DestModule=..., ImportList=...) at llvm/lib/Transforms/IPO/FunctionImport.cpp:1182 ```	2020-09-23 10:24:08 -07:00
Teresa Johnson	ab1b4810b5	[ThinLTO] Avoid temporaries when loading global decl attachment metadata When performing ThinLTO importing, the metadata loader attempts to lazy load, by building an index. However, module level global decl attachment metadata was being parsed early while building the index, since the associated (module level) global values aren't materialized on demand. This results in the creation of forward reference temporary metadatas, which are expensive. Normally, these module level global values don't have much attached metadata. However, in the case of -fwhole-program-vtables (e.g. for whole program devirtualization), the vtables may have many attached type metadatas. This was resulting in very slow performance when performing ThinLTO importing with the default lazy loading. This patch restructures the handling of these global decl attachment records, delaying their parsing until after the lazy loading index has been built. Then the parser can use the interface that loads from the index, which resolves forward references immediately instead of creating expensive temporaries. For one ThinLTO backend that imports from modules containing huge numbers of vtables and associated types, I measured the following compile times for the metadata materialization during function importing, rounded to nearest second: No -fwhole-program-vtables: Lazy loading on (head): 1s Lazy loading off (head): 3s Lazy loading on (patch): 1s With -fwhole-program-vtables: Lazy loading on (head): 440s Lazy loading off (head): 4s Lazy loading on (patch): 2s Differential Revision: https://reviews.llvm.org/D87970	2020-09-22 20:32:07 -07:00
Simon Pilgrim	d566771779	ValueList.cpp - remove unnecessary includes. NFCI. Already included in ValueList.h	2020-09-17 15:06:01 +01:00
Simon Pilgrim	abe0d8551d	MetadataLoader.cpp - remove unnecessary StringRef include. NFCI. Already included in MetadataLoader.h	2020-09-17 13:18:54 +01:00
Jianzhou Zhao	11201315d5	Flush bitcode incrementally for LTO output Bitcode writer does not flush buffer until the end by default. This is fine to small bitcode files. When -flto,--plugin-opt=emit-llvm,-gmlt are used, the final bitcode file is large, for example, >8G. Keeping all data in memory consumes a lot of memory. This change allows bitcode writer flush data to disk early when buffered data size is above some threshold. This is only enabled when lld emits LLVM bitcode. One issue to address is backpatching bitcode: subblock length, function body indexes, meta data indexes need to backfill. If buffer can be flushed partially, we introduced raw_fd_stream that supports read/seek/write, and enables backpatching bitcode flushed in disk. Reviewed-by: tejohnson, MaskRay Differential Revision: https://reviews.llvm.org/D86905	2020-09-17 03:32:31 +00:00
Simon Pilgrim	c6a82fdbf2	ValueEnumerator.cpp - remove duplicate includes. NFCI. Remove headers already included in ValueEnumerator.h	2020-09-16 18:32:28 +01:00
Mircea Trofin	e543708e5e	[NFC][ThinLTO] Let llvm::EmbedBitcodeInModule handle serialization. llvm::EmbedBitcodeInModule handles serializing the passed-in module, if the provided MemoryBufferRef is invalid. This is already the path taken in one of the uses of the API - clang::EmbedBitcode, when called from BackendConsumer::HandleTranslationUnit - so might as well do the same here and reduce (by very little) code duplication. The only difference this patch introduces is that the serialization happens with ShouldPreserveUseListOrder set to true. Differential Revision: https://reviews.llvm.org/D87339	2020-09-10 10:25:00 -07:00
Guillaume Chatelet	5a4a0cfcfb	[NFC] Separate bitcode reading for FUNC_CODE_INST_CMPXCHG(_OLD) This is preparatory work to unable storing alignment for AtomicCmpXchgInst. See D83136 for context and bug: https://bugs.llvm.org/show_bug.cgi?id=27168 This is the fixed version of D83375, which was submitted and reverted. Differential Revision: https://reviews.llvm.org/D87373	2020-09-09 19:10:30 +00:00
Fangrui Song	6ae7b403c3	Set alignment of .llvmbc and .llvmcmd to 1 Otherwise their alignment is dependent on the size of the section. If the size is large than 16, the alignment will be 16. 16 is a bad choice for both .llvmbc and .llvmcmd because the padding between two contributions from input sections is of a variable size. A bitstream is actually guaranteed to be 4-byte aligned, but consumers don't need this property.	2020-08-29 18:27:34 -07:00
David Sherwood	f4257c5832	[SVE] Make ElementCount members private This patch changes ElementCount so that the Min and Scalable members are now private and can only be accessed via the get functions getKnownMinValue() and isScalable(). In addition I've added some other member functions for more commonly used operations. Hopefully this makes the class more useful and will reduce the need for calling getKnownMinValue(). Differential Revision: https://reviews.llvm.org/D86065	2020-08-28 14:43:53 +01:00
Sourabh Singh Tomar	f91d18eaa9	[DebugInfo][flang]Added support for representing Fortran assumed length strings This patch adds support for representing Fortran `character(n)`. Primarily patch is based out of D54114 with appropriate modifications. Test case IR is generated using our downstream classic-flang. We're in process of upstreaming flang PR's but classic-flang has dependencies on llvm, so this has to get in first. Patch includes functional test case for both IR and corresponding dwarf, furthermore it has been manually tested as well using GDB. Source snippet: ``` program assumedLength call sub('Hello') call sub('Goodbye') contains subroutine sub(string) implicit none character(len=), intent(in) :: string print , string end subroutine sub end program assumedLength ``` GDB: ``` (gdb) ptype string type = character (5) (gdb) p string $1 = 'Hello' ``` Reviewed By: aprantl, schweitz Differential Revision: https://reviews.llvm.org/D86305	2020-08-22 10:13:40 +05:30
Vitaly Buka	fc4fd89852	[StackSafety] Use ValueInfo in ParamAccess::Call This avoid GUID lookup in Index.findSummaryInModule. Follow up for D81242. Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D85269	2020-08-14 12:42:44 -07:00
Kai Nacke	b3aece0531	[SystemZ/ZOS] Add binary format goff and operating system zos to the triple Adds the binary format goff and the operating system zos to the triple class. goff is selected as default binary format if zos is choosen as operating system. No further functionality is added. Reviewers: efriedma, tahonermann, hubert.reinterpertcast, MaskRay Reviewed By: efriedma, tahonermann, hubert.reinterpertcast Differential Revision: https://reviews.llvm.org/D82081	2020-08-11 05:26:26 -04:00
Guillaume Chatelet	754deffd11	[NFC] Move BitcodeCommon.h from Bitstream to Bitcode	2020-07-27 20:49:17 +00:00
Guillaume Chatelet	1b4d24912a	[NFC] Replace ".size() < 1" with ".empty()"	2020-07-27 13:54:53 +00:00
Guillaume Chatelet	d9bbe85943	[Alignment][NFC] Update Bitcodewriter to use Align Differential Revision: https://reviews.llvm.org/D83533	2020-07-27 08:16:45 +00:00
Steven Wu	ac375c2fe3	[Bitcode] Avoid duplicating linker option when upgrading Summary: The upgrading path from old ModuleFlag based linker options to the new NamedMetadata based linker option in in materializeMetadata() which gets called once for the module and once for every GV. The linker options are getting dup'ed every time and it can create massive amount of the linker options in the object file that gets created from old bitcode. Fix the problem by checking if the new option exists or not before upgrade again. rdar://64543389 Reviewers: pcc, t.p.northover, dexonsmith, arphaman Reviewed By: arphaman Subscribers: hiraditya, jkorous, ributzka, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D83688	2020-07-23 13:07:28 -07:00
Steven Wu	78709345fb	[Bitcode] Drop invalid branch_weight in BitcodeReader Summary: If bitcode reader gets an invalid branch weight, drop that from the inputs. This allows us to read the broken modules we generated before the verifier was able to catch this. rdar://64870641 Reviewers: yrouban, t.p.northover, dexonsmith, arphaman, aprantl Reviewed By: aprantl Subscribers: aprantl, hiraditya, jkorous, ributzka, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D83699	2020-07-23 09:07:15 -07:00
Alok Kumar Sharma	2d10258a31	[DebugInfo] Support for DW_AT_associated and DW_AT_allocated. Summary: This support is needed for the Fortran array variables with pointer/allocatable attribute. This support enables debugger to identify the status of variable whether that is currently allocated/associated. for pointer array (before allocation/association) without DW_AT_associated (gdb) pt ptr type = integer (140737345375288:140737354129776) (gdb) p ptr value requires 35017956 bytes, which is more than max-value-size with DW_AT_associated (gdb) pt ptr type = integer (:) (gdb) p ptr $1 = <not associated> for allocatable array (before allocation) without DW_AT_allocated (gdb) pt arr type = integer (140737345375288:140737354129776) (gdb) p arr value requires 35017956 bytes, which is more than max-value-size with DW_AT_allocated (gdb) pt arr type = integer, allocatable (:) (gdb) p arr $1 = <not allocated> Testing - unit test cases added - check-llvm - check-debuginfo Reviewed By: aprantl Differential Revision: https://reviews.llvm.org/D83544	2020-07-20 19:54:35 +05:30
Matt Arsenault	5e999cbe8d	IR: Define byref parameter attribute This allows tracking the in-memory type of a pointer argument to a function for ABI purposes. This is essentially a stripped down version of byval to remove some of the stack-copy implications in its definition. This includes the base IR changes, and some tests for places where it should be treated similarly to byval. Codegen support will be in a future patch. My original attempt at solving some of these problems was to repurpose byval with a different address space from the stack. However, it is technically permitted for the callee to introduce a write to the argument, although nothing does this in reality. There is also talk of removing and replacing the byval attribute, so a new attribute would need to take its place anyway. This is intended avoid some optimization issues with the current handling of aggregate arguments, as well as fixes inflexibilty in how frontends can specify the kernel ABI. The most honest representation of the amdgpu_kernel convention is to expose all kernel arguments as loads from constant memory. Today, these are raw, SSA Argument values and codegen is responsible for turning these into loads. Background: There currently isn't a satisfactory way to represent how arguments for the amdgpu_kernel calling convention are passed. In reality, arguments are passed in a single, flat, constant memory buffer implicitly passed to the function. It is also illegal to call this function in the IR, and this is only ever invoked by a driver of some kind. It does not make sense to have a stack passed parameter in this context as is implied by byval. It is never valid to write to the kernel arguments, as this would corrupt the inputs seen by other dispatches of the kernel. These argumets are also not in the same address space as the stack, so a copy is needed to an alloca. From a source C-like language, the kernel parameters are invisible. Semantically, a copy is always required from the constant argument memory to a mutable variable. The current clang calling convention lowering emits raw values, including aggregates into the function argument list, since using byval would not make sense. This has some unfortunate consequences for the optimizer. In the aggregate case, we end up with an aggregate store to alloca, which both SROA and instcombine turn into a store of each aggregate field. The optimizer never pieces this back together to see that this is really just a copy from constant memory, so we end up stuck with expensive stack usage. This also means the backend dictates the alignment of arguments, and arbitrarily picks the LLVM IR ABI type alignment. By allowing an explicit alignment, frontends can make better decisions. For example, there's real no advantage to an aligment higher than 4, so a frontend could choose to compact the argument layout. Similarly, there is a high penalty to using an alignment lower than 4, so a frontend could opt into more padding for small arguments. Another design consideration is when it is appropriate to expose the fact that these arguments are all really passed in adjacent memory. Currently we have a late IR optimization pass in codegen to rewrite the kernel argument values into explicit loads to enable vectorization. In most programs, unrelated argument loads can be merged together. However, exposing this property directly from the frontend has some disadvantages. We still need a way to track the original argument sizes and alignments to report to the driver. I find using some side-channel, metadata mechanism to track this unappealing. If the kernel arguments were exposed as a single buffer to begin with, alias analysis would be unaware that the padding bits betewen arguments are meaningless. Another family of problems is there are still some gaps in replacing all of the available parameter attributes with metadata equivalents once lowered to loads. The immediate plan is to start using this new attribute to handle all aggregate argumets for kernels. Long term, it makes sense to migrate all kernel arguments, including scalars, to be passed indirectly in the same manner. Additional context is in D79744.	2020-07-20 10:23:09 -04:00
Eric Christopher	cc28058c13	Temporarily revert "[NFC] Separate bitcode reading for FUNC_CODE_INST_CMPXCHG(_OLD)" as it wasn't NFC and is causing issues with thinlto bitcode reading. I've followed up offline with reproduction instructions and testcases. This reverts commit `30582457b4`.	2020-07-10 15:21:00 -07:00
Guillaume Chatelet	30582457b4	[NFC] Separate bitcode reading for FUNC_CODE_INST_CMPXCHG(_OLD) This is preparatory work to unable storing alignment for AtomicCmpXchgInst. See D83136 for context and bug: https://bugs.llvm.org/show_bug.cgi?id=27168 Differential Revision: https://reviews.llvm.org/D83375	2020-07-10 04:27:39 +00:00
Gui Andrade	ff7900d5de	[LLVM] Accept `noundef` attribute in function definitions/calls The `noundef` attribute indicates an argument or return value which may never have an undef value representation. This patch allows LLVM to parse the attribute. Differential Revision: https://reviews.llvm.org/D83412	2020-07-08 19:02:04 +00:00
Guillaume Chatelet	74c723757e	[NFC] Adding the align attribute on Atomic{CmpXchg\|RMW}Inst This is the first step to add support for the align attribute to AtomicRMWInst and AtomicCmpXchgInst. Next step is to add support in IRBuilder and BitcodeReader. Bug: https://bugs.llvm.org/show_bug.cgi?id=27168 Differential Revision: https://reviews.llvm.org/D83136	2020-07-07 09:54:13 +00:00

1 2 3 4 5 ...

1756 Commits