The main change is to add a 'IsDecl' field to DIModule so
that when IsDecl is set to true, the debug info entry generated
for the module would be marked as a declaration. That way, the debugger
would look up the definition of the module in the gloabl scope.
Please see the comments in llvm/test/DebugInfo/X86/dimodule.ll
for what the debug info entries would look like.
Differential Revision: https://reviews.llvm.org/D93462
Clang FE currently has hot/cold function attribute. But we only have
cold function attribute in LLVM IR.
This patch adds support of hot function attribute to LLVM IR. This
attribute will be used in setting function section prefix/suffix.
Currently .hot and .unlikely suffix only are added in PGO (Sample PGO)
compilation (through isFunctionHotInCallGraph and
isFunctionColdInCallGraph).
This patch changes the behavior. The new behavior is:
(1) If the user annotates a function as hot or isFunctionHotInCallGraph
is true, this function will be marked as hot. Otherwise,
(2) If the user annotates a function as cold or
isFunctionColdInCallGraph is true, this function will be marked as
cold.
The changes are:
(1) user annotated function attribute will used in setting function
section prefix/suffix.
(2) hot attribute overwrites profile count based hotness.
(3) profile count based hotness overwrite user annotated cold attribute.
The intention for these changes is to provide the user a way to mark
certain function as hot in cases where training input is hard to cover
all the hot functions.
Differential Revision: https://reviews.llvm.org/D92493
This migrates all LLVM (except Kaleidoscope and
CodeGen/StackProtector.cpp) DebugLoc::get to DILocation::get.
The CodeGen/StackProtector.cpp usage may have a nullptr Scope
and can trigger an assertion failure, so I don't migrate it.
Reviewed By: #debug-info, dblaikie
Differential Revision: https://reviews.llvm.org/D93087
Define ConstantData::PoisonValue.
Add support for poison value to LLLexer/LLParser/BitcodeReader/BitcodeWriter.
Add support for poison value to llvm-c interface.
Add support for poison value to OCaml binding.
Add m_Poison in PatternMatch.
Differential Revision: https://reviews.llvm.org/D71126
No longer rely on an external tool to build the llvm component layout.
Instead, leverage the existing `add_llvm_componentlibrary` cmake function and
introduce `add_llvm_component_group` to accurately describe component behavior.
These function store extra properties in the created targets. These properties
are processed once all components are defined to resolve library dependencies
and produce the header expected by llvm-config.
Differential Revision: https://reviews.llvm.org/D90848
llvm::EmbedBitcodeInModule needs (what used to be called) EmbedMarker
set, in order to emit .llvmcmd. EmbedMarker is really about embedding the
command line, so renamed the parameter accordingly, too.
This was not caught at test because the check-prefix was incorrect, but
FileCheck does not report that when multiple prefixes are provided. A
separate patch will address that.
Differential Revision: https://reviews.llvm.org/D90278
This is needed to support fortran assumed rank arrays which
have runtime rank.
Summary:
Fortran assumed rank arrays have dynamic rank. DWARF TAG
DW_TAG_generic_subrange is needed to support that.
Testing:
unit test cases added (hand-written)
check llvm
check debug-info
Reviewed By: aprantl
Differential Revision: https://reviews.llvm.org/D89218
It's currently ambiguous in IR whether the source language explicitly
did not want a stack a stack protector (in C, via function attribute
no_stack_protector) or doesn't care for any given function.
It's common for code that manipulates the stack via inline assembly or
that has to set up its own stack canary (such as the Linux kernel) would
like to avoid stack protectors in certain functions. In this case, we've
been bitten by numerous bugs where a callee with a stack protector is
inlined into an __attribute__((__no_stack_protector__)) caller, which
generally breaks the caller's assumptions about not having a stack
protector. LTO exacerbates the issue.
While developers can avoid this by putting all no_stack_protector
functions in one translation unit together and compiling those with
-fno-stack-protector, it's generally not very ergonomic or as
ergonomic as a function attribute, and still doesn't work for LTO. See also:
https://lore.kernel.org/linux-pm/20200915172658.1432732-1-rkir@google.com/https://lore.kernel.org/lkml/20200918201436.2932360-30-samitolvanen@google.com/T/#u
Typically, when inlining a callee into a caller, the caller will be
upgraded in its level of stack protection (see adjustCallerSSPLevel()).
By adding an explicit attribute in the IR when the function attribute is
used in the source language, we can now identify such cases and prevent
inlining. Block inlining when the callee and caller differ in the case that one
contains `nossp` when the other has `ssp`, `sspstrong`, or `sspreq`.
Fixes pr/47479.
Reviewed By: void
Differential Revision: https://reviews.llvm.org/D87956
When generating the use-list order, also consider value uses that are
operands which are wrapped in metadata; e.g. llvm.dbg.value operands.
This fixes PR36778. The test case is based on the reproducer from that
report.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D53758
This adds the LLVM IR attribute `mustprogress` as defined in LangRef through D86233. This attribute will be applied to functions with in languages like C++ where forward progress is guaranteed. Functions without this attribute are not required to make progress.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D85393
Happened to notice some of these printing as UnknownCode while running llvm-bcanalyzer on a bc file I had.
Differential Revision: https://reviews.llvm.org/D86900
This restores commit ab1b4810b5 which was
reverted in 01b9deba76, with a fix for the
issue it caused. We should use a temporary BitstreamCursor when
loading the global decl attachment records so that the abbrev ids held
in the lazy loading IndexCursor are not clobbered. Enhanced the test so
that the issue is exposed there.
Original description:
When performing ThinLTO importing, the metadata loader attempts to lazy
load, by building an index. However, module level global decl attachment
metadata was being parsed early while building the index, since the
associated (module level) global values aren't materialized on demand.
This results in the creation of forward reference temporary metadatas,
which are expensive.
Normally, these module level global values don't have much attached
metadata. However, in the case of -fwhole-program-vtables (e.g. for
whole program devirtualization), the vtables may have many attached type
metadatas. This was resulting in very slow performance when performing
ThinLTO importing with the default lazy loading.
This patch restructures the handling of these global decl attachment
records, delaying their parsing until after the lazy loading index has
been built. Then the parser can use the interface that loads from the
index, which resolves forward references immediately instead of creating
expensive temporaries.
For one ThinLTO backend that imports from modules containing huge
numbers of vtables and associated types, I measured the following
compile times for the metadata materialization during function
importing, rounded to nearest second:
No -fwhole-program-vtables:
Lazy loading on (head): 1s
Lazy loading off (head): 3s
Lazy loading on (patch): 1s
With -fwhole-program-vtables:
Lazy loading on (head): 440s
Lazy loading off (head): 4s
Lazy loading on (patch): 2s
Differential Revision: https://reviews.llvm.org/D87970
This patch adds support for DWARF attribute DW_AT_rank.
Summary:
Fortran assumed rank arrays have dynamic rank. DWARF attribute
DW_AT_rank is needed to support that.
Testing:
unit test cases added (hand-written)
check llvm
check debug-info
Reviewed By: aprantl
Differential Revision: https://reviews.llvm.org/D89141
This allows overload sets containing function_ref arguments to work correctly
Otherwise they're ambiguous as anything "could be" converted to a function_ref.
This matches proposed std::function_ref, absl::function_ref, etc.
Differential Revision: https://reviews.llvm.org/D88901
This reverts commit 55c4ff91bd.
Issues were introduced as discussed in https://reviews.llvm.org/D88241
where this change made previous bugs in the linker and BitCodeWriter
visible.
Make the corresponding change that was made for byval in
b7141207a4. Like byval, this requires a
bulk update of the test IR tests to include the type before this can
be mandatory.
This reverts commit ab1b4810b5.
It caused an issue in llvm::lto::thinBackend for a -fsanitize=cfi build.
```
AbbrevNo is 0 => "Invalid abbrev number"
0 llvm::BitstreamCursor::getAbbrev (this=0x9db4c8, AbbrevID=4) at llvm/include/llvm/Bitstream/BitstreamReader.h:528
1 0x00007f5f777a6eb4 in llvm::BitstreamCursor::readRecord (this=0x9db4c8, AbbrevID=4, Vals=llvm::SmallVector of Size 0, Capacity 64, Blob=0x7ffcd0e26558) at
usr/local/google/home/maskray/llvm/llvm/lib/Bitstream/Reader/BitstreamReader.cpp:228
2 0x00007f5f796bf633 in llvm::MetadataLoader::MetadataLoaderImpl::lazyLoadOneMetadata (this=0x9db3a0, ID=188, Placeholders=...) at /usr/local/google/home/mas
ray/llvm/llvm/lib/Bitcode/Reader/MetadataLoader.cpp:1091
3 0x00007f5f796c2527 in llvm::MetadataLoader::MetadataLoaderImpl::getMetadataFwdRefOrLoad (this=0x9db3a0, ID=188) at llvm
lib/Bitcode/Reader/MetadataLoader.cpp:668
4 0x00007f5f796bfff3 in llvm::MetadataLoader::getMetadataFwdRefOrLoad (this=0xd31580, Idx=188) at llvm/lib/Bitcode/Reader
MetadataLoader.cpp:2290
5 0x00007f5f79638265 in (anonymous namespace)::BitcodeReader::parseFunctionBody (this=0xd312e0, F=0x9de758) at llvm/lib/B
tcode/Reader/BitcodeReader.cpp:3938
6 0x00007f5f79635d32 in (anonymous namespace)::BitcodeReader::materialize (this=0xd312e0, GV=0x9de758) at llvm/lib/Bitcod
/Reader/BitcodeReader.cpp:5408
7 0x00007f5f7f8dbe3e in llvm::Module::materialize (this=0x9b92c0, GV=0x9de758) at llvm/lib/IR/Module.cpp:442
8 0x00007f5f7f7f8fbe in llvm::GlobalValue::materialize (this=0x9de758) at llvm/lib/IR/Globals.cpp:50
9 0x00007f5f83b9b5f5 in llvm::FunctionImporter::importFunctions (this=0x7ffcd0e2a730, DestModule=..., ImportList=...) at
llvm/lib/Transforms/IPO/FunctionImport.cpp:1182
```
When performing ThinLTO importing, the metadata loader attempts to lazy
load, by building an index. However, module level global decl attachment
metadata was being parsed early while building the index, since the
associated (module level) global values aren't materialized on demand.
This results in the creation of forward reference temporary metadatas,
which are expensive.
Normally, these module level global values don't have much attached
metadata. However, in the case of -fwhole-program-vtables (e.g. for
whole program devirtualization), the vtables may have many attached type
metadatas. This was resulting in very slow performance when performing
ThinLTO importing with the default lazy loading.
This patch restructures the handling of these global decl attachment
records, delaying their parsing until after the lazy loading index has
been built. Then the parser can use the interface that loads from the
index, which resolves forward references immediately instead of creating
expensive temporaries.
For one ThinLTO backend that imports from modules containing huge
numbers of vtables and associated types, I measured the following
compile times for the metadata materialization during function
importing, rounded to nearest second:
No -fwhole-program-vtables:
Lazy loading on (head): 1s
Lazy loading off (head): 3s
Lazy loading on (patch): 1s
With -fwhole-program-vtables:
Lazy loading on (head): 440s
Lazy loading off (head): 4s
Lazy loading on (patch): 2s
Differential Revision: https://reviews.llvm.org/D87970
Bitcode writer does not flush buffer until the end by default. This is
fine to small bitcode files. When -flto,--plugin-opt=emit-llvm,-gmlt are
used, the final bitcode file is large, for example, >8G. Keeping all
data in memory consumes a lot of memory.
This change allows bitcode writer flush data to disk early when buffered
data size is above some threshold. This is only enabled when lld emits
LLVM bitcode.
One issue to address is backpatching bitcode: subblock length, function
body indexes, meta data indexes need to backfill. If buffer can be
flushed partially, we introduced raw_fd_stream that supports
read/seek/write, and enables backpatching bitcode flushed in disk.
Reviewed-by: tejohnson, MaskRay
Differential Revision: https://reviews.llvm.org/D86905
llvm::EmbedBitcodeInModule handles serializing the passed-in module, if
the provided MemoryBufferRef is invalid. This is already the path taken
in one of the uses of the API - clang::EmbedBitcode, when called from
BackendConsumer::HandleTranslationUnit - so might as well do the same
here and reduce (by very little) code duplication.
The only difference this patch introduces is that the serialization happens
with ShouldPreserveUseListOrder set to true.
Differential Revision: https://reviews.llvm.org/D87339
Otherwise their alignment is dependent on the size of the section. If the size
is large than 16, the alignment will be 16.
16 is a bad choice for both .llvmbc and .llvmcmd because the padding between two
contributions from input sections is of a variable size.
A bitstream is actually guaranteed to be 4-byte aligned, but consumers don't
need this property.
This patch changes ElementCount so that the Min and Scalable
members are now private and can only be accessed via the get
functions getKnownMinValue() and isScalable(). In addition I've
added some other member functions for more commonly used operations.
Hopefully this makes the class more useful and will reduce the
need for calling getKnownMinValue().
Differential Revision: https://reviews.llvm.org/D86065
This patch adds support for representing Fortran `character(n)`.
Primarily patch is based out of D54114 with appropriate modifications.
Test case IR is generated using our downstream classic-flang. We're in process
of upstreaming flang PR's but classic-flang has dependencies on llvm, so
this has to get in first.
Patch includes functional test case for both IR and corresponding
dwarf, furthermore it has been manually tested as well using GDB.
Source snippet:
```
program assumedLength
call sub('Hello')
call sub('Goodbye')
contains
subroutine sub(string)
implicit none
character(len=*), intent(in) :: string
print *, string
end subroutine sub
end program assumedLength
```
GDB:
```
(gdb) ptype string
type = character (5)
(gdb) p string
$1 = 'Hello'
```
Reviewed By: aprantl, schweitz
Differential Revision: https://reviews.llvm.org/D86305
This avoid GUID lookup in Index.findSummaryInModule.
Follow up for D81242.
Reviewed By: tejohnson
Differential Revision: https://reviews.llvm.org/D85269
Adds the binary format goff and the operating system zos to the triple
class. goff is selected as default binary format if zos is choosen as
operating system. No further functionality is added.
Reviewers: efriedma, tahonermann, hubert.reinterpertcast, MaskRay
Reviewed By: efriedma, tahonermann, hubert.reinterpertcast
Differential Revision: https://reviews.llvm.org/D82081
Summary:
The upgrading path from old ModuleFlag based linker options to the new
NamedMetadata based linker option in in materializeMetadata() which gets
called once for the module and once for every GV. The linker options are
getting dup'ed every time and it can create massive amount of the linker
options in the object file that gets created from old bitcode. Fix the
problem by checking if the new option exists or not before upgrade
again.
rdar://64543389
Reviewers: pcc, t.p.northover, dexonsmith, arphaman
Reviewed By: arphaman
Subscribers: hiraditya, jkorous, ributzka, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D83688
Summary:
If bitcode reader gets an invalid branch weight, drop that from the
inputs. This allows us to read the broken modules we generated before
the verifier was able to catch this.
rdar://64870641
Reviewers: yrouban, t.p.northover, dexonsmith, arphaman, aprantl
Reviewed By: aprantl
Subscribers: aprantl, hiraditya, jkorous, ributzka, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D83699
Summary:
This support is needed for the Fortran array variables with pointer/allocatable
attribute. This support enables debugger to identify the status of variable
whether that is currently allocated/associated.
for pointer array (before allocation/association)
without DW_AT_associated
(gdb) pt ptr
type = integer (140737345375288:140737354129776)
(gdb) p ptr
value requires 35017956 bytes, which is more than max-value-size
with DW_AT_associated
(gdb) pt ptr
type = integer (:)
(gdb) p ptr
$1 = <not associated>
for allocatable array (before allocation)
without DW_AT_allocated
(gdb) pt arr
type = integer (140737345375288:140737354129776)
(gdb) p arr
value requires 35017956 bytes, which is more than max-value-size
with DW_AT_allocated
(gdb) pt arr
type = integer, allocatable (:)
(gdb) p arr
$1 = <not allocated>
Testing
- unit test cases added
- check-llvm
- check-debuginfo
Reviewed By: aprantl
Differential Revision: https://reviews.llvm.org/D83544
This allows tracking the in-memory type of a pointer argument to a
function for ABI purposes. This is essentially a stripped down version
of byval to remove some of the stack-copy implications in its
definition.
This includes the base IR changes, and some tests for places where it
should be treated similarly to byval. Codegen support will be in a
future patch.
My original attempt at solving some of these problems was to repurpose
byval with a different address space from the stack. However, it is
technically permitted for the callee to introduce a write to the
argument, although nothing does this in reality. There is also talk of
removing and replacing the byval attribute, so a new attribute would
need to take its place anyway.
This is intended avoid some optimization issues with the current
handling of aggregate arguments, as well as fixes inflexibilty in how
frontends can specify the kernel ABI. The most honest representation
of the amdgpu_kernel convention is to expose all kernel arguments as
loads from constant memory. Today, these are raw, SSA Argument values
and codegen is responsible for turning these into loads.
Background:
There currently isn't a satisfactory way to represent how arguments
for the amdgpu_kernel calling convention are passed. In reality,
arguments are passed in a single, flat, constant memory buffer
implicitly passed to the function. It is also illegal to call this
function in the IR, and this is only ever invoked by a driver of some
kind.
It does not make sense to have a stack passed parameter in this
context as is implied by byval. It is never valid to write to the
kernel arguments, as this would corrupt the inputs seen by other
dispatches of the kernel. These argumets are also not in the same
address space as the stack, so a copy is needed to an alloca. From a
source C-like language, the kernel parameters are invisible.
Semantically, a copy is always required from the constant argument
memory to a mutable variable.
The current clang calling convention lowering emits raw values,
including aggregates into the function argument list, since using
byval would not make sense. This has some unfortunate consequences for
the optimizer. In the aggregate case, we end up with an aggregate
store to alloca, which both SROA and instcombine turn into a store of
each aggregate field. The optimizer never pieces this back together to
see that this is really just a copy from constant memory, so we end up
stuck with expensive stack usage.
This also means the backend dictates the alignment of arguments, and
arbitrarily picks the LLVM IR ABI type alignment. By allowing an
explicit alignment, frontends can make better decisions. For example,
there's real no advantage to an aligment higher than 4, so a frontend
could choose to compact the argument layout. Similarly, there is a
high penalty to using an alignment lower than 4, so a frontend could
opt into more padding for small arguments.
Another design consideration is when it is appropriate to expose the
fact that these arguments are all really passed in adjacent
memory. Currently we have a late IR optimization pass in codegen to
rewrite the kernel argument values into explicit loads to enable
vectorization. In most programs, unrelated argument loads can be
merged together. However, exposing this property directly from the
frontend has some disadvantages. We still need a way to track the
original argument sizes and alignments to report to the driver. I find
using some side-channel, metadata mechanism to track this
unappealing. If the kernel arguments were exposed as a single buffer
to begin with, alias analysis would be unaware that the padding bits
betewen arguments are meaningless. Another family of problems is there
are still some gaps in replacing all of the available parameter
attributes with metadata equivalents once lowered to loads.
The immediate plan is to start using this new attribute to handle all
aggregate argumets for kernels. Long term, it makes sense to migrate
all kernel arguments, including scalars, to be passed indirectly in
the same manner.
Additional context is in D79744.
as it wasn't NFC and is causing issues with thinlto bitcode reading.
I've followed up offline with reproduction instructions and testcases.
This reverts commit 30582457b4.
The `noundef` attribute indicates an argument or return value which
may never have an undef value representation.
This patch allows LLVM to parse the attribute.
Differential Revision: https://reviews.llvm.org/D83412