This is still not sufficient for lld to handle its own output when a
fde points to a discarded section. I am investigating if it is better
to change the -r output or make lld able to read the current version.
llvm-svn: 295141
freescale triple.
On multiarch systems, this previously caused us to stat every file in
/usr/lib/<triple> (typically several thousand files). This change halves
the runtime of a clang invocation on an empty file on my system.
llvm-svn: 295140
This patch corrects the maximum workgroups per CU if we have big
workgroups (more than 128). This calculation contributes to the
occupancy calculation in respect to LDS size.
Differential Revision: https://reviews.llvm.org/D29974
llvm-svn: 295134
This is a really horrible case. If a .eh_frame points to a discarded
section, it is not clear what is the correct thing to do.
It looks like ld.bfd discards the entire .eh_frame content and gold
discards the second relocation, leaving one frame with an fde that
refers to a bogus location. This is similar to what gold does.
llvm-svn: 295133
This reverts commit r295102.
In the link of seabios the assumption seems to be that the section has
an actual address, so this is not sufficient. Changing the assembly
code to add a "a" flag seems like the correct thing to do instead of
extending this hack.
Sorry about the noise.
Original message:
Relax the restriction on what relocations can be in a non-alloc section.
The main thing that they can't have is relocations that require the
creation of gots or plt. For now also accept R_PC.
Found while linking seabios.
llvm-svn: 295130
Summary: Previously the cleanups (e.g. dtor calls) are inserted into the
outer scope (e.g. function body scope), instead of it's own scope. After
the fix, the cleanups are inserted right after getting the size value.
This fixes pr30306.
Reviewers: rsmith
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D24333
llvm-svn: 295123
that has been explicitly specialized!
We assume in various places that we can tell the template specialization kind
of a class type by looking at the declaration produced by TagType::getDecl.
That was previously not quite true: for an explicit specialization, we could
have first seen a template-id denoting the specialization (with a use that does
not trigger an implicit instantiation of the defintiion) and then seen the
first explicit specialization declaration. TagType::getDecl would previously
return an arbitrary declaration when called on a not-yet-defined class; it
now consistently returns the most recent declaration in that case.
llvm-svn: 295118
Summary:
The YAML output produced by llvm-xray is supposed to be wrapped at the
arbitrary default of 70 columns set by `yaml:Output`. Unfortunately,
the wrapping is rather unpredictable, and can easily go past the set
number of columns, depending on the execution environment.
To make the YAML output environment-independent, disable wrapping
instead.
Reviewers: dberris
Reviewed By: dberris
Subscribers: fhahn, llvm-commits
Differential Revision: https://reviews.llvm.org/D29962
llvm-svn: 295116
Multiple blocks in the callee can be mapped to a single cloned block
since we prune the callee as we clone it. The existing code
iterates over the value map and clones the block frequency (and
eventually scales the frequencies of the cloned blocks). Value map's
iteration is not deterministic and so the cloned block might get the
frequency of any of the original blocks. The fix is to set the max of
the original frequencies to the cloned block. The first block in the
sequence must have this max frequency and, in the call context,
subsequent blocks must have its frequency.
Differential Revision: https://reviews.llvm.org/D29696
llvm-svn: 295115
In case user did not provide valid standard name for -std option, available
values (with short description) will be reported.
Patch by Paweł Żukowski!
llvm-svn: 295113
Summary:
This helps to avoid signed integer overflow after running a fast fuzz target for several hours, e.g.:
<...>
Done -1097903291 runs in 54001 second(s)
Reviewers: kcc
Reviewed By: kcc
Differential Revision: https://reviews.llvm.org/D29941
llvm-svn: 295112
Group calls into constant and non-constant arguments up front, and use uint64_t
instead of ConstantInt to represent constant arguments. The goal is to allow
the information from the summary to fit naturally into this data structure in
a future change (specifically, it will be added to CallSiteInfo).
This has two side effects:
- We disallow VCP for constant integer arguments of width >64 bits.
- We remove the restriction that the bitwidth of a vcall's argument and return
types must match those of the vfunc definitions.
I don't expect either of these to matter in practice. The first case is
uncommon, and the second one will lead to UB (so we can do anything we like).
Differential Revision: https://reviews.llvm.org/D29744
llvm-svn: 295110
Summary:
When setting debugloc for instructions created in SplitBlockPredecessors, current implementation copies debugloc from the first-non-phi instruction of the original basic block. However, if the first-non-phi instruction is a call for @llvm.dbg.value, the debugloc of the instruction may point the location outside of the block itself. For the example code of
```
1 typedef struct _node_t {
2 struct _node_t *next;
3 } node_t;
4
5 extern node_t *root;
6
7 int foo() {
8 node_t *node, *tmp;
9 int ret = 0;
10
11 node = tmp = root->next;
12 while (node != root) {
13 while (node) {
14 tmp = node;
15 node = node->next;
16 ret++;
17 }
18 }
19
20 return ret;
21 }
```
, below is the basicblock corresponding to line 12 after Reassociate expressions pass:
```
while.cond: ; preds = %while.cond2, %entry
%node.0 = phi %struct._node_t* [ %1, %entry ], [ null, %while.cond2 ]
%ret.0 = phi i32 [ 0, %entry ], [ %ret.1, %while.cond2 ]
tail call void @llvm.dbg.value(metadata i32 %ret.0, i64 0, metadata !19, metadata !20), !dbg !21
tail call void @llvm.dbg.value(metadata %struct._node_t* %node.0, i64 0, metadata !11, metadata !20), !dbg !31
%cmp = icmp eq %struct._node_t* %node.0, %0, !dbg !33
br i1 %cmp, label %while.end5, label %while.cond2, !dbg !35
```
As you can see, the first-non-phi instruction is a call for @llvm.dbg.value, and the debugloc is
```
!21 = !DILocation(line: 9, column: 7, scope: !6)
```
, which is a definition of 'ret' variable and outside of the scope of the basicblock itself. However, current implementation picks up this debugloc for the instructions created in SplitBlockPredecessors. This patch addresses this problem by picking up debugloc from the first-non-phi-non-dbg instruction.
Reviewers: dblaikie, samsonov, eugenis
Reviewed By: eugenis
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D29867
llvm-svn: 295106
Summary:
Blocks ending in unreachable are typically cold because they end the
program or throw an exception, so merging them with other identical
blocks is usually profitable because it reduces the size of cold code.
MachineBlockPlacement generally does not arrange to fall through to such
blocks, so commoning these blocks will not introduce additional
unconditional branches.
Reviewers: hans, iteratee, haicheng
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D29153
llvm-svn: 295105
This instruction clears the low bits of a pointer without requiring (possibly
dodgy if pointers aren't ints) conversions to and from an integer. Since (as
far as I'm aware) all masks are statically known, the instruction takes an
immediate operand rather than a register to specify the mask.
llvm-svn: 295103
The main thing that they can't have is relocations that require the
creation of gots or plt. For now also accept R_PC.
Found while linking seabios.
llvm-svn: 295102
This is a re-try of r295085: fix up some test cases that assume that
profile name variables are preserved by the instrprof pass.
This catches one additional case in test/CoverageMapping/unused_names.c.
llvm-svn: 295101
This reverts 295092 (re-applies 295084), with a fix for dangling
references from the array of coverage names passed down from frontends.
I missed this in my initial testing because I only checked test/Profile,
and not test/CoverageMapping as well.
Original commit message:
The profile name variables passed to counter increment intrinsics are dead
after we emit the finalized name data in __llvm_prf_nm. However, we neglect to
erase these name variables. This causes huge size increases in the
__TEXT,__const section as well as slowdowns when linker dead stripping is
disabled. Some affected projects are so massive that they fail to link on
Darwin, because only the small code model is supported.
Fix the issue by throwing away the name constants as soon as we're done with
them.
Differential Revision: https://reviews.llvm.org/D29921
llvm-svn: 295099
Store instructions can have more than one memory operand as a result
of optimizations that fold different stores into one.
When we identify spill instructions to generate DBG_VALUE instructions
to record the spilling of a variable, we disregard stores with
multiple memory operands for now. We may miss some relevant spills but
the handling is a bit more complex, so we'll do it in a different patch.
This fixes PR31935.
llvm-svn: 295093