- Add "amdgpu-waitcnt-forcezero" to force all waitcnt instrs to be emitted as s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
- Add debug counters to control force emit of s_waitcnt instrs; debug counters:
si-insert-waitcnts-forceexp: force emit s_waitcnt expcnt(0) instrs
si-insert-waitcnts-forcevm: force emit s_waitcnt lgkmcnt(0) instrs
si-insert-waitcnts-forcelgkm: force emit s_waitcnt vmcnt(0) instrs
- Add some debug statements
Note that a variant of this patch was previously committed/reverted.
Differential Revision: https://reviews.llvm.org/D45888
llvm-svn: 330862
Passing the features in random order will lead to unpredictable results
when some of the features are related (like the architecture-version
features on ARM).
It might be possible to fix this particular case in the ARM target code,
to avoid adding overlapping target features. But we should probably be
sorting in any case: the behavior shouldn't depend on StringMap's
hashing algorithm.
Differential Revision: https://reviews.llvm.org/D46030
llvm-svn: 330861
Debug var, expr and loc were only supported for non-fixed stack objects.
This patch adds the following fields to the "fixedStack:" entries, and
renames the ones from "stack:" to:
* debug-info-variable
* debug-info-expression
* debug-info-location
Differential Revision: https://reviews.llvm.org/D46032
llvm-svn: 330859
The current statement domain derivation algorithm does not (always)
consider that different exit blocks of a loop can have different
conditions to be reached.
From the code
for (int i = n; ; i-=2) {
if (i <= 0) goto even;
if (i <= 1) goto odd;
A[i] = i;
}
even:
A[0] = 42;
return;
odd:
A[1] = 21;
return;
Polly currently derives the following domains:
Stmt_even_critedge
Domain :=
[n] -> { Stmt_even_critedge[] };
Stmt_odd
Domain :=
[n] -> { Stmt_odd[] : (1 + n) mod 2 = 0 and n > 0 };
while the domain for the odd case is correct, Stmt_even is assumed to be
executed unconditionally, which is obviously wrong. While projecting out
the loop dimension in `adjustDomainDimensions`, it does not consider
that there are other exit condition that have matched before.
I don't know a how to fix this without changing a lot of code. Therefore
This patch rejects loops with multiple exist blocks to fix the
miscompile of test-suite's uuencode.
The odd condition is transformed by LLVM to
%cmp1 = icmp eq i64 %indvars.iv, 1
such that the project_out in adjustDomainDimensions() indeed only
matches for odd n (using this condition only, we'd have an infinite loop
otherwise).
The even condition manifests as
%cmp = icmp slt i64 %indvars.iv, 3
Because buildDomainsWithBranchConstraints() does not consider other exit
conditions, it has to assume that the induction variable will eventually
be lower than 3 and taking this exit.
IMHO we need to reuse the algorithm that determines the number of
iterations (addLoopBoundsToHeaderDomain) to determine which exit
condition applies first. It has to happen in
buildDomainsWithBranchConstraints() because the result will need to
propagate to successor BBs. Currently addLoopBoundsToHeaderDomain() just
look for union of all backedge conditions (which means leaving not the
loop here). The patch in llvm.org/PR35465 changes it to look for exit
conditions instead. This is required because there might be other exit
conditions that do not alternatively go back to the loop header.
Differential Revision: https://reviews.llvm.org/D45649
llvm-svn: 330858
Summary:
This adds `__scudo_print_stats` as an interface function to display the Primary
and Secondary allocator statistics for Scudo.
Reviewers: alekseyshl, flowerhack
Reviewed By: alekseyshl
Subscribers: delcypher, llvm-commits, #sanitizers
Differential Revision: https://reviews.llvm.org/D46016
llvm-svn: 330857
This is necessary in order to get a working C++ compiler on Darwin
since Clang expects libc++ headers to be part of the toolchain.
Differential Revision: https://reviews.llvm.org/D46075
llvm-svn: 330855
Imports in a wasm module can have custom module name. This change
adds the module name to the WasmSymbol structure so that the linker
can preserve this module name.
This is needed to fix: https://bugs.llvm.org/show_bug.cgi?id=37168
Differential Revision: https://reviews.llvm.org/D45797
llvm-svn: 330854
Previously we only formed MUL_IMM when we split a constant. This blocked load folding on those cases. We should also form MUL_IMM for 3/5/9 to favor LEA over load folding.
Differential Revision: https://reviews.llvm.org/D46040
llvm-svn: 330850
There are only a few cases of importing a frienddecl which is currently supported.
This patch aims to improve the friend import process.
Set FriendObjectKind in case of decls, insert friend into the friend chain
correctly, checks structurally equivalent in a more advanced manner.
Test cases added as well.
llvm-svn: 330847
Previously `call zero`, `call f0` etc would fail. This leads to compilation
failures if building programs that define functions with those names and using
-save-temps.
llvm-svn: 330846
Summary:
When performing indirect call promotion, current implementation inspects "all" parameters of the callsite and attemps to match with the formal argument type of the callee function. However, it is not possible to find the type for variable length arguments, and the compiler crashes when it attemps to match the type for variable lenght argument.
It seems that the bug is introduced with D40658. Prior to that, the type matching is performed only for the parameters whose ID is less than callee->getFunctionNumParams(). The attached test case will crash without the patch.
Reviewers: mssimpso, davidxl, davide
Reviewed By: mssimpso
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D46026
llvm-svn: 330844
Virtually all other tablegen outputs are called .inc, not .gen, so rename these two too for consistency.
No behavior change.
https://reviews.llvm.org/D46058
llvm-svn: 330843
As discussed in D45862, we want to delete parts of
this code because it can create more instructions
than it removes. But we also want to preserve some
folds that are winners, so tidy up what's here to
make splitting the good from bad a bit easier.
llvm-svn: 330841
The read/write flag is set by manually decoding the instruction that caused
the exception. It is implemented this way because the cause register which
contains the needed flag was removed from the signal context structure which
the user handler receives from the kernel.
Patch by Milos Stojanovic.
Differential Revision: https://reviews.llvm.org/D45768
llvm-svn: 330840
As discussed in D45862, we want these folds sometimes
because they're good improvements.
But as we can see here, the current logic doesn't
check uses and doesn't produce optimal code in all
cases.
llvm-svn: 330837
Summary:
This is a convenient function when we try to get std::string of
SymbolID.
Reviewers: ioeric
Subscribers: klimek, ilya-biryukov, MaskRay, jkorous, cfe-commits
Differential Revision: https://reviews.llvm.org/D46065
llvm-svn: 330835
Summary:
The example that was broken before (^ designates completion points):
class Foo {
Foo() : fie^ld^() {} // no completions were provided here.
int field;
};
To fix it we don't cut off lexing after an identifier followed by code
completion token is lexed. Instead we skip the rest of identifier and
continue lexing.
This is consistent with behavior of completion when completion token is
right before the identifier.
Reviewers: sammccall, aaron.ballman, bkramer, sepavloff, arphaman, rsmith
Reviewed By: aaron.ballman
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D44932
llvm-svn: 330833
Summary: This adds some delimiters to detect cpp code in raw strings.
Reviewers: klimek
Reviewed By: klimek
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D46062
llvm-svn: 330832
To do this:
1. Change GlobalAddress SDNode to TargetGlobalAddress to avoid legalizer
split the symbol.
2. Change ExternalSymbol SDNode to TargetExternalSymbol to avoid legalizer
split the symbol.
3. Let PseudoCALL match direct call with target operand TargetGlobalAddress
and TargetExternalSymbol.
Differential Revision: https://reviews.llvm.org/D44885
llvm-svn: 330827
To do this:
1. Add PseudoCALLIndirct to match indirect function call.
2. Add PseudoCALL to support parsing and print pseudo `call` in assembly
3. Expand PseudoCALL to the following form with R_RISCV_CALL relocation type
while encoding:
auipc ra, func
jalr ra, ra, 0
If we expand PseudoCALL before emitting assembly, we will see auipc and jalr
pair when compile with -S. It's hard for assembly parser to parsing this
pair and identify it's semantic is function call and then insert R_RISCV_CALL
relocation type. Although we could insert R_RISCV_PCREL_HI20 and
R_RISCV_PCREL_LO12_I relocation types instead of R_RISCV_CALL.
Due to RISCV relocation design, auipc and jalr pair only can relax to jal with
R_RISCV_CALL + R_RISCV_RELAX relocation types.
We expand PseudoCALL as late as encoding(RISCVMCCodeEmitter) instead of before
emitting assembly(RISCVAsmPrinter) because we want to preserve call
pseudoinstruction in assembly code. It's more readable and assembly parser
could identify call assembly and insert R_RISCV_CALL relocation type.
Differential Revision: https://reviews.llvm.org/D45859
llvm-svn: 330826
ISel is currently picking 'JAL' over 'JAL_MM' for calling a function when
targeting microMIPS. A later patch will correct this behaviour.
This patch extends the mechanism for transforming instructions into their short
delay to recognise 'JAL_MM' for transforming into 'JALS_MM'.
llvm-svn: 330825
source/Symbol/ClangASTContext.cpp:391:13: error: enumeration value 'HIP' not handled in switch [-Werror,-Wswitch]
switch (IK.getLanguage()) {
llvm-svn: 330823
With this patch, options to add/tweak views are all grouped together in the
-help output.
The new "View Options" category looks like this:
```
View Options:
-dispatch-stats - Print dispatch statistics
-instruction-info - Print the instruction info view
-instruction-tables - Print instruction tables
-register-file-stats - Print register file statistics
-resource-pressure - Print the resource pressure view
-retire-stats - Print retire control unit statistics
-scheduler-stats - Print scheduler statistics
-timeline - Print the timeline view
-timeline-max-cycles=<uint> - Maximum number of cycles in the timeline view. Defaults to 80 cycles
-timeline-max-iterations=<uint> - Maximum number of iterations to print in timeline view
```
llvm-svn: 330816
Currently, LLD supports ASSERT as a separate command.
We support two forms now.
Assign expression-form: . = ASSERT(0x100)
(old GNU ld required it and some scripts in the wild are still using
something like . = ASSERT((_end - _text <= (512 * 1024 * 1024)), "kernel image bigger than KERNEL_IMAGE_SIZE");
Nowadays above is not a mandatory form and command-like form is commonly used:
ASSERT(<expr>, "text);
The return value of the ASSERT is Dot. That was implemented in D30171.
It looks like (2) is just a short version of (1) then.
GNU ld does *not* list ASSERT as a SECTIONS command:
https://sourceware.org/binutils/docs/ld/SECTIONS.html#SECTIONS
Given above we probably can change ASSERT to be an assignment to Dot.
That makes the rest of the code much simpler. Patch do that.
Differential revision: https://reviews.llvm.org/D45434
llvm-svn: 330814