We have an issue currently. The following YAML piece just ignores the `Excluded` key.
```
SectionHeaderTable:
Sections: []
Excluded:
- Name: .foo
```
Currently the meaning is: exclude the whole table.
The code checks that the `Sections` key is empty and doesn't catch/check
invalid/duplicated/missed `Excluded` entries.
Also there is no way to exclude all sections except the first null section,
because `Sections: []` currently just excludes the whole the sections header table.
To fix it, I suggest a change of the behavior.
1) A new `NoHeaders` key is added. It provides an explicit syntax to drop the whole table.
2) The meaning of the following is changed:
```
SectionHeaderTable:
Sections: []
Excluded:
- Name: .foo
```
Assuming there are 2 sections in the object (a null section and `.foo`), with this patch it
means: exclude the `.foo` section, keep the null section. The null section is an implicit
section and I think it is reasonable to make "Sections: []" to mean it is implicitly added.
It will be consistent with the global "Sections" tag that is used to describe sections.
3) `SectionHeaderTable->Sections` is now optional. No `Sections` is the same as
`Sections: []` (I think it avoids a confusion).
4) Using of `NoHeaders` together with `Sections`/`Excluded` is not allowed.
5) It is possible to use the `Excluded` key without the `Sections` key now (in this case
`Excluded` must contain all sections).
6) `SectionHeaderTable:` or `SectionHeaderTable: []` is not allowed.
7) When the `SectionHeaderTable` key is present, we still require all sections to be
present in `Sections` and `Excluded` lists. No changes here, we are still strict.
Differential revision: https://reviews.llvm.org/D81655
Summary:
Teach MachineVerifier to check branches for MBB operands if they are not declared indirect.
Add `isBarrier`, `isIndirectBranch` to `G_BRINDIRECT` and `G_BRJT`.
Without these, `MachineInstr.isConditionalBranch()` was giving a
false-positive for those instructions.
Reviewers: aemerson, qcolombet, dsanders, arsenm
Reviewed By: dsanders
Subscribers: hiraditya, wdng, simoncook, s.egerton, arsenm, rovka, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D81587
Have BasicTTI call the base implementation so that both agree on the
default behaviour, which the default being a cost of '1'. This has
required an X86 specific implementation as it seems to be very
reliant on those instructions being free. Changes are also made to
AMDGPU so that their implementations distinguish between cost kinds,
so that the unrolling isn't affected. PowerPC also has its own
implementation to prevent changes to the reg-usage vectorizer test.
The cost model test changes now reflect that ret instructions are not
generally free.
Differential Revision: https://reviews.llvm.org/D79164
Enable TTIImpl::getUserCost to handle FNeg so that
getInstructionThroughput can call that instead. This means we can
remove the code in the AMDGPU backend too.
Differential Revision: https://reviews.llvm.org/D81635
When checking for an enum function attribute, use hasFnAttribute()
rather than hasAttribute() at FunctionIndex, because it is
significantly faster (and more concise to boot).
Move the cost modelling, with the reduction pattern matching, from
getInstructionThroughput into generic TTIImpl::getUserCost. The
modelling in the AMDGPU backend can now be removed.
Differential Revision: https://reviews.llvm.org/D81643
This patch tries to reassociate two patterns related to FMA to expose
more ILP on PowerPC.
// Pattern 1:
// A = FADD X, Y (Leaf)
// B = FMA A, M21, M22 (Prev)
// C = FMA B, M31, M32 (Root)
// -->
// A = FMA X, M21, M22
// B = FMA Y, M31, M32
// C = FADD A, B
// Pattern 2:
// A = FMA X, M11, M12 (Leaf)
// B = FMA A, M21, M22 (Prev)
// C = FMA B, M31, M32 (Root)
// -->
// A = FMUL M11, M12
// B = FMA X, M21, M22
// D = FMA A, M31, M32
// C = FADD B, D
Reviewed By: jsji
Differential Revision: https://reviews.llvm.org/D80175
Summary:
When an SCC got split due to inlining, we have two mechanisms for reprocessing the updated SCC, first is UR.UpdatedC
that repeatedly rerun the new, current SCC; second is a worklist for all newly split SCCs. We can avoid rerun of
the same SCC when the SCC is set to be processed by both mechanisms *back to back*. In pathological cases, such redundant
rerun could cause exponential size growth due to inlining along cycles, even when there's no SCC mutation and hence
convergence is not a problem.
Note that it's ok to have SCC updated and rerun immediately, and also in the work list if we have actually moved an SCC
to be topologically "below" the current one due to merging. In that case, we will need to revisit the current SCC after
those moved SCCs. For that reason, the redundant avoidance here only targets back to back rerun of the same SCC - the
case described by the now removed FIXME comment.
Reviewers: chandlerc, wmi
Subscribers: llvm-commits, hoy
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D80589
Currently, there doesn't seem to be any way to look up a Value*
in a map/set indexed by AssertingVH/PoisoningVH, without creating
a value handle -- which is fairly expensive, because it involves
adding the value handle to the use list and immediately removing
it again. Using find_as(Value *) does not work (and is in fact
worse than just using find(Value *)), because it will end up
creating multiple value handles during the lookup itself.
For AssertingVH, address this by simply using DenseMapInfo<T *>
instead of manually implementing something. The AssertingVH<T>
will now get coerced to T*, rather than the other way around.
For PoisoningVH, add extra overloads of getHashValue() and
isEqual() that accept a T* argument.
This allows using find_as(Value *) to perform efficient lookups
in assertion-enabled builds.
Differential Revision: https://reviews.llvm.org/D81793
This patch adds a new field `bool Is64bit` in `DWARFYAML::Data` to indicate the address size of target. It's helpful for inferring the `AddrSize` in some DWARF sections.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D81709
Summary:
ThinLTO linking runs dataflow processing on collected
function parameters. Then StackSafetyGlobalInfoWrapperPass
in ThinLTO backend will run as usual looking up to external
symbol in the summary if needed.
Depends on D80985.
Reviewers: eugenis, pcc
Reviewed By: eugenis
Subscribers: inglorion, hiraditya, steven_wu, dexonsmith, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D81242
Summary:
This commit slightly modifies the MCDisassembler, and llvm-objdump to
allow targets to also decode entire symbols.
WebAssembly uses the onSymbolStart hook it to decode preludes.
WebAssembly partially disassembles the symbol in its target specific
way; and then falls back to the normal flow of llvm-objdump.
AMDGPU needs it to decode kernel descriptors entirely, and move to the
next symbol.
This commit is to split the above task into 2.
- Changes to llvm-objdump and MC-layer without breaking WebAssembly code
[ this commit ]
- AMDGPU's implementation of onSymbolStart that decodes kernel
descriptors. [ https://reviews.llvm.org/D80713 ]
Reviewers: scott.linder, t-tye, sunfish, arsenm, jhenderson, MaskRay, aardappel
Reviewed By: scott.linder, jhenderson, aardappel
Subscribers: bcain, dschuff, wdng, tpr, sbc100, jgravelle-google, hiraditya, aheejin, MaskRay, rupprecht, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D80512
Summary:
Inline functions in Type.h depended upon inline functions isVectorTy and
getScalarType defined in DerivedTypes.h. Reimplement these functions in
Type.h in terms of Type
Reviewers: rengolin, efriedma, echristo, c-rhodes, david-arm
Reviewed By: echristo
Subscribers: tschuett, rkruppe, psnobl, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D81684
This patch adds a new option to CriticalEdgeSplittingOptions to control
whether loop-simplify form must be preserved. It is them used by GVN to
indicate that loop-simplify form does not have to be preserved.
This fixes a crash exposed by 189efe295b.
If the critical edge we are splitting goes from a block inside a loop to
a block outside the loop, splitting the edge will create a new exit
block. As a result, the new block will branch to the original exit
block, which will add a non-loop predecessor, breaking loop-simplify
form. To preserve loop-simplify form, the predecessor blocks of the
original exit are split, but that does not work for blocks with
indirectbr terminators. If preserving loop-simplify form is requested,
bail out , before making any changes.
Reviewers: reames, hfinkel, davide, efriedma
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D81582
Summary:
This completes the needed glueing to support reading tbd files from nm.
This includes specifying which slice filtering with `--arch` and a new
option specifically for tbd files `--add-inlinedinfo` which will show
the reexported libraries that are appended in the tbd file.
Reviewers: ributzka, steven_wu, JDevlieghere, jhenderson
Reviewed By: JDevlieghere
Subscribers: hiraditya, MaskRay, dexonsmith, rupprecht, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D81614
SUMMARY:
Since we deal with aix emitLinkage in the PPCAIXAsmPrinter::emitLinkage() in the patch https://reviews.llvm.org/D75866. It do not go to AsmPrinter::emitLinkage() any more, we clean up some aix related code in the AsmPrinter::emitLinkage()
Reviewers: Jason liu
Differential Revision: https://reviews.llvm.org/D81613
Summary:
Other derivations will all want to emit optimization remarks and, as
part of that, use debug info.
Additionally, drive-by const-ing.
Reviewers: davidxl, dblaikie
Subscribers: aprantl, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D81507
Change BasicBlock::removePredecessor to optionally return a vector of
instructions which might be dead. Use this in ConstantFoldTerminator to
delete them if they are dead.
Reapply with a bug fix: don't drop the "!KeepOneInputPHIs" argument when
removePredecessor calls PHINode::removeIncomingValue.
Differential Revision: https://reviews.llvm.org/D80206
Change BasicBlock::removePredecessor to optionally return a vector of
instructions which might be dead. Use this in ConstantFoldTerminator to
delete them if they are dead.
Differential Revision: https://reviews.llvm.org/D80206
Previously these functions either returned a "changed" flag or a "repeat
instruction" flag, and could also modify an iterator to control which
instruction would be processed next.
Simplify this by always returning a "changed" flag, and handling all of
the "repeat instruction" functionality by modifying the iterator.
No functional change intended except in this case:
// If the source and destination of the memcpy are the same, then zap it.
... where the previous code failed to process the instruction after the
zapped memcpy.
Differential Revision: https://reviews.llvm.org/D81540
By moving target-independent code from
llvm/lib/Target/X86/X86IndirectThunks.cpp
to
llvm/include/llvm/CodeGen/IndirectThunks.h
Differential Revision: https://reviews.llvm.org/D81401
Summary:
The patch wraps ThinLTO index into immutable
pass which can be used by StackSafety analysis.
Reviewers: eugenis, pcc
Reviewed By: eugenis
Subscribers: hiraditya, steven_wu, dexonsmith, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D80985
Before this patch, we have to calculate the offset for the current range list entry. This patch helps make the "Offset" field optional.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D81220
This change introduces an LLD switch --thinlto-single-module to allow compiling only a part of the input modules. This is specifically enables:
1. Fast investigating/debugging modules of interest without spending time on compiling unrelated modules.
2. Compiler debug dump with -mllvm -debug-only= for specific modules.
It will be useful for large applications which has 1K+ input modules for thinLTO.
The switch can be combined with `--lto-obj-path=` or `--lto-emit-asm` to obtain intermediate object files or assembly files. So far the module name matching is implemented as a fuzzy name lookup where the modules with name containing the switch value are compiled.
E.g,
Command:
ld.lld main.o thin.a --thinlto-single-module=thin.a --lto-obj-path=single.o
log:
[ThinLTO] Selecting thin.a(thin1.o at 168) to compile
[ThinLTO] Selecting thin.a(thin2.o at 228) to compile
Command:
ld.lld main.o thin.a --thinlto-single-module=thin1.o --lto-obj-path=single.o
log:
[ThinLTO] Selecting thin.a(thin1.o at 168) to compile
Differential Revision: https://reviews.llvm.org/D80406
Summary:
New file include to support platform dependent grid constants. It will be
used by clang, libomptarget plugins, and deviceRTLs to access constant
values consistently and with fast access in the deviceRTLs.
Originally authored by Greg Rodgers (@gregrodgers).
Reviewers: arsenm, sameerds, jdoerfert, yaxunl, b-sumner, scchan, JonChesterfield
Reviewed By: arsenm
Subscribers: llvm-commits, pdhaliwal, jholewinski, jvesely, wdng, nhaehnle, guansong, kerbowa, sstefan1, cfe-commits, ronlieb, gregrodgers
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D80917
MIRBuilder was in the middle of of a bunch of methods and not group
with the other member variables, which made it harder to see what
state this carries around. Move these to the top as is the usual
convention.
If the target explicitly requested custom legalization, it should be
required to implement this. Also move default legalizeIntrinsic
implementation into the header so it's next to the related
legalizeCustom.
Summary:
This makes the code easier to reason about, as it will behave the same
way regardless of whether there is any more data coming after the
presumed end of the prologue.
Reviewers: jhenderson, dblaikie, probinson, ikudrin
Subscribers: hiraditya, MaskRay, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77557
Summary:
This patch splits the Attributor::run() function into multiple
functions.
Simple Logic changes to make this possible:
# Moved iteration count verification earlier.
# NumFinalAAs get set a little bit later.
Reviewers: jdoerfert, sstefan1, uenoku
Reviewed By: jdoerfert
Subscribers: hiraditya, uenoku, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D81022
The relocation codes for R_<CLS>_PLT32 are incorrectly in the dynamic
relocation range that starts at 1024 for AArch64 and 180 for AArch64_32.
Correct these so that they start at the next available static relocation
code in the non-TLS range. The R_<CLS>_PLT32 description is currently in
unpublished so this change corrects LLVM to match the values that will
appear in the final ELF for the 64-bit Arm Architecture document.
Differential Revision: https://reviews.llvm.org/D81410
[ v1 was reverted by c6ec352a6b due to
modpost failing; v2 fixes this. More info:
https://github.com/ClangBuiltLinux/linux/issues/1045#issuecomment-640381783 ]
This makes -fsanitize=kernel-address emit the correct globals
constructors for the kernel. We had to do the following:
* Disable generation of constructors that rely on linker features such
as dead-global elimination.
* Only instrument globals *not* in explicit sections. The kernel uses
sections for special globals, which we should not touch.
* Do not instrument globals that are prefixed with "__" nor that are
aliased by a symbol that is prefixed with "__". For example, modpost
relies on specially named aliases to find globals and checks their
contents. Unfortunately modpost relies on size stored as ELF debug info
and any padding of globals currently causes the debug info to cause size
reported to be *with* redzone which throws modpost off.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=203493
Tested:
* With 'clang/test/CodeGen/asan-globals.cpp'.
* With test_kasan.ko, we can see:
BUG: KASAN: global-out-of-bounds in kasan_global_oob+0xb3/0xba [test_kasan]
* allyesconfig, allmodconfig (x86_64)
Reviewed By: glider
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D81390
Instead of loading from e.g. `<vscale x 16 x i8>*`, load from element
pointer `i8*`. This is more in line with the other load/store
intrinsics for SVE.
Reviewers: fpetrogalli, c-rhodes, rengolin, efriedma
Reviewed By: efriedma
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D81458
Multiple times we faced an issue of huge outputs due to unexpected behavior
or incorrect test cases. The last one was https://reviews.llvm.org/D80629#2073066.
This patch limits the output to 10 Mb for ELF and introduces the --max-size to change this
limit.
I've tried to keep the implementation non-intrusive.
The current logic we have is that we prepare section content in a buffer first and write
it to the output later. This patch checks the available limit on each writing attempt to this buffer
and stops writing when the limit is reached and raises the internal error flag.
Later, this flag is is checked before the actual writing to a file happens and
an error is reported.
Differential revision: https://reviews.llvm.org/D81258
Summary:
This patch splits the Attributor::run() function into multiple functions.
Simple Logic changes to make this possible:
# Moved iteration count verification earlier.
# NumFinalAAs get set a little bit later.
Reviewers: jdoerfert, sstefan1, uenoku
Reviewed By: jdoerfert
Subscribers: hiraditya, uenoku, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D81022