Rather than handling zlib handling manually, use find_package from CMake
to find zlib properly. Use this to normalize the LLVM_ENABLE_ZLIB,
HAVE_ZLIB, HAVE_ZLIB_H. Furthermore, require zlib if LLVM_ENABLE_ZLIB is
set to YES, which requires the distributor to explicitly select whether
zlib is enabled or not. This simplifies the CMake handling and usage in
the rest of the tooling.
This is a reland of abb0075 with all followup changes and fixes that
should address issues that were reported in PR44780.
Differential Revision: https://reviews.llvm.org/D79219
LTO builds have been creating invalid DWARF and one of the errors was a file index that was out of bounds. "llvm-dwarfdump --verify" will check all file indexes for line tables already, but there are no checks for the validity of file indexes in attributes.
The verification will verify if there is a DW_AT_decl_file/DW_AT_call_file that:
- there is a line table for the compile unit
- the file index is valid
- the encoding is appropriate
Tests are added that test all of the above conditions.
Differential Revision: https://reviews.llvm.org/D84817
This patch changes the functionality of AsmPrinter to name the basic block end labels as LBB_END${i}_${j}, with ${i} being the identifier for the function and ${j} being the identifier for the basic block. The new naming scheme is consistent with how basic block labels are named (.LBB${i}_{j}), and how function end symbol are named (.Lfunc_end${i}) and helps to write stronger tests for the upcoming patch for BB-Info section (as proposed in https://lists.llvm.org/pipermail/llvm-dev/2020-July/143512.html). The end label is used with basicblock-labels (BB-Info section in future) and basicblock-sections to compute the size of basic blocks and basic block sections, respectively. For BB sections, the section containing the entry basic block will not have a BB end label since it already gets the function end-label.
This label is cached for every basic block (CachedEndMCSymbol) like the label for the basic block (CachedMCSymbol).
Differential Revision: https://reviews.llvm.org/D83885
MSan removes readnone/readonly and similar attributes from callees,
because after MSan instrumentation those attributes no longer apply.
This change removes the attributes from call sites, as well.
Failing to do this may cause DSE of paramTLS stores before calls to
readonly/readnone functions.
Differential Revision: https://reviews.llvm.org/D85259
This reverts commit e9761688e4. It breaks the build:
```
~/src/llvm-project/llvm/lib/Analysis/IVDescriptors.cpp:868:10: error: no viable conversion from returned value of type 'SmallVector<[...], 8>' to function return type 'SmallVector<[...], 4>'
return ReductionOperations;
```
These were implementation detail, but become necessary for generic data
copying.
Also added const variations to them, and move assignment, since we had a
move ctor (and the move assignment helps in a subsequent patch).
Differential Revision: https://reviews.llvm.org/D85262
Arm MVE has multiple instructions such as VMLAVA.s8, which (in this
case) can take two 128bit vectors, sign extend the inputs to i32,
multiplying them together and sum the result into a 32bit general
purpose register. So taking 16 i8's as inputs, they can multiply and
accumulate the result into a single i32 without any rounding/truncating
along the way. There are also reduction instructions for plain integer
add and min/max, and operations that sum into a pair of 32bit registers
together treated as a 64bit integer (even though MVE does not have a
plain 64bit addition instruction). So giving the vectorizer the ability
to use these instructions both enables us to vectorize at higher
bitwidths, and to vectorize things we previously could not.
In order to do that we need a way to represent that the reduction
operation, specified with a llvm.experimental.vector.reduce when
vectorizing for Arm, occurs inside the loop not after it like most
reductions. This patch attempts to do that, teaching the vectorizer
about in-loop reductions. It does this through a vplan recipe
representing the reductions that the original chain of reduction
operations is replaced by. Cost modelling is currently just done through
a prefersInloopReduction TTI hook (which follows in a later patch).
Differential Revision: https://reviews.llvm.org/D75069
The `decode_relrs` helper is declared as:
`Expected<std::vector<Elf_Rel>> decode_relrs(Elf_Relr_Range relrs) const;`
it never returns an error though and hence can be simplified to return
a vector.
Differential revision: https://reviews.llvm.org/D85302
This quietly disabled use of zlib on Windows even when building with
-DLLVM_ENABLE_ZLIB=FORCE_ON.
> Rather than handling zlib handling manually, use find_package from CMake
> to find zlib properly. Use this to normalize the LLVM_ENABLE_ZLIB,
> HAVE_ZLIB, HAVE_ZLIB_H. Furthermore, require zlib if LLVM_ENABLE_ZLIB is
> set to YES, which requires the distributor to explicitly select whether
> zlib is enabled or not. This simplifies the CMake handling and usage in
> the rest of the tooling.
>
> This is a reland of abb0075 with all followup changes and fixes that
> should address issues that were reported in PR44780.
>
> Differential Revision: https://reviews.llvm.org/D79219
This reverts commit 10b1b4a231 and follow-ups
64d99cc6ab and
f9fec0447e.
Currently, we only test the `--stackmap` option here:
https://github.com/llvm/llvm-project/blob/master/llvm/test/Object/stackmap-dump.test
it uses a precompiled MachO binary currently and I've found no tests for this option for ELF.
The implementation also has issues. For example, it might assert on a wrong version
of the .llvm-stackmaps section. Or it might crash on an empty or truncated section.
This patch introduces a new tools/llvm-readobj/ELF test file as well as implements a few
basic checks to catch simple crashes/issues
It also eliminates `unwrapOrError` calls in `printStackMap()`.
Differential revision: https://reviews.llvm.org/D85208
Fixed the commit c35585e209.
This is a fix for the bug 46098 where PostDominatorTree
is unexpectedly changed by InstCombine's branch swapping
transformation.
This patch fixes PostDomTree builder. While looking for
the furthest away node in a reverse unreachable subgraph
this patch runs DFS with successors in their function order.
This order is indifferent to the order of successors, so is
the furthest away node.
Reviewers: kuhar, nikic, lebedev.ri
Differential Revision: https://reviews.llvm.org/D84763
This reverts commit c35585e209.
The MLIR is broken with this patch, reproduce by adding
-DLLVM_ENABLE_PROJECTS=mlir to the cmake configuration and
build `ninja tools/mlir/lib/IR/CMakeFiles/obj.MLIRIR.dir/Dominance.cpp.o`
This is one more NFC part extracted from D79485. Normal and SCC based loops have very different representation and have to be handled separatly each time we deal with loops. D79485 is going to introduce much more extensive use of loops what will be problematic with out this change.
Reviewed By: davidxl
Differential Revision: https://reviews.llvm.org/D84838
This is another fix for the bug 46098 where PostDominatorTree
is unexpectedly changed by InstCombine's branch swapping
transformation.
This patch fixes PostDomTree builder. While looking for
the furthest away node in a reverse unreachable subgraph
this patch runs DFS with successors in their function order.
This order is indifferent to the order of successors, so is
the furthest away node.
Reviewers: kuhar, nikic, lebedev.ri
Differential Revision: https://reviews.llvm.org/D84763
Added a mechanism to check the element type, get the total element
count, and the size of an element.
Differential Revision: https://reviews.llvm.org/D85250
This patch simplified IR generation for __builtin_btf_type_id().
For __builtin_btf_type_id(obj, flag), previously IR builtin
looks like
if (obj is a lvalue)
llvm.bpf.btf.type.id(obj.ptr, 1, flag) !type
else
llvm.bpf.btf.type.id(obj, 0, flag) !type
The purpose of the 2nd argument is to differentiate
__builtin_btf_type_id(obj, flag) where obj is a lvalue
vs.
__builtin_btf_type_id(obj.ptr, flag)
Note that obj or obj.ptr is never used by the backend
and the `obj` argument is only used to derive the type.
This code sequence is subject to potential llvm CSE when
- obj is the same .e.g., nullptr
- flag is the same
- metadata type is different, e.g., typedef of struct "s"
and strust "s".
In the above, we don't want CSE since their metadata is different.
This patch change IR builtin to
llvm.bpf.btf.type.id(seq_num, flag) !type
and seq_num is always increasing. This will prevent potential
llvm CSE.
Also report an error if the type name is empty for
remote relocation since remote relocation needs non-empty
type name to do relocation against vmlinux.
Differential Revision: https://reviews.llvm.org/D85174
This is the last remaining use of ConstantProp, migrate it to InstSimplify in the goal of removing ConstantProp.
Add -hexagon-instsimplify option to enable skipping of instsimplify in
tests that can't handle the extra optimization.
Differential Revision: https://reviews.llvm.org/D85047
Get the argument register and ensure there's a copy to the virtual
register. AMDGPU and AArch64 have similarish code to get the livein
value, and I also want to use this in multiple places.
This is a bit more aggressive about setting the register class than
the original function, but that's probably OK.
I think we're missing a few verifier checks for function live ins. I
noticed AArch64's calling convention code is not actually adding
liveins to functions, only the entry block (which apparently might not
matter that much?). There should probably be a verifier check that
entry block live ins are also live into the function. We also might
need a verifier check that the copy to the livein virtual register is
in the entry block.
Teach SCCP to create notconstant lattice values from inequality
assumes and nonnull metadata, and update getConstant() to make
use of them. Additionally isOverdefined() needs to be changed to
consider notconstant an overdefined value.
Handling inequality branches is delayed until our branch on undef
story in other passes has been improved.
Differential Revision: https://reviews.llvm.org/D83643
This patch stops unconditionally transforming FSUB(-0, X) into an FNEG(X) while building the MIR.
This corresponds with the SelectionDAGISel change in D84056.
Differential Revision: https://reviews.llvm.org/D85139
for the advantage outlined by D83639 ([OptTable] Support grouped short options)
Some behavior changes:
* -i={0,false} is removed. Use --no-inlines instead.
* --demangle={0,false} is removed. Use --no-demangle instead
* -untag-addresses={0,false} is removed. Use --no-untag-addresses instead
Added a higher level API OptTable::parseArgs which handles optional
initial options populated from an environment variable, expands response
files recursively, and parses options.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D83530
This patch added the following additional compile-once
run-everywhere (CO-RE) relocations:
- existence/size of typedef, struct/union or enum type
- enum value and enum value existence
These additional relocations will make CO-RE bpf programs more
adaptive for potential kernel internal data structure changes.
For existence/size relocations, the following two code patterns
are supported:
1. uint32_t __builtin_preserve_type_info(*(<type> *)0, flag);
2. <type> var;
uint32_t __builtin_preserve_field_info(var, flag);
flag = 0 for existence relocation and flag = 1 for size relocation.
For enum value existence and enum value relocations, the following code
pattern is supported:
uint64_t __builtin_preserve_enum_value(*(<enum_type> *)<enum_value>,
flag);
flag = 0 means existence relocation and flag = 1 for enum value.
relocation. In the above <enum_type> can be an enum type or
a typedef to enum type. The <enum_value> needs to be an enumerator
value from the same enum type. The return type is uint64_t to
permit potential 64bit enumerator values.
Differential Revision: https://reviews.llvm.org/D83242
The CFA is calculated as (SP/FP + offset), but when there are
SVE objects on the stack the SP offset is partly scalable and
should instead be expressed as the DWARF expression:
SP + offset + scalable_offset * VG
where VG is the Vector Granule register, containing the
number of 64bits 'granules' in a scalable vector.
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D84043
When mapping an optional value, if the value is <none> and followed
by comments, there will be a parsing error. This patch helps fix this
issue.
e.g.,
When mapping the following YAML,
```
Sections:
- Name: blah
Type: SHT_foo
Flags: [[FLAGS=<none>]] ## some comments.
```
the raw value of `ScalarNode` is "<none> " rather than "<none>". We need
to remove the spaces.
Differential Revision: https://reviews.llvm.org/D85180
Adds the function createMCInst() to MCContext that creates a MCInst using
a typed bump alloctor.
MCInst contains a SmallVector<MCOperand, 8>. The SmallVector is POD only
for <= 8 operands. The default untyped bump pointer allocator of MCContext
does not delete the MCInst, so if the SmallVector grows, it's a leak.
This fixes https://bugs.llvm.org/show_bug.cgi?id=46900.
Merging alias results from different paths, when a path did phi
translation is not necesarily correct. Conservatively terminate such paths.
Aimed to fix PR46156.
Differential Revision: https://reviews.llvm.org/D84905
Specified in https://github.com/WebAssembly/simd/pull/237, these
instructions load the first vector lane from memory and zero the other
lanes. Since these instructions are not officially part of the SIMD
proposal, they are only available on an opt-in basis via LLVM
intrinsics and clang builtin functions. If these instructions are
merged to the proposal, this implementation will change so that the
instructions will be generated from normal IR. At that point the
intrinsics and builtin functions would be removed.
This PR also changes the opcodes for the experimental f32x4.qfm{a,s}
instructions because their opcodes conflicted with those of the
v128.load{32,64}_zero instructions. The new opcodes were chosen to
match those used in V8.
Differential Revision: https://reviews.llvm.org/D84820
Extend the memop value profile buckets to be more flexible (could accommodate a
mix of individual values and ranges) and to cover more value ranges (from 11 to
22 buckets).
Disabled behind a flag (to be enabled separately) and the existing code to be
removed later.
Differential Revision: https://reviews.llvm.org/D81682
The strictfp attribute is required on all function calls in a function
that is itself marked with the strictfp attribute. The IRBuilder knows
this and has a method for adding the attribute to function call instructions.
If a function being called has the strictfp attribute itself then the
IRBuilder will refuse to add the attribute to the calling instruction
despite being asked to add it. Eliminate this error.
Differential Revision: https://reviews.llvm.org/D84878
A JSON->TensorSpec utility we will use subsequently to specify
additional outputs needed for certain training scenarios.
Differential Revision: https://reviews.llvm.org/D84976
`DenseMapAPIntKeyInfo` is now located in `lib/IR/LLVMContextImpl.h`.
Moved it into `include/ADT/DenseMapInfo.h` to use it.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D85131
Use pad with undef and unmerge with unused results. This is annoyingly
similar to several other places in LegalizerHelper, but they're all
slightly different.
It implements an approach suggested in the D84398 thread.
With it the following:
```
Sections:
- Name: .bar
Type: SHT_PROGBITS
Offset: [[MACRO=<none>]]
```
works just like the `Offset` key was not specified.
It is useful for tests that want to have a default value for a field and to
have a way to override it at the same time.
Differential revision: https://reviews.llvm.org/D84526
As discussed on D81500, this adds a more general ElementCount variant of the build helper and converts the (non-scalable) unsigned NumElts variant to use it internally.
For AMDGPU, vectors with elements < 32 bits should be indexed in
32-bit elements and the desired bits extracted from there. For
elements > 64-bits, these should be reduce to 64/32 elements to enable
the normal dynamic indexing paths.
In the dynamic index cases, this produces shorter code most of the
time. This does immediately regress the constant index cases, but this
should be fixed once we have the most basic of shift combines.
The element size > 64 case is pretty much ported from the exisiting
DAG implementation for extract element promote. The increasing element
size case is new.
Try to be more consistent with the SDLoc param in the TargetLowering methods.
This also exposes an issue where we were passing a SDNode as a SDLoc, relying on the implicit SDLoc(SDNode) constructor.
This is a split patch of D80991.
This patch introduces AAPotentialValues and its interface only.
For more detail of AAPotentialValues abstract attribute, see the original patch.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D83283
formLCSSAForInstructions is used by SCEVExpander, which tracks all
inserted instructions including LCSSA phis using asserting value
handles. This means cleanup needs to happen in the caller.
Extend formLCSSAForInstructions to take an optional pointer to a
vector. If this argument is non-nullptr, instead of directly deleting
the phis, add them to the vector, so the caller can process them.
This should address various PPC buildbot failures, including
http://lab.llvm.org:8011/builders/clang-ppc64be-linux-lnt/builds/40567
Use IRBuilder instead PHINode::Create. This should not impact the
generated code, but IRBuilder provides a way to register callbacks for
inserted instructions, which is convenient for some users.
Reviewed By: lebedev.ri
Differential Revision: https://reviews.llvm.org/D85037
The word "dependency graph" is a bit misleading. When there is an
edge from node A to B (A -> B), it actually mean that B depends on
A and when the state of A is updated, B should also be updated. So
I update the comment to make the description clearer.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D85065
querying getSCEV() for incomplete phis leads to wrong cache value in `ExprToIVMap`,
because incomplete phis may be simplified to same value before get SCEV expression.
Reviewed By: lebedev.ri, mkazantsev
Differential Revision: https://reviews.llvm.org/D77560
Summary: This patch separates the Loop Peeling Utilities from Loop Unrolling.
The reason for this change is that Loop Peeling is no longer only being used by
loop unrolling; Patch D82927 introduces loop peeling with fusion, such that
loops can be modified to have to same trip count, making them legal to be
peeled.
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D83056
https://reviews.llvm.org/D84909
Patch adds two new GICombinerRules, one for G_INTTOPTR and one for
G_PTRTOINT. The G_INTTOPTR elides ptr2int(int2ptr(x)) to a copy of x, if
the cast is within the same address space. The G_PTRTOINT elides
int2ptr(ptr2int(x)) to a copy of x. Patch additionally adds new combiner
tests for the AArch64 target to test these new combiner rules.
Patch by mkitzan
Refactoring `Slice` class and function `createUniversalBinary` from
`llvm-lipo` into MachOUniversalWriter. This refactoring is necessary so
as to use the refactored code for creating universal binaries under
llvm-libtool-darwin.
Reviewed by alexshap, smeenai
Differential Revision: https://reviews.llvm.org/D84662
This patch makes the 'debug_aranges' entry optional. If the entry is
empty, yaml2obj will only emit the header for it.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D84921
In this patch, we add a helper function getDWARFEmitterByName(). This
function returns the proper DWARF section emitting method by the name.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D84952
In this patch, emitDebugPubnames(), emitDebugPubtypes(),
emitDebugGNUPubnames(), emitDebugGNUPubtypes() are added.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D85003
It turned out that the D78704 included a private LLVM header, which is excluded
from the LLVM install target.
I'm substituting that `#include` with the public one by moving the necessary
`#define` into that. There was a discussion about this at D78704 and on the
cfe-dev mailing list.
I'm also placing a note to remind others of this pitfall.
Reviewed By: mgorny
Differential Revision: https://reviews.llvm.org/D84929
is enabled.
When -sample-profile-merge-inlinee is enabled, new FunctionSamples may be
created during profile merge without GUIDToFuncNameMap being initialized.
That will occasionally cause compiler crash. The patch fixes it.
Differential Revision: https://reviews.llvm.org/D84994
If an analysis is actually invalidated, there's already a log statement
for that: 'Invalidating analysis: FooAnalysis'.
Otherwise the statement is not very useful.
Reviewed By: asbirlea, ychen
Differential Revision: https://reviews.llvm.org/D84981
findAllocaForValue uses AllocaForValue to cache resolved values.
The function is used only to resolve arguments of lifetime
intrinsic which usually are not fare for allocas. So result reuse
is likely unnoticeable.
In followup patches I'd like to replace the function with
GetUnderlyingObjects.
Depends on D84616.
Differential Revision: https://reviews.llvm.org/D84617
This patch addes time trace functionality to have a better understanding
of the analysis times.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D84980
The -harness option enables new testing use-cases for llvm-jitlink. It takes a
list of objects to treat as a test harness for any regular objects passed to
llvm-jitlink.
If any files are passed using the -harness option then the following
transformations are applied to all other files:
(1) Symbols definitions that are referenced by the harness files are promoted
to default scope. (This enables access to statics from test harness).
(2) Symbols definitions that clash with definitions in the harness files are
deleted. (This enables interposition by test harness).
(3) All other definitions in regular files are demoted to local scope.
(This causes untested code to be dead stripped, reducing memory cost and
eliminating spurious unresolved symbol errors from untested code).
These transformations allow the harness files to reference and interpose
symbols in the regular object files, which can be used to support execution
tests (including fuzz tests) of functions in relocatable objects produced by a
build.
This allows clients to detect invalid transformations applied by JITLink passes
(e.g. inserting or removing symbols in unexpected ways) and terminate linking
with an error.
This change is used to simplify the error propagation logic in
ObjectLinkingLayer.
Problem:
Right now, our "Running pass" is not accurate when passes are wrapped in adaptor because adaptor is never skipped and a pass could be skipped. The other problem is that "Running pass" for a adaptor is before any "Running pass" of passes/analyses it depends on. (for example, FunctionToLoopPassAdaptor). So the order of printing is not the actual order.
Solution:
Doing things like PassManager::Debuglogging is very intrusive because we need to specify Debuglogging whenever adaptor is created. (Actually, right now we're not specifying Debuglogging for some sub-PassManagers. Check PassBuilder)
This patch move debug logging for pass as a PassInstrument callback. We could be sure that all running passes are logged and in the correct order.
This could also be used to implement hierarchy pass logging in legacy PM. We could also move logging of pass manager to this if we want.
The test fixes looks messy. It includes changes:
- Remove PassInstrumentationAnalysis
- Remove PassAdaptor
- If a PassAdaptor is for a real pass, the pass is added
- Pass reorder (to the correct order), related to PassAdaptor
- Add missing passes (due to Debuglogging not passed down)
Reviewed By: asbirlea, aeubanks
Differential Revision: https://reviews.llvm.org/D84774
We need to keep track of the alloca insertion point (which we already
communicate via the callback to the user) as we place allocas as well.
Reviewed By: fghanim, SouraVX
Differential Revision: https://reviews.llvm.org/D82470
As far as I know, ipconstprop has not been used in years and ipsccp has
been used instead. This has the potential for confusion and sometimes
leads people to spend time finding & reporting bugs as well as
updating it to work with the latest API changes.
This patch moves the tests over to SCCP. There's one functional difference
I am aware of: ipconstprop propagates for each call-site individually, so
for functions that are called with different constant arguments it can sometimes
produce better results than ipsccp (at much higher compile-time cost).But
IPSCCP can be thought to do so as well for internal functions and as mentioned
earlier, the pass seems unused in practice (and there are no plans on working
towards enabling it anytime).
Also discussed on llvm-dev:
http://lists.llvm.org/pipermail/llvm-dev/2020-July/143773.html
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D84447
This patch makes the 'Length' field of the address range table optional.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D84911
This patch makes the 'AddressSize' and 'SegmentSelectorSize' fields of
address range table optional.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D84907
This change define RAII class `FileLocker` and methods `lock` and
`tryLockFor` of the class `raw_fd_stream` to facilitate using file locks.
Differential Revision: https://reviews.llvm.org/D79066
I still think it's highly questionable that we have two intrinsics
with identical behavior and only vary by the name of the libcall used
if it happens to be lowered that way, but try to reduce the feature
delta between SDAG and GlobalISel for recently added intrinsics. I'm
not sure which opcode should be considered the canonical one, but
lower roundeven back to round.
Further abstracting the specification of a tensor, to more easily
support different types and shapes of tensor, and also to perform
initialization up-front, at TFModelEvaluator construction time.
Differential Revision: https://reviews.llvm.org/D84685
This adds a common API for compute constant ranges of intrinsics.
The intention here is that
a) we can reuse the same code across different passes that handle
constant ranges, i.e. this can be reused in SCCP
b) we only have to add knowledge about supported intrinsics to
ConstantRange, not any consumers.
Differential Revision: https://reviews.llvm.org/D84587
This patch supports the situation where caller does not have a valid TOC and
calls using the R_PPC64_REL24_NOTOC relocation and the callee is not DSO local.
In this case the call cannot be made directly since the callee may or may not
require a valid TOC pointer. As a result this situation require a PC-relative
plt stub to set up r12.
Reviewed By: sfertile, MaskRay, stefanp
Differential Revision: https://reviews.llvm.org/D83669