An inbounds GEP may still cross the sign boundary, so signed icmps
cannot be folded (https://alive2.llvm.org/ce/z/XSgi4D). This was
previously fixed for other folds in this function, but this one
was missed.
New method `FCmpInst::compare` is added, which evaluates the given
compare predicate for constant operands. Interface is made similar to
`ICmpInst::compare`.
Differential Revision: https://reviews.llvm.org/D116168
The recursive implementation can run into stack overflows, e.g. like in PR52844.
The order the users are visited changes, but for the current use case
this only impacts the order error messages are emitted.
With Control-Flow Integrity (CFI), the LowerTypeTests pass replaces
function references with CFI jump table references, which is a problem
for low-level code that needs the address of the actual function body.
For example, in the Linux kernel, the code that sets up interrupt
handlers needs to take the address of the interrupt handler function
instead of the CFI jump table, as the jump table may not even be mapped
into memory when an interrupt is triggered.
This change adds the no_cfi constant type, which wraps function
references in a value that LowerTypeTestsModule::replaceCfiUses does not
replace.
Link: https://github.com/ClangBuiltLinux/linux/issues/1353
Reviewed By: nickdesaulniers, pcc
Differential Revision: https://reviews.llvm.org/D108478
It was already possible to create an AttributeList from an Index
and an AttributeSet. However, this would actually end up using
the implicit constructor on AttrBuilder, thus doing an unnecessary
conversion from AttributeSet to AttrBuilder to AttributeSet.
Instead we can accept the AttributeSet directly, as that is what
we need anyway.
As requested in D115787, I've added a test for LLVMConstGEP2 and
LLVMConstInBoundsGEP2. However, to make this work in the echo test,
I also had to change a couple of APIs to work on GEP operators,
rather than only GEP instructions.
Differential Revision: https://reviews.llvm.org/D115858
Weirdly, the opaque pointer compatible variants LLVMConstGEP2 and
LLVMConstInBoundsGEP2 were already declared in the header, but not
actually implemented. This adds the missing implementations and
deprecates the incompatible functions.
Differential Revision: https://reviews.llvm.org/D115787
These are deprecated and should be replaced with getAlign().
Some of these asserts don't do anything because Load/Store/AllocaInst never have a 0 align value.
These flags are documented as generating poison values for particular input values. As such, we should really be consistent about their handling with how we handle nsw/nuw/exact/inbounds.
Differential Revision: https://reviews.llvm.org/D115460
The vp.load and vp.gather intrinsics require the intrinsic return
type to determine the correct function signature. With opaque pointers,
it cannot be derived from the parameter pointee types.
Differential Revision: https://reviews.llvm.org/D115632
Rust allows enums to be scopes, as shown by the previous change. Sadly,
D111770 disallowed enums-as-scopes in the LLVM Verifier, which means
that LLVM HEAD stopped working for Rust compiles. As a result, we back
out the verifier part of D111770 with a modification to the testcase so
we don't break this in the future.
The testcase is now actual IR from rustc at commit 8f8092cc3, which is
the nightly as of 2021-09-28. I would expect rustc 1.57 to produce
similar or identical IR if someone wants to reproduce this IR in the
future with minimal changes. A recipe for reproducing the IR using rustc
is included in the test file.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D115353
This exposes the core logic of getGEPIndicesForOffset() as a
getGEPIndexForOffset() method that only returns a single offset,
instead of following the whole chain.
Reverts 02940d6d22. Fixes breakage in the modules build.
LLVM loops cannot represent irreducible structures in the CFG. This
change introduce the concept of cycles as a generalization of loops,
along with a CycleInfo analysis that discovers a nested
hierarchy of such cycles. This is based on Havlak (1997), Nesting of
Reducible and Irreducible Loops.
The cycle analysis is implemented as a generic template and then
instatiated for LLVM IR and Machine IR. The template relies on a new
GenericSSAContext template which must be specialized when used for
each IR.
This review is a restart of an older review request:
https://reviews.llvm.org/D83094
Original implementation by Nicolai Hähnle <nicolai.haehnle@amd.com>,
with recent refactoring by Sameer Sahasrabuddhe <sameer.sahasrabuddhe@amd.com>
Differential Revision: https://reviews.llvm.org/D112696
Currently, it is impossible to specify a DataLayout with pointer
size and index size that is not a whole number of bytes.
This patch modifies
the DataLayout class to accept arbitrary pointer sizes and to
store the size as a number of bits, rather than as a number of bytes.
Generally speaking, the external interface of the class as used
by in-tree architectures remains the same and shouldn't affect the
behavior of architecures with pointer sizes equal to a whole number
of bytes.
Note the interface of setPointerAlignment has changed and takes
a pointer and index size that is a number of bits, rather than a number
of bytes.
Patch originally by Ajit Kumar Agarwal
Differential Revision: https://reviews.llvm.org/D114141
This patch extends LLVM IR to add metadata that can be used to emit macho files with two build version load commands.
It utilizes "darwin.target_variant.triple" and "darwin.target_variant.SDK Version" metadata names for that,
which will be set by a future patch in clang.
MachO uses two build version load commands to represent an object file / binary that is targeting both the macOS target,
and the Mac Catalyst target. At runtime, a dynamic library that supports both targets can be loaded from either a native
macOS or a Mac Catalyst app on a macOS system. We want to add support to this to upstream to LLVM to be able to build
compiler-rt for both targets, to finish the complete support for the Mac Catalyst platform, which is right now targetable
by upstream clang, but the compiler-rt bits aren't supported because of the lack of this multiple build version support.
Differential Revision: https://reviews.llvm.org/D112189
The default for min is changed to 1. The behaviour of -mvscale-{min,max}
in Clang is also changed such that 16 is the max vscale when targeting
SVE and no max is specified.
Reviewed By: sdesmalen, paulwalker-arm
Differential Revision: https://reviews.llvm.org/D113294
LLVM loops cannot represent irreducible structures in the CFG. This
change introduce the concept of cycles as a generalization of loops,
along with a CycleInfo analysis that discovers a nested
hierarchy of such cycles. This is based on Havlak (1997), Nesting of
Reducible and Irreducible Loops.
The cycle analysis is implemented as a generic template and then
instatiated for LLVM IR and Machine IR. The template relies on a new
GenericSSAContext template which must be specialized when used for
each IR.
This review is a restart of an older review request:
https://reviews.llvm.org/D83094
Original implementation by Nicolai Hähnle <nicolai.haehnle@amd.com>,
with recent refactoring by Sameer Sahasrabuddhe <sameer.sahasrabuddhe@amd.com>
Differential Revision: https://reviews.llvm.org/D112696
This patch implements the intrinsic for ref.null.
In the process of implementing int_wasm_ref_null_func() and
int_wasm_ref_null_extern() intrinsics, it removes the redundant
HeapType.
This also causes the textual assembler syntax for ref.null to
change. Instead of receiving an argument: `func` or `extern`, the
instruction mnemonic is either ref.null_func or ref.null_extern,
without the need for a further operand.
Reviewed By: tlively
Differential Revision: https://reviews.llvm.org/D114979
Avoid the use of deprecated (opaque pointer incompatible) APIs
in C API tests, in preparation for header deprecation. Add a
LLVMGetGEPSourceElementType() to cover a bit of functionality
that is necessary for the echo test.
This change is split out from https://reviews.llvm.org/D114936.
Add generic support for vec3 types, and in particular define
llvm_v3f32_ty which will be used by AMDGPU's
llvm.amdgcn.image.bvh.intersect.ray intrinsic.
Differential Revision: https://reviews.llvm.org/D114956
As mentioned in D106585, this causes non-determinism, which can also be
shown by this test case being flaky without this patch.
We were using the APSInt's bit width for hashing, but not for checking
for equality. APInt::isSameValue() does not check bit width.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D115054
Helps appease MSVC which is complaining about "fatal error C1061: compiler limit: blocks nested too deeply" - we already do the same thing for avx512.mask.store intrinsics.
This is only a stopgap solution until another else-if case needs adding - we really need to refactor this chain of ifs properly.
Try to appease the microsoft compiler which is apparently running out of
if statements. Separate the new ARM code into a separate function to
keep it simpler.
I'm not having a lot of luck with the microosft compiler recently. Maybe
this will help it with its errors:
llvm\lib\IR\AutoUpgrade.cpp(3726): fatal error C1061: compiler limit: blocks nested too deeply
If not, it's a good code cleanup anyway.
This adjusts all the MVE and CDE intrinsics now that v2i1 is a legal
type, to use a <2 x i1> as opposed to emulating the predicate with a
<4 x i1>. The v4i1 workarounds have been removed leaving the natural
v2i1 types, notably in vctp64 which now generates a v2i1 type.
AutoUpgrade code has been added to upgrade old IR, which needs to
convert the old v4i1 to a v2i1 be converting it back and forth to an
integer with arm.mve.v2i and arm.mve.i2v intrinsics. These should be
optimized away in the final assembly.
Differential Revision: https://reviews.llvm.org/D114455
Deprecate LLVMAddAlias in favor of LLVMAddAlias2, which accepts a
value type and an address space. Previously these were extracted
from the pointer type.
Differential Revision: https://reviews.llvm.org/D114860
When creating a new DIBuilder with an existing DICompileUnit, load the
DINodes from the current DICompileUnit so they don't get overwritten.
This is done in the MachineOutliner pass, but it didn't change the CU so
the bug never appeared. We need this if we ever want to add DINodes to
the CU after it has been created, e.g., DIGlobalVariables.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D114556
In-loop vector reductions which use the llvm.fmuladd intrinsic involve
the creation of two recipes; a VPReductionRecipe for the fadd and a
VPInstruction for the fmul. If the call to llvm.fmuladd has fast-math flags
these should be propagated through to the fmul instruction, so an
interface setFastMathFlags has been added to the VPInstruction class to
enable this.
Differential Revision: https://reviews.llvm.org/D113125
Usage and naming of macros in VPIntrinsics.def has been inconsistent. Rename all property macros to VP_PROPERTY_<name>. Use BEGIN/END scope macros to attach properties to vp intrinsics and SDNodes (instead of specifying either directly with the property macro).
A follow-up patch has documentation on how the macros are (intended) to be used.
Reviewed By: frasercrmck
Differential Revision: https://reviews.llvm.org/D114144
Currently the stepvector intrinsic only supports element types that
are integers of size 8 bits or more. This patch adds support for the
creation of stepvectors with smaller element types by creating
the intrinsic with i8 elements that we then truncate to the requested
size.
It's not currently possible to write a vectoriser test to exercise
this code path so I have added a unit test here:
llvm/unittests/IR/IRBuilderTest.cpp
Differential Revision: https://reviews.llvm.org/D113767
ProfileCount could model invalid values, but a user had no indication
that the getCount method could return bogus data. Optional<ProfileCount>
addresses that, because the user must dereference the optional. In
addition, the patch removes concept duplication.
Differential Revision: https://reviews.llvm.org/D113839
This trivial patch runs clang-format on some unformatted files before
doing logic changes and prevent hard to review diffs.
Differential Revision: https://reviews.llvm.org/D113572
When creating a splat of 0 for scalable vectors we tend to create them
with using a combination of shufflevector and insertelement, i.e.
shufflevector (<vscale x 4 x i32> insertelement (<vscale x 4 x i32> poison, i32 0, i32 0),
<vscale x 4 x i32> poison, <vscale x 4 x i32> zeroinitializer)
However, for the case of a zero splat we can actually just replace the
above with zeroinitializer instead. This makes the IR a lot simpler and
easier to read. I have changed ConstantFoldShuffleVectorInstruction to
use zeroinitializer when creating a splat of integer 0 or FP +0.0 values.
Differential Revision: https://reviews.llvm.org/D113394
This patch introduces a new abstract attributor instance that propagates
assumption information from functions. Conceptually, if a function is
only called by functions that have certain assumptions, then we can
apply the same assumptions to that function. This problem is similar to
calculating the dominator set, but the assumptions are merged instead of
nodes.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D111054
Add UNIQUED and DISTINCT properties in Metadata.def and use them to
implement restrictions on the `distinct` property of MDNodes:
* DIExpression can currently be parsed from IR or read from bitcode
as `distinct`, but this property is silently dropped when printing
to IR. This causes accepted IR to fail to round-trip. As DIExpression
appears inline at each use in the canonical form of IR, it cannot
actually be `distinct` anyway, as there is no syntax to describe it.
* Similarly, DIArgList is conceptually always uniqued. It is currently
restricted to only appearing in contexts where there is no syntax for
`distinct`, but for consistency it is treated equivalently to
DIExpression in this patch.
* DICompileUnit is already restricted to always being `distinct`, but
along with adding general support for the inverse restriction I went
ahead and described this in Metadata.def and updated the parser to be
general. Future nodes which have this restriction can share this
support.
The new UNIQUED property applies to DIExpression and DIArgList, and
forbids them to be `distinct`. It also implies they are canonically
printed inline at each use, rather than via MDNode ID.
The new DISTINCT property applies to DICompileUnit, and requires it to
be `distinct`.
A potential alternative change is to forbid the non-inline syntax for
DIExpression entirely, as is done with DIArgList implicitly by requiring
it appear in the context of a function. For example, we would forbid:
!named = !{!0}
!0 = !DIExpression()
Instead we would only accept the equivalent inlined version:
!named = !{!DIExpression()}
This essentially removes the ability to create a `distinct` DIExpression
by construction, as there is no syntax for `distinct` inline. If this
patch is accepted as-is, the result would be that the non-canonical
version is accepted, but the following would be an error and produce a diagnostic:
!named = !{!0}
; error: 'distinct' not allowed for !DIExpression()
!0 = distinct !DIExpression()
Also update some documentation to consistently use the inline syntax for
DIExpression, and to describe the restrictions on `distinct` for nodes
where applicable.
Reviewed By: StephenTozer, t-tye
Differential Revision: https://reviews.llvm.org/D104827
For some optimizations on comparisons it's necessary that the
union/intersect is exact and not a superset. Add methods that
return Optional<ConstantRange> only if the result is exact.
For the sake of simplicity this is implemented by comparing
the subset and superset approximations for now, but it should be
possible to do this more directly, as unionWith() and intersectWith()
already distinguish the cases where the result is imprecise for the
preferred range type functionality.
When accumulating the GEP offset in BasicAA, we should use the
pointer index size rather than the pointer size.
Differential Revision: https://reviews.llvm.org/D112370
Add a variant of getEquivalentICmp() that produces an optional
offset. This allows us to create an equivalent icmp for all ranges.
Use this in the with.overflow folding code, which was doing this
adjustment separately -- this clarifies that the fold will indeed
always apply.
These `M` parameters shadow the `M` member in `VerifierSupport`, and
both always refer to the same module. Eliminate the redundant parameters
and always use the member.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D106474
When we have an actual shuffle, we can impose the additional restriction
that the mask replicates the elements of the first operand, so we know
the replication factor as a ratio of output and op0 vector sizes.
This is to revert commit f95bd18b5f (Revert "[Attr] support
btf_type_tag attribute") plus a bug fix.
Previous change failed to handle cases like below:
$ cat reduced.c
void a(*);
void a() {}
$ clang -c reduced.c -O2 -g
In such cases, during clang IR generation, for function a(),
CGCodeGen has numParams = 1 for FunctionType. But for
FunctionTypeLoc we have FuncTypeLoc.NumParams = 0. By using
FunctionType.numParams as the bound to access FuncTypeLoc
params, a random crash is triggered. The bug fix is to
check against FuncTypeLoc.NumParams before accessing
FuncTypeLoc.getParam(Idx).
Differential Revision: https://reviews.llvm.org/D111199
Avid readers of this saga may recall from previous installments,
that replication mask replicates (lol) each of the `VF` elements
in a vector `ReplicationFactor` times. For example, the mask for
`ReplicationFactor=3` and `VF=4` is: `<0,0,0,1,1,1,2,2,2,3,3,3>`.
More importantly, replication mask is used by LoopVectorizer
when using masked interleaved memory operations.
As discussed in previous installments, while it is used by LV,
and we **seem** to support masked interleaved memory operations on X86,
it's support in cost model leaves a lot to be desired:
until basically yesterday even for AVX512 we had no cost model for it.
As it has been witnessed in the recent
AVX2 `X86TTIImpl::getInterleavedMemoryOpCost()`
costmodel patches, while it is hard-enough to query the cost
of a particular assembly sequence [from llvm-mca],
afterwards the check lines LV costmodel tests must be updated manually.
This is, at the very least, boring.
Okay, now we have decent costmodel coverage for interleaving shuffles,
but now basically the same mind-killing sequence has to be performed
for replication mask. I think we can improve at least the second half
of the problem, by teaching
the `TargetTransformInfoImplCRTPBase::getUserCost()` to recognize
`Instruction::ShuffleVector` that are repetition masks,
adding exhaustive test coverage
using `-cost-model -analyze` + `utils/update_analyze_test_checks.py`
This way we can have good exhaustive coverage for cost model,
and only basic coverage for the LV costmodel.
This patch adds precise undef-aware `isReplicationMask()`,
with exhaustive test coverage.
* `InstructionsTest.ShuffleMaskIsReplicationMask` shows that
it correctly detects all the known masks.
* `InstructionsTest.ShuffleMaskIsReplicationMask_undef`
shows that replacing some mask elements in a known replication mask
still allows us to recognize it as a replication mask.
Note, with enough undef elts, we may detect a different tuple.
* `InstructionsTest.ShuffleMaskIsReplicationMask_Exhaustive_Correctness`
shows that if we detected the replication mask with given params,
then if we actually generate a true replication mask with said params,
it matches element-wise ignoring undef mask elements.
Reviewed By: spatel
Differential Revision: https://reviews.llvm.org/D113214
This reverts commits 737e4216c5 and
ce7ac9e66a.
After those commits, the compiler can crash with a reduced
testcase like this:
$ cat reduced.c
void a(*);
void a() {}
$ clang -c reduced.c -O2 -g
This patch added clang codegen and llvm support
for btf_type_tag support. Currently, btf_type_tag
attribute info is preserved in DebugInfo IR only for
pointer types associated with typedef, global variable
and function declaration. Eventually, such information
is emitted to dwarf.
The following is an example:
$ cat test.c
#define __tag __attribute__((btf_type_tag("tag")))
int __tag *g;
$ clang -O2 -g -c test.c
$ llvm-dwarfdump --debug-info test.o
...
0x0000001e: DW_TAG_variable
DW_AT_name ("g")
DW_AT_type (0x00000033 "int *")
DW_AT_external (true)
DW_AT_decl_file ("/home/yhs/test.c")
DW_AT_decl_line (2)
DW_AT_location (DW_OP_addr 0x0)
0x00000033: DW_TAG_pointer_type
DW_AT_type (0x00000042 "int")
0x00000038: DW_TAG_LLVM_annotation
DW_AT_name ("btf_type_tag")
DW_AT_const_value ("tag")
0x00000041: NULL
0x00000042: DW_TAG_base_type
DW_AT_name ("int")
DW_AT_encoding (DW_ATE_signed)
DW_AT_byte_size (0x04)
0x00000049: NULL
Basically, a DW_TAG_LLVM_annotation tag will be inserted
under DW_TAG_pointer_type tag if that pointer has a btf_type_tag
associated with it.
Differential Revision: https://reviews.llvm.org/D111199
The patch in D112349 added a previously nonexistant restriction on ifunc
resolvers that they MUST be defintions. However, the function
multiversioning depends on being able to resolve these resolvers at
link-time, so this additional restriction was breaking.
Implement two builtins to pack/unpack IBM extended long double float,
according to GCC 'Basic PowerPC Builtin Functions Available ISA 2.05'.
Reviewed By: jsji
Differential Revision: https://reviews.llvm.org/D112055
When a constant expression CE is being converted into a corresponding instruction I,
CE is supposed to be replaced by I. However, it is possible that CE is being used multiple
times within a parent instruction PI. Make sure that *all* the uses of CE within PI are
replaced by I.
Reviewed By: rampitec, arsenm
Differential Revision: https://reviews.llvm.org/D112717
Verify that the resolver exists, that it is a defined
Function, and that its return type matches the ifunc's
type. Add corresponding check to BitcodeReader, change
clang to emit the correct type, and fix tests to comply.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D112349
For certain combination of LHS and RHS constant ranges,
the signedness of the relational comparison predicate is irrelevant.
This implements complete and precise model for all predicates,
as confirmed by the brute-force tests. I'm not sure if there are
some more cases that we can handle here.
In a follow-up, CVP will make use of this.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D90924
As noted in https://reviews.llvm.org/D90924#inline-1076197
apparently this is a pretty common pattern,
let's not repeat it yet again, but have it in a common place.
There may be some more places where it could be used,
but these are the most obvious ones.
createReplacementInstr was a trivial wrapper around
ConstantExpr::getAsInstruction, which also inserted the new instruction
into a basic block. Implement this directly in getAsInstruction by
adding an InsertBefore parameter and change all callers to use it. NFC.
A follow-up patch will remove createReplacementInstr.
Differential Revision: https://reviews.llvm.org/D112791
Upon further investigation and discussion,
this is actually the opposite direction from what we should be taking,
and this direction wouldn't solve the motivational problem anyway.
Additionally, some more (polly) tests have escaped being updated.
So, let's just take a step back here.
This reverts commit f3190dedee.
This reverts commit 749581d21f.
This reverts commit f3df87d57e.
This reverts commit ab1dbcecd6.
This method parallels the dropPoisonGeneratingFlags on Instruction, but is hoisted to operator to handle constant expressions as well.
This is mostly code movement, but I did go ahead and add the inrange constexpr gep case. This had been discussed previously, but apparently never followed up o.
While we could emit such a tautological `select`,
it will stick around until the next instsimplify invocation,
which may happen after we count the cost of this redundant `select`.
Which is precisely what happens with loop vectorization legality checks,
and that artificially increases the cost of said checks,
which is bad.
There is prior art for this in `IRBuilderBase::CreateAnd()`/`IRBuilderBase::CreateOr()`.
Refs. https://reviews.llvm.org/D109368#3089809
Otherwise, ODRUniquing would map some member method/variable MDNodes
to have enum type DIScope, resulting in invalid debug info and bad
DWARF.
- Add a Verifier check that when a 'scope:' operand is an ODR type that is not an enum.
- Makes ODRUniquing apply to only ODR types with the same tag so that the debuginfo/DWARF is well-formed.
Reviewed By: probinson, aprantl
Differential Revision: https://reviews.llvm.org/D111770
Currently strip.invariant/launder.invariant are handled by
constructing constant expressions with the intrinsics skipped.
This takes an alternative approach of accumulating the offset
using stripAndAccumulateConstantOffsets(), with a flag to look
through invariant.group intrinsics.
Differential Revision: https://reviews.llvm.org/D112382
Optimize the iterator comparison logic to compare Current.data()
pointers. Use std::tie for assignments from std::pair. Replace
the custom class with a function returning iterator_range.
Differential Revision: https://reviews.llvm.org/D110535
IRBuilder has been updated to support preserving metdata in a more
general manner. This patch adds `LLVMAddMetadataToInst` and
deprecates `LLVMSetInstDebugLocation` in favor of the more
general function.
Reviewed By: aprantl
Differential Revision: https://reviews.llvm.org/D93454
As discussed in:
* https://reviews.llvm.org/D94166
* https://lists.llvm.org/pipermail/llvm-dev/2020-September/145031.html
The GlobalIndirectSymbol class lost most of its meaning in
https://reviews.llvm.org/D109792, which disambiguated getBaseObject
(now getAliaseeObject) between GlobalIFunc and everything else.
In addition, as long as GlobalIFunc is not a GlobalObject and
getAliaseeObject returns GlobalObjects, a GlobalAlias whose aliasee
is a GlobalIFunc cannot currently be modeled properly. Creating
aliases for GlobalIFuncs does happen in the wild (e.g. glibc). In addition,
calling getAliaseeObject on a GlobalIFunc will currently return nullptr,
which is undesirable because it should return the object itself for
non-aliases.
This patch refactors the GlobalIFunc class to inherit directly from
GlobalObject, and removes GlobalIndirectSymbol (while inlining the
relevant parts into GlobalAlias and GlobalIFunc). This allows for
calling getAliaseeObject() on a GlobalIFunc to return the GlobalIFunc
itself, making getAliaseeObject() more consistent and enabling
alias-to-ifunc to be properly modeled in the IR.
I exercised some judgement in the API clients of GlobalIndirectSymbol:
some were 'monomorphized' for GlobalAlias and GlobalIFunc, and
some remained shared (with the type adapted to become GlobalValue).
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D108872
Fixes an issue where GEP salvaging did not properly account for GEP
instructions which stepped over array elements of width 0 (effectively a
no-op). This unnecessarily produced long expressions by appending
`... + (x * 0)` and potentially extended the number of SSA values used
in the dbg.value. This also erroneously triggered an assert in the
salvage function that the element width would be strictly positive.
These issues are resolved by simply ignoring these useless operands.
Reviewed By: aprantl
Differential Revision: https://reviews.llvm.org/D111809
The multiply() implementation is very slow -- it performs six
multiplications in double the bitwidth, which means that it will
typically work on allocated APInts and bypass fast-path
implementations. Add an additional implementation that doesn't
try to produce anything better than a full range if overflow is
possible. At least for the BasicAA use-case, we really don't care
about more precise modeling of overflow behavior. The current
use of multiply() is fine while the implementation is limited to
a single index, but extending it to the multiple-index case makes
the compile-time impact untenable.
For the common case where the shift amount is constant (a single
element range) we can easily compute a precise range (up to
unsigned envelope), so do that.
This patch continues unblocking optimizations that are blocked by pseudo probe instrumentation.
Not exactly like DbgIntrinsics, PseudoProbe intrinsic has other attributes (such as mayread, maywrite, mayhaveSideEffect) that can block optimizations. The issues fixed are:
- Flipped default param of getFirstNonPHIOrDbg API to skip pseudo probes
- Unblocked CSE by avoiding pseudo probe from clobbering memory SSA
- Unblocked induction variable simpliciation
- Allow empty loop deletion by treating probe intrinsic isDroppable
- Some refactoring.
Reviewed By: wenlei
Differential Revision: https://reviews.llvm.org/D110847
In the review of D111085 it was pointed out that these functions don't
conform to the naming scheme in use in LLVM. With this commit we should
be good for all of FPEnv.h.
To better reflect the meaning of the now-disambiguated {GlobalValue,
GlobalAlias}::getBaseObject after breaking off GlobalIFunc::getResolverFunction
(D109792), the function is renamed to getAliaseeObject.
Currently the max alignment representable is 1GB, see D108661.
Setting the align of an object to 4GB is desirable in some cases to make sure the lower 32 bits are clear which can be used for some optimizations, e.g. https://crbug.com/1016945.
This uses an extra bit in instructions that carry an alignment. We can store 15 bits of "free" information, and with this change some instructions (e.g. AtomicCmpXchgInst) use 14 bits.
We can increase the max alignment representable above 4GB (up to 2^62) since we're only using 33 of the 64 values, but I've just limited it to 4GB for now.
The one place we have to update the bitcode format is for the alloca instruction. It stores its alignment into 5 bits of a 32 bit bitfield. I've added another field which is 8 bits and should be future proof for a while. For backward compatibility, we check if the old field has a value and use that, otherwise use the new field.
Updating clang's max allowed alignment will come in a future patch.
Reviewed By: hans
Differential Revision: https://reviews.llvm.org/D110451
Currently the max alignment representable is 1GB, see D108661.
Setting the align of an object to 4GB is desirable in some cases to make sure the lower 32 bits are clear which can be used for some optimizations, e.g. https://crbug.com/1016945.
This uses an extra bit in instructions that carry an alignment. We can store 15 bits of "free" information, and with this change some instructions (e.g. AtomicCmpXchgInst) use 14 bits.
We can increase the max alignment representable above 4GB (up to 2^62) since we're only using 33 of the 64 values, but I've just limited it to 4GB for now.
The one place we have to update the bitcode format is for the alloca instruction. It stores its alignment into 5 bits of a 32 bit bitfield. I've added another field which is 8 bits and should be future proof for a while. For backward compatibility, we check if the old field has a value and use that, otherwise use the new field.
Updating clang's max allowed alignment will come in a future patch.
Reviewed By: hans
Differential Revision: https://reviews.llvm.org/D110451
Currently the max alignment representable is 1GB, see D108661.
Setting the align of an object to 4GB is desirable in some cases to make sure the lower 32 bits are clear which can be used for some optimizations, e.g. https://crbug.com/1016945.
This uses an extra bit in instructions that carry an alignment. We can store 15 bits of "free" information, and with this change some instructions (e.g. AtomicCmpXchgInst) use 14 bits.
We can increase the max alignment representable above 4GB (up to 2^62) since we're only using 33 of the 64 values, but I've just limited it to 4GB for now.
The one place we have to update the bitcode format is for the alloca instruction. It stores its alignment into 5 bits of a 32 bit bitfield. I've added another field which is 8 bits and should be future proof for a while. For backward compatibility, we check if the old field has a value and use that, otherwise use the new field.
Updating clang's max allowed alignment will come in a future patch.
Reviewed By: hans
Differential Revision: https://reviews.llvm.org/D110451
As described on D111049, we're trying to remove the <string> dependency from error handling and replace uses of report_fatal_error(const std::string&) with the Twine() variant which can be forward declared.
Stop using APInt constructors and methods that were soft-deprecated in
D109483. This fixes all the uses I found in llvm, except for the APInt
unit tests which should still test the deprecated methods.
Differential Revision: https://reviews.llvm.org/D110807
This is basically the same change as 42cc7f3c52
but for integer attributes. Rather than treating each attribute
individually, handle them all the same way. The only thing that
needs to be done per attribute is specify how get/add convert
from/to the raw representation.
This patch adds the functionalities to print MDNode in tree shape. For
example, instead of printing a MDNode like this:
```
<0x5643e1166888> = !DILocalVariable(name: "foo", arg: 2, scope: <0x5643e11c9740>, file: <0x5643e11c6ec0>, line: 8, type: <0x5643e11ca8e0>, flags: DIFlagPublic | DIFlagFwdDecl, align: 8)
```
The printTree/dumpTree functions can give you:
```
<0x5643e1166888> = !DILocalVariable(name: "foo", arg: 2, scope: <0x5643e11c9740>, file: <0x5643e11c6ec0>, line: 8, type: <0x5643e11ca8e0>, flags: DIFlagPublic | DIFlagFwdDecl, align: 8)
<0x5643e11c9740> = distinct !DISubprogram(scope: null, spFlags: 0)
<0x5643e11c6ec0> = distinct !DIFile(filename: "file.c", directory: "/path/to/dir")
<0x5643e11ca8e0> = distinct !DIDerivedType(tag: DW_TAG_pointer_type, baseType: <0x5643e11668d8>, size: 1, align: 2)
<0x5643e11668d8> = !DIBasicType(tag: DW_TAG_unspecified_type, name: "basictype")
```
Which is useful when using it in debugger. Where sometimes printing the
whole module to see all MDNodes is too expensive.
Differential Revision: https://reviews.llvm.org/D110113
AsmWriterContext is a simple compound that stores TypePrinting,
SlotTracker (i.e. "Machine" in AsmWriter), and Module instances -- three
of the most commonly used objects in the AsmWriter infrastructure.
Previously these three objects are passed as separate function arguments
to most of the printer functions in this file. Tidying them up can bring
easier code refactoring on printer functions in the future (e.g. when we
want to pass additional objects to all printer functions).
NOTE: Theoritically, this patch should be NFC.
Differential Revision: https://reviews.llvm.org/D110112
ParameterABIAttributes functions work with unsigned integers as the index, so having the getter be signed makes no sense. Additionally, for this reason, the loop vars that were signed were changed to unsigned too.
Reviewed By: jeroen.dobbelaere
Differential Revision: https://reviews.llvm.org/D110344
We expose the fact that we rely on unsigned wrapping to iterate through
all indexes. This can be confusing. Rather, keeping it as an
implementation detail through an iterator is less confusing and is less
code.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D110885
`__stdcall`, `__fastcall` and `__vectorcall` return large values via a
hidden pointer argument. However, the size of that argument should not
be included in the argument list byte count suffix added to the
function's decorated name.
This patch fixes that issue so that LLVM generates the same decorated
name as MSVC does.
MSVC example: https://godbolt.org/z/nc35MKPhr
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D110719
To avoid using the AST when emitting diagnostics, split the "dontcall"
attribute into "dontcall-warn" and "dontcall-error", and also add the
frontend attribute value as the LLVM attribute value. This gives us all
the information to report diagnostics we need from within the IR (aside
from access to the original source).
One downside is we directly use LLVM's demangler rather than using the
existing Clang diagnostic pretty printing of symbols.
Previous revisions didn't properly declare the new dependencies.
Reviewed By: nickdesaulniers
Differential Revision: https://reviews.llvm.org/D110364
To avoid using the AST when emitting diagnostics, split the "dontcall"
attribute into "dontcall-warn" and "dontcall-error", and also add the
frontend attribute value as the LLVM attribute value. This gives us all
the information to report diagnostics we need from within the IR (aside
from access to the original source).
One downside is we directly use LLVM's demangler rather than using the
existing Clang diagnostic pretty printing of symbols.
Reviewed By: nickdesaulniers
Differential Revision: https://reviews.llvm.org/D110364
This patch is for fixing potential insertElement-related bugs like D93818.
```
V = UndefValue::get(VecTy);
for(...)
V = Builder.CreateInsertElementy(V, Elt, Idx);
=>
V = PoisonValue::get(VecTy);
for(...)
V = Builder.CreateInsertElementy(V, Elt, Idx);
```
Like above, this patch changes the placeholder V to poison.
The patch will be separated into several commits.
Reviewed By: aqjune
Differential Revision: https://reviews.llvm.org/D110311
This patch emits DW_TAG_namelist and DW_TAG_namelist_item for fortran
namelist variables. DICompositeType is extended to support this fortran
feature.
Reviewed By: aprantl
Differential Revision: https://reviews.llvm.org/D108553
Thinlink provides an opportunity to propagate function attributes across modules, enabling additional propagation opportunities.
This change propagates (currently default off, turn on with `disable-thinlto-funcattrs=1`) noRecurse and noUnwind based off of function summaries of the prevailing functions in bottom-up call-graph order. Testing on clang self-build:
1. There's a 35-40% increase in noUnwind functions due to the additional propagation opportunities.
2. Throughput is measured at 10-15% increase in thinlink time which itself is 1.5% of E2E link time.
Implementation-wise this adds the following summary function attributes:
1. noUnwind: function is noUnwind
2. mayThrow: function contains a non-call instruction that `Instruction::mayThrow` returns true on (e.g. windows SEH instructions)
3. hasUnknownCall: function contains calls that don't make it into the summary call-graph thus should not be propagated from (e.g. indirect for now, could add no-opt functions as well)
Testing:
Clang self-build passes and 2nd stage build passes check-all
ninja check-all with newly added tests passing
Reviewed By: tejohnson
Differential Revision: https://reviews.llvm.org/D36850
This is a fix for the issue reported at
https://reviews.llvm.org/D110043#3019942:
The ElementSize is a uint64_t and as such may be larger than the
index space, or be negative in the index space. This is UB, but
shouldn't cause assertion failures.
We address this by detecting whether the size is too large and
use a zero index in that case (which is always conservatively
correct).
Differential Revision: https://reviews.llvm.org/D110437
In ThinLTO for locals we normally compute the GUID from the name after
prepending the source path to get a unique global id. SamplePGO indirect
call profiles contain the target GUID without this uniquification,
however (unless compiling with -funique-internal-linkage-names).
In order to correctly handle the call edges added to the combined index
for these indirect calls, during importing and bitcode writing we
consult a map of original to full GUID to identify the actual callee.
However, for a large application this was consuming a lot of compile
time as we need to do this repeatedly (especially during importing where
we may traverse call edges multiple times).
To fix this implement a suggestion in one of the FIXME comments, and
actually modify the call edges during a single traversal after the index
is built to perform the fixups once. I combined this fixup with the dead
code analysis performed on the index in order to avoid adding an
additional walk of the index. The dead code analysis is the first
analysis performed on the index.
This reduced the time required for a large thin link with SamplePGO by
about 20%.
No new test added, but I confirmed that there are existing tests that
will fail when no fixup is performed.
Differential Revision: https://reviews.llvm.org/D110374
- This patch adds in the GOFF mangling support to the LLVM data layout string. A corresponding additional line has been added into the data layout section in the language reference documentation.
- Furthermore, this patch also sets the right data layout string for the z/OS target in the SystemZ backend.
Reviewed By: uweigand, Kai, abhina.sreeskantharajan, MaskRay
Differential Revision: https://reviews.llvm.org/D109362
While both GlobalAlias and GlobalIFunc are GlobalIndirectSymbol, their
`getIndirectSymbol()` usage is quite different (GlobalIFunc's resolver
is an entity different from GlobalIFunc itself).
As discussed on https://lists.llvm.org/pipermail/llvm-dev/2020-September/144904.html
("[IR] Modelling of GlobalIFunc"), the name `getBaseObject` is confusing when
used with GlobalIFunc.
To resolve the confusion:
* Move GloalIndirectSymol::getBaseObject to GlobalAlias:: (GlobalIFunc should use `getResolver` instead)
* Change GlobalValue::getBaseObject not to inspect GlobalIFunc. Note: the function has 7 references.
* Add GlobalIFunc::getResolverFunction to peel off potential ConstantExpr indirection
(`strlen` in `test/LTO/Resolution/X86/ifunc.ll`)
Note: GlobalIFunc::getResolver (like GlobalAlias::getAliasee which does not peel
off ConstantExpr indirection) is kept to be used by ValueEnumerator.
Reviewed By: ibookstein
Differential Revision: https://reviews.llvm.org/D109792
And always print it.
This makes some LLVM diagnostics match up better with Clang's diagnostics.
Updated some AMDGPU uses of DiagnosticInfoResourceLimit and now we print
better diagnostics for those.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D110204
A logic incompleteness may lead MemorySSA to be too conservative
in its results. Specifically, when dealing with a call of kind
`call i32 bitcast (i1 (i1)* @test to i32 (i32)*)(i32 %1)`, where
the function `test` is declared with readonly attribute, the
bitcast is not looked through, obscuring function attributes. Hence,
some methods of CallBase (e.g., doesNotReadMemory) could provide
suboptimal results.
Differential Revision: https://reviews.llvm.org/D109888
This patch allows sinking an instruction which can have multiple uses in a
single user. We were previously over-restrictive by looking for exactly one use,
rather than one user.
Also added an API for retrieving a unique undroppable user.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D109700
One of the two inputs of the Shufflevector is often a placeholder.
Previously, there were cases where the placeholder was undef, and there were cases where it was poison.
I added these constructors to create a placeholder consistently.
Changing to use the newly added constructor will be written in a separate patch.
Reviewed By: spatel
Differential Revision: https://reviews.llvm.org/D110146
We implement logic to convert a byte offset into a sequence of GEP
indices for that offset in a number of places. This patch adds a
DataLayout::getGEPIndicesForOffset() method, which implements the
core logic. I've updated SROA, ConstantFolding and InstCombine to
use it, and there's a few more places where it looks relevant.
Differential Revision: https://reviews.llvm.org/D110043
Some buildbots fail with:
> C:\a\llvm-clang-x86_64-expensive-checks-win\llvm-project\llvm\lib\IR\Verifier.cpp(4352): error C2678: binary '==': no operator found which takes a left-hand operand of type 'const llvm::MDOperand' (or there is no acceptable conversion)
Possibly the explicit MDOperand to Metadata* conversion will help?
New field `elements` is added to '!DIImportedEntity', representing
list of aliased entities.
This is needed to dump optimized debugging information where all names
in a module are imported, but a few names are imported with overriding
aliases.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D109343
This reverts commit 4ac4e52189.
There are couple of test failures, which needs update of the test cases.
Doing a clean revert and will recommit the change along with fixed
testcases.
The API was removed in 4ac4e52189 in favor of
getUniqueUndroppableUser.
However, this caused a buildbot failure in AbstractCallSiteTest.cpp,
which uses the API and the AbstractCallSite class requires a "use"
rather than a user.
Retain the API so that the unittest compiles and passes.
This patch allows sinking an instruction which can have multiple uses in a
single user. We were previously over-restrictive by looking for exactly one use,
rather than one user.
Also, the API for retrieving undroppable user has been updated accordingly since
in both usecases (Attributor and InstCombine), we seem to care about the user,
rather than the use.
Reviewed-By: nikic
Differential Revision: https://reviews.llvm.org/D109700
This patch adds functionality to check assumption attributes on call
sites as well.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D109376
Currently, opaque pointers are supported in two forms: The
-force-opaque-pointers mode, where all pointers are opaque and
typed pointers do not exist. And as a simple ptr type that can
coexist with typed pointers.
This patch removes support for the mixed mode. You either get
typed pointers, or you get opaque pointers, but not both. In the
(current) default mode, using ptr is forbidden. In -opaque-pointers
mode, all pointers are opaque.
The motivation here is that the mixed mode introduces additional
issues that don't exist in fully opaque mode. D105155 is an example
of a design problem. Looking at D109259, it would probably need
additional work to support mixed mode (e.g. to generate GEPs for
typed base but opaque result). Mixed mode will also end up
inserting many casts between i8* and ptr, which would require
significant additional work to consistently avoid.
I don't think the mixed mode is particularly valuable, as it
doesn't align with our end goal. The only thing I've found it to
be moderately useful for is adding some opaque pointer tests in
between typed pointer tests, but I think we can live without that.
Differential Revision: https://reviews.llvm.org/D109290
This renames the primary methods for creating a zero value to `getZero`
instead of `getNullValue` and renames predicates like `isAllOnesValue`
to simply `isAllOnes`. This achieves two things:
1) This starts standardizing predicates across the LLVM codebase,
following (in this case) ConstantInt. The word "Value" doesn't
convey anything of merit, and is missing in some of the other things.
2) Calling an integer "null" doesn't make any sense. The original sin
here is mine and I've regretted it for years. This moves us to calling
it "zero" instead, which is correct!
APInt is widely used and I don't think anyone is keen to take massive source
breakage on anything so core, at least not all in one go. As such, this
doesn't actually delete any entrypoints, it "soft deprecates" them with a
comment.
Included in this patch are changes to a bunch of the codebase, but there are
more. We should normalize SelectionDAG and other APIs as well, which would
make the API change more mechanical.
Differential Revision: https://reviews.llvm.org/D109483
integer 0/1 for the operand of bundle "clang.arc.attachedcall"
https://reviews.llvm.org/D102996 changes the operand of bundle
"clang.arc.attachedcall". This patch makes changes to llvm that are
needed to handle the new IR.
This should make it easier to understand what the IR is doing and also
simplify some of the passes as they no longer have to translate the
integer values to the runtime functions.
Differential Revision: https://reviews.llvm.org/D103000
Index is 0 when the return value has the returned attribute. But the
return value cannot have the returned attribute, so the check is
pointless.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D109334
llvm.vp.select extends the regular select instruction with an explicit
vector length (%evl).
All lanes with indexes at and above %evl are
undefined. Lanes below %evl are taken from the first input where the
mask is true and from the second input otherwise.
Reviewed By: rogfer01
Differential Revision: https://reviews.llvm.org/D105351
Added opt option -print-pipeline-passes to print a -passes compatible
string describing the built pass pipeline.
As an example:
$ opt -enable-new-pm=1 -adce -licm -simplifycfg -o /dev/null /dev/null -print-pipeline-passes
verify,function(adce),function(loop-mssa(licm)),function(simplifycfg<bonus-inst-threshold=1;no-forward-switch-cond;no-switch-to-lookup;keep-loops;no-hoist-common-insts;no-sink-common-insts>),verify,BitcodeWriterPass
At the moment this is best-effort only and there are some known
limitations:
- Not all passes accepting parameters will print their parameters
(currently only implemented for simplifycfg).
- Some ClassName to pass-name mappings are not unique.
- Some ClassName to pass-name mappings are missing (e.g.
BitcodeWriterPass).
Differential Revision: https://reviews.llvm.org/D108298
Added opt option -print-pipeline-passes to print a -passes compatible
string describing the built pass pipeline.
As an example:
$ opt -enable-new-pm=1 -adce -licm -simplifycfg -o /dev/null /dev/null -print-pipeline-passes
verify,function(adce),function(loop-mssa(licm)),function(simplifycfg<bonus-inst-threshold=1;no-forward-switch-cond;no-switch-to-lookup;keep-loops;no-hoist-common-insts;no-sink-common-insts>),verify,BitcodeWriterPass
At the moment this is best-effort only and there are some known
limitations:
- Not all passes accepting parameters will print their parameters
(currently only implemented for simplifycfg).
- Some ClassName to pass-name mappings are not unique.
- Some ClassName to pass-name mappings are missing (e.g.
BitcodeWriterPass).
This is part one of a couple of patches to fully rename these methods.
I've made the mistake of assuming that these indexes are for parameters
multiple times, but actually they're based off of a weird indexing
scheme AttributeList::AttrIndex where 0 is the return value and ~0 is
the function. Hopefully renaming these methods will make this clearer.
Ideally users should use more specific methods like
AttributeList::getFnAttr().
This patch simply adds the name that we want in the end. This is so the
removal of the methods with the original names happens in a separate
change to make it easier for downstream users.
This touches all relevant methods in AttributeList, CallBase, and Function.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D108788
We have a large compile showing occasional non-deterministic behavior
that is due to DIArgList not being properly uniqued in some cases. I
tracked this down to handleChangedOperands, for which there is a custom
implementation for DIArgList, that does not take care of re-uniquing
after updating the DIArgList Args, unlike the default version of
handleChangedOperands for MDNode.
Since the Args in the DIArgList form the key for the store, this seems
to be occasionally breaking the lookup in that DenseSet. Specifically,
when invoking DIArgList::get() from replaceVariableLocationOp, very
occasionally it returns a new DIArgList object, when one already exists
having the same exact Args pointers. This in turn causes a subsequent
call to Instruction::isIdenticalToWhenDefined on those two otherwise
identical DIArgList objects during a later pass to return false, leading
to different IR in those rare cases.
I modified DIArgList::handleChangedOperands to perform similar
re-uniquing as the MDNode version used by other metadata node types.
This also necessitated a change to the context destructor, since in some
cases we end up with DIArgList as distinct nodes: DIArgList is the only
metadata node type to have a custom dropAllReferences, so we need to
invoke that version on DIArgList in the DistinctMDNodes store to clean
it up properly.
Differential Revision: https://reviews.llvm.org/D108968
In D87099, the mangler learned to quote export directives that contain
special characters. Only alhpanumerical characters as well as
'_', '$', '.' and '@' were exmpt from this quoting. However, at least
binutils considers an unquoted '.' to be syntax and object files
containing such symbols will cause errors during linking. Fix that
by removing '.' from the list of allowed exemptions.
Differential Revision: https://reviews.llvm.org/D100359
It looks like this array was missed in 4276d4a8d0
Fixed tests that expected `elements` to be empty or depeneded on the order of the empty DINode.
Reviewed By: aprantl
Differential Revision: https://reviews.llvm.org/D107024
This patch emits DW_TAG_namelist and DW_TAG_namelist_item for fortran
namelist variables. DICompositeType is extended to support this fortran
feature.
Reviewed By: aprantl
Differential Revision: https://reviews.llvm.org/D108553
Generate btf_tag annotations for function parameters.
A field "annotations" is introduced to DILocalVariable, and
annotations are represented as an DINodeArray, similar to
DIComposite elements. The following example illustrates how
annotations are encoded in IR:
distinct !DILocalVariable(name: "info",, arg: 1, ..., annotations: !10)
!10 = !{!11, !12}
!11 = !{!"btf_tag", !"a"}
!12 = !{!"btf_tag", !"b"}
Differential Revision: https://reviews.llvm.org/D106620
Generate btf_tag annotations for DIGlobalVariable.
A field "annotations" is introduced to DIGlobalVariable, and
annotations are represented as an DINodeArray, similar to
DIComposite elements. The following example illustrates how
annotations are encoded in IR:
distinct !DIGlobalVariable(..., annotations: !10)
!10 = !{!11, !12}
!11 = !{!"btf_tag", !"a"}
!12 = !{!"btf_tag", !"b"}
Differential Revision: https://reviews.llvm.org/D106619
Generate btf_tag annotations for DISubprogram types.
A field "annotations" is introduced to DISubprogram, and
annotations are represented as an DINodeArray, similar to
DIComposite elements. The following example illustrates how
annotations are encoded in IR:
distinct !DISubprogram(..., annotations: !10)
!10 = !{!11, !12}
!11 = !{!"btf_tag", !"a"}
!12 = !{!"btf_tag", !"b"}
Differential Revision: https://reviews.llvm.org/D106618
Add support for the GNU C style __attribute__((error(""))) and
__attribute__((warning(""))). These attributes are meant to be put on
declarations of functions whom should not be called.
They are frequently used to provide compile time diagnostics similar to
_Static_assert, but which may rely on non-ICE conditions (ie. relying on
compiler optimizations). This is also similar to diagnose_if function
attribute, but can diagnose after optimizations have been run.
While users may instead simply call undefined functions in such cases to
get a linkage failure from the linker, these provide a much more
ergonomic and actionable diagnostic to users and do so at compile time
rather than at link time. Users instead may be able use inline asm .err
directives.
These are used throughout the Linux kernel in its implementation of
BUILD_BUG and BUILD_BUG_ON macros. These macros generally cannot be
converted to use _Static_assert because many of the parameters are not
ICEs. The Linux kernel still needs to be modified to make use of these
when building with Clang; I have a patch that does so I will send once
this feature is landed.
To do so, we create a new IR level Function attribute, "dontcall" (both
error and warning boil down to one IR Fn Attr). Then, similar to calls
to inline asm, we attach a !srcloc Metadata node to call sites of such
attributed callees.
The backend diagnoses these during instruction selection, while we still
know that a call is a call (vs say a JMP that's a tail call) in an arch
agnostic manner.
The frontend then reconstructs the SourceLocation from that Metadata,
and determines whether to emit an error or warning based on the callee's
attribute.
Link: https://bugs.llvm.org/show_bug.cgi?id=16428
Link: https://github.com/ClangBuiltLinux/linux/issues/1173
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D106030
Generate btf_tag annotations for DIDrived types. More specifically,
clang frontend generates the btf_tag annotations for record
fields. The annotations are represented as an DINodeArray
in DebugInfo. The following example illustrate how
annotations are encoded in IR:
distinct !DIDerivedType(tag: DW_TAG_member, ..., annotations: !10)
!10 = !{!11, !12}
!11 = !{!"btf_tag", !"a"}
!12 = !{!"btf_tag", !"b"}
Differential Revision: https://reviews.llvm.org/D106616
Clang patch D106614 added attribute btf_tag support. This patch
generates btf_tag annotations for DIComposite types.
A field "annotations" is introduced to DIComposite, and the
annotations are represented as an DINodeArray, similar to
DIComposite elements. The following example illustrates
how annotations are encoded in IR:
distinct !DICompositeType(..., annotations: !10)
!10 = !{!11, !12}
!11 = !{!"btf_tag", !"a"}
!12 = !{!"btf_tag", !"b"}
Each btf_tag annotation is represented as a 2D array of
meta strings. Each record may have more than one
btf_tag annotations, as in the above example.
Reland with additional fixes for llvm/unittests/IR/DebugTypeODRUniquingTest.cpp.
Differential Revision: https://reviews.llvm.org/D106615
The COFF specific `DataReferencedByCode` complexity (D103372 D103717) is due to
a link.exe limitation: an external symbol in IMAGE_COMDAT_SELECT_ASSOCIATIVE is
not really dropped, so it can cause duplicate definition error.
Clang patch D106614 added attribute btf_tag support. This patch
generates btf_tag annotations for DIComposite types.
A field "annotations" is introduced to DIComposite, and the
annotations are represented as an DINodeArray, similar to
DIComposite elements. The following example illustrates
how annotations are encoded in IR:
distinct !DICompositeType(..., annotations: !10)
!10 = !{!11, !12}
!11 = !{!"btf_tag", !"a"}
!12 = !{!"btf_tag", !"b"}
Each btf_tag annotation is represented as a 2D array of
meta strings. Each record may have more than one
btf_tag annotations, as in the above example.
Differential Revision: https://reviews.llvm.org/D106615
This patch adds vector-predicated ("VP") reduction intrinsics corresponding to
each of the existing unpredicated `llvm.vector.reduce.*` versions. Unlike the
unpredicated reductions, all VP reductions have a start value. This start value
is returned when the no vector element is active.
Support for expansion on targets without native vector-predication support is
included.
This patch is based on the ["reduction
slice"](https://reviews.llvm.org/D57504#1732277) of the LLVM-VP reference patch
(https://reviews.llvm.org/D57504).
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D104308
SwitchInst should have a void result type.
Add a check to the verifier to catch this error.
Reviewed By: samparker
Differential Revision: https://reviews.llvm.org/D108084
This patch updates ConstantVector::getSplat to use poison instead
of undef when using insertelement/shufflevector to splat.
This follows on from D93793.
Differential Revision: https://reviews.llvm.org/D107751
When LLVM is used in other projects, it may happen that global cons-
tructors will execute before the call to ParseCommandLineOptions.
Since OptBisect is initialized via a constructor, and has no ability
to be updated at a later time, passing "-opt-bisect-limit" to the
parse function may have no effect.
To avoid this problem use a cl::cb (callback) to set the bisection
limit when the option is actually processed.
Differential Revision: https://reviews.llvm.org/D104551
It's entirely possible (because it actually happened) for a bool
variable to end up with a 256-bit DW_AT_const_value. This came about
when a local bool variable was initialized from a bitfield in a
32-byte struct of bitfields, and after inlining and constant
propagation, the variable did have a constant value. The sequence of
optimizations had it carrying "i256" values around, but once the
constant made it into the llvm.dbg.value, no further IR changes could
affect it.
Technically the llvm.dbg.value did have a DIExpression to reduce it
back down to 8 bits, but the compiler is in no way ready to emit an
oversized constant *and* a DWARF expression to manipulate it.
Depending on the circumstances, we had either just the very fat bool
value, or an expression with no starting value.
The sequence of optimizations that led to this state did seem pretty
reasonable, so the solution I came up with was to invent a DWARF
constant expression folder. Currently it only does convert ops, but
there's no reason it couldn't do other ops if that became useful.
This broke three tests that depended on having convert ops survive
into the DWARF, so I added an operator that would abort the folder to
each of those tests.
Differential Revision: https://reviews.llvm.org/D106915
Constant::getSplatValue has O(N) time complexity in the worst case,
where N is the # of elements in a vector. So we call
Constant::getAggregateElement first and return earlier if possible to
avoid unnecessary getSplatValue calls.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D107252
This patch adds an initial ShuffleVectorInst::isInsertSubvectorMask helper to recognize 2-op shuffles where the lowest elements of one of the sources are being inserted into the "in-place" other operand, this includes "concat_vectors" patterns as can be seen in the Arm shuffle cost changes. This also helped fix a x86 issue with irregular/length-changing SK_InsertSubvector costs - I'm hoping this will help with D107188
This doesn't currently attempt to work with 1-op shuffles that could either be a "widening" shuffle or a self-insertion.
The self-insertion case is tricky, but we currently always match this with the existing SK_PermuteSingleSrc logic.
The widening case will be addressed in a follow up patch that treats the cost as 0.
Masks with a high number of undef elts will still struggle to match optimal subvector widths - its currently bounded by minimum-width possible insertion, whilst some cases would benefit from wider (pow2?) subvectors.
Differential Revision: https://reviews.llvm.org/D107228
We might want to use info from GC strategy in middle end analysis.
The motivation for this is provided in D99135: we may want to ask
a GC if it's going to work with a given pointer (currently this code
makes naive check by the method name).
Differetial Revision: https://reviews.llvm.org/D100559
Reviewed By: reames
Currently, the default alignment is much larger than the actual size of
the vector in memory. Fix this to use a sane default.
For SVE, temporarily remove lowering of load/store operations for
predicates with less than 16 elements. The layout the backend was
assuming for SVE predicates with less than 16 elements doesn't agree
with the frontend. More work probably needs to be done here.
This change is, strictly speaking, not backwards-compatible at the
bitcode level. But probably nobody is actually depending on that; i1
vectors in memory are rare, and the code that does use them probably
ends up forcing the alignment to something sane anyway. If we think
this is a concern, I can restrict this to scalable vectors for now
(where it's actually causing issues for me at the moment).
Differential Revision: https://reviews.llvm.org/D88994
Target-dependent constant folding will fold these down to simple
constants (or at least, expressions that don't involve a GEP). We don't
need heroics to try to optimize the form of the expression before that
happens.
Fixes https://bugs.llvm.org/show_bug.cgi?id=51232 .
Differential Revision: https://reviews.llvm.org/D107116
This is a second attempt to fix the EXPENSIVE_CHECKS issue that was mentioned In D91661#2875179 by @jroelofs.
(The first attempt was in D105983)
D91661 more or less completely reverted D49126 and by doing so also removed the cleanup logic of the created declarations and calls.
This patch is a replacement for D91661 (which must itself be reverted first). It replaces the custom declaration creation with the
generic version and shows the test impact. It also tracks the number of NamedValues to detect if a new prototype was added instead
of looking at the available users of a prototype.
Reviewed By: jroelofs
Differential Revision: https://reviews.llvm.org/D106147
When hoisting/moving calls to locations, we strip unknown metadata. Such calls are usually marked `speculatable`, i.e. they are guaranteed to not cause undefined behaviour when run anywhere. So, we should strip attributes that can cause immediate undefined behaviour if those attributes are not valid in the context where the call is moved to.
This patch introduces such an API and uses it in relevant passes. See
updated tests.
Fix for PR50744.
Reviewed By: nikic, jdoerfert, lebedev.ri
Differential Revision: https://reviews.llvm.org/D104641
This fixes an assert firing when compiling code which involves 128 bit
integrals.
This would trigger runtime checks similar to this:
```
Assertion failed: getMinSignedBits() <= 64 && "Too many bits for int64_t", file llvm/include/llvm/ADT/APInt.h, line 1646
```
To get around this, we just saturate those big values.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D105320
DIEnumerator stores an APInt as of April 2020, so now we don't need to
truncate the enumerator value to 64 bits. Fixes assertions during IRGen.
Split from D105320, thanks to Matheus Izvekov for the test case and
report.
Differential Revision: https://reviews.llvm.org/D106585
Proposed alternative to D105338.
This is ugly, but short-term I think it's the best way forward: first,
let's formalize the hacks into a coherent model. Then we can consider
extensions of that model (we could have different flavors of volatile
with different rules).
Differential Revision: https://reviews.llvm.org/D106309
This adjusts mayHaveSideEffect() to return true for !willReturn()
instructions. Just like other side-effects, non-willreturn calls
(aka "divergence") cannot be removed and cannot be reordered relative
to other side effects. This fixes a number of bugs where
non-willreturn calls are either incorrectly dropped or moved. In
particular, it also fixes the last open problem in
https://bugs.llvm.org/show_bug.cgi?id=50511.
I performed a cursory review of all current mayHaveSideEffect()
uses, which convinced me that these are indeed the desired default
semantics. Places that do not want to consider non-willreturn as a
sideeffect generally do not want mayHaveSideEffect() semantics at
all. I identified two such cases, which are addressed by D106591
and D106742. Finally, there is a use in SCEV for which we don't
really have an appropriate API right now -- what it wants is
basically "would this be considered forward progress". I've just
spelled out the previous semantics there.
Differential Revision: https://reviews.llvm.org/D106749
Rather than adding methods for dropping these attributes in
various places, add a function that returns an AttrBuilder with
these attributes, which can then be used with existing methods
for dropping attributes. This is with an eye on D104641, which
also needs to drop them from returns, not just parameters.
Also be more explicit about the semantics of the method in the
documentation. Refer to UB rather than Undef, which is what this
is actually about.
From LangRef:
> if the parameter or return pointer is null, poison value is
> returned or passed instead. The nonnull attribute should be
> combined with the noundef attribute to ensure a pointer is not
> null or otherwise the behavior is undefined.
Dropping noundef is sufficient to prevent UB. Including nonnull
in this method just muddies the semantics.
This was previously combining indices even though they operate on
different types. For non-opaque pointers, the condition is
automatically satisfied based on the pointer types being equal.
This is part of a patch series working towards the ability to make
SourceLocation into a 64-bit type to handle larger translation units.
!srcloc is generated in clang codegen, and pulled back out by llvm
functions like AsmPrinter::emitInlineAsm that need to report errors in
the inline asm. From there it goes to LLVMContext::emitError, is
stored in DiagnosticInfoInlineAsm, and ends up back in clang, at
BackendConsumer::InlineAsmDiagHandler(), which reconstitutes a true
clang::SourceLocation from the integer cookie.
Throughout this code path, it's now 64-bit rather than 32, which means
that if SourceLocation is expanded to a 64-bit type, this error report
won't lose half of the data.
The compiler will tolerate both of i32 and i64 !srcloc metadata in
input IR without faulting. Test added in llvm/MC. (The semantic
accuracy of the metadata is another matter, but I don't know of any
situation where that matters: if you're reading an IR file written by
a previous run of clang, you don't have the SourceManager that can
relate those source locations back to the original source files.)
Original version of the patch by Mikhail Maltsev.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D105491
Remove the assert in AssemblyWriter::printBasicBlock() and
in BasicBlock::isEntryBlock() that require blocks to have parents.
Instead, have BasicBlock::isEntryBlock() return false for unattached
blocks. This allows us to call these functions for blocks that are
not yet added to a module which is a useful debugging capability.
Committing for xiaoqing_wu
https://reviews.llvm.org/D106127k
In the textual format, `noduplicates` means no COMDAT/section group
deduplication is performed. Therefore, if both sets of sections are retained, and
they happen to define strong external symbols with the same names,
there will be a duplicate definition linker error.
In PE/COFF, the selection kind lowers to `IMAGE_COMDAT_SELECT_NODUPLICATES`.
The name describes the corollary instead of the immediate semantics. The name
can cause confusion to other binary formats (ELF, wasm) which have implemented/
want to implement the "no deduplication" selection kind. Rename it to be clearer.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D106319
For musttail calls, ABI attributes between the function and the
musttail call must match. The current check discards the type of
type attributes like byval, which means that it will consider
byval(i32) and byval(i64) (or similar) as compatible.
I assume this is a leftover from before these attributes had a
type argument. Ran into this while trying to tighten an assertion
in AttrBuilder.
Differential Revision: https://reviews.llvm.org/D105841
Use the elementtype attribute introduced in D105407 for the
llvm.preserve.array/struct.index intrinsics. It carries the
element type of the GEP these intrinsics effectively encode.
This patch:
* Adds a verifier check that the attribute is required.
* Adds it in the IRBuilder methods for these intrinsics.
* Autoupgrades old bitcode without the attribute.
* Updates the lowering code to use the attribute rather than
the pointer element type.
* Updates lots of tests to specify the attribute.
* Adds -force-opaque-pointers to the intrinsic-array.ll test
to demonstrate they work now.
https://reviews.llvm.org/D106184
As suggested on D105733, this adds a verifier rule that calls to
intrinsics must match the signature of the intrinsic.
Without opaque pointers this is automatically enforced for all
calls, because the pointer types need to match. If the signatures
don't match, a pointer bitcast has to be inserted. For intrinsics
in particular, such bitcasts are not legal, because the address of
intrinsics cannot be taken.
With opaque pointers, there are no more pointer bitcasts, so it's
generally possible for the call and the callee signature to differ.
However, for intrinsics we still want to enforce that the signatures
must match, the same as was done before through the address taken
check.
We can't enforce this more generally for non-intrinsics, because
calls with mismatched signatures at the very least can legally
occur in unreachable code, and might also be valid in some other
cases, depending on how exactly the signatures differ.
Differential Revision: https://reviews.llvm.org/D106013
Intrinsics can only be called directly, taking their address is not
legal. This is currently only enforced for intrinsics that have an
ID, rather than all intrinsics. Adjust the check to cover all
intrinsics.
This came up in D106013.
Differential Revision: https://reviews.llvm.org/D106095
This implements the elementtype attribute specified in D105407. It
just adds the attribute and the specified verifier rules, but
doesn't yet make use of it anywhere.
Differential Revision: https://reviews.llvm.org/D106008
With the current deref semantics, this is redundant - since we assume that anything which is dereferenceable (ever) can't be freed - but it becomes neccessary for the deref-at-point semantics.
Testing wise, this is covered by test/CodeGen/X86/hoist-invariant-load.ll when -use-dereferenceable-at-point-semantics is active. I didn't bother duplicating the command line since a) it's an in-development mode, and b) the change is pretty obvious.
While it is nice to have separate methods in the public AttributeSet
API, we can fetch the type from the internal AttributeSetNode
using a generic API for all type attribute kinds.
A couple of attributes had explicit checks for incompatibility
with pointer types. However, this is already handled generically
by the typeIncompatible() check. We can drop these after adding
SwiftError to typeIncompatible().
However, the previous implementation of the check prints out all
attributes that are incompatible with a given type, even though
those attributes aren't actually used. This has the annoying
result that the error message changes every time a new attribute
is added to the list. Improve this by explicitly finding which
attribute isn't compatible and printing just that.
Add tests for D105088, as well as an option to disable the
(generally) unsound inttoptr of ptrtoint optimization.
Differential Revision: https://reviews.llvm.org/D105771
It is possible that the remangled name for an intrinsic already exists with a different (and wrong) prototype within the module.
As the bitcode reader keeps both versions of all remangled intrinsics around for a longer time, this can result in a
crash, as can be seen in https://bugs.llvm.org/show_bug.cgi?id=50923
This patch makes 'remangleIntrinsicFunction' aware of this situation. When it is detected, it moves the version with the wrong prototype to a different name. That version will be removed anyway once the module is completely loaded.
With thanks to @asbirlea for reporting this issue when trying out an lto build with the full restrict patches, and @efriedma for suggesting a sane resolution mechanism.
Reviewed By: apilipenko
Differential Revision: https://reviews.llvm.org/D105118
Continuing from D105763, this allows placing certain properties
about attributes in the TableGen definition. In particular, we
store whether an attribute applies to fn/param/ret (or a combination
thereof). This information is used by the Verifier, as well as the
ForceFunctionAttrs pass. I also plan to use this in LLParser,
which also duplicates info on which attributes are valid where.
This keeps metadata about attributes in one place, and makes it
more likely that it stays in sync, rather than in various
functions spread across the codebase.
Differential Revision: https://reviews.llvm.org/D105780
This is now the same as isIntAttrKind(), so use that instead, as
it does not require manual maintenance. The naming is also more
accurate in that both int and type attributes have an argument,
but this method was only targeting int attributes.
I initially wanted to tighten the AttrBuilder assertion, but we
have some in-tree uses that would violate it.
Assert that enum/int/type attributes go through the constructor
they are supposed to use.
To make sure this can't happen via invalid bitcode, explicitly
verify that the attribute kind if correct there.
Followup to D105658 to make AttrBuilder automatically work with
new type attributes. TableGen is tweaked to emit First/LastTypeAttr
markers, based on which we can handle type attributes
programmatically.
Differential Revision: https://reviews.llvm.org/D105763
In the spirit of TRegions [0], this patch analyzes a kernel and tracks
if it can be executed in SPMD-mode. If so, we flip the arguments of
the __kmpc_target_init and deinit call to enable the mode. We also
update the `<kernel>_exec_mode` flag to indicate to the runtime we
changed the mode to SPMD.
The code analysis is done interprocedurally by extending the
AAKernelInfo abstract attribute to track SPMD compatibility as well.
[0] https://link.springer.com/chapter/10.1007/978-3-030-28596-8_11
Differential Revision: https://reviews.llvm.org/D102307
Broke check-clang, see https://reviews.llvm.org/D102307#2869065
Ran `git revert -n ebbe149a6f08535ede848a531a601ae6591cfbc5..269416d41908bb670f67af689155d5ab8eea689a`
In the spirit of TRegions [0], this patch analyzes a kernel and tracks
if it can be executed in SPMD-mode. If so, we flip the arguments of
the __kmpc_target_init and deinit call to enable the mode. We also
update the `<kernel>_exec_mode` flag to indicate to the runtime we
changed the mode to SPMD.
The code analysis is done interprocedurally by extending the
AAKernelInfo abstract attribute to track SPMD compatibility as well.
[0] https://link.springer.com/chapter/10.1007/978-3-030-28596-8_11
Differential Revision: https://reviews.llvm.org/D102307
While working on the elementtype attribute, I felt that the type
attribute handling in AttrBuilder is overly repetitive. This patch
converts the separate Type* members into an std::array<Type*>, so
that all type attribute kinds can be handled generically.
There's more room for improvement here (especially when it comes to
converting the AttrBuilder to an Attribute), but this seems like a
good starting point.
Differential Revision: https://reviews.llvm.org/D105658
To support options like -print-before=<pass> and -print-after=<pass>
the PassBuilder will register PassInstrumentation callbacks as well
as a mapping between internal pass class names and the pass names
used in those options (and other cmd line interfaces). But for
some reason all the passes that takes options where missing in those
maps, so for example "-print-after=loop-vectorize" didn't work.
This patch will add the missing entries by also taking care of
function and loop passes with params when setting up the class to
pass name maps.
One might notice that even with this patch it might be tricky to
know what pass name to use in options such as -print-after. This
because there only is a single mapping from class name to pass name,
while the PassRegistry currently is a bit messy as it sometimes
reuses the same class for different pass names (without using the
"pass with params" scheme, or the pass-name<variant> syntax).
It gets extra messy in some situations. For example the
MemorySanitizerPass can run like this (with debug and print-after)
opt -passes='kmsan' -print-after=msan-module -debug-only=msan
The 'kmsan' alias for 'msan<kernel>' is just confusing as one might
think that 'kmsan' is a separate pass (but the DEBUG_TYPE is still
just 'msan'). And since the module pass version of the pass adds
a mapping from 'MemorySanitizerPass' to 'msan-module' one need to
use 'msan-module' in the print-before and print-after options.
Reviewed By: ychen
Differential Revision: https://reviews.llvm.org/D105006
C++23 will make these conversions ambiguous - so fix them to make the
codebase forward-compatible with C++23 (& a follow-up change I've made
will make this ambiguous/invalid even in <C++23 so we don't regress
this & it generally improves the code anyway)
Several subclasses of User override operator new without also overriding
operator delete. This means that delete expressions fall back to using
operator delete of the base class, which would be User. However, this is
only allowed if the base class has a virtual destructor which is not the
case for User, so this is UB.
See also [expr.delete] (3) for the exact wording.
This is actually detected in some cases by GCC 11's
-Wmismatched-new-delete now which is how I found this error.
Differential Revision: https://reviews.llvm.org/D103143