This method parallels the dropPoisonGeneratingFlags on Instruction, but is hoisted to operator to handle constant expressions as well.
This is mostly code movement, but I did go ahead and add the inrange constexpr gep case. This had been discussed previously, but apparently never followed up o.
While we could emit such a tautological `select`,
it will stick around until the next instsimplify invocation,
which may happen after we count the cost of this redundant `select`.
Which is precisely what happens with loop vectorization legality checks,
and that artificially increases the cost of said checks,
which is bad.
There is prior art for this in `IRBuilderBase::CreateAnd()`/`IRBuilderBase::CreateOr()`.
Refs. https://reviews.llvm.org/D109368#3089809
Otherwise, ODRUniquing would map some member method/variable MDNodes
to have enum type DIScope, resulting in invalid debug info and bad
DWARF.
- Add a Verifier check that when a 'scope:' operand is an ODR type that is not an enum.
- Makes ODRUniquing apply to only ODR types with the same tag so that the debuginfo/DWARF is well-formed.
Reviewed By: probinson, aprantl
Differential Revision: https://reviews.llvm.org/D111770
Currently strip.invariant/launder.invariant are handled by
constructing constant expressions with the intrinsics skipped.
This takes an alternative approach of accumulating the offset
using stripAndAccumulateConstantOffsets(), with a flag to look
through invariant.group intrinsics.
Differential Revision: https://reviews.llvm.org/D112382
Optimize the iterator comparison logic to compare Current.data()
pointers. Use std::tie for assignments from std::pair. Replace
the custom class with a function returning iterator_range.
Differential Revision: https://reviews.llvm.org/D110535
IRBuilder has been updated to support preserving metdata in a more
general manner. This patch adds `LLVMAddMetadataToInst` and
deprecates `LLVMSetInstDebugLocation` in favor of the more
general function.
Reviewed By: aprantl
Differential Revision: https://reviews.llvm.org/D93454
As discussed in:
* https://reviews.llvm.org/D94166
* https://lists.llvm.org/pipermail/llvm-dev/2020-September/145031.html
The GlobalIndirectSymbol class lost most of its meaning in
https://reviews.llvm.org/D109792, which disambiguated getBaseObject
(now getAliaseeObject) between GlobalIFunc and everything else.
In addition, as long as GlobalIFunc is not a GlobalObject and
getAliaseeObject returns GlobalObjects, a GlobalAlias whose aliasee
is a GlobalIFunc cannot currently be modeled properly. Creating
aliases for GlobalIFuncs does happen in the wild (e.g. glibc). In addition,
calling getAliaseeObject on a GlobalIFunc will currently return nullptr,
which is undesirable because it should return the object itself for
non-aliases.
This patch refactors the GlobalIFunc class to inherit directly from
GlobalObject, and removes GlobalIndirectSymbol (while inlining the
relevant parts into GlobalAlias and GlobalIFunc). This allows for
calling getAliaseeObject() on a GlobalIFunc to return the GlobalIFunc
itself, making getAliaseeObject() more consistent and enabling
alias-to-ifunc to be properly modeled in the IR.
I exercised some judgement in the API clients of GlobalIndirectSymbol:
some were 'monomorphized' for GlobalAlias and GlobalIFunc, and
some remained shared (with the type adapted to become GlobalValue).
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D108872
Fixes an issue where GEP salvaging did not properly account for GEP
instructions which stepped over array elements of width 0 (effectively a
no-op). This unnecessarily produced long expressions by appending
`... + (x * 0)` and potentially extended the number of SSA values used
in the dbg.value. This also erroneously triggered an assert in the
salvage function that the element width would be strictly positive.
These issues are resolved by simply ignoring these useless operands.
Reviewed By: aprantl
Differential Revision: https://reviews.llvm.org/D111809
The multiply() implementation is very slow -- it performs six
multiplications in double the bitwidth, which means that it will
typically work on allocated APInts and bypass fast-path
implementations. Add an additional implementation that doesn't
try to produce anything better than a full range if overflow is
possible. At least for the BasicAA use-case, we really don't care
about more precise modeling of overflow behavior. The current
use of multiply() is fine while the implementation is limited to
a single index, but extending it to the multiple-index case makes
the compile-time impact untenable.
For the common case where the shift amount is constant (a single
element range) we can easily compute a precise range (up to
unsigned envelope), so do that.
This patch continues unblocking optimizations that are blocked by pseudo probe instrumentation.
Not exactly like DbgIntrinsics, PseudoProbe intrinsic has other attributes (such as mayread, maywrite, mayhaveSideEffect) that can block optimizations. The issues fixed are:
- Flipped default param of getFirstNonPHIOrDbg API to skip pseudo probes
- Unblocked CSE by avoiding pseudo probe from clobbering memory SSA
- Unblocked induction variable simpliciation
- Allow empty loop deletion by treating probe intrinsic isDroppable
- Some refactoring.
Reviewed By: wenlei
Differential Revision: https://reviews.llvm.org/D110847
In the review of D111085 it was pointed out that these functions don't
conform to the naming scheme in use in LLVM. With this commit we should
be good for all of FPEnv.h.
To better reflect the meaning of the now-disambiguated {GlobalValue,
GlobalAlias}::getBaseObject after breaking off GlobalIFunc::getResolverFunction
(D109792), the function is renamed to getAliaseeObject.
Currently the max alignment representable is 1GB, see D108661.
Setting the align of an object to 4GB is desirable in some cases to make sure the lower 32 bits are clear which can be used for some optimizations, e.g. https://crbug.com/1016945.
This uses an extra bit in instructions that carry an alignment. We can store 15 bits of "free" information, and with this change some instructions (e.g. AtomicCmpXchgInst) use 14 bits.
We can increase the max alignment representable above 4GB (up to 2^62) since we're only using 33 of the 64 values, but I've just limited it to 4GB for now.
The one place we have to update the bitcode format is for the alloca instruction. It stores its alignment into 5 bits of a 32 bit bitfield. I've added another field which is 8 bits and should be future proof for a while. For backward compatibility, we check if the old field has a value and use that, otherwise use the new field.
Updating clang's max allowed alignment will come in a future patch.
Reviewed By: hans
Differential Revision: https://reviews.llvm.org/D110451
Currently the max alignment representable is 1GB, see D108661.
Setting the align of an object to 4GB is desirable in some cases to make sure the lower 32 bits are clear which can be used for some optimizations, e.g. https://crbug.com/1016945.
This uses an extra bit in instructions that carry an alignment. We can store 15 bits of "free" information, and with this change some instructions (e.g. AtomicCmpXchgInst) use 14 bits.
We can increase the max alignment representable above 4GB (up to 2^62) since we're only using 33 of the 64 values, but I've just limited it to 4GB for now.
The one place we have to update the bitcode format is for the alloca instruction. It stores its alignment into 5 bits of a 32 bit bitfield. I've added another field which is 8 bits and should be future proof for a while. For backward compatibility, we check if the old field has a value and use that, otherwise use the new field.
Updating clang's max allowed alignment will come in a future patch.
Reviewed By: hans
Differential Revision: https://reviews.llvm.org/D110451
Currently the max alignment representable is 1GB, see D108661.
Setting the align of an object to 4GB is desirable in some cases to make sure the lower 32 bits are clear which can be used for some optimizations, e.g. https://crbug.com/1016945.
This uses an extra bit in instructions that carry an alignment. We can store 15 bits of "free" information, and with this change some instructions (e.g. AtomicCmpXchgInst) use 14 bits.
We can increase the max alignment representable above 4GB (up to 2^62) since we're only using 33 of the 64 values, but I've just limited it to 4GB for now.
The one place we have to update the bitcode format is for the alloca instruction. It stores its alignment into 5 bits of a 32 bit bitfield. I've added another field which is 8 bits and should be future proof for a while. For backward compatibility, we check if the old field has a value and use that, otherwise use the new field.
Updating clang's max allowed alignment will come in a future patch.
Reviewed By: hans
Differential Revision: https://reviews.llvm.org/D110451
As described on D111049, we're trying to remove the <string> dependency from error handling and replace uses of report_fatal_error(const std::string&) with the Twine() variant which can be forward declared.
Stop using APInt constructors and methods that were soft-deprecated in
D109483. This fixes all the uses I found in llvm, except for the APInt
unit tests which should still test the deprecated methods.
Differential Revision: https://reviews.llvm.org/D110807
This is basically the same change as 42cc7f3c52
but for integer attributes. Rather than treating each attribute
individually, handle them all the same way. The only thing that
needs to be done per attribute is specify how get/add convert
from/to the raw representation.
This patch adds the functionalities to print MDNode in tree shape. For
example, instead of printing a MDNode like this:
```
<0x5643e1166888> = !DILocalVariable(name: "foo", arg: 2, scope: <0x5643e11c9740>, file: <0x5643e11c6ec0>, line: 8, type: <0x5643e11ca8e0>, flags: DIFlagPublic | DIFlagFwdDecl, align: 8)
```
The printTree/dumpTree functions can give you:
```
<0x5643e1166888> = !DILocalVariable(name: "foo", arg: 2, scope: <0x5643e11c9740>, file: <0x5643e11c6ec0>, line: 8, type: <0x5643e11ca8e0>, flags: DIFlagPublic | DIFlagFwdDecl, align: 8)
<0x5643e11c9740> = distinct !DISubprogram(scope: null, spFlags: 0)
<0x5643e11c6ec0> = distinct !DIFile(filename: "file.c", directory: "/path/to/dir")
<0x5643e11ca8e0> = distinct !DIDerivedType(tag: DW_TAG_pointer_type, baseType: <0x5643e11668d8>, size: 1, align: 2)
<0x5643e11668d8> = !DIBasicType(tag: DW_TAG_unspecified_type, name: "basictype")
```
Which is useful when using it in debugger. Where sometimes printing the
whole module to see all MDNodes is too expensive.
Differential Revision: https://reviews.llvm.org/D110113
AsmWriterContext is a simple compound that stores TypePrinting,
SlotTracker (i.e. "Machine" in AsmWriter), and Module instances -- three
of the most commonly used objects in the AsmWriter infrastructure.
Previously these three objects are passed as separate function arguments
to most of the printer functions in this file. Tidying them up can bring
easier code refactoring on printer functions in the future (e.g. when we
want to pass additional objects to all printer functions).
NOTE: Theoritically, this patch should be NFC.
Differential Revision: https://reviews.llvm.org/D110112
ParameterABIAttributes functions work with unsigned integers as the index, so having the getter be signed makes no sense. Additionally, for this reason, the loop vars that were signed were changed to unsigned too.
Reviewed By: jeroen.dobbelaere
Differential Revision: https://reviews.llvm.org/D110344
We expose the fact that we rely on unsigned wrapping to iterate through
all indexes. This can be confusing. Rather, keeping it as an
implementation detail through an iterator is less confusing and is less
code.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D110885
`__stdcall`, `__fastcall` and `__vectorcall` return large values via a
hidden pointer argument. However, the size of that argument should not
be included in the argument list byte count suffix added to the
function's decorated name.
This patch fixes that issue so that LLVM generates the same decorated
name as MSVC does.
MSVC example: https://godbolt.org/z/nc35MKPhr
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D110719
To avoid using the AST when emitting diagnostics, split the "dontcall"
attribute into "dontcall-warn" and "dontcall-error", and also add the
frontend attribute value as the LLVM attribute value. This gives us all
the information to report diagnostics we need from within the IR (aside
from access to the original source).
One downside is we directly use LLVM's demangler rather than using the
existing Clang diagnostic pretty printing of symbols.
Previous revisions didn't properly declare the new dependencies.
Reviewed By: nickdesaulniers
Differential Revision: https://reviews.llvm.org/D110364
To avoid using the AST when emitting diagnostics, split the "dontcall"
attribute into "dontcall-warn" and "dontcall-error", and also add the
frontend attribute value as the LLVM attribute value. This gives us all
the information to report diagnostics we need from within the IR (aside
from access to the original source).
One downside is we directly use LLVM's demangler rather than using the
existing Clang diagnostic pretty printing of symbols.
Reviewed By: nickdesaulniers
Differential Revision: https://reviews.llvm.org/D110364
This patch is for fixing potential insertElement-related bugs like D93818.
```
V = UndefValue::get(VecTy);
for(...)
V = Builder.CreateInsertElementy(V, Elt, Idx);
=>
V = PoisonValue::get(VecTy);
for(...)
V = Builder.CreateInsertElementy(V, Elt, Idx);
```
Like above, this patch changes the placeholder V to poison.
The patch will be separated into several commits.
Reviewed By: aqjune
Differential Revision: https://reviews.llvm.org/D110311
This patch emits DW_TAG_namelist and DW_TAG_namelist_item for fortran
namelist variables. DICompositeType is extended to support this fortran
feature.
Reviewed By: aprantl
Differential Revision: https://reviews.llvm.org/D108553
Thinlink provides an opportunity to propagate function attributes across modules, enabling additional propagation opportunities.
This change propagates (currently default off, turn on with `disable-thinlto-funcattrs=1`) noRecurse and noUnwind based off of function summaries of the prevailing functions in bottom-up call-graph order. Testing on clang self-build:
1. There's a 35-40% increase in noUnwind functions due to the additional propagation opportunities.
2. Throughput is measured at 10-15% increase in thinlink time which itself is 1.5% of E2E link time.
Implementation-wise this adds the following summary function attributes:
1. noUnwind: function is noUnwind
2. mayThrow: function contains a non-call instruction that `Instruction::mayThrow` returns true on (e.g. windows SEH instructions)
3. hasUnknownCall: function contains calls that don't make it into the summary call-graph thus should not be propagated from (e.g. indirect for now, could add no-opt functions as well)
Testing:
Clang self-build passes and 2nd stage build passes check-all
ninja check-all with newly added tests passing
Reviewed By: tejohnson
Differential Revision: https://reviews.llvm.org/D36850
This is a fix for the issue reported at
https://reviews.llvm.org/D110043#3019942:
The ElementSize is a uint64_t and as such may be larger than the
index space, or be negative in the index space. This is UB, but
shouldn't cause assertion failures.
We address this by detecting whether the size is too large and
use a zero index in that case (which is always conservatively
correct).
Differential Revision: https://reviews.llvm.org/D110437