When ASan and e.g. Dead Virtual Function Elimination are enabled, the
latter will rely on type metadata to determine if certain virtual calls can be
removed. However, ASan currently does not copy type metadata, which can cause
virtual function calls to be incorrectly removed.
Differential Revision: https://reviews.llvm.org/D88368
There are two `WasmSignature` structs, one in
include/llvm/BinaryFormat/Wasm.h and the other in
lib/MC/WasmObjectWriter.cpp. I don't know why they got separated in this
way in the first place, but it seems we can unify them to use the one in
Wasm.h for all cases.
Reviewed By: dschuff, sbc100
Differential Revision: https://reviews.llvm.org/D88428
Some instructions (G_LOAD, G_SELECT, G_UNMERGE_VALUES) check if their uses
will define/use FPRs (using `onlyUsesFP` and `onlyDefinesFP`).
The register bank of a use isn't necessarily known when an instruction asks for
this.
Teach `hasFPConstraints` to look at the instructions feeding into a G_PHI when
its destination bank is unknown. If any of them are FPR, assume the entire
G_PHI will also be assigned a FPR.
Since a phi can have many inputs, and those inputs can in turn be phis,
restrict the search depth to a very low number.
Also improve the docs for `hasFPConstraints` and friends a little.
This is a 0.3% code size improvement on CTMark/Bullet at -O3, and a 0.2% code
size improvement at CTMark/pairlocalalign at -O3.
Differential Revision: https://reviews.llvm.org/D88177
This should be NFC unless some target was expecting that
some form of cttz/ctlz/memcpy is free in terms of size/latency
but not free in throughput cost.
This should be close to NFC (no-functional-change), but I
can't completely rule out that some call on some target
travels down a different path. There's an especially large
amount of code spaghetti in this part of the cost model.
The goal is to clean up the intrinsic cost handling so
we can canonicalize to the new min/max intrinsics without
causing regressions.
Support emitting ANDSXrs and ANDSWrs in `emitTST`. Update opt-fold-compare.mir
to show that it works.
Differential Revision: https://reviews.llvm.org/D87530
When we see this:
```
%and = G_AND %x, %y
%xor = G_XOR %and, %y
```
Produce this:
```
%not = G_XOR %x, -1
%new_and = G_AND %not, %y
```
as long as we are guaranteed to eliminate the original G_AND.
Also matches all commuted forms. E.g.
```
%and = G_AND %y, %x
%xor = G_XOR %y, %and
```
will be matched as well.
Differential Revision: https://reviews.llvm.org/D88104
By default clangd will score a code completion item using heuristics model.
Scoring can be done by Decision Forest model by passing `--ranking_model=decision_forest` to
clangd.
Features omitted from the model:
- `NameMatch` is excluded because the final score must be multiplicative in `NameMatch` to allow rescoring by the editor.
- `NeedsFixIts` is excluded because the generating dataset that needs 'fixits' is non-trivial.
There are multiple ways (heuristics) to combine the above two features with the prediction of the DF:
- `NeedsFixIts` is used as is with a penalty of `0.5`.
Various alternatives of combining NameMatch `N` and Decision forest Prediction `P`
- N * scale(P, 0, 1): Linearly scale the output of model to range [0, 1]
- N * a^P:
- More natural: Prediction of each Decision Tree can be considered as a multiplicative boost (like NameMatch)
- Ordering is independent of the absolute value of P. Order of two items is proportional to `a^{difference in model prediction score}`. Higher `a` gives higher weightage to model output as compared to NameMatch score.
Baseline MRR = 0.619
MRR for various combinations:
N * P = 0.6346, advantage%=2.5768
N * 1.1^P = 0.6600, advantage%=6.6853
N * **1.2**^P = 0.6669, advantage%=**7.8005**
N * **1.3**^P = 0.6668, advantage%=**7.7795**
N * **1.4**^P = 0.6659, advantage%=**7.6270**
N * 1.5^P = 0.6646, advantage%=7.4200
N * 1.6^P = 0.6636, advantage%=7.2671
N * 1.7^P = 0.6629, advantage%=7.1450
N * 2^P = 0.6612, advantage%=6.8673
N * 2.5^P = 0.6598, advantage%=6.6491
N * 3^P = 0.6590, advantage%=6.5242
N * scaled[0, 1] = 0.6465, advantage%=4.5054
Differential Revision: https://reviews.llvm.org/D88281
We need to preserve the LD_LIBRARY_PATH environment variable when
spawning a child process (certain setups rely on non-standard paths
for e.g. libstdc++). In order to achieve this, set
LLVM_CRC_UNIXCRCRETURNCODE in the parent process instead of creating
the child's environment from scratch.
Reviewed By: aganea
Differential Revision: https://reviews.llvm.org/D88308
Extend -fsanitize=nullability-arg to handle call sites which accept C++
member pointers.
rdar://62476022
Differential Revision: https://reviews.llvm.org/D88336
Replaces the dummy CodeCompletion model with a trained DecisionForest
model.
The features.json needs to be manually curated specifying the features
to be used. This is a one-time cost and does not change if the model
changes until we decide to add/remove features.
Differential Revision: https://reviews.llvm.org/D88071
- `-cl-fp32-correctly-rounded-divide-sqrt` is an OpenCL-specific option
and `correctly-rounded-divide-sqrt-fp-math` should be added for OpenCL
at most.
Differential revision: https://reviews.llvm.org/D88303
Added patterns to generate an SSAT or USAT with shift for
SSAT/USAT instructions that are matched from IR patterns.
Differential Revision: https://reviews.llvm.org/D88145
Essentially the same as the signed variants from D88259. Also includes a clean up of the lowering function.
Differential Revision: https://reviews.llvm.org/D88317
It was mentioned that D88276 that when a phi node is visited, terminators at their incoming edges should be used for CtxI.
This is a patch that makes two functions (ComputeNumSignBitsImpl, isGuaranteedNotToBeUndefOrPoison) to do so.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D88360
Fixes minor bug in D88402 where we were using the original shift constant (with undefs) instead of one with the splat values (re)splatted to all elements.
This is a part of https://bugs.llvm.org/show_bug.cgi?id=47581.
We have the following computation:
```
(1) uint64_t Location = Address & 0x7fffffff;
(2) if (Location & 0x04000000)
(3) Location |= (uint64_t) ~0x7fffffff;
(4) return Location + Place;
```
At line 2 there is a mistype. The constant should be `0x40000000`,
not `0x04000000`, because the intention here is to sign extend the `Location`,
which is the 31 bit signed value.
Differential revision: https://reviews.llvm.org/D88407
We have been running tests/benchmarks downstream with tail-predication enabled
for some time now and this behaves as expected: we are not aware of any
correctness issues, and this performs better across the board than with
tail-predication disabled. Time to flip the switch!
Differential Revision: https://reviews.llvm.org/D88093
Similar to collecting information from branches guarding a loop, we can
also collect information from assumes dominating the loop header.
Fixes PR47247.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D87854
Flags of the module derived exclusively from the compiler flag `-mbranch-protection`.
The note is generated based on the module flags accordingly.
After this change in case of compile unit without function won't have
the .note.gnu.property if the compiler flag is not present [1].
[1] https://bugs.llvm.org/show_bug.cgi?id=46480
Reviewed By: chill
Differential Revision: https://reviews.llvm.org/D80791
A while ago, we converted isShuffleEquivalent/isTargetShuffleEquivalent to both use IsElementEquivalent internally.
This allows us to make the shuffle args optional like isTargetShuffleEquivalent and update foldShuffleOfHorizOp to use isShuffleEquivalent (which it should as its using a ISD::VECTOR_SHUFFLE mask).
Add a tweak that populates an empty switch statement of an enumeration type with all of the enumerators of that type.
Before:
```
enum Color { RED, GREEN, BLUE };
void f(Color color) {
switch (color) {}
}
```
After:
```
enum Color { RED, GREEN, BLUE };
void f(Color color) {
switch (color) {
case RED:
case GREEN:
case BLUE:
break;
}
}
```
Reviewed By: sammccall
Differential Revision: https://reviews.llvm.org/D88383
According to POWER ISA, floating point instructions altering exception
bits in FPSCR should be 'may raise FP exception'. (excluding those
read or write the whole FPSCR directly, like mffs/mtfsf) We need to
model FPSCR well in future patches to handle the special case properly.
Instructions added mayRaiseFPException:
- fre(s)/frsqrte(s)
- fmadd(s)/fmsub(s)/fnmadd(s)/fnmsub(s)
- xscmpoqp/xscmpuqp/xscmpeqdp/xscmpgedp/xscmpgtdp
- xscvdphp/xscvhpdp/xvcvhpsp/xvcvsphp/xsrqpxp
- xsmaxcdp/xsincdp/xsmaxjdp/xsminjdp
Instructions removed mayRaiseFPException:
- xstdivdp/xvtdiv(d|s)p/xstsqrtdp/xvtsqrt(d|s)p
- xsabsdp/xsnabsdp/xvabs(d|s)p/xvnabs(d|s)p
- xsnegdp/xscpsgndp/xvneg(d|s)p/xvcpsgn(d|s)p
- xvcvsxwdp/xvcvuxwdp
- xscvdpspn/xscvspdpn
Reviewed By: steven.zhang
Differential Revision: https://reviews.llvm.org/D87738
This tends to increase code size but more importantly it reduces vgpr
usage, and could avoid costly readfirstlanes if the result needs to be
in an sgpr.
Differential Revision: https://reviews.llvm.org/D88245
With the recent patches to the ASTImporter that improve template type importing
(D87444), most of the import-std-module tests can now finally import the
type of the STL container they are testing. This patch removes most of the casts
that were added to simplify types to something the ASTImporter can import
(for example, std::vector<int>::size_type was casted to `size_t` until now).
Also adds the missing tests that require referencing the container type (for
example simply printing the whole container) as here we couldn't use a casting
workaround.
The only casts that remain are in the forward_list tests that reference
the iterator and the stack test. Both tests are still failing to import the
respective container type correctly (or crash while trying to import).
Currently we are always recognizing the `SHT_MIPS_ABIFLAGS` section,
even on non-MIPS targets.
The problem of doing this is briefly discussed in D88228 which does the same for `SHT_ARM_EXIDX`:
"The problem is that `SHT_ARM_EXIDX` shares the value with `SHT_X86_64_UNWIND (0x70000001U)`.
We might have other machine specific conflicts, e.g.
`SHT_ARM_ATTRIBUTES` vs `SHT_MSP430_ATTRIBUTES` vs `SHT_RISCV_ATTRIBUTES (0x70000003U)`."
I think we should only recognize target specific sections when the machine type
matches. I.e. `SHT_MIPS_*` should be recognized only on `MIPS`, `SHT_ARM_*`
only on `ARM` etc.
This patch stops recognizing `SHT_MIPS_ABIFLAGS` on `non-MIPS` targets.
Note: I had to update `ScalarEnumerationTraits<ELFYAML::MIPS_ISA>::enumeration`, because
otherwise test crashes, calling `llvm_unreachable`.
Differential revision: https://reviews.llvm.org/D88294
This is a reimplementation of the overflow checks for the elementcount,
i.e. the 2nd argument of intrinsic get.active.lane.mask. The element
count is lowered in each iteration of the tail-predicated loop, and
we must prove that this expression doesn't overflow.
Many thanks to Eli Friedman and Sam Parker for all their help with
this work.
Differential Revision: https://reviews.llvm.org/D88086
9d9a11c7be added this check for predicatable instructions between the
D/WLSTP and the loop's start, but it was missing the last instruction in
the block. Change it to use some iterators instead.
Differential Revision: https://reviews.llvm.org/D88354