To be consistent with OpenACC and will find the tests in one single
directory for OpenMP.
Reviewed By: kiranchandramohan
Differential Revision: https://reviews.llvm.org/D127529
Implement double precision FMA (Fused Multiply-Add) for targets without
FMA instructions using __uint128_t to store the intermediate results.
Reviewed By: michaelrj, sivachandra
Differential Revision: https://reviews.llvm.org/D124495
Replace a single dash with a double dash for options that have more
than a single letter.
llvm-bolt-wrapper.py has special treatment for output options such as
"-o" and "-w" causing issues when a single dash is used, e.g. for
"-write-dwp". The wrapper can be fixed as well, but using a double dash
has other advantages as well.
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D127538
A previous patch added the constant EXP_MANT_MASK to the FloatProperties
for other types of long double. This patch adds it to the special 80-bit
x87 long double.
Reviewed By: lntue
Differential Revision: https://reviews.llvm.org/D127550
Mark fragments related to split jump table as non-simple.
A function could be splitted into hot and cold fragments. A split jump table is
challenging for correctly reconstructing control flow graphs, so it was marked
as ignored. This update marks those fragments as non-simple, allowing them
to be printed and partial control flow graph construction.
Test Plan:
```
llvm-lit -a tools/bolt/test/X86/split-func-icf.s
```
This test has two functions (main, main2), each has a jump table target to the
same cold portion main2.cold.1(*2). We try to print out only this cold portion.
If it is ignored, it cannot be printed. If it is non-simple, it can be printed. We
verify that it can be printed.
Reviewed By: Amir
Differential Revision: https://reviews.llvm.org/D127464
This commit extends the UnifyAliasedResourcePass to handle scalar
types of different bitwidths. It requires to get the smaller bitwidth
resource as the canonical resource so that we can avoid subcomponent
load/store. Instead we load/store multiple smaller bitwidth ones.
Reviewed By: hanchung
Differential Revision: https://reviews.llvm.org/D127266
When running scan-build-py's analyze-build script with output format set
to sarif & html it wants to print a message on how to look at the
defects mentioning the directory name twice.
But the path argument was only given once to the logging function,
causing "TypeError: not enough arguments for format string" exception.
Differential Revision: https://reviews.llvm.org/D126974
This patch changes the `crashlog` command behavior to print the help
message if no argument was provided with the command.
rdar://94576026
Differential Revision: https://reviews.llvm.org/D127362
Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
1. When checking if a candidate contains a CFI instruction, actually
iterate over all of the instructions, instead of stopping halfway
through.
2. Make sure copied CFI directives refer to the correct instruction.
Fixes https://github.com/llvm/llvm-project/issues/55842
Differential Revision: https://reviews.llvm.org/D126930
The `EndLoc` parameter was always unset so no fixit was emitted. But it is also unnecessary for determining the range so we can remove it.
Differential Revision: https://reviews.llvm.org/D127251
Our actual lowering for i1 reductions uses ctpop combined with possibly a vector negate and possibly a logic op afterwards. I believe ctpop to be low cost on all reasonable hardware.
The default costing implementation here was returning quite inconsistent costs. and/or were returning very high costs (because we seem to think moving into scalar registers is very expensive?) and others were returning lower but still too high (because of the assumed tree reduce strategy). While we should probably improve the generic costing strategy for i1 vectors, let's start by fixing the immediate problem.
Differential Revision: https://reviews.llvm.org/D127511
This brings us into alignment with AArch64, and in the process fixes a compiler crash bug in uniform store handling in the vectorizer.
Before the recent invalid cost bailout work, this would have also avoided crashes on invalid costs in some cases. I honestly think the vectorizer should gracefully bailout on uniform stores it can't use a scatter for, but it doesn't, so lets take the path of least resistance here. It's also possible that there are other vectorizer bugs AArch64 isn't seeing because of this hook; we don't want to be finding them either.
Differential Revision: https://reviews.llvm.org/D127514
We need a preheader and a single latch, but we don't need a dedicated
exit.
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D127513
Our optimization pass checks for loop simplify form, before doing
the transform. The loops here aren't in loop simplify form because
the exit block has two predecessors.
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D127451
Plan is the migrate the global variable metadata for sanitizers, that's
currently carried around generally in the 'llvm.asan.globals' section,
onto the global variable itself.
This patch adds the attribute and plumbs it through the LLVM IR and
bitcode formats, but is a no-op other than that so far.
Reviewed By: vitalybuka, kstoimenov
Differential Revision: https://reviews.llvm.org/D126100
Implements eh frame handling by using generic EHFrame passes. The c++ exception handling works correctly with this change.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D127063
Currently, linking the device libraries requires setting a constant
that indicates the code object ABI version the compilation is
targeting.
This fixes the MLIR linking process by setting this constant to 400,
which is the value corresponding to the current code object ABI
default, version 4.
Reviewed By: Mogball
Differential Revision: https://reviews.llvm.org/D126913
The encoding of COMPUTE_TMPRING_SIZE.WAVESIZE and
SPI_TMPRING_SIZE.WAVESIZE has changed in GFX11: it is now in units
of 64 dwords instead of 256 dwords, and the field has been widened
from 13 bits to 15 bits.
Depends on D126989
Reviewed By: rampitec, arsenm, #amdgpu
Differential Revision: https://reviews.llvm.org/D127248
Add pattern to hoist scalar code outside of warp distribute region as
those cannot be distributed and we would want to execute them on all
the lanes.
Add patterns to distribute transfer_write ops. Those operations can be
distributed in different ways and it is control by user.
Differential Revision: https://reviews.llvm.org/D127152
The DWARF version was raised to 5 for all platforms which do not opt
out. Default to DWARF version to 4 for z/OS again.
Reviewed By: abhina.sreeskantharajan, uweigand
Differential Revision: https://reviews.llvm.org/D127498
Building on D126796, this commit adds the infrastructure for being able
to print out descriptions of what each warning does.
Differential Revision: https://reviews.llvm.org/D126832