After the changes introduced by D106799 it is possible to tag
outlined function with both AlwaysInline and NoInline attributes using
-fno-inline command line options.
This issue is similiar to D107649.
Differential Revision: https://reviews.llvm.org/D112645
This patch changes the ScriptedProcess test to use a stack-only skinny
corefile as a backing store.
The corefile is saved as a temporary file at the beginning of the test,
and a second target is created for the ScriptedProcess. To do so, we use
the SBAPI from the ScriptedProcess' python script to interact with the
corefile process.
This patch also makes some small adjustments to the other ScriptedProcess
scripts to resolve some inconsistencies and removes the raw memory dump
that was previously checked in.
Differential Revision: https://reviews.llvm.org/D112047
Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
This patch changes the `ScriptedThread` initializer in couple of ways:
- It replaces the `SBTarget` parameter by a `SBProcess` (pointing to the
`ScriptedProcess` that "owns" the `ScriptedThread`).
- It adds a reference to the `ScriptedProcessInfo` Dictionary, to pass
arbitrary user-input to the `ScriptedThread`.
This patch also fixes the SWIG bindings methods that call the
`ScriptedProcess` and `ScriptedThread` initializers by passing all the
arguments to the appropriate `PythonCallable` object.
Differential Revision: https://reviews.llvm.org/D112046
Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
This patch adds a new `StructuredData::Dictionary` constructor that
takes a `StructuredData::ObjectSP` as an argument. This is used to pass
the opaque_ptr from the `SBStructuredData` used to initialize a
ScriptedProecss, to the `ProcessLaunchInfo` class.
This also updates `SBLaunchInfo::SetScriptedProcessDictionary` to
reflect the formentionned changes which solves the nullptr deref.
Differential Revision: https://reviews.llvm.org/D112107
Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
Remove the getSmallestBoundingIndex method that has no uses anymore.
Depends On D113548
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D113549
The call to getTypeSizeInChars() is replaced with
getTypeSizeInCharsIfKnown(), which does not crash on forward declared
structs. This only affects printing.
Differential Revision: https://reviews.llvm.org/D113570
Replace the getSmallestBoundingIndex method used in promotion by getConstantUpperBoundForIndex that uses flat affine constraints to compute a constant upper bound.
Depends On D113546
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D113548
Similar to D113199 but dealing with the vector size, this extends the
fptosi+fmul to fixed point fold to handle fptosi.sat nodes that are
equally viable, so long as the saturation width matches the output
width.
Differential Revision: https://reviews.llvm.org/D113200
Texture/sampler/surface operands can be either a register or an
immediate (an index of .texref, .samplerref or .surfref).
TableGen declarations for these instructions used to only have
Int64Regs operands, so this caused issues when machine verifier
is turned on:
*** Bad machine code: Expected a register operand. ***
- function: bar
- basic block: %bb.0 (0x55b144d99ab8)
- instruction: %4:int32regs = SULD_1D_I32_TRAP 0, killed %2:int32regs
- operand 1: 0
The solution is to duplicate these instructions for all possible
operand types (i16imm and Int64Regs). Since this would
essentially double the amount code in TableGen, the patch also
does some refactoring for the original instructions to keep
things manageable.
Differential Revision: https://reviews.llvm.org/D112232
Use the custom upper bound computation in hoisting by the new getUpperBoundForIndex method.
Depends On D113546
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D113547
Use CodegenStrategy instead of a separate test pass to test iterator interchange.
Depends On D113409
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D113550
Entropic scheduling with exec-time option can be misled, if inputs
on the right path to become crashing inputs accidentally take more
time to execute before it's added to the corpus. This patch, by letting
more of such inputs added to the corpus (four inputs of size 7 to 10,
instead of a single input of size 2), reduces possibilities of being
influenced by timing flakiness.
A longer-term fix could be to reduce timing flakiness in the fuzzer;
one way could be to execute inputs multiple times and take average of
their execution time before they are added to the corpus.
Reviewed By: morehouse
Differential Revision: https://reviews.llvm.org/D113544
`llvm-otool -tV foo.o` and `llvm-objdump --macho -d foo.o` would
previously fail on object files containing @TLVPPAGE or @TLVPPAGEOFF relocs.
Move llvm-objdump-specific test from
llvm/test/MC/AArch64/arm64-tls-modifiers-darwin.s to new
llvm/test/tools/llvm-objdump/MachO/disassemble-arm64-tlv-modifers.test
and put test for this fix to that new file.
Fixes PR52356.
Differential Revision: https://reviews.llvm.org/D112843
Remove padding test pass that was replaced by CodegenStrategy.
Depends On D113411
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D113412
Extension of D112504. Lower amdgpu printf to `__llvm_omp_vprintf`
which takes the same const char*, void* arguments as cuda vprintf and also
passes the size of the void* alloca which will be needed by a non-stub
implementation of `__llvm_omp_vprintf` for amdgpu.
This removes the amdgpu link error on any printf in a target region in favour
of silently compiling code that doesn't print anything to stdout.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D112680
[NFC] As part of using inclusive language within the llvm project and to match
the renamed master branch of `google/benchmark`, this patch replaces master with
main in the benchmark releasing docs.
Reviewed By: kbobyrev
Differential Revision: https://reviews.llvm.org/D113513
Replace the getSmallestBoundingIndex method used in padding by getConstantUpperBoundForIndex that uses flat affine constraints to compute a constant upper bound.
Depends On D113398
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D113546
Use CodegenStrategy instead of a separate test pass to test hoisting.
Depends On D113410
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D113411
Use CodegenStrategy instead of a separate test pass to test padding.
Depends On D113409
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D113410
Use AffineApplyOp instead of SubIOp to compute the padding width when creating a pad tensor operation.
Depends On D113382
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D113404
A problem was noted in the post-commit review for
c36b7e21bd / D113035 :
If the source type is not integer or integer vector,
then we could crash when trying to ComputeNumSignBits().
This patch adds conversion for basic box operations that extract
information from the box.
This patch is part of the upstreaming effort from fir-dev branch.
Reviewed By: awarzynski
Differential Revision: https://reviews.llvm.org/D113551
Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
Add constant to index the different values in a box so that
they can be reused for the codegen part as well.
Reviewed By: awarzynski
Differential Revision: https://reviews.llvm.org/D113553
Currently, the floating point instructions that depend on
rounding mode are correctly marked in the PPC back end with
an implicit use of the RM register. Similarly, instructions
that explicitly define the register are marked with an
implicit def of the same register. So for the most part,
RM-using code won't be moved across RM-setting instructions.
However, calls are not marked as RM-setting instructions so
code can be moved across calls. This is generally desired,
but so is the ability to turn off this behaviour with an
appropriate option - and -frounding-math really should be
that option.
This patch provides a set of call instructions (for direct
and indirect calls) that are marked with an implicit def of
the RM register. These will be used for calls that are marked
with the strictfp attribute.
Differential revision: https://reviews.llvm.org/D111433
Add a flag to control if CodegenStrategy runs the EnablePass between the transformations.
Depends On D113382
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D113409
Remove unused members and store the indexing and packing loops in SmallVector.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D113398
If processLogicalImmediate fails, we should return from the function
without changing InsInstrs or DelInstrs. This happens for
CodeGen/AArch64/urem-seteq-nonzero.ll LIT test as described in
https://reviews.llvm.org/D99662#2662296.
Callers of genAlternativeCodeSequence skip patterns where InsInstrs
stays empty, so this does not cause any issues now.
Differential Revision: https://reviews.llvm.org/D100047
When reducing functions with very large basic blocks (~ almost 1 million
BBs), the majority of time is spent maintaining the order in the std::set
for the basic blocks to keep.
In those cases, DenseSet<> is much more efficient. Use it instead.
Remove the padding options from the tiling options since padding is now implemented by a separate pattern/pass introduced in https://reviews.llvm.org/D112412.
The revsion remove the tile-and-pad-tensors.mlir and replaces it with the pad.mlir that tests padding in isolation (without tiling). Similarly, hoist-padding.mlir is replaced by pad-and-hoist.mlir introduced in https://reviews.llvm.org/D112713.
Depends On D112838
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D113382
NFC refactor of D113351, pulling out the APInt split/merge code from the BuildVectorSDNode bits extraction into a BuildVectorSDNode::recastRawBits helper. This is to allow us to reuse the code when we're packing constant folded APInt data back together.
This patch is part of the upstreaming effort from fir-dev.
Differential Revision: https://reviews.llvm.org/D113560
Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
Co-authored-by: Jean Perier <jperier@nvidia.com>
This patch is part of the upstreaming effort for fir-dev.
Differential Revision: https://reviews.llvm.org/D113559
Co-authored-by: Jean Perier <jperier@nvidia.com>
Unfortunately sinking recipes for first-order recurrences relies on
the original position of recipes. So if a recipes needs to be sunk after
an optimized induction, it needs to stay in the original position, until
sinking is done. This is causing PR52460.
To fix the crash, keep the recipes in the original position until
sink-after is done.
Post-commit follow-up to c45045bfd0 to address PR52460.
This reverts commit 7cd273c339.
Several patches with tests fixes have been applied:
0cada82f0a "[Test] Remove incorrect test in GVN"
97cb13615d "[Test] Separate IndVars test into AArch64 and X86 parts"
985cc490f1 "[Test] Remove separated test in IndVars",
and test failures caused by 5ec2386 should be resolved now.
This is useful for ops such as scf::IfOp, which always bufferize in-place.
This commit is in preparation of decoupling BufferizationAliasInfo from the SCF dialect.
Differential Revision: https://reviews.llvm.org/D113339