Revision https://reviews.llvm.org/D122332 added a pattern transformation
where v_cmpx instructions are introduced. However, the modifiers are
not correctly inherited from the original operands. The patch
adds the source modifiers, if they are exist, or sets them to 0.
Reviewed By: foad
Differential Revision: https://reviews.llvm.org/D122489
Support `TransformerResult<void>` in the consumer callback, which
allows generic code to more naturally use the `Transformer` interface
(instead of needing to specialize on `void`).
This also delete the specialization that existed within `Transformer`
itself, instead replacing it with an `std::function` adapter.
Reviewed By: ymandel
Differential Revision: https://reviews.llvm.org/D122499
This patch moves the code to set the correct incoming block for the
backedge value to VPlan::execute.
When generating the phi node, the backedge value is temporarily added
using the pre-header as incoming block. The invalid phi node will be
fixed up during VPlan::execute after main VPlan code generation.
At the same time, the backedge value is also moved to the latch.
This change removes the requirement to create the latch block up-front
for VPWidenIntOrFpInductionRecipe::execute, which in turn will enable
modeling the pre-header in VPlan.
As an alternative, the increment could be modeled as separate recipe,
but that would require more work and a bit of redundant code, as we need
to create the step-vector during VPWidenIntOrFpInductionRecipe::execute
anyways, to create the values for different parts.
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D121617
This revision supports padding only a subset of the iteration dimensions via an additional padding-dimensions parameter. This control allows us to pad an operation in multiple steps. For example, one may want to pad only the output dimensions of a producer matmul fused into a consumer loop nest, before tiling and padding its reduction dimension.
Depends On D122309
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D122560
NVPTXTargetLowering::getFunctionParamOptimizedAlign, which was introduces in
D120129, contained a poorly designed assertion checking that a function with
internal or private linkage is not a kernel. It relied on invariants that
were not actually guaranteed, and that resulted in compiler crash with some
CUDA versions (see discussion with @jdoerfert in D120129). This patch changes
that assertion and makes it use isKernelFunction which is designed exactly for
such checks. This patch also includes a test with IR that caused compiler crash
before.
Differential Revision: https://reviews.llvm.org/D122562
Pass the padding options using arrays instead of lambdas. In particular pass the padding value as string and use the argument parser to create the padding value. Arrays are a more natural choice that matches the current use cases and avoids converting arrays to lambdas.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D122309
Make the warning more specific as downstream compilers could produce other warnings.
Reviewed By: tstellar
Differential Revision: https://reviews.llvm.org/D122487
With the addition of `__attribute__((visibility("hidden")))` to the test, the test fails because AIX's current default behaviour is to ignore hidden visibility, so the expected error is not seen. This patch marks the test `XFAIL` on AIX for now.
Reviewed By: cebowleratibm
Differential Revision: https://reviews.llvm.org/D122519
The demangler had no concept of operator precendence, and would
parenthesize many more subexpressions than necessary. In particular
it would parenthesize primary-expressions, such as '4', which just
looks strange. It would also parenthesize '>' expressions, just in
case they were inside a template parameter list.
This patch fixes both issues.
* Add operator precedence to the OpInfo structure, and add a
subexpression helper that will parenthesize a lower precedence
subexpression.
* Add a 'greater-than is greater-than' indicator to the output buffer,
so the expression printer knows whether it is immediately inside a
template parameter list (and must therefore parenthesize 'expr >
expr'). This is a counter, so that ...
* Add open and close printers to the output buffer, that increment and
decrement the gt-is-gt indicator.
* Parenthesize comma operators inside comma-separated lists. (probably
a rare case, but still).
This dramatically reduces the extraneous parentheses being printed.
Reviewed By: dblaikie, bruno
Differential Revision: https://reviews.llvm.org/D120905
This was reusing a cast to GlobalVariable to check for an
Instruction, which means we'll try to dereference a null pointer
if it's not actually a GlobalVariable. We should be casting
MTI->getSource() instead.
I don't think this problem is really specific to opaque pointers,
but it certainly makes it a lot easier to reproduce.
Fixes https://github.com/llvm/llvm-project/issues/54572.
This patch adds the lowering of coarray statements to the runtime
functions. The runtime functions are currently not implemented.
This patch is part of the upstreaming effort from fir-dev branch.
Reviewed By: jeanPerier
Differential Revision: https://reviews.llvm.org/D122466
The repository field we want to leave blank is on the page as the
`Create Diff` button, so merged the instructions about leaving the
field blank and clicking the button.
The current implementation of Function Specialization does not allow
specializing more than one arguments per function call, which is a
limitation I am lifting with this patch.
My main challenge was to choose the most suitable ADT for storing the
specializations. We need an associative container for binding all the
actual arguments of a specialization to the function call. We also
need a consistent iteration order across executions. Lastly we want
to be able to sort the entries by Gain and reject the least profitable
ones.
MapVector fits the bill but not quite; erasing elements is expensive
and using stable_sort messes up the indices to the underlying vector.
I am therefore using the underlying vector directly after calculating
the Gain.
Differential Revision: https://reviews.llvm.org/D119880
The "in-class initializer" expression should be set in the field of a
default initialization expression before this expression node is created.
The `CXXDefaultInitExpr` objects are created after the AST is loaded and
at import not present in the "To" AST. And the in-class initializers of
the used fields can be missing too, these must be set at import.
This fixes a github issue #54061.
Reviewed By: martong
Differential Revision: https://reviews.llvm.org/D120824
This patch adds the ReductionClauseInterface and also adds reduction
support for `omp.parallel` operation.
Reviewed By: kiranchandramohan
Differential Revision: https://reviews.llvm.org/D122402
Split waterfall loops into multiple blocks so that exec mask
manipulation (s_and_saveexec) does not occur in the middle of
a block.
VGPR live range optimizer is updated to handle waterfall loops
spanning multiple blocks.
Reviewed By: ruiling
Differential Revision: https://reviews.llvm.org/D122200
PointerDeallocate was silently doing nothing because it relied on
Destroy that doe not do anything for Pointers. Add an option to Destroy
in order to destroy pointers.
Add a unit test for PointerDeallocate.
Differential Revision: https://reviews.llvm.org/D122492
Fix#54456: `objcopy --only-keep-debug` produces a linked image with invalid
empty dynamic section. llvm-objdump -p currently reports an error which seems
excessive.
```
% llvm-readelf -l a.out
llvm-readelf: warning: 'a.out': no valid dynamic table was found
...
```
Follow the spirit of llvm-readelf -l (D64472) and report a warning instead.
This allows later files to be dumped despite warnings for an input file, and
improves objdump compatibility in that the exit code is now 0 instead of 1.
```
% llvm-objdump -p a.out # new behavior
...
Program Header:
llvm-objdump: warning: 'a.out': invalid empty dynamic section
% objdump -p a.out
...
Dynamic Section:
```
Reviewed By: jhenderson, raj.khem
Differential Revision: https://reviews.llvm.org/D122505
When the -fdirectives-only option is used together with -E, the preprocessor
output reflects evaluation of if/then/else directives.
As such, it preserves defines and undefs of macros that are still live after
such processing. The intent is that this output could be consumed as input
to generate considered a C++20 header unit.
We strip out any (unused) defines that come from built-in, built-in-file or
command line; these are re-added when the preprocessed source is consumed.
Differential Revision: https://reviews.llvm.org/D121099
Add explanations for WPD(whole program devirtualization) and another meaning for CFI(control flow Integrity).
Reviewed By: tejohnson
Differential Revision: https://reviews.llvm.org/D122473
AArch64 requires CI to be aligned to 8 bytes due to access instructions
restrictions. E.g. the ldr with imm, where imm must be aligned to 8 bytes.
Differential Revision: https://reviews.llvm.org/D122065
In contrast to Linux it does not provide entries which can be readlinked
-- these are just regular files, not giving the expected outcome. That's
on top of procfs not being mounted by default to begin with.
This is probably the case on other BSDs as well, so I expect there will
be more ifdefs added down the road.
Reviewed By: emaste, dim
Differential Revision: https://reviews.llvm.org/D122545