Summary:
Add new options -print-changed=[dot-cfg | dot-cfg-quiet] which create
a website of DOT files showing colourized changes as the IR is changed
by passes in the new pass manager pipeline.
A new change reporter is introduced that creates a website of changes made
by passes in the opt pipeline that change the IR. The hidden option
-dot-cfg-dir=<dir> specifies a directory (defaulting to "./") into which the
website will be created.
A file passes.html is created that contains a list of all the passes that
act on the IR. Those that do not change the IR are listed as omitted
because of no change, ignored or filtered out (using -filter-print-func
and -filter-passes) or not listed in quiet mode. Those that
do change the IR are listed as a link to a DOT file which contains a
CFG depiction of the IR (ala -dot-cfg) except that the instructions,
basic blocks and links that are only in the IR before the pass (ie, removed)
and those that are only in the IR after the pass (ie, added) are shown in
red and green, respectively, while the aspects of the CFG that do not change
are shown in black. Additional hidden options
-dot-cfg-before-color=<dot named color>,
-dot-cfg-after-color=<dot named color> and
-dot-cfg-common-color=<dot named color> are defined that allow the
customization of the colors used in colorizing the CFG.
-change-printer-dot-path=<path to dot exe> is also added.
Author: Jamie Schmeiser <schmeise@ca.ibm.com>
Reviewed By: aeubanks (Arthur Eubanks)
Differential Revision: https://reviews.llvm.org/D87202
This patch adds the ZeroOpConversion pattern to LLVM IR dialect.
Conversion of aggregate types is not implemented yet and will trigger a
failure to legalize the operation. This is tested in the
convert-to-llvm-invalid.fir test file.
This patch is part of the upstreaming effort from fir-dev branch.
Reviewed By: kiranchandramohan
Differential Revision: https://reviews.llvm.org/D113014
Co-authored-by: Jean Perier <jperier@nvidia.com>
Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
These were added to prevent functions from being removed by WPO.
But that doesn't make sense, correct WPO will not remove functions we actually use.
I noticed these because compiling cc1_main.cpp was pulling in random LLVM pass headers.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D112971
I recently spent some extra time debugging a false positive because I
didn't realize the "real" tag was in the short granule. Adding the
short tag here makes it more obvious that we could be dealing with a
short granule.
Reviewed By: hctim, eugenis
Differential Revision: https://reviews.llvm.org/D112949
The mask type for the llvm.experimental.vp.splice intrinsics must have
the same number of elements as the result type.
Reviewed By: simoll
Differential Revision: https://reviews.llvm.org/D112924
In the added cases we have two congruent IVs. IndVars widens at least one of them.
If they are both widened, then one of them is erased as they stay congruent after
widening. However if only one IV is widened, the other one stays in the loop.
We can simply erase the narrow IV and replace its uses with truncates of the
widest IV.
This is a new draft of D28234. I previously did the unorthodox thing of
pushing to it when I wasn't the original author, but since this version
- Uses `GNUInstallDirs`, rather than mimics it, as the original author
was hesitant to do but others requested.
- Is much broader, effecting many more projects than LLVM itself.
I figured it was time to make a new revision.
I am using this patch (and many back-ports) as the basis of
https://github.com/NixOS/nixpkgs/pull/111487 for my distro (NixOS). It
looked like people were generally on board in D28234, but I make note of
this here in case extra motivation is useful.
---
As pointed out in the original issue, a central tension is that LLVM
already has some partial support for these sorts of things. For example
`LLVM_LIBDIR_SUFFIX`, or `COMPILER_RT_INSTALL_PATH`. Because it's not
quite clear yet what to do about those, we are holding off on changing
libdirs and `compiler-rt`. for this initial PR.
---
On the advice of @lebedev.ri, I am splitting this up a bit per
subproject, starting with LLVM. To allow it to be more easily reviewed. This and the subsequent patch must be landed together, as this will not build alone. But the rest can be landed on their own.
Reviewed By: compnerd
Differential Revision: https://reviews.llvm.org/D100810
Header "llvm/Transforms/Scalar/SimpleLoopUnswitch.h" is currently
included twice. This commit removes the duplicate 'include' line.
Previous commit 693eedb138
seems to have mistakenly added the duplicate 'include'.
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D112979
The tests diffs are logically equivalent, and so this is
generally NFC, but this makes the code match the code
comment.
It should also be more efficient. If we choose the 'not'
operand (rather than the 'not' instruction) as the select
condition, then we don't have to invert the select
condition/operands as a subsequent transform.
This patch implements __builtin_reduce_max and __builtin_reduce_min as
specified in D111529.
The order of operations does not matter for min or max reductions and
they can be directly lowered to the corresponding
llvm.vector.reduce.{fmin,fmax,umin,umax,smin,smax} intrinsic calls.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D112001
We'd like to take a progressive approach towards Fconvolution op
CodeGen, by 1) tiling it to fit compute hierarchy first, and then
2) tiling along window dimensions with size 1 to reduce the problem
to be matmul-like. After that, we can 3) downscale high-D convolution
ops to low-D by removing the size-1 window dimensions. The final
step would be 4) vectorizing the low-D convolution op directly.
We have patterns for 1), 2), and 4). This commit adds a pattern for
3) for `linalg.conv_2d_nhwc_hwcf` ops as a starter. Supporting other
high-D convolution ops should be similar and mechanical.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D112928
The scalarizer pass seems to be inserting instructions in-between PHI nodes or debug intrinsics that end up staying at the end of the pass, resulting in malformed IR and violating assumptions.
This patch adds a check to make sure the `extractelement` instructions that it adds are correctly placed after all PHI nodes and debug intrinsics.
Patch by vettoreldaniele.
Reviewed By: bjope
Differential Revision: https://reviews.llvm.org/D112472
Previously, if accidentally multiple checkers `eval::Call`-ed the same
`CallEvent`, in debug builds the analyzer detected this and crashed
with the message stating this. Unfortunately, the message did not state
the offending checkers violating this invariant.
This revision addresses this by printing a more descriptive message
before aborting.
Reviewed By: martong
Differential Revision: https://reviews.llvm.org/D112889
Back in the mists of time, the CXXRecordDecl for the injected-class-name was
a redecl of the outer class itself.
This got changed in 470c454a61, but only for plain
classes: class template instantation was still detecting the injected-class-name
in the template body and marking its instantiation as a redecl.
This causes some subtle inconsistent behavior between the two, e.g.
hasDefinition() returns true for Foo<int>::Foo but false for Bar::Bar.
This is the root cause of PR51912.
Differential Revision: https://reviews.llvm.org/D112765
Symbol tables are a largely useful top-level IR construct, for example, they
make it easy to access functions in a module by name instead of traversing the
list of module's operations to find the corresponding function.
Depends On D112886
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D112821
Inserting a symbol into a SymbolTable may lead to the name of the symbol being
changed in order to ensure uniqueness of symbol names in the table. Return this
new name to spare the caller the need to extract it from the symbol operation.
Depends On D112700
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D112886
Fold (srl (mul (zext i32:$a to i64), i64:c), 32) -> (mulhu $a, $b),
if c can truncate to i32 without loss.
Reviewed By: frasercrmck, craig.topper, RKSimon
Differential Revision: https://reviews.llvm.org/D108129
If the type of a funnel shift needs to be expanded, expand it to two funnel shifts instead of regular shifts. For constant shifts, this doesn't make much difference, but for variable shifts it allows a more optimal lowering.
Also use the optimized funnel shift lowering for rotates.
Alive2: https://alive2.llvm.org/ce/z/TvHDB- / https://alive2.llvm.org/ce/z/yzPept
(Branched from D108058 as getting this completed should help unlock some other WIP patches).
Original Patch: @efriedma (Eli Friedman)
Differential Revision: https://reviews.llvm.org/D112443
Rely on the hasOperation() instead - as commented on D77804, the mid-term intention is to recognise rotate/funnel-by-constant pre-legalization to help avoid SimplifyDemandedBits regressions.
We may want to restrict the detected platforms to only `x86_64` and `aarch64`.
There are still custom detection in api.td but I don't think we can handle these:
- config/linux/api.td:205
- config/linux/api.td:199
Differential Revision: https://reviews.llvm.org/D112818
The CERT rule ERR33-C can be modeled partially by the existing check
'bugprone-unused-return-value'. The existing check is reused with
a fixed set of checked functions.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D112409