The ComplexPattern is looking for an immediate in a certain range
that has a single use. This can be handled with a PatLeaf since
we aren't matching multiple patterns or checking any complicated
relationships between nodes.
This shrinks the isel table a little bit since tablegen no longer
has to generate patterns with commuted operands. With the PatLeaf,
tablegen can see we're matching an immediate which should always
be on the right hand side of add.
Reviewed By: benshi001
Differential Revision: https://reviews.llvm.org/D102510
This adds a simple fold into codegenprepare that converts comparison of
branches towards comparison with zero if possible. For example:
%c = icmp ult %x, 8
br %c, bla, blb
%tc = lshr %x, 3
becomes
%tc = lshr %x, 3
%c = icmp eq %tc, 0
br %c, bla, blb
As a first order approximation, this can reduce the number of
instructions needed to perform the branch as the shift is (often) needed
anyway. At the moment this does not effect very much, as llvm tends to
prefer the opposite form. But it can protect against regressions from
commits like rG9423f78240a2.
Simple cases of Add and Sub are added along with Shift, equally as the
comparison to zero can often be folded with cpsr flags.
Differential Revision: https://reviews.llvm.org/D101778
Running this script gives
```
"llvm-project/llvm/./utils/wciia.py", line 56
if word == "N:":
TabError: inconsistent use of tabs and spaces in indentation
```
Under emacs' whitespace-mode, it shows
```
for·line·in·code_owners_file:$
····for·word·in·line.split():$
» if·word·==·"N:":$
» » name·=·line[2:].strip()$
» » if·code_owner:$
» » » process_code_owner(code_owner)$
» » » code_owner·=·{}$
```
I use `yapf` to format this script directly and it's running correctly.
This code was re-implementing the same-BB case of
isPotentiallyReachable(). Historically, this was done because
CaptureTracking used additional caching for local dominance
queries. Now that it is no longer needed, the code is effectively
the same as isPotentiallyReachable().
The only difference are extra checks for invoke/phis. These are
misleading checks related to dominance in the value availability
sense that are not relevant for control reachability. The invoke
check was correct but redundant in that invokes are always
terminators, so `I` could never come before the invoke. The phi
check is a matter of interpretation (should an earlier phi node be
considered reachable from a later phi node in the same block?)
but ultimately doesn't matter because phis don't capture anyway.
Reapply after adjusting the synchronized.m test case, where the
TODO is now resolved. The pointer is only captured on the exception
handling path.
-----
For the CapturesBefore tracker, it is sufficient to check that
I can not reach BeforeHere. This does not necessarily require
that BeforeHere dominates I, it can also occur if the capture
happens on an entirely disjoint path.
This change was previously accepted in D90688, but had to be
reverted due to large compile-time impact in some cases: It
increases the number of reachability queries that are performed.
After recent changes, the compile-time impact is largely mitigated,
so I'm reapplying this patch. The remaining compile-time impact
is largely proportional to changes in code-size.
The intention is to be able to run this from additional locations (such as shuffle combining) in the future.
Reapplies rGb95a103808ac (after reversion at rGc012a388a15b), with SSE3/SSSE3 typo fix, test added at rG0afb10de1449.
This reverts commit 6b8b43e7af.
This causes clang test to fail (CodeGenObjC/synchronized.m).
Revert until I can figure out whether that's an expected change.
For the CapturesBefore tracker, it is sufficient to check that
I can not reach BeforeHere. This does not necessarily require
that BeforeHere dominates I, it can also occur if the capture
happens on an entirely disjoint path.
This change was previously accepted in D90688, but had to be
reverted due to large compile-time impact in some cases: It
increases the number of reachability queries that are performed.
After recent changes, the compile-time impact is largely mitigated,
so I'm reapplying this patch. The remaining compile-time impact
is largely proportional to changes in code-size.
This is based on the test from D90688, without the argmemonly
attribute. The argmemonly attribute would guaranteed no modref
by itself and the question of captures would not arise in the
first place.
Provide an option to specify optimization level when creating an
ExecutionEngine via the MLIR JIT Python binding. Not only is the
specified optimization level used for code generation, but all LLVM
optimization passes at the optimization level are also run prior to
machine code generation (akin to the mlir-cpu-runner tool).
Default opt level continues to remain at level two (-O2).
Contributions in part from Prashant Kumar <prashantk@polymagelabs.com>
as well.
Differential Revision: https://reviews.llvm.org/D102551
On AIX, we have to ship `libatomic.a` for compatibility. First, a new `clang_rt.atomic` is added. Second, use added cmake modules for AIX, we are able to build a compatible libatomic.a for AIX. The second step can't be perfectly implemented with cmake now since AIX's archive approach is kinda unique, i.e., archiving shared libraries into a static archive file.
Reviewed By: jsji
Differential Revision: https://reviews.llvm.org/D102155
The default AsmPrinter print GV in comments,
AIX should do so too.
This also fix LLVM :: CodeGen/Generic/inline-asm-mem-clobber.ll.
Reviewed By: hubert.reinterpretcast
Differential Revision: https://reviews.llvm.org/D102534
This patch replaces the `powerpc64` token with the `system-aix` one in
the UNSUPPORTED line of a test. The `powerpc64` token was originally
added temporarily in 71a0609a2b.
If AIX uses integrated-as by default and it works both for 32-bit and
64-bit objects, then the issues encountered so far (see comments in
D96033) would be mostly solved.
As it is, marking the test as expected-to-fail (as opposed to
unsupported) on AIX might cause more trouble in the form of 32-bit
versus 64-bit differences. I am not aware of other situations where LIT
tests are dependent on whether the LLVM build is 64-bit or 32-bit.
Reviewed By: jsji
Differential Revision: https://reviews.llvm.org/D102560
This patch makes it possible to do call site specific deductions
for AAValueSimplification and AAIsDead.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D84722
Reachability queries are very expensive, and currently performed
for each instruction we look at, even though most of them will
not lead to a capture and are thus ultimately irrelevant. It is
more efficient to walk a few unnecessary instructions than to
perform unnecessary reachability queries.
Theoretically, this may produce worse results, because the additional
instructions considered may cause us to hit the use count limit
earlier. In practice, this does not appear to be a problem, e.g.
on test-suite O3 we report only one more captured-before with this
change, with no resulting codegen differences.
This makes PointerMayBeCapturedBefore() significantly cheaper in
practice, hopefully allowing it to be used in more places.
This patch introduces source loading and pruning functions.
It will allow to use the DWARF embedded source and use the same code for JSON printout.
No functional changes.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D102539
https://reviews.llvm.org/D101681 landed a change to check the testing
configuration which relies on using the `-print-runtime-dir` flag of
clang to determine where the runtime testing library is.
The patch treated not being able to find the path reported by clang
as an error. Unfortunately this seems to break the
`llvm-clang-win-x-aarch64` bot. Either the bot is misconfigured or
clang is reporting a bogus path.
To temporarily unbreak the bot downgrade the fatal error to a warning.
While we're here also print information about the command used to
determine the path to aid debugging.
to a warning.
https://reviews.llvm.org/D101681 introduced a check to make sure the
compiler and compiler-rt were using the same library path when
`COMPILER_RT_TEST_STANDALONE_BUILD_LIBS=ON`, i.e. the developer's
intention is to test the just built libs rather that shipped with the
compiler used for testing.
It seems this broken some bots that are likely misconfigured.
So to unbreak them, for now let's make this a warning so the bot
owners can investigate without breaking their builds.
This patch adds support for GCC's -fstack-usage flag. With this flag, a stack
usage file (i.e., .su file) is generated for each input source file. The format
of the stack usage file is also similar to what is used by GCC. For each
function defined in the source file, a line with the following information is
produced in the .su file.
<source_file>:<line_number>:<function_name> <size_in_byte> <static/dynamic>
"Static" means that the function's frame size is static and the size info is an
accurate reflection of the frame size. While "dynamic" means the function's
frame size can only be determined at run-time because the function manipulates
the stack dynamically (e.g., due to variable size objects). The size info only
reflects the size of the fixed size frame objects in this case and therefore is
not a reliable measure of the total frame size.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D100509
findIndirectCallFunctionSamples will leave Sum uninitialized if it returns an empty vector, we don't really use Sum in this case (but we do make a copy that isn't used either) - so ensure we initialize the value to zero to at least silence the static analysis warning.
These checks are not specific to the instruction based variant of
isPotentiallyReachable(), they are equally valid for the basic
block based variant. Move them there, to make sure that switching
between the instruction and basic block variants cannot introduce
regressions.