If a function doesn't contain loops and does not call non-willreturn
functions, then it is willreturn. Loops are detected by checking
for backedges in the function. We don't attempt to handle finite
loops at this point.
Differential Revision: https://reviews.llvm.org/D94633
`X86AsmParser::ParseIntelExpression` has a while loop. In the body,
calls to MCAsmLexer::UnLex can force a reallocation in the MCAsmLexer's
`CurToken` SmallVector, invalidating saved references to
`MCAsmLexer::getTok()`.
`const MCAsmToken &Tok` is such a saved reference, and this moves it
from outside the while loop to inside the body, fixing a
use-after-realloc.
`Tok` will still be reused across calls to `Lex()`, each of which
effectively destroys and constructs the pointed-to token. I'm a bit
skeptical of this usage pattern, but it seems broadly used in the
X86AsmParser (and others) so I'm leaving it alone (for now).
Somehow this bug was exposed by https://reviews.llvm.org/D94739,
resulting in test failures in dot-operator related tests in
llvm/test/tools/llvm-ml. I suspect the exposure path is related to
optimizer changes from splitting up the grow operation, but I haven't
dug all the way in. Regardless, there are already tests in tree that
cover this; they might fail consistently if we added ASan
instrumentation to SmallVector.
Differential Revision: https://reviews.llvm.org/D95112
Summary:
Prior to D91261 the information checked the OMP_MAP_TARGET_PARAM flag, change this as it has been removed. The INFO macro was changed to accept a flag as input to make conditionally printing information easier.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D95133
Defaulted destructor was treated inconsistently, compared to other
compiler-generated functions.
When Sema::IdentifyCUDATarget() got called on just-created dtor which didn't
have implicit __host__ __device__ attributes applied yet, it would treat it as a
host function. That happened to (sometimes) hide the error when dtor referred
to a host-only functions.
Even when we had identified defaulted dtor as a HD function, we still treated it
inconsistently during selection of usual deallocators, where we did not allow
referring to wrong-side functions, while it is allowed for other HD functions.
This change brings handling of defaulted dtors in line with other HD functions.
Differential Revision: https://reviews.llvm.org/D94732
The place-holding implementation of C_LOC just didn't work
when used with our more complete semantic checking, specifically
in the case of a polymorphic argument; convert it to an external
function with an implicit interface. C_ASSOCIATED needs to be
a generic interface with specific implementations for C_PTR and
C_FUNPTR.
Differential Revision: https://reviews.llvm.org/D94714
Profiling has been recently implemented in libomptarget (D93055). This patch enables time profiling support for libomptarget in libomp, to support profiling of multi-threaded execution of offloaded regions.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D94855
All Fortran options should be set in `CompilerInstance` (via its
`CompilerInvocation`) before any of `FrontendAction` is entered -
that's one of the tasks of the driver. However, this is a bit tricky
with fixed and free from detection introduced in
https://reviews.llvm.org/D94228.
Fixed-free form detection needs to happen:
* before any frontend action (we need to specify `isFixedForm` in
`Fortran::parser::Options` before running any actions)
* separately for every input file (we might be compiling multiple
Fortran files, some in free form, some in fixed form)
In other words, we need this to happen early (before any
`FrontendAction`), but not too early (we need to know what the current
input file is). In practice, `isFixedForm` can only be set later
than other options (other options are inferred from compiler flags). So
we can't really set all of them in one place, which is not ideal.
All changes in this patch are NFCs (hence no new tests). Quick summary:
* move fixed/free form detection from `FrontendAction::ExecuteAction` to
`CompilerInstance::ExecuteAction`
* add a bool flag in `FrontendInputFile` to mark a file as fixed/free
form
* updated a few comments
Differential Revision: https://reviews.llvm.org/D95042
Define OrderedOp and UnorderedOp instructions in SPIR-V and convert
cmpf operations with `ord` and `uno` tag to these instructions
respectively.
Differential Revision: https://reviews.llvm.org/D95098
The SPIR-V spec uses OpSpecConstantOp. Using an inconsistent name
makes the dialect generation scripts fail. Update to use the right
operation name, and fix the auto generation scripts as well.
Differential Revision: https://reviews.llvm.org/D95097
This pass is required to get correct codegen for image instructions with
the tfe or lwe bits set.
Differential Revision: https://reviews.llvm.org/D95132
Pretty similar to D95058, this patch added forward declaration for
CUDA atomic functions. We already have definitions with right mangled names in
internal CUDA headers so the forward declaration here can work properly.
Reviewed By: jdoerfert, JonChesterfield
Differential Revision: https://reviews.llvm.org/D95085
Make this look more like the DAG handling and move to common code.
I also noticed AArch64 seems to not be properly adding the
physreg:virtreg mapping to the function live ins.
Allow parsing generated mir with custom pseudo source value tokens.
Also rename pseudo source values to have more meaningful names.
Differential Revision: https://reviews.llvm.org/D94768
It turns out the vectorizer calls the getIntrinsicInstrCost functions
with a scalar return type and vector VF. This updates the costmodel to
handle that, still producing the correct vector costs.
A vectorizer test is added to show it vectorizing at the correct factor
again.
This patch makes sure that diagnostics from the prescanner are reported
when running `flang-new -E` (i.e. only the preprocessor phase is
requested). More specifically, the `PrintPreprocessedAction` action is
updated.
With this patch we make sure that the `f18` and `flang-new` provide
identical output when running the preprocessor and the prescanner
generates diagnostics.
Differential Revision: https://reviews.llvm.org/D94782
Summary:
The custom mapper API did not previously support the mapping names added previously. This means they were not present if a user requested debugging information while using the mapper functions. This adds basic support for passing the mapped names to the runtime library.
Reviewers: jdoerfert
Differential Revision: https://reviews.llvm.org/D94806
GCC/libstdc++ before 6.1 can't handle scoped enums as unordered_map keys. LLVM
(and some build) bots officially support some GCC 5.x versions, so this patch
just makes the enum unscoped until we can require GCC 6.x.
In https://llvm.org/PR48810 , we are crashing while trying to
propagate attributes from mempcpy (returns void*) to memcpy
(returns nothing - void).
We can avoid the crash by removing known incompatible
attributes for the void return type.
I'm not sure if this goes far enough (should we just drop all
attributes since this isn't the same function?). We also need
to audit other transforms in LibCallSimplifier to make sure
there are no other cases that have the same problem.
Differential Revision: https://reviews.llvm.org/D95088
Add DemandedElts support inside the TRUNCATE analysis.
REAPPLIED - this was reverted by @hans at rGa51226057fc3 due to an issue with vector shift amount types, which was fixed in rG935bacd3a724 and an additional test case added at rG0ca81b90d19d
Differential Revision: https://reviews.llvm.org/D56387
As noticed on D56387, for vectors we must always correctly adjust the shift amount type during truncation (not just after legalization). We were getting away with it as we currently only accepted scalars via the dyn_cast<ConstantSDNode>.
The original patch got reverted as a dependency of cf1c774d6a .
That patch got relanded so it's also necessary to reland this patch.
Original summary:
After cf1c774d6a, Clang seems to generate code
that is more similar to icc/Clang, so we can use the same line numbers for
all compilers in this test.
The DSYM variant of this test is failing since D94890. But as we explicitly
try to disable the DSYM generation in the makefile and build the archive on
our own, I don't see why we even need to run the DSYM version of the test.
This patch disables the generated derived versions of this test for the
different debug information containers (which includes the failing DSYM one).
Currently when LLDB has enough data in the debug information to import the `std` module,
it will just try to import it. However when debugging libraries where the sources aren't
available anymore, importing the module will generate a confusing diagnostic that
the module couldn't be built.
For the fallback mode (where we retry failed expressions with the loaded module), this
will cause the second expression to fail with a module built error instead of the
actual parsing issue in the user expression.
This patch adds checks that ensures that we at least have any source files in the found
include paths before we try to import the module. This prevents the module from being
loaded in the situation described above which means we don't emit the bogus 'can't
import module' diagnostic and also don't waste any time retrying the expression in the
fallback mode.
For the unit tests I did some refactoring as they now require a VFS with the files in it
and not just the paths. The Python test just builds a binary with a fake C++ module,
then deletes the module before debugging.
Fixes rdar://73264458
Reviewed By: JDevlieghere
Differential Revision: https://reviews.llvm.org/D95096
This patch applies the idea from D93734 to LoopUnswitch.
It adds support for unswitching on conditions that are only
invariant along certain paths through a loop.
In particular, it targets conditions in the loop header that
depend on values loaded from memory. If either path from
the true or false successor through the loop does not modify
memory, perform partial loop unswitching.
That is, duplicate the instructions feeding the condition in the pre-header.
Then unswitch on the duplicated condition. The condition is now known
in the unswitched version for the 'invariant' path through the original loop.
On caveat of this approach is that one of the loops created can be partially
unswitched again. To avoid this behavior, `llvm.loop.unswitch.partial.disable`
metadata is added to the unswitched loops, to avoid subsequent partial
unswitching.
If that's the approach to go, I can move the code handling the metadata kind
into separate functions.
This increases the cases we unswitch quite a bit in SPEC2006/SPEC2000 &
MultiSource. It also allows us to eliminate a dead loop in SPEC2017's omnetpp
```
Tests: 236
Same hash: 170 (filtered out)
Remaining: 66
Metric: loop-unswitch.NumBranches
Program base patch diff
test-suite...000/255.vortex/255.vortex.test 2.00 23.00 1050.0%
test-suite...T2006/401.bzip2/401.bzip2.test 7.00 55.00 685.7%
test-suite :: External/Nurbs/nurbs.test 5.00 26.00 420.0%
test-suite...s-C/unix-smail/unix-smail.test 1.00 3.00 200.0%
test-suite.../Prolangs-C++/ocean/ocean.test 1.00 3.00 200.0%
test-suite...tions/lambda-0.1.3/lambda.test 1.00 3.00 200.0%
test-suite...yApps-C++/PENNANT/PENNANT.test 2.00 5.00 150.0%
test-suite...marks/Ptrdist/yacr2/yacr2.test 1.00 2.00 100.0%
test-suite...lications/viterbi/viterbi.test 1.00 2.00 100.0%
test-suite...plications/d/make_dparser.test 12.00 24.00 100.0%
test-suite...CFP2006/433.milc/433.milc.test 14.00 27.00 92.9%
test-suite.../Applications/lemon/lemon.test 7.00 12.00 71.4%
test-suite...ce/Applications/Burg/burg.test 6.00 10.00 66.7%
test-suite...T2006/473.astar/473.astar.test 16.00 26.00 62.5%
test-suite...marks/7zip/7zip-benchmark.test 78.00 121.00 55.1%
```
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D93764
Previous code build the model that tile config register is the user of
each AMX instruction. There is a problem for the tile config register
spill. When across function, the ldtilecfg instruction may be inserted
on each AMX instruction which use tile config register. This cause all
tile data register clobber.
To fix this issue, we remove the model of tile config register. We
analyze the regmask of call instruction and insert ldtilecfg if there is
any tile data register live across the call. Inserting the sttilecfg
before the call is unneccessary, because the tile config doesn't change
and we can just reload the config.
Besides we also need check tile config register interference. Since we
don't model the config register we should check interference from the
ldtilecfg to each tile data register def.
ldtilecfg
/ \
BB1 BB2
/ \
call BB3
/ \
%1=tileload %2=tilezero
We can start from the instruction of each tile def, and backward to
ldtilecfg. If there is any call instruction, and tile data register is
not preserved, we should insert ldtilecfg after the call instruction.
Differential Revision: https://reviews.llvm.org/D94155
This makes the following improvements.
For `SHT_GNU_versym`:
* yaml2obj: set `sh_link` to index of `.dynsym` section automatically.
For `SHT_GNU_verdef`:
* yaml2obj: set `sh_link` to index of `.dynstr` section automatically.
* yaml2obj: set `sh_info` field automatically.
* obj2yaml: don't dump the `Info` field when its value matches the number of version definitions.
For `SHT_GNU_verneed`:
* yaml2obj: set `sh_link` to index of `.dynstr` section automatically.
* yaml2obj: set `sh_info` field automatically.
* obj2yaml: don't dump the `Info` field when its value matches the number of version dependencies.
Also, simplifies few test cases.
Differential revision: https://reviews.llvm.org/D94956
In case of indirect calls or address taken functions,
skip propagating any attributes to them. We just
propagate features to such functions.
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D94585