83daa49758 made loop-rotate more conservative in the presence of
function calls in the prepare-for-lto stage. The code did not properly
account for calls that are no actual function calls, like calls to
intrinsics. This patch updates the code to ensure only calls that are
lowered to actual calls are considered inline candidates.
Add Semantic checks for OpenMP 4.5 - 2.7.4 Workshare Construct.
- The structured block in a workshare construct may consist of only
scalar or array assignments, forall or where statements,
forall, where, atomic, critical or parallel constructs.
- All array assignments, scalar assignments, and masked array
assignments must be intrinsic assignments.
- The construct must not contain any user defined function calls unless
the function is ELEMENTAL.
Test cases : omp-workshare03.f90, omp-workshare04.f90, omp-workshare05.f90
Resolve test cases (omp-workshare01.f90 and omp-workshare02.f90) marked as XFAIL
Reviewed By: kiranchandramohan
Differential Revision: https://reviews.llvm.org/D93091
This CPU supports all v8.5a features except BTI, and so identifies as v8.5a to
Clang. A bit weird, but the best way for things like xnu to detect the new
features it cares about.
In prehistorical times, AffineApplyOp was allowed to produce multiple values.
This allowed the creation of intricate SSA use-def chains.
AffineApplyNormalizer was originally introduced as a means of reusing the AffineMap::compose method to write SSA use-def chains.
Unfortunately, symbols that were produced by an AffineApplyOp needed to be promoted to dims and reordered for the mathematical composition to be valid.
Since then, single result AffineApplyOp became the law of the land but the original assumptions were not revisited.
This revision revisits these assumptions and retires AffineApplyNormalizer.
Differential Revision: https://reviews.llvm.org/D94920
Such files (Thin-%%%%%%.tmp.o) are supposed to be deleted immediately
after they're used (either by renaming or deletion). However, we've seen
instances on Windows where this doesn't happen, probably due to the
filesystem being flaky. This is effectively a resource leak which has
prevented us from using the ThinLTO cache on Windows.
Since those temporary files are in the thinlto cache directory which we
prune periodically anyway, allowing them to be pruned too seems like a
tidy way to solve the problem.
Differential revision: https://reviews.llvm.org/D94962
This patch updates the llvm module map to reflect changes made in
`24672ddea3c97fd1eca3e905b23c0116d7759ab8` and fixes the module builds
(`-DLLVM_ENABLE_MODULES=On`).
Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
Add the following standard predefinitions that f18 supports:
* `__flang__`,
* `__flang_major__`,
* `__flang_minor__`,
* `__flang_patchlevel__`
Summary of changes:
- Populate Fortran::parser::Options#predefinitions with the default
supported predefinitions
Differential Revision: https://reviews.llvm.org/D94516
Tweak dexter-tests/memvars/inline-escaping-function.c added in D94761
(b7e516202e) by adding a 'param' use after the merge point. The test XFAILS
with and without this change, but without it the test looks very similar to
memvars/unused-merged-value.c. The test now demonstrates the problem more
clearly.
Currently the new flang driver always runs in free form mode. This patch
adds support for fixed form mode detection based on the file extensions.
Like `f18`, `flang-new` will treat files ending with ".f", ".F" and
".ff" as fixed form. Additionally, ".for", ".FOR", ".fpp" and ".FPP"
file extensions are recognised as fixed form files. This is consistent
with gfortran [1]. In summary, files with the following extensions are
treated as fixed-form:
* ".f", ".F", ".ff", ".for", ".FOR", ".fpp", ".FPP"
For consistency with flang/test/lit.cfg.py and f18, this patch also adds
support for the following file extensions:
* ".ff", ".FOR", ".for", ".ff90", ".fpp", ".FPP"
This is added in flang/lib/Frontend/FrontendOptions.cpp. Additionally,
the following extensions are included:
* ".f03", ".F03", ".f08", ".F08"
This is for compatibility with gfortran [1] and other popular Fortran
compilers [2].
NOTE: internally Flang will only differentiate between fixed and free
form files. Currently Flang does not support switching between language
standards, so in this regard file extensions are irrelevant. More
specifically, both `file.f03` and `file.f18` are represented with
`Language::Fortran` (as opposed to e.g. `Language::Fortran03`).
Summary of changes:
- Set Fortran::parser::Options::sFixedForm according to the file type
- Add isFixedFormSuffix and isFreeFormSuffix helper functions to
FrontendTool/Utils.h
- Change FrontendOptions::GetInputKindForExtension to support the missing
file extensions that f18 supports and some additional ones
- FrontendActionTest.cpp is updated to make sure that the test input is
treated as free-form
[1] https://gcc.gnu.org/onlinedocs/gfortran/GNU-Fortran-and-GCC.html
[2] https://github.com/llvm/llvm-project/blob/master/flang/docs/OptionComparison.md#notes
Differential Revision: https://reviews.llvm.org/D94228
This was already done in SemaTemplateInstantiateDecl.cpp, but not in
SemaTemplateInstantiate.cpp.
Anecdotally I've seen some clangd crashes where coredumps point to this
being a problem, but I cannot reproduce this so far.
Differential Revision: https://reviews.llvm.org/D94933
On z/OS, the following error message is not matched correctly in lit tests. This patch updates the CHECK expression to match the end period successfully.
```
EDC5129I No such file or directory.
```
Differential Revision: https://reviews.llvm.org/D94239
This patch computes the cost for vector.reduce<operand> for scalable vectors.
The cost is split into two parts: the legalization cost and the horizontal
reduction.
Differential Revision: https://reviews.llvm.org/D93639
These dexter tests illustrate PR48719, the summary of which is:
Sometimes we insert dbg.values for merged values (PHIs) when promoting
variables, sometimes we don't. Sometimes there is no PHI because the merged
value is never used. It doesn't matter because LiveDebugValues understands these
merged values (implicit or otherwise) and correctly updates the debug
info. Importantly, these merged variable values (which may or may not exist as
PHIs, and may or not be represented with dbg.values) are //always// implicitly
defined by the combination of incoming edges and the incoming variable locations
along those edges by virtue of LiveDebugValues existing. Unfortunately, it is
possible to mess with the CFG and remove / move these edges before
LiveDebugValues runs. In this case our debug info model only works when the
merged value is tracked by a dbg.value. Currently, this is only done rigorously
for variables which are A) promoted in the first round of mem2reg and B) are
used after the merge point.
As an example, compile the following source with -O3 -g and step through with a
debugger. You will see parama=5 throughout the function fun which is incorrect -
we expect to see param=20 after the conditional assignment.
__attribute__((optnone))
void esc(int* p) {}
__attribute__((optnone))
void fluff() {}
__attribute__((noinline))
int fun(int parama, int paramb) {
if (parama)
parama = paramb;
fluff(); // DexLabel('s0')
esc(¶ma);
return 0;
}
int main() {
return fun(5, 20);
}
1. parama is escaped by esc(¶ma) so it is not promoted by
SROA/mem2reg (failing condition "A" above).
2. InstCombine's LowerDbgDeclare converts the dbg.declare to a set of
dbg.values (tracking the stored SSA values).
3. InstCombine replaces the two stores to parama's alloca (the initial
parameter register store in entry and the assignment in if.then) with a
PHI+store in the common sucessor.
4. SimplifyCFG folds the blocks together and converts the PHI to a
select.
The debug info is not updated to account for the merged value in the successor
prior to SimplifyCFG when it exists as a PHI, or during when it becomes a
select.
As with D89543, which added some dexter tests for escaped locals, the idea is
to build a set of source-level tests which highlights existing issues and
might be useful in evaluating a new debug info model.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D94761
Add support for option -I in the new Flang driver. This will allow for
included headers and module files in other directories, as the default
search path is currently the working folder. The behaviour of this is
consistent with the current f18 driver, where the current folder (i.e.
".") has the highest priority followed by the order of '-I's taking
priority from first to last.
Summary of changes:
- Add SearchDirectoriesFromDashI to PreprocessorOptions, to be forwarded
into the parser's searchDirectories
- Add header files and non-functional module files to be used in
regression tests. The module files are just text files and are used to
demonstrated that paths specified with `-I` are taken into account when
searching for .mod files.
Differential Revision: https://reviews.llvm.org/D93453
If a srl doesn't introduce any sign bits into the truncated result, then replace with a sra to let us use a PACKSS truncation - fixes a regression noticed in D56387 on pre-SSE41 targets that don't have PACKUSDW.
This caused a miscompile in Chromium, see comments on the codereview for
discussion and pointer to a reproducer.
> InstCombine already performs a fold where X == Y ? f(X) : Z is
> transformed to X == Y ? f(Y) : Z if f(Y) simplifies. However,
> if f(X) only has one use, then we can always directly replace the
> use inside the instruction. To actually be profitable, limit it to
> the case where Y is a non-expr constant.
>
> This could be further extended to replace uses further up a one-use
> instruction chain, but for now this only looks one level up.
>
> Among other things, this also subsumes D94860.
>
> Differential Revision: https://reviews.llvm.org/D94862
This also reverts the follow-up
a003f26539cf4db744655e76c41f4c4a8913f116:
> [llvm] Prevent infinite loop in InstCombine of select statements
>
> This fixes an issue where the RHS and LHS the comparison operation
> creating the predicate were swapped back and forth forever.
>
> Differential Revision: https://reviews.llvm.org/D94934
D84108 exposed a bad interaction between inlining and loop-rotation
during regular LTO, which is causing notable regressions in at least
CINT2006/473.astar.
The problem boils down to: we now rotate a loop just before the vectorizer
which requires duplicating a function call in the preheader when compiling
the individual files ('prepare for LTO'). But this then prevents further
inlining of the function during LTO.
This patch tries to resolve this issue by making LoopRotate more
conservative with respect to rotating loops that have inline-able calls
during the 'prepare for LTO' stage.
I think this change intuitively improves the current situation in
general. Loop-rotate tries hard to avoid creating headers that are 'too
big'. At the moment, it assumes all inlining already happened and the
cost of duplicating a call is equal to just doing the call. But with LTO,
inlining also happens during full LTO and it is possible that a previously
duplicated call is actually a huge function which gets inlined
during LTO.
From the perspective of LV, not much should change overall. Most loops
calling user-provided functions won't get vectorized to start with
(unless we can infer that the function does not touch memory, has no
other side effects). If we do not inline the 'inline-able' call during
the LTO stage, we merely delayed loop-rotation & vectorization. If we
inline during LTO, chances should be very high that the inlined code is
itself vectorizable or the user call was not vectorizable to start with.
There could of course be scenarios where we inline a sufficiently large
function with code not profitable to vectorize, which would have be
vectorized earlier (by scalarzing the call). But even in that case,
there probably is no big performance impact, because it should be mostly
down to the cost-model to reject vectorization in that case. And then
the version with scalarized calls should also not be beneficial. In a way,
LV should have strictly more information after inlining and make more
accurate decisions (barring cost-model issues).
There is of course plenty of room for things to go wrong unexpectedly,
so we need to keep a close look at actual performance and address any
follow-up issues.
I took a look at the impact on statistics for
MultiSource/SPEC2000/SPEC2006. There are a few benchmarks with fewer
loops rotated, but no change to the number of loops vectorized.
Reviewed By: sanwou01
Differential Revision: https://reviews.llvm.org/D94232
This patch adds a new test case which depends on AArch64 SVE support and
dynamic resize capability enabled. It created two seperate threads which
have different values of sve registers and SVE vector granule at various
points during execution.
We test that LLDB is doing the size and offset updates properly for all
of the threads including the main thread and when we VG is updated using
prctl call or by 'register write vg' command the appropriate changes are
also update in register infos.
Reviewed By: labath
Differential Revision: https://reviews.llvm.org/D82866
This patch builds on previously submitted SVE patches regarding expedited
register set and per thread register infos. (D82853 D82855 and D82857)
We need to resize SVE register based on value received in expedited list.
Also we need to resize SVE registers when we write vg register using
register write vg command. The resize will result in a updated offset
for all of fpr and sve register set. This offset will be configured
in native register context by RegisterInfoInterface and will also be
be updated on client side in GDBRemoteRegisterContext.
A follow up patch will provide a API test to verify this change.
Reviewed By: labath
Differential Revision: https://reviews.llvm.org/D82863
The test couldn't find lldb-server as it's path was being overridden by
LLDB_DEBUGSERVER_PATH environment variable (pointing to debugserver).
This test should always use lldb-server, as it tests its platform
capabilities.
There's no need for the environment override, as lldb-server tests
should test the executable they just built, so I just remote the
override capability.
This patch handles cases where we have to save/restore the link register
into the stack and and load/store instruction which use the stack are
part of the outlined region. It checks that there will be no overflow
introduced by the new offset and fixup these instructions accordingly.
Differential Revision: https://reviews.llvm.org/D92934
When a command option does not have a short version
(e.g. -f for --file), we use an arbitrary value in the
short_option field to mark it as invalid.
(though this value is unqiue to be used later for other
things)
We check that this short option is valid to print using
llvm::isPrint. This implicitly casts our int to char,
meaning we check the last char of any short_option value.
Since the arbitrary value we chose for these options is
some shortened hex version of the name, this returned true
even for invalid values.
Since llvm::isPrint returns true we later call std::islower
and/or std::isupper on the short_option value. (the int)
Calling these functions with something that cannot be validly
converted to unsigned char is undefined. Somehow we got/get
away with this but for me compiling with g++-9 I got a crash
for "help memory read".
The other command that uses this is "target variable" but that
didn't crash for unknown reasons.
Checking that short_option can fit into an unsigned char before
we call llvm::isPrint means we will not attempt to call islower/upper
on these options since we have no reason to print them.
This also fixes bogus short options being shown for "memory read"
and target variable.
For "target variable", before:
-e <filename> ( --file <filename> )
-b <filename> ( --shlib <filename> )
After:
--file <filename>
--shlib <filename>
(note that the bogus short options are just the bottom byte of our
arbitrary short_option value)
Reviewed By: labath
Differential Revision: https://reviews.llvm.org/D94917
This fixes an issue where the RHS and LHS the comparison operation
creating the predicate were swapped back and forth forever.
Differential Revision: https://reviews.llvm.org/D94934
In addition to consistency, we'll hit a wall when 11.1.0 gets released, because
we cannot represent it with lit versioning scheme.
Differential Revision: https://reviews.llvm.org/D94157
Previously uniqueCallSite could have race conditions between different
threads. Now it is accessed with an atomic RMW and will be unique
between different threads.
Differential Revision: https://reviews.llvm.org/D94784
A previous patch has already changed getInstructionCost to return
an InstructionCost type. This patch changes the other various
getXXXCost functions to return an InstructionCost too. This is a
non-functional change - I've added a few asserts that the costs
are valid in places where we're selecting between vector call
and intrinsic costs. However, since we don't yet return invalid
costs from any of the TTI implementations these asserts should
not fire.
See this patch for the introduction of the type: https://reviews.llvm.org/D91174
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html
Differential Revision: https://reviews.llvm.org/D94065
This patch moves the parsing of `{Lang,CodeGen}Options` from `parseSimpleArgs` to the original `Parse{Lang,CodeGen}Args` functions.
This ensures all marshalled `LangOptions` are being parsed **after** the call `setLangDefaults`, which in turn enables us to marshall `LangOptions` that somehow depend on the defaults. (In a future patch.)
Now, `CodeGenOptions` need to be parsed **after** `LangOptions`, because `-cl-mad-enable` (a `CodeGenOpt`) depends on the value of `-cl-fast-relaxed-math` and `-cl-unsafe-math-optimizations` (`LangOpts`).
Unfortunately, this removes the nice property that marshalled options get parsed in the exact order they appear in the `.td` file. Now we cannot be sure that a TableGen record referenced in `ImpliedByAnyOf` has already been parsed. This might cause an ordering issues (i.e. reading value of uninitialized variable). I plan to mitigate this by moving each `XxxOpt` group from `parseSimpleArgs` back to their original parsing function. With this setup, if an option from group `A` references option from group `B` in TableGen, the compiler will require us to make the `CompilerInvocation` member for `B` visible in the parsing function for `A`. That's where we notice that `B` didn't get parsed yet.
Reviewed By: Bigcheese
Differential Revision: https://reviews.llvm.org/D94682
Element sections will also need flags, so we shouldn't squat the
WASM_SEGMENT namespace.
Depends on D90948.
Differential Revision: https://reviews.llvm.org/D92315
This patch changes to make call_indirect explicitly refer to the
corresponding function table, residualizing TABLE_NUMBER relocs against
it.
With this change, wasm-ld now sees all references to tables, and can
link multiple tables.
Differential Revision: https://reviews.llvm.org/D90948
`ssize_t` is from POSIX and is not standard unfortunately.
Rewritting the code so it doesn't depend on it.
Differential Revision: https://reviews.llvm.org/D94760