Implements parts of:
- P0645 Text Formatting
Depends on D92214
Reland with changes:
The format header will only be compiled if the compiler used has support
for concepts. This should fix the issues with the initial version.
Differential Revision: https://reviews.llvm.org/D93166
This patch implemented TTI.IntImmCost() properly.
Each BPF insn has 32bit immediate space, so for any immediate
which can be represented as 32bit signed int, the cost
is technically free. If an int cannot be presented as
a 32bit signed int, a ld_imm64 instruction is needed
and a TCC_Basic is returned.
This change is motivated when we observed that
several bpf selftests failed with latest llvm trunk, e.g.,
#10/16 strobemeta.o:FAIL
#10/17 strobemeta_nounroll1.o:FAIL
#10/18 strobemeta_nounroll2.o:FAIL
#10/19 strobemeta_subprogs.o:FAIL
#96 snprintf_btf:FAIL
The reason of the failure is due to that
SpeculateAroundPHIsPass did aggressive transformation
which alters control flow for which currently verifer
cannot handle well. In llvm12, SpeculateAroundPHIsPass
is not called.
SpeculateAroundPHIsPass relied on TTI.getIntImmCost()
and TTI.getIntImmCostInst() for profitability
analysis. This patch implemented TTI.getIntImmCost()
properly for BPF backend which also prevented
transformation which caused the above test failures.
Differential Revision: https://reviews.llvm.org/D96448
In addition to wall time etc. this should allow us to get less noisy
values for time measurements.
Reviewed By: JDevlieghere
Differential Revision: https://reviews.llvm.org/D96049
Right now when running `expr --top-level -- void foo() {}`, LLDB just prints a cryptic
`error: Couldn't find $__lldb_expr() in the module` error. The reason for that is
that if we don't have a running process, we try to set our execution policy to always use the
IR interpreter (ExecutionPolicyNever) which works even without a process. However
that code didn't consider the special ExecutionPolicyTopLevel which we use for
top-level expressions. By changing the execution policy to ExecutionPolicyNever,
LLDB thinks we're actually trying to interpret a normal expression inside our
`$__lldb_expr` function and then fails when looking for it.
This just adds an exception for top-level expressions to that code and a bunch of tests.
Reviewed By: shafik
Differential Revision: https://reviews.llvm.org/D91723
Clang emits a warning when accessing an Objective-C getter but not using the result.
This gets triggered when just trying to print a getter value in the expression parser (where
Clang just sees a normal expression like `obj.getter` while parsing).
This patch just disables the warning in the expression parser (similar to what we do with
the C++ equivalent of just accessing a member variable but not doing anything with it).
Reviewed By: kastiglione
Differential Revision: https://reviews.llvm.org/D94307
This adds the CostKind to getMVEVectorCostFactor, so that it can
automatically account for CodeSize costs, where it returns a cost of 1
not the MVEFactor used for Throughput/Latency. This helps simplify the
caller code and allows us to get the codesize cost more correct in more
cases.
With https://reviews.llvm.org/D63376, we began storing the APValue
directly into the ConstantExpr object so that we could reuse the
calculated value later. However, it missed a case when not in C++11
mode but the expression is known to be constant.
This used to be a LLDB_LOGF call that used the printf %s syntax.
0ab109d43d changed it to LLDB_LOG but didn't
update this format string to use formatv's syntax so this just printed '%s'.
As for SETCC, use a less expensive condition code when generating
STRICT_FSETCC if the node is known not to have Nan.
Reviewed By: SjoerdMeijer
Differential Revision: https://reviews.llvm.org/D91972
This commit regroups commonalities among InputGlobal, InputEvent, and
InputTable into the new InputElement. The subclasses are defined
inline in the new InputElement.h. NFC.
Reviewed By: sbc100
Differential Revision: https://reviews.llvm.org/D94677
This commit moves a line in SelectionDAGBuilder::handleDebugValue to
avoid implicitly casting a TypeSize object to an unsigned earlier than
necessary. It was possible that we bail out of the loop before the value
is ever used, which means we could create a superfluous TypeSize
warning.
Reviewed By: DavidTruby
Differential Revision: https://reviews.llvm.org/D96423
ModuleTranslation contains multiple fields that keep track of the mappings
between various MLIR and LLVM IR components. The original ModuleTranslation
extension model was based on inheritance, with these fields being protected and
thus accessible in the ModuleTranslation and derived classes. The
inheritance-based model doesn't scale to translation of more than one derived
dialect and will be progressively replaced with a more flexible one based on
dialect interfaces and a translation state that is separate from
ModuleTranslation. This change prepares the replacement by making the mappings
private and providing public methods to access them.
Depends On D96436
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D96437
Historically, JitRunner has been registering all available dialects with the
context and depending on them without the real need. Make it take a registry
that contains only the dialects that are expected in the input and stop linking
in all dialects.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D96436
Before this commit, expression statements could not be annotated
with statement attributes. Whenever parser found attribute, it
unconditionally assumed that it was followed by a declaration.
This not only doesn't allow expression attributes to have attributes,
but also produces spurious error diagnostics.
In order to maintain all previously compiled code, we still assume
that GNU attributes are followed by declarations unless ALL of those
are statement attributes. And even in this case we are not forcing
the parser to think that it should parse a statement, but rather
let it proceed as if no attributes were found.
Differential Revision: https://reviews.llvm.org/D93630
Our test configuration logic assumes that the tests can be run either
with debugserver or with lldb-server. This is not entirely correct,
since lldb server has two "personalities" (platform server and debug
server) and debugserver is only a replacement for the latter.
A consequence of this is that it's not possible to test the platform
behavior of lldb-server on macos, as it is not possible to get a hold of
the lldb-server binary.
One solution to that would be to duplicate the server configuration
logic to be able to specify both executables. However, that seems
excessively redundant.
A well-behaved lldb should be able to find the debug server on its own,
and testing lldb with a different (lldb-|debug)server does not seem very
useful (even in the out-of-tree debugserver setup, we copy the server
into the build tree to make it appear "real").
Therefore, this patch deletes the configuration altogether and changes
the low-level server retrieval functions to be able to both lldb-server
and debugserver paths. They do this by consulting the "support
executable" directory of the lldb under test.
Differential Revision: https://reviews.llvm.org/D96202
In the tablegen architecture definition, the Name field for the
ARMv87a record read "ARMv86a". All the other records contain their own
names.
Corrected it to "ARMv87a", and added the necessary value in
ARMArchEnum for that to refer to.
Reviewed By: pratlucas
Differential Revision: https://reviews.llvm.org/D96493
Various get_image builtin function declarations did not have the const
attribute. Bring the const attributes of `-fdeclare-opencl-builtins`
more in sync with `opencl-c.h`.
The patch did not account for one corner case where cmp does not dominate
the loop latch. This patch adds this check, hopefully it's cheap because
the CFG does not change during the transform, so DT queries should be
executed quickly.
If you see compile time slowness from this, please revert.
Differential Revision: https://reviews.llvm.org/D96119
The swift_bridge attribute warns when the attribute is applied multiple
times to the same declaration. However, it warns about the arguments
being different to the attribute without ever checking if the arguments
actually are different. If the arguments are different, diagnose,
otherwise silently accept the code. Either way, drop the duplicated
attribute.
This changes which of the getScalarizationOverhead overloads is used in
the gather/scatter cost to use the base variant directly, not relying on
the version using heuristics on the number of args with no args
provided. It should still produce the same costs for scalarized
gathers/scatters.
The '%dexter_regression_test' substitution was missing quotes around the
python executable, unlike other substitutions of a similar nature in the
file. This changes fixes the issue.
Differential Revision: https://reviews.llvm.org/D96420
Reviewed by: jmorse, aganea
This patch just addresses one of the outstanding TODOs. More
specifically, it moves all the outstanding standard macro predefinitions
from `SetDefaultFortranOpts` to `setDefaultPredefinitions`. This
dedicated method for standard macro predefs was introduced in:
* https://reviews.llvm.org/D96032
Move implementation of kill intrinsics to WQM pass. Add live lane
tracking by updating a stored exec mask when lanes are killed.
Use live lane tracking to enable early termination of shader
at any point in control flow.
Reviewed By: piotr
Differential Revision: https://reviews.llvm.org/D94746
With t2DoLoopDec we can be left with some extra MOV's in the preheaders
of tail predicated loops. This removes them, in the same way we remove
other dead variables.
Differential Revision: https://reviews.llvm.org/D91857
Today, inside a template, you can get completion for:
Foo<T> t;
t.^
t has dependent type Foo<T>, and we use the primary template to find its members.
However we also want this to work:
t.foo.bar().^
The type of t.foo.bar() is DependentTy, so we attempt to resolve using similar
heuristics (e.g. primary template).
Differential Revision: https://reviews.llvm.org/D96376
Since the new pass manager has been enabled by default these tests had their
-O1 variations failing due to the tested functions being inlined. This just
adds no_inline to the respective code similar to what we did in other
tests (e.g. aa56b30014 ).
Add the builtin functions brought by the
cl_khr_subgroup_extended_types extension to
`-fdeclare-opencl-builtins`.
Differential Revision: https://reviews.llvm.org/D96279
This will be needed in the loop-vectorizer where the minimum VF
requested may be a scalable VF. getMinimumVF now takes an additional
operand 'IsScalableVF' that indicates whether a scalable VF is required.
Reviewed By: kparzysz, rampitec
Differential Revision: https://reviews.llvm.org/D96020
With the standard dialect being split up, the set of dialects that are
used when converting to GPU is growing. This change modifies the
SCFToGpu pass to allow all operations inside launch bodies.
Differential Revision: https://reviews.llvm.org/D96480
We were storing predicate registers, such as a <8 x i1>, in the opposite
order to how the rest of llvm expects. This actually turns out to be
correct for the one place that usually uses it - the
ScalarizeMaskedMemIntrin pass, but only because the pass was incorrect
itself. This fixes the order so that bits are stored in the opposite
order and bitcasts work as expected. This allows the Scalarization pass
to be fixed, as in https://reviews.llvm.org/D94765.
Differential Revision: https://reviews.llvm.org/D94867
The EndLoc of a type loc can be invalid for broken code.
Also extend the existing test to support error code with `error-ok`
annotation.
Differential Revision: https://reviews.llvm.org/D96261
This patch is NFC and changes occurrences of `unsigned Width`
and `unsigned i` to work on type ElementCount instead.
This patch is a preparatory patch with the ultimate goal of making
`computeMaxVF()` return both a max fixed VF and a max scalable VF,
so that `selectVectorizationFactor()` can pick the most cost-effective
vectorization factor.
Reviewed By: david-arm
Differential Revision: https://reviews.llvm.org/D96019
This patch fixes an issue in the implementation of DUP/CPY where certain
immediates were not accepted. Immediates should be interpreted as a two's
complement encoding of a value that fits the number of bits of the element
type.
mov z0.b, p0/z, #127
<=> mov z0.b, p0/z, #-129
<=> mov z0.b, p0/z, #0xffffffffffffff7f
This behaviour is in line with the GNU assembler.
Reviewed By: c-rhodes
Differential Revision: https://reviews.llvm.org/D94776
The dimension order of a filter in tensorflow is
[filter_height, filter_width, in_channels, out_channels], which is different
from current definition. The current definition follows TOSA spec. Add TF
version conv ops to .tc, so we do not have to insert a transpose op around a
conv op.
Reviewed By: antiagainst
Differential Revision: https://reviews.llvm.org/D96038