Default address space (applies when no explicit address space was
specified) maps to generic (4) address space.
Added SYCL named address spaces `sycl_global`, `sycl_local` and
`sycl_private` defined as sub-sets of the default address space.
Static variables without address space now reside in global address
space when compile for SPIR target, unless they have an explicit address
space qualifier in source code.
Differential Revision: https://reviews.llvm.org/D89909
1. Add an accessor function to MCSymbolizer to retrieve addresses
referenced by a symbolizable operand, but not resolved to a symbol.
That way, the caller can synthesize labels at those addresses and
then retry disassembling the section.
2. Implement that in AMDGPU -- a failed symbol lookup results in the
address being added to a vector returned by the new function.
3. Use that in llvm-objdump when using MCSymbolizer (which only happens
on AMDGPU) and SymbolizeOperands is on.
Differential Revision: https://reviews.llvm.org/D101145
Change-Id: I19087c3bbfece64bad5a56ee88bcc9110d83989e
When transforming a loop terminating condition into a "max" comparison,
the DebugLoc from the old condition should be set on the newly created
comparison. They are the same operation, just optimized. Fixes PR48067.
Differential Revision: https://reviews.llvm.org/D98218
This expands the VMOVRRD(extract(..(build_vector(a, b, c, d)))) pattern,
to also handle insert_vectors. Providing we can find the correct insert,
this helps further simplify patterns by removing the redundant VMOVRRD.
Differential Revision: https://reviews.llvm.org/D100245
We can already vectorize loops that involve int<>int, fp<>fp, int<>fp
and fp<>int conversions, however we didn't previously have any tests
for them. This patch adds some tests for each conversion type.
Differential Revision: https://reviews.llvm.org/D99951
When vectorising for AArch64 targets if you specify the SVE attribute
we automatically then treat masked loads and stores as legal. Also,
since we have no cost model for masked memory ops we believe it's
cheap to use the masked load/store intrinsics even for fixed width
vectors. This can lead to poor code quality as the intrinsics will
currently be scalarised in the backend. This patch adds a basic
cost model that marks fixed-width masked memory ops as significantly
more expensive than for scalable vectors.
Tests for the cost model are added here:
Transforms/LoopVectorize/AArch64/masked-op-cost.ll
Differential Revision: https://reviews.llvm.org/D100745
When iterating over const blocks, the base type in the lambdas needs
to use const VPBlockBase *, otherwise it cannot be used with input
iterators over const VPBlockBase.
Also adjust the type of the input iterator range to const &, as it
does not take ownership of the input range.
When generating output for `-fdebug-dump-symbols`, make sure that
BuildRuntimeDerivedTypeTables is also run. This change is needed in
order to make the implementation of `-fdebug-dump-symbols` in
`flang-new` consistent with `f18`. It also allows us to port more tests
to use the new driver whenever it is enabled.
Differential Revision: https://reviews.llvm.org/D100649
The test added in D97533 (and modified by this patch) has some overly
strict printed metadata ordering requirements, specifically the
interleaving of DILocalVariable nodes and DILocation nodes. Slight changes
in metadata emission can easily break this unfortunately.
This patch stops after clang codegen rather than allowing the coro splitter
to run, and reduces the need for ordering: it picks out the
DILocalVariable nodes being sought, in any order (CHECK-DAG), and doesn't
examine any DILocations. The implicit CHECK-NOT is what's important: the
test seeks to ensure a duplicate set of DILocalVariables aren't emitted in
the same scope.
Differential Revision: https://reviews.llvm.org/D100298
Clang-format was indenting the lines following the `?` in the added test
case by +5 instead of +4. This only happens in a very specific
situation, where the `?` is followed by a multiline block comment, as in
the example. This fix addresses this without regressing any of the
existing tests.
Differential Revision: https://reviews.llvm.org/D101033
CGP can move instructions like a ptrtoint into a loop, but the
MVETailPredication when converting them will currently assume invariant
trip counts. This tries to ensure the operands are loop invariant, and
bails if not.
Differential Revision: https://reviews.llvm.org/D100550
Initial (D96045) patch didn't handle split dwarf cases,
so this fixes that bug.
In addition, before applying this patch, we had a slowdown
that happened after the D96045. With this patch,
the slowdown will be fixed as well.
Differential Revision: https://reviews.llvm.org/D100951
Add option to `clang-scan-deps` to enable/disable generation of command-line arguments with absolute paths. This is essentially a revert of D100533, but with improved naming and added test.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D101051
We have several extensions that need i32 to be Custom for
INTRINSIC_WO_CHAIN with RV64 so enable it for all RV64.
For V extension, make i32 Custom for RV64 and i64 Custom for RV32.
When the i32 or i64 is legal, the operation action doesn't matter.
LegalizeDAG checks MVT::Other rather than the real type.
This teaches DAG combine that shift amount operands for grev, gorc
shfl, unshfl only read a few bits.
This also teaches DAG combine that grevw, gorcw, shflw, unshflw,
bcompressw, bdecompressw only consume the lower 32 bits of their
inputs.
In the future we can teach SimplifyDemandedBits to also propagate
demanded bits of the output to the inputs in some cases.
https://reviews.llvm.org/D99400 set clang DefaultDebuggerTuning for AIX
to dbx. However, we still need to update the target default so that llc
and other tools will get the same default debuggertuning, and avoid
passing extra options in LTO.
Reviewed By: #powerpc, shchenz, dblaikie
Differential Revision: https://reviews.llvm.org/D101197
In EHFrameRegistrationPlugin::notifyTransferringResources if SrcKey had
eh-frames associated but DstKey did not we would create a new entry for DskKey,
invalidating the iterator for SrcKey in the process. This commit fixes that by
removing SrcKey first in this case.
`X86TTIImpl::getInterleavedMemoryOpCostAVX2()` currently contains data
only for a handful of tuples. For now, at least add tests for a few more.
I'm guessing that we care how well the patterns codegen since
we use their presumed cost for vectorization decisions,
so i've added codegen tests too.
There's one really easy caveat for these codegen tests:
for interleaved load tests, we really have to ensure that the
deinterleaved vectors are escaped separately. Similarly for stores.
Bootstrap with `-Werror` is currently broken due to D79714.
This patch is required to bring the bootstrap bot back to green. The
code will likely need to be fixed and the pragmas removed in due time,
but for now we need to bring the bot back up.
Bot that is currently failing:
https://lab.llvm.org/buildbot/#/builders/36/builds/7680
Differential Revision: https://reviews.llvm.org/D101214
Try to fix bug 49974.
This patch fixes two issues:
1. BL does not use predicate (BL_pred is the predicate version of BL),
so we shouldn't add predicate operands in DecodeBranchImmInstruction.
2. Inside DecodeT2AddSubSPImm, we shouldn't add predicate operands into
the MCInst because ARMDisassembler::AddThumbPredicate will do that for us.
However, we should handle CC-out operand for t2SUBspImm and t2AddspImm.
Differential Revision: https://reviews.llvm.org/D100585