D9844 fixed a problem where the ss suffix in the AsmString "cmp${cc}ss"
was recognised as the X86 SS register, by only recognising a token as a
register name if it is "isolated", i.e. surrounded by separator
characters.
In the AMDGPU backend there are many operands like $clamp which expand
to an optional string " clamp" including the preceding space, so we want
to have AsmStrings including sequences like "vcc$clamp" where vcc is a
register name.
This patch relaxes the rules for an isolated token, to say that it's OK
if the token is immediately followed by a '$'.
Differential Revision: https://reviews.llvm.org/D90315
When moving +0.0 into a float vector, we can use to vi*gpr variants of
INS.
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D90176
When trying to prove that a memory access touches only dereferenceable memory across all iterations of a loop, use the maximum exit count rather than an exact one. In many cases we can't prove exact exit counts whereas we can prove an upper bound.
The test included is for a single exit loop with a min(C,V) exit count, but the true motivation is support for multiple exits loops. It's just really hard to write a test case for multiple exits because the vectorizer (the primary user of this API), bails far before this. For multiple exits, this allows a mix of analyzeable and unanalyzable exits when only analyzeable exits are needed to prove deref.
classes into the enclosing block scope.
We weren't properly detecting whether the name would be injected into a
block scope in the case where it was lexically declared in a local
class.
This fixes a subtle issue, described in the comment starting with
"Clone the op without the regions and inline the regions from the old op",
which prevented this conversion from working on non-trivial examples.
Differential Revision: https://reviews.llvm.org/D90203
Previously, if make_paths_relative() failed due to some reason, it would
happily keep going and set the ${out_pathlist} to the standard output
of the command, which would be the empty string if the command failed.
This can lead to issues that are difficult to diagnose, since the calling
code will usually try to keep going with a variable that was set to the
empty string.
Differential Revision: https://reviews.llvm.org/D89985
Split `FileEntry` and `FileEntryRef` out into a new file
`clang/Basic/FileEntry.h`. This allows current users of a
forward-declared `FileEntry` to transition to `FileEntryRef` without
adding more includers of `FileManager.h`.
Also split `UniqueID` out to llvm/Support/FileSystem/UniqueID.h, so
`FileEntry.h` doesn't need to include all of `FileSystem.h` for just
that type.
Differential Revision: https://reviews.llvm.org/D89761
Use LocationSize::upperBound instead of precise since we only know an upper bound on the number of bytes read/written.
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D89885
The support of a few debug info attributes specifically for Fortran
arrays have been added to LLVM recently, but there's no way to take
advantage of them through DIBuilder. This patch extends
DIBuilder::createArrayType to enable the settings of those attributes.
Patch by Chih-Ping Chen!
Differential Review: https://reviews.llvm.org/D90323
This is needed to support fortran assumed rank arrays which
have runtime rank.
Summary:
Fortran assumed rank arrays have dynamic rank. DWARF TAG
DW_TAG_generic_subrange is needed to support that.
Testing:
unit test cases added (hand-written)
check llvm
check debug-info
Reviewed By: aprantl
Differential Revision: https://reviews.llvm.org/D89218
Branch weights are not represented internally linearly with the value in
the IR. In its current state the test happened to pass, but the branch
weights for 0,3,6 and 2,5,8,9 were not actually equal.
$ opt -passes='print<branch-prob>'
shows that the sum of the branch probabilities going to bb0 and bb2 were not the same.
Printing analysis results of BPI for function 'bt_order_by_weight':
---- Branch Probabilities ----
edge entry -> bb0 probability is 0x00000003 / 0x80000000 = 0.00%
edge entry -> bb2 probability is 0x00000004 / 0x80000000 = 0.00%
with this change:
Printing analysis results of BPI for function 'bt_order_by_weight':
---- Branch Probabilities ----
edge entry -> bb0 probability is 0x00000004 / 0x80000000 = 0.00%
edge entry -> bb2 probability is 0x00000004 / 0x80000000 = 0.00%
Reviewed By: hans
Differential Revision: https://reviews.llvm.org/D90273
Previously we added support for target nowait, but target data nowait
has not been supported yet. In this patch, target data nowait will also be
wrapped into a task.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D90099
If most elements of BUILD_VECTOR are the same, with a few different
elements, it is better to use DUP for the common elements and
INSERT_VECTOR_ELT for the different elements.
Currently this transform is guarded quite restrictively to only trigger
in clearly beneficial cases.
With D90176, the lowering for patterns originating from code like
` float32x4_t y = {a,a,a,0};` (common in 3D apps) are lowered even
better (unnecessary fmov is removed).
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D90233
https://reviews.llvm.org/D88060
This adds the following combines
1) build_vector formation from insert_vec_elts
2) insert_vec_elts (build_vector) -> build_vector
Check for duplicate clauses associated with directive. Clauses can appear only once
in the 4 lists associated with each directive (allowedClauses, allowedOnceClauses,
allowedExclusiveClauses, requiredClauses). Duplicates were already present (removed with this
patch) or were introduce in new patches by mistake (D89861).
Reviewed By: kiranchandramohan
Differential Revision: https://reviews.llvm.org/D90241
Fix an out-of-bounds shift in emitLegacyZExt by using a slightly more
complicated dwarf expression to create the zext mask.
This addresses a UBSan diagnostic seen when compiling compiler-rt
(llvm.org/PR47927).
rdar://70307714
Differential Revision: https://reviews.llvm.org/D89838
Make the virtual method Toolchain::GetDefaultStackProtectorLevel()
return an explict enum value rather than an integral constant. This
makes the code subjectively easier to read, and should help prevent bugs
that may (or may never) arise from changing the enum values. Previously,
these were just kept in sync via a comment, which is brittle. The trade
off is including a additional header in a few new places. It is not
necessary, but in my opinion helps the readability.
Split off from https://reviews.llvm.org/D90194 to help cut down on lines
changed in code review.
Reviewed By: phosek
Differential Revision: https://reviews.llvm.org/D90271
Define the __vector_pair and __vector_quad types that are used to manipulate
the new accumulator registers introduced by MMA on PowerPC. Because these two
types are specific to PowerPC, they are defined in a separate new file so it
will be easier to add other PowerPC specific types if we need to in the future.
Differential Revision: https://reviews.llvm.org/D81508
Completing the series of FIXME removals for special-case intrinsics:
50dfa19cc7f2c25c7079c963bde01501ea93d85d
This one looks quite different than the others. The size/blended
cost is still potentially very far off from the throughput cost,
but this is hopefully not worse on the whole. It looks like the
underlying costs for the expanded shift/logic have their own
cost-kind limitations. Also, we are not asking the target if
it has a legal funnel shift op, so we just assume that the
intrinsic gets expanded.
Getting the body of a Module is a common need which justifies a
dedicated accessor instead of forcing users to go through the
region->blocks->front unwrapping manually.
Differential Revision: https://reviews.llvm.org/D90287
This patch modifies the Clang AVR toolchain so that it always passes
the '-Tdata=0x800100' to the linker for ATmega328 devices. This matches
AVR-GCC behaviour, and also corresponds to the address of the start of
the data section in data space according to the ATmega328 datasheet.
Without this, clang does not produce a valid ATmega328 binary.
When targeting all non-ATmega328 chips, a warning will be emitted due to
the fact that proper handling for the chips data section address is
not yet implemented.
I've held off adding other microcontrollers for now, mostly because the
AVR toolchain logic is smeared across LLVM core TableGen files, and two Clang
libraries. The 'family detection' logic is also only implemented for
ATmega328 at the moment, for similar reasons.
In the future, I aim to write an RFC to llvm-dev to find a better way
for LLVM to expose target-specific details such as these to compiler
frontends.
Differential Revision: https://reviews.llvm.org/D86629
both deferred_globals.cpp namespace.cpp require lldb in order to run and will
fail if it's not available.
add the required lines to the top of the tests.
As proposed in https://github.com/WebAssembly/simd/pull/376. This commit
implements new builtin functions and intrinsics for these instructions, but does
not yet add them to wasm_simd128.h because they have not yet been merged to the
proposal. These are the first instructions with opcodes greater than 0xff, so
this commit updates the MC layer and disassembler to handle that correctly.
Differential Revision: https://reviews.llvm.org/D90253