This enables us to intercept changes to the token type via setType(), which
is a precondition for being able to use multi-pass formatting for macro
arguments.
Differential Revision: https://reviews.llvm.org/D67405
Summary: This allows for suppressing warnings about the conversion function never being called if it overrides a virtual function in a base class.
Differential Revision: https://reviews.llvm.org/D78444
Under MVE a vdup will always take a gpr register, not a floating point
value. During DAG combine we convert the types to a bitcast to an
integer in an attempt to fold the bitcast into other instructions. This
is OK, but only works inside the same basic block. To do the same trick
across a basic block boundary we need to convert the type in
codegenprepare, before the splat is sunk into the loop.
This adds a convertSplatType function to codegenprepare to do that,
putting bitcasts around the splat to force the type to an integer. There
is then some adjustment to the code in shouldSinkOperands to handle the
extra bitcasts.
Differential Revision: https://reviews.llvm.org/D78728
The existing implementation of SubViewOp::getRanges relies on all
offsets/sizes/strides to be dynamic values and does not work in
combination with canonicalization. This revision adds a
SubViewOp::getOrCreateRanges to create the missing constants in the
canonicalized case.
This allows reactivating the fused pass with staged pattern
applications.
However another issue surfaces that the SubViewOp verifier is now too
strict to allow folding. The existing folding pattern is turned into a
canonicalization pattern which rewrites memref_cast + subview into
subview + memref_cast.
The transform-patterns-matmul-to-vector can then be reactivated.
Differential Revision: https://reviews.llvm.org/D79759
Similar to fmul/fadd, we can sink a splat into a loop containing a fma
in order to use more register instruction variants. For that there are
also adjustments to the sinking code to handle more than 2 arguments.
Differential Revision: https://reviews.llvm.org/D78386
This patch adds a new TTI hook to allow targets to tell LSR that
a chain including some instruction is already profitable and
should not be optimized. This patch also adds an implementation
of this TTI hook for ARM so LSR doesn't optimize chains that include
the VCTP intrinsic.
Differential Revision: https://reviews.llvm.org/D79418
Due to the extension of Liveness, Buffer Assignment can now work on nested regions. This PR provides a test case to show that existing functionally of BA works properly.
Differential Revision: https://reviews.llvm.org/D79332
Summary:
In the assembler or inline assembler,
attempting to use an invalid fixup type
gives a crash with a segmentation fault.
__attribute__((naked))
void foo(void) {
__asm__("mov r9, :lower16:bar(prel31)");
}
This should give a proper error message when building for ARM or Thumb.
This brings it in line with AARCH64.
This fixes all 8 instances of llvm_unreachable("Unsupported Modifier");
in ARM/MCTargetDesc/ARMELFObjectWriter.cpp.
A test is provided for each instance.
Reviewers: llvm-commits, MarkMurrayARM
Reviewed By: MarkMurrayARM
Subscribers: kristof.beyls, hiraditya, danielkiss
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79782
Change-Id: I6971ba37f129cc453568fe71514ccb2ac9d16831
This was reverted because of a miscompilation. At closer inspection, the
problem was actually visible in a changed llvm regression test too. This
one-line follow up fix/recommit will splat the IV, which is what we are trying
to avoid if unnecessary in general, if tail-folding is requested even if all
users are scalar instructions after vectorisation. Because with tail-folding,
the splat IV will be used by the predicate of the masked loads/stores
instructions. The previous version omitted this, which caused the
miscompilation. The original commit message was:
If tail-folding of the scalar remainder loop is applied, the primary induction
variable is splat to a vector and used by the masked load/store vector
instructions, thus the IV does not remain scalar. Because we now mark
that the IV does not remain scalar for these cases, we don't emit the vector IV
if it is not used. Thus, the vectoriser produces less dead code.
Thanks to Ayal Zaks for the direction how to fix this.
This is a reimplementation of the `orderNodes` function, as the old
implementation didn't take into account all cases.
Fix PR41509
Differential Revision: https://reviews.llvm.org/D79037
Summary:
Synchronize the function definition with the LLVM documentation.
https://llvm.org/docs/Atomics.html#libcalls-atomic
GCC also returns bool for the same atomic builtin.
Reviewers: theraven
Reviewed By: theraven
Subscribers: theraven, dberris, jfb, #sanitizers
Tags: #sanitizers
Differential Revision: https://reviews.llvm.org/D79845
The current standard Alloca node is not annotated with the
MemEffect<Alloc> trait. This CL updates the Alloc and Alloca
memory-effect annotations using the latest Resource objects. Alloca
nodes will use a newly defined AutomaticAllocationScopeResource to
distinguish between Alloc and Alloca memory effects.
Differential Revision: https://reviews.llvm.org/D79620
The near-identical implementations of this function for posix-y
platforms were merged in r293910. PlatformWindows was left out of this
merge because at the time we did not have a suitable base class to sink
the code into. That is no longer true, so this commit finishes the job
by moving the code into RemoteAwarePlatform::ResolveExecutable.
Summary:
The D programming language has 'char', 'wchar', and 'dchar' as base types,
which are defined as UTF-8, UTF-16, and UTF-32, respectively.
It also has type constructors (e.g. 'const' and 'immutable'),
that leads to D compilers emitting DW_TAG_base_type with DW_ATE_UTF
and name 'char', 'immutable(wchar)', 'const(char)', etc...
Before this patch, DW_ATE_UTF would only recognize types that
followed the C/C++ naming, and emit an error message for the rest, e.g.:
```
error: need to add support for DW_TAG_base_type 'immutable(char)'
encoded with DW_ATE = 0x10, bit_size = 8
```
The code was changed to check the byte size first,
then fall back to the old name-based check.
Reviewers: clayborg, labath
Reviewed By: labath
Subscribers: labath, lldb-commits
Tags: #lldb
Differential Revision: https://reviews.llvm.org/D79559
This test tried to verify that "wait()" returned quickly but "quick" is
impossible to define given a busy and/or slow system.
Instead, I've refactored the test to verify that `wait()` actually
waits which the old test did not verify.
Windows doesn't properly support pass plugins (as a shared library
can't have undefined references, which pass plugins assume, being
loaded into a host process that contains provides them), thus
disable building it and the corresponding test.
This matches what was done for the passes unit test in
bc8e442188.
Differential Revision: https://reviews.llvm.org/D79771
Summary:
Nonnull attribute can be applied to non-pointers. This caused assertion
failures in NonNullParamChecker when we tried to *assume* such parameters
to be non-zero.
rdar://problem/63150074
Differential Revision: https://reviews.llvm.org/D79843
Summary:
This LWG issue states that the result of `year_month_day_last::day()` is implementation defined if `ok()` is `false`.
However, from user perspective, calling `day()` in this situation will lead to a (possibly difficult to find) crash.
Hence, I have added an assertion to warn user at least when assertions are enabled.
I am however not aware of the libc++ stand on the desired behaviour.
Reviewers: ldionne, mclow.lists, EricWF, #libc
Reviewed By: ldionne, #libc
Subscribers: christof, dexonsmith, libcxx-commits
Tags: #libc
Differential Revision: https://reviews.llvm.org/D70346
A lot of tests under PowerPC are using fast flag, while fast is just
alias of 7 fast-math flags. This change makes test points clearer.
mc-instrlat.ll and sms-iterator.ll keeps unchanged since they are not
testing fast-math behavior. (one for machine combiner crash, one for
machine pipeliner bug)
Reviewed By: steven.zhang, spatel
Differential Revision: https://reviews.llvm.org/D78989
Summary:
In TableGen's instruction selection table generator, references to
register classes were handled by generating a matcher table entry in the
form of "EmitStringInteger, MVT::i32, 'RegisterClassID'". This ID is in
fact the enum integer value corresponding to the register class.
However, both the table generator and the table consumer
(SelectionDAGISel) assume that this ID is less than or equal to 127,
i.e. at most 7 bits. Values greater than this threshold cause completely
wrong behaviours in the instruction selection process.
This patch adds a check to determine if the enum integer value is
greater than the limit of 127. In finding so, the generator emits an
"EmitInteger" instead, which properly supports values with arbitrary
sizes.
Commit f8d044bbcf fixed the very same bug
for register subindices. The present patch now extends this cover to
register classes.
Reviewers: rampitec
Reviewed By: rampitec
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79705
This patch extends DIModule Debug metadata in LLVM to support
Fortran modules. DIModule is extended to contain File and Line
fields, these fields will be used by Flang FE to create debug
information necessary for representing Fortran modules at IR level.
Furthermore DW_TAG_module is also extended to contain these fields.
If these fields are missing, debuggers like GDB won't be able to
show Fortran modules information correctly.
Reviewed By: aprantl
Differential Revision: https://reviews.llvm.org/D79484
Hide the method that allows setting probability for particular
edge and introduce a public method that sets probabilities for
all outgoing edges at once.
Setting individual edge probability is error prone. More over
it is difficult to check that the total probability is 1.0
because there is no easy way to know when the user finished
setting all the probabilities.
Reviewers: yamauchi, ebrevnov
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79396
xsnegdp, xsabsdp and xsnabsdp can be used to operate on f32 operand.
This patch adds the missing patterns since we prefer VSX instructions
when available.
Reviewed By: steven.zhang
Differential Revision: https://reviews.llvm.org/D75344
Fix the assumption that all bitcasts of the same type sizes are free.
We now only assume that bitcasts between ints and ptrs of the same
size are free. This allows TTImpl to just call the concrete
implementation of getCastInstrCost.
Differential Revision: https://reviews.llvm.org/D78918
Legalizer should respect both command-line options or SDNode-level
fast-math flags.
Also, this patch propagates other flags during custom simplifying.
Reviewed By: steven.zhang
Differential Revision: https://reviews.llvm.org/D79074
This is only valid if the source tensors (result tensor) is static
shaped with all unit-extents when the reshape is collapsing
(expanding) dimensions.
Differential Revision: https://reviews.llvm.org/D79764
Summary:
The ppc-early-ret pass use the addReg() to add operand to the new
instruction, it can't reserve the flag of old operand. This has caused
machine verfications failed.
This patch use add() to instead of addReg().
Reviewed By: steven.zhang
Differential Revision: https://reviews.llvm.org/D77997
We need to avoid declaring dependencies on strings which are valid
LINK_LIBS and not valid targets. Previously, we used if(TARGET) to
check this condition. However, if(TARGET) checks whether a target has
been created (in the cmake subdirectory traversal order) and not
whether it *will* be created. This results in annoying directory
ordering problems.
This patch changes the check to more explicitly eliminate problematic
libraries (namely -lpthread) using a REGEX.
Differential Revision: https://reviews.llvm.org/D79837
Fixes PR45673
The commit 9180c14fe4 (D76206) resolved only a part of the problem
of concurrent .gcda file creation. It ensured that only one process
creates the file but did not ensure that the process locks the
file first. If not, the process which created the file may clobber
the contents written by a process which locked the file first.
This is the cause of PR45673.
This commit prevents the clobbering by revising the assumption
that a process which creates the file locks the file first.
Regardless of file creation, a process which locked the file first
uses fwrite (new_file==1) and other processes use mmap (new_file==0).
I also tried to keep the creation/first-lock process same by using
mkstemp/link/unlink but the code gets long. This commit is more
simple.
Note: You may be confused with other changes which try to resolve
concurrent file access. My understanding is (may not be correct):
D76206: Resolve race of .gcda file creation (but not lock)
This one: Resolve race of .gcda file creation and lock
D54599: Same as D76206 but abandoned?
D70910: Resolve race of multi-threaded counter flushing
D74953: Resolve counter sharing between parent/children processes
D78477: Revision of D74953
Differential Revision: https://reviews.llvm.org/D79556
Fixes PR41696
The loop-reroll pass generates an invalid IR (or its assertion
fails in debug build) if values of the base instruction and
other root instructions (terms used in the loop-reroll pass)
are used outside the loop block. See IRs written in PR41696
as examples.
The current implementation of the loop-reroll pass can reroll
only loops that don't have values that are used outside the
loop, except reduced values (the last values of reduction chains).
This is described in the comment of the `LoopReroll::reroll`
function.
https://github.com/llvm/llvm-project/blob/llvmorg-10.0.0/llvm/lib/Transforms/Scalar/LoopRerollPass.cpp#L1600
This is checked in the `LoopReroll::DAGRootTracker::validate`
function.
https://github.com/llvm/llvm-project/blob/llvmorg-10.0.0/llvm/lib/Transforms/Scalar/LoopRerollPass.cpp#L1393
However, the base instruction and other root instructions skip
this check in the validation loop.
https://github.com/llvm/llvm-project/blob/llvmorg-10.0.0/llvm/lib/Transforms/Scalar/LoopRerollPass.cpp#L1229
Moving the check in front of the skip is the logically simplest
fix. However, inserting the check in an earlier stage is better
in terms of compilation time of unrerollable loops. This fix
inserts the check for the base instruction into the function
to validate possible base/root instructions. Check for other
root instructions is unnecessary because they don't match any
base instructions if they have uses outside the loop.
Differential Revision: https://reviews.llvm.org/D79549
This patch marks following tests as xfail for arm-linux target.
lldb/test/API/functionalities/load_using_paths/TestLoadUsingPaths.py
lldb/test/API/python_api/thread/TestThreadAPI.py
lldb/test/Shell/Recognizer/assert.test
Bugs have been filed for all of them for the corresponding failing
component.
Summary:
Makes this operation runnable on CPU by generating MLIR instructions
that are eventually folded into an LLVM IR constant for the mask.
Reviewers: nicolasvasilache, ftynse, reidtatge, bkramer, andydavis1
Reviewed By: nicolasvasilache, ftynse, andydavis1
Subscribers: mehdi_amini, rriddle, jpienaar, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, liufengdb, stephenneuendorffer, Joonsoo, grosul1, frgossen, Kayjukh, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79815
This patch fixes minidebuginfo-set-and-hit-breakpoint.test for arm-linux
targets. 32-bit elf executables use .rel.dyn and 64-bit uses .rela.dyn for
relocation entries for dynamic symbols.
For AAReturnedValues we treated new and existing information differently
in the updateImpl. Only the latter was properly analyzed and
categorized. The former was thought to be analyzed in the subsequent
update. Since the Attributor does not support "self-updates" we need to
make sure the state is "stable" after each updateImpl invocation. That
is, if the surrounding information does not change, the state is valid.
Now we make sure all return values have been handled and properly
categorized each iteration. We might not update again if we have not
requested a non-fix attribute so we cannot "wait" for the next update to
analyze a new return value.
Bug reported by @sdmitriev.