Summary:
AIX compilers define macros based on the version of the operating
system.
This patch implements updating of versionless AIX triples to include the
host AIX version. Also, the host triple detection in the build system is
adjusted to strip the AIX version information so that the run-time
detection is preferred.
Reviewers: xingxue, stefanp, nemanjai, jasonliu
Reviewed By: xingxue
Subscribers: mgorny, kristina, jdoerfert, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D58798
llvm-svn: 355995
This patch adds an XCOFF triple object format type into LLVM.
This XCOFF triple object file type will be used later by object file and assembly generation for the AIX platform.
Differential Revision: https://reviews.llvm.org/D58930
llvm-svn: 355989
Currently we have -Rpass for filtering the remarks that are displayed as
diagnostics, but when using -fsave-optimization-record, there is no way
to filter the remarks while generating them.
This adds support for filtering remarks by passes using a regex.
Ex: `clang -fsave-optimization-record -foptimization-record-passes=inline`
will only emit the remarks coming from the pass `inline`.
This adds:
* `-fsave-optimization-record` to the driver
* `-opt-record-passes` to cc1
* `-lto-pass-remarks-filter` to the LTOCodeGenerator
* `--opt-remarks-passes` to lld
* `-pass-remarks-filter` to llc, opt, llvm-lto, llvm-lto2
* `-opt-remarks-passes` to gold-plugin
Differential Revision: https://reviews.llvm.org/D59268
Original llvm-svn: 355964
llvm-svn: 355984
A faulting_op is one that has specified behavior when a fault occurs, generally redirecting control flow to another location. This change just adds a comment to the assembly output which makes it both human readable, and machine checkable w/o having to parse the FaultMap section. This is used to split a test file into two parts, so that I can (in a near future commit) easily extend the test file to demonstrate another case.
llvm-svn: 355982
This indicates an intrinsic parameter is required to be a constant,
and should not be replaced with a non-constant value.
Add the attribute to all AMDGPU and generic intrinsics that comments
indicate it should apply to. I scanned other target intrinsics, but I
don't see any obvious comments indicating which arguments are intended
to be only immediates.
This breaks one questionable testcase for the autoupgrade. I'm unclear
on whether the autoupgrade is supposed to really handle declarations
which were never valid. The verifier fails because the attributes now
refer to a parameter past the end of the argument list.
llvm-svn: 355981
Summary:
This is similar to how addr2line handles consecutive entries with the
same address - pick the last one.
Reviewers: dblaikie, friss, JDevlieghere
Reviewed By: dblaikie
Subscribers: ormris, echristo, JDevlieghere, probinson, aprantl, hiraditya, rupprecht, jdoerfert, llvm-commits
Tags: #llvm, #debug-info
Differential Revision: https://reviews.llvm.org/D58952
llvm-svn: 355972
Every time a physical register reference was parsed, this would
initialize a string map for every register in in target, and discard
it for the next. The same applies for the other fields initialized
from target information.
Follow along with how the function state is tracked, and add a new
tracking class for target information.
The string->register class/register bank for some reason were kept
separately, so track them in the same place.
llvm-svn: 355970
Currently we have -Rpass for filtering the remarks that are displayed as
diagnostics, but when using -fsave-optimization-record, there is no way
to filter the remarks while generating them.
This adds support for filtering remarks by passes using a regex.
Ex: `clang -fsave-optimization-record -foptimization-record-passes=inline`
will only emit the remarks coming from the pass `inline`.
This adds:
* `-fsave-optimization-record` to the driver
* `-opt-record-passes` to cc1
* `-lto-pass-remarks-filter` to the LTOCodeGenerator
* `--opt-remarks-passes` to lld
* `-pass-remarks-filter` to llc, opt, llvm-lto, llvm-lto2
* `-opt-remarks-passes` to gold-plugin
Differential Revision: https://reviews.llvm.org/D59268
llvm-svn: 355964
The included test case currently crashes on tip of tree. Rather than adding a bailout, I chose to restructure the code so that the existing helper function could be used. Given that, the majority of the diff is NFC-ish, but the key difference is that canConvertValue returns false when only one side is a non-integral pointer.
Thanks to Cherry Zhang for the test case.
Differential Revision: https://reviews.llvm.org/D59000
llvm-svn: 355962
The existing statepoint lowering code does something odd; it adds machine memory operands post instruction selection. This was copied from the stackmap/patchpoint implementation, but appears to be non-idiomatic.
This change is largely NFC. It moves the MMO creation logic into SelectionDAG building. It ends up not quite being NFC because the size of the stack slot is reflected in the MMO. The old code blindly used pointer size for the MMO size, which appears to have always been incorrect for larger values. It just happened nothing actually relied on the MMOs, so it worked out okay.
For context, I'm planning on removing the MOVolatile flag from these in a future commit, and then removing the MOStore flag from deopt spill slots in a separate one. Doing so is motivated by a small test case where we should be able to better schedule spill slots, but don't do so due to a memory use/def implied by the statepoint.
Differential Revision: https://reviews.llvm.org/D59106
llvm-svn: 355953
Summary:
This fixes an extremely long compile time caused by recursive analysis
of truncs, which were not previously subject to any depth limits unlike
some of the other ops. I decided to use the same control used for
sext/zext, since the routines analyzing these are sometimes mutually
recursive with the trunc analysis.
Reviewers: mkazantsev, sanjoy
Subscribers: sanjoy, jdoerfert, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D58994
llvm-svn: 355949
Vector imm setting instructions like XXLXORz/XXLXORspz/XXLXORdpz
Should behave like LI8.
We should set corresponding flags to allow rematerialization and other
opts in LICM, RA, Scheduling etc.
Differential Revision: https://reviews.llvm.org/D58645
llvm-svn: 355948
This patch adds a new option to SplitAllCriticalEdges and uses it to avoid splitting critical edges when the destination basic block ends with unreachable. Otherwise if we split the critical edge, sanitizer coverage will instrument the new block that gets inserted for the split. But since this block itself shouldn't be reachable this is pointless. These basic blocks will stick around and generate assembly, but they don't end in sane control flow and might get placed at the end of the function. This makes it look like one function has code that flows into the next function.
This showed up while compiling the linux kernel with clang. The kernel has a tool called objtool that detected the code that appeared to flow from one function to the next. https://github.com/ClangBuiltLinux/linux/issues/351#issuecomment-461698884
Differential Revision: https://reviews.llvm.org/D57982
llvm-svn: 355947
If a symbol points to the end of a fragment, instead of searching for
fixups in that fragment, search in the next fragment.
Fixes spurious assembler error with subtarget change next to "la"
pseudo-instruction, or expanded equivalent.
Alternate proposal to fix the problem discussed in
https://reviews.llvm.org/D58759.
Testcase by Ana Pazos.
Differential Revision: https://reviews.llvm.org/D58943
llvm-svn: 355946
Expand MULO with constant power of two operand into a shift. The
overflow is checked with (x << shift) >> shift == x, where the right
shift will be logical for umulo and arithmetic for smulo (with
exception for multiplications by signed_min).
Differential Revision: https://reviews.llvm.org/D59041
llvm-svn: 355937
This patch removes two assertions that were preventing writing of a test
that checked an empty line followed by some text. For example:
CHECK: {{^$}}
CHECK-NEXT: foo()
The assertion was because the current location the CHECK-NEXT was
scanning from was the start of the buffer. A similar issue occurred with
CHECK-SAME. These assertions don't protect against anything, as there is
already an error check that checks that CHECK-NEXT/EMPTY/SAME don't
appear first in the checks, and the following code works fine if the
pointer is at the start of the input.
Reviewed by: probinson, thopre, jdenny
Differential Revision: https://reviews.llvm.org/D58784
llvm-svn: 355928
Targets can potentially emit more efficient code if they know address
computations never overflow. For example ILP32 code on AArch64 (which only has
64-bit address computation) can ignore the possibility of overflow with this
extra information.
llvm-svn: 355926
The code might intend to replace puts("") with putchar('\n') even if the
return value is used. It failed because use_empty() was used to guard
the whole block. While returning '\n' (putchar('\n')) is technically
correct (puts is only required to return a nonnegative number on
success), doing this looks weird and there is really little benefit to
optimize puts whose return value is used. So don't do that.
llvm-svn: 355921
This was found when we generated COPY from G8RC to F8RC in
EmitInstrWithCustomInserter without checking proper architecture,
we silently generated mtvsrd, which require P8 and up.
This is a NFC patch to add assert when we call copyPhysReg, in case
someone accidentally generate COPY between G8RC to F8RC for P7 and
below.
llvm-svn: 355920
This is a refactoring patch that removes the redundancy of performing operand reordering twice, once in buildTree() and later in vectorizeTree().
To achieve this we need to keep track of the operands within the TreeEntry struct while building the tree, and later in vectorizeTree() we are just accessing them from the TreeEntry in the right order.
This patch is the first in a series of patches that will allow for better operand reordering across chains of instructions (e.g., a chain of ADDs), as presented here: https://www.youtube.com/watch?v=gIEn34LvyNo
Patch by: @vporpo (Vasileios Porpodas)
Differential Revision: https://reviews.llvm.org/D59059
........
Reverted due to buildbot failures that I don't have time to track down.
llvm-svn: 355913
This is a refactoring patch that removes the redundancy of performing operand reordering twice, once in buildTree() and later in vectorizeTree().
To achieve this we need to keep track of the operands within the TreeEntry struct while building the tree, and later in vectorizeTree() we are just accessing them from the TreeEntry in the right order.
This patch is the first in a series of patches that will allow for better operand reordering across chains of instructions (e.g., a chain of ADDs), as presented here: https://www.youtube.com/watch?v=gIEn34LvyNo
Patch by: @vporpo (Vasileios Porpodas)
Differential Revision: https://reviews.llvm.org/D59059
llvm-svn: 355906
This is addressing the issue that we're not modeling the cost of clib functions
in TTI::getIntrinsicCosts and thus we're basically addressing this fixme:
// FIXME: This is wrong for libc intrinsics.
To enable analysis of clib functions, we not only need an intrinsic ID and
formal arguments, but also the actual user of that function so that we can e.g.
look at alignment and values of arguments. So, this is the initial plumbing to
pass the user of an intrinsinsic on to getCallCosts, which queries
getIntrinsicCosts.
Differential Revision: https://reviews.llvm.org/D59014
llvm-svn: 355901
These two values correspond to the 'Empty' and 'Tombstone' special
keys defined by DenseMapInfo<int64_t>, which means that neither one
can be used as a key in DenseMap<int64_t, anything>. Hence, if you try
to use either of those values as an int literal, IntInit::get() fails
an assertion when it tries to insert them into its static cache of
int-literal objects.
Fixed by replacing the DenseMap with a std::map, which doesn't intrude
on the space of legal values of the key type.
Reviewers: nhaehnle, hfinkel, javedabsar, efriedma
Reviewed By: efriedma
Subscribers: fhahn, efriedma, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D59016
llvm-svn: 355900
Change from original commit: move test (that uses an X86 triple) into the X86
subdirectory.
Original description:
Gating vectorizing reductions on *all* fastmath flags seems unnecessary;
`reassoc` should be sufficient.
Reviewers: tvvikram, mkuper, kristof.beyls, sdesmalen, Ayal
Reviewed By: sdesmalen
Subscribers: dcaballe, huntergr, jmolloy, mcrosier, jlebar, bixia, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D57728
llvm-svn: 355889
Summary:
Swift now generates PDBs for debugging on Windows. llvm and lldb
need a language enumerator value too properly handle the output
emitted by swiftc.
Subscribers: jdoerfert, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D59231
llvm-svn: 355882
For the design in question, overloads seem to be a much simpler and less subtle solution.
This removes ODR issues, and errors of the kind where code that uses the
specialization in question will accidentally and erroneously specialize
the primary template. This only "works" by accident; the program is
ill-formed NDR.
(Found with -Wundefined-func-template.)
Patch by Thomas Köppe!
Differential Revision: https://reviews.llvm.org/D58998
llvm-svn: 355880
ProcFeatures was a class that just concatenated two feature lists together and gave it a name. We used it to inherit features between CPUs.
ProcModel took a two CPU feature lists and concatenated them before deferring to ProcessorModel. This was to allow inherited features and specific features to be passed to each CPU.
Both of these allowed for only very rigid CPU inheritance rules.
With this patch we now store all of the lists we were using for inheritance in one object and do any list oncatenation we want there. Then we just pass whatever list we want from this class into the ProcessorModel class for each CPU.
Hopefully this gives us more flexibility to build up feature lists in whatever ways we think make sense. Perhaps untangling ISA flags and tuning flags.
I've only touched the CPUs that were directly affected by the removal of the ProcModel and ProcFeatures classes. We should move more of the feature lists into ProcessorFeatures.
llvm-svn: 355872
After r355865, we should be able to safely select G_EXTRACT_VECTOR_ELT without
running into any problematic intrinsics.
Also add a fix for lane copies, which don't support index 0.
llvm-svn: 355871
AtomicCmpSwapWithSuccess is legalised into an AtomicCmpSwap plus a comparison.
This requires an extension of the value which, by default, is a
zero-extension. When we later lower AtomicCmpSwap into a PseudoCmpXchg32 and then expanded in
RISCVExpandPseudoInsts.cpp, the lr.w instruction does a sign-extension.
This mismatch of extensions causes the comparison to fail when the compared
value is negative. This change overrides TargetLowering::getExtendForAtomicOps
for RISC-V so it does a sign-extension instead.
Differential Revision: https://reviews.llvm.org/D58829
Patch by Ferran Pallarès Roca.
llvm-svn: 355869
The RISC-V Assembly Programmer's Manual defines fp as another alias of x8.
However, our tablegen rules only recognise s0. This patch adds fp as another
alias of x8. GCC also accepts fp.
Differential Revision: https://reviews.llvm.org/D59209
Patch by Ferran Pallarès Roca.
llvm-svn: 355867
Overloaded intrinsics aren't necessarily safe for instruction selection. One
such intrinsic is aarch64.neon.addp.*.
This is a temporary workaround to ensure that we always fall back on that
intrinsic. Eventually this will be replaced with a proper solution.
https://bugs.llvm.org/show_bug.cgi?id=40968
Differential Revision: https://reviews.llvm.org/D59062
llvm-svn: 355865
It hasn't seen active development in years, and it hasn't reached a
state where it was useful.
Remove the code until someone is interested in working on it again.
Differential Revision: https://reviews.llvm.org/D59133
llvm-svn: 355862
Fixes https://bugs.llvm.org/show_bug.cgi?id=36796.
Implement basic legalizations (PromoteIntRes, PromoteIntOp,
ExpandIntRes, ScalarizeVecOp, WidenVecOp) for VECREDUCE opcodes.
There are more legalizations missing (esp float legalizations),
but there's no way to test them right now, so I'm not adding them.
This also includes a few more changes to make this work somewhat
reasonably:
* Add support for expanding VECREDUCE in SDAG. Usually
experimental.vector.reduce is expanded prior to codegen, but if the
target does have native vector reduce, it may of course still be
necessary to expand due to legalization issues. This uses a shuffle
reduction if possible, followed by a naive scalar reduction.
* Allow the result type of integer VECREDUCE to be larger than the
vector element type. For example we need to be able to reduce a v8i8
into an (nominally) i32 result type on AArch64.
* Use the vector operand type rather than the scalar result type to
determine the action, so we can control exactly which vector types are
supported. Also change the legalize vector op code to handle
operations that only have vector operands, but no vector results, as
is the case for VECREDUCE.
* Default VECREDUCE to Expand. On AArch64 (only target using VECREDUCE),
explicitly specify for which vector types the reductions are supported.
This does not handle anything related to VECREDUCE_STRICT_*.
Differential Revision: https://reviews.llvm.org/D58015
llvm-svn: 355860
As a fix for https://bugs.llvm.org/show_bug.cgi?id=40986 ("excessive compile
time building opencollada"), this patch makes sure that no phys reg is hinted
more than once from getRegAllocationHints().
This handles the case were many virtual registers are assigned to the same
physreg. The previous compile time fix (r343686) in weightCalcHelper() only
made sure that physical/virtual registers are passed no more than once to
addRegAllocationHint().
Review: Dimitry Andric, Quentin Colombet
https://reviews.llvm.org/D59201
llvm-svn: 355854
Summary:
Depends on https://reviews.llvm.org/D59069.
https://bugs.llvm.org/show_bug.cgi?id=40979 describes a bug in which the
-coro-split pass would assert that a use was across a suspend point from
a definition. Normally this would mean that a value would "spill" across
a suspend point and thus need to be stored in the coroutine frame. However,
in this case the use was unreachable, and so it would not be necessary
to store the definition on the frame.
To prevent the assert, simply remove unreachable basic blocks from a
coroutine function before computing spills. This avoids the assert
reported in PR40979.
Reviewers: GorNishanov, tks2103
Reviewed By: GorNishanov
Subscribers: EricWF, jdoerfert, llvm-commits, lewissbaker
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D59068
llvm-svn: 355852
Summary:
llvm-objdump can be tricked into reading beyond valid memory and
segfaulting if LC_LINKER_COMMAND strings are not null terminated. libObject
does have code to validate the integrity of the LC_LINKER_COMMAND struct,
but this validator improperly assumes linker command strings are null
terminated.
The solution is to report an error if a string extends beyond the end of
the LC_LINKER_COMMAND struct.
Reviewers: lhames, pete
Reviewed By: pete
Subscribers: rupprecht, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D59179
llvm-svn: 355851