This patch adds functions to allow MachineLICM to hoist invariant stores.
Currently, MachineLICM does not hoist any store instructions, however
when storing the same value to a constant spot on the stack, the store
instruction should be considered invariant and be hoisted. The function
isInvariantStore iterates each operand of the store instruction and checks
that each register operand satisfies isCallerPreservedPhysReg. The store
may be fed by a copy, which is hoisted by isCopyFeedingInvariantStore.
This patch also adds the PowerPC changes needed to consider the stack
register as caller preserved.
Differential Revision: https://reviews.llvm.org/D40196
llvm-svn: 327856
Currently the WriteResPair style multi-classes take a single pipeline stage and latency, this patch generalizes this to make it easier to create complex schedules with ResourceCycles and NumMicroOps be overriden from their defaults.
This has already been done for the Jaguar scheduler to remove a number of custom schedule classes and adding it to the other x86 targets will make it much tidier as we add additional classes in the future to try and replace so many custom cases.
I've converted some instructions but a lot of the models need a bit of cleanup after the patch has been committed - memory latencies not being consistent, the class not actually being used when we could remove some/all customs, etc. I'd prefer to keep this as NFC as possible so later patches can be smaller and target specific.
Differential Revision: https://reviews.llvm.org/D44612
llvm-svn: 327855
Exit with a non-zero value in case any of the underlying clang-tidy
invocations exit with a non-zero value.
This is useful in case WarningsAsErrors is enabled for some of the
checks: if any of those checks find something, the exit status now
reflects that.
Also add the ability to use run-clang-tidy.py via lit, and assert that
the exit code is not 0 when modernize-use-auto is triggered
intentionally.
Reviewers: alexfh, aaron.ballman
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D44366
llvm-svn: 327854
1. Given that we already have a classification bucket with 'nop' in the name,
that's where 'nop' belongs. Right now, it's only used for prefix bytes and 'pause'.
2. Make the latency of this class '1' for Jaguar to tell the scheduler (and presumably
llvm-mca) how to model the resource requirements better even though a nop has no
dependencies.
Differential Revision: https://reviews.llvm.org/D44608
llvm-svn: 327853
Summary:
This fixes a usage of createTemporaryFile in clang repo after
a change in llvm repo.
Reviewers: klimek, bkramer, krasimir, espindola, ilya-biryukov
Reviewed By: ilya-biryukov
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D36828
llvm-svn: 327852
Summary:
This commit changes semantics of createUniqueFile and
createTemporaryFile variants that do not return file descriptors.
Previously they only checked if files exist, therefore being subject
to race conditions. Now they will create an empty file to avoid them.
Functions that do not create a file are now called
getPotentiallyUniqueTempFileName and getPotentiallyUniqueFileName.
Reviewers: klimek, bkramer, krasimir, JDevlieghere, espindola
Reviewed By: klimek
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D36827
llvm-svn: 327851
Summary:
Otherwise, patterns like in the test case produce cryptic error
messages about fields being resolved incompletely.
Change-Id: I713c0191f00fe140ad698675803ab1f8823dc5bd
Reviewers: arsenm, craig.topper, tra, MartinO
Subscribers: wdng, llvm-commits
Differential Revision: https://reviews.llvm.org/D44476
llvm-svn: 327850
Summary:
The docs already claim that this happens, but so far it hasn't. As a
consequence, existing TableGen files get this wrong a lot, but luckily
the fixes are all reasonably straightforward.
To make this work with all the existing forms of self-references (since
the true type of a record is only built up over time), the lookup of
self-references in !cast is delayed until the final resolving step.
Change-Id: If5923a72a252ba2fbc81a889d59775df0ef31164
Reviewers: arsenm, craig.topper, tra, MartinO
Subscribers: wdng, javed.absar, llvm-commits
Differential Revision: https://reviews.llvm.org/D44475
llvm-svn: 327849
Summary:
These are cases of self-references that exist today in practice. Let's
add tests for them to avoid regressions.
The self-references in PPCInstrInfo.td can be expressed in a simpler
way. Allowing this type of self-reference while at the same time
consistently doing late-resolve even for self-references is problematic
because there are references to fields that aren't in any class. Since
there's no need for this type of self-reference anyway, let's just
remove it.
Change-Id: I914e0b3e1ae7adae33855fac409b536879bc3f62
Reviewers: arsenm, craig.topper, tra, MartinO
Subscribers: nemanjai, wdng, kbarton, llvm-commits
Differential Revision: https://reviews.llvm.org/D44474
llvm-svn: 327848
Summary:
Make sure that we always fold immediately, so there's no point in
attempting to re-fold when nothing changes.
Change-Id: I069e1989455b6f2ca8606152f6adc1a5e817f1c8
Reviewers: arsenm, craig.topper, tra, MartinO
Subscribers: wdng, llvm-commits
Differential Revision: https://reviews.llvm.org/D44198
llvm-svn: 327847
Summary:
Cast-from-string for records isn't going away, but cast-from-string for
variables is a pretty dodgy feature to have, especially when referencing
template arguments. It's doubtful that this ever worked in a reliable
way, and nobody seems to be using it, so let's get rid of it and get
some related cleanups.
Change-Id: I395ac8a43fef4cf98e611f2f552300d21e99b66a
Reviewers: arsenm, craig.topper, tra, MartinO
Subscribers: wdng, llvm-commits
Differential Revision: https://reviews.llvm.org/D44195
llvm-svn: 327844
Normally DCE kills these, but at -O0 these get left behind
leaving suspicious looking illegal copies.
Replace with IMPLICIT_DEF to avoid iterator issues.
llvm-svn: 327842
This is the groundwork for adding the Armv8.2-A FP16 vector intrinsics, which
uses v4f16 and v8f16 vector operands and return values. All the moving parts
are tested with two intrinsics, a 1-operand v8f16 and a 2-operand v4f16
intrinsic. In a follow-up patch the rest of the intrinsics and tests will be
added.
Differential Revision: https://reviews.llvm.org/D44538
llvm-svn: 327839
This patch introduces a new class named HWStallEvent (see HWEventListener.h),
and updates the event listener interface. A HWStallEvent represents a pipeline
stall caused by the lack of hardware resources. Similarly to HWInstructionEvent,
the event type is an unsigned, and the exact meaning depends on the subtarget.
At the moment, HWStallEvent supports a few generic dispatch events.
The main goals of this patch is to remove the logic that counts dispatch stalls
from the DispatchUnit to the BackendStatistics view.
Previously, DispatchUnit was responsible for counting and classifying dispatch
stall events. With this patch, we delegate the task of counting and classifying
stall events to the listeners (i.e. in our case, it is view
"BackendStatistics"). So, the DispatchUnit doesn't have to do extra
(unnecessary) bookkeeping.
This patch also helps futher simplifying the Backend interface. Now class
BackendStatistics no longer has to query the Backend interface to obtain the
number of dispatch stalls. As a consequence, we can get rid of all the
'getNumXXX()' methods from class Backend.
The long term goal is to remove all the remaining dependencies between the
Backend and the BackendStatistics interface.
Differential Revision: https://reviews.llvm.org/D44621
llvm-svn: 327837
For generating NEON intrinsics, this determines the NEON data type, and whether
it should be a half type or an i16 type. I.e., we always pass a half type for
AArch64, this hasn't changed, but now also for ARM but only when FullFP16 is
enabled, and i16 otherwise.
This is intended to be non-functional change, but together with the backend
work in D44538 which adds support for f16 vectors, this enables adding the
AArch32 FP16 (vector) intrinsics.
Differential Revision: https://reviews.llvm.org/D44561
llvm-svn: 327836
I don't think anyone ever got this to work, what with getting exactly
the right Python dependency and so on. Removing it simplifies the
script, removes a number of hairy dependencies, and cuts ~30 MB off the
installer size.
llvm-svn: 327835
If DoneMBB becomes empty it must have CC added to its live-in list, since it
will fall-through into EndMBB. This happens when the CLC loop does the
complete range.
Review: Ulrich Weigand
llvm-svn: 327834
Summary:
Detects function calls where the return value is unused.
Checked functions can be configured.
Reviewers: alexfh, aaron.ballman, ilya-biryukov, hokein
Reviewed By: alexfh, aaron.ballman
Subscribers: hintonda, JonasToth, Eugene.Zelenko, mgorny, xazax.hun, cfe-commits
Tags: #clang-tools-extra
Patch by Kalle Huttunen!
Differential Revision: https://reviews.llvm.org/D41655
llvm-svn: 327833
Despite their names, RegSaveAreaPtrPtr and OverflowArgAreaPtrPtr
used to be i8* instead of i8**.
This is important, because these pointers are dereferenced twice
(first in CreateLoad(), then in getShadowOriginPtr()), but for some
reason MSan allowed this - most certainly because it was possible
to optimize getShadowOriginPtr() away at compile time.
Differential revision: https://reviews.llvm.org/D44520
llvm-svn: 327830
For MSan instrumentation with MS.ParamTLS and MS.ParamOriginTLS being
TLS variables, the CreateAdd() with ArgOffset==0 is a no-op, because
the compiler is able to fold the addition of 0.
But for KMSAN, which receives ParamTLS and ParamOriginTLS from a call
to the runtime library, this introduces a stray instruction which
complicates reading/testing the IR.
Differential revision: https://reviews.llvm.org/D44514
llvm-svn: 327829
This is a step towards the upcoming KMSAN implementation patch.
KMSAN is going to use a different warning function,
__msan_warning_32(uptr origin), so we'd better create the warning
calls in one place.
Differential Revision: https://reviews.llvm.org/D44513
llvm-svn: 327828
This is the same as 327248 except Arm defining _GLOBAL_OFFSET_TABLE_ to
be the base of the .got section as some existing code is relying upon it.
For most Targets the _GLOBAL_OFFSET_TABLE_ symbol is expected to be at
the start of the .got.plt section so that _GLOBAL_OFFSET_TABLE_[0] =
reserved value that is by convention the address of the dynamic section.
Previously we had defined _GLOBAL_OFFSET_TABLE_ as either the start or end
of the .got section with the intention that the .got.plt section would
follow the .got. However this does not always hold with the current
default section ordering so _GLOBAL_OFFSET_TABLE_[0] may not be consistent
with the reserved first entry of the .got.plt.
X86, X86_64 and AArch64 will use the .got.plt. Arm, Mips and Power use .got
Fixes PR36555
Differential Revision: https://reviews.llvm.org/D44259
llvm-svn: 327823
This is re-land of https://reviews.llvm.org/rL327362 with a fix
and regression test.
The crash was due to it is possible that for found MDL loop,
LHS or RHS may contain an invariant unknown SCEV which
does not dominate the MDL. Please see regression
test for an example.
Reviewers: sanjoy, mkazantsev, reames
Reviewed By: mkazantsev
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D44553
llvm-svn: 327822
Also move ADC8i8 and SBB8i8 in the Sandy Bridge model to the same class as ADC8ri and SBB8ri. That seems more accurate since its the 8i8 is just the register forced to AL instead of coming from modrm.
llvm-svn: 327820
Mostly ported from amd_builtins, uses only denormal path for fp32.
Passes CTS on carrizo and turks
Reviewer: Aaron Watry <awatry@gmail.com>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 327818
This patch adds i128 division support by instruction LLVM to lower
128-bit divisions to the __udivmodti4 and __divmodti4 rtlib functions.
This also adds test for 64-bit division and 128-bit division.
Patch by Peter Nimmervoll.
llvm-svn: 327814
This matches what we do in SDNode.
This should allow SDValue::dump to be used in the debugger without getting an error if you don't pass an argument.
llvm-svn: 327811
Now the codebase can use the DWARFUnit superclass. It will make it later
seamlessly work also with DWARFPartialUnit for DWZ.
This patch is only a search-and-replace easily undone, nothing interesting
in it.
Differential revision: https://reviews.llvm.org/D42892
llvm-svn: 327810
DW_TAG_partial_unit for DWZ can be then presented by DWARFPartialUnit also
inherited from DWARFUnit.
Differential revision: https://reviews.llvm.org/D40466
llvm-svn: 327809
This is similar to the check later when we remap some of the instructions from one class to a new one. But if we reuse the class we don't get to do that check.
So many CPUs have violations of this check that I had to add a flag to the SchedMachineModel to allow it to be disabled. Hopefully we can get those cleaned up quickly and remove this flag.
A lot of the violations are due to overlapping regular expressions, but that's not the only kind of issue it found.
llvm-svn: 327808