Summary:
If the operands of a udiv/urem can be proved to fit within a smaller
power-of-two-sized type, reduce the width of the udiv/urem.
Backed out for failing an assert in clang bootstrap builds. Re-landing
with a fix for handling non-power-of-two inputs (e.g. udiv i24).
Original Differential Revision: https://reviews.llvm.org/D44102
llvm-svn: 326908
The purpose of this patch is to have LSR generate better code on Power.
This is done by overriding isLSRCostLess.
Differential Revision: https://reviews.llvm.org/D40855
llvm-svn: 326906
SampleProfReader assumes function names in the profile are all mangled names.
However, there are cases that few demangled names are somehow contained in
the profile (usually because of debug info problems), which may trigger parsing
error in SampleProfReader and cause the whole profile to be unusable. The patch
extends SampleProfReader to handle profiles with demangled names, so that those
profiles can still be useful.
Differential revision: https://reviews.llvm.org/D44161
llvm-svn: 326905
Instead of only printing the CU-relative offset in non-verbose mode, it
makes more sense to only printed the resolved address. In verbose mode
we still print both.
Differential revision: https://reviews.llvm.org/D44148
rdar://33525475
llvm-svn: 326903
Breaks bootstrap builds: clang built with this patch asserts while
building MCDwarf.cpp: Assertion `castIsValid(op, S, Ty) && "Invalid
cast!"' failed.
llvm-svn: 326900
Summary:
If the operands of a udiv/urem can be proved to fit within a smaller
power-of-two-sized type, reduce the width of the udiv/urem.
Reviewers: spatel, sanjoy
Subscribers: llvm-commits, hiraditya
Differential Revision: https://reviews.llvm.org/D44102
llvm-svn: 326898
These instructions are defined as taking a GPR register and a
coprocessor register for ISAs up to MIPS32. MIPS32 extended the
definition to allow a selector--a value from 0 to 32--to access
another register.
These instructions are now internally defined as being MIPS-I
instructions, but are rejected for pre-MIPS32 ISA's if they have
an explicit selector which is non-zero. This deviates slightly from
GAS's behaviour which rejects assembly instructions with an
explicit selector for pre-MIPS32 ISAs.
E.g:
mfc0 $4, $5, 0
is rejected by GAS for MIPS-I to MIPS-V but will be accepted
with this patch for MIPS-I to MIPS-V.
Reviewers: atanasyan
Differential Revision: https://reviews.llvm.org/D41662
llvm-svn: 326890
The LoadStoreVectorizer thought that <1 x T> and T were the same types
when merging stores, leading to a crash later.
Patch by Erik Hogeman.
Differential Revision: https://reviews.llvm.org/D44014
llvm-svn: 326884
getCurrCycleIdx() returns the decoder cycle index which the next candidate SU
will be placed on.
This patch improves this method by passing the candidate SU to it so that if
SU will begin a new group, the index of that group is returned instead.
Review: Ulrich Weigand
llvm-svn: 326880
Handle the not-taken branch in emitInstruction() where the TakenBranch
argument is available. This is cleaner than relying on EmitInstruction().
Review: Ulrich Weigand
llvm-svn: 326879
Summary:
Only IMUL16rri uses an extra P0156. IMUL32* and IMUL16rr only use
P1.
This was computed using https://github.com/google/EXEgesis/blob/master/exegesis/tools/compute_itineraries.cc
This can easily be validated by running perf on the following code:
```
int main(int argc, char**argv) {
int a = argc;
int b = argc;
int c = argc;
int d = argc;
for (int i = 0; i < LOOP_ITERATIONS; ++i) {
asm volatile(
R"(
.rept 10000
imull $0x2, %%edx, %%eax
imull $0x2, %%ecx, %%ebx
imull $0x2, %%eax, %%edx
imull $0x2, %%ebx, %%ecx
.endr
)"
: "+a"(a), "+b"(b), "+c"(c), "+d"(d)
:
:);
}
return a+b+c+d;
}
```
-> test.cc
perf stat -x, -e cycles --pfm-events=uops_executed_port:port_0:u,uops_executed_port:port_1:u,uops_executed_port:port_2:u,uops_executed_port:port_3:u,uops_executed_port:port_4:u,uops_executed_port:port_5:u,uops_executed_port:port_6:u,uops_executed_port:port_7:u test
Reviewers: craig.topper, RKSimon, gadi.haber
Subscribers: llvm-commits, gchatelet, chandlerc
Differential Revision: https://reviews.llvm.org/D43460
llvm-svn: 326877
Summary: This avoids crashing when a user tries to dump a pdb with the `-native` option.
Reviewers: zturner, llvm-commits, rnk
Reviewed By: zturner
Subscribers: mgrang
Differential Revision: https://reviews.llvm.org/D44117
llvm-svn: 326863
Summary: This helps to determine the line number for a PDB type with definition
Reviewers: zturner, llvm-commits, rnk
Reviewed By: zturner
Subscribers: rengolin, JDevlieghere
Differential Revision: https://reviews.llvm.org/D44119
llvm-svn: 326857
I think most of the Intel Core CPUs and recent AMD CPUs are unaffected. All the CPUs that have a "subtype" should work. The ones that were broken are the ones that are a "type" with no subtypes.
Fixes PR36619.
llvm-svn: 326840
Fixes the bug found by asan. Also XFAIL the new test for Darwin,
which is stuck on DWARF v2, and fix up other tests so they stop
failing on Windows.
llvm-svn: 326839
It's been quite some time the Dependence Analysis (DA) is broken,
as it uses the GEP representation to "identify" multi-dimensional arrays.
It even wrongly detects multi-dimensional arrays in single nested loops:
from test/Analysis/DependenceAnalysis/Coupled.ll, example @couple6
;; for (long int i = 0; i < 50; i++) {
;; A[i][3*i - 6] = i;
;; *B++ = A[i][i];
DA used to detect two subscripts, which makes no sense in the LLVM IR
or in C/C++ semantics, as there are no guarantees as in Fortran of
subscripts not overlapping into a next array dimension:
maximum nesting levels = 1
SrcPtrSCEV = %A
DstPtrSCEV = %A
using GEPs
subscript 0
src = {0,+,1}<nuw><nsw><%for.body>
dst = {0,+,1}<nuw><nsw><%for.body>
class = 1
loops = {1}
subscript 1
src = {-6,+,3}<nsw><%for.body>
dst = {0,+,1}<nuw><nsw><%for.body>
class = 1
loops = {1}
Separable = {}
Coupled = {1}
With the current patch, DA will correctly work on only one dimension:
maximum nesting levels = 1
SrcSCEV = {(-2424 + %A)<nsw>,+,1212}<%for.body>
DstSCEV = {%A,+,404}<%for.body>
subscript 0
src = {(-2424 + %A)<nsw>,+,1212}<%for.body>
dst = {%A,+,404}<%for.body>
class = 1
loops = {1}
Separable = {0}
Coupled = {}
This change removes all uses of GEP from DA, and we now only rely
on the SCEV representation.
The patch does not turn on -da-delinearize by default, and so the DA analysis
will be more conservative in the case of multi-dimensional memory accesses in
nested loops.
I disabled some interchange tests, as the DA is not able to disambiguate
the dependence anymore. To make DA stronger, we may need to
compute a bound on the number of iterations based on the access functions
and array dimensions.
The patch cleans up all the CHECKs in test/Transforms/LoopInterchange/*.ll to
avoid checking for snippets of LLVM IR: this form of checking is very hard to
maintain. Instead, we now check for output of the pass that are more meaningful
than dozens of lines of LLVM IR. Some tests now require -debug messages and thus
only enabled with asserts.
Patch written by Sebastian Pop and Aditya Kumar.
Differential Revision: https://reviews.llvm.org/D35430
llvm-svn: 326837
These two functions iterate over the list of statistics but don't take the lock
that protects the iterators from being invalidated by
StatisticInfo::addStatistic().
So far, this hasn't been an issue since (in-tree at least) these functions are
called by the StatisticInfo destructor so addStatistic() shouldn't be called
anymore. However, we do expose them in the public API.
Note that this only protects against iterator invalidation and does not protect
against ordering issues caused by statistic updates that race with
PrintStatistics()/PrintStatisticsJSON().
Thanks to Roman Tereshin for spotting it
llvm-svn: 326834
The code checks Level == AfterLegalizeDAG which is the fourth and last of the possible DAG combine stages that we have.
There is a Level called AfterLegalVectorOps, but that's the third DAG combine and it doesn't always run.
A function called isAfterLegalVectorOps should imply it returns true in either of the DAG combines that runs after the legalize vector ops stage, but that's not what this function does.
llvm-svn: 326832
In case if -mattr used to modify feature set bits in llvm-mc call
getIsaVersion can fail to identify specific ISA due to test mismatch.
Adding default fallback tests which will always correctly report at
least major version.
Differential Revision: https://reviews.llvm.org/D44163
llvm-svn: 326825
Summary:
- Emit UdtSourceLine information for enums to match MSVC
- Add a method to add UDTSrcLine and call it for all Class/Struct/Union/Enum
- Update test cases to verify the changes
Reviewers: zturner, llvm-commits, rnk
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D44116
llvm-svn: 326824
Using cst_pred_ty in the definition allows us to match vectors with undef elements.
This is a continuation of an effort to make all pattern matchers allow undef elements in vectors:
rL325437
rL325466
D43792
Differential Revision: https://reviews.llvm.org/D44076
llvm-svn: 326823
Most of the folds based on SelectPatternResult belong in InstSimplify rather than
InstCombine, so the helper code should be available to other passes/analysis.
llvm-svn: 326812
Following the ARM-neon backend, define isExtractSubvectorCheap to return true
when extracting low and high part of a neon register.
The patch disables a test in llvm/test/CodeGen/AArch64/arm64-ext.ll This
testcase is fragile in the sense that it requires a BUILD_VECTOR to "survive"
all DAG transforms until ISelLowering. The testcase is supposed to check that
AArch64TargetLowering::ReconstructShuffle() works, and for that we need a
BUILD_VECTOR in ISelLowering. As we now transform the BUILD_VECTOR earlier into
an VEXT + vector_shuffle, we don't have the BUILD_VECTOR pattern when we get to
ISelLowering. As there is no way to disable the combiner to only exercise the
code in ISelLowering, the patch disables the testcase.
Differential revision: https://reviews.llvm.org/D43973
llvm-svn: 326811
AsmToken is in the MCParser library, so we can't use its dump function from
MCAsmMacro in the MC library. Instead, just print the string, which we don't
need the MCParser library for.
llvm-svn: 326810
One addrspacecast disappeared in clang emitted IR for
block invoke function due to adoption of the new
addr space mapping.
Differential Revision: https://reviews.llvm.org/D43785
llvm-svn: 326806
This patch handling:
Enable parsing of raw encodings of system registers .
Allows UNPREDICTABLE sysregs to be decoded to a raw number in the same way that disasslib does, rather than llvm crashing.
Disassemble msr/mrs with unpredictable sysregs as SoftFail.
Fix regression due to SoftFailing some encodings.
Patch by Chris Ryder
Differential revision:https://reviews.llvm.org/D43374
llvm-svn: 326803
This adds some debug printing (gated behind the "asm-macros" debug flag) which
can help tracing complicated assembly macros.
Differential revision: https://reviews.llvm.org/D43937
llvm-svn: 326795
* Move printing from llvm-mc to the AsmToken class, so that it can be used elsewhere.
* Add 5 cases which were missed: BigNum, Comment, HashDirective, Space and
BackSlash, and remove the default case so that -Wswitch will catch this error
in future.
This is almost NFC, except for the fact that llvm-mc can now print those 5
tokens in -as-lex mode.
Differential revision: https://reviews.llvm.org/D43936
llvm-svn: 326794
Change doCallSiteSplitting to iterate until we reach the terminator instruction.
tryToSplitCallSite can replace BB's terminator in case BB is a successor of
itself. Then IE will be invalidated and we also have to check the current
terminator.
Reviewers: junbuml, davidxl, davide, fhahn
Reviewed By: fhahn, junbuml
Differential Revision: https://reviews.llvm.org/D43824
llvm-svn: 326793