The sdivrem will emit its own MOVSX to move %ah to the low byte of a register. By using a MOVSX for an any_extend this allows a post-isel peephole to merge them.
llvm-svn: 346581
After the division %ah is being sign extended to move it to lower byte of a register while avoiding a partial register read. We then zero extend the low byte to the full 32 bit register. But we don't use any of the zero extended bits. In the DAG the zero extend was really an any_extend so the sign extend should have been enough.
llvm-svn: 346580
This gives shuffle lowering the freedom to use zero_extend_vector_inreg for the unpckl shuffle. Shuffle combining usually makes this swap later, but not when AVX512 is enabled it seems.
While there also use DAG.getConstant to create a 0 vector instead of using the helper the forces a specific BUILD_VECTOR. I don't think that helper is usually needed. We're basically free to create a constant build_vector anytime and it will be legalized on its own.
llvm-svn: 346574
This patch adds support for funclets in frame lowering and ISel
lowering. Together with D50288 and D50166, it enables C++ exception
handling.
Patch by Sanjin Sijaric, with some fixes by me.
Differential Revision: https://reviews.llvm.org/D51524
llvm-svn: 346568
In r346432 ("[DAGCombine] Improve alias analysis for chain of independent stores"),
the order of ldi/sts blocks changed.
The new IR is equivalent to the old IR.
This patch updates the test to fix the test suite.
llvm-svn: 346565
LDRcp should be deleted when the dest register is dead in register
coalescing. Without MemOp, dead LDRcp will cause dead constant pool
value which references to non-existing label.
Patch by Yin Ma.
Differential Revision: https://reviews.llvm.org/D54173
llvm-svn: 346563
ComputeValueKnownInPredecessors has a "visited" set to prevent infinite
loops, since a value can be visited more than once. However, the
implementation didn't prevent the algorithm from taking exponential
time. Instead of removing elements from the RecursionSet one at a time,
we should keep around the whole set until
ComputeValueKnownInPredecessors finishes, then discard it.
The testcase is synthetic because I was having trouble effectively
reducing the original. But it's basically the same idea.
Instead of failing, we could theoretically cache the result instead.
But I don't think it would help substantially in practice.
Differential Revision: https://reviews.llvm.org/D54239
llvm-svn: 346562
Summary:
These tests fail on 32-bit builds because NaN payload bits in floating point
immediates are not necessarily preserved through compilation. This is because
the MC layer uses native doubles to store these values. The tests will be
reenabled once this problem has been fixed or deleted if we decide we don't care
about lowering payload bits.
Reviewers: aheejin, dschuff
Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits
Differential Revision: https://reviews.llvm.org/D54353
llvm-svn: 346558
With avx512f but not avx512bw we need to extend to v16i32 then truncate that to to v16i8. Previously we emitted both nodes during lowering, but I'm trying to switch to using target independent nodes and with that switched the extend+truncate wou
This patch changes the implementation to what will be necessary with that patch which helps minimize test diffs.
llvm-svn: 346552
A few fp128 tests were omitted from test/CodeGen/SystemZ/fp-round-01.ll
since in early days, LLVM couldn't handle implicitly generated library
calls to functions with long double arguments on SystemZ.
This deficiency was actually long since fixed, but those tests are
still missing. This patch adds the missing tests. NFC.
llvm-svn: 346541
This makes X86ISD::VSEXT more similar to ISD::SIGN_EXTEND and ISD::ZERO_EXTEND.
I'm hoping to replace X86ISD::VSEXT/VZEXT with target independent nodes. Making the target specific nodes similar to the target independent nodes helps minimize test diffs in that patch.
llvm-svn: 346539
Eliminate the stack frame in functions with the noreturn nounwind
attributes, and when the noreturn-stack-elim target feature is
enabled. This reduces the code and stack space needed for noreturn
functions.
Differential Revision: https://reviews.llvm.org/D54210
llvm-svn: 346532
It's possible for vector op legalization to generate a shuffle. If that happens we should give a chance for DAG combine to combine that with a build_vector input.
I also fixed a bug in combineShuffleOfScalars that was considering the number of uses on a undef input to a shuffle. We don't care how many times undef is used.
Differential Revision: https://reviews.llvm.org/D54283
llvm-svn: 346530
Summary:
The current implementation prepends a space on every line, making it difficult to compare against GNU strings.
The space appears to have come from handling --radix in rL292707. The space is for making sure there's a space between the radix and the value; however the space is still emitted even when there is no radix. This change fixes that so the space is only emitted when there is a radix.
Reviewers: jhenderson
Reviewed By: jhenderson
Subscribers: llvm-commits, compnerd
Differential Revision: https://reviews.llvm.org/D54238
llvm-svn: 346529
Both -fPIC and -G0 disable placement of globals in small data section,
but if a global has an explicit section assigmnent placing it in small
data, it should go there anyway.
llvm-svn: 346523
Currently in llvm, CalleeSavedInfo can only assign a callee saved register to
stack frame index to be spilled in the prologue. We would like to enable
spilling gprs to vector registers. This patch adds the capability to spill to
other registers aside from just the stack. It also adds the changes for power9
to spill gprs to volatile vector registers when they are available.
This happens only for leaf functions when using the option
-ppc-enable-pe-vector-spills.
Differential Revision: https://reviews.llvm.org/D39386
llvm-svn: 346512
This reverts commit r345972. Need to update the description + possibly
to update the patch itself after discussion with Eric Christofer.
llvm-svn: 346508
Summary:
lcov tracefiles are used by various coverage reporting tools and build
systems (e.g., Bazel). It is a simple text-based format to parse and
more convenient to use than the JSON export format, which needs
additional processing to map regions/segments back to line numbers.
It's a little unfortunate that "text" format is now overloaded to refer
specifically to JSON for export, but I wanted to avoid making any
breaking changes to the UI of the llvm-cov tool at this time.
Patch by Tony Allevato (@allevato).
Reviewers: Dor1s, vsk
Reviewed By: Dor1s, vsk
Subscribers: mgorny, llvm-commits
Differential Revision: https://reviews.llvm.org/D54266
llvm-svn: 346506
A minor improvement of buildVector() that skips creating an
INSERT_VECTOR_ELT for a Value which has already been used for the
REPLICATE.
Review: Ulrich Weigand
https://reviews.llvm.org/D54315
llvm-svn: 346504
Now that we have mixed type sizes, i1 values need to be explicitly
handled as we want to avoid promoting these values.
Differential Revision: https://reviews.llvm.org/D54308
llvm-svn: 346499
I noticed that we weren't generating broadcasts as much I thought we would with
D54271, and this is part of the problem.
Widening the shuffle elements means adding bitcasts and hiding the relationship
between a splatted scalar and the vector. If we can form a broadcast, do that
before going through the rest of the shuffle lowering because broadcasts should
be cheap and can often be load-folded.
Differential Revision: https://reviews.llvm.org/D54280
llvm-svn: 346498
After D45330, Dominators are required for IPSCCP and can be preserved.
This patch preserves DominatorTreeAnalysis in the new pass manager. AFAIK the legacy pass manager cannot preserve function analysis required by a module analysis.
Reviewers: davide, dberlin, chandlerc, efriedma, kuhar, NutshellySima
Reviewed By: chandlerc, kuhar, NutshellySima
Differential Revision: https://reviews.llvm.org/D47259
llvm-svn: 346486
The DAGCombiner tries to SimplifySelectCC as follows:
select_cc(x, y, 16, 0, cc) -> shl(zext(set_cc(x, y, cc)), 4)
It can't cope with the situation of reordered operands:
select_cc(x, y, 0, 16, cc)
In that case we just need to swap the operands and invert the Condition Code:
select_cc(x, y, 16, 0, ~cc)
Differential Revision: https://reviews.llvm.org/D53236
llvm-svn: 346484
We can stop recording conditions once we reached the immediate dominator
for the block containing the call site. Conditions in predecessors of the
that node will be the same for all paths to the call site and splitting
is not beneficial.
This patch makes CallSiteSplitting dependent on the DT anlysis. because
the immediate dominators seem to be the easiest way of finding the node
to stop at.
I had to update some exiting tests, because they were checking for
conditions that were true/false on all paths to the call site. Those
should now be handled by instcombine/ipsccp.
Reviewers: davide, junbuml
Reviewed By: junbuml
Differential Revision: https://reviews.llvm.org/D44627
llvm-svn: 346483
In SimplifyCFG when given a conditional branch that goes to BB1 and BB2, the hoisted common terminator instruction in the two blocks, caused debug line records associated with subsequent select instructions to become ambiguous. It causes the debugger to display unreachable source lines.
Differential Revision: https://reviews.llvm.org/D53390
llvm-svn: 346481
Previously, during the search, all values had to have the same
'TypeSize', which is equal to number of bits of the integer type of
the icmp operand. All values in the tree had to match this size;
meaning that, if we searched from i16, we wouldn't accept i8s. A
change in type size requires zext and truncs to perform the casts so,
to allow mixed narrow types, the handling of these instructions is
now slightly different:
- we allow casts if their result or operand is <= TypeSize.
- zexts are sinks if their result > TypeSize.
- truncs are still sinks if their operand == TypeSize.
- truncs are still sources if their result == TypeSize.
The transformation bails on finding an icmp that operates on data
smaller than the current TypeSize.
Differential Revision: https://reviews.llvm.org/D54108
llvm-svn: 346480
CMake invokes rc using the joined spelling which appears to be supported
by Microsoft's rc implementation, so we should support it as well.
Differential Revision: https://reviews.llvm.org/D54191
llvm-svn: 346470
CMake generate manifests that contain absolute filenames and these
currently result in assertion error. This change ensures that we
handle these correctly.
Differential Revision: https://reviews.llvm.org/D54194
llvm-svn: 346450
Summary:
The pass incorrectly assumed if there's a longjmp declaration in the
module, there is also a setjmp function declaration. Fixed it, and now
the pass only converts longjmp and does not do any other transformation
when there's no setjmp declaration in the module.
Fixes PR39562.
Reviewers: jgravelle-google, sbc100
Subscribers: dschuff, sunfish, llvm-commits
Differential Revision: https://reviews.llvm.org/D54273
llvm-svn: 346445
This patch adds logic to detect reductions across the inner and outer
loop by following the incoming values of PHI nodes in the outer loop. If
the incoming values take part in a reduction in the inner loop or come
from outside the outer loop, we found a reduction spanning across inner
and outer loop.
With this change, ~10% more loops are interchanged in the LLVM
test-suite + SPEC2006.
Fixes https://bugs.llvm.org/show_bug.cgi?id=30472
Reviewers: mcrosier, efriedma, karthikthecool, davide, hfinkel, dmgreen
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D43245
llvm-svn: 346438
Summary:
This fixes PR 37422
In ELF, non-weak symbols can also be non-prevailing. In this particular
PR, the __llvm_profile_* symbols are non-prevailing but weren't getting
dropped - causing multiply-defined errors with lld.
Also add a test, strong_non_prevailing.ll, to ensure that multiple
copies of a strong symbol are dropped.
To fix the test regressions exposed by this fix,
- do not mark prevailing copies for symbols with 'appending' linkage.
There's no one prevailing copy for such symbols.
- fix the prevailing version in dead-strip-fulllto.ll
- explicitly pass exported symbols to llvm-lto in fumcimport.ll and
funcimport_var.ll
Reviewers: tejohnson, pcc
Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith,
dang, srhines, llvm-commits
Differential Revision: https://reviews.llvm.org/D54125
llvm-svn: 346436
As discussed in D54073, we have a potential regression from more aggressive vector narrowing here, so let's try to avoid that by changing build-vector lowering slightly.
Insert-vector-element lowering always does this since there's no "pinsr" for ymm/zmm:
// If the vector is wider than 128 bits, extract the 128-bit subvector, insert
// into that, and then insert the subvector back into the result.
...but we can sometimes do better for insert-into-constant-vector by using shuffle lowering.
Differential Revision: https://reviews.llvm.org/D54271
llvm-svn: 346433
FindBetterNeighborChains simulateanously improves the chain
dependencies of a chain of related stores avoiding the generation of
extra token factors. For chains longer than the GatherAllAliasDepths,
stores further down in the chain will necessarily fail, a potentially
significant waste and preventing otherwise trivial parallelization.
This patch directly parallelize the chains of stores before improving
each store. This generally improves DAG-level parallelism.
Reviewers: courbet, spatel, RKSimon, bogner, efriedma, craig.topper, rnk
Subscribers: sdardis, javed.absar, hiraditya, jrtc27, atanasyan, llvm-commits
Differential Revision: https://reviews.llvm.org/D53552
llvm-svn: 346432
As noted by Andrea Di Biagio in https://bugs.llvm.org/show_bug.cgi?id=39465
both the loads and stores occupy both the store and load queues.
This is clearly wrong.
llvm-svn: 346425
Summary:
When the 3rd argument to these intrinsics is zero, lowering them
to shift instructions produces poison values, since we end up with
shift amounts equal to the number of bits in the shifted value. This
means we can only lower these intrinsics if we can prove that the
3rd argument is not zero.
Reviewers: arsenm
Reviewed By: arsenm
Subscribers: bnieuwenhuizen, jvesely, wdng, nhaehnle, llvm-commits
Differential Revision: https://reviews.llvm.org/D53739
llvm-svn: 346422
This eliminates the outlining penalty for llvm.trap/unreachable, because
callers no longer have to emit cleanup/ret instructions after calling an
outlined `noreturn` function.
rdar://45523626
llvm-svn: 346421
LC_BUILD_VERSION contains platform information that is useful for LLDB
to match up dSYM bundles with binaries. This patch copies the load
command over into the dSYM.
rdar://problem/44145175
rdar://problem/45883463
Differential Revision: https://reviews.llvm.org/D54233
llvm-svn: 346412
It was discovered in randomized testing that the SystemZ implementation of
shouldCoalesce() could be caused to crash when subreg liveness was
enabled. This was because an undef use of the virtual register was copied
outside current MBB at the point of shouldCoalesce() being called. For more
details, see https://bugs.llvm.org/show_bug.cgi?id=39276.
This patch changes the check for MBB locality from livein/liveout checks to
do checks for all instructions of both intervals being inside MBB. This
avoids the cases with dead defs / undef uses outside MBB, which are not
affecting liveness in/out of MBB.
The original test case included as a reduced .mir test case.
Review: Ulrich Weigand
https://reviews.llvm.org/D54197
llvm-svn: 346406
During review it was noted that while it appears that
the Piledriver can do two [consecutive] loads per cycle,
it can only do one store per cycle. It was suggested
that the sched model incorrectly models that,
but it was opted to fix this afterwards.
These tests show that the two consecutive loads are
modelled correctly, and one consecutive stores is not
modelled incorrectly. Unless i'm missing the point.
https://bugs.llvm.org/show_bug.cgi?id=39465
llvm-svn: 346404
Generalize code in Thumb2InstrInfo::storeRegToStackSlot() and
loadRegToStackSlot() to allow the GPR class or any of its sub-classes
(including hGPR) to be stored/loaded by ARM::t2STRi12/ARM::t2LDRi12.
Differential Revision: https://reviews.llvm.org/D51927
llvm-svn: 346401
The patch has been reverted because it ended up prohibiting propagation
of a constant to exit value. For such values, we should skip all checks
related to hard uses because propagating a constant is always profitable.
Differential Revision: https://reviews.llvm.org/D53691
llvm-svn: 346397
LSR reassociates constants as unfolded offsets when the constants fit as
immediate add operands, which currently prevents such constants from being
combined later with loop invariant registers.
This patch modifies GenerateCombinations() to generate a second formula which
includes the unfolded offset in the combined loop-invariant register.
This commit fixes a bug in the original patch (committed at r345114, reverted
at r345123).
Differential Revision: https://reviews.llvm.org/D51861
llvm-svn: 346390
Summary:
MergeFunctions currently tries to process strong functions before
weak functions, because weak functions can simply call strong
functions, while a strong/weak function cannot call a weak function
(a backing strong function is needed).
This patch additionally tries to process external functions before
local functions, because we definitely have to keep the external
function, but may be able to drop the local one (and definitely
can if it is also unnamed_addr).
Unfortunately, this exposes an existing bug in the implementation:
The FnTree and FNodesInTree structures can currently go out of
sync in the case where two weak functions are merged, because the
function in FnTree/FNodesInTree is RAUWed. This leaves it behind in
FnTree (this is intended, as it is the strong backing function which
should be used for further merges), while it is replaced in
FNodesInTree (this is not intended).
This is fixed by switching FNodesInTree from using a ValueMap to
using a DenseMap of AssertingVH.
This exposes another minor issue: Currently FNodesInTree is not
cleared after MergeFunctions finishes running. Currently, this is
potentially dangerous (e.g. if something else wants to RAUW a function
with a non-function), but at the very least it is unnecessary/inefficient.
After the change to use AssertingVH it becomes more problematic,
because there are certainly passes that remove functions.
This issue is fixed by clearing FNodesInTree at the end of the pass.
Reviewers: jfb, whitequark
Reviewed By: whitequark
Subscribers: rkruppe, llvm-commits
Differential Revision: https://reviews.llvm.org/D53271
llvm-svn: 346386
Summary:
For unnamed_addr functions we RAUW instead of only replacing direct callers. However, functions in which replacements were performed currently are not added back to the worklist, resulting in missed merging opportunities.
Fix this by calling removeUsers() prior to RAUW.
Reviewers: jfb, whitequark
Reviewed By: whitequark
Subscribers: rkruppe, llvm-commits
Differential Revision: https://reviews.llvm.org/D53262
llvm-svn: 346385
Promote alloca can vectorize a small array by bitcasting it to a
vector type. Extend vectorization for the case when alloca is
already a vector type. We still want to replace GEPs with an
insert/extract element instructions in this case.
Differential Revision: https://reviews.llvm.org/D54219
llvm-svn: 346376
Summary:
This change implements assembler parser, code emitter, ELF object writer
and disassembler for the MSP430 ISA. Also, more instruction forms are added
to the target description.
Reviewers: asl
Reviewed By: asl
Subscribers: pftbest, krisb, mgorny, llvm-commits
Differential Revision: https://reviews.llvm.org/D53661
llvm-svn: 346374
Summary:
Port the GNU style printNotes method to the LLVMStyle subclass.
This is basically just a heavy refactor so that the note parsing/formatting logic from the GNUStyle::printNotes can be shared with LLVMStyle::printNotes.
Reviewers: MaskRay
Reviewed By: MaskRay
Subscribers: dschuff, fedor.sergeev, llvm-commits
Differential Revision: https://reviews.llvm.org/D54220
llvm-svn: 346371
If all the edge counts for a function are zero, skip count population and
annotation, as nothing will happen. This can save some compile time.
Differential Revision: https://reviews.llvm.org/D54212
llvm-svn: 346370
Like the comment says, this isn't the most efficient fix in terms of
codesize, but it works.
Differential Revision: https://reviews.llvm.org/D54129
llvm-svn: 346358
The lowering was missing live-ins in certain cases, like a sequence of
multiple tMOVCCr_pseudo instructions. This would lead to a verifier
failure, and on pre-v6 Thumb CPSR would be incorrectly clobbered.
For reasons I don't completely understand, it's hard to get a sequence
of multiple tMOVCCr_pseudo instructions; the issue only seems to show up
with 64-bit comparisons where the result is zero-extended. I added some
extra testcases in case that changes in the future. Probably some
optimization opportunities here if anyone is interested. (@test_slt_not
is the case that was getting miscompiled.)
The code to check the liveness of CPSR was stolen from
X86ISelLowering.cpp; maybe it could be refactored into common helper,
but I have no idea where to put it.
Differential Revision: https://reviews.llvm.org/D54192
llvm-svn: 346355
This allows testing AMDGPU alias analysis like any
other alias analysis pass. This fixes the existing
test pointlessly running opt -O3 when it really
just wants to run the one analysis.
Before there was no way to test this using -aa-eval
with opt, since the default constructed pass
is run. The wrapper subclass allows the
default constructor to pass the necessary callback.
llvm-svn: 346353
When partial unswitch operates on multiple conditions at once, .e.g:
if (Cond1 || Cond2 || NonInv) ...
it should infer (and replace) values for individual conditions only on one
side of unswitch and not another.
More precisely only these derivations hold true:
(Cond1 || Cond2) == false => Cond1 == Cond2 == false
(Cond1 && Cond2) == true => Cond1 == Cond2 == true
By the way we organize unswitching it means only replacing on "continue" blocks
and never on "unswitched" ones. Since trivial unswitch does not have "unswitched"
blocks it does not have this problem.
Fixes PR 39568.
Reviewers: chandlerc, asbirlea
Differential Revision: https://reviews.llvm.org/D54211
llvm-svn: 346350
Summary:
The conditional branch created to support -fsplit-stack for X86 is
left unbiased/unhinted, resulting in less than ideal block placement:
the __morestack call block is kept on the main hot path. Bias the
branch to insure that the stack allocation block is treated as a
"cold" block during machine basic block placement.
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D54123
llvm-svn: 346336
If we simplify an instruction to itself, we do not need to add a user to
itself. For congruence classes with a defining expression, we already
use a similar logic.
Fixes PR38259.
Reviewers: davide, efriedma, mcrosier
Reviewed By: davide
Differential Revision: https://reviews.llvm.org/D51168
llvm-svn: 346335
It seems that the PPC backend croaks when lowering a call to a
function with an argument of type [2 x i32].
Just modify the type slightly to avoid this -- I wasn't actually
intending to stress test the backend...
llvm/lib/Target/PowerPC/PPCISelLowering.cpp:6172: llvm::SDValue llvm::PPCTargetLowering::LowerCall_64SVR4(...): Assertion `(!HasParameterArea || NumBytesActuallyUsed == ArgOffset) && "mismatch in size of parameter area"' failed.
llvm-svn: 346334
By morphing the instruction rather than deleting and creating a new one,
we retain fast-math-flags and potentially other metadata (profile info?).
llvm-svn: 346331
Summary:
Add unit tests to check the support for each supported format to avoid
regressions such as the one in PR36906.
Reviewers: gchatelet
Subscribers: tschuett, lebedev.ri, llvm-commits
Differential Revision: https://reviews.llvm.org/D54144
llvm-svn: 346330
This adds the llvm-side support for post-inlining evaluation of the
__builtin_constant_p GCC intrinsic.
Also fixed SCCPSolver::visitCallSite to not blow up when seeing a call
to a function where canConstantFoldTo returns true, and one of the
arguments is a struct.
Updated from patch initially by Janusz Sobczak.
Differential Revision: https://reviews.llvm.org/D4276
llvm-svn: 346322
The sibling fold for 'oge' --> 'ord' was already here,
but this half was missing.
The result of fabs() must be positive or nan, so asking
if the result is negative or nan is the same as asking
if the result is nan.
This is another step towards fixing:
https://bugs.llvm.org/show_bug.cgi?id=39475
llvm-svn: 346321
Set operands order for G_MERGE_VALUES and G_UNMERGE_VALUES so
that least significant bits always go first, regardless of endianness.
Differential Revision: https://reviews.llvm.org/D54098
llvm-svn: 346305
Set `LiveReg::PhysReg` to zero when freeing a register instead of
removing it from the entry from `LiveRegMap`. This way no iterators get
invalidated and we can avoid passing around and updating iterators all
over the place.
This does not change any allocator decisions. It is not completely NFC
because the arbitrary iteration order through `LiveRegMap` in
`spillAll()` changes so we may get a different order in those spill
sequences (the amount of spills does not change).
This is in preparation of https://reviews.llvm.org/D52010.
llvm-svn: 346298
Summary:
This updates generated binaries and corresponding test cases up to date
after applying FixFunctionBitcasts pass.
Reviewers: sbc100
Subscribers: dschuff, jgravelle-google, sunfish, llvm-commits
Differential Revision: https://reviews.llvm.org/D54070
llvm-svn: 346286
This feature makes it easy to tune FileCheck diagnostic output when
running the test suite via ninja, a bot, or an IDE. For example:
```
$ FILECHECK_OPTS='-color -v -dump-input-on-failure' \
LIT_FILTER='OpenMP/for_codegen.cpp' ninja check-clang \
| less -R
```
Reviewed By: probinson
Differential Revision: https://reviews.llvm.org/D53517
llvm-svn: 346272
Add this option for debugging and providing workaround.
By default it is off so no behavior change in backend.
Differential Revision: https://reviews.llvm.org/D54158
llvm-svn: 346267
Summary:
The NotEligibleToImport flag on the GlobalValueSummary was set if it
isn't legal to import (e.g. because it references unpromotable locals)
and when it can't be inlined (in which case importing is pointless).
I split out the inlinable piece into a separate flag on the
FunctionSummary (doesn't make sense for aliases or global variables),
because in the future we may want to import for reasons other than
inlining.
Reviewers: davidxl
Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, arphaman, llvm-commits
Differential Revision: https://reviews.llvm.org/D53345
llvm-svn: 346261
The lowering for a call to eh_typeid_for changes when it's moved from
one function to another.
There are several proposals for fixing this issue in llvm.org/PR39545.
Until some solution is in place, do not allow CodeExtractor to extract
calls to eh_typeid_for, as that results in serious miscompilations.
llvm-svn: 346256
When CodeExtractor moves instructions to a new function, debug
intrinsics referring to those instructions within the parent function
become invalid.
This results in the same verifier failure which motivated r344545, about
function-local metadata being used in the wrong function.
llvm-svn: 346255
It was causing a crash because we were trying to get the definition
of a target register. Fixed the issue by adding a check and added
a test case for that.
llvm-svn: 346251
Support the IS_SHARED bit in the memory limits flag word.
The compiler does not create object files with memory definitions,
but the field is used by the linker.
Differential Revision: https://reviews.llvm.org/D54131
llvm-svn: 346246
This is another part of solving PR39475:
https://bugs.llvm.org/show_bug.cgi?id=39475
This might be enough to fix that particular issue, but as noted
with the FIXME, we're still dropping FMF on other folds around here.
llvm-svn: 346234
The `sigrie` instruction signals a Reserved Instruction Exception.
This patch adds support for assembling / disassembling the instruction.
Differential Revision: http://reviews.llvm.org/D53861
llvm-svn: 346230
Summary:
This change cuts across LLVM and compiler-rt to add support for
rendering custom events in the XRayRecord type, to allow for including
user-provided annotations in the output YAML (as raw bytes).
This work enables us to add custom event and typed event records into
the `llvm::xray::Trace` type for user-provided events. This can then be
programmatically handled through the C++ API and can be included in some
of the tooling as well. For now we support printing the raw data we
encounter in the custom events in the converted output.
Future work will allow us to start interpreting these custom and typed
events through a yet-to-be-defined API for extending the trace analysis
library.
Reviewers: mboerger
Subscribers: hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D54139
llvm-svn: 346214
This patch makes LICM use `ICFLoopSafetyInfo` that is a smarter version
of LoopSafetyInfo that leverages power of Implicit Control Flow Tracking
to keep track of throwing instructions and give less pessimistic answers
to queries related to throws.
The ICFLoopSafetyInfo itself has been introduced in rL344601. This patch
enables it in LICM only.
Differential Revision: https://reviews.llvm.org/D50377
Reviewed By: apilipenko
llvm-svn: 346201
This reverts commit 2f425e9c7946b9d74e64ebbfa33c1caa36914402.
It seems that the check that we still should do the transform if we
know the result is constant is missing in this code. So the logic that
has been deleted by this change is still sometimes accidentally useful.
I revert the change to see what can be done about it. The motivating
case is the following:
@Y = global [400 x i16] zeroinitializer, align 1
define i16 @foo() {
entry:
br label %for.body
for.body: ; preds = %entry, %for.body
%i = phi i16 [ 0, %entry ], [ %inc, %for.body ]
%arrayidx = getelementptr inbounds [400 x i16], [400 x i16]* @Y, i16 0, i16 %i
store i16 0, i16* %arrayidx, align 1
%inc = add nuw nsw i16 %i, 1
%cmp = icmp ult i16 %inc, 400
br i1 %cmp, label %for.body, label %for.end
for.end: ; preds = %for.body
%inc.lcssa = phi i16 [ %inc, %for.body ]
ret i16 %inc.lcssa
}
We should be able to figure out that the result is constant, but the patch
breaks it.
Differential Revision: https://reviews.llvm.org/D51584
llvm-svn: 346198
Summary:
Improve the intrinsic bindings with operations for
- Retrieving and automatically inserting the declaration of an intrinsic by ID
- Retrieving the name of a non-overloaded intrinsic by ID
- Retrieving the name of an overloaded intrinsic by ID and overloaded parameter types
Improve the echo test to copy non-overloaded intrinsics by ID.
Reviewers: whitequark, deadalnix
Reviewed By: whitequark
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D53626
llvm-svn: 346195
This reverts rL345880. It caused some test failures on the
webassembly waterfall. e.g. binaryen2.test_mainenv fails due
the fact that `envp` ends up being undef rather than 0.
Differential Revision: https://reviews.llvm.org/D54117
llvm-svn: 346187
This is NFCI for InstCombine because it calls InstSimplify,
so I left the tests for this transform there. As noted in
the code comment, we can allow this fold more often by using
FMF and/or value tracking.
llvm-svn: 346169
I'm preparing a patch to avoid creating critical edges in cmov expansion. Updating these tests to make the changes by the next patch easier to see.
llvm-svn: 346161
Summary:
This patch prevents MergeICmps to performn the transformation if the address operand GEP of the load instruction has a use outside of the load's parent block. Without this patch, compiler crashes with the given test case because the use of `%first.i` is still around when the basic block is erased from https://github.com/llvm-mirror/llvm/blob/master/lib/Transforms/Scalar/MergeICmps.cpp#L620. I think checking `isUsedOutsideOfBlock` with `GEP` is the original intention of the code, as the checking for `LoadI` is already performed in the same function.
This patch is incomplete though, as this makes the pass overly conservative and fails the test `tuple-four-int8.ll`. I believe what needs to be done is checking if GEP has a use outside of block that is not the part of "Comparisons" chain. Submit the patch as of now to prevent compiler crash.
Reviewers: courbet, trentxintong
Reviewed By: courbet
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D54089
llvm-svn: 346151
There was no coverage for at least 2 out of the 4 patterns because
of fcmp canonicalization. The tests and code should be moved to
InstSimplify in a follow-up because this doesn't create any new values.
llvm-svn: 346150
On Power9, we don't have patterns to select the following intrinsics:
llvm.ppc.vsx.stxvw4x.be
llvm.ppc.vsx.stxvd2x.be
This patch adds support for these.
Differential Revision: https://reviews.llvm.org/D53581
llvm-svn: 346148
As stated in IEEE-754 and discussed in:
https://bugs.llvm.org/show_bug.cgi?id=38086
...the sign of zero does not affect any FP compare predicate.
Known regressions were fixed with:
rL346097 (D54001)
rL346143
The transform will help reduce pattern-matching complexity to solve:
https://bugs.llvm.org/show_bug.cgi?id=39475
...as well as improve CSE and codegen (a zero constant is almost always
easier to produce than 0x80..00).
llvm-svn: 346147
It looks like we correctly removed edge cases with 0.0 from D50714,
but we were a bit conservative because getBinOpIdentity() doesn't
distinguish between +0.0 and -0.0 and 'nsz' is effectively always
true for fcmp (see discussion in:
https://bugs.llvm.org/show_bug.cgi?id=38086
Without this change, we would get regressions by canonicalizing
to +0.0 in all fcmp, and that's a step towards solving:
https://bugs.llvm.org/show_bug.cgi?id=39475
llvm-svn: 346143
Summary:
LTO and ThinLTO optimizes the IR differently.
One source of differences is the amount of internalizations that
can happen.
Add an option to enable/disable internalization so that other
differences can be studied in isolation. e.g. inlining.
There are other things lto and thinlto do differently, I will add
flags to enable/disable them as needed.
Reviewers: tejohnson, pcc, steven_wu
Subscribers: mehdi_amini, inglorion, steven_wu, dexonsmith, dang, llvm-commits
Differential Revision: https://reviews.llvm.org/D53294
llvm-svn: 346140
The constrained intrinsic tests have grown in number. Split off
the FMA tests into their own file to reduce double coverage.
Differential Revision: https://reviews.llvm.org/D53932
llvm-svn: 346137
We currently seem to underestimate the size of functions with loops in them,
both in terms of absolute code size and in the difficulties of dealing with
such code. (Calls, for example, can be tail merged to further reduce
codesize). At -Oz, we can then increase code size by inlining small loops
multiple times.
This attempts to penalise functions with loops at -Oz by adding a CallPenalty
for each top level loop in the function. It uses LI (and hence DT) to calculate
the number of loops. As we are dealing with minsize, the inline threshold is
small and functions at this point should be relatively small, making the
construction of these cheap.
Differential Revision: https://reviews.llvm.org/D52716
llvm-svn: 346134
Expand on LONG_BRANCH_LUi and LONG_BRANCH_(D)ADDiu pseudo
instructions by creating variants which support
less operands/accept GPR64Opnds as their operand in order
to appease the machine verifier pass.
Differential Revision: https://reviews.llvm.org/D53977
llvm-svn: 346133
The new atomic optimizer I previously added in D51969 did not work
correctly when a pixel shader was using derivatives, and had helper
lanes active.
To fix this we add an llvm.amdgcn.ps.live call that guards a branch
around the entire atomic operation - ensuring that all helper lanes are
inactive within the wavefront when we compute our atomic results.
I've added a test case that can cause derivatives, and exposes the
problem.
Differential Revision: https://reviews.llvm.org/D53930
llvm-svn: 346128
Turn the assert in PrepareConstants into a conditon so that we can
handle mul instructions with negative immediates.
Differential Revision: https://reviews.llvm.org/D54094
llvm-svn: 346126
r345840 slightly changed the way promotion happens which could
result in zext and truncs having the same source and destination
types. This fixes that issue.
We can now also remove the zext and trunc in the following case:
(zext (trunc (promoted op)), i32)
This means that we can no longer treat a value, that is only used by
a sink, to be safe to promote.
I've also added in some extra asserts and replaced a cast for a
dyn_cast.
Differential Revision: https://reviews.llvm.org/D54032
llvm-svn: 346125
This patch fixes a bug in the AVR FRMIDX expansion logic.
The expansion would leave a leftover operand from the original FRMIDX,
but now attached to a MOVWRdRr instruction. The MOVWRdRr instruction
did not expect this operand and so LLVM rejected the machine
instruction.
This would trigger an assertion:
Assertion failed: ((isImpReg || Op.isRegMask() || MCID->isVariadic() ||
OpNo < MCID->getNumOperands() || isMetaDataOp) &&
"Trying to add an operand to a machine instr that is already done!"),
function addOperand, file llvm/lib/CodeGen/MachineInstr.cpp
Tim fixed this so that now the FRMIDX is expanded correctly into
a well-formed MOVWRdRr.
Patch by Tim Neumann
llvm-svn: 346117
v2i8/v2i16/v2i32 are promoted to v2i64. pmuludq takes a v2i64 input and produces a v2i64 output. Since we don't about the upper bits of the type legalized multiply we can use the pmuludq to produce the multiply result for the bits we do care about.
llvm-svn: 346115
This is an AVR-specific workaround for a limitation of the register
allocator that only exposes itself on targets with high register
contention like AVR, which only has three pointer registers.
The three pointer registers are X, Y, and Z.
In most nontrivial functions, Y is reserved for the frame pointer,
as per the calling convention. This leaves X and Z. Some instructions,
such as LPM ("load program memory"), are only defined for the Z
register. Sometimes this just leaves X.
When the backend generates a LDDWRdPtrQ instruction with Z as the
destination pointer, it usually trips up the register allocator
with this error message:
LLVM ERROR: ran out of registers during register allocation
This patch is a hacky workaround. We ban the LDDWRdPtrQ instruction
from ever using the Z register as an operand. This gives the
register allocator a bit more space to allocate, fixing the
regalloc exhaustion error.
Here is a description from the patch author Peter Nimmervoll
As far as I understand the problem occurs when LDDWRdPtrQ uses
the ptrdispregs register class as target register. This should work, but
the allocator can't deal with this for some reason. So from my testing,
it seams like (and I might be totally wrong on this) the allocator reserves
the Z register for the ICALL instruction and then the register class
ptrdispregs only has 1 register left and we can't use Y for source and
destination. Removing the Z register from DREGS fixes the problem but
removing Y register does not.
More information about the bug can be found on the avr-rust issue
tracker at https://github.com/avr-rust/rust/issues/37.
A bug has raised to track the removal of this workaround and a proper
fix; PR39553 at https://bugs.llvm.org/show_bug.cgi?id=39553.
Patch by Peter Nimmervoll
llvm-svn: 346114
Using TargetTransformInfo allows the splitting pass to factor in the
code size cost of instructions as it decides whether or not outlining is
profitable.
This did not regress the overall amount of outlining seen on the handful
of internal frameworks I tested.
Thanks to Jun Bum Lim for suggesting this!
Differential Revision: https://reviews.llvm.org/D53835
llvm-svn: 346108
Summary: This also enables some constant folding from KnownBits propagation. This helps on some cases vXi64 case in 32-bit mode where constant vectors appear as vXi32 and a bitcast. This can prevent getNode from constant folding sra/shl/srl.
Reviewers: RKSimon, spatel
Reviewed By: spatel
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D54069
llvm-svn: 346102
In PR39475:
https://bugs.llvm.org/show_bug.cgi?id=39475
..we may fail to recognize/simplify fabs() in some cases because we do not
canonicalize fcmp with a -0.0 operand.
Adding that canonicalization can cause regressions on min/max FP tests, so
that's this patch: for the purpose of determining whether something is min/max,
let the value returned by the select determine how we treat a 0.0 operand in the fcmp.
This patch doesn't actually change the -0.0 to +0.0. It just changes the analysis, so
we don't fail to recognize equivalent min/max patterns that only differ in the
signbit of 0.0.
Differential Revision: https://reviews.llvm.org/D54001
llvm-svn: 346097
This patch gives the IR ComputeNumSignBits the same functionality as the
DAG version (the code is derived from the existing code).
This an extension of the single input shuffle analysis added with D53659.
Differential Revision: https://reviews.llvm.org/D53987
llvm-svn: 346071
Use MachineFrameInfo's OffsetAdjustment field to pass this information
from the target to CodeViewDebug.cpp. The X86 backend doesn't use it for
any other purpose.
This fixes PR38857 in the case where there is a non-aligned quantity of
CSRs and a non-aligned quantity of locals.
llvm-svn: 346062
Adding functionality to the DWARF verifier for DWARF v5 strx* forms which
index into the string offsets table.
Differential Revision: https://reviews.llvm.org/D54049
llvm-svn: 346061
ModuleSummaryIndex::exportToDot crashes when linking the Linux kernel
under ThinLTO using LLVMgold.so. This is due to the exportToDot
function trying to get the GUID of an empty ValueInfo. The root cause
related to the fact that we attempt to get the GUID of an aliasee
via its OriginalGUID recorded in the aliasee summary, and that is not
always possible. Specifically, we cannot do this mapping when the value
is internal linkage and there were other internal linkage symbols with
the same name.
There are 2 fixes for the problem included here.
1) In all cases where we can currently print the dot file from the
command line (which is only via save-temps), we have a valid AliaseeGUID
in the AliasSummary. Use that when it is available, so that we can get
the correct aliasee GUID whenever possible.
2) However, if we were to invoke exportToDot from the debugger right
after it is built during the initial analysis step (i.e. the per-module
summary), we won't have the AliaseeGUID field populated. In that case,
we have a fallback fix that will simply print "@"+GUID when we aren't
able to get the GUID from the OriginalGUID. It simply checks if the VI
is valid or not before attempting to get the name. Additionally, since
getAliaseeGUID will assert that the AliaseeGUID is non-zero, guard the
earlier fix#1 by a new function hasAliaseeGUID().
Reviewers: pcc, tmroeder
Subscribers: evgeny777, mehdi_amini, inglorion, dexonsmith, arphaman, llvm-commits
Differential Revision: https://reviews.llvm.org/D53986
llvm-svn: 346055
The majority of the changes are because the rest of shuffle lowering/combining prefers to replace the undef input with the other operand. Using UNPCKL directly seemed to avoid this and just grabbed a randomish register for the undef which can create false dependencies.
llvm-svn: 346050
Summary:
The assembler was able to assemble and then dump back to .s, but
was failing to parse certain directives necessary for valid .o
output:
- .type directives are now recognized to distinguish function symbols
and others.
- .size is now parsed to provide function size.
- .globaltype (introduced in https://reviews.llvm.org/D54012) is now
recognized to ensure symbols like __stack_pointer have a proper type
set for both .s and .o output.
Also added tests for the above.
Reviewers: sbc100, dschuff
Subscribers: jgravelle-google, aheejin, dexonsmith, kristina, llvm-commits, sunfish
Differential Revision: https://reviews.llvm.org/D53842
llvm-svn: 346047
We already have custom lowering for the AVX case in LegalizeVectorOps. So its better to keep the regular extend op around as long as possible.
I had to qualify one place in DAG combine that created illegal vector extending load operations. This change by itself had no effect on any tests which is why its included here.
I've made a few cleanups to the custom lowering. The sign extend code no longer creates an identity shuffle with undef elements. The zero extend code now emits a zero_extend_vector_inreg instead of an unpckl with a zero vector.
For the high half of the custom lowering of zero_extend/any_extend, we're now using an unpckh with a zero vector or undef. Previously we used used a pshufd to move the upper 64-bits to the lower 64-bits and then used a zero_extend_vector_inreg. I think the zero vector should require less execution resources and be smaller code size.
Differential Revision: https://reviews.llvm.org/D54024
llvm-svn: 346043
Use getImageBase() helper to compute the image base. Fix various
offsets/addresses/masks so they're actually correct.
This allows decoding unwind info from DLLs, and unwind info from object
files containing multiple functions.
Differential Revision: https://reviews.llvm.org/D54015
llvm-svn: 346036
A number of intrinsics, such as llvm.sin.f32, would result in a failure to
select. This patch adds expansions for the relevant selection DAG nodes, as
well as exhaustive testing for all f32 and f64 intrinsics.
The codegen for FMA remains a TODO item, pending support for the various
RISC-V FMA instruction variants.
The llvm.minimum.f32.* and llvm.maximum.* tests are commented-out, pending
upstream support for target-independent expansion, as discussed in
http://lists.llvm.org/pipermail/llvm-dev/2018-November/127408.html.
Differential Revision: https://reviews.llvm.org/D54034
Patch by Luís Marques.
llvm-svn: 346034
This is necessary as I'm wanting to remove the 'Constant Pool' shuffle decoding from getTargetShuffleMask - but using getTargetShuffleMaskIndices allows the shuffle combiner to realize that these calls are really broadcasts.....
As with a lot of the X86ISD::VPERMV3 code this causes some vperm2i/vperm2t shuffles to flip depending on optimal commutation.
llvm-svn: 346032
Summary:
EH stack depth is incremented at `try` and decremented at `catch`. When
there are more than two catch instructions for a try instruction, we
shouldn't count non-first catches when calculating EH stack depths.
This patch fixes two bugs:
- CFGStackify: Exclude `catch_all` in the terminate catch pad when
calculating EH pad stack, because when we have multiple catches for a
try we should count only the first catch instruction when calculating
EH pad stack.
- InstPrinter: The initial intention was also to exclude non-first
catches, but it didn't account nested try-catches, so it failed on
this case:
```
try
try
catch
end
catch <-- (1)
end
```
In the example, when we are at the catch (1), the last seen EH
instruction is not `try` but `end_try`, violating the wrong assumption.
We don't need these after we switch to the second proposal because there
is gonna be only one `catch` instruction. But anyway before then these
bugfixes are necessary for keep trunk in working state.
Reviewers: dschuff
Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits
Differential Revision: https://reviews.llvm.org/D53819
llvm-svn: 346029
Summary:
-mldst-motion creates a new phi node without any debug info. Use the merged debug location from the incoming stores to fix this.
Fixes PR38177. The test case here is (somewhat) simplified from:
```
struct S {
int foo;
void fn(int bar);
};
void S::fn(int bar) {
if (bar)
foo = 1;
else
foo = 0;
}
```
Reviewers: dblaikie, gbedwell, aprantl, vsk
Reviewed By: vsk
Subscribers: vsk, JDevlieghere, llvm-commits
Tags: #debug-info
Differential Revision: https://reviews.llvm.org/D54019
llvm-svn: 346027
Running "llvm-pdbutil dump -all" on linux (using the native PDB reader),
over a few PDBs pulled from the Microsoft public symbol store uncovered
a few small issues:
- stripped PDBs might not have the strings stream (/names)
- stripped PDBs might not have the "module info" stream
Differential Revision: https://reviews.llvm.org/D54006
llvm-svn: 346010
Let i8/i16 uint/sint to fp conversions cost 1 if operand is a load.
Since the load already does the extension, there is no extra cost (previously
returned 2).
Review: Ulrich Weigand
https://reviews.llvm.org/D54028
llvm-svn: 346009
Summary:
The hot and cold count thresholds are derived from the summary, but for
debugging purposes it is convenient to provide the actual thresholds.
Reviewers: davidxl
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D54040
llvm-svn: 346005
Model this function more closely after the BasicTTIImpl version, with
separate handling of loads and stores. For loads, the set of actually loaded
vectors is checked.
This makes it more readable and just slightly more accurate generally.
Review: Ulrich Weigand
https://reviews.llvm.org/D53071
llvm-svn: 345998
As reported in PR38952, postra-machine-sink relies on DBG_VALUE insns being
adjacent to the def of the register that they reference. This is not always
true, leading to register copies being sunk but not the associated DBG_VALUEs,
which gives the debugger a bad variable location.
This patch collects DBG_VALUEs as we walk through a BB looking for copies to
sink, then passes them down to performSink. Compile-time impact should be
negligable.
Differential Revision: https://reviews.llvm.org/D53992
llvm-svn: 345996
Small-data (i.e. GP-relative) loads and stores allow 16-bit scaled
offset. For a load of a value of type T, the small-data area is
equivalent to an array "T sdata[65536]". This implies that objects
of smaller sizes need to be closer to the beginning of sdata,
while larger objects may be farther away, or otherwise the offset
may be insufficient to reach it. Similarly, an object of a larger
size should not be accessed via a load of a smaller size.
llvm-svn: 345975
Summary:
If the output of debug directives only is requested, we should drop
emission of ',debug' option from the target directive. Required for
supporting of nvprof profiler.
Reviewers: probinson, echristo, dblaikie
Subscribers: Hahnfeld, jholewinski, llvm-commits, JDevlieghere, aprantl
Differential Revision: https://reviews.llvm.org/D46061
llvm-svn: 345972
reduceBuildVecConvertToConvertBuildVec vectorizes int2float in the DAGCombiner, which means that even if the LV/SLP has decided to keep scalar code using the cost models, this will override this.
While there are cases where vectorization is necessary in the DAG (mainly due to legalization artefacts), I don't think this is the case here, we should assume that the vectorizers know what they are doing.
Differential Revision: https://reviews.llvm.org/D53712
llvm-svn: 345964
Fix PR39417, PR39497
The loop vectorizer may generate runtime SCEV checks for overflow and stride==1
cases, leading to execution of original scalar loop. The latter is forbidden
when optimizing for size. An assert introduced in r344743 triggered the above
PR's showing it does happen. This patch fixes this behavior by preventing
vectorization in such cases.
Differential Revision: https://reviews.llvm.org/D53612
llvm-svn: 345959
These tests are meant to test dwarf emission (or prolog/epilogue
generation) so we can convert them to .mir and only run the relevant
part of the pipeline.
This way they become independent of changes in earlier passes such as my
planned changes to RegAllocFast.
llvm-svn: 345919
Summary:
Assembly output can use globals like __stack_pointer implicitly,
but has no way of indicating the type of such a global, which makes
it hard for tools processing it (such as the MC Assembler) to
reconstruct this information.
The improved assembler directives parsing (in progress in
https://reviews.llvm.org/D53842) will make use of this information.
Also deleted code for the .import_global directive which was unused.
New test case in userstack.ll
Reviewers: dschuff, sbc100
Subscribers: jgravelle-google, aheejin, sunfish, llvm-commits
Differential Revision: https://reviews.llvm.org/D54012
llvm-svn: 345917
I'm having trouble creating a test case for the ISD::TRUNCATE part of this that shows any codegen differences. But I was able to test the setcc path which is what the test changes here cover.
llvm-svn: 345908
Instruction mapping in the outliner uses "illegal numbers" to signify that
something can't ever be part of an outlining candidate. This means that the
number is unique and can't be part of any repeated substring.
Because each of these is unique, we can use a single unique number to represent
a range of things we can't outline.
The outliner tries to leverage this using a flag which is set in an MBB when
the previous instruction we tried to map was "illegal". This patch improves
that logic to work across MBBs. As a bonus, this also simplifies the mapping
logic somewhat.
This also updates the machine-outliner-remarks test, which was impacted by the
order of Candidates on an OutlinedFunction changing. This order isn't
guaranteed, so I added a FIXME to fix that in a follow-up. The order of
Candidates on an OutlinedFunction isn't important, so this still is NFC.
llvm-svn: 345906
Summary: Different variants of idot8 codegen dag patterns are not generated by llvm-tablegen due to a huge
increase in the compile time. Support the pattern that clang FE generates after reordering the
additions in integer-dot8 source language pattern.
Author: FarhanaAleen
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D53937
llvm-svn: 345902
Summary:
Like `block` or `loop`, `try` can take an optional signature which can
be omitted. This patch allows `try`'s signature to be omitted. Also
added some tests for EH instructions.
Reviewers: aardappel
Subscribers: dschuff, sbc100, jgravelle-google, sunfish, llvm-commits
Differential Revision: https://reviews.llvm.org/D53873
llvm-svn: 345888