We were missing this fold in the DAG, which I've copied directly from llvm::ConstantFoldCastInstruction
Differential Revision: https://reviews.llvm.org/D62807
llvm-svn: 362397
LanaiMCCodeEmitter.cpp was not using any APIs from Lanai.h, and was only
including it for transitive dependencies. Doing so is problematic from
include-what-you-use perspective, but it is also a layering issue (it
creates a dependency cycle between the primary Lanai target library and
the MCTargetDesc library).
llvm-svn: 362394
HexagonInstPrinter.cpp was not using any APIs from HexagonAsmPrinter.h.
Doing so is problematic from include-what-you-use perspective, but it is
also a layering issue (it creates a dependency cycle between the primary
Hexagon target library and the MCTargetDesc library).
llvm-svn: 362389
HexagonMCInstrInfo.cpp was not using any APIs from Hexagon.h. Doing so
is problematic from include-what-you-use perspective, but it is also a
layering issue (it creates a dependency cycle between the primary
Hexagon target library and the MCTargetDesc library).
llvm-svn: 362387
HexagonMCCodeEmitter.cpp was not using any APIs from Hexagon.h. Doing
so is problematic from include-what-you-use perspective, but it is also
a layering issue (it creates a dependency cycle between the primary
Hexagon target library and the MCTargetDesc library).
llvm-svn: 362386
HexagonMCCompound.cpp was not using any APIs from Hexagon.h. Doing so
is problematic from include-what-you-use perspective, but it is also a
layering issue (it creates a dependency cycle between the primary
Hexagon target library and the MCTargetDesc library).
llvm-svn: 362385
HexagonShuffler.cpp was not using any APIs from Hexagon.h, and was only
including it for transitive dependencies. Doing so is problematic from
include-what-you-use perspective, but it is also a layering issue (it
creates a dependency cycle between the primary Hexagon target library
and the MCTargetDesc library).
llvm-svn: 362384
HexagonMCChecker.cpp was not using any APIs from Hexagon.h. Doing so is
problematic from include-what-you-use perspective, but it is also a
layering issue (it creates a dependency cycle between the primary
Hexagon target library and the MCTargetDesc library).
llvm-svn: 362383
HexagonMCTargetDesc.cpp was not using any APIs from Hexagon.h. Doing so
is problematic from include-what-you-use perspective, but it is also a
layering issue (it creates a dependency cycle between the primary
Hexagon target library and the MCTargetDesc library).
llvm-svn: 362382
HexagonMCShuffler.cpp was not using any APIs from Hexagon.h. Doing so
is problematic from include-what-you-use perspective, but it is also a
layering issue (it creates a dependency cycle between the primary
Hexagon target library and the MCTargetDesc library).
llvm-svn: 362381
The recent change D60691 introduced a bug in clang when handling
option combinations such as `-mcpu=cortex-m4 -mfpu=none`. Those
options together should select Cortex-M4 but disable all use of
hardware FP, but in fact, now hardware FP instructions can still be
generated in that mode.
The reason is because the handling of FPUVersion::NONE disables all
the same feature names it used to, of which the base one is `vfp2`.
But now there are further features below that, like `vfp2d16fp` and
(following D60694) `fpregs`, which also need to be turned off to
disable hardware FP completely.
Added a tiny test which double-checks that compiling a simple FP
function doesn't access the FP registers.
Reviewers: SjoerdMeijer, dmgreen
Reviewed By: dmgreen
Subscribers: lebedev.ri, javed.absar, kristof.beyls, hiraditya, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D62729
llvm-svn: 362380
Summary:
This patch adds tests for directives .arch, .arch_extension and .cpu for
all features defined in Arm SVE2 architecture extension.
Reviewed By: chill
Differential Revision: https://reviews.llvm.org/D62602
llvm-svn: 362378
gnu-sections.test currently use relocs.obj.elf-x86_64 and
relocs.obj.elf-i386 precompiled objects as an inputs.
These inputs actually initially were introduced to test the
dump of relocations and have almost nothing common with dumping
sections.
Patch converts the test to use yaml2obj. That allows to remove
relocs.obj.elf-i386 binary.
(relocs.obj.elf-x86_64 is still used by another test and can't be removed atm).
Differential revision: https://reviews.llvm.org/D62659
llvm-svn: 362377
HexagonELFObjectWriter.cpp was not using any APIs from Hexagon.h, and
was only including it for transitive dependencies. Doing so is
problematic from include-what-you-use perspective, but it is also a
layering issue (it creates a dependency cycle between the primary
Hexagon target library and the MCTargetDesc library).
llvm-svn: 362376
rL362089 introduced a set of yaml based reloc-types-*.test test cases
(instead of huge reloc-types.test that used a lot of precompiled binaries)
These test cases checks LLVM-styled dumping of the relocations.
gnu-relocations.test was a test case to check GNU styled relocations dumping.
It did that only for elf-x86 and elf-x86_64 targets. It did not test all of the
relocations though.
Now, after rL362089, it does not make sence to keep it.
This patch updates reloc-types-elf-i386.test and reloc-types-elf-x64.test tests
with llvm-readelf calls to check GNU styled output in one place.
It removes gnu-relocations.test completely.
One of intentions of doing this is also to get rid of relocs.obj.elf-i386 and
relocs.obj.elf-x86_64 precompiled objects completely (they are used in other tests still).
Differential revision: https://reviews.llvm.org/D62655
llvm-svn: 362374
When LiveDebugValues deduces new variable's location from spill, restore or
register copy instruction it should close old variable's location. Otherwise
we can have multiple block output locations for same variable. That could lead
to inserting two DBG_VALUEs for same variable to the beginning of the successor
block which results to ignoring of first DBG_VALUE.
Reviewers: aprantl, jmorse, wolfgangp, dstenb
Reviewed By: aprantl
Subscribers: probinson, asowda, ivanbaev, petarj, djtodoro
Tags: #debug-info
Differential Revision: https://reviews.llvm.org/D62196
llvm-svn: 362373
HexagonAsmBackend.cpp was not using any APIs from Hexagon.h. Doing so
is problematic from include-what-you-use perspective, but it is also a
layering issue (it creates a dependency cycle between the primary
Hexagon target library and the MCTargetDesc library).
llvm-svn: 362372
HexagonAsmParser.cpp was not using any APIs from Hexagon.h. Doing so is
problematic from include-what-you-use perspective, but it is also a
layering issue (it creates a dependency cycle between the primary
Hexagon target library and the AsmParser library).
llvm-svn: 362370
HexagonShuffler.h was not using any APIs from Hexagon.h, and was only
including it for transitive dependencies. Doing so is problematic from
include-what-you-use perspective, but it is also a layering issue (it
creates a dependency cycle between the primary Hexagon target library
and the MCTargetDesc library).
llvm-svn: 362369
BPFMCTargetDesc.cpp was not using any APIs from BPF.h. Doing so is
problematic from include-what-you-use perspective, but it is also a
layering issue (it creates a dependency cycle between the primary
BPF target library and the MCTargetDesc library).
llvm-svn: 362368
Summary:
- pr42062
When compiling for MinSize,
ARMTargetLowering::LowerCall decides to indirect
multiple calls to a same function. However,
it disconsiders the limitation that thumb1
indirect calls require the callee to be in a
register from r0 to r3 (llvm limiation).
If all those registers are used by arguments, the
compiler dies with "error: run out of registers
during register allocation".
This patch tells the function
IsEligibleForTailCallOptimization if we intend to
perform indirect calls, as to avoid tail call
optimization.
Reviewers: dmgreen, efriedma
Reviewed By: efriedma
Subscribers: javed.absar, kristof.beyls, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D62683
llvm-svn: 362366
A std::array is implemented as a template with an array inside a struct.
Older versions of clang, like 3.6, require an extra set of curly braces
around std::array initializations to avoid warnings.
The C++ language was changed regarding this by CWG 1270. So more modern
tool chains does not complain even if leaving out one level of braces.
llvm-svn: 362360
Summary:
LDWRdPtr would be expanded to ld+ldd. ldd only accepts the pointer register is Y or Z.
So the register class of pointer of LDWRdPtr should be PTRDISPREGS instead of PTRREGS.
Reviewers: dylanmckay
Reviewed By: dylanmckay
Subscribers: dylanmckay, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D62300
llvm-svn: 362351
If we hit the limit, we do expand the outstanding tokenfactors.
Otherwise, we might drop nodes with users in the unexpanded
tokenfactors. This fixes the crashes reported by Jordan Rupprecht.
Reviewers: niravd, spatel, craig.topper, rupprecht
Reviewed By: niravd
Differential Revision: https://reviews.llvm.org/D62633
llvm-svn: 362350
This changed updates the MSVC Visualizer to work with the recent change
of PointerUnion into a variadic template. As an extra bonus, we
fix some bit rot in the SmallPtrSet visualizer as well
llvm-svn: 362345
Also add two FC_Far that seem to be missing, by symmetry from
the public and protected cases. (But FC_Far isn't really a thing
anymore, so this doesn't really have an observable effect.)
llvm-svn: 362344
Recommit of r361790 that was temporarily reverted in r361793 due to bot breakage.
Summary:
The following changes were required to fix these tests:
1) Change LLVM_ENABLE_PLUGINS to an option and move it to
llvm/CMakeLists.txt with an appropriate default -- which matches
the original default behavior.
2) Move the plugins directory from clang/test/Analysis
clang/lib/Analysis. It's not enough to add an exclude to the
lit.local.cfg file because add_lit_testsuites recurses the tree and
automatically adds the appropriate `check-` targets, which don't
make sense for the plugins because they aren't tests and don't
have `RUN` statements.
Here's a list of the `clang-check-anlysis*` targets with this
change:
```
$ ninja -t targets all| sed -n "s/.*\/\(check[^:]*\):.*/\1/p" | sort -u | grep clang-analysis
check-clang-analysis
check-clang-analysis-checkers
check-clang-analysis-copypaste
check-clang-analysis-diagnostics
check-clang-analysis-engine
check-clang-analysis-exploration_order
check-clang-analysis-html_diagnostics
check-clang-analysis-html_diagnostics-relevant_lines
check-clang-analysis-inlining
check-clang-analysis-objc
check-clang-analysis-unified-sources
check-clang-analysis-z3
```
3) Simplify the logic and only include the subdirectories under
clang/lib/Analysis/plugins if LLVM_ENABLE_PLUGINS is set.
Reviewed By: NoQ
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D62445
llvm-svn: 362328
Move this combine from x86 into generic DAGCombine, which currently only manages cases where the bitcast is between types of the same scalarsize.
Differential Revision: https://reviews.llvm.org/D59188
llvm-svn: 362324
Add (opt-in) support for implicit truncation to isConstOrConstSplat, which allows us to match truncated 'all ones' cases in isBitwiseNot.
PR41020 compares against using ISD::isBuildVectorAllOnes() instead, but that predicate silently accepts any UNDEF elements in the build vector which might not be what we want in isBitwiseNot - so I've added an opt-in 'AllowUndefs' flag that is set to false by default but will allow us to enable it on individual cases where its safe.
Differential Revision: https://reviews.llvm.org/D62783
llvm-svn: 362323
The results of the dyn_casts were immediately dereferenced on the next line
so they had better not be null.
I don't think there's any way for these dyn_casts to fail, so use a cast
of adding null check.
llvm-svn: 362315
LLVM CMake build already uses libtool instead of ar when building
for Apple platform and we should be using the same when building
runtimes. To do so, this change extracts the logic for finding
libtool into a separate file and then uses it from both the LLVM
build as well as the LLVM runtimes build.
Differential Revision: https://reviews.llvm.org/D62769
llvm-svn: 362313
Over a year ago, MachineInstr gained a fourth boolean parameter that occurs
before the TII pointer. When this happened, several places started accidentally
passing TII into this boolean parameter instead of the TII parameter.
llvm-svn: 362312
The AVX512BW and AVX512VL checks were never used. And AVX512 is the same
as AVX on all tests that weren't already split for AVX1 and AVX2.
llvm-svn: 362308
Extract a willNotOverflow() helper function that is shared between
eliminateOverflowIntrinsic() and strengthenOverflowingOperation().
Use WithOverflowInst for the former.
We'll be able to reuse the same code for saturating intrinsics as
well.
llvm-svn: 362305
Summary: Fneg can be implemented with an xor rather than a function call so we don't need to add the function call overhead. This was pointed out in D62699
Reviewers: efriedma, cameron.mcinally
Reviewed By: efriedma
Subscribers: javed.absar, eraman, hiraditya, haicheng, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D62747
llvm-svn: 362304
The `cfcmsa` and `ctcmsa` instructions accept index of MSA control
register. The MIPS64 SIMD Architecture define eight MSA control
registers. But register index for `cfcmsa` and `ctcmsa` instructions
might be any number in 0..31 range. If the index is greater then 7,
`cfcmsa` writes zero to the destination registers and `ctcmsa` does
nothing [1].
[1] MIPS Architecture for Programmers Volume IV-j:
The MIPS64 SIMD Architecture Module
https://www.mips.com/?do-download=the-mips64-simd-architecture-module
Differential Revision: https://reviews.llvm.org/D62597
llvm-svn: 362299
If we would allow register coalescing on PTRDISPREGS class then register
allocator can lock Z register to some virtual register. Larger instructions
requiring a memory acces then fail during the register allocation phase since
there is no available register to hold a pointer if Y register was already
taken for a stack frame. This patch prevents it by keeping Z register
spillable. It does it by not allowing coalescer to lock it.
Original discussion on https://github.com/avr-rust/rust/issues/128.
llvm-svn: 362298
I have initially added it in for test to display both
whether the binop w/ constant is sinked or hoisted.
But as it can be seen from the 'sub (sub C, %x), %y'
test, that actually conceals the issues it is supposed to test.
At least two more patterns are unhandled:
* 'add (sub C, %x), %y' - D62266
* 'sub (sub C, %x), %y'
llvm-svn: 362295
Fix for https://bugs.llvm.org/show_bug.cgi?id=31181 and partial fix
for LFTR poison handling issues in general.
When LFTR moves a condition from pre-inc to post-inc, it may now
depend on value that is poison due to nowrap flags. To avoid this,
we clear any nowrap flag that SCEV cannot prove for the post-inc
addrec.
Additionally, LFTR may switch to a different IV that is dynamically
dead and as such may be arbitrarily poison. This patch will correct
nowrap flags in some but not all cases where this happens. This is
related to the adoption of IR nowrap flags for the pre-inc addrec.
(See some of the switch_to_different_iv tests, where flags are not
dropped or insufficiently dropped.)
Finally, there are likely similar issues with the handling of GEP
inbounds, but we don't have a test case for this yet.
Differential Revision: https://reviews.llvm.org/D60935
llvm-svn: 362292
This allows the DWARFExpression class to handle addresses without
crashing on targets with 16-bit pointers like AVR.
This is required in order to generate assembly from clang via the '-S'
flag.
This fixes an error with the following message:
clang: llvm/include/llvm/DebugInfo/DWARF/DWARFExpression.h:132: llvm::DWARFExpression::DWARFExpression(llvm::DataExtractor, uint16_t, uint8_t):
Assertion `AddressSize == 8 || AddressSize == 4' failed.
llvm-svn: 362290
Summary:
This was flagged in https://www.viva64.com/en/b/0629/ under "Snippet No.
33".
It seems that this statement is doing the standard bitwise trick for
adjusting a value to have a specific alignment.
The issue is that getStubAlignment() returns an unsigned, while DataSize
is declared a uint64_t. The right hand side of the expression is not
extended to 64b before bitwise negation, resulting in the top half of
the mask being 0s, which is not correct for realignment.
Reviewers: lhames, MaskRay
Reviewed By: MaskRay
Subscribers: RKSimon, MaskRay, hiraditya, llvm-commits, srhines
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D62227
llvm-svn: 362286
ARM64 CodeView test was incorrectly put under test/DebugInfo/COFF folder which
runs for all all architectures. This fix moves it to a subfolder AArch64 with
lit.local.cfg which specify it supports AArch64 only.
llvm-svn: 362283
At the moment, LoopPredication completely bails out if it sees a latch of the form:
%cmp = icmp ne %iv, %N
br i1 %cmp, label %loop, label %exit
OR
%cmp = icmp ne %iv.next, %NPlus1
br i1 %cmp, label %loop, label %exit
This is unfortunate since this is exactly the form that LFTR likes to produce. So, go ahead and recognize simple cases where we can.
For pre-increment loops, we leverage the fact that LFTR likes canonical counters (i.e. those starting at zero) and a (presumed) range fact on RHS to discharge the check trivially.
For post-increment forms, the key insight is in remembering that LFTR had to insert a (N+1) for the RHS. CVP can hopefully prove that add nsw/nuw (if there's appropriate range on N to start with). This leaves us both with the post-inc IV and the RHS involving an nsw/nuw add, and SCEV can discharge that with no problem.
This does still need to be extended to handle non-one steps, or other harder patterns of variable (but range restricted) starting values. That'll come later.
Differential Revision: https://reviews.llvm.org/D62748
llvm-svn: 362282
We were hashing the string pointer, not the string, so two instructions
could be identical (isIdenticalTo), but have different hash codes.
This showed up as a very rare, non-deterministic assertion failure
rehashing a DenseMap constructed by MachineOutliner. So there's no
"real" testcase, just a unittest which checks that the hash function
behaves correctly.
I'm a little scared fixing this is going to cause a regression in
outlining or MachineCSE, but hopefully we won't run into any issues.
Differential Revision: https://reviews.llvm.org/D61975
llvm-svn: 362281
CodeView has its own register map which is defined in cvconst.h. Missing this
mapping before saving register to CodeView causes debugger to show incorrect
value for all register based variables, like variables in register and local
variables addressed by register (stack pointer + offset).
This change added mapping between LLVM register and CodeView register so the
correct register number will be stored to CodeView/PDB, it aso fixed the
mapping from CodeView register number to register name based on current
CPUType but print PDB to yaml still assumes X86 CPU and needs to be fixed.
Differential Revision: https://reviews.llvm.org/D62608
llvm-svn: 362280
Summary:
It looks like since INLINEASM_BR was created off of INLINEASM (r353563),
a few checks for INLINEASM needed to be updated to check for either
case.
pr/41999
Reviewers: hfinkel
Reviewed By: hfinkel
Subscribers: nemanjai, hiraditya, kbarton, jsji, llvm-commits, craig.topper, srhines
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D62403
llvm-svn: 362278
Testing with debuggers shows that our previous behavior was correct.
The reason I thought MSVC did things differently is that MSVC prefers to
use the 0xB combined code offset and code length update opcode when
inline sites are discontiguous.
Keep the test changes, and update the llvm-pdbutil inline line table
dumper to account for this new interpretation of the opcodes.
llvm-svn: 362277
When the object size argument is -1, no checking can be done, so calling the
_chk variant is unnecessary. We already did this for a bunch of these
functions.
rdar://50797197
Differential revision: https://reviews.llvm.org/D62358
llvm-svn: 362272
Just copy all of the operands except the chain and call MorphNode on that.
This removes the IsUnary and IsTernary flags.
Also always get the result type from the result type of the original
nodes. Previously we got it from the operand except for two nodes
where that didn't work.
llvm-svn: 362269
Summary:
This was flagged in https://www.viva64.com/en/b/0629/ under "Snippet No.
7".
These statements are order independent, short of the use-after-move.
Reviewers: echristo, srhines, RKSimon
Reviewed By: RKSimon
Subscribers: dblaikie, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D62114
llvm-svn: 362267
Summary:
Fixes a warning produced from scan-build (llvm.org/reports/scan-build/),
further warnings found by annotation isMoveInstr [[nodiscard]].
isMoveInstr potentially does not assign to its parameters, so if they
were uninitialized, they will potentially stay uninitialized. It seems
most call sites pass references to uninitialized values, then use them
without checking the return value.
Reviewers: wmi
Reviewed By: wmi
Subscribers: MatzeB, qcolombet, hiraditya, tpr, llvm-commits, srhines
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D62109
llvm-svn: 362265
After improving the inline line table dumper in llvm-pdbutil and looking
at MSVC's inline line tables, it is clear that setting the length of the
inlined code region does not update the code offset. This means that the
delta to the beginning of a new discontiguous inlined code region should
be calculated relative to the last code offset, excluding the length.
Implementing this is a one line fix for MC: simply don't update
LastLabel.
While I'm updating these test cases, switch them to use llvm-objdump -d
and llvm-pdbutil. This allows us to show offsets of each instruction and
correlate the line table offsets to the actual code.
llvm-svn: 362264
If we can determine that a saturating add/sub will not overflow based
on range analysis, convert it into a simple binary operation. This is
a sibling transform to the existing with.overflow handling.
Reapplying this with an additional check that the saturating intrinsic
has integer type, as LVI currently does not support vector types.
Differential Revision: https://reviews.llvm.org/D62703
llvm-svn: 362263
Noticed on D62703. LVI only handles plain integers, not vectors of
integers. This was previously not an issue, because vector support
for with.overflow is only a relatively recent addition.
llvm-svn: 362261
This feeds the new llvm_codsign BUNDLE_PATH option through from the llvm target wrapper functions, so that you can specify the BUNDLE_PATH on the target's codesign.
llvm-svn: 362248
We don't want to create vregs if there is nothing to use them for. That causes
verifier errors.
Differential Revision: https://reviews.llvm.org/D62740
llvm-svn: 362247
The resource pressure distribution computation is now delegated by class
BottleneckAnalysis to an instance of class PressureTracker.
Class PressureTracker is also responsible for:
- tracking users of processor resource units.
- tracking the number of delay cycles caused by increases in backpressure.
BottleneckAnalysis internally initializes a dependency graph. Each nodes
represents an instruction in the input code sequence. Edges of the dependency
graph are critical register/memory/resource dependencies. Dependencies are only
added to the graph if they are seen as critical by backend pressure events.
The DependencyGraph is currently unused. It is possible to print the dependency
graph (see method DependencyGraph::dump()) for debugging purposes.
The long term goal is to use the information stored by the dependency graph in
order to do critical path computation.
llvm-svn: 362246
If we can determine that a saturating add/sub will not overflow
based on range analysis, convert it into a simple binary operation.
This is a sibling transform to the existing with.overflow handling.
Differential Revision: https://reviews.llvm.org/D62703
llvm-svn: 362242
[FPEnv] Added a special UnrollVectorOp method to deal with the chain on StrictFP opcodes
This change creates UnrollVectorOp_StrictFP. The purpose of this is to address a failure that consistently occurs when calling StrictFP functions on vectors whose number of elements is 3 + 2n on most platforms, such as PowerPC or SystemZ. The old UnrollVectorOp method does not expect that the vector that it will unroll will have a chain, so it has an assert that prevents it from running if this is the case. This new StrictFP version of the method deals with the chain while unrolling the vector. With this new function in place during vector widending, llc can run vector-constrained-fp-intrinsics.ll for SystemZ successfully.
Submitted by: Drew Wock <drew.wock@sas.com>
Reviewed by: Cameron McInally, Kevin P. Neal
Approved by: Cameron McInally
Differential Revision: https://reviews.llvm.org/D62546
llvm-svn: 362241
AMDGPU uses multiplier 9 for the inline cost. It is taken into account
everywhere except for inline hint threshold. As a result we are penalizing
functions with the inline hint making them less probable to be inlined
than those without the hint. Defaults are 225 for a normal function and
325 for a function with an inline hint. Currently we have effective
threshold 225 * 9 = 2025 for normal functions and just 325 for those with
the hint. That is fixed by this patch.
Differential Revision: https://reviews.llvm.org/D62707
llvm-svn: 362239
In PPCReduceCRLogicals after splitting the original MBB into 2, the 2 impacted branches still use original branch probability. This is unreasonable. Suppose we have following code, and the probability of each successor is 50%.
condc = conda || condb
br condc, label %target, label %fallthrough
It can be transformed to following,
br conda, label %target, label %newbb
newbb:
br condb, label %target, label %fallthrough
Since each branch has a probability of 50% to each successor, the total probability to %fallthrough is 25% now, and the total probability to %target is 75%. This actually changed the original profiling data. A more reasonable probability can be set to 70% to the false side for each branch instruction, so the total probability to %fallthrough is close to 50%.
This patch assumes the branch target with two incoming edges have same edge frequency and computes new probability fore each target, and keep the total probability to original targets unchanged.
Differential Revision: https://reviews.llvm.org/D62430
llvm-svn: 362237
These can take a significant amount of time in some builds.
Suggested by Andrea Di Biagio.
Differential Revision: https://reviews.llvm.org/D62666
llvm-svn: 362219
It looks this fold was already partially happening, indirectly
via some other folds, but with one-use limitation.
No other fold here has that restriction.
https://rise4fun.com/Alive/ftR
llvm-svn: 362217
Summary:
A three sources variant of the TBL instruction is added to the existing
SVE instruction in SVE2. This is implemented with minor changes to the
existing TableGen class. TBX is a new instruction with its own
definition.
The specification can be found here:
https://developer.arm.com/docs/ddi0602/latest
Reviewed By: chill
Differential Revision: https://reviews.llvm.org/D62600
llvm-svn: 362214
Test different operand types of callee and their behavior whether
relocation model is pic or not.
Possible operand types are:
Register (function pointer),
External symbol (used for libcalls e.g. __udivdi3 or memcpy),
Global address.
Global address has different handling depending on relocation model
and linkage type. Register and external symbol do not.
Differential Revision: https://reviews.llvm.org/D62590
llvm-svn: 362212
Fix the misleadingly indentation introduced in rL362064. This will get rid of
the compiler warning, and it was actually a bug. This change will be used and
tested in D62669.
llvm-svn: 362211
Handle position independent code for MIPS32.
When callee is global address, lower call will emit callee
as G_GLOBAL_VALUE and add target flag if needed.
Support $gp in getRegBankFromRegClass().
Select G_GLOBAL_VALUE, specially handle case when
there are target flags attached by lowerCall.
Differential Revision: https://reviews.llvm.org/D62589
llvm-svn: 362210
Move initGlobalBaseReg from MipsSEDAGToDAGISel to MipsFunctionInfo.
This way functions used for handling position independent code during
instruction selection, getGlobalBaseReg and initGlobalBaseReg,
end up in same class.
Differential Revision: https://reviews.llvm.org/D62586
llvm-svn: 362206
Lower call for callee that is register for MIPS32.
Register should contain callee function address.
Differential Revision: https://reviews.llvm.org/D62585
llvm-svn: 362204
These patterns can incorrectly narrow a volatile load from 128-bits to 64-bits.
Similar to PR42079.
Switch to using (v4i32 (bitcast (v2i64 (scalar_to_vector (loadi64))))) as the
load pattern used in the instructions.
This probably still has issues in 32-bit mode where loadi64 isn't legal. Maybe
we should use VZMOVL for widened loads even when we don't need the upper bits
as zeroes?
llvm-svn: 362203
DAG combine will usually fold fpextend+load to an fp extload anyway. So the
256 and 512 patterns were probably unnecessary. The 128 bit pattern was special
in that it looked for a v4f32 load, but then used it in an instruction that
only loads 64-bits. This is bad if the load happens to be volatile. We could
probably make the patterns volatile aware, but that's more work for something
that's probably rare. The peephole pass might kick in and save us anyway. We
might also be able to fix this with some additional DAG combines.
This also adds patterns for vselect+extload to enabled masked vcvtps2pd to be
used. Previously we looked for the unlikely vselect+fpextend+load.
llvm-svn: 362199
This consolidates the vreg skip code into one function (SkipVRegs()).
SkipVRegs() now knows if it should skip as if it is the first initialization or
subsequent skips.
The first skip is also done the first time createVirtualRegister is called by
the cursor instead of by the cursor's constructor. This prevents verifier
errors on machine functions that have no vregs (where the verifier will
complain that there are vregs when the function uses none).
Differential Revision: https://reviews.llvm.org/D62717
llvm-svn: 362195
This makes the 5 address operands come first. And the data operand comes last.
This matches the operand order the instruction is created with. It's also the
expected order in X86MCInstLower. So everything appeared to work, but the
operands didn't match their declared type.
Fixes a -verify-machineinstrs failure.
Also remove the isel patterns from these instructions since they should only
be used for stack spills and reloads. I'm not even sure what types the patterns
were looking for to match.
llvm-svn: 362193
This is am almost NFC, it does the following:
- If there is no register class for a COPY's src or dst, bail.
- Fixes uses iterator invalidation bug.
Differential Revision: https://reviews.llvm.org/D62713
llvm-svn: 362191