Many of these problems are because shuffle lowering widens element size and reduces element count when possible. This causes the shuffle to become separated from the select by a bitcast. Future patches will work to improve these cases by rewriting the shuffle back to a narrow element type if we think it can result in folding the mask.
llvm-svn: 287503
The change is part of RegCall calling convention support for LLVM.
Long double (f80) requires special treatment as the first f80 parameter is saved in FP0 (floating point stack).
This review present the change and the corresponding tests.
Differential Revision: https://reviews.llvm.org/D26151
llvm-svn: 287485
If a response file in construct `@file` was specified by relative name,
constructs `@file` nested within it were resolved incorrectly if the
flag RelativeNames in call to ExpandResponseFile was set to true.
This feature is used in configuration files, tests for it are in
respective change (see D24933).
llvm-svn: 287482
The casting based reading of the LSDA could attempt to read unsuitably aligned
data. Avoid that case by explicitly using a memcpy. A similar approach is used
in libc++abi to address the same UB.
llvm-svn: 287479
Rather than redeclaring the interfaces for exceptions, prefer using the
`unwind.h` header. This is vended by at least gcc and clang, and can also be
found by an external unwinding library (e.g. libunwind). Doing this simplifies
the example to the exception handling itself. Minor tweaks are the result of
_Unwind_Context_t not being defined, which is just a typedef for struct
_Unwind_Context *. NFC.
llvm-svn: 287478
add BPF disassembler, so tools like llvm-objdump can be used:
$ llvm-objdump -d -no-show-raw-insn ./sockex1_kern.o
./sockex1_kern.o: file format ELF64-BPF
Disassembly of section socket1:
bpf_prog1:
0: r6 = r1
8: r0 = *(u8 *)skb[23]
10: *(u32 *)(r10 - 4) = r0
18: r1 = *(u32 *)(r6 + 4)
20: if r1 != 4 goto 8
28: r2 = r10
30: r2 += -4
ld_imm64 (the only 16-byte insn) and special ld_abs/ld_ind instructions
had to be treated in a special way. The decoders for the rest of the insns
are automatically generated.
Add tests to cover new functionality.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
llvm-svn: 287477
The demangler had stopped using a custom allocator but had not been updated to
remove the use of the explicit allocator passing. This removes that as we do
not need to do anything special here anymore. This just makes the code more
compact. NFC.
llvm-svn: 287472
We created a local typedef for `std::basic_string<char, std::char_traits<char>>`
which is just `std::string`. Remove the local typedef and propagate the type
information through the rest of the demangler. NFC.
llvm-svn: 287470
It seems that because ThinLTO does not import the full module,
some invariant of the type mapper are broken.
In Monolithic LTO, we import every globals: when calling
IRLinker::copyFunctionProto() on @foo(), we end-up calling
TypeMapTy::get(FTy) on the type of @foo(), which will map
%0 and record the destination as opaque.
ThinLTO skips this because @foo is not imported and goes directly
to the next stage.
Next we call computeTypeMapping() that map the types for each
globals, and ends up checking for type isomorphism, and may add
type mapping. However it doesn't record if there was an opaque
destination type that was resolved.
Instead of lazily "discovering" opaque type in the destination
module on the go, we change the TypeFinder to eagerly record all
types and not only the named ones.
Differential Revision: https://reviews.llvm.org/D26840
llvm-svn: 287453
Summary:
This will also be added to the LTO API, right now this will
bring ThinLTO on par with Monolithic LTO on Darwin.
Reviewers: anemet
Subscribers: tejohnson, llvm-commits
Differential Revision: https://reviews.llvm.org/D26886
llvm-svn: 287450
Summary:
This makes it explicit that ownership is taken. Also replace all `new`
with make_unique<> at call sites.
Reviewers: anemet
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D26884
llvm-svn: 287449
Summary:
* ARM is omitted from this patch because this check appears to expose bugs in this target.
* Mips is omitted from this patch because this check either detects bugs or deliberate
emission of instructions that don't satisfy their predicates. One deliberate
use is the SYNC instruction where the version with an operand is correctly
defined as requiring MIPS32 while the version without an operand is defined
as an alias of 'SYNC 0' and requires MIPS2.
* X86 is omitted from this patch because it doesn't use the tablegen-erated
MCCodeEmitter infrastructure.
Patches for ARM and Mips will follow.
Depends on D25617
Reviewers: tstellarAMD, jmolloy
Subscribers: wdng, jmolloy, aemerson, rengolin, arsenm, jyknight, nemanjai, nhaehnle, tstellarAMD, llvm-commits
Differential Revision: https://reviews.llvm.org/D25618
llvm-svn: 287439
llvm-lto2.cpp has the following include chain:
llvm/LTO/Caching.h
llvm/LTO/LTO.h
llvm/CodeGen/Analysis.h
llvm/IR/CallSite.h
llvm/IR/Attributes.h
llvm/IR/Attributes.gen
This means llvm-lto2 needs to depend on intrinsics_gen.
llvm-svn: 287434
AnalysisWrappers.cpp has the following include chain:
llvm/Analysis/CallGraph.h
llvm/IR/CallSite.h
llvm/IR/Attributes.h
llvm/IR/Attributes.gen
This means opt needs to depend on intrinsics_gen.
llvm-svn: 287433
llvm-nm.cpp has the following include chain:
llvm/IR/Function.h
llvm/IR/Argument.h
llvm/IR/Attributes.h
llvm/IR/Attributes.gen
This means llvm-nm needs to depend on intrinsics_gen.
llvm-svn: 287432
llvm-link.cpp has the following include chain:
llvm/Bitcode/BitcodeWriter.h
llvm/IR/ModuleSummaryIndex.h
llvm/IR/Module.h
llvm/IR/Function.h
llvm/IR/Argument.h
llvm/IR/Attributes.h
llvm/IR/Attributes.gen
This means llvm-link needs to depend on intrinsics_gen.
llvm-svn: 287431
llvm-extract.cpp has the following include chain:
llvm/Bitcode/BitcodeWriterPass.h
llvm/IR/PassManager.h
llvm/IR/Function.h
llvm/IR/Argument.h
llvm/IR/Attributes.h
llvm/IR/Attributes.gen
This means llvm-extract needs to depend on intrinsics_gen.
llvm-svn: 287430
llvm-dwp.cpp has the following include chain:
llvm/CodeGen/AsmPrinter.h
llvm/CodeGen/MachineFunctionPass.h
llvm/CodeGen/MachineFunction.h
llvm/CodeGen/MachineBasicBlock.h
llvm/CodeGen/MachineInstr.h
llvm/Analysis/AliasAnalysis.h
llvm/IR/CallSite.h
llvm/IR/Attributes.h
llvm/IR/Attributes.gen
This means llvm-dwp needs to depend on intrinsics_gen.
llvm-svn: 287429
llvm-dis.cpp has the following include chain:
llvm/Bitcode/BitcodeReader.h
llvm/IR/ModuleSummaryIndex.h
llvm/IR/Module.h
llvm/IR/Function.h
llvm/IR/Argument.h
llvm/IR/Attributes.h
llvm/IR/Attributes.gen
This means llvm-dis needs to depend on intrinsics_gen.
llvm-svn: 287428
llvm-diff.cpp has the following include chain:
llvm/IR/Module.h
llvm/IR/Function.h
llvm/IR/Argument.h
llvm/IR/Attributes.h
llvm/IR/Attributes.gen
This means llvm-diff needs to depend on intrinsics_gen.
llvm-svn: 287427
llvm-stress.cpp has the following include chain:
llvm/Analysis/CallGraphSCCPass.h
llvm/Analysis/CallGraph.h
llvm/IR/CallSite.h
llvm/IR/Attributes.h
llvm/IR/Attributes.gen
This means llvm-stress needs to depend on intrinsics_gen.
llvm-svn: 287426
TestPasses.cpp has the following include chain:
llvm/IR/InstVisitor.h
llvm/IR/CallSite.h
llvm/IR/Attributes.h
llvm/IR/Attributes.gen
This means bugpoint-passes needs to depend on intrinsics_gen.
llvm-svn: 287425
llvm-bcanalyzer.cpp has the following include chain:
llvm/Bitcode/BitcodeReader.h
llvm/IR/ModuleSummaryIndex.h
llvm/IR/Module.h
llvm/IR/Function.h
llvm/IR/Argument.h
llvm/IR/Attributes.h
llvm/IR/Attributes.gen
This means llvm-bcanalyzer needs to depend on intrinsics_gen.
llvm-svn: 287424
llvm-as.cpp has the following include chain:
llvm/Bitcode/BitcodeWriter.h
llvm/IR/ModuleSummaryIndex.h
llvm/IR/Module.h
llvm/IR/Function.h
llvm/IR/Argument.h
llvm/IR/Attributes.h
llvm/IR/Attributes.gen
This means llvm-as needs to depend on intrinsics_gen.
llvm-svn: 287423
llc.cpp has the following include chain:
llvm/Analysis/TargetLibraryInfo.h
llvm/IR/Function.h
llvm/IR/Argument.h
llvm/IR/Attributes.h
llvm/IR/Attributes.gen
This means llc needs to depend on intrinsics_gen.
llvm-svn: 287422
ChildTarget.cpp has the following include chain:
llvm/ExecutionEngine/Orc/OrcABISupport.h
llvm/ExecutionEngine/Orc/IndirectionUtils.h
llvm/IR/IRBuilder.h
llvm/IR/ConstantFolder.h
llvm/IR/InstrTypes.h
llvm/IR/Attributes.h
llvm/IR/Attributes.gen
This means lli needs to depend on intrinsics_gen.
llvm-svn: 287420
DwarfLinker.cpp has the following include chain:
llvm/CodeGen/AsmPrinter.h
llvm/CodeGen/MachineFunctionPass.h
llvm/CodeGen/MachineFunction.h
llvm/CodeGen/MachineBasicBlock.h
llvm/CodeGen/MachineInstr.h
llvm/Analysis/AliasAnalysis.h
llvm/IR/CallSite.h
llvm/IR/Attributes.h
llvm/IR/Attributes.gen
This means llvm-dsymutil needs to depend on intrinsics_gen.
llvm-svn: 287419
When LLVM_DEPENDENCY_DEBUGGING=On we should apply the sandbox only on the target, not the directory. This is important for directories that create more than one target, or for nested directories.
llvm-svn: 287415
verify-uselistorder.cpp has the following include chain:
llvm/Bitcode/BitcodeReader.h
llvm/IR/ModuleSummaryIndex.h
llvm/IR/Module.h
llvm/IR/Function.h
llvm/IR/Argument.h
llvm/IR/Attributes.h
llvm/IR/Attributes.gen
This means verify-uselistorder needs to depend on intrinsics_gen.
llvm-svn: 287405
sanstats.cpp has the following include chain:
llvm/Transforms/Utils/SanitizerStats.h
llvm/IR/IRBuilder.h
llvm/IR/ConstantFolder.h
llvm/IR/InstrTypes.h
llvm/IR/Attributes.h
llvm/IR/Attributes.gen
This means sanstats needs to depend on intrinsics_gen.
llvm-svn: 287404
CrashDebugger.cpp has the following include chain:
llvm/Analysis/TargetTransformInfo.h
llvm/IR/IntrinsicInst.h
llvm/IR/Function.h
llvm/IR/Argument.h
llvm/IR/Attributes.h
llvm/IR/Attributes.gen
This means bugpoint needs to depend on intrinsics_gen.
llvm-svn: 287402
This is a prerequisite patch for D26556:
https://reviews.llvm.org/D26556
...because there was no direct coverage for these folds (which in some cases are adding instructions).
llvm-svn: 287400
llvm-split.cpp has the following include chain:
llvm/Bitcode/BitcodeWriter.h
llvm/IR/ModuleSummaryIndex.h
llvm/IR/Module.h
llvm/IR/Function.h
llvm/IR/Argument.h
llvm/IR/Attributes.h
llvm/IR/Attributes.gen
This means llvm-split needs to depend on intrinsics_gen.
llvm-svn: 287399
llvm-lto.cpp has the following include chain:
llvm/Bitcode/BitcodeReader.h
llvm/IR/ModuleSummaryIndex.h
llvm/IR/Module.h
llvm/IR/Function.h
llvm/IR/Argument.h
llvm/IR/Attributes.h
llvm/IR/Attributes.gen
This means llvm-lto needs to depend on intrinsics_gen.
llvm-svn: 287398
llvm-ar.cpp has the following include chain:
llvm/IR/Module.h
llvm/IR/Function.h
llvm/IR/Argument.h
llvm/IR/Attributes.h
llvm/IR/Attributes.gen
This means llvm-ar needs to depend on intrinsics_gen.
llvm-svn: 287395
llvm-profdata.cpp has the following include chain:
llvm/ProfileData/SampleProfReader.h
llvm/IR/Function.h
llvm/IR/Argument.h
llvm/IR/Attributes.h
llvm/IR/Attributes.gen
This means llvm-profdata needs to depend on intrinsics_gen.
llvm-svn: 287394
lto.cpp has the following include chain:
llvm/Bitcode/BitcodeReader.h
llvm/IR/ModuleSummaryIndex.h
llvm/IR/Module.h
llvm/IR/Function.h
llvm/IR/Argument.h
llvm/IR/Attributes.h
llvm/IR/Attributes.gen
This means LTO needs to depend on intrinsics_gen.
llvm-svn: 287393
The previously used "names" are rather descriptions (they use multiple
words and contain spaces), use short programming language identifier
like strings for the "names" which should be used when exporting to
machine parseable formats.
Also removed a unused TimerGroup from Hexxagon.
Differential Revision: https://reviews.llvm.org/D25583
llvm-svn: 287369
It is used to drive this from the clang driver via -mllvm.
Same option name is used as in opt.
Differential Revision: https://reviews.llvm.org/D26832
llvm-svn: 287356
During Module linking, it's possible for SrcM->getIdentifiedStructTypes();
to return types that are actually defined in the destination module
(DstM). Depending on how the bitcode file was read,
getIdentifiedStructTypes() might do a walk over all values, including
metadata nodes, looking for types. In my case, a debug info metadata
node was shared between the two modules, and it referred to a type
defined in the destination module (see test case).
Differential Revision: https://reviews.llvm.org/D26212
llvm-svn: 287353
Summary:
LLVM will define a symbol, either EnableABIBreakingChecks or
DisableABIBreakingChecks depending on the configuration setting for
LLVM_ABI_BREAKING_CHECKS.
The llvm-config.h header will add weak references to these symbols in
every clients that includes this header. This should ensure that
a mismatch triggers a link failure (or a load time failure for DSO).
On MSVC, the pragma "detect_mismatch" is used instead.
Reviewers: rnk, jroelofs
Subscribers: llvm-commits, mgorny
Differential Revision: https://reviews.llvm.org/D26841
llvm-svn: 287352
The MIPS MSA ASE provides instructions to convert to and from half precision
floating point. This patch teaches the MIPS backend to treat f16 as a legal
type and how to promote such values to f32 for the usual set of operations.
As a result of this, the fexup[lr].w intrinsics no longer crash LLVM during
type legalization.
Reviewers: zoran.jovanvoic, vkalintiris
Differential Revision: https://reviews.llvm.org/D26398
llvm-svn: 287349
Summary:
The 32-bit instructions don't zero the high 16-bits like the 16-bit
instructions do.
Reviewers: arsenm
Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, llvm-commits, tony-tye
Differential Revision: https://reviews.llvm.org/D26828
llvm-svn: 287342
insertUniqueBackedgeBlock in lib/Transforms/Utils/LoopSimplify.cpp now
propagates existing llvm.loop metadata to newly the added backedge.
llvm::TryToSimplifyUncondBranchFromEmptyBlock in lib/Transforms/Utils/Local.cpp
now propagates existing llvm.loop metadata to the branch instructions in the
predecessor blocks of the empty block that is removed.
Differential Revision: https://reviews.llvm.org/D26495
llvm-svn: 287341
Summary:
The addr64-based legalization is incorrect for MUBUF instructions with idxen
set as well as for BUFFER_LOAD/STORE_FORMAT_* instructions. This affects
e.g. shaders that access buffer textures.
Since we never actually need the addr64-legalization in shaders, this patch
takes the easy route and keys off the calling convention. If this ever
affects (non-OpenGL) compute, the type of legalization needs to be chosen
based on some TSFlag.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98664
Reviewers: arsenm, tstellarAMD
Subscribers: kzhuravl, wdng, yaxunl, tony-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D26747
llvm-svn: 287339
When we see a SETCC whose only users are zero extend operations, we can replace
it with a subtraction. This results in doing all calculations in GPRs and
avoids CR use.
Currently we do this only for ULT, ULE, UGT and UGE condition codes. There are
ways that this can be extended. For example for signed condition codes. In that
case we will be introducing additional sign extend instructions, so more careful
profitability analysis may be required.
Another direction to extend this is for equal, not equal conditions. Also when
users of SETCC are any_ext or sign_ext, we might be able to do something
similar.
llvm-svn: 287329
This is a straightforward extension of the existing support for 32/64-bit element types. Just needed to add the additional instrinsics to the switches.
llvm-svn: 287316
The same thing was done to 32-bit and 64-bit element sizes previously.
This will allow us to support these shuffls in InstCombineCalls along with the other variable shift intrinsics.
llvm-svn: 287312
since bpf instruction set was introduced people learned to
read and understand kernel verifier output whereas llvm asm
output stayed obscure and unknown. Convert llvm to emit
assembler text similar to kernel to avoid this discrepancy
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
llvm-svn: 287300
Summary:
This extends FCOPYSIGN support to 512-bit vectors.
I've also added tests to show what the 128-bit and 256-bit cases look like with broadcast loads.
Reviewers: delena, zvi, RKSimon, spatel
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D26791
llvm-svn: 287298
Summary: This should provide the function similar to `--disable-libedit` with the autotools build system, which seems to be missing from the commit (r200595) that adds this.
Reviewers: pcc, beanz
Subscribers: mgorny, llvm-commits
Differential Revision: https://reviews.llvm.org/D26550
llvm-svn: 287293
Summary:
CompareSCEVComplexity goes too deep (50+ on a quite a big unrolled loop) and runs almost infinite time.
Added cache of "equal" SCEV pairs to earlier cutoff of further estimation. Recursion depth limit was also introduced as a parameter.
Reviewers: sanjoy
Subscribers: mzolotukhin, tstellarAMD, llvm-commits
Differential Revision: https://reviews.llvm.org/D26389
llvm-svn: 287232
vXi64 multiplication is lowered into 3 calls of vpmuludq with the upper/lower 32-bit halves.
If any of these halves are zero then we can remove individual calls. Although there was isBuildVectorAllZeros code to do this I don't think it ever worked (maybe just for constant folded cases that don't seem to be tested for any longer).
This requires additional X86ISD support for computeKnownBitsForTargetNode, so far I've just added support for X86ISD::VZEXT (VPMOVZX* - helping the AVX2+ cases).
Partial fix for PR30845
Differential Revision: https://reviews.llvm.org/D26590
llvm-svn: 287223
Summary:
The motivation for this is to enable correct detection of dlopen() on Android.
Android does not provide a static version of libdl, so if we add the -static flag
after performing the check, it will succeed even though subsequent link steps
will fail. With this change we correctly detect the absence of libdl in a
LLVM_BUILD_STATIC build on Android.
The link itself still does not succeed because the code does not check the result
of this check properly, but I plan to fix that in a separate change.
Reviewers: beanz
Subscribers: danalbert, mgorny, srhines, tberghammer, llvm-commits
Differential Revision: https://reviews.llvm.org/D26463
llvm-svn: 287220
Summary:
Variadic functions can be treated in the same way as normal functions
with respect to the number and types of parameters.
Reviewers: grosbach, olista01, t.p.northover, rengolin
Subscribers: javed.absar, aemerson, llvm-commits
Differential Revision: https://reviews.llvm.org/D26748
llvm-svn: 287219
Register Calling Convention defines a new behavior for v64i1 types.
This type should be saved in GPR.
However for 32 bit machine we need to split the value into 2 GPRs (because each is 32 bit).
Differential Revision: https://reviews.llvm.org/D26181
llvm-svn: 287217
ImplicitNullCheck keeps track of one instruction that the memory
operation depends on that it also hoists with the memory operation.
When hoisting this dependency, it would sometimes clobber a live-in
value to the basic block we were hoisting the two things out of. Fix
this by explicitly looking for such dependencies.
I also noticed two redundant checks on `MO.isDef()` in IsMIOperandSafe.
They're redundant since register MachineOperands are either Defs or Uses
-- there is no third kind. I'll change the checks to asserts in a later
commit.
llvm-svn: 287213
This patch adds an option to the build system LLVM_DEPENDENCY_DEBUGGING. Over time I plan to extend this to do more complex verifications, but the initial patch causes compile errors wherever there is missing a dependency on intrinsics_gen.
Because intrinsics_gen is a compile-time dependency not a link-time dependency, everything that relies on the headers generated in intrinsics_gen needs an explicit dependency.
llvm-svn: 287207
This patch updates a bunch of places where add_dependencies was being explicitly called to add dependencies on intrinsics_gen to instead use the DEPENDS named parameter. This cleanup is needed for a patch I'm working on to add a dependency debugging mode to the build system.
llvm-svn: 287206
Summary:
For flat loop, even if it is hot, it is not a good idea to unroll in runtime, thus we set a lower partial unroll threshold.
For hot loop, we set a higher unroll threshold and allows expensive tripcount computation to allow more aggressive unrolling.
Reviewers: davidxl, mzolotukhin
Subscribers: sanjoy, mehdi_amini, llvm-commits
Differential Revision: https://reviews.llvm.org/D26527
llvm-svn: 287186
This pass splits globals into elements using inrange annotations on
getelementptr indices.
Differential Revision: https://reviews.llvm.org/D22295
llvm-svn: 287178
We save an inter-register file move this way. If there's any CPU where
the FP logic is slower, we could transform this back to int-logic in
MachineCombiner.
This helps, but doesn't solve, PR6137:
https://llvm.org/bugs/show_bug.cgi?id=6137
The 'andn' test shows that we're missing a pattern match to
recognize the xor with -1 constant as a 'not' op.
llvm-svn: 287171
This should prevent stack overflows in non-optimized builds on
.ll files with lots of consecutive commented-out lines.
Instead of recursing into LexToken(), continue into a 'while (true)'.
llvm-svn: 287170
They're not SelectionDAG- or FunctionLoweringInfo-specific. They
are, however, specific to building MMI from IR.
We could make them members, but it's nice having MMI be a "simple" data
structure and this logic kept separate.
This also lets us reuse them from GlobalISel.
llvm-svn: 287167
No real functional change with this commit.
The problem with report_fatal_error() is it does not include the tool name
and the file name the for which the error message was generated.
Uses of report_fatal_error() were change to report_error() or error()
to get a better error and to make the code smaller and cleaner.
Also changed things like error(errorToErrorCode(SOrErr.takeError())) to
use report_error() with a file name and the llvm::Error (as well as the
ArchitectureName if available) so the error message is printed.
llvm-svn: 287163
Summary:
A lot of the pseudo instructions are required because LLVM assumes that
all integers of the same size as the pointer size are legal. This means
that it will not currently expand 16-bit instructions to their 8-bit
variants because it thinks 16-bit types are legal for the operations.
This also adds all of the CodeGen tests that required the pass to run.
Reviewers: arsenm, kparzysz
Subscribers: wdng, mgorny, modocache, llvm-commits
Differential Revision: https://reviews.llvm.org/D26577
llvm-svn: 287162
We only ever create TargetConstantPool, TargetJumpTable, TargetExternalSymbol,
TargetGlobalAddress, TargetGlobalTLSAddress, MCSymbol and TargetBlockAddress
nodes as operands of X86ISD::Wrapper nodes, so we can remove one check and
invert the other.
Also update the documentation comment for X86ISD::Wrapper.
Differential Revision: https://reviews.llvm.org/D26731
llvm-svn: 287160
We don't track callee clobbered registers correctly, so avoid hoisting
across calls.
Note: for this bug to trigger we need a `readonly` call target, since we
already have logic to not hoist across potentially storing instructions
either.
llvm-svn: 287159
One half of the shifts obviously needed conditional selection based on whether
the shift amount is more than 32-bits, but leaving the other half as the
natural shift isn't acceptable either: it's undefined behaviour to shift a
32-bit value by more than 31.
llvm-svn: 287149
We fail to produce bit-to-bit matching stage2 and stage3 compiler in PGO
bootstrap build. The reason is because LoopBlockSet is of SmallPtrSet type
whose iterating order depends on the pointer value.
This patch fixes this issue by changing to use SmallSetVector.
Differential Revision: http://reviews.llvm.org/D26634
llvm-svn: 287148
Summary:
Extend replaceZeroVectorStore to handle more vector type stores,
floating point zero vectors and set alignment more accurately on split
stores.
This is a follow-up change to r286875.
This change fixes PR31038.
Reviewers: MatzeB
Subscribers: mcrosier, aemerson, llvm-commits, rengolin
Differential Revision: https://reviews.llvm.org/D26682
llvm-svn: 287142
Summary:
1. Don't try to copy values to and from the same register class.
2. Replace copies with of registers with immediate values with v_mov/s_mov
instructions.
The main purpose of this change is to make MachineSink do a better job of
determining when it is beneficial to split a critical edge, since the pass
assumes that copies will become move instructions.
This prevents a regression in uniform-cfg.ll if we enable critical edge
splitting for AMDGPU.
Reviewers: arsenm
Subscribers: arsenm, kzhuravl, llvm-commits
Differential Revision: https://reviews.llvm.org/D23408
llvm-svn: 287131
We can replace "scalar" FP-bitwise-logic with other forms of bitwise-logic instructions.
Scalar SSE/AVX FP-logic instructions only exist in your imagination and/or the bowels of
compilers, but logically equivalent int, float, and double variants of bitwise-logic
instructions are reality in x86, and the float variant may be a shorter instruction
depending on which flavor (SSE or AVX) of vector ISA you have...so just prefer float all
the time.
This is a preliminary step towards solving PR6137:
https://llvm.org/bugs/show_bug.cgi?id=6137
Differential Revision:
https://reviews.llvm.org/D26712
llvm-svn: 287122
This unit test infinite-looped on s390x due to a thread_yield being optimized
out. I've updated the QueueChannel class (where thread_yield was called) to use
a condition variable instead. This should cause the unit test to behave
correctly.
llvm-svn: 287121
Both the (V)CVTDQ2PD (i32 to f64) and (V)CVTUDQ2PD (u32 to f64) conversion instructions are lossless and can be safely represented as generic SINT_TO_FP/UINT_TO_FP calls instead of x86 intrinsics without affecting final codegen.
LLVM counterpart to D26686
Differential Revision: https://reviews.llvm.org/D26736
llvm-svn: 287108
MipsFastISel uses a a class to represent addresses with a signed member
to represent the offset. MipsFastISel::emitStore, emitLoad and computeAddress
all treated the offset as being positive. In cases where the offset was
actually negative and a frame pointer was used, this would cause the constant
synthesis routine to crash as it would generate an unexpected instruction
sequence when frame indexes are replaced.
Reviewers: vkalintiris
Differential Revision: https://reviews.llvm.org/D26192
llvm-svn: 287099
This patch adds the single operand form of the not alias to microMIPS and
MIPS along with additional tests.
This partially resolves PR/30381.
Thanks to Sean Bruno for reporting the issue!
llvm-svn: 287097
Summary:
All uses have been replaced by appropriate std::chrono types, and the class is
now unused.
Reviewers: zturner, mehdi_amini
Subscribers: llvm-commits, mgorny
Differential Revision: https://reviews.llvm.org/D26447
llvm-svn: 287094
Summary: These intrinsics have been unused for clang for a while. This patch removes them. We auto upgrade them to extractelements, a scalar operation and then an insertelement. This matches the sequence used by clangs intrinsic file.
Reviewers: zvi, delena, RKSimon
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D26660
llvm-svn: 287083
This has two advantages:
1) We slowly move away from ErrorOr to the new handling interface,
in the hope of having an uniform error handling in LLVM, eventually.
2) We're starting to have *meaningful* error messages for invalid
object ELF files, rather than a generic "parse error". At some point
we should include also the offset to improve the quality of the
diagnostic.
llvm-svn: 287081
Doing this before register allocation reduces register pressure as we do
not even have to allocate a register for those dead definitions.
Differential Revision: https://reviews.llvm.org/D26111
llvm-svn: 287076
Summary:
We update the documentation to define what the requirements are for the
provided XRay log handler. This is to make it clear that the function
pointer provided must do internal synchronisation and that there are no
guarantees provided by XRay on when the function shall be invoked once
it has been installed as a log handler.
Reviewers: rSerge, rengolin
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D26651
llvm-svn: 287073
In https://reviews.llvm.org/D25347, Geoff noticed that we still have
useless copy that we can eliminate after register allocation. At the
time the allocation is chosen for those copies, they are not useless
but, because of changes in the surrounding code, later on they might
become useless.
The Greedy allocator already has a mechanism to deal with such cases
with a late recoloring. However, we missed to record the some of the
missed hints.
This commit fixes that.
llvm-svn: 287070