Commit Graph

354105 Commits

Author SHA1 Message Date
Simon Pilgrim 0387df7f02 [X86] combineX86ShuffleChain - use narrowShuffleMaskElts scale == 1 builtin handling. NFC.
narrowShuffleMaskElts already has the fast-path for scale == 1, no need to reimplement it here.
2020-05-12 13:45:40 +01:00
Yaxun (Sam) Liu e03394c6a6 [CUDA][HIP] Workaround for resolving host device function against wrong-sided function
recommit c77a4078e0 with fix

https://reviews.llvm.org/D77954 caused regressions due to diagnostics in implicit
host device functions.

For now, it seems the most feasible workaround is to treat implicit host device function and explicit host
device function differently. Basically in device compilation for implicit host device functions, keep the
old behavior, i.e. give host device candidates and wrong-sided candidates equal preference. For explicit
host device functions, favor host device candidates against wrong-sided candidates.

The rationale is that explicit host device functions are blessed by the user to be valid host device functions,
that is, they should not cause diagnostics in both host and device compilation. If diagnostics occur, user is
able to fix them. However, there is no guarantee that implicit host device function can be compiled in
device compilation, therefore we need to preserve its overloading resolution in device compilation.

Differential Revision: https://reviews.llvm.org/D79526
2020-05-12 08:27:50 -04:00
Sam Parker f1f8cffce4 [NFC][AArch64] More casts tests...
Don't use truncs are users because sometimes they're free too.
2020-05-12 13:06:17 +01:00
Simon Pilgrim 45aa1b8853 [X86][AVX] Use X86ISD::VPERM2X128 for blend-with-zero if optimizing for size
Last part of PR22984 - avoid the zero-register dependency if optimizing for size
2020-05-12 13:03:50 +01:00
Simon Pilgrim 24ac6a2d7d FuzzerCLI.h - reduce StringRef.h include to forward declaration. NFC. 2020-05-12 13:03:50 +01:00
Simon Pilgrim e143253fa8 DebugCounter.h - remove unused includes. NFC.
Added explicit StringRef.h include as we need the full definition for several inline functions in DebugCounter.h.
2020-05-12 13:03:49 +01:00
Pierre-vh 24bf8063d6 [Target][ARM] Replace outdated getARMVPTBlockMask function
getARMVPTBlockMask was an outdated function that only handled basic
block masks: T, TT, TTT and TTTT. This worked fine before the MVE
VPT Block Insertion Pass improvements as it was the only kind of
masks that it could generate, but now it can generate more complex
masks that uses E predicates, so it's dangerous to use that function
to calculate VPT/VPST block masks.

I replaced it with 2 different functions:
  - expandPredBlockMask, in ARMBaseInfo. This adds an "E" or "T" at
    the end of an existing PredBlockMask.
  - recomputeVPTBlockMask, in Thumb2InstrInfo. This takes an iterator
    to a VPT/VPST instruction and recomputes its block mask by looking
    at the predicated instructions that follows it. This should be
    used to recompute a block mask after removing/adding a predicated
    instruction to the block.

The expandPredBlockMask function is pretty much imported from the MVE
VPT Blocks pass.

I had to change the ARMLowOverheadLoops and MVEVPTBlocks passes as well
so they could use these new functions.

Differential Revision: https://reviews.llvm.org/D78201
2020-05-12 12:10:15 +01:00
Pierre-vh bf2183374a [Target][ARM] Replace re-uses of old VPR values with VPNOTs
Differential Revision: https://reviews.llvm.org/D76847
2020-05-12 12:09:57 +01:00
David Zarzycki 9e32bf550d [libcxx testing] Remove ALLOW_RETRIES from sleep_for.pass.cpp
Operating systems are best effort by default, so we cannot assume that
sleep-like APIs return as soon as we'd like.

Even if a sleep-like API returns when we want it to, the potential for
preemption means that attempts to measure time are subject to delays.
2020-05-12 06:55:11 -04:00
Sander de Smalen 077d2d6802 [CodeGen][SVE] Add patterns for whole vector predicate select
Added patterns to implement `select i1 %p, <vty> %a, <vty> %b`

Reviewed By: efriedma

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79356
2020-05-12 11:47:39 +01:00
Jim Lin 9d6064ec49 Revert "[RISCV] Make CanLowerReturn protected for downstream maintenance"
This reverts commit d775841d7d.
2020-05-12 18:49:17 +08:00
Sam Parker e114bdf072 [NFC][AArch64] More cast cost tests
Add truncating stores and casts with users.
2020-05-12 11:32:52 +01:00
Sander de Smalen d6936be2ef [SveEmitter] Add builtins for svdup and svindex
Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D79357
2020-05-12 11:02:32 +01:00
Petre-Ionut Tudor 9682d0d5dc [ARM] Refactor lower to S[LR]I optimization
Summary:
The optimization has been refactored to fix certain bugs and
limitations. The condition for lowering to S[LR]I has been changed
to reflect the manual pseudocode description of SLI and SRI operation.
The optimization can now handle more cases of operand type and order.

Subscribers: kristof.beyls, hiraditya, danielkiss, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79233
2020-05-12 11:00:13 +01:00
Sam Parker b4a8091a11 [ARM][CostModel] Improve getCastInstrCost
- Specifically check for sext/zext users which have 'long' form NEON
  instructions.
- Add more entries to the table for sext/zexts so that we can report
  more accurately the number of vmovls required for NEON.
- Pass the instruction to the pass implementation.

Differential Revision: https://reviews.llvm.org/D79561
2020-05-12 10:32:20 +01:00
Sam Parker 1952c86d61 [AArch64][CostModel] getCastInstrCost
Pass the instruction to the base implementation.

Differential Revision: https://reviews.llvm.org/D79562
2020-05-12 10:02:29 +01:00
Manoel Roemmer 6b9e43c67e [Openmp][VE] Libomptarget plugin for NEC SX-Aurora
This patch adds a libomptarget plugin for the NEC SX-Aurora TSUBASA Vector
Engine (VE target).  The code is largely based on the existing generic-elf
plugin and uses the NEC VEO and VEOSINFO libraries for offloading.

Differential Revision: https://reviews.llvm.org/D76843
2020-05-12 10:47:30 +02:00
Haojian Wu 40ef427460 get rid of the NDEBUG usage in RecoveryExpr, NFC.
use the llvm::all_of, per dblaikie's suggestion.
2020-05-12 10:19:58 +02:00
Sam Parker 494c7ecef9 [NFC][AArch64] Update tests
Add cost model tests for extending loads.
2020-05-12 08:49:05 +01:00
Eric Christopher a42e53cccf Fix typos encountered while working on pass pipeline for O1. 2020-05-12 00:45:15 -07:00
Djordje Todorovic 8b7b84e99d Revert "[NFC][DwarfDebug] Prefer explicit to auto type deduction"
This wasn't proposed by the LLVM Style Guide.
Please see https://reviews.llvm.org/D79624.

This reverts commit rG2552dc5317e0.
2020-05-12 09:44:31 +02:00
Djordje Todorovic 41ca605813 Revert "[NFC][DwarfDebug] Avoid default capturing when using lambdas"
Reverting this because we found it isn't that useful.
Please see https://reviews.llvm.org/D79616.

This reverts commit rG45e5a32a8bd3.
2020-05-12 09:37:28 +02:00
Jonas Paulsson 57feff93a8 [SystemZ] Improve foldMemoryOperandImpl: vec->FP conversions
Use FP-mem instructions when folding reloads into single lane (W..) vector
instructions.

Only do this when all other operands of the instruction have already been
allocated to an FP (F0-F15) register.

Review: Ulrich Weigand

Differential Revision: https://reviews.llvm.org/D76705
2020-05-12 09:21:24 +02:00
David Sherwood 42c7a6d52b [CodeGen] Fix incorrect uses of getVectorNumElements()
I have fixed up some places in SelectionDAG::getNode() where we
used to assert that the number of vector elements for two types
are the same. I have changed such cases to assert that the
element counts are the same instead. I've added new tests that
exercise the code paths for all the truncations. All the extend
operations are covered by this existing test:

  CodeGen/AArch64/sve-sext-zext.ll

For the ISD::SETCC case I fixed this code path is exercised by
these existing tests:

  CodeGen/AArch64/sve-fcmp.ll
  CodeGen/AArch64/sve-intrinsics-int-compares-with-imm.ll

Differential Revision: https://reviews.llvm.org/D79399
2020-05-12 07:50:37 +01:00
Muhammad Omair Javaid 054ed1fd0b [LLDB] Disable TestBasicEntryValues.py for arm
TestBasicEntryValues.py fails on arm 32 bit. Currently running on silent master here:
http://lab.llvm.org:8014/builders/lldb-arm-ubuntu/
2020-05-12 11:32:58 +05:00
Nathan Ridge 5a7276b354 [clangd] Have suppression comments take precedence over warning-as-error
Summary: This matches the clang-tidy behaviour.

Fixes https://github.com/clangd/clangd/issues/375

Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, usaxena95, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D79691
2020-05-12 02:29:03 -04:00
Eric Christopher 84a9c72574 Temporarily Revert "[mlir][shape] Tidy up shape.shape_of" as it's breaking a few tests.
This reverts commit b604544886.

Followed up offline with a testcase.
2020-05-11 23:05:18 -07:00
Jim Lin d775841d7d [RISCV] Make CanLowerReturn protected for downstream maintenance
Summary: For the downstream RISCV maintenance, it would be easier to override and reuse CanLowerReturn for customizing.

Reviewers: asb, lenary, luismarques

Reviewed By: lenary

Subscribers: hiraditya, rbar, johnrusso, simoncook, sabuasal, niosHD, kito-cheng, shiva0217, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, s.egerton, pzheng, sameer.abuasal, apazos, evandro, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78545
2020-05-12 13:50:42 +08:00
Qiu Chaofan e8d2ff22f0 [PowerPC] Add fma/fsqrt/fmax strict-fp intrinsics
This patch adds strict-fp intrinsics support for fma, fsqrt, fmaxnum and
fminnum on PowerPC.

Reviewed By: hfinkel

Differential Revision: https://reviews.llvm.org/D72749
2020-05-12 13:44:09 +08:00
zoecarver 5eb55483eb Revert "[libcxx] shared_ptr changes from library fundamentals (P0414R2)."
This reverts commit e8c13c182a.
2020-05-11 22:43:17 -07:00
Fangrui Song f98709a982 [gcov] Fix big-endian problems
In a big-endian .gcda file, the first four bytes are "gcda" instead of "adcg".
All 32-bit values are in big-endian.

With this change, libclang_rt.profile can hopefully produce gcov
compatible output.
2020-05-11 22:36:46 -07:00
Fangrui Song 4c684b91d5 Revert part of D49132 "[gcov] Fix gcov profiling on big-endian machines"
D49132 is partially correct. For 64-bit values, the lower 32-bit part comes
before the higher 32-bit part (in a little-endian manner).

For 32-bit values, libgcov reads/writes 32-bit values in native endianness.
2020-05-11 22:27:01 -07:00
Martin Storsjö 1f707cc990 Partially revert "[CMake] Fix building with -DBUILD_SHARED_LIBS=ON on mingw"
This reverts parts of commit 609ef94838,
as it caused build failures on windows if LLVM_BUILD_EXAMPLES was
enabled, due to Bye being added as a dependency of the lit tests.
2020-05-12 08:20:44 +03:00
Sourabh Singh Tomar 93aee9ca86 [DWARF5]: Added support for dumping strx forms in llvm-dwarfdump
This patch adds support for dumping DW_MACRO_define_strx,
DW_MACRO_undef_strx in llvm-dwarfdump. These forms are currently
supported only in debug_macro section.

Reviewed By: ikudrin, dblaikie

Differential Revision: https://reviews.llvm.org/D78736
2020-05-12 10:29:18 +05:30
Fangrui Song 013f06703e [gcov] Emit GCOV_TAG_OBJECT_SUMMARY/GCOV_TAG_PROGRAM_SUMMARY correctly and fix llvm-cov's decoding of runcount
gcov 9 (r264462) started to use GCOV_TAG_OBJECT_SUMMARY. Before,
GCOV_TAG_PROGRAM_SUMMARY was used.
libclang_rt.profile should emit just one tag according to the version.

Another bug introduced by rL194499 is that the wrong runcount field was
selected.

Fix the two bugs so that gcov can correctly decode "Runs:" from
libclang_rt.profile produced .gcda files, and llvm-cov gcov can
correctly decode "Runs:" from libgcov produced .gcda files.
2020-05-11 21:53:53 -07:00
Wang, Pengfei 2e9f1153c5 [x86/SLH][NFC] Add a test to produce a failed generation. 2020-05-12 11:43:20 +08:00
aartbik 40f56c8cf1 [mlir] [VectorOps] Replace zero-scalar + splat into direct zero vector constant
Summary:
The scalar zero + splat yields more intermediate code than the direct
dense zero constant, and ultimately is lowered to exactly the same
LLVM IR operations, so no point wasting the intermediate code.

Reviewers: nicolasvasilache, andydavis1, reidtatge

Reviewed By: nicolasvasilache

Subscribers: mehdi_amini, rriddle, jpienaar, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, liufengdb, stephenneuendorffer, Joonsoo, grosul1, frgossen, Kayjukh, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79758
2020-05-11 20:20:37 -07:00
Jason Molenda 2b8b783b1a Quote error string from qLaunchSuccess
If the error message from qLaunchSucess included a gdb RSP
metacharacter, it could crash lldb.  Apply the binary
escaping to the string before sending it to lldb; lldb
promiscuously applies the binary escaping protocol on
packets it receives.

Also fix a small bug in cstring_to_asciihex_string where
a high bit character (eg utf-8 chars) would not be
quoted correctly due to signed char fun.

Differential Revision: https://reviews.llvm.org/D79614

rdar://problem/62873581
2020-05-11 20:05:57 -07:00
Eric Christopher 59a299cbb3 Fix a release+noasserts werror for unused variable. 2020-05-11 20:03:23 -07:00
Eric Christopher eb81de2de4 Temporarily Revert "[lld-macho] Re-add dylink-lazy test" as it
appears to be still failing.

This reverts commit 723c46e645.
2020-05-11 19:47:21 -07:00
zoecarver e8c13c182a [libcxx] shared_ptr changes from library fundamentals (P0414R2).
Implements P0414R2:
  * Adds support for array types in std::shared_ptr.
  * Adds reinterpret_pointer_cast for shared_ptr.

Differential Revision: https://reviews.llvm.org/D62259
2020-05-11 18:46:29 -07:00
Joel E. Denny 2aa0217add [FileCheck] Make invalid prefix diagnostics more precise
This will prove especially helpful after D79276, which introduces
comment prefixes.  Specifically, identifying whether there's a
uniqueness violation will be helpful as prefixes will be required to
be unique across both check prefixes and comment prefixes.

Also, remove a related comment about `cl::list` that no longer seems
relevant now that FileCheck is also a library.

Reviewed By: jhenderson, thopre

Differential Revision: https://reviews.llvm.org/D79375
2020-05-11 21:11:58 -04:00
Austin Kerbow 1429e4c399 [AMDGPU][GlobalISel] Revise handling of wide loads in RegBankSelect
When splitting loads in RegBankSelect G_EXTRACT_VECTOR_ELT were being added
which could not be selected. Since invoking the legalizer will generate
instructions that split and combine wide loads, we can remove the redundant
repair instructions which are added by RegBankSelect.

Differential Revision: https://reviews.llvm.org/D75547
2020-05-11 18:10:16 -07:00
Nico Weber 91259bf9c6 [gn build] Use relative paths in generated lit.site.cfg.py files for llvm and clang.
This ports a16ba6fea2 to the GN build.

No intended behavior change.
2020-05-11 20:58:45 -04:00
Kazu Hirata 0205fabe5d [Inlining] Make shouldBeDeferred static (NFC)
Summary:
This patch makes shouldBeDeferred static because it is called only
from shouldInline in the same .cpp file.

Reviewers: davidxl, mtrofin

Reviewed By: mtrofin

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79750
2020-05-11 17:43:31 -07:00
Eli Friedman c9c930ae67 [SelectionDAG] Don't promote the alignment of allocas beyond the stack alignment.
allocas in LLVM IR have a specified alignment. When that alignment is
specified, the alloca has at least that alignment at runtime.

If the specified type of the alloca has a higher preferred alignment,
SelectionDAG currently ignores that specified alignment, and increases
the alignment. It does this even if it would trigger stack realignment.
I don't think this makes sense, so this patch changes that.

I was looking into this for SVE in particular: for SVE, overaligning
vscale'ed types is extra expensive because it requires realigning the
stack multiple times, or using dynamic allocation. (This currently isn't
implemented.)

I updated the expected assembly for a couple tests; in particular, for
arg-copy-elide.ll, the optimization in question does not increase the
alignment the way SelectionDAG normally would. For the rest, I just
increased the specified alignment on the allocas to match what
SelectionDAG was inferring.

Differential Revision: https://reviews.llvm.org/D79532
2020-05-11 17:39:00 -07:00
Saiyedul Islam 117e5609e9 [AMDGPU] Reserving VGPR for future SGPR Spill
Summary: One VGPR register is allocated to handle a future spill of SGPR if "--amdgpu-reserve-vgpr-for-sgpr-spill" option is used

Reviewers: arsenm, rampitec, msearles, cdevadas

Reviewed By: arsenm

Subscribers: madhur13490, qcolombet, kerbowa, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits

Tags: #amdgpu, #llvm

Differential Revision: https://reviews.llvm.org/D70379
2020-05-12 00:33:00 +00:00
Eli Friedman a8874c76e8 [AArch64][SVE] Add patterns for VSELECT of immediates.
This covers forms involving "CPY (immediate, zeroing)".

This doesn't handle the case where the operands are reversed, and the
condition is freely invertible.  Not sure how to handle that.  Maybe a
DAGCombine.

Differential Revision: https://reviews.llvm.org/D79598
2020-05-11 17:04:22 -07:00
Rahul Joshi 5633813bf3 [MLIR] Fix several misc issues in in Toy tutorial
Summary:
- Fix comments in several places
- Eliminate extra ' in AST dump and adjust tests accordingly

Differential Revision: https://reviews.llvm.org/D78399
2020-05-11 16:56:47 -07:00
Austin Kerbow 09253b608a [AMDGPU] Allow spilling FP to memory
If there are no available lanes in a reserved VGPR, no free SGPR, and no unused CSR
VGPR when trying to save the FP it needs to be spilled to memory as a last
resort. This can be done in the prolog/epilog if we manually add the spill
and manage exec.

Differential Revision: https://reviews.llvm.org/D79610
2020-05-11 16:42:59 -07:00