Commit Graph

190961 Commits

Author SHA1 Message Date
Johannes Doerfert b6dbd0f71f [Attributor][NFC] Internalize helper function 2020-01-28 22:50:34 -06:00
Benjamin Kramer 49ad3f6143 One more bugpoitn fix for GCC5 2020-01-29 03:42:02 +01:00
Benjamin Kramer 42a25e7fe6 Try harder to fix bugpoint with GCC5 2020-01-29 03:30:47 +01:00
Benjamin Kramer cd87e207ec Make bugpoint work with gcc5 again. 2020-01-29 03:11:00 +01:00
Benjamin Kramer bd31243a34 Fix more implicit conversions. Getting closer to having clang working with gcc 5 again 2020-01-29 02:57:59 +01:00
Benjamin Kramer bb39b52950 Fix conversions in clang and examples 2020-01-29 02:48:15 +01:00
Nate Voorhies a40b3e3b61 [NFC] Fix unused variable warning.
Reviewers: dschuff

Reviewed By: dschuff

Subscribers: hiraditya, aheejin, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73591
2020-01-28 17:19:23 -08:00
Vedant Kumar 8359511c62 [CodeExtractor] Remove stale llvm.assume calls from extracted region
During extraction, stale llvm.assume handles may be retained in the
original function. The setup is:

1) CodeExtractor unregisters assumptions in the blocks that are to be
   extracted.

2) Extraction happens. There are now two functions: f1 and f1.extracted.

3) Leftover assumptions in f1 (/not/ removed as they were not in the set of
   blocks to be extracted) now have affected-value llvm.assume handles in
   f1.extracted.

When assumptions for a value used in f1 are looked up, ValueTracking can assert
as some of the handles are in the wrong function. To fix this, simply erase the
llvm.assume calls in the extracted function.

Alternatives include flushing the assumption cache in the original function, or
walking all values used in the original function to prune stale affected-value
handles. Both seem more expensive.

Testing: check-llvm, LNT run with -mllvm -hot-cold-split enabled

rdar://58460728
2020-01-28 17:18:01 -08:00
Benjamin Kramer 2d92336db0 Another stab at making the gold plugin compile again 2020-01-29 02:12:53 +01:00
Benjamin Kramer a9bc7b83a4 Another round of GCC5 fixes. 2020-01-29 02:09:24 +01:00
Derek Schuff d966bf830f [WebAssembly] Preserve debug frame base information through register coloring
2 fixes:

Register coloring can re-assign virtual registers. When the frame base register
is colored, update the DwarfFrameBase accordingly When the frame base register
is stackified, do not attempt to encode DW_AT_frame_base as a local In the
future we will presumably want to handle this case better but for now we can
emit worse debug info rather than crashing.

Differential Revision: https://reviews.llvm.org/D73581
2020-01-28 16:58:15 -08:00
Benjamin Kramer 735f90fe42 Fix one round of implicit conversions found by g++5. 2020-01-29 01:52:48 +01:00
Craig Topper ca2abea29a [X86] Use SelectionDAG::getZExtOrTrunc to simplify some code. NFCI 2020-01-28 16:27:59 -08:00
Craig Topper 92ecc306af [X86] Add test case for llvm.flt.rounds 2020-01-28 16:27:59 -08:00
Nico Weber ce70eb76ea Fix AVR build after 777180a32b 2020-01-28 19:22:22 -05:00
Benjamin Kramer 8b6320c79d Address implicit conversions detected by g++ 5 only. 2020-01-29 01:01:09 +01:00
Benjamin Kramer ddf77f10a3 One more batch of things found by g++ 6 2020-01-29 00:50:34 +01:00
Eli Friedman 2f6b9edfa8 [AliasAnalysis] Add missing FMRB_* enums.
Previously, the enums didn't account for all the possible cases, which
could cause misleading results (particularly for a "switch" on
FunctionModRefBehavior).

Fixes regression in polly from recent patch to add writeonly to memset.

While I'm here, also fix a few dubious uses of the FMRB_* enum values.

Differential Revision: https://reviews.llvm.org/D73154
2020-01-28 15:47:08 -08:00
Benjamin Kramer 0d401fa36b Fix a couple more implicit conversions that Clang doesn't diagnose. 2020-01-29 00:42:56 +01:00
Benjamin Kramer 5976067d2c A bunch more implicit string conversions that my Clang didn't detect. 2020-01-29 00:30:16 +01:00
Benjamin Kramer 05c19705d8 [tblgen] Fix implicit conversion only diagnosed by g++ 6 2020-01-29 00:25:44 +01:00
Jonas Devlieghere 00d834e087 Fix more implicit conversions 2020-01-28 15:19:27 -08:00
Benjamin Kramer c9909c22fe Fix implicit conversions in example code. 2020-01-29 00:17:35 +01:00
Benjamin Kramer 159709f04f [Support] Fix implicit std::string conversions on Win32. 2020-01-29 00:02:26 +01:00
Benjamin Kramer 777180a32b [ADT] Make StringRef's std::string conversion operator explicit
This has the same behavior as converting std::string_view to
std::string. This is an expensive conversion, so explicit conversions
are helpful for avoiding unneccessary string copies.
2020-01-28 23:47:07 +01:00
Shoaib Meenai 2e745ba6b0 [runtimes] Fix passing lists to runtimes configures
We have to replace the ";" with "|" (since LLVMExternalProjectUtils uses
"|" as the `LIST_SEPARATOR` when invoking `ExternalProject_Add`) in
order for lists to be passed correctly to the runtimes CMake configures.
Remove the special case for `LLVM_ENABLE_RUNTIMES`, since it'll just get
handled by the general logic now.

Differential Revision: https://reviews.llvm.org/D73512
2020-01-28 14:36:14 -08:00
Benjamin Kramer adcd026838 Make llvm::StringRef to std::string conversions explicit.
This is how it should've been and brings it more in line with
std::string_view. There should be no functional change here.

This is mostly mechanical from a custom clang-tidy check, with a lot of
manual fixups. It uncovers a lot of minor inefficiencies.

This doesn't actually modify StringRef yet, I'll do that in a follow-up.
2020-01-28 23:25:25 +01:00
Danilo Carvalho Grael 1f85dfb2af [AArch64][SVE] Add SVE2 mla indexed intrinsics.
Summary:
Add SVE2 mla indexed intrinsics:
- smlalb, smalalt, umlalb, umlalt, smlslb, smlslt, umlslb, umlslt.

Reviewers: efriedma, sdesmalen, dancgr, cameron.mcinally, c-rhodes, rengolin

Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, arphaman, psnobl, llvm-commits, amehsan

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73576
2020-01-28 17:15:33 -05:00
Jessica Paquette dba29f7c3b [AArch64][GlobalISel] Fold G_AND into G_BRCOND
When the G_BRCOND is fed by a eq or ne G_ICMP, it may be possible to fold a
G_AND into the branch by producing a tbnz/tbz instead.

This happens when

  1. We have a ne/eq G_ICMP feeding into the G_BRCOND
  2. The G_ICMP is a comparison against 0
  3. One of the operands of the G_AND is a power of 2 constant

This is very similar to the code in AArch64TargetLowering::LowerBR_CC.

Add opt-and-tbnz-tbz to test this.

Differential Revision: https://reviews.llvm.org/D73573
2020-01-28 14:00:31 -08:00
Mircea Trofin 7f434b91a9 [llvm] Ensure InlineCost-related fields are initialized
Summary: Small fix - never hurts to have things initialized.

Reviewers: davidxl, eraman

Reviewed By: davidxl

Subscribers: haicheng, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73420
2020-01-28 13:38:45 -08:00
Michael Spang a2fb2c0ddc [GlobalMerge] Preserve symbol visibility when merging globals
Symbols created for merged external global variables have default
visibility. This can break programs when compiling with -Oz
-fvisibility=hidden as symbols that should be hidden will be exported at
link time.

Differential Revision: https://reviews.llvm.org/D73235
2020-01-28 13:26:18 -08:00
Nico Weber 776937c3e8 [gn build] (manually) port 90a10f00ff 2020-01-28 16:00:54 -05:00
Alina Sbirlea 4aa8cdfeeb [LoopUnrollAndJamPass] Clean unnecessary includes. [NFCI] 2020-01-28 12:31:08 -08:00
Whitney Tsang cd0cff4392 [NFCI][LoopUnrollAndJam] Minor changes.
Summary:
1. Add assertions.
2. Verify more analyses.
These changes are moved out of https://reviews.llvm.org/D73129 to
simplify that review.

Reviewer: dmgreen, jdoerfert, Meinersbur, kbarton, bmahjour, etiotto
Reviewed By: dmgreen
Subscribers: fhahn, hiraditya, zzheng, llvm-commits, prithayan, anhtuyen
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D73204
2020-01-28 20:24:23 +00:00
Petr Hosek 127d3abf25 [Instrumentation] Set hidden visibility for the bias variable
We have to avoid using a GOT relocation to access the bias variable,
setting the hidden visibility achieves that.

Differential Revision: https://reviews.llvm.org/D73529
2020-01-28 12:07:03 -08:00
Alexandre Ganea b8c39e9462 Fix compiling with clang-cl inside a Visual Studio 2019 16.4 command prompt.
This was introduced by 0d17410e91 and was preventing from compiling with clang-cl on Windows.
The problem was that clang-cl detects the triple from the current env vars (was x86_64-pc-windows-msvc19.24.28315 for me, as I happen to always run inside a VS2019 cmd prompt).
2020-01-28 14:57:57 -05:00
Sanjay Patel 7a717d82ff [InstCombine] refactor foldVectorCmp(); NFC
We can handle other patterns here as shown in PR44588.
2020-01-28 14:40:48 -05:00
Petr Hosek 56b7f595d2 [CMake] Set ASM compiler for external projects
This is necessary on Windows, otherwise CMake fails. It's not
conventional on Windows to use cl for assembly (you'd use ml or ml64
instead), but CMake has a separate ASM_MASM mode for that, and clang-cl
works fine for assembly so we'll use that on Windows for consistency.

Differential Revision: https://reviews.llvm.org/D73522
2020-01-28 11:39:21 -08:00
Roland McGrath 2b0e6fe2e2 [Fuchsia] Remove aarch64-fuchsia target-specific -mcmodel=kernel
Under --target=aarch64-fuchsia, -mcmodel=kernel has the effect of
(the default) -mcmodel=small plus -mtp=el1 (which did not exist when
this behavior was added). Fuchsia's kernel now uses -mtp=el1
directly instead of -mcmodel=kernel, so remove this special support.

Patch By: mcgrathr

Differential Revision: https://reviews.llvm.org/D73409
2020-01-28 11:32:08 -08:00
David Blaikie b96e6859c9 llvm-symbolizer test: Add a bit of extra detail on how to compile/reproduce this
The details are also in the .test file, but doesn't hurt to make it a
bit clearer.
2020-01-28 11:07:47 -08:00
LLVM GN Syncbot b8461fc0c7 [gn build] Port 2c03c899d5 2020-01-28 18:59:31 +00:00
Hiroshi Yamauchi 2c03c899d5 [MBFI] Move BranchFolding::MBFIWrapper to its own files. NFC.
Summary:
To avoid header file circular dependency issues in passing updated MBFI (in
MBFIWrapper) to the interface of profile guided size optimizations.

A prep step for (and split off of) D73381.

Reviewers: davidxl

Subscribers: mgorny, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73494
2020-01-28 10:58:46 -08:00
Kristina Bessonova 4b0a7fe008 [llvm-dwarfdump][Statistics] Make calculations of vars in global scope more accurate
It isn't known how many times we've seen the same variable or member in
the global scope (unlike in functions), but there still can be some duplicates
among different CUs.
So, this patch proposes to count variables in the global scope just as a sum of
the number of vars, constant members and artificial entities.

Reviewed by: aprantl

Differential Revision: https://reviews.llvm.org/D73004
2020-01-28 20:52:20 +02:00
Kristina Bessonova 5499e2f455 [llvm-dwarfdump][Statistics] Distinguish parameters with same name or w/o a name
A few DW_TAG_formal_parameter's of the same function may have the same
name (e.g. variadic (template) functions) or don't have a name at all
(if the parameter isn't used inside the function body), but we still
need to be able to distinguish between them to get correct number of 'total vars'
and 'availability' metric.

Reviewed by: aprantl

Differential Revision: https://reviews.llvm.org/D73003
2020-01-28 20:52:20 +02:00
Kristina Bessonova 57839e5178 [llvm-dwarfdump][Statistics] Count more than one conrete out-of-line instances of a function
Here may be more than one out-of-line instance of the same function
among different CUs. All of them should be accounted for to get an accurate
total number of variables/parameters.

Reviewed by: aprantl

Differential Revision: https://reviews.llvm.org/D73002
2020-01-28 20:52:19 +02:00
Sanjay Patel 276a6b8889 [InstCombine] add tests for cmp with splat operand and splat constant; NFC
See PR44588:
https://bugs.llvm.org/show_bug.cgi?id=44588
2020-01-28 13:42:20 -05:00
LLVM GN Syncbot e7d5a8d0b4 [gn build] Port a928d127a5 2020-01-28 18:39:09 +00:00
Amara Emerson 14c2cf8e18 [AArch64][GlobalISel] Don't bail out of the select(cmp(a, b)) -> csel optimization with multiple users.
It can still be beneficial to do the optimization if the result of the compare
is used by *another* select.

Differential Revision: https://reviews.llvm.org/D73511
2020-01-28 10:09:03 -08:00
Derek Schuff da6a896e6b [WebAssembly] Add WebAssembly support to llvm-symbolizer
The only thing missing for basic llvm-symbolizer support is the ability on
lib/Object to get a wasm symbol's section ID, which allows sorting and
computation of the symbols' sizes.

Also, when the WasmAsmParser switches sections on new functions, also add the
section to the list of Dwarf sections if Dwarf is being generated for assembly;
this allows writing of simple tests.

Reviewers: sbc100, jhenderson, aardappel

Differential Revision: https://reviews.llvm.org/D73246
2020-01-28 09:55:38 -08:00
Kristina Bessonova 2e5d20bd47 [llvm-dwarfdump][Statistics] Ignore declarations of global variables
Reviewed by: djtodoro

Differential Revision: https://reviews.llvm.org/D73001
2020-01-28 19:50:46 +02:00
Kristina Bessonova e76106e01c [llvm-dwarfdump][Statistics] Ignore DW_TAG_subroutine_type in statistics
DW_TAG_subroutine_type is not really useful for statistics purposes, as it never
has location information. But it may contain DW_TAG_formal_parameter
children that generate number of parameters w/o location and decrease
'availability' metric significantly.

Reviewed by: djtodoro

Differential Revision: https://reviews.llvm.org/D72983
2020-01-28 19:50:46 +02:00
Kristina Bessonova 9806b39dae [llvm-dwarfdump][Statistics] Distinguish functions/variables with same name across different CUs
Different variables and functions might have the same name in different CU.
To calculate 'Availability' metric more accurate (i.e. to avoid getting
availability above 100%), we need to have some additional logic to
distinguish between them.

The patch introduces a DIE identifier that consists of a function/variable name
and declaration information: a filename and a line number. This allows
distinguishing different functions/variables (different means declared in
different files/lines) with the same name, keeping duplicates counted
as duplicates.

Reviewed by: aprantl, djtodoro

Differential Revision: https://reviews.llvm.org/D72797
2020-01-28 19:50:46 +02:00
Derek Schuff a928d127a5 [llvm-objcopy] Initial support for wasm in llvm-objcopy
Currently only supports simple copying, other operations to follow.

Reviewers: sbc100, alexshap, jhenderson

Differential Revision: https://reviews.llvm.org/D70930
2020-01-28 09:47:16 -08:00
Florian Hahn 5d0ffbeb4d [Matrix] Mark expressions shared between multiple remarks.
This patch adds support for explicitly highlighting sub-expressions
shared by multiple leaf nodes. For example consider the following
code

  %shared.load = tail call <8 x double> @llvm.matrix.columnwise.load.v8f64.p0f64(double* %arg1, i32 %stride, i32 2, i32 4), !dbg !10, !noalias !10
  %trans = tail call <8 x double> @llvm.matrix.transpose.v8f64(<8 x double> %shared.load, i32 2, i32 4), !dbg !10
  tail call void @llvm.matrix.columnwise.store.v8f64.p0f64(<8 x double> %trans, double* %arg3, i32 10, i32 4, i32 2), !dbg !10
  %load.2 = tail call <30 x double> @llvm.matrix.columnwise.load.v30f64.p0f64(double* %arg3, i32 %stride, i32 2, i32 15), !dbg !10, !noalias !10
  %mult = tail call <60 x double> @llvm.matrix.multiply.v60f64.v8f64.v30f64(<8 x double> %trans, <30 x double> %load.2, i32 4, i32 2, i32 15), !dbg !11
  tail call void @llvm.matrix.columnwise.store.v60f64.p0f64(<60 x double> %mult, double* %arg2, i32 10, i32 4, i32 15), !dbg !11

We have two leaf nodes (the 2 stores) and the first store stores %trans
which is also used by the matrix multiply %mult. We generate separate
remarks for each leaf (stores). To denote that parts are shared, the
shared expressions are marked as shared (), with a reference to the
other remark that shares it. The operation summary also denotes the
shared operations separately.

Reviewers: anemet, Gerolf, thegameg, hfinkel, andrew.w.kaylor, LuoYuanke

Reviewed By: anemet

Differential Revision: https://reviews.llvm.org/D72526
2020-01-28 09:27:55 -08:00
Jonathan Roelofs 7f93ff58e1 [llvm] Fix broken cases of 'CHECK[^:]*$' in tests 2020-01-28 09:52:59 -07:00
Florian Hahn d1f849a284 [LV] Hoist code to mark conditional assumes as dead to caller (NFC).
This is a follow-up suggested in D73423. It is sufficient to just add
the conditional assumes to DeadInstructions once.
2020-01-28 08:50:44 -08:00
Konstantin Pyzhov 6d614a82a4 Summary:
This CL adds clang declarations of built-in functions for AMDGPU MFMA intrinsics and instructions.
OpenCL tests for new built-ins are included.

Differential Revision: https://reviews.llvm.org/D72723
2020-01-28 03:51:27 -05:00
Michael Liao 9c54b42338 Fix warning of `-Wcast-qual`. NFC. 2020-01-28 11:36:43 -05:00
Florian Hahn a911fef3dd [LV] Do not try to sink dead instructions.
Dead instructions do not need to be sunk. Currently we try and record
the recipies for them, but there are no recipes emitted for them and
there's nothing to sink. They can be removed from SinkAfter while
marking them for recording.

Fixes PR44634.

Reviewers: rengolin, hsaito, fhahn, Ayal, gilr

Reviewed By: gilr

Differential Revision: https://reviews.llvm.org/D73423
2020-01-28 08:28:03 -08:00
LLVM GN Syncbot dc5777e514 [gn build] Port a32f894f17 2020-01-28 15:50:50 +00:00
Jonathan Roelofs a32f894f17 [ADT] Remove more llvm::make_unique
https://reviews.llvm.org/D73316
2020-01-28 08:48:50 -07:00
Nico Weber 0d17410e91 Prevent building with MSVC 14.24
MSVC 14.24 miscompiles some of LLVM's code, which makes at least these tests fail:

    LLVM :: MC/MachO/gen-dwarf-cpp.s
    LLVM :: MC/MachO/gen-dwarf-macro-cpp.s
    LLVM :: MC/MachO/gen-dwarf-producer.s
    LLVM :: MC/MachO/gen-dwarf.s

It seems better to diagnose that at build time. Since both the previous
and the next version have a fix, this might be good enough and we might
not need a real workaround. (We ran into this at
https://crbug.com/1045948)

If you hit this, use either a newer or an older version of MSVC,
or use clang-cl as host compiler.

Differential Revision: https://reviews.llvm.org/D73550
2020-01-28 10:11:06 -05:00
Victor Huang 4b414d9ade [PowerPC][Future] Add pld and pstd to future CPU
Add the prefixed instructions pld and pstd to future CPU. These are load and
store instructions that require new operand types that are 34 bits. This patch
adds the two instructions as well as the operand types required.

Note that this patch also makes a minor change to tablegen to account for the
fact that some instructions are going to require shifts greater than 31 bits
for the new 34 bit instructions.

Differential Revision: https://reviews.llvm.org/D72574
2020-01-28 08:23:29 -06:00
Whitney Tsang 78dc64989c [CodeMoverUtils] Improve IsControlFlowEquivalent.
Summary:
Currently IsControlFlowEquivalent determine if two blocks are control
flow equivalent by checking if A dominates B and B post dominates A.
There exists blocks that are control flow equivalent even if they don't
satisfy the A dominates B and B post dominates A condition.
For example,

if (cond)
  A
if (cond)
  B
In the PR, we determine if two blocks are control flow equivalent by
also checking if the two sets of conditions A and B depends on are
equivalent.
Reviewer: jdoerfert, Meinersbur, dmgreen, etiotto, bmahjour, fhahn,
hfinkel, kbarton
Reviewed By: fhahn
Subscribers: hiraditya, llvm-commits
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D71578
2020-01-28 14:18:00 +00:00
Wang, Pengfei dba8cd5438 Fix sphinx build bot failure. NFCI. 2020-01-28 22:07:34 +08:00
Sam Parker 7ad879caa0 [NFC][RDA] typedef SmallPtrSetImpl<MachineInstr*> 2020-01-28 13:15:44 +00:00
Benjamin Kramer 2e4977965b [ADT] Implicitly convert between StringRef and std::string_view when we have C++17
This makes the types almost seamlessly interchangeable in C++17
codebases. Eventually we want to replace StringRef with the standard
type, but that requires C++17 being the default and a huge refactoring
job as StringRef has a lot more functionality.
2020-01-28 13:56:12 +01:00
Wang, Pengfei 3239b5034e [FPEnv] Add pragma FP_CONTRACT support under strict FP.
Summary: Support pragma FP_CONTRACT under strict FP.

Reviewers: craig.topper, andrew.w.kaylor, uweigand, RKSimon, LiuChen3

Subscribers: hiraditya, jdoerfert, cfe-commits, llvm-commits, LuoYuanke

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D72820
2020-01-28 20:43:43 +08:00
Miloš Stojanović 4c8817cddf [mips][NFC] Remove unused instruction formats
`BranchBase` unused sice: rL170663
`FI` unsused since: rL170954
`FFI` unused since: rL190221

Differential revision: https://reviews.llvm.org/D73489
2020-01-28 13:30:59 +01:00
Wang, Pengfei 3d1f0ce3b9 [X86] Add combination for fma and fneg on X86 under strict FP.
Summary: X86 has instructions to calculate fma and fneg at the same time. But we combine the fneg and fma only when fneg is the source operand under strict FP.

Reviewers: craig.topper, andrew.w.kaylor, uweigand, RKSimon, LiuChen3

Subscribers: LuoYuanke, llvm-commits, cfe-commits, jdoerfert, hiraditya

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72824
2020-01-28 20:09:56 +08:00
James Henderson 5c05165984 Revert "[DebugInfo] Make most debug line prologue errors non-fatal to parsing"
This reverts commit b94191fecd.

The change broke both an LLD test and the LLDB build.
2020-01-28 11:49:30 +00:00
James Henderson b94191fecd [DebugInfo] Make most debug line prologue errors non-fatal to parsing
Many of the debug line prologue errors are not inherently fatal. In most
cases, we can make reasonable assumptions and carry on. This patch does
exactly that. In the case of length problems, the approach of "the
claimed length is correct" is taken to be consistent with other
instances such as the SectionParser, which ignores the read length.

Reviewed by: dblaikie

Differential Revision: https://reviews.llvm.org/D72158
2020-01-28 11:29:50 +00:00
Benjamin Kramer fba7574cb9 [docs] Clarify llvm.used semantics with less awkward wording 2020-01-28 12:13:57 +01:00
Jay Foad 4a331beadc [AMDGPU] Fix vccz after v_readlane/v_readfirstlane to vcc_lo/hi
Summary:
Up to gfx9, writes to vcc_lo and vcc_hi by instructions like
v_readlane and v_readfirstlane do not update vccz to reflect the new
value of vcc. Fix it by reusing part of the existing vccz bug handling
code, which inserts an "s_mov_b64 vcc, vcc" instruction to restore vccz
just before an instruction that needs the correct value.

Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69661
2020-01-28 10:52:17 +00:00
Kazushi (Jam) Marukawa 92600c2ec8 [VE] call isel with stack passing
Summary:
Function calls and stack-passing of function arguments.
Custom lowering, isel patterns and tests.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D73461
2020-01-28 10:55:47 +01:00
Georgii Rymar cff7c149de [llvm-readobj][test] - Remove --symbols --dyn-syms part from Object/readobj-shared-object.test.
The intention of Object/readobj-shared-object.test was to check the
general output for shared object.

I've added a case for testing dynamic objects to ELF/symbols.test.
Also we already test dynamic symbols printing in ELF/dyn-symbols.test +
I've added a case for `--dyn-syms` alias in D73164.

Hence we can remove this piece from Object/readobj-shared-object.test.

Differential revision: https://reviews.llvm.org/D73175
2020-01-28 12:36:29 +03:00
Guillaume Chatelet d9bff3be99 Update tests for @llvm.memcpy.inline intrinsics 2020-01-28 10:32:43 +01:00
Guillaume Chatelet 5f87510c37 Fix failing bot 2020-01-28 10:20:55 +01:00
Kazushi (Jam) Marukawa 422dfea577 [VE] enable unaligned load/store isel
Summary: Enable unaligned load/store isel for iN and fp32/64 and tests.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D73448
2020-01-28 09:53:37 +01:00
Guillaume Chatelet 879c825cb8 [instrinsics] Add @llvm.memcpy.inline instrinsics
Summary:
This is a follow up on D61634. It adds an LLVM IR intrinsic to allow better implementation of memcpy from C++.
A follow up CL will add the intrinsics in Clang.

Reviewers: courbet, theraven, t.p.northover, jdoerfert, tejohnson

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71710
2020-01-28 09:42:01 +01:00
Florian Hahn 6f07f304a2 [Matrix] Mark remarks test as AArch64 specific. 2020-01-27 18:00:43 -08:00
Florian Hahn 62e228f8fd [Matrix] Add info about number of operations to remarks.
This patch updates the remark to also include a summary of the number of
vector operations generated for each matrix expression.

Reviewers: anemet, Gerolf, thegameg, hfinkel, andrew.w.kaylor, LuoYuanke

Reviewed By: anemet

Differential Revision: https://reviews.llvm.org/D72480
2020-01-27 17:43:39 -08:00
Shoaib Meenai 3a5acdc963 [llvm] Fix file ignoring inside directories
We have some ! patterns in the .gitignore (for the projects and runtimes
directories), and those patterns end up overriding the previous file
ignores, such that e.g. a .swp file inside the runtimes directory isn't
ignored. Move the file ignores last to ensure they take effect.

Differential Revision: https://reviews.llvm.org/D73253
2020-01-27 17:00:33 -08:00
Shoaib Meenai a308b98ecb [runtimes] Support install-*-stripped targets
This is needed to support including runtime targets in
LLVM_DISTRIBUTION_COMPONENTS.

Differential Revision: https://reviews.llvm.org/D73252
2020-01-27 17:00:24 -08:00
Shoaib Meenai b1da8eba60 [runtimes] Fix installation for LLVM_RUNTIME_DISTRIBUTION_COMPONENTS
The installation target we create should trigger the corresponding
installation target in the runtimes external project.

Differential Revision: https://reviews.llvm.org/D73251
2020-01-27 17:00:18 -08:00
Wei Mi f60671f049 [LV] Remove nondeterminacy by changing LoopVectorizationLegality::Reductions
from DenseMap to MapVector

The iteration order of LoopVectorizationLegality::Reductions matters for the
final code generation, so we better use MapVector instead of DenseMap for it
to remove the nondeterminacy. reduction-order.ll in the patch is an example
reduced from the case we saw. In the output of opt command, the order of the
select instructions in the vector.body block keeps changing from run to run
currently.

Differential Revision: https://reviews.llvm.org/D73490
2020-01-27 16:53:20 -08:00
Florian Hahn 949294f396 [Matrix] Add optimization remarks for matrix expression.
Generate remarks for matrix operations in a function. To generate remarks
for matrix expressions, the following approach is used:
1. Collect leafs of matrix expressions (done in
   RemarkGenerator::getExpressionLeafs).  Leafs are lowered matrix
   instructions without other matrix users (like stores).

2. For each leaf, create a remark containing a linearizied version of the
   matrix expression.

The following improvements will be submitted as follow-ups:
* Summarize number of vector instructions generated for each expression.
* Account for shared sub-expressions.
* Propagate matrix remarks up the inlining chain.

The information provided by the matrix remarks helps users to spot cases
where matrix expression got split up, e.g. due to inlining not
happening. The remarks allow users to address those issues, ensuring
best performance.

Reviewers: anemet, Gerolf, thegameg, hfinkel, andrew.w.kaylor, LuoYuanke

Reviewed By: anemet

Differential Revision: https://reviews.llvm.org/D72453
2020-01-27 16:39:29 -08:00
Fangrui Song c7c5da6df3 Reland "[StackColoring] Remap PseudoSourceValue frame indices via MachineFunction::getPSVManager()""
Reland 7a8b0b1595, with a fix that checks
`!E.value().empty()` to avoid inserting a zero to SlotRemap.

Debugged by rnk@ in https://bugs.chromium.org/p/chromium/issues/detail?id=1045650#c33

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D73510
2020-01-27 15:58:49 -08:00
Reid Kleckner 9521c18438 [IR] Keep a double break between functions when printing a module
This behavior appears to have changed unintentionally in
b0e979724f.

Instead of printing the leading newline in printFunction, print it when
printing a module. This ensures that `OS << *Func` starts printing
immediately on the current line, but whole modules are printed nicely.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D73505
2020-01-27 15:31:09 -08:00
Reid Kleckner c7feb6b36a [WinEH] Re-run stack coloring test for i686
This would've caught https://crbug.com/1045650, which resulted in the
revert of 7a8b0b1595.
2020-01-27 15:26:03 -08:00
Evgenii Stepanov 34ab56904e Support zero size types in StackSafetyAnalysis.
Reviewers: vitalybuka

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73395
2020-01-27 15:22:59 -08:00
Evgenii Stepanov c3b80adcee Fix StackSafetyAnalysis crash with scalable vector types.
Summary:
Treat scalable allocas as if they have storage size of 0, and
scalable-typed memory accesses as if their range is unlimited.

This is not a proper support of scalable vector types in the analysis -
we can do better, but not today.

Reviewers: vitalybuka

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73394
2020-01-27 15:22:59 -08:00
Florian Hahn 8e3f59b45a [AArch64] Add option to enable/disable load-store renaming.
This patch adds a new option to enable/disable register renaming in the
load-store optimizer. Defaults to disabled, as there is a potential
mis-compile caused by this.
2020-01-27 15:15:50 -08:00
Eric Schweitz aca68feaad remove a trailing space character (test commit) 2020-01-27 15:01:55 -08:00
Matt Arsenault d2a9739274 AMDGPU/GlobalISel: Eliminate SelectVOP3Mods_f32
Trivial type predicates should be moved into the tablegen pattern
itself, and not checked inside complex patterns. This eliminates a
redundant complex pattern, and fixes select source modifiers for
GlobalISel.

I have further patches which fully handle select in tablegen and
remove all of the C++ selection, although it requires the ugliness to
support the entire range of legal register types.
2020-01-27 17:53:54 -05:00
Jay Foad cbbbd5b5f6 [GlobalISel] Make use of KnownBits::computeForAddSub
Summary:
This is mostly NFC. computeForAddSub may give more precise results in
some cases, but that doesn't seem to affect any existing GlobalISel
tests.

Subscribers: rovka, hiraditya, volkan, Petar.Avramovic, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73431
2020-01-27 22:22:56 +00:00
Stephen Neuendorffer 27f2e9ab1c [examples] Fix CMakefiles for JITLink and OrcError library refactoring
The examples need explicit library dependencies when building with
BUILD_SHARED_LIBS=on
2020-01-27 13:58:50 -08:00
Sanjay Patel 747242af8d [InstCombine] allow more narrowing of casted select
D47163 created a rule that we should not change the casted
type of a select when we have matching types in its compare condition.
That was intended to help vector codegen, but it also could create
situations where we miss subsequent folds as shown in PR44545:
https://bugs.llvm.org/show_bug.cgi?id=44545

By using shouldChangeType(), we can continue to get the vector folds
(because we always return false for vector types). But we also solve
the motivating bug because it's ok to narrow the scalar select in that
example.

Our canonicalization rules around select are a mess, but AFAICT, this
will not induce any infinite looping from the reverse transform (but
we'll need to watch for that possibility if committed).

Side note: there's a similar use of shouldChangeType() for phi ops
just below this diff, and the source and destination types appear to
be reversed.

Differential Revision: https://reviews.llvm.org/D72733
2020-01-27 16:35:50 -05:00
Simon Pilgrim e7e043724e [DAG] Enable ISD::EXTRACT_SUBVECTOR SimplifyMultipleUseDemandedBits handling
This allows SimplifyDemandedBits to call SimplifyMultipleUseDemandedBits to create a simpler ISD::EXTRACT_SUBVECTOR, which is particularly useful for cases where we're splitting into subvectors anyhow.

Differential Revision: This allows SimplifyDemandedBits to call SimplifyMultipleUseDemandedBits to create a simpler ISD::EXTRACT_SUBVECTOR, which is particularly useful for cases where we're splitting into subvectors anyhow.
2020-01-27 21:17:47 +00:00
Adrian Prantl a095d149c2 Fix an assertion failure in DwarfExpression's subregister composition
This patch fixes an assertion failure in DwarfExpression that is
triggered when a complex fragment has exactly the size of a
subregister of the register the DBG_VALUE points to *and* there is no
DWARF encoding for the super-register.

I took the opportunity to replace/document some magic values with
static constructor functions to make this code less confusing to read.

rdar://problem/58489125

Differential Revision: https://reviews.llvm.org/D72938
2020-01-27 12:44:37 -08:00
Roman Lebedev 7bca4a28f5
[NFC][LoopVectorize] Autogenerate tests affected by isHighCostExpansionHelper() cost modelling (PR44668) 2020-01-27 23:34:30 +03:00
Roman Lebedev 9c801c48ee
[NFC][IndVarSimplify] Autogenerate tests affected by isHighCostExpansionHelper() cost modelling (PR44668) 2020-01-27 23:34:29 +03:00
Matt Arsenault c3075e6171 AMDGPU/GlobalISel: Select buffer atomics
The cmpswap handling is incomplete and fails to select.
2020-01-27 15:16:44 -05:00
Matt Arsenault 0eb62d5b3f AMDGPU/GlobalISel: Select llvm.amdgcn.raw.tbuffer.store 2020-01-27 15:16:21 -05:00
Matt Arsenault a69c26a927 AMDGPU/GlobalISel: Select llvm.amdgcn.struct.buffer.store[.format] 2020-01-27 15:00:21 -05:00
Matt Arsenault 533d650e94 AMDGPU/GlobalISel: Move llvm.amdgcn.raw.buffer.store handling
Treat this the same way as loads. There's less value to the
intermediate nodes, but it's good to be consistent.
2020-01-27 14:59:30 -05:00
Sanjay Patel 242fed9d7f [InstCombine] convert fsub nsz with fneg operand to -(X + Y)
This was noted in D72521 - we need to match fneg specifically to
consistently handle that pattern along with (-0.0 - X).
2020-01-27 14:49:15 -05:00
Nikita Popov bcfa0f592f [InstCombine] Move negation handling into freelyNegateValue()
Followup to D72978. This moves existing negation handling in
InstCombine into freelyNegateValue(), which make it composable.
In particular, root negations of div/zext/sext/ashr/lshr/sub can
now always be performed through a shl/trunc as well.

Differential Revision: https://reviews.llvm.org/D73288
2020-01-27 20:46:23 +01:00
Nikita Popov 0957748cb7 [InstCombine] Add more negation tests; NFC
Additional test cases for pushing negations through various
instructions.
2020-01-27 20:46:23 +01:00
Matt Arsenault d2a9b87fee TableGen: Try to fix expensive checks failures 2020-01-27 14:42:04 -05:00
Matt Arsenault 75d66f8434 AMDGPU/GlobalISel: Select llvm.amdcn.struct.tbuffer.load 2020-01-27 14:42:04 -05:00
Petr Hosek 369ea47b92 [Symbolize] Handle error after the notes loop
We always have to check the error, even if we're going to ignore it.
2020-01-27 11:00:27 -08:00
Vedant Kumar e08f205f5c Reland (again): [DWARF] Allow cross-CU references of subprogram definitions
This is a revert-of-revert (i.e. this reverts commit 802bec89, which
itself reverted fa4701e1 and 79daafc9) with a fix folded in. The problem
was that call site tags weren't emitted properly when LTO was enabled
along with split-dwarf. This required a minor fix. I've added a reduced
test case in test/DebugInfo/X86/fission-call-site.ll.

Original commit message:

This allows a call site tag in CU A to reference a callee DIE in CU B
without resorting to creating an incomplete duplicate DIE for the callee
inside of CU A.

We already allow cross-CU references of subprogram declarations, so it
doesn't seem like definitions ought to be special.

This improves entry value evaluation and tail call frame synthesis in
the LTO setting. During LTO, it's common for cross-module inlining to
produce a call in some CU A where the callee resides in a different CU,
and there is no declaration subprogram for the callee anywhere. In this
case llvm would (unnecessarily, I think) emit an empty DW_TAG_subprogram
in order to fill in the call site tag. That empty 'definition' defeats
entry value evaluation etc., because the debugger can't figure out what
it means.

As a follow-up, maybe we could add a DWARF verifier check that a
DW_TAG_subprogram at least has a DW_AT_name attribute.

Update #1:

Reland with a fix to create a declaration DIE when the declaration is
missing from the CU's retainedTypes list. The declaration is left out
of the retainedTypes list in two cases:

1) Re-compiling pre-r266445 bitcode (in which declarations weren't added
   to the retainedTypes list), and
2) Doing LTO function importing (which doesn't update the retainedTypes
   list).

It's possible to handle (1) and (2) by modifying the retainedTypes list
(in AutoUpgrade, or in the LTO importing logic resp.), but I don't see
an advantage to doing it this way, as it would cause more DWARF to be
emitted compared to creating the declaration DIEs lazily.

Update #2:

Fold in a fix for call site tag emission in the split-dwarf + LTO case.

Tested with a stage2 ThinLTO+RelWithDebInfo build of clang, and with a
ReleaseLTO-g build of the test suite.

rdar://46577651, rdar://57855316, rdar://57840415, rdar://58888440

Differential Revision: https://reviews.llvm.org/D70350
2020-01-27 10:52:34 -08:00
Matt Arsenault 09ed0e44d9 AMDGPU/GlobalISel: Select llvm.amdgcn.raw.tbuffer.load 2020-01-27 13:40:37 -05:00
Stanislav Mekhanoshin 53eb0f8c07 [AMDGPU] Attempt to reschedule withou clustering
We want to have more load/store clustering but we also want
to maintain low register pressure which are oposit targets.
Allow scheduler to reschedule regions without mutations
applied if we hit a register limit.

Differential Revision: https://reviews.llvm.org/D73386
2020-01-27 10:27:16 -08:00
Matt Arsenault 97711228fd AMDGPU/GlobalISel: Select llvm.amdgcn.struct.buffer.load.format 2020-01-27 13:23:35 -05:00
Luke Drummond 482e890d1f [tablegen] Emit string literals instead of char arrays
This changes the generated (Instr|Asm|Reg|Regclass)Name tables from this
form:
    extern const char HexagonInstrNameData[] = {
      /* 0 */ 'G', '_', 'F', 'L', 'O', 'G', '1', '0', 0,
      /* 9 */ 'E', 'N', 'D', 'L', 'O', 'O', 'P', '0', 0,
      /* 18 */ 'V', '6', '_', 'v', 'd', 'd', '0', 0,
      /* 26 */ 'P', 'S', '_', 'v', 'd', 'd', '0', 0,
      [...]
    };

...to this:

    extern const char HexagonInstrNameData[] = {
      /* 0 */ "G_FLOG10\0"
      /* 9 */ "ENDLOOP0\0"
      /* 18 */ "V6_vdd0\0"
      /* 26 */ "PS_vdd0\0"
      [...]
    };

This should make debugging and exploration a lot easier for mortals,
while providing a significant compile-time reduction for common compilers.

To avoid issues with low implementation limits, this is disabled by
default for visual studio.

To force output one way or the other, pass
`--long-string-literals=<bool>` to `tablegen`

Reviewers: mstorsjo, rnk

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D73044

A variation of this patch was originally committed in ce23515f5a and
then reverted in e464b31c due to build failures.
2020-01-27 18:22:25 +00:00
Jonas Devlieghere 3ed88b052b [llvm][TextAPI/MachO] Support writing single macCatalyst platform
TAPI currently lacks a way to emit the macCatalyst platform. For TBD_V3
is does support zippered frameworks given that both macOS and
macCatalyst are part of the PlatformSet.

Differential revision: https://reviews.llvm.org/D73325
2020-01-27 10:21:06 -08:00
Matt Arsenault ce7ca2caf2 AMDGPU/GlobalISel: Select llvm.amdgcn.struct.buffer.load 2020-01-27 13:05:55 -05:00
Matt Arsenault 198624c39d AMDGPU/GlobalISel: Select llvm.amdgcn.raw.buffer.load.format 2020-01-27 13:02:19 -05:00
Matt Arsenault fc90222a91 AMDGPU/GlobalISel: Select llvm.amdgcn.raw.buffer.load
Use intermediate instructions, unlike with buffer stores. This is
necessary because of the need to have an internal way to distinguish
between signed and unsigned extloads. This introduces some duplication
and near duplication with the buffer store selection path. The store
handling should maybe be moved into legalization to match and
eliminate the duplication.
2020-01-27 12:49:23 -05:00
Matt Arsenault e60d658260 AMDGPU/GlobalISel: Handle VOP3NoMods 2020-01-27 09:03:44 -08:00
Matt Arsenault d309b4ebe4 AMDGPU/GlobalISel: Add baseline tests for fma/fmad selection 2020-01-27 09:02:13 -08:00
Matt Arsenault 0968234590 AMDGPU/GlobalISel: Minor refactor of MUBUF complex patterns
This will make it easier to support the small variants in the complex
patterns for atomics.
2020-01-27 09:00:00 -08:00
Matt Arsenault bef27175c7 AMDGPU: Fix not using f16 fsin/fcos
I noticed this because this accidentally started working for
GlobalISel.
2020-01-27 08:59:59 -08:00
Jay Foad e37997cc0d [AMDGPU] Simplify test and extend to gfx9 and gfx10
Summary:
This is in preparation for adding more test cases for D69661 and other
bug fixes in the same area.

Reviewers: tpr, dstuttard, critson, nhaehnle, arsenm

Subscribers: kzhuravl, jvesely, wdng, yaxunl, t-tye, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70708
2020-01-27 16:56:40 +00:00
Simon Pilgrim 2d5e281b0f [X86][AVX] Add a more aggressive SimplifyMultipleUseDemandedBits to simplify masked store masks.
Fixes a poor codegen issue noticed in PR11210.
2020-01-27 16:44:25 +00:00
Matt Arsenault a1d33ce73a AMDGPU/GlobalISel: Custom legalize v2s16 G_SHUFFLE_VECTOR
Try to keep simple v2s16 cases as-is. This will more naturally map to
how the VOP3P op_sel modifiers work compared to the expansion
involving bitcasts and bitshifts.

This could maybe try harder with wider source vector types, although
that could be handled with a pre-legalize combine.
2020-01-27 08:28:05 -08:00
Christian Sigg 97431831e5 Add pretty printers for llvm::PointerIntPair and llvm::PointerUnion.
Reviewers: aprantl, dblaikie, jdoerfert, nicolasvasilache

Reviewed By: dblaikie

Subscribers: jpienaar, dexonsmith, merge_guards_bot, llvm-commits

Tags: #llvm, #clang, #lldb, #openmp

Differential Revision: https://reviews.llvm.org/D72557
2020-01-27 17:23:59 +01:00
Nico Weber 68051c1224 Revert "[StackColoring] Remap PseudoSourceValue frame indices via MachineFunction::getPSVManager()"
This reverts commit 7a8b0b1595.
It seems to break exception handling on 32-bit Windows, see
https://crbug.com/1045650
2020-01-27 11:22:33 -05:00
Matt Arsenault 4e69df091d Revert "AMDGPU: Temporary drop s_mul_hi_i/u32 patterns"
This reverts commit fe23ed2c68.

It was never really clear this was responsible for the performance
regressions that caused this to be reverted. It's been a long time,
and we need to have scalar patterns for this to get GlobalISel
working.
2020-01-27 08:07:21 -08:00
Teresa Johnson 2f63d549f1 Restore "[LTO/WPD] Enable aggressive WPD under LTO option"
This restores 59733525d3 (D71913), along
with bot fix 19c76989bb.

The bot failure should be fixed by D73418, committed as
af954e441a.

I also added a fix for non-x86 bot failures by requiring x86 in new test
lld/test/ELF/lto/devirt_vcall_vis_public.ll.
2020-01-27 07:55:05 -08:00
Matt Arsenault bc3d900fa5 AMDGPU/GlobalISel: Fix not using global atomics on gfx9+
For some reason the flat/global atomics end up in the generated
matcher table in a different order from SelectionDAG. Use
AddedComplexity to prefer checking for global atomics first.
2020-01-27 07:42:42 -08:00
Whitney Tsang 2b335e9aae [LoopUnroll] Remove remapInstruction().
Summary:
LoopUnroll can reuse the RemapInstruction() in ValueMapper, or
remapInstructionsInBlocks() in CloneFunction, depending on the needs.
There is no need to have its own version in LoopUnroll.

By calling RemapInstruction() without TypeMapper or Materializer and
with Flags (RF_NoModuleLevelChanges | RF_IgnoreMissingLocals), it does
the same as remapInstruction(). remapInstructionsInBlocks() calls
RemapInstruction() exactly as described.

Looking at the history, I cannot find any obvious reason to have its own
version.
Reviewer: dmgreen, jdoerfert, Meinersbur, kbarton, bmahjour, etiotto,
foad, aprantl
Reviewed By: jdoerfert
Subscribers: hiraditya, zzheng, llvm-commits, prithayan, anhtuyen
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D73277
2020-01-27 15:42:13 +00:00
James Henderson c963b5fbd6 [test][llvm-dwarfdump] Add extra test case for invalid MD5 form
A subsequent patch will change how an invalid file name table is handled
to allow parsing to continue. This patch adds a test case that will
demonstrate a difference in behaviour with that change between invalid
file tables where the error is before the end of the stated prologue
length and where the error occurs after the stated length.

Reviewed by: dblaikie

Differential Revision: https://reviews.llvm.org/D72157
2020-01-27 15:33:34 +00:00
James Henderson f1be770ff6 [DebugInfo] Make incorrect debug line extended opcode length non-fatal
It is possible to try to keep parsing a debug line program even when the
length of an extended opcode does not match what is expected for that
opcode. This patch changes what was previously a fatal error to be
non-fatal. The parser now continues by assuming the the claimed length
is correct, even if it means moving the offset backwards.

Reviewed by: dblaikie

Differential Revision: https://reviews.llvm.org/D72155
2020-01-27 15:32:41 +00:00
Matt Arsenault ac0b9b4ccf AMDPGPU/GlobalISel: Select more MUBUF global addressing modes
The handling of the high bits of the resource descriptor seem weird to
me, where the 3rd dword changes based on the instruction.
2020-01-27 07:28:36 -08:00
Matt Arsenault fdaad485e6 AMDGPU/GlobalISel: Initial selection of MUBUF addr64 load/store
Fixes the main reason for compile failures on SI, but doesn't really
try to use the addressing modes yet.
2020-01-27 07:13:56 -08:00
Simon Pilgrim d89180972b [X86][AVX] Add test case from PR11210
Shows failure to remove sign bit comparison when the result has multiple uses
2020-01-27 15:08:21 +00:00
Dominik Montada 9965b12fd1 Use pointer type size for offset constant when lowering load/stores 2020-01-27 06:55:32 -08:00
Matt Arsenault 2214bc81d0 AMDGPU: Allow i16 shader arguments
Not allowing this just creates unnecessary complications when writing
simple tests.
2020-01-27 06:55:32 -08:00
Jay Foad 1bf00219fc [AMDGPU] Handle multiple base operands in areMemAccessesTriviallyDisjoint
Summary:
This is in preparation for getMemOperandsWithOffset returning more base
operands.

Depends on D73455.

Reviewers: arsenm, rampitec

Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73456
2020-01-27 14:45:21 +00:00
Jay Foad 6461eadf8f [AMDGPU] Handle multiple base operands in shouldClusterMemOps
Summary:
This is in preparation for getMemOperandsWithOffset returning more base
operands.

Depends on D73454.

Reviewers: arsenm, rampitec

Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73455
2020-01-27 14:45:21 +00:00
Jay Foad fcf5254fa7 [AMDGPU] Handle frame index base operands in memOpsHaveSameBasePtr
Summary:
This is in preparation for getMemOperandsWithOffset returning more base
operands.

Reviewers: arsenm, rampitec

Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, arphaman, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73454
2020-01-27 14:45:21 +00:00
vpykhtin 4332f1a4c8 [AMDGPU] Fix GCN regpressure trackers for INLINEASM instructions.
Differential revision: https://reviews.llvm.org/D73338
2020-01-27 17:25:25 +03:00
Matt Arsenault 2a160ba5b0 GlobalISel: Reimplement widenScalar for G_UNMERGE_VALUES results
Only use shifts if the requested type exactly matches the source type,
and create sub-unmerges otherwise.
2020-01-27 06:18:26 -08:00
David Green 8a6b948eb5 [MVE] Fixup order of gather writeback intrinsic outputs
The MVE_VLDRWU32_qi_pre gather loads, like the other _pre/_post mve
loads returns the writeback as result 0, the value as result 1. The llvm
ir intrinsic seems to have this the other way around though, and so when
lowering from one to the other we need to switch the first two outputs.

I've also fixed up the types of _pre/_post on normal MVE loads. There we
were already getting the values the right way around, just not for the
types. I don't believe this was causing anything to go wrong, but it was
very confusing to read in the debug output.

Differential Revision: https://reviews.llvm.org/D73370
2020-01-27 14:08:06 +00:00
Matt Arsenault 06d9230fef GlobalISel: Translate vector GEPs 2020-01-27 05:35:05 -08:00
Russell Gallop 77e6bb3cba Re-land [Support] Extend TimeProfiler to support multiple threads
This makes TimeTraceProfilerInstance thread local. Added
timeTraceProfilerFinishThread() which moves the thread local instance to
a global vector of instances. timeTraceProfilerWrite() then writes
recorded data from all instances.

Threads are identified based on their thread ids. Totals are reported
with artificial thread ids higher than the real ones.

This fixes the previous version to work with __thread as well as
thread_local.

Differential Revision: https://reviews.llvm.org/D71059
2020-01-27 13:01:49 +00:00
Igor Kudrin 8f3d47c54a [DWARF] Do not pass Version to DWARFExpression. NFCI.
The Version was used only to determine the size of an operand of
DW_OP_call_ref. The size was 4 for all versions apart from 2, but
the DW_OP_call_ref operation was introduced only in DWARF3. Thus,
the code may be simplified and using of Version may be eliminated.

Differential Revision: https://reviews.llvm.org/D73264
2020-01-27 19:08:46 +07:00