Commit Graph

321116 Commits

Author SHA1 Message Date
Matt Arsenault 077df01918 AMDGPU: Fix test failing since r365512
llvm-svn: 365521
2019-07-09 17:54:34 +00:00
Jinsong Ji 06fef0b359 Revert "[HardwareLoops] NFC - move hardware loop checking code to isHardwareLoopProfitable()"
This reverts commit d955573065.

llvm-svn: 365520
2019-07-09 17:53:09 +00:00
Steven Wu 65f964c23e Add lit.local.cfg to llvm-objdump tests
Add configuration file to llvm-objdump tests to treat files with .yaml
extension as tests.

llvm-svn: 365519
2019-07-09 17:47:14 +00:00
Erik Pilkington abffae3a56 [ObjC] Add a warning for implicit conversions of a constant non-boolean value to BOOL
rdar://51954400

Differential revision: https://reviews.llvm.org/D63912

llvm-svn: 365518
2019-07-09 17:29:40 +00:00
Nico Weber 0efac296f1 Remove a comment that has been obsolete since r327679
llvm-svn: 365517
2019-07-09 17:19:47 +00:00
Michael Liao 329c032040 [unittest] Add bogus register info.
Reviewers: dstenb

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64421

llvm-svn: 365516
2019-07-09 17:19:01 +00:00
Nico Weber c9c55cf89b Rename llvm/test/tools/llvm-pdbdump to llvm/test/tools/llvm-pdbutil
llvm-pdbdump was renamed to llvm-pdbutil long ago. This updates the test
to be where you'd expect them to be.

llvm-svn: 365515
2019-07-09 17:14:24 +00:00
Nico Weber ce84e6ae8e Make pdbdump-objfilename test work again
- The test had extension .yaml, which lit doesn't execute in this
  directory. Rename to .test to make it run, and move the yaml bits
  into a dedicated file, like with all other tests in this dir.

- llvm-pdbdump got renamed to llvm-pdbutil long ago, update test.

- -dbi-module-info got renamed in r305032, update test for this too.

llvm-svn: 365514
2019-07-09 17:02:51 +00:00
Julian Lettner 521f77e635 [TSan] Improve handling of stack pointer mangling in {set,long}jmp, pt.8
Refine longjmp key management.  For Linux, re-implement key retrieval in
C (instead of assembly).  Removal of `InitializeGuardPtr` and a final
round of cleanups will be done in the next commit.

Reviewed By: dvyukov

Differential Revision: https://reviews.llvm.org/D64092

llvm-svn: 365513
2019-07-09 16:49:43 +00:00
Christudasan Devadasan b2d24bd540 [AMDGPU] Created a sub-register class for the return address operand in the return instruction.
Function return instruction lowering, currently uses the fixed register pair s[30:31] for holding
the return address. It can be any SGPR pair other than the CSRs. Created an SGPR pair sub-register class
exclusive of the CSRs, and used this regclass while lowering the return instruction.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D63924

llvm-svn: 365512
2019-07-09 16:48:42 +00:00
Sam Elliott 114d2db49b [RISCV] Fix ICE in isDesirableToCommuteWithShift
Summary:
There was an error being thrown from isDesirableToCommuteWithShift in
some tests. This was tracked down to the method being called before
legalisation, with an extended value type, not a machine value type.

In the case I diagnosed, the error was only hit with an instruction sequence
involving `i24`s in the add and shift. `i24` is not a Machine ValueType, it is
instead an Extended ValueType which was causing the issue.

I have added a test to cover this case, and fixed the error in the callback.

Reviewers: asb, luismarques

Reviewed By: asb

Subscribers: hiraditya, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, kito-cheng, shiva0217, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, Jim, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64425

llvm-svn: 365511
2019-07-09 16:24:16 +00:00
Amara Emerson 6616e269a6 [AArch64][GlobalISel] Optimize conditional branches followed by unconditional branches
If we have an icmp->brcond->br sequence where the brcond just branches to the
next block jumping over the br, while the br takes the false edge, then we can
modify the conditional branch to jump to the br's target while inverting the
condition of the incoming icmp. This means we can eliminate the br as an
unconditional branch to the fallthrough block.

Differential Revision: https://reviews.llvm.org/D64354

llvm-svn: 365510
2019-07-09 16:05:59 +00:00
Hiroshi Yamauchi d088720eda Revert Revert Devirtualize destructor of final class.
Revert r364359 and recommit r364100.

r364100 was reverted as r364359 due to an internal test failure, but it was a
false alarm.

llvm-svn: 365509
2019-07-09 15:57:29 +00:00
Simon Atanasyan e3892d84e0 [mips] Show error in case of using FP64 mode on pre MIPS32R2 CPU
llvm-svn: 365508
2019-07-09 15:48:16 +00:00
Simon Atanasyan 623282f0dd [mips] Explicitly select `mips32r2` CPU for test cases require 64-bit FPU. NFC
Support for 64-bit coprocessors on a 32-bit architecture
was added in `MIPS32 R2`.

llvm-svn: 365507
2019-07-09 15:48:05 +00:00
David Bolvansky 901d91e5f0 [NFC] Fixed tests
llvm-svn: 365506
2019-07-09 15:31:36 +00:00
Mikhail Maltsev a448ed99df [libunwind] Fix Unwind-EHABI.cpp:getByte on big-endian targets
Summary:
The function getByte is dependent on endianness and the current
behavior is incorrect on big-endian targets.

This patch fixes the issue.

Reviewers: phosek, ostannard, dmgreen, christof, chill

Reviewed By: ostannard, chill

Subscribers: chill, christof, libcxx-commits

Tags: #libc

Differential Revision: https://reviews.llvm.org/D64402

llvm-svn: 365505
2019-07-09 15:29:06 +00:00
Simon Pilgrim 57603cbde8 [DAGCombine] LoadedSlice - keep getOffsetFromBase() uint64_t offset. NFCI.
Keep the uint64_t type from getOffsetFromBase() to stop truncation/extension overflow warnings in MSVC in alignment math.

llvm-svn: 365504
2019-07-09 15:28:57 +00:00
Yonghong Song d3d88d08b5 [BPF] Support for compile once and run everywhere
Introduction
============

This patch added intial support for bpf program compile once
and run everywhere (CO-RE).

The main motivation is for bpf program which depends on
kernel headers which may vary between different kernel versions.
The initial discussion can be found at https://lwn.net/Articles/773198/.

Currently, bpf program accesses kernel internal data structure
through bpf_probe_read() helper. The idea is to capture the
kernel data structure to be accessed through bpf_probe_read()
and relocate them on different kernel versions.

On each host, right before bpf program load, the bpfloader
will look at the types of the native linux through vmlinux BTF,
calculates proper access offset and patch the instruction.

To accommodate this, three intrinsic functions
   preserve_{array,union,struct}_access_index
are introduced which in clang will preserve the base pointer,
struct/union/array access_index and struct/union debuginfo type
information. Later, bpf IR pass can reconstruct the whole gep
access chains without looking at gep itself.

This patch did the following:
  . An IR pass is added to convert preserve_*_access_index to
    global variable who name encodes the getelementptr
    access pattern. The global variable has metadata
    attached to describe the corresponding struct/union
    debuginfo type.
  . An SimplifyPatchable MachineInstruction pass is added
    to remove unnecessary loads.
  . The BTF output pass is enhanced to generate relocation
    records located in .BTF.ext section.

Typical CO-RE also needs support of global variables which can
be assigned to different values to different hosts. For example,
kernel version can be used to guard different versions of codes.
This patch added the support for patchable externals as well.

Example
=======

The following is an example.

  struct pt_regs {
    long arg1;
    long arg2;
  };
  struct sk_buff {
    int i;
    struct net_device *dev;
  };

  #define _(x) (__builtin_preserve_access_index(x))
  static int (*bpf_probe_read)(void *dst, int size, const void *unsafe_ptr) =
          (void *) 4;
  extern __attribute__((section(".BPF.patchable_externs"))) unsigned __kernel_version;
  int bpf_prog(struct pt_regs *ctx) {
    struct net_device *dev = 0;

    // ctx->arg* does not need bpf_probe_read
    if (__kernel_version >= 41608)
      bpf_probe_read(&dev, sizeof(dev), _(&((struct sk_buff *)ctx->arg1)->dev));
    else
      bpf_probe_read(&dev, sizeof(dev), _(&((struct sk_buff *)ctx->arg2)->dev));
    return dev != 0;
  }

In the above, we want to translate the third argument of
bpf_probe_read() as relocations.

  -bash-4.4$ clang -target bpf -O2 -g -S trace.c

The compiler will generate two new subsections in .BTF.ext,
OffsetReloc and ExternReloc.
OffsetReloc is to record the structure member offset operations,
and ExternalReloc is to record the external globals where
only u8, u16, u32 and u64 are supported.

   BPFOffsetReloc Size
   struct SecLOffsetReloc for ELF section #1
   A number of struct BPFOffsetReloc for ELF section #1
   struct SecOffsetReloc for ELF section #2
   A number of struct BPFOffsetReloc for ELF section #2
   ...
   BPFExternReloc Size
   struct SecExternReloc for ELF section #1
   A number of struct BPFExternReloc for ELF section #1
   struct SecExternReloc for ELF section #2
   A number of struct BPFExternReloc for ELF section #2

  struct BPFOffsetReloc {
    uint32_t InsnOffset;    ///< Byte offset in this section
    uint32_t TypeID;        ///< TypeID for the relocation
    uint32_t OffsetNameOff; ///< The string to traverse types
  };

  struct BPFExternReloc {
    uint32_t InsnOffset;    ///< Byte offset in this section
    uint32_t ExternNameOff; ///< The string for external variable
  };

Note that only externs with attribute section ".BPF.patchable_externs"
are considered for Extern Reloc which will be patched by bpf loader
right before the load.

For the above test case, two offset records and one extern record
will be generated:
  OffsetReloc records:
        .long   .Ltmp12                 # Insn Offset
        .long   7                       # TypeId
        .long   242                     # Type Decode String
        .long   .Ltmp18                 # Insn Offset
        .long   7                       # TypeId
        .long   242                     # Type Decode String

  ExternReloc record:
        .long   .Ltmp5                  # Insn Offset
        .long   165                     # External Variable

  In string table:
        .ascii  "0:1"                   # string offset=242
        .ascii  "__kernel_version"      # string offset=165

The default member offset can be calculated as
    the 2nd member offset (0 representing the 1st member) of struct "sk_buff".

The asm code:
    .Ltmp5:
    .Ltmp6:
            r2 = 0
            r3 = 41608
    .Ltmp7:
    .Ltmp8:
            .loc    1 18 9 is_stmt 0        # t.c:18:9
    .Ltmp9:
            if r3 > r2 goto LBB0_2
    .Ltmp10:
    .Ltmp11:
            .loc    1 0 9                   # t.c:0:9
    .Ltmp12:
            r2 = 8
    .Ltmp13:
            .loc    1 19 66 is_stmt 1       # t.c:19:66
    .Ltmp14:
    .Ltmp15:
            r3 = *(u64 *)(r1 + 0)
            goto LBB0_3
    .Ltmp16:
    .Ltmp17:
    LBB0_2:
            .loc    1 0 66 is_stmt 0        # t.c:0:66
    .Ltmp18:
            r2 = 8
            .loc    1 21 66 is_stmt 1       # t.c:21:66
    .Ltmp19:
            r3 = *(u64 *)(r1 + 8)
    .Ltmp20:
    .Ltmp21:
    LBB0_3:
            .loc    1 0 66 is_stmt 0        # t.c:0:66
            r3 += r2
            r1 = r10
    .Ltmp22:
    .Ltmp23:
    .Ltmp24:
            r1 += -8
            r2 = 8
            call 4

For instruction .Ltmp12 and .Ltmp18, "r2 = 8", the number
8 is the structure offset based on the current BTF.
Loader needs to adjust it if it changes on the host.

For instruction .Ltmp5, "r2 = 0", the external variable
got a default value 0, loader needs to supply an appropriate
value for the particular host.

Compiling to generate object code and disassemble:
   0000000000000000 bpf_prog:
           0:       b7 02 00 00 00 00 00 00         r2 = 0
           1:       7b 2a f8 ff 00 00 00 00         *(u64 *)(r10 - 8) = r2
           2:       b7 02 00 00 00 00 00 00         r2 = 0
           3:       b7 03 00 00 88 a2 00 00         r3 = 41608
           4:       2d 23 03 00 00 00 00 00         if r3 > r2 goto +3 <LBB0_2>
           5:       b7 02 00 00 08 00 00 00         r2 = 8
           6:       79 13 00 00 00 00 00 00         r3 = *(u64 *)(r1 + 0)
           7:       05 00 02 00 00 00 00 00         goto +2 <LBB0_3>

    0000000000000040 LBB0_2:
           8:       b7 02 00 00 08 00 00 00         r2 = 8
           9:       79 13 08 00 00 00 00 00         r3 = *(u64 *)(r1 + 8)

    0000000000000050 LBB0_3:
          10:       0f 23 00 00 00 00 00 00         r3 += r2
          11:       bf a1 00 00 00 00 00 00         r1 = r10
          12:       07 01 00 00 f8 ff ff ff         r1 += -8
          13:       b7 02 00 00 08 00 00 00         r2 = 8
          14:       85 00 00 00 04 00 00 00         call 4

Instructions #2, #5 and #8 need relocation resoutions from the loader.

Signed-off-by: Yonghong Song <yhs@fb.com>

Differential Revision: https://reviews.llvm.org/D61524

llvm-svn: 365503
2019-07-09 15:28:41 +00:00
Simon Pilgrim d050e45631 [ADT] Remove MSVC-only "no two-phase name lookup" typename path.
Now that we've dropped VS2015 support (D64326) we can use the regular codepath as VS2017+ correctly handles it

llvm-svn: 365502
2019-07-09 15:24:19 +00:00
David Bolvansky e625eb9def [NFC] Added tests for D64285
llvm-svn: 365501
2019-07-09 15:12:01 +00:00
Marco Antognini d36e130a86 [OpenCL][Sema] Improve address space support for blocks
Summary:
This patch ensures that the following code is compiled identically with
-cl-std=CL2.0 and -fblocks -cl-std=c++.

    kernel void test(void) {
      void (^const block_A)(void) = ^{
        return;
      };
    }

A new test is not added because cl20-device-side-enqueue.cl will cover
this once blocks are further improved for C++ for OpenCL.

The changes to Sema::PerformImplicitConversion are based on
the parts of Sema::CheckAssignmentConstraints on block pointer
conversions.

Reviewers: rjmccall, Anastasia

Subscribers: yaxunl, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D64083

llvm-svn: 365500
2019-07-09 15:04:27 +00:00
Marco Antognini b00d5f732c [OpenCL][Sema] Fix builtin rewriting
This patch ensures built-in functions are rewritten using the proper
parent declaration.

Existing tests are modified to run in C++ mode to ensure the
functionality works also with C++ for OpenCL while not increasing the
testing runtime.

llvm-svn: 365499
2019-07-09 15:04:23 +00:00
Aaron Ballman b1e511bf5a Ignore trailing NullStmts in StmtExprs for GCC compatibility.
Ignore trailing NullStmts in compound expressions when determining the result type and value. This is to match the GCC behavior which ignores semicolons at the end of compound expressions.

Patch by Dominic Ferreira.

llvm-svn: 365498
2019-07-09 15:02:07 +00:00
Chen Zheng d955573065 [HardwareLoops] NFC - move hardware loop checking code to isHardwareLoopProfitable()
Differential Revision: https://reviews.llvm.org/D64197

llvm-svn: 365497
2019-07-09 14:56:17 +00:00
David Green 781e3aff8c [ARM] Add test for MVE and no floats. NFC
Adds a simple test that MVE with no floating point will be promoted correctly
to software float calls.

llvm-svn: 365496
2019-07-09 14:43:17 +00:00
Sanjay Patel fb453353da [InferFunctionAttrs] add more tests for derefenceable; NFC
llvm-svn: 365495
2019-07-09 14:43:03 +00:00
Petar Avramovic be20e36107 [MIPS GlobalISel] Register bank select for G_PHI. Select i64 phi
Select gprb or fprb when def/use register operand of G_PHI is
used/defined by either:
 copy to/from physical register or
 instruction with only one mapping available for that use/def operand.

Integer s64 phi is handled with narrowScalar when mapping is applied,
produced artifacts are combined away. Manually set gprb to all register
operands of instructions created during narrowScalar.

Differential Revision: https://reviews.llvm.org/D64351

llvm-svn: 365494
2019-07-09 14:36:17 +00:00
Matt Arsenault fdd761af15 AMDGPU/GlobalISel: Prepare some tests for store selection
Mostsly these would fail due to trying to use SI with a flat
operation. Implementing global loads with MUBUF is more work than
flat, so these won't be handled in the initial load selection.

Others fail because store of s64 won't initially work, as the current
set of patterns expect everything to be turned into v2i32.

llvm-svn: 365493
2019-07-09 14:30:57 +00:00
Petar Avramovic dbb6d01d34 [MIPS GlobalISel] Regbanks for G_SELECT. Select i64, f32 and f64 select
Select gprb or fprb when def/use register operand of G_SELECT is
used/defined by either:
 copy to/from physical register or
 instruction with only one mapping available for that use/def operand.

Integer s64 select is handled with narrowScalar when mapping is applied,
produced artifacts are combined away. Manually set gprb to all register
operands of instructions created during narrowScalar.

For selection of floating point s32 or s64 select it is enough to set
fprb of appropriate size and selectImpl will do the rest.

Differential Revision: https://reviews.llvm.org/D64350

llvm-svn: 365492
2019-07-09 14:30:29 +00:00
Matt Arsenault 85ad662dfd AMDGPU/GlobalISel: Fix test
llvm-svn: 365491
2019-07-09 14:30:02 +00:00
Emilio Cobos Alvarez 743754501b [libclang] Fix hang in release / assertion in debug when evaluating value-dependent types.
Expression evaluator doesn't work in value-dependent types, so ensure that the
precondition it asserts holds.

This fixes https://bugs.llvm.org/show_bug.cgi?id=42532

Differential Revision: https://reviews.llvm.org/D64409

llvm-svn: 365490
2019-07-09 14:27:01 +00:00
James Henderson e0a3ee79c5 [docs][llvm-dwarfdump] Fix wording
llvm-svn: 365489
2019-07-09 14:20:58 +00:00
Matt Arsenault 4dd5755d01 AMDGPU/GlobalISel: Legalize more concat_vectors
llvm-svn: 365488
2019-07-09 14:17:31 +00:00
Matt Arsenault 6bdb92d833 AMDGPU/GlobalISel: Improve regbankselect for icmp s16
Account for 64-bit scalar eq/ne when available.

llvm-svn: 365487
2019-07-09 14:13:09 +00:00
Matt Arsenault 8b8eee5904 AMDGPU/GlobalISel: Make s16 G_ICMP legal
llvm-svn: 365486
2019-07-09 14:10:43 +00:00
Alexey Bataev e509af3cd6 [OPENMP]Fix the float point semantics handling on the device.
The device should use the same float point representation as the host.
Previous patch fixed the handling of the sizes of the float point types,
but did not fixed the fp semantics. This patch makes target device to
use the host fp semantics. this is required for the correct data
transfer between host and device and correct codegen.

llvm-svn: 365485
2019-07-09 14:09:53 +00:00
Matt Arsenault e6d10f97dd AMDGPU/GlobalISel: Select G_SUB
llvm-svn: 365484
2019-07-09 14:05:11 +00:00
Matt Arsenault 872f38be7e AMDGPU/GlobalISel: Select G_UNMERGE_VALUES
llvm-svn: 365483
2019-07-09 14:02:26 +00:00
Matt Arsenault 9b7ffc4e55 AMDGPU/GlobalISel: Select G_MERGE_VALUES
llvm-svn: 365482
2019-07-09 14:02:20 +00:00
Nico Weber 6241035684 gn build: Merge r365453
llvm-svn: 365481
2019-07-09 13:58:18 +00:00
Fangrui Song 04615341e4 [ItaniumMangle] Refactor long double/__float128 mangling and fix the mangled code
In gcc PowerPC, long double has 3 mangling schemes:

-mlong-double-64: `e`
-mlong-double-128 -mabi=ibmlongdouble: `g`
-mlong-double-128 -mabi=ieeelongdouble: `u9__ieee128` (gcc <= 8.1: `U10__float128`)

The current useFloat128ManglingForLongDouble() bisection is not suitable
when we support -mlong-double-128 in clang (D64277). Replace
useFloat128ManglingForLongDouble() with getLongDoubleMangling() and
getFloat128Mangling() to allow 3 mangling schemes.

I also deleted the `getTriple().isOSBinFormatELF()` check (the Darwin
support has gone: https://reviews.llvm.org/D50988).

For x86, change the mangled code of __float128 from `U10__float128` to `g`. `U10__float128` was wrongly copied from PowerPC.
The test will be added to `test/CodeGen/x86-long-double.cpp` in D64277.

Reviewed By: erichkeane

Differential Revision: https://reviews.llvm.org/D64276

llvm-svn: 365480
2019-07-09 13:32:26 +00:00
Ilya Biryukov 51dad4196e [Syntax] Move roles into a separate enum
To align with reviewer's suggestions.

llvm-svn: 365479
2019-07-09 13:31:43 +00:00
Nico Weber e7a67bf8ce lld-link: Stop accepting /natvis and /fastfail in .drectve sections
link.exe doesn't accept them either.

Differential Revision: https://reviews.llvm.org/D64352

llvm-svn: 365478
2019-07-09 13:30:03 +00:00
Simon Pilgrim 480e8ad217 [CodeGen] AccelTable - remove non-constexpr (MSVC) Atom defs
Now that we've dropped VS2015 support (D64326) we can enable the constexpr variables on MSVC builds as VS2017+ correctly handles them

llvm-svn: 365477
2019-07-09 13:07:48 +00:00
Simon Atanasyan 2fa6b54635 [mips] Implement sge/sgeu pseudo instructions
The `sge/sgeu Dst, Src1, Src2/Imm` pseudo instructions set register
`Dst` to 1 if register `Src1` is greater than or equal `Src2/Imm` and
to 0 otherwise.

Differential Revision: https://reviews.llvm.org/D64314

llvm-svn: 365476
2019-07-09 12:55:55 +00:00
Simon Atanasyan 00df4d92ed [mips] Implement sgt/sgtu pseudo instructions with immediate operand
The `sgt/sgtu Dst, Src1, Src2/Imm` pseudo instructions set register
`Dst` to 1 if register `Src1` is greater than `Src2/Imm` and to 0 otherwise.

Differential Revision: https://reviews.llvm.org/D64313

llvm-svn: 365475
2019-07-09 12:55:42 +00:00
James Henderson 8447b419a7 [docs][llvm-objdump] Make some wording improvements/simplifications.
llvm-svn: 365474
2019-07-09 12:41:39 +00:00
Pengfei Wang a50bbfc470 [NFC] [X86] Fix scan-build complaining
Summary:
Remove unused variable. This fixes bug:
https://bugs.llvm.org/show_bug.cgi?id=42526

Signed-off-by: pengfei <pengfei.wang@intel.com>

Reviewers: RKSimon, xiangzhangllvm, craig.topper

Subscribers: cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D64389

llvm-svn: 365473
2019-07-09 12:41:12 +00:00
Tim Northover 13b204fee1 OpaquePtr: pass type to CreateLoad. NFC.
This is the one place in LLVM itself that used the deprecated API for
CreateLoad, so I just added the type in.

llvm-svn: 365472
2019-07-09 12:36:36 +00:00