Commit Graph

367957 Commits

Author SHA1 Message Date
Nicolas Vasilache cf9503c1b7 [mlir] Add subtensor_insert operation
Differential revision: https://reviews.llvm.org/D88657
2020-10-02 06:32:31 -04:00
Kadir Cetinkaya 54c03d8f7d
[clangd][lit] Update document-link.test to respect custom resource-dir locations
Differential Revision: https://reviews.llvm.org/D88721
2020-10-02 12:24:06 +02:00
Simon Pilgrim ec07ae2a83 [InstCombine] Add some basic vector bswap tests
We get the vNi16 cases already via matching as a rotate followed by the fshl -> bswap combines
2020-10-02 11:08:12 +01:00
Nicolas Vasilache 787bf5e383 [mlir] Add canonicalization for the `subtensor` op
Differential revision: https://reviews.llvm.org/D88656
2020-10-02 06:05:52 -04:00
Nicolas Vasilache e3de249a4c [mlir] Add a subtensor operation
This revision introduces a `subtensor` op, which is the counterpart of `subview` for a tensor operand. This also refactors the relevant pieces to allow reusing the `subview` implementation where appropriate.

This operation will be used to implement tiling for Linalg on tensors.
2020-10-02 05:35:30 -04:00
Simon Pilgrim 670e60c023 [InstCombine] Add partial bswap test from D88578 2020-10-02 10:34:30 +01:00
Meera Nakrani f7c0e2b8f2 [ARM] Prevent constants from iCmp instruction from being hoisted if part of a min(max()) pattern
Marks constants of an ICmp instruction as free if it's only user is a select
instruction that is part of a min(max()) pattern. Ensures that in loops, in
particular when loop unrolling is turned on, SSAT will still be correctly generated.

Differential Revision: https://reviews.llvm.org/D88662
2020-10-02 09:28:35 +00:00
Hsiangkai Wang 067add7b5f [RISCV] Support vmsge.vx and vmsgeu.vx pseudo instructions in RVV.
Implement vmsge{u}.vx pseudo instruction.

According to RISC-V V specification, there are different scenarios for this
pseudo instruction. I list them below.

unmasked va >= x

  pseudoinstruction: vmsge{u}.vx vd, va, x
  expansion: vmslt{u}.vx vd, va, x; vmnand.mm vd, vd, vd

masked va >= x, vd != v0

  pseudoinstruction: vmsge{u}.vx vd, va, x, v0.t
  expansion: vmslt{u}.vx vd, va, x, v0.t; vmxor.mm vd, vd, v0

masked va >= x, vd == v0

  pseudoinstruction: vmsge{u}.vx vd, va, x, v0.t, vt
  expansion: vmslt{u}.vx vt, va, x;  vmandnot.mm vd, vd, vt

Use pseudo instruction to model vmsge{u}.vx. The pseudo instruction will convert
to different expansion according to the condition.

Differential Revision: https://reviews.llvm.org/D84732
2020-10-02 17:20:34 +08:00
Sam McCall 17747d2ec8 [clangd] Remove Tweak::Intent, use CodeAction kind directly. NFC
Intent was a nice idea but it ends up being a bit awkward/heavyweight
without adding much.

In particular, it makes it hard to implement `CodeActionParams.only` properly
(there's an inheritance hierarchy for kinds).

Differential Revision: https://reviews.llvm.org/D88427
2020-10-02 11:14:23 +02:00
serge-sans-paille 9573c9f2a3 Fix limit behavior of dynamic alloca
When the allocation size is 0, we shouldn't probe. Within [1,  PAGE_SIZE], we
should probe once etc.

This fixes https://bugs.llvm.org/show_bug.cgi?id=47657

Differential Revision: https://reviews.llvm.org/D88548
2020-10-02 11:10:02 +02:00
Georgii Rymar 5829dc9250 [yaml2obj][elf2yaml] - Add a support for the `EntSize` field for `SHT_HASH` sections.
Specification  for SHT_HASH table says (https://refspecs.linuxbase.org/elf/gabi4+/ch5.dynamic.html#hash)
that it contains Elf32_Word entries for both 32/64 bit objects.

Currently both GNU linkers and LLD sets the `sh_entsize` field to `4`.

At the same time, `yaml2obj` ignores the `EntSize` field for SHT_HASH sections.
This patch fixes this and also adds a support for obj2yaml: it will not
dump this field when the `sh_entsize` contains the default value (`4`).

Differential revision: https://reviews.llvm.org/D88652
2020-10-02 12:01:50 +03:00
Tres Popp bfd7ee92cc Handle unused variable without asserts 2020-10-02 10:22:55 +02:00
Sam McCall bc18d8d9b7 [clangd] Drop dependence on standard library in check.test 2020-10-02 09:53:06 +02:00
Thomas Lively 542523a61a [WebAssembly] Emulate v128.const efficiently
v128.const was recently implemented in V8, but until it rolls into Chrome
stable, we can't enable it in the WebAssembly backend without breaking origin
trial users. So far we have been lowering build_vectors that would otherwise
have been lowered to v128.const to splats followed by sequences of replace_lane
instructions to initialize each lane individually. That produces large and
inefficient code, so this patch introduces new logic to lower integer vector
constants to a single i64x2.splat where possible, with at most a single
i64x2.replace_lane following it if necessary.

Adapted from a patch authored by @omnisip.

Differential Revision: https://reviews.llvm.org/D88591
2020-10-02 00:28:06 -07:00
David Sherwood b0ce9f0f4c [SVE][CodeGen] Fix implicit TypeSize->uint64_t casts in TypePromotion
The TypePromotion pass only operates on scalar types so I've fixed up
all places where we were relying upon the implicit cast from
TypeSize->uint64_t.

Differential Revision: https://reviews.llvm.org/D88575
2020-10-02 08:12:11 +01:00
David Sherwood b8ce6a6756 [SVE][CodeGen] Add new EVT/MVT getFixedSizeInBits() functions
When we know that a particular type is always going to be fixed
width we have so far been writing code like this:

  getSizeInBits().getFixedSize()

Since we are doing this in quite a few places now it seems to make
sense to add a new helper function that allows us to replace
these calls with a single getFixedSizeInBits() call.

Differential Revision: https://reviews.llvm.org/D88649
2020-10-02 07:47:31 +01:00
Martin Storsjö afb4e0f289 [AArch64] Omit SEH directives for the epilogue if none are needed
For these cases, we already omit the prologue directives, if
(!AFI->hasStackFrame() && !windowsRequiresStackProbe && !NumBytes).

When writing the epilogue (after the prolog has been written), if
the function doesn't have the WinCFI flag set (i.e. if no prologue
was generated), assume that no epilogue will be needed either,
and don't emit any epilog start pseudo instruction. After completing
the epilogue, make sure that it actually matched the prologue.

Previously, when epilogue start/end was generated, but no prologue,
the unwind info for such functions actually was huge; 12 bytes xdata
(4 bytes header, 4 bytes for one non-folded epilogue header, 4 bytes
for padded opcodes) and 8 bytes pdata. Because the epilog consisted of
one opcode (end) but the prolog was empty (no .seh_endprologue), the
epilogue couldn't be folded into the prologue, and thus couldn't be
considered for packed form either.

On a 6.5 MB DLL with 110 KB pdata and 166 KB xdata, this gets rid of
38 KB pdata and 62 KB xdata.

Differential Revision: https://reviews.llvm.org/D88641
2020-10-02 09:12:56 +03:00
Stephen Neuendorffer 47df8c57e4 [MLIR] Updates around MemRef Normalization
The documentation for the NormalizeMemRefs pass and the associated MemRefsNormalizable
traits was confusing and not on the website.  This update clarifies the language
around the difference between a MemRef Type, an operation that accesses the value of
MemRef Type, and better documents the limitations of the current implementation.
This patch also includes some basic debugging information for the pass so people
might have a chance of figuring out why it doesn't work on their code.

Differential Revision: https://reviews.llvm.org/D88532
2020-10-01 21:11:41 -07:00
Max Kazantsev b8ac19cf1c [SCEV] Limited support for unsigned preds in isImpliedViaOperations
The logic there only considers `SLT/SGT` predicates. We can use the same logic
for proving `ULT/UGT` predicates if all involved values are non-negative.

Adding full-scale support for unsigned might be challenging because of code amount,
so we can consider this in the future.

Differential Revision: https://reviews.llvm.org/D88087
Reviewed By: reames
2020-10-02 10:20:57 +07:00
Philip Reames f29645e7af [gvn] Handle a corner case w/vectors of non-integral pointers
If we try to coerce a vector of non-integral pointers to a narrower type (either narrower vector or single pointer), we use inttoptr and violate the semantics of non-integral pointers.  In theory, we can handle many of these cases, we just need to use a different code idiom to convert without going through inttoptr and back.

This shows up as wrong code bugs, and in some cases, crashes due to failed asserts.  Modeled after a change which has lived downstream for a couple years, though completely rewritten to be more idiomatic.
2020-10-01 19:20:21 -07:00
Carl Ritson 2ef9d21e1a [AMDGPU] SIInsertSkips: Tidy block splitting to use splitAt
Convert to use new MachineBasicBlock splitAt function.
Place code in splitBlock function for reuse in future changes.
Should yield no functional change.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D88537
2020-10-02 11:10:55 +09:00
Jason Molenda a1e97923a0 Have kernel binary scanner load dSYMs as binary+dSYM if best thing found
lldb's PlatforDarwinKernel scans the local filesystem (well known
locations, plus user-specified directories) for kernels and kexts
when doing kernel debugging, and loads them automatically.  Sometimes
kernel developers want to debug with *only* a dSYM, in which case they
give lldb the DWARF binary + the dSYM as a binary and symbol file.
This patch adds code to lldb to do this automatically if that's the
best thing lldb can find.

A few other bits of cleanup in PlatformDarwinKernel that I undertook
at the same time:

1. Remove the 'platform.plugin.darwin-kernel.search-locally-for-kexts'
setting.  When I added the local filesystem index at start of kernel
debugging, I thought people might object to the cost of the search
and want a way to disable it.  No one has.

2. Change the behavior of
'plugin.dynamic-loader.darwin-kernel.load-kexts' setting so it does
not disable the local filesystem scan, or use of the local filesystem
binaries.

3. PlatformDarwinKernel::GetSharedModule into GetSharedModuleKext and
GetSharedModuleKernel for easier readability & maintenance.

4. Added accounting of .dSYM.yaa files (an archive format akin to tar)
that I come across during the scan.  I'm not using these for now; it
would be very expensive to expand the archives & see if the UUID matches
what I'm searching for.

<rdar://problem/69774993>
Differential Revision: https://reviews.llvm.org/D88632
2020-10-01 18:55:37 -07:00
Carl Ritson 5136f4748a CodeGen: Fix livein calculation in MachineBasicBlock splitAt
Fix and simplify computation of liveins for new block.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D88535
2020-10-02 10:45:04 +09:00
Esme-Yi c4690b0077 [PowerPC] Put the CR field in low bits of GRC during copying CRRC to GRC.
Summary: How we copying the CRRC to GRC is using a single MFOCRF to copy the contents of CR field n (CR bits 4×n+32:4×n+35) into bits 4×n+32:4×n+35 of register GRC. That’s not correct because we expect the value of destination register equals to source so we have to put the the contents of CR field in the lowest 4 bits. This patch adds a RLWINM after MFOCRF to achieve that.
The problem came up when adding builtins for xvtdivdp, xvtdivsp, xvtsqrtdp, xvtsqrtsp, as posted in D88278. We need to move the outputs (in CR register) to GRC. However outputs of these instructions may not in a fixed CR# register, so we can’t directly add a rotation instruction in the .td patterns, but need to wait until the CR register is determined. Then we confirmed this should be a bug in POST-RA PSEUDO PASS.

Reviewed By: nemanjai, shchenz

Differential Revision: https://reviews.llvm.org/D88274
2020-10-02 01:26:18 +00:00
Joseph Huber 82453e759c [OpenMP] Add Missing Runtime Call for Globalization Remarks
Summary:
Add a missing runtime call to perform data globalization checks.

Reviewers: jdoerfert

Subscribers: guansong hiraditya llvm-commits sstefan1 yaxunl

Tags: #LLVM #OpenMP

Differential Revision: https://reviews.llvm.org/D88621
2020-10-01 21:19:53 -04:00
Valentin Clement c1dcb573a8 [flang][openacc] Update loop construct lowering
Update the loop construct lowering to support multiple occurences of the same clauses
such as private. Add some utility functions used by other constructs.

Upstreaming part of https://github.com/flang-compiler/f18-llvm-project/pull/438/

Reviewed By: schweitz

Differential Revision: https://reviews.llvm.org/D88253
2020-10-01 20:39:04 -04:00
peter klausler 3261aefc72 [flang] Extend runtime API for PAUSE to allow a stop code
Support integer and default character stop codes on PAUSE
statements.  Add length argument to STOP statement with a
character stop code.

Differential revision: https://reviews.llvm.org/D88692
2020-10-01 17:20:11 -07:00
peter klausler a94d943f1a [flang] Fix actions at end of output record
It turns out that unformatted fixed-size output records
do need to be padded out if short, in order to avoid a
spurious EOF crash on a short record at the end of the file.
While here in AdvanceRecord(), move the unformatted
variable-length record header/footer writing code to here
from EndIoStatement().

Differential revision: https://reviews.llvm.org/D88685
2020-10-01 17:18:20 -07:00
jasonliu 78a9e62aa6 [XCOFF] Enable -fdata-sections on AIX
Summary:
Some design decision worth noting about:

I've noticed a recent mailing discussing about why string literal is
not affected by -fdata-sections for ELF target:
http://lists.llvm.org/pipermail/llvm-dev/2020-September/145121.html

But on AIX, our linker could not split the mergeable string like other target.
So I think it would make more sense for us to emit separate csect for
every mergeable string in -fdata-sections mode,
as there might not be other ways for linker to do garbage collection
on unused mergeable string.

Reviewed By: daltenty, hubert.reinterpretcast

Differential Revision: https://reviews.llvm.org/D88339
2020-10-02 00:16:24 +00:00
peter klausler 61687f3a48 [flang] Fix buffering read->write transition
The buffer needs to be Reset() after a Flush(), since the
Flush() can be a no-op after a read->write transition.
And record numbers are 1-based, not 0-based.
This fixes a bug with rewrites of records that have been
recently read.

Differential revision: https://reviews.llvm.org/D88612
2020-10-01 16:57:38 -07:00
peter klausler 75a5ec1bad [flang][msvc] Rework a MSVC work-around to avoid clang warning
A recent MSVC work-around patch is eliciting unused variable
warnings from clang; package the lambda reference arguments
into a struct to avoid the warning.

Differential revision: https://reviews.llvm.org/D88695
2020-10-01 16:52:30 -07:00
Philip Reames bb0344644a [memcpyopt] Conservatively handle non-integral pointers
If we allow the non-integral pointers to become memset and memcpy, we loose the ability to reason about pointer propagation.  This patch is modeled on changes we've carried downstream for a long time, figured it was worth being equally conservative for other users.  There is room to refine the semantics and handling here if anyone is motivated.
2020-10-01 16:46:56 -07:00
Muhammad Asif Manzoor aab6f7db47 [AArch64][SVE] Add lowering for llvm fabs
Add the functionality to lower fabs for passthru variant

Reviewed By: paulwalker-arm

Differential Revision: https://reviews.llvm.org/D88679
2020-10-01 19:41:25 -04:00
Philip Reames de3cb9548d Fix a bug in memset formation with vectors of non-integral pointers
We were converting the non-integral store into a integer store which is not legal.
2020-10-01 16:11:11 -07:00
Stanislav Mekhanoshin caeb13aba8 [AMDGPU] Allow SOP asm mnemonic to differ
Allows the creation of real SOP1 instructions with
assembler mnemonics that differ from their
pseudo-instruction mnemonics. The default behavior
keeps the mnemonics matching.

Corrects a subtarget label typo in a comment.

Authored By: Joe_Nash

Differential Revision: https://reviews.llvm.org/D88708
2020-10-01 16:00:04 -07:00
peter klausler e99d184d54 [flang] Readability improvement in binary->decimal conversion
Tweak binary->decimal conversions to avoid an integer multiplication
in a hot loop to improve readability and get a minor (~5%) speed-up.
Use native integer division by constants for more readability, too,
since current build compilers seem to optimize it correctly now.
Delete the now needless temporary work-around facility in
Common/unsigned-const-division.h.

Differential revision: https://reviews.llvm.org/D88604
2020-10-01 15:49:27 -07:00
Jessica Paquette 5402d11b1d [GlobalISel][AArch64] Don't emit cset for G_FCMPs feeding into G_BRCONDs
Similar to the FP case in `AArch64TargetLowering::LowerBR_CC`.

Instead of emitting the csets + a tbnz, just emit a compare + bcc
(or two bccs, depending on the condition code)

This improves cases like this: https://godbolt.org/z/v8hebx

This is a 0.1% geomean code size improvement for CTMark at -O3.

Differential Revision: https://reviews.llvm.org/D88624
2020-10-01 15:34:16 -07:00
Jessica Paquette 8e8664e55e [AArch64][GlobalISel] Use emitTestBit in selection for G_BRCOND
Partially refactoring, partially fixing a bug.

- We shouldn't use TB(N)ZX unless the bit number is >= 32
- We can fold more than xor using emitTestBit

Also remove a check which isn't relevant anymore + update tests.

Rename select-brcond-of-not.mir to select-brcond-of-binop.mir, since it now
tests more than just G_XOR.

Differential Revision: https://reviews.llvm.org/D88702
2020-10-01 15:33:34 -07:00
Amara Emerson 017b871502 [AArch64][GlobalISel] Alias rules for G_FCMP to G_ICMP.
No need to be different here for the vast majority of rules.
2020-10-01 15:20:09 -07:00
Amara Emerson e28c5899a2 [AArch64][GlobalISel] Make <8 x s8> integer arithmetic ops legal. 2020-10-01 14:35:21 -07:00
Amara Emerson a97e97faed [AArch64][GlobalISel] Make <8 x s8> shifts legal and add selection support. 2020-10-01 14:21:18 -07:00
Amara Emerson 9a2b3bbc59 Revert "[AArch64][GlobalISel] Make <8 x s8> shifts legal."
Accidentally pushed this.
2020-10-01 14:15:57 -07:00
Amara Emerson 8071c2f5c6 [AArch64][GlobalISel] Make <8 x s8> shifts legal. 2020-10-01 14:10:10 -07:00
Alexandre Ganea 4140f0744f [LLD][COFF] Fix crash with /summary and PCH input files
Before this patch /summary was crashing with some .PCH.OBJ files, because tpiMap[srcIdx++] was reading at the wrong location. When the TpiSource depends on a .PCH.OBJ file, the types should be offset by the previously merged PCH.OBJ set of indices.

Differential Revision: https://reviews.llvm.org/D88678
2020-10-01 17:08:35 -04:00
Raphael Isemann 15ea45f16b [lldb] Skip unique_ptr import-std-module tests on Linux
This seems to fail on ubuntu 18.04.5 with Clang 9 due to:

Error output:
error: Couldn't lookup symbols:
  std::__1::default_delete<int>::operator()(int) const
2020-10-01 23:04:36 +02:00
Amara Emerson 9f6acb1358 [AArch64][GlobalISel] Merge G_SHL, G_ASHR and G_LSHR legalizer rules together.
There's no need for any difference between these.
2020-10-01 14:02:45 -07:00
Arthur Eubanks b29573b672 [gn build] Support building with ThinLTO
Differential Revision: https://reviews.llvm.org/D88584
2020-10-01 13:48:31 -07:00
Aaron Puchert 1c1a810558 libclc: Use find_package to find Python 3 and require it
The script's shebang wants Python 3, so we use FindPython3. The
original code didn't work when an unversioned python was not available.
This is explicitly allowed in PEP 394. ("Distributors may choose to set
the behavior of the python command as follows: python2, python3, not
provide python command, allow python to be configurable by an end user
or a system administrator.")

Also I think it's actually required, so let the configuration fail if we
can't find it.

Lastly remove the shebang, since the script is only run via interpreter
and doesn't have the executable bit set anyway.

Reviewed By: jvesely

Differential Revision: https://reviews.llvm.org/D88366
2020-10-01 22:31:33 +02:00
Amara Emerson 73457536ff [AArch64][GlobalISel] Use custom legalization for G_TRUNC for v8i8 vectors.
Truncating to v8i8 is a case where we want to split the source but also generate
intermediate truncates to reduce the size of the source vector before truncating
down to v8i8. This implements the same strategy as what SelectionDAG does, but
I'm not certain where if anywhere in generic code it should live.

Use it for legalization of v8s8 = G_ICMP v8s32.

Differential Revision: https://reviews.llvm.org/D88191
2020-10-01 13:22:00 -07:00
Amara Emerson 4c265ce665 [AArch64][GlobalISel] Camp oversize v4s64 G_FPEXT operations. 2020-10-01 13:08:31 -07:00