Commit Graph

184494 Commits

Author SHA1 Message Date
Matt Arsenault 3e45c70288 GlobalISel: Support physical register inputs in patterns
llvm-svn: 371253
2019-09-06 20:32:37 +00:00
Reid Kleckner e0df2dce4c Remove dead .seh_stackalloc parsing method in X86AsmParser
The shared COFF asm parser code handles this directive, since it is
shared with AArch64. Spotted by Alexandre Ganea in review.

llvm-svn: 371251
2019-09-06 20:12:44 +00:00
Matt Arsenault 02eb6a44a8 AMDGPU: Fix typo
llvm-svn: 371249
2019-09-06 20:00:22 +00:00
Puyan Lotfi 5476bd9432 [llvm-ifs] Improving detection of PlatformKind from triple for TBD generation.
It was pointed out that I had hard-coded PlatformKind. This is rectifying that.

Differential Revision: https://reviews.llvm.org/D67255

llvm-svn: 371248
2019-09-06 19:59:59 +00:00
Sean Fertile eaf34a983c [PowerPC][XCOFF] Remove basic test. [NFC]
Test verified that we could compile an empty module and produce an XCOFF
object file. Newer tests superssed this coverage, its safe to remove.

llvm-svn: 371247
2019-09-06 19:55:44 +00:00
Evandro Menezes 88a98ea3f7 [ConstantFolding] Add new test cases for transcendentals (NFC)
llvm-svn: 371246
2019-09-06 19:41:49 +00:00
Lang Hames c1105111b3 [ORC] Make sure RPC channel-send is called in blocking calls and responses.
ORC-RPC batches calls by default, and the channel's send method must be called
to transfer any buffered calls to the remote. The call to send was missing on
responses and blocking calls in the SingleThreadedRPCEndpoint. This patch adds
the necessary calls and modifies the RPC unit test to check for them.

llvm-svn: 371245
2019-09-06 19:21:59 +00:00
Lang Hames 335676ee62 [llvm-jitlink] Add optional slab allocator for testing locality optimizations.
The llvm-jitlink utility now accepts a '-slab-allocate <size>' option. If given,
llvm-jitlink will use a slab-based memory manager rather than the default
InProcessMemoryManager. Using a slab allocator will allow reliable testing of
future locality based optimizations (e.g. PLT and GOT elimination) in JITLink.

The <size> argument is a number, optionally followed by a units specifier (Kb,
Mb, or Gb). If the units are not given then the number is assumed to be in Kb.

llvm-svn: 371244
2019-09-06 19:21:55 +00:00
Craig Topper 7bb433c87b [X86] Use MOVSX by default instead of CBW to extend i8 to AX for i8 sdivrem.
We can use a MOVSX16 here then rely on FixupBWInst to change to
MOVSX32 if the upper bits are dead. With a special case to
not promote if it could be turned into CBW.

Then we can rely on X86MCInstLower to turn the MOVSX into CBW
very late if register allocation worked out.

Using MOVSX gives an opportunity to use the MOVSX as a both a
copy and a sign extend since the input and output register aren't
tied together.

Differential Revision: https://reviews.llvm.org/D67192

llvm-svn: 371243
2019-09-06 19:17:02 +00:00
Craig Topper 22b35c4291 [X86] Use MOVZX16rr8/MOVZXrm8 when extending input for i8 udivrem.
We can rely on X86FixupBWInsts to turn these into MOVZX32. This
simplifies a follow up commit to use MOVSX for i8 sdivrem with
a late optimization to use CBW when register allocation works out.

llvm-svn: 371242
2019-09-06 19:15:04 +00:00
Craig Topper 0364d89b6d [X86] Teach FixupBWInsts to turn MOVSX16rr8/MOVZX16rr8/MOVSX16rm8/MOVZX16rm8 into their 32-bit dest equivalents when the upper part of the register is dead.
llvm-svn: 371240
2019-09-06 19:14:49 +00:00
Sean Fertile 74966aca35 [PowerPC][XCOFF] Verify symbol table in xcoff object files. [NFC]
Extend the common/local-common testing for object files to also verify the
symbol table now that the needed functionality has landed in llvm-readobj.

Differential Revision: https://reviews.llvm.org/D66944

llvm-svn: 371237
2019-09-06 18:56:14 +00:00
Evandro Menezes 7feb812ccd [ConstantFolding] Refactor functions not available before C99 (NFC)
Note the cases when calling a function at compile time may fail if the host
does not support the C99 run time library.

llvm-svn: 371236
2019-09-06 18:24:21 +00:00
Kevin P. Neal fab40fce3f [FPEnv] Teach the IRBuilder about constrained FPToSI and FPToUI.
The IRBuilder doesn't know that the two floating point to integer instructions
have constrained equivalents. This patch adds the support by building on
the strict FP mode now present in the IRBuilder.

Reviewed by:	John McCall
Approved by:	John McCall
Differential Revision:	https://reviews.llvm.org/D67291

llvm-svn: 371235
2019-09-06 18:04:34 +00:00
Francis Visoiu Mistrih e14c0c5ae0 [Remarks] Add support for internalizing a remark in a string table
In order to keep remarks around, we need to make them tied to a string
table.

Users then can delete the parser and rely on the string table to keep
the memory of the strings alive and deduplicated.

llvm-svn: 371233
2019-09-06 17:22:51 +00:00
Oliver Cruickshank a050307c05 [ARM] Add patterns for VSUB with q and r registers
Added patterns for VSUB to support q and r registers, which reduces
pressure on q registers.

llvm-svn: 371231
2019-09-06 17:02:42 +00:00
Oliver Cruickshank 3aed95af4e [ARM] Add patterns for VADD with q and r registers
Added support for VADD to use q and r registers, which reduces pressure
on q registers.

llvm-svn: 371230
2019-09-06 17:02:35 +00:00
Oliver Cruickshank 9bf27928e1 [ARM] Add patterns for VMUL with q and r registers
Added support for VMUL to use an r register, this reduces pressure on
the q registers.

llvm-svn: 371229
2019-09-06 17:02:21 +00:00
Evandro Menezes 6f1369755d [ConstantFolding] Refactor function match for better speed (NFC)
Use an `enum` instead of string comparison to match the candidate function.

llvm-svn: 371228
2019-09-06 16:49:49 +00:00
Jessica Paquette 121d9114f5 [AArch64][GlobalISel] Always fall back on tail calls with -tailcallopt
-tailcallopt requires that we perform different stack adjustments than with
sibling calls. For example, the `@caller_to0_from8` function in
test/CodeGen/AArch64/tail-call.ll requires that we adjust SP. Without
-tailcallopt, this adjustment does not happen. With it, however, it is expected.

So, to ensure that adding sibling call support doesn't break -tailcallopt,
make CallLowering always fall back on possible tail calls when -tailcallopt
is passed in.

Update test/CodeGen/AArch64/tail-call.ll with a GlobalISel line to make sure
that we don't differ from the SDAG implementation at any point.

Differential Revision: https://reviews.llvm.org/D67245

llvm-svn: 371227
2019-09-06 16:49:13 +00:00
JF Bastien 52614dfc7f [InstCombine] pow(x, +/- 0.0) -> 1.0
Summary:
This isn't an important optimization at all... We're already doing:
  pow(x, 0.0) -> 1.0
My patch merely teaches instcombine that -0.0 does the same.

However, doing this fixes an AMAZING bug! Compile this program:

  extern "C" double pow(double, double);
  double boom(double base) {
    return pow(base, -0.0);
  }

With:
  clang++ ~/Desktop/fast-math.cpp -ffast-math -O2 -S

And clang will crash with a signal. Wow, fast math is so fast it ICEs the
compiler! Arguably, the generated math is infinitely fast.

What's actually happening is that we recurse infinitely in getPow. In debug we
hit its assertion:
  assert(Exp != 0 && "Incorrect exponent 0 not handled");

We avoid this entire mess if we instead recognize that an exponent of positive
and negative zero yield 1.0.

A separate commit, r371221, fixed the same problem. This only contains the added
tests.

<rdar://problem/54598300>

Reviewers: scanon

Subscribers: hiraditya, jkorous, dexonsmith, ributzka, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67248

llvm-svn: 371224
2019-09-06 16:26:59 +00:00
Sanjay Patel 4f0e429acc [SimplifyLibCalls] handle pow(x,-0.0) before it can assert (PR43233)
https://bugs.llvm.org/show_bug.cgi?id=43233

llvm-svn: 371221
2019-09-06 16:10:18 +00:00
Sam Tebbs f1cdd95a2f [ARM] Sink add/mul(shufflevector(insertelement())) for MVE instruction selection
This patch sinks add/mul(shufflevector(insertelement())) into the basic block in which they are used so that they can then be selected together.

This is useful for various MVE instructions, such as vmla and others that take R registers.

Loop tests have been added to the vmla test file to make sure vmlas are generated in loops.

Differential revision: https://reviews.llvm.org/D66295

llvm-svn: 371218
2019-09-06 16:01:32 +00:00
Valery Pykhtin e8ade89bb3 [AMDGPU] Enable constant offset promotion to immediate operand for VMEM stores
Differential revision: https://reviews.llvm.org/D66958

llvm-svn: 371214
2019-09-06 15:33:53 +00:00
Guillaume Chatelet ad1cea0dda [Alignment][NFC] Use Align with TargetLowering::setPrefFunctionAlignment
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: nemanjai, javed.absar, hiraditya, kbarton, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, jsji, s.egerton, pzheng, ychen, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67267

llvm-svn: 371212
2019-09-06 15:03:49 +00:00
Cyndy Ishida 4f8d005831 [Object] remove struct constructor, NFC
Summary: make POD struct by removing ctors

Reviewers: avl, dblaikie

Reviewed By: dblaikie

Subscribers: ributzka, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67251

llvm-svn: 371211
2019-09-06 15:02:22 +00:00
Guillaume Chatelet 9fcf066d0c [Alignment][NFC] Use Align with TargetLowering::setPrefLoopAlignment
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: nemanjai, hiraditya, kbarton, MaskRay, jsji, ychen, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67278

llvm-svn: 371210
2019-09-06 14:51:15 +00:00
Guillaume Chatelet 5d870c2ec0 [Alignment] fix dubious min function alignment
Summary:
This was discovered while introducing the llvm::Align type.
The original setMinFunctionAlignment used to take alignment as log2, looking at the comment it seems like instructions are to be 2-bytes aligned and not 4-bytes aligned.

Reviewers: uweigand

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67271

llvm-svn: 371204
2019-09-06 13:54:09 +00:00
George Rimar edfd276cbc [llvm-readelf] - Print unknown st_other value if present in GNU output.
This is a fix for https://bugs.llvm.org/show_bug.cgi?id=40785.

llvm-readelf does not print the st_value of the symbol when
st_value has any non-visibility bits set.

This patch:

* Aligns "Ndx" row for the default and a new cases.
(it was 1 space character off for the case when "PROTECTED" visibility was printed)

* Prints "[<other>: 0x??]" for symbols which has an additional st_other bits set.
In compare with GNU, this logic is a bit simpler and seems to be more consistent.

For MIPS GNU can print named flags, though can't print a mix of them:
0: 00000000 0 NOTYPE LOCAL DEFAULT UND 
1: 00000000 0 NOTYPE GLOBAL DEFAULT [OPTIONAL] UND a1
2: 00000000 0 NOTYPE GLOBAL DEFAULT [MIPS PLT] UND a2
3: 00000000 0 NOTYPE GLOBAL DEFAULT [MIPS PIC] UND a3
4: 00000000 0 NOTYPE GLOBAL DEFAULT [MICROMIPS] UND a4
5: 00000000 0 NOTYPE GLOBAL DEFAULT [MIPS16] UND a5
6: 00000000 0 NOTYPE GLOBAL DEFAULT [<other>: c] UND b1
7: 00000000 0 NOTYPE GLOBAL DEFAULT [<other>: 28] UND b2

On PPC64 it can print a localentry value that is encoded in the high bits of st_other
63: 0000000000000850 208 FUNC GLOBAL DEFAULT [<localentry>: 8] 12

We chose to print the raw st_other field, prefixed with '0x'.

Differential revision: https://reviews.llvm.org/D67094

llvm-svn: 371201
2019-09-06 13:05:34 +00:00
Guillaume Chatelet 4fc3ad9e13 [Alignment][NFC] Use Align with TargetLowering::setMinFunctionAlignment
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: jyknight, sdardis, nemanjai, javed.absar, hiraditya, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, jsji, s.egerton, pzheng, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67229

llvm-svn: 371200
2019-09-06 12:48:34 +00:00
Djordje Todorovic d409408e31 [test] Update the name of the debug entry values option. NFC
llvm-svn: 371199
2019-09-06 12:23:37 +00:00
James Molloy db2fa06722 [DFAPacketizer] Track resources for packetized instructions
This patch allows the DFAPacketizer to be queried after a packet is formed to work out which
resources were allocated to the packetized instructions.

This is particularly important for targets that do their own bundle packing - it's not
sufficient to know simply that instructions can share a packet; which slots are used is
also required for encoding.

This extends the emitter to emit a side-table containing resource usage diffs for each
state transition. The packetizer maintains a set of all possible resource states in its
current state. After packetization is complete, all remaining resource states are
possible packetization strategies.

The sidetable is only ~500K for Hexagon, but the extra tracking is disabled by default
(most uses of the packetizer like MachinePipeliner don't care and don't need the extra
maintained state).

Differential Revision: https://reviews.llvm.org/D66936

llvm-svn: 371198
2019-09-06 12:20:08 +00:00
Jeremy Morse 5d9cd3b4ca [DebugInfo] LiveDebugValues: explicitly terminate overwritten stack locations
If a stack spill location is overwritten by another spill instruction,
any variable locations pointing at that slot should be terminated. We
cannot rely on spills always being restored to registers or variable
locations being moved by a DBG_VALUE: the register allocator is entitled
to spill a value and then forget about it when it goes out of liveness.

To address this, scan for memory writes to spill locations, even those we
don't consider to be normal "spills". isSpillInstruction and
isLocationSpill distinguish the two now. After identifying spill
overwrites, terminate the open range, and insert a $noreg DBG_VALUE for
that variable.

Differential Revision: https://reviews.llvm.org/D66941

llvm-svn: 371193
2019-09-06 10:08:22 +00:00
Jay Foad 6c0204c794 [AMDGPU] Mark s_barrier as having side effects but not accessing memory.
Summary:
This fixes poor scheduling in a function containing a barrier and a few
load instructions.

Without this fix, ScheduleDAGInstrs::buildSchedGraph adds an artificial
edge in the dependency graph from the barrier instruction to the exit
node representing live-out latency, with a latency of about 500 cycles.
Because of this it thinks the critical path through the graph also has
a latency of about 500 cycles. And because of that it does not think
that any of the load instructions are on the critical path, so it
schedules them with no regard for their (80 cycle) latency, which gives
poor results.

Reviewers: arsenm, dstuttard, tpr, nhaehnle

Subscribers: kzhuravl, jvesely, wdng, yaxunl, t-tye, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67218

llvm-svn: 371192
2019-09-06 10:07:28 +00:00
Nico Weber 68df9dc098 gn build: Merge r371182
llvm-svn: 371191
2019-09-06 09:44:13 +00:00
Nico Weber 3dbb5c7e88 gn build: Merge r371179
llvm-svn: 371190
2019-09-06 09:44:10 +00:00
Sam Parker 29bf68fcfa [ARM] Fix for buildbot
llvm-svn: 371187
2019-09-06 09:36:23 +00:00
Fangrui Song d20c41dd31 [yaml2obj] Rename SHOffset (e_shoff) field to SHOff. NFC
`struct Elf*_Shdr` has a field `sh_offset`, named `ShOffset` in
llvm::ELFYAML::Section. Rename SHOffset (e_shoff) to SHOff to prevent confusion.

Reviewed By: grimar

Differential Revision: https://reviews.llvm.org/D67254

llvm-svn: 371185
2019-09-06 09:23:17 +00:00
Sam Parker 312409e464 [ARM] MVE Tail Predication
The MVE and LOB extensions of Armv8.1m can be combined to enable
'tail predication' which removes the need for a scalar remainder
loop after vectorization. Lane predication is performed implicitly
via a system register. The effects of predication is described in
Section B5.6.3 of the Armv8.1-m Arch Reference Manual, the key points
being:
- For vector operations that perform reduction across the vector and
  produce a scalar result, whether the value is accumulated or not.
- For non-load instructions, the predicate flags determine if the
  destination register byte is updated with the new value or if the
  previous value is preserved.
- For vector store instructions, whether the store occurs or not.
- For vector load instructions, whether the value that is loaded or
  whether zeros are written to that element of the destination
  register.

This patch implements a pass that takes a hardware loop, containing
masked vector instructions, and converts it something that resembles
an MVE tail predicated loop. Currently, if we had code generation,
we'd generate a loop in which the VCTP would generate the predicate
and VPST would then setup the value of VPR.PO. The loads and stores
would be placed in VPT blocks so this is not tail predication, but
normal VPT predication with the predicate based upon a element
counting induction variable. Further work needs to be done to finally
produce a true tail predicated loop.

Because only the loads and stores are predicated, in both the LLVM IR
and MIR level, we will restrict support to only lane-wise operations
(no horizontal reductions). We will perform a final check on MIR
during loop finalisation too.

Another restriction, specific to MVE, is that all the vector
instructions need operate on the same number of elements. This is
because predication is performed at the byte level and this is set
on entry to the loop, or by the VCTP instead.

Differential Revision: https://reviews.llvm.org/D65884

llvm-svn: 371179
2019-09-06 08:24:41 +00:00
Kang Zhang f879c68755 [CodeGen] Do the Simple Early Return in block-placement pass to optimize the blocks
Summary:

Fix a bug of not update the jump table and recommit it again.

In `block-placement` pass, it will create some patterns for unconditional we can do the simple early retrun.
But the `early-ret` pass is before `block-placement`, we don't want to run it again.
This patch is to do the simple early return to optimize the blocks at the last of `block-placement`.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D63972

llvm-svn: 371177
2019-09-06 08:16:18 +00:00
David Zarzycki 412a8d7a83 [CMake] LLVM_COMPILE_FLAGS also applies to C files
LLVM_COMPILE_FLAGS also applies to C files, otherwise tuning flags,
etc. won't be picked up.

https://reviews.llvm.org/D67171

llvm-svn: 371173
2019-09-06 07:12:36 +00:00
Mikael Holmen dee0702b2a [MIR] Change test case to read from stdin instead of file
The

    ;CHECK: bb
    ;CHECK-NEXT: %namedVReg1353:_(p0) = COPY $d0

parts of the test case failed when the tests were placed in a directory
including "bb" in the path, since the full path of the file is then
output in the
 ; ModuleID = '/repo/bb/
line which the CHECK matched on and then the CHECK-NEXT failed.

llvm-svn: 371171
2019-09-06 06:55:54 +00:00
Craig Topper 463c8e5eeb [X86] Add tests for extending and truncating between v16i8 and v16i64 with min-legal-vector-width=256.
It looks like we might be able to do these in fewer steps, but
I'm not sure.

llvm-svn: 371170
2019-09-06 06:02:17 +00:00
Craig Topper 7739fbc9c3 [X86] Fix bad indentation. NFC
llvm-svn: 371167
2019-09-06 05:50:46 +00:00
Alex Brachet dfacf8851e Fix rL371162 again
llvm-svn: 371164
2019-09-06 03:31:42 +00:00
Alex Brachet 27d42af603 Fix failing test from rL371162
llvm-svn: 371163
2019-09-06 02:56:48 +00:00
Alex Brachet 0b69c59656 [yaml2obj] Make e_phoff and e_phentsize 0 if there are no program headers
Summary: It says [[ http://www.sco.com/developers/gabi/latest/ch4.eheader.html | here ]] that if there are no program headers than e_phoff should be 0, but currently it is always set after the header. GNU's `readelf` (but not `llvm-readelf`) complains about this: `readelf: Warning: possibly corrupt ELF header - it has a non-zero program header offset, but no program headers`.

Reviewers: jhenderson, grimar, MaskRay, rupprecht

Reviewed By: jhenderson, grimar, MaskRay

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67054

llvm-svn: 371162
2019-09-06 02:27:55 +00:00
Nico Weber b1cf175271 gn build: Merge r371159
llvm-svn: 371161
2019-09-06 01:22:13 +00:00
Jonas Devlieghere bee0f7ddd7 [MC] Fix undefined behavior in MCInstPrinter::formatHex
Passing INT64_MIN to MCInstPrinter::formatHex triggers undefined
behavior because the negation of -9223372036854775808 cannot be
represented in type 'int64_t' (aka 'long long'). This patch puts a
workaround in place to just print the hex value directly.

A possible alternative involves using a small helper functions that uses
(implementation) defined conversions to achieve the desirable value:

  static int64_t helper(int64_t V) {
    auto U = static_cast<uint64_t>(V);
    return V < 0 ? -U : U;
  }

The underlying problem is that MCInstPrinter::formatHex(int64_t) returns
a format_object<int64_t> and should really return a
format_object<uint64_t>. However, that's not possible because formatImm
needs to be able to print both as decimal (where a signed is required)
and hex (where we'd prefer to always have an unsigned).

  format_object<int64_t> formatImm(int64_t Value) const {
    return PrintImmHex ? formatHex(Value) : formatDec(Value);
  }

Differential revision: https://reviews.llvm.org/D67236

llvm-svn: 371159
2019-09-06 01:13:32 +00:00
Alina Sbirlea 57fcb1d7fc Cleanup test.
llvm-svn: 371158
2019-09-06 00:58:03 +00:00