Commit Graph

3354 Commits

Author SHA1 Message Date
Sanjay Patel 6124cae8f7 [DAGCombine] (float)((int) f) --> ftrunc (PR36617)
fptosi / fptoui round towards zero, and that's the same behavior as ISD::FTRUNC, 
so replace a pair of casts with the equivalent node. We don't have to account for 
special cases (NaN, INF) because out-of-range casts are undefined.

Differential Revision: https://reviews.llvm.org/D44909

llvm-svn: 328921
2018-03-31 17:55:44 +00:00
Christof Douma a1e77c0e02 [ARM] Support float literals under XO
Follow up patch of r328313 to support the UseVMOVSR constraint. Removed
some unneeded instructions from the test and removed some stray
comments.

Differential Revision: https://reviews.llvm.org/D44941

llvm-svn: 328691
2018-03-28 10:02:26 +00:00
Artur Pilipenko ca1d849cd6 Fix a reoccuring typo in load-combine tests
%tmp = bitcast i32* %arg to i8*
   %tmp1 = getelementptr inbounds i8, i8* %tmp, i32 0
-  %tmp2 = load i8, i8* %tmp, align 1
+  %tmp2 = load i8, i8* %tmp1, align 1

This doesn't change the semantics of the tests but makes use of %tmp1 which was originally intended.

llvm-svn: 328642
2018-03-27 17:33:50 +00:00
Krzysztof Parzyszek 52396bb9c5 Use .set instead of = when printing assignment in assembly output
On Hexagon "x = y" is a syntax used in most instructions, and is not
treated as a directive.

Differential Revision: https://reviews.llvm.org/D44256

llvm-svn: 328635
2018-03-27 16:44:41 +00:00
Rafael Espindola 78fdca3cd5 Use local symbols for creating .stack-size.
llvm-svn: 328581
2018-03-26 20:40:22 +00:00
Christof Douma 4a025cc79d [ARM] Support float literals under XO
When targeting execute-only and fp-armv8, float constants in a compare
resulted in instruction selection failures. This is now fixed by using
vmov.f32 where possible, otherwise the floating point constant is
lowered into a integer constant that is moved into a floating point
register.

This patch also restores using fpcmp with immediate 0 under fp-armv8.

Change-Id: Ie87229706f4ed879a0c0cf66631b6047ed6c6443
llvm-svn: 328313
2018-03-23 13:02:03 +00:00
Rafael Espindola d947977e87 Run dos2unix on a test. NFC.
llvm-svn: 327934
2018-03-20 01:06:29 +00:00
Aaron Smith 6738960588 [SelectionDAG] Transfer DbgValues when integer operations are promoted
Summary:
DbgValue nodes were not transferred when integer DAG nodes were promoted. For example, if an i32 add node was promoted to an i64 add node by DAGTypeLegalizer::PromoteIntegerResult(), its DbgValue node was not transferred to the new node. The simple fix is to update SetPromotedInteger() to transfer DbgValues. 

Add AArch64/dbg-value-i8.ll to test this change and fix ARM/debug-info-d16-reg.ll which had the wrong DILocalVariable nodes with arg numbers even though they are not for function parameters.

Patch by Se Jong Oh!

Reviewers: vsk, JDevlieghere, aprantl

Reviewed By: JDevlieghere

Subscribers: javed.absar, kristof.beyls, llvm-commits

Tags: #debug-info

Differential Revision: https://reviews.llvm.org/D44546

llvm-svn: 327919
2018-03-19 22:58:50 +00:00
Martin Storsjo 9a55c1b0dc [ARM, AArch64] Check the no-stack-arg-probe attribute for dynamic stack probes
This extends the use of this attribute on ARM and AArch64 from
SVN r325900 (where it was only checked for fixed stack
allocations on ARM/AArch64, but for all stack allocations on X86).

This also adds a testcase for the existing use of disabling the
fixed stack probe with the attribute on ARM and AArch64.

Differential Revision: https://reviews.llvm.org/D44291

llvm-svn: 327897
2018-03-19 20:06:50 +00:00
Sjoerd Meijer d16037d9bb [ARM] Support for v4f16 and v8f16 vectors
This is the groundwork for adding the Armv8.2-A FP16 vector intrinsics, which
uses v4f16 and v8f16 vector operands and return values. All the moving parts
are tested with two intrinsics, a 1-operand v8f16 and a 2-operand v4f16
intrinsic. In a follow-up patch the rest of the intrinsics and tests will be
added.

Differential Revision: https://reviews.llvm.org/D44538

llvm-svn: 327839
2018-03-19 13:35:25 +00:00
Sjoerd Meijer d391a1a985 [ARM] FP16 codegen support for VSEL
This implements lowering of SELECT_CC for f16s, which enables
codegen of VSEL with f16 types.

Differential Revision: https://reviews.llvm.org/D44518

llvm-svn: 327695
2018-03-16 08:06:25 +00:00
Craig Topper 1b8cf49704 [SelectionDAG][ARM][X86] Teach PromoteIntRes_SETCC to do a better job picking the result type for the setcc.
Previously if getSetccResultType returned an illegal type we just fell back to using the default promoted type. This appears to have been to handle the case where for vectors getSetccResultType returns the input type, but the input type itself isn't legal and will need to be promoted. Without the legality check we would never reach a legal type.

But just picking the promoted type to be the setcc type can create strange setccs where the result type is 128 bits and the operand type is 256 bits. If for example the result type was promoted to v8i16 from v8i1, but the input type was promoted from v8i23 to v8i32. We currently handle this with custom lowering code in X86.

This legality check also caused us reject the getSetccResultType when the input type needed to be widened or split. Even though that result wouldn't have caused legalization to get stuck.

This patch tries to fix this by detecting the getSetccResultType needs to be promoted. If its input type also needs to be promoted we'll try a ask for a new setcc result type based on its eventual promoted value. Otherwise we fall back to default type to promote to.

For any other illegal values we might get back from the initial call to getSetccResultType we just keep and allow it to be re-legalized later via splitting or widening or scalarizing.

llvm-svn: 327683
2018-03-15 23:04:11 +00:00
Reid Kleckner 3a7a2e4a0a [FastISel] Sink local value materializations to first use
Summary:
Local values are constants, global addresses, and stack addresses that
can't be folded into the instruction that uses them. For example, when
storing the address of a global variable into memory, we need to
materialize that address into a register.

FastISel doesn't want to materialize any given local value more than
once, so it generates all local value materialization code at
EmitStartPt, which always dominates the current insertion point. This
allows it to maintain a map of local value registers, and it knows that
the local value area will always dominate the current insertion point.

The downside is that local value instructions are always emitted without
a source location. This is done to prevent jumpy line tables, but it
means that the local value area will be considered part of the previous
statement. Consider this C code:
  call1();      // line 1
  ++global;     // line 2
  ++global;     // line 3
  call2(&global, &local); // line 4

Today we end up with assembly and line tables like this:
  .loc 1 1
  callq call1
  leaq global(%rip), %rdi
  leaq local(%rsp), %rsi
  .loc 1 2
  addq $1, global(%rip)
  .loc 1 3
  addq $1, global(%rip)
  .loc 1 4
  callq call2

The LEA instructions in the local value area have no source location and
are treated as being on line 1. Stepping through the code in a debugger
and correlating it with the assembly won't make much sense, because
these materializations are only required for line 4.

This is actually problematic for the VS debugger "set next statement"
feature, which effectively assumes that there are no registers live
across statement boundaries. By sinking the local value code into the
statement and fixing up the source location, we can make that feature
work. This was filed as https://bugs.llvm.org/show_bug.cgi?id=35975 and
https://crbug.com/793819.

This change is obviously not enough to make this feature work reliably
in all cases, but I felt that it was worth doing anyway because it
usually generates smaller, more comprehensible -O0 code. I measured a
0.12% regression in code generation time with LLC on the sqlite3
amalgamation, so I think this is worth doing.

There are some special cases worth calling out in the commit message:
1. local values materialized for phis
2. local values used by no-op casts
3. dead local value code

Local values can be materialized for phis, and this does not show up as
a vreg use in MachineRegisterInfo. In this case, if there are no other
uses, this patch sinks the value to the first terminator, EH label, or
the end of the BB if nothing else exists.

Local values may also be used by no-op casts, which adds the register to
the RegFixups table. Without reversing the RegFixups map direction, we
don't have enough information to sink these instructions.

Lastly, if the local value register has no other uses, we can delete it.
This comes up when fastisel tries two instruction selection approaches
and the first materializes the value but fails and the second succeeds
without using the local value.

Reviewers: aprantl, dblaikie, qcolombet, MatzeB, vsk, echristo

Subscribers: dotdash, chandlerc, hans, sdardis, amccarth, javed.absar, zturner, llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D43093

llvm-svn: 327581
2018-03-14 21:54:21 +00:00
Francis Visoiu Mistrih e85b06d65f [CodeGen] Use MIR syntax for MachineMemOperand printing
Get rid of the "; mem:" suffix and use the one we use in MIR: ":: (load 2)".

rdar://38163529

Differential Revision: https://reviews.llvm.org/D42377

llvm-svn: 327580
2018-03-14 21:52:13 +00:00
Arnold Schwaighofer bf1638daa8 SjLjEHPrepare: Don't reg-to-mem swifterror values
swifterror llvm values model the swifterror register as memory at the
LLVM IR level. ISel will perform adhoc mem-to-reg on them. swifterror
values are constraint in how they can be used. Spilling them to memory
is not allowed.

SjLjEHPrepare tried to lower swifterror values to memory which is
unecessary since the back-end will spill and reload the register as
neccessary (as long as clobbering calls are marked as such which is the
case here) and further leads to invalid IR because swifterror values
can't be stored to memory.

rdar://38164004

llvm-svn: 327521
2018-03-14 15:44:07 +00:00
Sjoerd Meijer af30f06d5c [ARM] Fix for PR36577
Don't PerformSHLSimplify if the given node is used by a node that also uses a
constant because we may get stuck in an infinite combine loop.

bugzilla: https://bugs.llvm.org/show_bug.cgi?id=36577

Patch by Sam Parker.

Differential Revision: https://reviews.llvm.org/D44097

llvm-svn: 326882
2018-03-07 09:10:44 +00:00
Florian Hahn 9deef20b6c [ARM] Fix codegen for VLD3/VLD4/VST3/VST4 with WB
Code generation of VLD3, VLD4, VST3 and VST4 with register writeback is
broken due to 2 separate bugs:

1) VLD1d64TPseudoWB_register and VLD1d64QPseudoWB_register are missing
   rules to expand them to non pseudo MIR. These are selected for
   ARMISD::VLD3_UPD/VLD4_UPD with v1i64 vectors in SelectVLD.

2) Selection of the right VLD/VST instruction is broken for load and
   store of 3 and 4 v1i64 vectors. SelectVLD and SelectVST are called
   with MIR opcode for fixed writeback (ie increment is access size)
   and call getVLDSTRegisterUpdateOpcode() to select an opcode with
   register writeback if base register update is of a different size.
   Since getVLDSTRegisterUpdateOpcode() only knows about
   VLD1/VLD2/VST1/VST2 the call is currently conditional on the number
   of element in the vector.

   However, VLD1/VST1 is selected by SelectVLD/SelectVST's caller for
   load and stores of 3 or 4 v1i64 vectors. Therefore the opcode is not
   updated which later lead to a fixed writeback instruction being
   constructed with an extra operand for the register writeback.

This patch addresses the two issues as follows:
- it adds the necessary mapping from VLD1d64TPseudoWB_register and
  VLD1d64QPseudoWB_register to VLD1d64Twb_register and
  VLD1d64Qwb_register respectively. Like for the existing _fixed
  variants, the cost of these is bumped for unaligned access.
- it changes the logic in SelectVLD and SelectVSD to call isVLDfixed
  and isVSTfixed respectively to decide whether the opcode should be
  updated. It also reworks the logic and comments for pushing the
  writeback offset operand and r0 operand to clarify the logic:
  writeback offset needs to be pushed if it's a register writeback,
  r0 needs to be pushed if not and the instruction is a
  VLD1/VLD2/VST1/VST2.

Reviewers: rengolin, t.p.northover, samparker

Reviewed By: samparker

Patch by Thomas Preud'homme <thomas.preudhomme@arm.com>

Differential Revision: https://reviews.llvm.org/D42970

llvm-svn: 326570
2018-03-02 13:02:55 +00:00
Chih-Hung Hsieh 9f9e4681ac [TLS] use emulated TLS if the target supports only this mode
Emulated TLS is enabled by llc flag -emulated-tls,
which is passed by clang driver.
When llc is called explicitly or from other drivers like LTO,
missing -emulated-tls flag would generate wrong TLS code for targets
that supports only this mode.
Now use useEmulatedTLS() instead of Options.EmulatedTLS to decide whether
emulated TLS code should be generated.
Unit tests are modified to run with and without the -emulated-tls flag.

Differential Revision: https://reviews.llvm.org/D42999

llvm-svn: 326341
2018-02-28 17:48:55 +00:00
Pablo Barrio 512f7ee315 [ARM] Lower lower saturate to 0 and lower saturate to -1 using bit-operations
Summary:
Expressions of the form x < 0 ? 0 :  x; and x < -1 ? -1 : x can be lowered using bit-operations instead of branching or conditional moves

In thumb-mode this results in a two-instruction sequence, a shift followed by a bic or or while in ARM/thumb2 mode that has flexible second operand the shift can be folded into a single bic/or instructions. In most cases this results in smaller code and possibly less branches, and in no case larger than before.

Patch by Martin Svanfeldt

Reviewers: fhahn, pbarrio, rogfer01

Reviewed By: pbarrio, rogfer01

Subscribers: chrib, yroux, eugenis, efriedma, rogfer01, aemerson, javed.absar, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D42574

llvm-svn: 326333
2018-02-28 17:13:07 +00:00
Sjoerd Meijer fc0d02cbbf [ARM] Another f16 litpool fix
We were always setting the block alignment to 2 bytes in Thumb mode
and 4-bytes in ARM mode (r325754, and r325012), but this could cause 
reducing the block alignment when it already had been aligned (e.g. 
in Thumb mode when the block is a CPE that was already 4-byte aligned).

Patch by Momchil Velikov, I've only added a test.

Differential Revision: https://reviews.llvm.org/D43777

llvm-svn: 326232
2018-02-27 19:26:02 +00:00
Geoff Berry a2b9011290 Re-enable "[MachineCopyPropagation] Extend pass to do COPY source forwarding"
Re-enable commit r323991 now that r325931 has been committed to make
MachineOperand::isRenamable() check more conservative w.r.t. code
changes and opt-in on a per-target basis.

llvm-svn: 326208
2018-02-27 16:59:10 +00:00
Francis Visoiu Mistrih e4fae4d5b6 [CodeGen] Don't omit any redundant information in -debug output
In r322867, we introduced IsStandalone when printing MIR in -debug
output. The default behaviour for that was:

1) If any of MBB, MI, or MO are -debug-printed separately, don't omit any
redundant information.

2) When -debug-printing a MF entirely, don't print any redundant
information.

3) When printing MIR, don't print any redundant information.

I'd like to change 2) to:

2) When -debug-printing a MF entirely, don't omit any redundant information.

Differential Revision: https://reviews.llvm.org/D43337

llvm-svn: 326094
2018-02-26 15:23:42 +00:00
Sjoerd Meijer d31a8c0595 Recommit: [ARM] f16 constant pool fix
This recommits r325754; the modified and failing test case
actually didn't need any modifications.

llvm-svn: 325765
2018-02-22 10:43:57 +00:00
Sjoerd Meijer 9a25247f80 Revert r325754 and r325755 (f16 literal pool) because buildbots were unhappy.
llvm-svn: 325756
2018-02-22 08:41:55 +00:00
Sjoerd Meijer f98e32cf53 Added a test that I forgot to svn add in my previous commit r325754.
llvm-svn: 325755
2018-02-22 08:20:50 +00:00
Sjoerd Meijer 7d5909eb0f [ARM] f16 constant pool fix
This is a follow up of r325012, that allowed half types in constant pools.
Proper alignment was enforced when a big basic block was split up, but not when
a CPE was placed before/after a block; the successor block had the wrong
alignment.

Differential Revision: https://reviews.llvm.org/D43580

llvm-svn: 325754
2018-02-22 08:16:05 +00:00
Sjoerd Meijer 4d5c40492a [ARM] Lower BR_CC for f16
This case wasn't handled yet.

Differential Revision: https://reviews.llvm.org/D43508

llvm-svn: 325616
2018-02-20 19:28:05 +00:00
Francis Visoiu Mistrih 7f0f8bb4bd [CodeGen] Fix tests breaking after r325505
llvm-svn: 325512
2018-02-19 15:51:17 +00:00
Sjoerd Meijer c9bde5404a [ARM] Add LLVM tests for the vcvtr builtins
Follow up of Clang commit r325351; this adds the LLVM tests, which
were also missing.

Differential Revision: https://reviews.llvm.org/D43395

llvm-svn: 325443
2018-02-17 19:59:29 +00:00
Quentin Colombet 48abac82b8 Revert "[MachineCopyPropagation] Extend pass to do COPY source forwarding"
This reverts commit r323991.

This commit breaks target that don't model all the register constraints
in TableGen. So far the workaround was to set the
hasExtraXXXRegAllocReq, but it proves that it doesn't cover all the
cases.
For instance, when mutating an instruction (like in the lowering of
COPYs) the isRenamable flag is not properly updated. The same problem
will happen when attaching machine operand from one instruction to
another.

Geoff Berry is working on a fix in https://reviews.llvm.org/D43042.

llvm-svn: 325421
2018-02-17 03:05:33 +00:00
Jonas Paulsson 995ba6e42c [ARM] Return true in enableMultipleCopyHints().
Enable multiple COPY hints to eliminate more COPYs during register allocation.

Note that this is something all targets should do, see
https://reviews.llvm.org/D38128.

Review: Eli Friedman
llvm-svn: 325327
2018-02-16 09:51:01 +00:00
Mikhail Maltsev 0a7e107e77 [LegalizeDAG] Fix legalization of SETCC
Summary:
Currently when expanding a SETCC node into a SELECT_CC, LLVM uses
an incorrect type for determining BooleanContent of the result. This
patch fixes the issue.

Fixes PR36079.

Reviewers: rogfer01, javed.absar, efriedma

Reviewed By: efriedma

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D43282

llvm-svn: 325325
2018-02-16 09:35:16 +00:00
Roger Ferrer Ibanez d41059a9f6 [ARM] Materialise some boolean values to avoid a branch
This patch combines some cases of ARMISD::CMOV for integers that arise in comparisons of the form

  a != b ? x : 0
  a == b ? 0 : x

and that currently (e.g. in Thumb1) are emitted as branches.

Differential Revision: https://reviews.llvm.org/D34515

llvm-svn: 325323
2018-02-16 09:23:59 +00:00
Pablo Barrio fa6f1c0130 [ARM] Fix redirect in inline assembly test
Summary: Fix silly mistake in a test

Reviewers: gkistanova, apilipenko

Subscribers: javed.absar, eraman, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D43342

llvm-svn: 325283
2018-02-15 19:17:55 +00:00
Pablo Barrio e28cb8399a [ARM] Allow 64- and 128-bit types with 't' inline asm constraint
Summary:
In LLVM, 't' selects a floating-point/SIMD register and only supports
32-bit values. This is appropriately documented in the LLVM Language
Reference Manual. However, this behaviour diverges from that of GCC, where
't' selects the s0-s31 registers and its qX and dX variants depending on
additional operand modifiers (q/P).

For example, the following C code:

#include <arm_neon.h>
float32x4_t a, b, x;
asm("vadd.f32 %0, %1, %2" : "=t" (x) : "t" (a), "t" (b))

results in the following assembly if compiled with GCC:

vadd.f32 s0, s0, s1

whereas LLVM will show "error: couldn't allocate output register for
constraint 't'", since a, b, x are 128-bit variables, not 32-bit.

This patch extends the use of 't' to mean that of GCC, thus allowing
selection of the lower Q vector regs and their D/S variants. For example,
the earlier code will now compile as:

vadd.f32 q0, q0, q1

This behaviour still differs from that of GCC but I think it is actually
more correct, since LLVM picks up the right register type based on the
datatype of x, while GCC would need an extra operand modifier to achieve
the same result, as follows:

asm("vadd.f32 %q0, %q1, %q2" : "=t" (x) : "t" (a), "t" (b))

Since this is only an extension of functionality, existing code should not
be affected by this change. Note that operand modifiers q/P are already
supported by LLVM, so this patch should suffice to support inline
assembly with constraint 't' originally built for GCC.

Reviewers: grosbach, rengolin

Reviewed By: rengolin

Subscribers: rogfer01, efriedma, olista01, aemerson, javed.absar, eraman, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D42962

llvm-svn: 325244
2018-02-15 14:44:22 +00:00
Sjoerd Meijer 9430c8cd1c [ARM] f16 vcmp fixes
This adds f16 VCMP match rules and fixes the test cases.

Differential Revision: https://reviews.llvm.org/D43291

llvm-svn: 325228
2018-02-15 10:33:07 +00:00
Sjoerd Meijer 3b4294edd2 [ARM] f16 stack spill/reloads
This adds support for handling f16 stack spills/reloads.

Differential Revision: https://reviews.llvm.org/D43280

llvm-svn: 325130
2018-02-14 15:09:09 +00:00
Francis Visoiu Mistrih f6ed795d0c [CodeGen] Print bundled instructions using the MIR syntax in -debug output
Old syntax:

BUNDLE implicit-def %r0, implicit-def %r1, implicit %r2
* %r0 = SOME_OP %r2
* %r1 = ANOTHER_OP internal %r0

New syntax:

BUNDLE implicit-def %r0, implicit-def %r1, implicit %r2 {
  %r0 = SOME_OP %r2
  %r1 = ANOTHER_OP internal %r0
}

llvm-svn: 325032
2018-02-13 18:08:26 +00:00
Sjoerd Meijer f4a7fa7bbe [ARM] Allow half types in ConstantPool
Change ARMConstantIslandPass to:
- accept f16 literals as litpool entries,
- if the litpool needs to be inserted in the middle of a big block, then we
  need to 4-byte align the next instruction in ARM mode.

Differential Revision: https://reviews.llvm.org/D42784

llvm-svn: 325012
2018-02-13 15:34:09 +00:00
Sjoerd Meijer 101ee43072 [Thumb] Handle addressing mode AddrMode5FP16
This addressing mode wasn't checked, so we were running in an assert.

Differential Revision: https://reviews.llvm.org/D43179

llvm-svn: 324996
2018-02-13 10:29:03 +00:00
Martin Storsjo 9ca8b57186 [GlobalMerge] Allow merging of dllexported variables
If merging them, the dllexport attribute needs to be brought along
to the new GlobalAlias.

Differential Revision: https://reviews.llvm.org/D43192

llvm-svn: 324937
2018-02-12 21:14:21 +00:00
David Green 6d9f8c9817 [CodeGen] Add a -trap-unreachable option for debugging
Add a common -trap-unreachable option, similar to the target
specific hexagon equivalent, which has been replaced. This
turns unreachable instructions into traps, which is useful for
debugging.

Differential Revision: https://reviews.llvm.org/D42965

llvm-svn: 324880
2018-02-12 11:06:27 +00:00
Sanjay Patel 3659e8c95c [ARM] preserve test intent by removing undef
D43141 proposes to correct undef folding in the DAG,
and this test would not survive that change.

llvm-svn: 324814
2018-02-10 15:14:00 +00:00
Rafael Espindola c052fa0bd3 Emit smaller exception tables for non-SJLJ mode.
* Use uleb128 for code offsets in the LSDA call site table.
* Omit the TTBase offset if the type table is empty.

This change can reduce the size of the DWARF/Itanium LSDA by about half.

Patch by Ryan Prichard!

llvm-svn: 324750
2018-02-09 17:13:37 +00:00
Rafael Espindola d09b416943 Use assembler expressions to lay out the EH LSDA.
Rely on the assembler to finalize the layout of the DWARF/Itanium
exception-handling LSDA. Rather than calculate the exact size of each
thing in the LSDA, use assembler directives:

    To emit the offset to the TTBase label:

.uleb128 .Lttbase0-.Lttbaseref0
.Lttbaseref0:

    To emit the size of the call site table:

.uleb128 .Lcst_end0-.Lcst_begin0
.Lcst_begin0:
... call site table entries ...
.Lcst_end0:

    To align the type info table:

... action table ...
.balign 4
.long _ZTIi
.long _ZTIl
.Lttbase0:

Using assembler directives simplifies the compiler and allows switching
the encoding of offsets in the call site table from udata4 to uleb128 for
a large code size savings. (This commit does not change the encoding.)

The combination of the uleb128 followed by a balign creates an unfortunate
dependency cycle that the assembler must sometimes resolve either by
padding an LEB or by inserting zero padding before the type table. See
PR35809 or GNU as bug 4029.

Patch by Ryan Prichard!

llvm-svn: 324749
2018-02-09 17:00:25 +00:00
Francis Visoiu Mistrih 39ec2e95ae [CodeGen] Unify the syntax of MBB successors in MIR and -debug output
Instead of:

Successors according to CFG: %bb.6(0x12492492 / 0x80000000 = 14.29%)

print:

successors: %bb.6(0x12492492); %bb.6(14.29%)
llvm-svn: 324685
2018-02-09 00:10:31 +00:00
Francis Visoiu Mistrih da89d1812a [CodeGen] Print MachineBasicBlock labels using MIR syntax in -debug output
Instead of:

%bb.1: derived from LLVM BB %for.body

print:

bb.1.for.body:

Also use MIR syntax for MBB attributes like "align", "landing-pad", etc.

llvm-svn: 324563
2018-02-08 05:02:00 +00:00
Sjoerd Meijer 8c0739347c [ARM] FP16 mov imm pattern
This is a follow up of r324321, adding a match pattern for mov with a FP16
immediate (also fixing operand vfp_f16imm that wasn't even compiling).

Differential Revision: https://reviews.llvm.org/D42973

llvm-svn: 324456
2018-02-07 08:37:17 +00:00
Eli Friedman cd07a3e2f9 Place undefined globals in .bss instead of .data
Following up on the discussion from
http://lists.llvm.org/pipermail/llvm-dev/2017-April/112305.html, undef
values are now placed in the .bss as well as null values. This prevents
undef global values taking up potentially huge amounts of space in the
.data section.

The following two lines now both generate equivalent .bss data:

@vals1 = internal unnamed_addr global [20000000 x i32] zeroinitializer, align 4
@vals2 = internal unnamed_addr global [20000000 x i32] undef, align 4 ; previously unaccounted for

This is primarily motivated by the corresponding issue in the Rust
compiler (https://github.com/rust-lang/rust/issues/41315).

Differential Revision: https://reviews.llvm.org/D41705

Patch by varkor!

llvm-svn: 324424
2018-02-06 23:22:14 +00:00
Eli Friedman 98f8bba283 [LivePhysRegs] Fix handling of return instructions.
See D42509 for the original version of this.

Basically, there are two significant changes to behavior here:

- addLiveOuts always adds all pristine registers (even if a block has
no successors).
- addLiveOuts and addLiveOutsNoPristines always add all callee-saved
registers for return blocks (including conditional return blocks).

I cleaned up the functions a bit to make it clear these properties hold.

Differential Revision: https://reviews.llvm.org/D42655

llvm-svn: 324422
2018-02-06 23:00:17 +00:00