Commit Graph

308 Commits

Author SHA1 Message Date
Justin Bogner d45849f703 Canonicalize a large number of mir tests using update_mir_test_checks
This converts a large and somewhat arbitrary set of tests to use
update_mir_test_checks. I ran the script on all of the tests I expect
to need to modify for an upcoming mir syntax change and kept the ones
that obviously didn't change the tests in ways that might make it
harder to understand.

llvm-svn: 316137
2017-10-18 23:18:12 +00:00
Mikael Holmen a079ef68e3 [RegisterCoalescer] Don't set read-undef in pruneValues, only clear
Summary:
The comments in the code said

 // Remove <def,read-undef> flags. This def is now a partial redef.

but the code didn't just remove read-undef, it could introduce new ones which
could cause errors.

E.g. if we have something like

%vreg1<def> = IMPLICIT_DEF
%vreg2:subreg1<def, read-undef> = op %vreg3, %vreg4
%vreg2:subreg2<def> = op %vreg6, %vreg7

and we merge %vreg1 and %vreg2 then we should not set undef on the second subreg
def, which the old code did.

Now we solve this by actually do what the code comment says. We remove
read-undef flags rather than remove or introduce them.

Reviewers: qcolombet, MatzeB

Reviewed By: MatzeB

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D38616

llvm-svn: 315564
2017-10-12 06:21:28 +00:00
Mikael Holmen a1a3f5c5e6 Recommit [UnreachableBlockElim] Use COPY if PHI input is undef
This time invoking llc with "-march=x86-64" in the testcase, so we don't assume
the default target is x86.

Summary:
If we have

    %vreg0<def> = PHI %vreg2<undef>, <BB#0>, %vreg3, <BB#2>; GR32:%vreg0,%vreg2,%vreg3
    %vreg3<def,tied1> = ADD32ri8 %vreg0<kill,tied0>, 1, %EFLAGS<imp-def>; GR32:%vreg3,%vreg0

then we can't just change %vreg0 into %vreg3, since %vreg2 is actually
undef. We would have to also copy the undef flag to be able to change the
register.

Instead we deal with this case like other cases where we can't just
replace the register: we insert a COPY. The code creating the COPY already
copied all flags from the PHI input, so the undef flag will be transferred
as it should.

Reviewers: kparzysz

Reviewed By: kparzysz

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D38235

llvm-svn: 314882
2017-10-04 07:42:45 +00:00
Mikael Holmen 75b1992f78 Revert r314879 "[UnreachableBlockElim] Use COPY if PHI input is undef"
Build-bots broke on the new testcase. I'll investigate and fix.

llvm-svn: 314880
2017-10-04 06:39:22 +00:00
Mikael Holmen 65eb2f394c [UnreachableBlockElim] Use COPY if PHI input is undef
Summary:
If we have

    %vreg0<def> = PHI %vreg2<undef>, <BB#0>, %vreg3, <BB#2>; GR32:%vreg0,%vreg2,%vreg3
    %vreg3<def,tied1> = ADD32ri8 %vreg0<kill,tied0>, 1, %EFLAGS<imp-def>; GR32:%vreg3,%vreg0

then we can't just change %vreg0 into %vreg3, since %vreg2 is actually
undef. We would have to also copy the undef flag to be able to change the
register.

Instead we deal with this case like other cases where we can't just
replace the register: we insert a COPY. The code creating the COPY already
copied all flags from the PHI input, so the undef flag will be transferred
as it should.

Reviewers: kparzysz

Reviewed By: kparzysz

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D38235

llvm-svn: 314879
2017-10-04 06:06:31 +00:00
Matthias Braun 5c3e8a450e MIR: Serialize CaleeSavedInfo Restored flag
llvm-svn: 314449
2017-09-28 18:52:14 +00:00
Mikael Holmen 06064d1bac [IfConversion] Add testcases [NFC]
These tests should have been included in r310697 / D34099 but apparently
I missed them.

llvm-svn: 313737
2017-09-20 08:23:29 +00:00
Quentin Colombet d652aeb144 [MIRPrinter] Print empty successor lists when they cannot be guessed
This re-applies commit r313685, this time with the proper updates to
the test cases.

Original commit message:
Unreachable blocks in the machine instr representation are these
weird empty blocks with no successors.
The MIR printer used to not print empty lists of successors. However,
the MIR parser now treats non-printed list of successors as "please
guess it for me". As a result, the parser tries to guess the list of
successors and given the block is empty, just assumes it falls through
the next block (if any).

For instance, the following test case used to fail the verifier.
The MIR printer would print

         entry
        /      \
   true (def)   false (no list of successors)
       |
 split.true (use)

The MIR parser would understand this:

         entry
        /      \
   true (def)   false
       |        /  <-- invalid edge
 split.true (use)

Because of the invalid edge, we get the "def does not
dominate all uses" error.

The fix consists in printing empty successor lists, so that the parser
knows what to do for unreachable blocks.

rdar://problem/34022159

llvm-svn: 313696
2017-09-19 23:34:12 +00:00
Quentin Colombet 6888dbcda7 Revert "[MIRPrinter] Print empty successor lists when they cannot be guessed"
This reverts commit r313685.

I thought I had ran ninja check, but apparently I didn't...
Need to update a bunch of mir tests.

llvm-svn: 313686
2017-09-19 22:03:50 +00:00
Quentin Colombet 7fdaa5e641 [MIRPrinter] Print empty successor lists when they cannot be guessed
Unreachable blocks in the machine instr representation are these
weird empty blocks with no successors.
The MIR printer used to not print empty lists of successors. However,
the MIR parser now treats non-printed list of successors as "please
guess it for me". As a result, the parser tries to guess the list of
successors and given the block is empty, just assumes it falls through
the next block (if any).

For instance, the following test case used to fail the verifier.
The MIR printer would print
          entry
         /      \
    true (def)   false (no list of successors)
        |
  split.true (use)

The MIR parser would understand this:
          entry
         /      \
    true (def)   false
        |        /  <-- invalid edge
  split.true (use)

Because of the invalid edge, we get the "def does not
dominate all uses" error.

The fix consists in printing empty successor lists, so that the parser
knows what to do for unreachable blocks.

rdar://problem/34022159

llvm-svn: 313685
2017-09-19 21:55:51 +00:00
Konstantin Zhuravlyov 5f5b586c99 AMDGPU: Handle non-temporal loads and stores
Differential Revision: https://reviews.llvm.org/D36862

llvm-svn: 312729
2017-09-07 17:14:54 +00:00
Konstantin Zhuravlyov c8c9d4a0a6 AMDGPU: Handle more than one memory operand in SIMemoryLegalizer
Differential Revision: https://reviews.llvm.org/D37397

llvm-svn: 312725
2017-09-07 16:14:21 +00:00
Reid Kleckner 08f5fd51cc [codeview] Generalize DIExpression parsing to handle load chains
Summary:
Hopefully this also clarifies exactly when and why we're rewriting
certiain S_LOCALs using reference types: We're using the reference type
to stand in for a zero-offset load.

Reviewers: inglorion

Subscribers: llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D37309

llvm-svn: 312247
2017-08-31 15:56:49 +00:00
Bob Haarman 223303c5a3 Reland r311957 [codeview] support more DW_OPs for more complete debug info
Summary:
Some variables show up in Visual Studio as "optimized out" even in -O0
-Od builds. This change fixes two issues that would cause this to
happen. The first issue is that not all DIExpressions we generate were
recognized by the CodeView writer. This has been addressed by adding
support for DW_OP_constu, DW_OP_minus, and DW_OP_plus. The second
issue is that we had no way to encode DW_OP_deref in CodeView. We get
around that by changinge the type we encode in the debug info to be
a reference to the type in the source code.

This fixes PR34261.

The reland adds two extra checks to the original: It checks if the
DbgVariableLocation is valid before checking any of its fields, and
it only emits ranges with nonzero registers.

Reviewers: aprantl, rnk, zturner

Reviewed By: rnk

Subscribers: mgorny, llvm-commits, aprantl, hiraditya

Differential Revision: https://reviews.llvm.org/D36907

llvm-svn: 312034
2017-08-29 20:59:25 +00:00
Bob Haarman 0bf0d66682 Revert "[codeview] support more DW_OPs for more complete debug info"
This reverts commit e160912f53f047bc97e572add179e08e33f4df48.

llvm-svn: 311977
2017-08-29 04:08:31 +00:00
Bob Haarman b2a04a1513 [codeview] support more DW_OPs for more complete debug info
Summary:
Some variables show up in Visual Studio as "optimized out" even in -O0
-Od builds. This change fixes two issues that would cause this to
happen. The first issue is that not all DIExpressions we generate were
recognized by the CodeView writer. This has been addressed by adding
support for DW_OP_constu, DW_OP_minus, and DW_OP_plus. The second
issue is that we had no way to encode DW_OP_deref in CodeView. We get
around that by changinge the type we encode in the debug info to be
a reference to the type in the source code.

This fixes PR34261.

Reviewers: aprantl, rnk, zturner

Reviewed By: rnk

Subscribers: mgorny, llvm-commits, aprantl, hiraditya

Differential Revision: https://reviews.llvm.org/D36907

llvm-svn: 311957
2017-08-29 00:06:59 +00:00
Reid Kleckner 6d353348e5 Parse and print DIExpressions inline to ease IR and MIR testing
Summary:
Most DIExpressions are empty or very simple. When they are complex, they
tend to be unique, so checking them inline is reasonable.

This also avoids the need for CodeGen passes to append to the
llvm.dbg.mir named md node.

See also PR22780, for making DIExpression not be an MDNode.

Reviewers: aprantl, dexonsmith, dblaikie

Subscribers: qcolombet, javed.absar, eraman, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D37075

llvm-svn: 311594
2017-08-23 20:31:27 +00:00
Diana Picus d5a00b0ff6 [MIR] Print target-specific constant pools
This should enable us to test the generation of target-specific constant
pools, e.g. for ARM:

constants:
 - id:              0
   value:           'g(GOT_PREL)-(LPC0+8-.)'
   alignment:       4
   isTargetSpecific: true

I intend to use this to test PIC support in GlobalISel for ARM.

This is difficult to test outside of that context, since the existing
MIR tests usually rely on parser support as well, and that seems a bit
trickier to add. We could try to add a unit test, but the setup for that
seems rather convoluted and overkill.

We do test however that the parser reports a nice error when
encountering a target-specific constant pool.

Differential Revision: https://reviews.llvm.org/D36092

llvm-svn: 309806
2017-08-02 11:09:30 +00:00
Konstantin Zhuravlyov e9a5a77ee3 AMDGPU: Implement memory model
llvm-svn: 308781
2017-07-21 21:19:23 +00:00
Matt Arsenault db78273b6e Add an ID field to StackObjects
On AMDGPU SGPR spills are really spilled to another register.
The spiller creates the spills to new frame index objects,
which is used as a placeholder.

This will eventually be replaced with a reference to a position
in a VGPR to write to and the frame index deleted. It is
most likely not a real stack location that can be shared
with another stack object.

This is a problem when StackSlotColoring decides it should
combine a frame index used for a normal VGPR spill with
a real stack location and a frame index used for an SGPR.

Add an ID field so that StackSlotColoring has a way
of knowing the different frame index types are
incompatible.

llvm-svn: 308673
2017-07-20 21:03:45 +00:00
Nicolai Haehnle a253e4c028 AMDGPU: Fix crash when folding immediates into multiple uses
Summary:
When an immediate is folded by constant folding, we re-scan the entire
use list for two reasons:

1. The constant folding may have created a new use of the same reg.
2. The constant folding may have removed an additional use in the list
   we're currently traversing (e.g., constant folding an S_ADD_I32 c, c).

However, this could previously lead to a crash when an unrelated use was
added twice into the FoldList. Since we re-scan the whole list anyway, we
might as well just clear the FoldList again before we do so.

Using a MIR test to show this because real code seems to trigger the issue
only in connection with some really subtle control flow structures.

Fixes GL45-CTS.shading_language_420pack.binding_images on gfx9.

Reviewers: arsenm

Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye

Differential Revision: https://reviews.llvm.org/D35416

llvm-svn: 308314
2017-07-18 14:54:41 +00:00
Geoff Berry b1e8714af9 [AArch64][Falkor] Avoid HW prefetcher tag collisions (step 1)
Summary:
This patch is the first step in reducing HW prefetcher instruction tag
collisions in inner loops for Falkor.  It adds a pass that annotates IR
loads with metadata to indicate that they are known to be strided loads,
and adds a target lowering hook that translates this metadata to a
target-specific MachineMemOperand flag.

A follow on change will use this MachineMemOperand flag to re-write
instructions to reduce tag collisions.

Reviewers: mcrosier, t.p.northover

Subscribers: aemerson, rengolin, mgorny, javed.absar, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D34963

llvm-svn: 308059
2017-07-14 21:44:12 +00:00
Geoff Berry 6748abe24d [MIR] Add support for printing and parsing target MMO flags
Summary: Add target hooks for printing and parsing target MMO flags.
Targets may override getSerializableMachineMemOperandTargetFlags() to
return a mapping from string to flag value for target MMO values that
should be serialized/parsed in MIR output.

Add implementation of this hook for AArch64 SuppressPair MMO flag.

Reviewers: bogner, hfinkel, qcolombet, MatzeB

Subscribers: mcrosier, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D34962

llvm-svn: 307877
2017-07-13 02:28:54 +00:00
Konstantin Zhuravlyov bb80d3e1d3 Enhance synchscope representation
OpenCL 2.0 introduces the notion of memory scopes in atomic operations to
  global and local memory. These scopes restrict how synchronization is
  achieved, which can result in improved performance.

  This change extends existing notion of synchronization scopes in LLVM to
  support arbitrary scopes expressed as target-specific strings, in addition to
  the already defined scopes (single thread, system).

  The LLVM IR and MIR syntax for expressing synchronization scopes has changed
  to use *syncscope("<scope>")*, where <scope> can be "singlethread" (this
  replaces *singlethread* keyword), or a target-specific name. As before, if
  the scope is not specified, it defaults to CrossThread/System scope.

  Implementation details:
    - Mapping from synchronization scope name/string to synchronization scope id
      is stored in LLVM context;
    - CrossThread/System and SingleThread scopes are pre-defined to efficiently
      check for known scopes without comparing strings;
    - Synchronization scope names are stored in SYNC_SCOPE_NAMES_BLOCK in
      the bitcode.

Differential Revision: https://reviews.llvm.org/D21723

llvm-svn: 307722
2017-07-11 22:23:00 +00:00
Matt Arsenault 6c29c5acfe AMDGPU: Allow SIShrinkInstructions to work in non-SSA
Immediates can be folded as long as the immediate is a vreg.

Also undo commuting instructions if it didn't fold an immediate.

llvm-svn: 307575
2017-07-10 19:53:57 +00:00
Krzysztof Parzyszek 0ac065f318 [Hexagon] Handle Hexagon-specific machine operand target flags in MIR
llvm-svn: 307564
2017-07-10 18:31:02 +00:00
Quentin Colombet 81551148b7 [RegAllocFast] Add the proper initialize method to use the .mir infrastructure
NFC

llvm-svn: 307427
2017-07-07 19:25:42 +00:00
Mikael Holmen 9c3e2eac6a [MachineVerifier] Add check that tied physregs aren't different.
Summary: Added MachineVerifier code to check register ties more thoroughly, especially so that physical registers that are tied are the same. This may help e.g. when creating MIR files.

Original patch by Jesper Antonsson

Reviewers: stoklund, sanjoy, qcolombet

Reviewed By: qcolombet

Subscribers: qcolombet, llvm-commits

Differential Revision: https://reviews.llvm.org/D34394

llvm-svn: 307259
2017-07-06 13:18:21 +00:00
Matt Arsenault 3f031e75aa AMDGPU: Add operand target flags serialization
llvm-svn: 306995
2017-07-02 23:21:48 +00:00
Taewook Oh 0e35ea3b7c Remove redundant copy in recurrences
Summary:
If there is a chain of instructions formulating a recurrence, commuting operands can help removing a redundant copy. In the following example code,

```
BB#1: ; Loop Header
  %vreg0<def> = COPY %vreg13<kill>; GR32:%vreg0,%vreg13
  ...

BB#6: ; Loop Latch
  %vreg2<def> = COPY %vreg15<kill>; GR32:%vreg2,%vreg15
  %vreg10<def,tied1> = ADD32rr %vreg1<kill,tied0>, %vreg0<kill>, %EFLAGS<imp-def,dead>; GR32:%vreg10,%vreg1,%vreg0
  %vreg3<def,tied1> = ADD32rr %vreg2<kill,tied0>, %vreg10<kill>, %EFLAGS<imp-def,dead>; GR32:%vreg3,%vreg2,%vreg10
  CMP32ri8 %vreg3, 10, %EFLAGS<imp-def>; GR32:%vreg3
  %vreg13<def> = COPY %vreg3<kill>; GR32:%vreg13,%vreg3
  JL_1 <BB#1>, %EFLAGS<imp-use,kill>
```

Existing two-address generation pass generates following code:

```
BB#1:
  %vreg0<def> = COPY %vreg13<kill>; GR32:%vreg0,%vreg13
  ...

BB#6:
    Predecessors according to CFG: BB#5 BB#4
  %vreg2<def> = COPY %vreg15<kill>; GR32:%vreg2,%vreg15
  %vreg10<def> = COPY %vreg1<kill>; GR32:%vreg10,%vreg1
  %vreg10<def,tied1> = ADD32rr %vreg10<tied0>, %vreg0<kill>, %EFLAGS<imp-def,dead>; GR32:%vreg10,%vreg0
  %vreg3<def> = COPY %vreg10<kill>; GR32:%vreg3,%vreg10
  %vreg3<def,tied1> = ADD32rr %vreg3<tied0>, %vreg2<kill>, %EFLAGS<imp-def,dead>; GR32:%vreg3,%vreg2
  CMP32ri8 %vreg3, 10, %EFLAGS<imp-def>; GR32:%vreg3
  %vreg13<def> = COPY %vreg3<kill>; GR32:%vreg13,%vreg3
  JL_1 <BB#1>, %EFLAGS<imp-use,kill>
  JMP_1 <BB#7>
```

This is suboptimal because the assembly code generated has a redundant copy at the end of #BB6 to feed %vreg13 to BB#1:

```
.LBB0_6:
  addl  %esi, %edi
  addl  %ebx, %edi
  cmpl  $10, %edi
  movl  %edi, %esi
  jl  .LBB0_1
```

This redundant copy can be elimiated by making instructions in the recurrence chain to compute the value "into" the register that actually holds the feedback value. In this example, this can be achieved by commuting %vreg0 and %vreg1 to compute %vreg10. With that change, code after two-address generation becomes

```
BB#1:
  %vreg0<def> = COPY %vreg13<kill>; GR32:%vreg0,%vreg13
  ...

BB#6: derived from LLVM BB %bb7
    Predecessors according to CFG: BB#5 BB#4
  %vreg2<def> = COPY %vreg15<kill>; GR32:%vreg2,%vreg15
  %vreg10<def> = COPY %vreg0<kill>; GR32:%vreg10,%vreg0
  %vreg10<def,tied1> = ADD32rr %vreg10<tied0>, %vreg1<kill>, %EFLAGS<imp-def,dead>; GR32:%vreg10,%vreg1
  %vreg3<def> = COPY %vreg10<kill>; GR32:%vreg3,%vreg10
  %vreg3<def,tied1> = ADD32rr %vreg3<tied0>, %vreg2<kill>, %EFLAGS<imp-def,dead>; GR32:%vreg3,%vreg2
  CMP32ri8 %vreg3, 10, %EFLAGS<imp-def>; GR32:%vreg3
  %vreg13<def> = COPY %vreg3<kill>; GR32:%vreg13,%vreg3
  JL_1 <BB#1>, %EFLAGS<imp-use,kill>
  JMP_1 <BB#7>
```

and the final assembly does not have redundant copy:

```
.LBB0_6:
  addl  %edi, %eax
  addl  %ebx, %eax
  cmpl  $10, %eax
  jl  .LBB0_1
```

Reviewers: qcolombet, MatzeB, wmi

Reviewed By: wmi

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D31821

llvm-svn: 306758
2017-06-29 23:11:24 +00:00
Matthias Braun 7e23fc05c1 llc: Add ability to parse mir from stdin
- Add -x <language> option to switch between IR and MIR inputs.
- Change MIR parser to read from stdin when filename is '-'.
- Add a simple mir roundtrip test.

llvm-svn: 304825
2017-06-06 20:06:57 +00:00
Matthias Braun 8b5f9e4438 MIRPrinter: Avoid assert() when printing empty INLINEASM strings.
CodeGen uses MO_ExternalSymbol to represent the inline assembly strings.
Empty strings for symbol names appear to be invalid. For now just
special case the output code to avoid hitting an `assert()` in
`printLLVMNameWithoutPrefix()`.

This fixes https://llvm.org/PR33317

llvm-svn: 304815
2017-06-06 19:00:58 +00:00
Vivek Pandya 56d87ef5d7 [Improve CodeGen Testing] This patch renables MIRPrinter print fields which have value equal to its default.
If -simplify-mir option is passed then MIRPrinter will not print such fields.
This change also required some lit test cases in CodeGen directory to be changed.

Reviewed By: MatzeB

Differential Revision: https://reviews.llvm.org/D32304

llvm-svn: 304779
2017-06-06 08:16:19 +00:00
Matthias Braun 7bda195812 CodeGen: Refactor MIR parsing
When parsing .mir files immediately construct the MachineFunctions and
put them into MachineModuleInfo.

This allows us to get rid of the delayed construction (and delayed error
reporting) through the MachineFunctionInitialzier interface.

Differential Revision: https://reviews.llvm.org/D33809

llvm-svn: 304758
2017-06-06 00:44:35 +00:00
Quentin Colombet ebbaed6d3c [RABasic] Properly initialize the pass
Use the initializeXXX method to initialize the RABasic pass in the
pipeline. This enables us to take advantage of the .mir infrastructure.

llvm-svn: 304602
2017-06-02 22:46:26 +00:00
Matthias Braun d6a36ae282 TargetMachine: Indicate whether machine verifier passes.
This adds a callback to the LLVMTargetMachine that lets target indicate
that they do not pass the machine verifier checks in all cases yet.

This is intended to be a temporary measure while the targets are fixed
allowing us to enable the machine verifier by default with
EXPENSIVE_CHECKS enabled!

Differential Revision: https://reviews.llvm.org/D33696

llvm-svn: 304320
2017-05-31 18:41:23 +00:00
Tobias Grosser 8cf785f6b1 Revert "[IfConversion] Keep the CFG updated incrementally in IfConvertTriangle"
The reverted change introdued assertions ala:

"MachineBasicBlock::succ_iterator
llvm::MachineBasicBlock::removeSuccessor(succ_iterator, bool): Assertion
`I != Successors.end() && "Not a current successor!"'

Mikael, the original committer, wrote me that he is working on a fix, but that
it likely will take some time to get this resolved. As this bug is one of the
last two issues that keep the AOSP buildbot from turning green, I revert the
original commit r302876.

I am looking forward to see this recommitted after the assertion has been
resolved.

llvm-svn: 304128
2017-05-29 06:12:18 +00:00
Krzysztof Parzyszek 6a0005d1b4 Move machine-cse-physreg.mir to test/CodeGen/Thumb
llvm-svn: 303778
2017-05-24 17:20:47 +00:00
Mikael Holmen 2676f8269a MachineCSE: Respect interblock physreg liveness
Summary:
This is a fix for PR32538. MachineCSE first looks at MO.isDead(), but
if it is not marked dead, MachineCSE still wants to do its own check
to see if it is trivially dead. This check for the trivial case
assumed that physical registers cannot be live out of a block.

Patch by Mattias Eriksson.

Reviewers: qcolombet, jbhateja

Reviewed By: qcolombet, jbhateja

Subscribers: jbhateja, llvm-commits

Differential Revision: https://reviews.llvm.org/D33408

llvm-svn: 303731
2017-05-24 09:35:23 +00:00
Mikael Holmen ce3ec4519b [IfConversion] Keep the CFG updated incrementally in IfConvertTriangle
Summary:
Instead of using RemoveExtraEdges (which uses analyzeBranch, which cannot
always be trusted) at the end to fixup the CFG we keep the CFG updated as
we go along and remove or add branches and merge blocks.

This way we won't have any problems if the involved MBBs contain
unanalyzable instructions.

This fixes PR32721.

In that case we had a triangle

   EBB
   | \
   |  |
   | TBB
   |  /
   FBB

where FBB didn't have any successors at all since it ended with an
unconditional return. Then TBB and FBB were be merged into EBB, but EBB
would still keep its successors, and the use of analyzeBranch and
CorrectExtraCFGEdges wouldn't help to remove them since the return
instruction is not analyzable (at least not on ARM).

Reviewers: kparzysz, iteratee, MatzeB

Reviewed By: iteratee

Subscribers: aemerson, rengolin, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D33037

llvm-svn: 302876
2017-05-12 06:28:58 +00:00
Mikael Holmen 21c867c26e [IfConversion] Add missing check in IfConversion/canFallThroughTo
Summary:
When trying to figure out if MBB could fallthrough to ToMBB (possibly by
falling through a bunch of other MBBs) we didn't actually check if there
was fallthrough between the last two blocks in the chain.

Reviewers: kparzysz, iteratee, MatzeB

Reviewed By: kparzysz, iteratee

Subscribers: javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D32996

llvm-svn: 302650
2017-05-10 13:06:13 +00:00
Serge Pavlov d526b13e61 Add extra operand to CALLSEQ_START to keep frame part set up previously
Using arguments with attribute inalloca creates problems for verification
of machine representation. This attribute instructs the backend that the
argument is prepared in stack prior to  CALLSEQ_START..CALLSEQ_END
sequence (see http://llvm.org/docs/InAlloca.htm for details). Frame size
stored in CALLSEQ_START in this case does not count the size of this
argument. However CALLSEQ_END still keeps total frame size, as caller can
be responsible for cleanup of entire frame. So CALLSEQ_START and
CALLSEQ_END keep different frame size and the difference is treated by
MachineVerifier as stack error. Currently there is no way to distinguish
this case from actual errors.

This patch adds additional argument to CALLSEQ_START and its
target-specific counterparts to keep size of stack that is set up prior to
the call frame sequence. This argument allows MachineVerifier to calculate
actual frame size associated with frame setup instruction and correctly
process the case of inalloca arguments.

The changes made by the patch are:
- Frame setup instructions get the second mandatory argument. It
  affects all targets that use frame pseudo instructions and touched many
  files although the changes are uniform.
- Access to frame properties are implemented using special instructions
  rather than calls getOperand(N).getImm(). For X86 and ARM such
  replacement was made previously.
- Changes that reflect appearance of additional argument of frame setup
  instruction. These involve proper instruction initialization and
  methods that access instruction arguments.
- MachineVerifier retrieves frame size using method, which reports sum of
  frame parts initialized inside frame instruction pair and outside it.

The patch implements approach proposed by Quentin Colombet in
https://bugs.llvm.org/show_bug.cgi?id=27481#c1.
It fixes 9 tests failed with machine verifier enabled and listed
in PR27481.

Differential Revision: https://reviews.llvm.org/D32394

llvm-svn: 302527
2017-05-09 13:35:13 +00:00
Matthias Braun c1c5691686 Add missing target triple to test
llvm-svn: 302301
2017-05-05 21:50:26 +00:00
Matthias Braun 8940114f61 MIParser/MIRPrinter: Compute block successors if not explicitely specified
- MIParser: If the successor list is not specified successors will be
  added based on basic block operands in the block and possible
  fallthrough.

- MIRPrinter: Adds a new `simplify-mir` option, with that option set:
  Skip printing of block successor lists in cases where the
  parser is guaranteed to reconstruct it. This means we still print the
  list if some successor cannot be determined (happens for example for
  jump tables), if the successor order changes or branch probabilities
  being unequal.

Differential Revision: https://reviews.llvm.org/D31262

llvm-svn: 302289
2017-05-05 21:09:30 +00:00
Matthias Braun ab9438cb03 MachineFrameInfo: Track whether MaxCallFrameSize is computed yet; NFC
This tracks whether MaxCallFrameSize is computed yet. Ideally we would
assert and fail when the value is queried before it is computed, however
this fails various targets that need to be fixed first.

Differential Revision: https://reviews.llvm.org/D32570

llvm-svn: 301851
2017-05-01 22:32:25 +00:00
Justin Bogner 20dd36a48a MIR: Allow parsing of empty machine functions
If you run llc -stop-after=codegenprepare and feed the resulting MIR
to llc -start-after=codegenprepare, you'll have an empty machine
function since we haven't run any isel yet. Of course, this only works
if the MIRParser believes you that this is okay.

This is essentially a revert of r241862 with a fix for the problem it
was papering over.

llvm-svn: 299975
2017-04-11 19:32:41 +00:00
Matt Arsenault 754dd3eaef AMDGPU: Remove legacy bfe intrinsics
llvm-svn: 299372
2017-04-03 18:08:08 +00:00
Matt Arsenault 3dbeefa978 AMDGPU: Mark all unspecified CC functions in tests as amdgpu_kernel
Currently the default C calling convention functions are treated
the same as compute kernels. Make this explicit so the default
calling convention can be changed to a non-kernel.

Converted with perl -pi -e 's/define void/define amdgpu_kernel void/'
on the relevant test directories (and undoing in one place that actually
wanted a non-kernel).

llvm-svn: 298444
2017-03-21 21:39:51 +00:00
Oren Ben Simhon 75537b6566 [MIR] Test assumes x64 windows calling convention upon printing/parsing MIR output/input.
llvm-svn: 298212
2017-03-19 13:23:20 +00:00
Benjamin Kramer 6520f83ba4 [MIR] Add triple to test that assumes it runs on windows.
llvm-svn: 298211
2017-03-19 13:04:35 +00:00