Commit Graph

49568 Commits

Author SHA1 Message Date
Paul Robinson 8bd9d6ad83 Fix out-of-order stepping behavior in programs with sunk instructions.
MachineSink attempts to place instructions near the basic blocks where
they are needed.  Once an instruction has been sunk, its location
relative to other instructions no longer is consistent with the
original source code. In order to ensure correct stepping in the
debugger, the debug location for sunk instructions is either merged
with the insertion point or erased if the target successor block is
empty.

Originally submitted as r318679, revised to fix sanitizer failure and
improve testing.

Patch by Matthew Voss!

Differential Revision: https://reviews.llvm.org/D39933

llvm-svn: 320216
2017-12-09 00:17:01 +00:00
Adrian Prantl 01fb31cc89 dwarfdump: Add support for the --diff option.
--diff      Emit the output in a diff-friendly way by omitting offsets and
            addresses.

<rdar://problem/34502625>

llvm-svn: 320214
2017-12-08 23:32:47 +00:00
Duncan P. N. Exon Smith 9b8caf5bd7 Revert part of "Cleanup some GraphTraits iteration code"
This reverts part of r300656, which caused a regression in
propagateMassToSuccessors by counting edges n^2 times, where n is the
number of edges from the source basic block to the same successor basic
block. The result was both incorrect and very slow to compute for large
values of n (e.g. switches with multiple cases that go to the same basic
block).

Patch by Andrew Scheidecker!

llvm-svn: 320208
2017-12-08 22:42:43 +00:00
Vedant Kumar 195dfd10a6 [Debugify] Add a pass to test debug info preservation
The Debugify pass synthesizes debug info for IR. It's paired with a
CheckDebugify pass which determines how much of the original debug info
is preserved. These passes make it easier to create targeted tests for
debug info preservation.

Here is the Debugify algorithm:

  NextLine = 1
  for (Instruction &I : M)
    attach DebugLoc(NextLine++) to I

  NextVar = 1
  for (Instruction &I : M)
    if (canAttachDebugValue(I))
      attach dbg.value(NextVar++) to I

The CheckDebugify pass expects contiguous ranges of DILocations and
DILocalVariables. If it fails to find all of the expected debug info, it
prints a specific error to stderr which can be FileChecked.

This was discussed on llvm-dev in the thread:
"Passes to add/validate synthetic debug info"

Differential Revision: https://reviews.llvm.org/D40512

llvm-svn: 320202
2017-12-08 21:57:28 +00:00
Florian Hahn e5089e2e94 [CodeExtractor] Add debug locations for new call and branch instrs.
Summary:
If a partially inlined function has debug info, we have to add debug
locations to the call instruction calling the outlined function.
We use the debug location of the first instruction in the outlined
function, as the introduced call transfers control to this statement and
there is no other equivalent line in the source code.

We also use the same debug location for the branch instruction added
to jump from artificial entry block for the outlined function, which just
jumps to the first actual basic block of the outlined function.

Reviewers: davide, aprantl, rriddle, dblaikie, danielcdh, wmi

Reviewed By: aprantl, rriddle, danielcdh

Subscribers: eraman, JDevlieghere, llvm-commits

Differential Revision: https://reviews.llvm.org/D40413

llvm-svn: 320199
2017-12-08 21:49:03 +00:00
Dan Gohman 3a762bf9df [WebAssembly] Reapply r319186: "Support bitcasted function addresses with varargs."
This puts the functionality under control of a command-line option which is
off by default to avoid breaking existing setups.

llvm-svn: 320197
2017-12-08 21:27:00 +00:00
Dan Gohman 6736f59078 [WebAssemby] Re-apply r320041: "Support main functions with alternate signatures."
This includes a fix so that it doesn't transform declarations, and it
puts the functionality under control of a command-line option which is off
by default to avoid breaking existing setups.

llvm-svn: 320196
2017-12-08 21:18:21 +00:00
Konstantin Zhuravlyov c40d9f2e5d AMDGPU/GCN: Bring processors in sync with AMDGPUUsage
- Add gfx704
    - Change bonaire to gfx704
  - Remove gfx804
  - Remove gfx901
  - Remove gfx903

Differential Revision: https://reviews.llvm.org/D40046

llvm-svn: 320194
2017-12-08 20:52:28 +00:00
Craig Topper 7f0d456ef8 [X86] Teach lowering to only let through (insert_subvector (vXi1 zeros), subvec, 0) for vector sizes that have native KSHIFT support.
For narrow sizes we'll widen the zero vector and widen the insert. Then do an extract_subvector to get back down to correct size.

This allows us to remove some patterns from the isel table that had to COPY_TO_REGCLASS to an oversized register, do the shift and then COPY_TO_REGCLASS back to the narrow register. Now this is represented explicitly in the DAG.

This seems to have perturbed the register allocation in one of the tests, but the number of instructions didn't change.

llvm-svn: 320190
2017-12-08 20:10:33 +00:00
Simon Pilgrim 6415f56c79 [X86][X87] Tag x87 float compare instructions scheduler classes
llvm-svn: 320189
2017-12-08 20:10:31 +00:00
Matt Arsenault 856777d8c9 AMDGPU: image_getlod and image_getresinfo do not read memory
llvm-svn: 320187
2017-12-08 20:00:57 +00:00
Xinliang David Li d91057bf52 Revert r320104: infinite loop profiling bug fix
Causes unexpected memory issue with New PM this time.
The new PM invalidates BPI but not BFI, leaving the
reference to BPI from BFI invalid.

Abandon this patch.  There is a more general solution
which also handles runtime infinite loop (but not statically).

llvm-svn: 320180
2017-12-08 19:38:07 +00:00
Konstantin Zhuravlyov e30f88f3a9 AMDGPU: Report Arg's Value name in metadata if kernel_arg_name metadata is not available
Differential Revision: https://reviews.llvm.org/D40924

llvm-svn: 320176
2017-12-08 19:22:12 +00:00
Michael Trent ad840d2206 Reverting r320166 to fix test failures.
llvm-svn: 320174
2017-12-08 19:09:26 +00:00
Michael Trent de5209bdbd Updated llvm-objdump to display local relocations in Mach-O binaries
Summary:
llvm-objdump's Mach-O parser was updated in r306037 to display external
relocations for MH_KEXT_BUNDLE file types. This change extends the Macho-O
parser to display local relocations for MH_PRELOAD files. When used with
the -macho option relocations will be displayed in a historical format.

rdar://35778019

Reviewers: enderby

Reviewed By: enderby

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D40867

llvm-svn: 320166
2017-12-08 17:51:04 +00:00
Davide Italiano b5a62cc81a [DebugInfo] Use llc instead of llc_dwarf to fix this test.
We work around the fact that some platforms add a triple when
they expand llc_dwarf in lit.

llvm-svn: 320164
2017-12-08 17:15:50 +00:00
Simon Pilgrim 19d460b066 [X86][SHA] Tag SHA instructions scheduler classes
Put these under VecIMul itinerary classes for now - seems to be a good average value

llvm-svn: 320161
2017-12-08 16:38:41 +00:00
Alexey Bataev ec95c6cc0a [InstCombine] PR35354: Convert store(bitcast, load bitcast (select (Cond, &V1, &V2)) --> store (, load (select(Cond, load &V1, load &V2)))
Summary:
If we have the code like this:
```
float a, b;
a = std::max(a ,b);
```
it is converted into something like this:
```
%call = call dereferenceable(4) float* @_ZSt3maxIfERKT_S2_S2_(float* nonnull dereferenceable(4) %a.addr, float* nonnull dereferenceable(4) %b.addr)
%1 = bitcast float* %call to i32*
%2 = load i32, i32* %1, align 4
%3 = bitcast float* %a.addr to i32*
store i32 %2, i32* %3, align 4
```
After inlinning this code is converted to the next:
```
%1 = load float, float* %a.addr
%2 = load float, float* %b.addr
%cmp.i = fcmp fast olt float %1, %2
%__b.__a.i = select i1 %cmp.i, float* %a.addr, float* %b.addr
%3 = bitcast float* %__b.__a.i to i32*
%4 = load i32, i32* %3, align 4
%5 = bitcast float* %arrayidx to i32*
store i32 %4, i32* %5, align 4

```
This pattern is not recognized as minmax pattern.
Patch solves this problem by converting sequence
```
store (bitcast, (load bitcast (select ((cmp V1, V2), &V1, &V2))))
```
to a sequence
```
store (,load (select((cmp V1, V2), &V1, &V2)))
```
After this the code is recognized as minmax pattern.

Reviewers: RKSimon, spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D40304

llvm-svn: 320157
2017-12-08 15:32:10 +00:00
Max Kazantsev 9c08b7a053 [SCEV] Fix predicate usage in computeExitLimitFromICmp
In this method, we invoke `SimplifyICmpOperands` which takes the `Cond` predicate
by reference and may change it along with `LHS` and `RHS` SCEVs. But then we invoke
`computeShiftCompareExitLimit` with Values from which the SCEVs have been derived,
these Values have not been modified while `Cond` could be.

One of possible outcomes of this is that we may falsely prove that an infinite loop ends
within some finite number of iterations.

In this patch, we save the original `Cond` and pass it along with original operands.
This logic may be removed in future once `computeShiftCompareExitLimit` works
with SCEVs instead of value operands.

Reviewed By: sanjoy
Differential Revision: https://reviews.llvm.org/D40953

llvm-svn: 320142
2017-12-08 12:19:45 +00:00
Gadi Haber 2cf601f28f [X86][Haswell]: Updating the scheduling information for the Haswell subtarget.
Updated the scheduling information for the Haswell subtarget with the following changes:

Regrouped the instructions after adding appropriate load + store latencies.
Added scheduling for missing instructions such as the GATHER instrs.
The changes were made after revisiting the latencies impact of all memory uOps.

Reviewers: RKSimon, zvi, craig.topper, apilipenko
Differential Revision: https://reviews.llvm.org/D40021

Change-Id: Iaf6c1f5169add1552845a8a566af4e5a359217a7
llvm-svn: 320137
2017-12-08 09:48:44 +00:00
Abderrazek Zaafrani 2c80e4c7c3 [AArch64] Avoid SIMD interleaved store instruction for Exynos.
Replace interleaved store instructions by equivalent and more efficient instructions based on latency cost model.
Https://reviews.llvm.org/D38196

llvm-svn: 320123
2017-12-08 00:58:49 +00:00
Derek Schuff 9e1baeda74 Revert "[WebAssemby] Support main functions with alternate signatures."
This reverts commit 959e37e669b0c3cfad4cb9f1f7c9261ce9f5e9ae.
That commit doesn't handle the case where main is declared rather than defined,
in particular the even-more special case where main is a prototypeless
declaration (which is of course the one actually used by musl currently).

llvm-svn: 320121
2017-12-08 00:39:54 +00:00
Craig Topper 323ba39f10 [X86] Handle alls version of vXi1 insert_vector_elt with a constant index without falling back to shuffles.
We previously only supported inserting to the LSB or MSB where it was easy to zero to perform an OR to insert.

This change effectively extracts the old value and the new value, xors them together and then xors that single bit with the correct location in the original vector. This will cancel out the old value in the first xor leaving the new value in the position.

The way I've implemented this uses 3 shifts and two xors and uses an additional register. We can avoid the additional register at the cost of another shift.

llvm-svn: 320120
2017-12-08 00:16:09 +00:00
Bill Seurer 957a076cce [PowerPC][asan] Update asan to handle changed memory layouts in newer kernels
In more recent Linux kernels with 47 bit VMAs the layout of virtual memory
for powerpc64 changed causing the address sanitizer to not work properly. This
patch adds support for 47 bit VMA kernels for powerpc64 and fixes up test
cases.

https://reviews.llvm.org/D40907

There is an associated patch for compiler-rt.

Tested on several 4.x and 3.x kernel releases.

llvm-svn: 320109
2017-12-07 22:53:33 +00:00
Eric Christopher a469acac03 Temporarily revert "[PowerPC] Allow tail calls of fastcc functions from C CallingConv functions."
It is causing sanitizer failures on llvm tests in a bootstrapped compiler. No bot link since it's currently down, but following up to get the bot up.

This reverts commit r319218.

llvm-svn: 320106
2017-12-07 22:26:19 +00:00
Xinliang David Li 4b0027f671 [PGO] detect infinite loop and form MST properly
Differential Revision: http://reviews.llvm.org/D40873

llvm-svn: 320104
2017-12-07 22:23:28 +00:00
Jessica Paquette 59948666fb [MachineOutliner] Fix offset overflow check
The offset overflow check before was incorrect. It would always give the
correct result, but it was comparing the SCALED potential fixed-up offset
against an UNSCALED minimum/maximum. As a result, the outliner was missing a
bunch of frame setup/destroy instructions that ought to have been safe to
outline. This fixes that, and adds an instruction to the .mir test that
failed the old test.
  

llvm-svn: 320090
2017-12-07 21:51:43 +00:00
Mark Searles 9ebdbb433a [AMDGPU] Revert "[AMDGPU] Add options for waitcnt pass debugging; add instr count in debug output."
Patch caused a buildbot failure; http://lab.llvm.org:8011/builders/lld-x86_64-darwin13/builds/15733/steps/build_Lld/logs/stdio :
lib/Target/AMDGPU/SIInsertWaitcnts.cpp:396:11: error: private field 'InstCnt' is not used [-Werror,-Wunused-private-field]
  int32_t InstCnt = 0;
          ^
1 error generated.
"
This reverts commit 71627f79010aafe74fdcba901bba28dd7caa0869.

llvm-svn: 320086
2017-12-07 21:14:41 +00:00
Mark Searles a84d23489a [AMDGPU] Add options for waitcnt pass debugging; add instr count in debug output.
-amdgpu-waitcnt-forcezero={1|0}  Force all waitcnt instrs to be emitted as s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
-amdgpu-waitcnt-forceexp=<n>  Force emit a s_waitcnt expcnt(0) before the first <n> instrs
-amdgpu-waitcnt-forcelgkm=<n> Force emit a s_waitcnt lgkmcnt(0) before the first <n> instrs
-amdgpu-waitcnt-forcevm=<n>   Force emit a s_waitcnt vmcnt(0) before the first <n> instrs

Differential Revision: https://reviews.llvm.org/D40091

llvm-svn: 320084
2017-12-07 20:36:39 +00:00
Mark Searles d29f24acfb [AMDGPU] Add GCNHazardRecognizer::checkInlineAsmHazards() and GCNHazardRecognizer::checkVALUHazardsHelper(). checkInlineAsmHazards() checks INLINEASM for hazards that we particularly care about (so not exhaustive); this patch adds a check for INLINEASM that defs vregs that hold data-to-be stored by immediately preceding store of more than 8 bytes. If the instr were not within an INLINEASM, this scenario would be handled by checkVALUHazard(). Add checkVALUHazardsHelper(), which will be called by both checkVALUHazards() and checkInlineAsmHazards().
Differential Revision: https://reviews.llvm.org/D40098

llvm-svn: 320083
2017-12-07 20:34:25 +00:00
Craig Topper dfc79c7c33 [X86] Fix InsertBitToMaskVector to only issue KSHIFTS of native size so that upper bits are properly zeroed.
There's no v2i1 or v4i1 kshift, and v8i1 is only supported with AVXDQ. Isel has fake patterns to extend these types to native shifts, but makes no guarantees about the value of any bits shifted in when shifting right.

This patch promotes the vector to a type that supports a native shift first and only allows inserting into the msb of a native sized shift.

I've constructed this in a way that doesn't do the promotion if we're going to fallback to using a xmm/ymm/zmm shuffle. I think I have a plan to remove the shuffle fall back entirely. In which case we this can be simplified, but I wanted to fix the correctness issue first.

llvm-svn: 320081
2017-12-07 20:10:04 +00:00
Sanjay Patel 6cfc136870 [InstCombine] add tests for abs using bit hackery; NFC
llvm-svn: 320068
2017-12-07 18:13:33 +00:00
Simon Pilgrim 386b23f1fa [X86] Tag BMI/BMI2/TBM instructions scheduler classes
Put these under UNARY/BINOP ALU itinerary classes for now - seems to be a good average value

llvm-svn: 320064
2017-12-07 17:37:39 +00:00
Krzysztof Parzyszek 039d4d9286 [Hexagon] Generate HVX code for basic arithmetic operations
Handle and, or, xor, add, sub, mul for vectors of i8, i16, and i32.

llvm-svn: 320063
2017-12-07 17:37:28 +00:00
Simon Pilgrim d2e93e76b8 [X86][TBM] Add TBM scheduling tests
llvm-svn: 320062
2017-12-07 17:23:00 +00:00
Craig Topper 5db260fca4 [X86] Rename function in recently added test case to not be 'main' returning 'void'. NFC
llvm-svn: 320059
2017-12-07 17:02:49 +00:00
Davide Italiano f6e180d523 [DebugInfo] Move this test to X86/ now that it specifies a triple.
Should bring back the arm/arm64 bots. Reported by Yvan Roux.

llvm-svn: 320057
2017-12-07 16:10:39 +00:00
Simon Pilgrim 2983b46973 [X86] Tag SALC instructions scheduler class
Treat these the same as LAHF/SAHF (although its not a x86_64 instruction)

llvm-svn: 320055
2017-12-07 16:07:06 +00:00
Simon Pilgrim ffce0d8fbc [X86] Add LAHF/SAHF scheduling test
llvm-svn: 320054
2017-12-07 16:04:20 +00:00
Simon Pilgrim a383f84233 [X86] Add SALC scheduling test
llvm-svn: 320052
2017-12-07 15:46:58 +00:00
Simon Pilgrim f1d599adb2 [X86] Tag LZCNT/TZCNT instructions scheduler classes
Tagged as IMUL instructions for a reasonable approximation (ALU tends to be a lot faster) - POPCNT is currently tagged as FAdd which I think should be replaced with IMUL as well

llvm-svn: 320051
2017-12-07 15:24:14 +00:00
Sanjay Patel 9012391af1 [DAGCombiner] eliminate shuffle of insert element
I noticed this pattern in D38316 / D38388. We failed to combine a shuffle that is either 
repeating a scalar insertion at the same position in a vector or translated to a different 
element index.

Like the earlier patch, this could be an instcombine too, but since we opted to make this 
a DAG transform earlier, I've made this one a DAG patch too.

We do not need any legality checking because the new insert is identical to the existing 
insert except that it may have a different constant insertion operand.

The constant insertion test in test/CodeGen/X86/vector-shuffle-combining.ll was the 
motivation for D38756.

Differential Revision: https://reviews.llvm.org/D40209

llvm-svn: 320050
2017-12-07 15:17:58 +00:00
Igor Laevsky 4a4f2e8c67 [InstCombine] Don't crash on out of bounds index in the insertelement
Differential Revision: https://reviews.llvm.org/D40390

llvm-svn: 320049
2017-12-07 15:00:52 +00:00
Simon Pilgrim ff5212091a [X86][FMA] Regenerate fma schedule tests
llvm-svn: 320048
2017-12-07 14:51:47 +00:00
Simon Pilgrim 60411d9a8c [X86] Tag RDRAND/RDSEED instruction scheduler classes
llvm-svn: 320045
2017-12-07 14:18:48 +00:00
Simon Pilgrim 9a2898ed22 [X86] Regenerate RDTSC codegen tests
llvm-svn: 320042
2017-12-07 13:50:29 +00:00
Dan Gohman cdaa87dd2e [WebAssemby] Support main functions with alternate signatures.
WebAssembly requires caller and callee signatures to match, so the usual
C runtime trick of calling main and having it just work regardless of
whether main is defined as '()' or '(int argc, char *argv[])' doesn't
work. Extend the FixFunctionBitcasts pass to rewrite main to use the
latter form.

llvm-svn: 320041
2017-12-07 13:49:27 +00:00
Simon Pilgrim 439679c085 [X86][RDSEED] Add rdseed scheduling tests
llvm-svn: 320040
2017-12-07 13:47:17 +00:00
Simon Pilgrim eb87fe62ec [X86][RDRAND] Add rdrand scheduling tests
llvm-svn: 320039
2017-12-07 13:46:47 +00:00
Alex Bradbury f8f4b90544 [RISCV] MC layer support for the jump/branch instructions of the RVC extension
Differential Revision: https://reviews.llvm.org/D40002
    
Patch by Shiva Chen.

llvm-svn: 320038
2017-12-07 13:19:57 +00:00
Alex Bradbury 9f6aec4b7a [RISCV] MC layer support for load/store instructions of the C (compressed) extension
Differential Revision: https://reviews.llvm.org/D40001
    
Patch by Shiva Chen.

llvm-svn: 320037
2017-12-07 12:50:32 +00:00
Nikolai Bozhenov 1cf9c54e5c [Nios2] final infrastructure to provide compilation of a return from a function
This patch includes all missing functionality needed to provide first
compilation of a simple program that just returns from a function.
I've added a test case that checks for "ret" instruction printed in assembly
output.

Patch by Andrei Grischenko (andrei.l.grischenko@intel.com)
Differential revision: https://reviews.llvm.org/D39688

llvm-svn: 320035
2017-12-07 12:35:02 +00:00
Andrew V. Tischenko 44cfc51415 Add proper BTVER2 sched support for MOV instr.
Differential Revision: https://reviews.llvm.org/D40345

llvm-svn: 320034
2017-12-07 11:19:49 +00:00
Jonas Devlieghere e385d00960 [dsymutil] Add -verify option to run DWARF verifier after linking.
This patch adds support for running the DWARF verifier on the linked
debug info files. If the -verify options is specified and verification
fails, dsymutil exists with abort with non-zero exit code. This behavior
is *not* enabled by default.

Differential revision: https://reviews.llvm.org/D40777

llvm-svn: 320033
2017-12-07 11:17:19 +00:00
Alex Bradbury 72281a228b [RISCV] Add missed tests for RV64D MC layer support
Add tests missed in r320029.

llvm-svn: 320031
2017-12-07 11:05:38 +00:00
Alex Bradbury 4dd94e0ccd [RISCV] MC layer support for the standard RV64F instruction set extension
llvm-svn: 320028
2017-12-07 11:02:55 +00:00
Alex Bradbury 48f95a655d [RISCV] MC layer support for the standard RV64A instruction set extension
llvm-svn: 320027
2017-12-07 10:59:12 +00:00
Alex Bradbury 81def7224e [RISCV] MC layer support for the standard RV64M instruction set extension
llvm-svn: 320026
2017-12-07 10:56:07 +00:00
Alex Bradbury a6e6248307 [RISCV] MC layer support for the standard RV64I instructions
llvm-svn: 320024
2017-12-07 10:53:48 +00:00
Alex Bradbury 7bc2a95bb9 [RISCV] MC layer support for the standard RV32D instruction set extension
As the FPR32 and FPR64 registers have the same names, use 
validateTargetOperandClass in RISCVAsmParser to coerce a parsed FPR32 to an 
FPR64 when necessary. The rest of this patch is very similar to the RV32F 
patch.

Differential Revision: https://reviews.llvm.org/D39895

llvm-svn: 320023
2017-12-07 10:46:23 +00:00
Francis Visoiu Mistrih a8a83d150f [CodeGen] Use MachineOperand::print in the MIRPrinter for MO_Register.
Work towards the unification of MIR and debug output by refactoring the
interfaces.

For MachineOperand::print, keep a simple version that can be easily called
from `dump()`, and a more complex one which will be called from both the
MIRPrinter and MachineInstr::print.

Add extra checks inside MachineOperand for detached operands (operands
with getParent() == nullptr).

https://reviews.llvm.org/D40836

* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/kill: ([^ ]+) ([^ ]+)<def> ([^ ]+)/kill: \1 def \2 \3/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/kill: ([^ ]+) ([^ ]+) ([^ ]+)<def>/kill: \1 \2 def \3/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/kill: def ([^ ]+) ([^ ]+) ([^ ]+)<def>/kill: def \1 \2 def \3/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/<def>//g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<kill>/killed \1/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<imp-use,kill>/implicit killed \1/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<dead>/dead \1/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<def[ ]*,[ ]*dead>/dead \1/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<imp-def[ ]*,[ ]*dead>/implicit-def dead \1/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<imp-def>/implicit-def \1/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<imp-use>/implicit \1/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<internal>/internal \1/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<undef>/undef \1/g'

llvm-svn: 320022
2017-12-07 10:40:31 +00:00
Alex Bradbury 0d6cf90663 [RISCV] MC layer support for the standard RV32F instruction set extension
The most interesting part of this patch is probably the handling of 
rounding mode arguments. Sadly, the RISC-V assembler handles floating point 
rounding modes as a special "argument" when it would be more consistent to 
handle them like the atomics, opcode suffixes. This patch supports parsing 
this optional parameter, using InstAlias to allow parsing these floating point 
instructions when no rounding mode is specified.

Differential Revision: https://reviews.llvm.org/D39893

llvm-svn: 320020
2017-12-07 10:26:05 +00:00
Alex Bradbury d590c85753 [TableGen] Give the option of tolerating duplicate register names
A number of architectures re-use the same register names (e.g. for both 32-bit 
FPRs and 64-bit FPRs). They are currently unable to use the tablegen'erated 
MatchRegisterName and MatchRegisterAltName, as tablegen (when built with 
asserts enabled) will fail.

When the AllowDuplicateRegisterNames in AsmParser is set, duplicated register 
names will be tolerated. A backend can then coerce registers to the desired 
register class by (for instance) implementing validateTargetOperandClass.

At least the in-tree Sparc backend could benefit from this, as does RISC-V 
(single and double precision floating point registers).

Differential Revision: https://reviews.llvm.org/D39845

llvm-svn: 320018
2017-12-07 09:51:55 +00:00
Gadi Haber dd62ac49cb [X86][FMA][FMA4]: Adding full coverage of MC encoding for the FMA, FMA4 isa sets.<NFC>
NFC.
 Adding MC regressions tests to cover the FMA and FMA4 ISA sets.
 This patch is part of a larger task to cover MC encoding of all X86 ISA Sets starting revision https://reviews.llvm.org/D39952

Reviewers: craig.topper, RKSimon, zvi
Differential Revision: https://reviews.llvm.org/D40880

Change-Id: Ie39c0edce69ad647076b3d4e816948b2b6e1a9e4
llvm-svn: 320016
2017-12-07 09:16:34 +00:00
Gadi Haber e33a0cb8a8 [X86][X87]: Adding full coverage of MC encoding for all X87 ISA Sets.<NFC>
NFC.
 Currently, not all the X86 ISA Sets are covered by the MC regressions tests for X86.
 A full coverage needs to be added for each ISA set and for both 32bit and 64bit instructions + registers.
 This patch includes MC assembly tests for the X87 32bit and 64bit.

Reviewers: craigt, RKSimon, zvi
Differential Revision: https://reviews.llvm.org/D39952

Change-Id: I55e1719c09a70644a6a4073c720cb5341c80fee9
llvm-svn: 320015
2017-12-07 09:00:19 +00:00
Igor Laevsky 54d1ff0a58 [InstSimplify] Add tests for the rL319894
Differential Revision: https://reviews.llvm.org/D40650

llvm-svn: 320014
2017-12-07 08:52:24 +00:00
Mikael Holmen b5deac444d Skip DBG instr in OptimizePHIs when looking for dead PHI cycles
Summary:
Changed use_instructions() to use_nodbg_instructions() when
building an instruction set.

We don't want the presence of debug info to affect the code
we generate.

Reviewers: dblaikie, Eugene.Zelenko, chandlerc, aprantl

Reviewed By: aprantl

Subscribers: aprantl, llvm-commits

Differential Revision: https://reviews.llvm.org/D40882

llvm-svn: 320010
2017-12-07 07:01:21 +00:00
Leslie Zhai 8543d53fd9 [AVR] Override ParseDirective
Reviewers: dylanmckay, kparzysz

Reviewed By: dylanmckay

Differential Revision: https://reviews.llvm.org/D38029

llvm-svn: 320009
2017-12-07 06:56:09 +00:00
Sam Clegg 8460b26403 Revert "[WebAssembly] Import the linear memory and function table."
We need to a little time to prepare and lld-side change that
supports this.

Original change: https://reviews.llvm.org/D40875

llvm-svn: 320003
2017-12-07 03:05:45 +00:00
Sam Clegg e1694f9bf8 [WebAssembly] section kind can be code
Currently, when creating a named section, the Wasm
frontend forces it to use `SectionKind::Data`, whereas
in fact C++ does generate code sections with custom
names.

Patch by Nicholas Wilson

Differential Revision: https://reviews.llvm.org/D40906

llvm-svn: 320002
2017-12-07 02:55:51 +00:00
Davide Italiano 7495c762d1 [DebugInfo] Explicitly pass a triple to this test.
As we emit different linetables format on different operating
systems, this currently fails on linux. Speculative commit
to fix the bots.

llvm-svn: 319997
2017-12-07 01:22:10 +00:00
Davide Italiano 23b3f6da14 [MC/Dwarf] Use the older DWARF linetables format on Darwin.
dsymutil doesn't yet understand the new format and the change,
among others, breaks a large fraction of the debugger tests on
mac OS.

rdar://problem/35856354

llvm-svn: 319995
2017-12-07 00:57:25 +00:00
Dan Gohman 5cf6473903 [WebAssembly] Don't try to emit size information for unsized types
Patch by John Sully!

Fixes PR35164.

Differential Revision: https://reviews.llvm.org/D39519

llvm-svn: 319991
2017-12-07 00:14:30 +00:00
Dan Gohman 96d22e12a2 [WebAssembly] Import the linear memory and function table.
Instead of having .o files contain linear-memory and function table
definitions, use imports. This is more consistent with the stack pointer
being imported, and it's consistent with the linker being the one to
decide whether linear memory and function table are imported or defined
in the linked output. This implements tool-conventions #23.

Differential Revision: https://reviews.llvm.org/D40875

llvm-svn: 319989
2017-12-06 23:57:11 +00:00
Florian Hahn 5d6a4e43ba [AArch64] Add patterns to replace fsub fmul with fma fneg.
Summary:
This patch adds MachineCombiner patterns for transforming
(fsub (fmul x y) z) into (fma x y (fneg z)). This has a lower
latency on micro architectures where fneg is cheap.

Patch based on work by George Steed.

Reviewers: rengolin, joelkevinjones, joel_k_jones, evandro, efriedma

Reviewed By: evandro

Subscribers: aemerson, javed.absar, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D40306

llvm-svn: 319980
2017-12-06 22:48:36 +00:00
Adam Nemet a502ee73c4 [LV] Interleaved access vectorization: fix computing new alias info
As a new access is generated spanning across multiple fields, we need to
propagate alias info from all the fields to form the most generic alias info.

rdar://35602528

Differential Revision: https://reviews.llvm.org/D40617

llvm-svn: 319979
2017-12-06 22:42:24 +00:00
Krzysztof Parzyszek d2967868be [Hexagon] Recognize vdealb, vdealh, vshuffb and vshuffh specifically
llvm-svn: 319978
2017-12-06 22:41:49 +00:00
Krzysztof Parzyszek 64533cf630 [Hexagon] Handle perfect shuffles on single vectors
llvm-svn: 319965
2017-12-06 21:25:03 +00:00
Sanjay Patel b6404a8ca6 [InstCombine] canonicalize constant-minus-boolean to select-of-constants
This restores the half of:
https://reviews.llvm.org/rL75531
that was reverted at:
https://reviews.llvm.org/rL159230

For the x86 case mentioned there, we now produce:
leal 1(%rdi), %eax
subl %esi, %eax

We have target hooks to invert this in DAGCombiner (and x86 is enabled) with:
https://reviews.llvm.org/rL296977
https://reviews.llvm.org/rL311731

AArch64 and possibly other targets would probably benefit from enabling those hooks too. 
See PR30327:
https://bugs.llvm.org/show_bug.cgi?id=30327#c2

Differential Revision: https://reviews.llvm.org/D40612

llvm-svn: 319964
2017-12-06 21:22:57 +00:00
Dan Gohman ad19047d83 [WebAssembly] Remove WASM_STACK_POINTER.
WASM_STACK_POINTER and the .stack_pointer directive are no longer needed
now that the stack pointer global is an import.

llvm-svn: 319956
2017-12-06 20:56:40 +00:00
Simon Pilgrim 9afbe77a91 [X86][AVX512] Tag mask reg op instruction scheduler classes
llvm-svn: 319945
2017-12-06 19:36:00 +00:00
Rui Ueyama efb5024e57 [COFF] Ignore semicolons in module definition identifiers
Patch by David Major.

The NSS project's .def files make heavy use of semicolons in a
frightening attempt at portability:
https://hg.mozilla.org/projects/nss/raw-file/tip/lib/ckfw/capi/nsscapi.def

lld-link was treating the semicolon as part of the export name,
resulting in unresolved symbols. This patch includes ';' in the list of
characters to split on.

Differential Revision: https://reviews.llvm.org/D39968

llvm-svn: 319933
2017-12-06 19:18:24 +00:00
Zachary Turner c221dc71b1 Update obj2yaml and yaml2obj for .debug$H section.
Differential Revision: https://reviews.llvm.org/D40842

llvm-svn: 319925
2017-12-06 18:58:48 +00:00
Simon Pilgrim 7724b03cde [X86][SSE] Regenerate vpmovm2*/vpmov*2m avx512 schedule tests
llvm-svn: 319921
2017-12-06 18:47:37 +00:00
Simon Pilgrim 809c024b3d [X86][AVX2] Tag MASKMOV instruction scheduler classes
llvm-svn: 319915
2017-12-06 18:24:48 +00:00
Craig Topper fa172a5251 [X86] Regenerate test for r319778
llvm-svn: 319914
2017-12-06 18:04:39 +00:00
Simon Pilgrim df05251921 [X86][AVX512] Tag aligned/unaligned move instruction scheduler classes
llvm-svn: 319913
2017-12-06 17:59:26 +00:00
Simon Pilgrim 3ee91ade9b [X86][AVX] Regenerate vpmovm2*/vpmov*2m avx512 schedule tests
llvm-svn: 319912
2017-12-06 17:57:18 +00:00
Zvi Rackover 2e6e88f689 InstructionSimplify: 'extractelement' with an undef index is undef
Summary:
An undef extract index can be arbitrarily chosen to be an
out-of-range index value, which would result in the instruction being undef.

This change closes a gap identified while working on lowering vector permute intrinsics
with variable index vectors to pure LLVM IR.

Reviewers: arsenm, spatel, majnemer

Reviewed By: arsenm, spatel

Subscribers: fhahn, nhaehnle, wdng, llvm-commits

Differential Revision: https://reviews.llvm.org/D40231

llvm-svn: 319910
2017-12-06 17:51:46 +00:00
Artem Belevich a659d2590e [NVPTX,CUDA] Added llvm.nvvm.fns intrinsic and matching __nvvm_fns builtin in clang.
Differential Revision: https://reviews.llvm.org/D40872

llvm-svn: 319909
2017-12-06 17:50:05 +00:00
Zvi Rackover ffaed72089 AMDGPU Tests: Change a case to be run with -O0
D40231 requires to run case with -O0 to prevent InstructionSimplify from
transforming an extractelement with undef index.

llvm-svn: 319907
2017-12-06 17:40:09 +00:00
Adam Nemet 9e5e51aeed [opt-viewer] Suppress noisy Swift remarks
Most likely, this is not how we want to handle this in the long term.  This
code should probably be in the Swift repo and somehow plugged into the
opt-viewer.  This is still however very experimental at this point so I don't
want to over-engineer it at this point.

llvm-svn: 319902
2017-12-06 16:50:50 +00:00
Krzysztof Parzyszek 7d37dd8902 [Hexagon] Generate HVX code for vector construction and access
Support for:
  - build vector,
  - extract vector element, subvector,
  - insert vector element, subvector,
  - shuffle.

llvm-svn: 319901
2017-12-06 16:40:37 +00:00
Simon Pilgrim aa902be158 [X86][AVX512] Tag BROADCAST instruction scheduler classes
llvm-svn: 319900
2017-12-06 15:48:40 +00:00
Nirav Dave 7d8f3e0c93 [ARM][AArch64][DAG] Reenable post-legalize store merge
Reenable post-legalize stores with constant merging computation and
corresponding test case.

 * Properly truncate store merge constants
 * Disable merging of truncated stores floating points
 * Ensure merges of constant stores into a single vector are
   constructed from legal elements.

Reviewers: eastig, efriedma

Reviewed By: eastig

Subscribers: spatel, rengolin, aemerson, javed.absar, kristof.beyls, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D40701

llvm-svn: 319899
2017-12-06 15:30:13 +00:00
Simon Pilgrim a9282309e5 [X86][AVX512] Regenerate vpmovm2*/vpmov*2m avx512 schedule tests
llvm-svn: 319895
2017-12-06 14:07:38 +00:00
Igor Laevsky 03655c7636 [InstSimplify] Fold insertelement into undef if index is out of bounds
Differential Revision: https://reviews.llvm.org/D40650

llvm-svn: 319894
2017-12-06 14:04:45 +00:00
Jonas Paulsson 19380bae05 [SystemZ] Bugfix in expandRxSBG()
Csmith discovered a program that caused wrong code generation with -O0:

When handling a SIGN_EXTEND in expandRxSBG(), RxSBG.BitSize may be less than
the Input width (if a truncate was previously traversed), so maskMatters()
should be called with a masked based on the width of the sign extend result
instead.

Review: Ulrich Weigand
llvm-svn: 319892
2017-12-06 13:53:24 +00:00
Simon Dardis 515f42cbaa [mips] Fix definition of 'bc' instruction
llvm-svn: 319888
2017-12-06 12:42:49 +00:00
Hans Wennborg 146a9c3e51 Revert r319482 and r319483 "[memcpyopt] Teach memcpyopt to optimize across basic blocks"
This caused PR35519.

> [memcpyopt] Teach memcpyopt to optimize across basic blocks
>
> This teaches memcpyopt to make a non-local memdep query when a local query
> indicates that the dependency is non-local. This notably allows it to
> eliminate many more llvm.memcpy calls in common Rust code, often by 20-30%.
>
> Fixes PR28958.
>
> Differential Revision: https://reviews.llvm.org/D38374
>

> [memcpyopt] Commit file missed in r319482.
>
> This change was meant to be included with r319482 but was accidentally
> omitted.

llvm-svn: 319873
2017-12-06 01:47:55 +00:00
Vlad Tsyrklevich 0b40f21134 Revert "[DAGCombine] Move AND nodes to multiple load leaves"
This reverts commit r319773. It was causing some buildbots to hang, e.g.
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-android/builds/5589

llvm-svn: 319867
2017-12-06 01:16:08 +00:00
Derek Schuff 3d5b86205c [WebAssembly] Fix test breakage from r319810
llvm-svn: 319865
2017-12-06 01:02:44 +00:00
Zachary Turner 330f48f2dc Regex out the local hash comparison test.
Since the local hash is a different number of bytes depending
on host architecture, we don't have a consistent value.  I
will need to re-do this test for both x86 and x64.  For now
it accepts any value for the local hash.

llvm-svn: 319864
2017-12-06 00:58:12 +00:00
Zachary Turner 376d437776 Teach llvm-pdbutil to dump types from object files.
llvm-svn: 319859
2017-12-05 23:58:18 +00:00
Xinliang David Li 45c819063a Revert r319794: [PGO] detect infinite loop and form MST properly: memory leak problem
llvm-svn: 319841
2017-12-05 21:54:01 +00:00
Anna Thomas 7df1a92543 [SafepointIRVerifier] Allow deriving pointers from unrelocated base
Summary:
This patch allows to use derived pointers (GEPs/bitcasts) of unrelocated
base pointers. We care only about the uses of these derived pointers.

It is acheived by two changes:
1. When we have enough information to say if the pointer is unrelocated at some
point or not, we walk all BBs to remove from their Contributions all valid defs
of unrelocated pointers (GEP with unrelocated base or bitcast of unrelocated
pointer).
2. When it comes to verification we just ignore instructions that were removed
at stage 1.

Patch by Daniil Suchkov!

Reviewers: anna, reames, apilipenko, mkazantsev

Reviewed By: anna, mkazantsev

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D40289

llvm-svn: 319838
2017-12-05 21:39:37 +00:00
Simon Pilgrim d495301414 [X86][AVX512] Tag BLENDM instruction scheduler classes
llvm-svn: 319833
2017-12-05 21:05:25 +00:00
Simon Pilgrim b69dae42e3 [X86][AVX512] Tag GATHER/SCATTER instruction scheduler classes
NOTE: At the moment these use the WriteLoad/WriteStore classes, which severely underestimates the costs. This needs to be reviewed.
llvm-svn: 319829
2017-12-05 20:47:11 +00:00
Paul Robinson 795ab0d94d [DWARFv5] Emit v5 line table header.
Differential Revision: https://reviews.llvm.org/D40741

llvm-svn: 319827
2017-12-05 20:35:00 +00:00
Matt Arsenault 8ae38bc0bd AMDGPU: Fix SDWA crash on inline asm
This was only searching for explicit defs,
and asserting for any implicit or variadic
instruction defs, like inline asm.

llvm-svn: 319826
2017-12-05 20:32:01 +00:00
Hans Wennborg 5df9f0878b Re-commit r319490 "XOR the frame pointer with the stack cookie when protecting the stack"
The patch originally broke Chromium (crbug.com/791714) due to its failing to
specify that the new pseudo instructions clobber EFLAGS. This commit fixes
that.

> Summary: This strengthens the guard and matches MSVC.
>
> Reviewers: hans, etienneb
>
> Subscribers: hiraditya, JDevlieghere, vlad.tsyrklevich, llvm-commits
>
> Differential Revision: https://reviews.llvm.org/D40622

llvm-svn: 319824
2017-12-05 20:22:20 +00:00
Ulrich Weigand 5bfed6cb7c [SystemZ] Validate shifted compare value in adjustForTestUnderMask
When folding a shift into a test-under-mask comparison, make sure that
there is no loss of precision when creating the shifted comparison
value.  This usually never happens, except for certain always-true
comparisons in unoptimized code.

Fixes PR35529.

llvm-svn: 319818
2017-12-05 19:42:07 +00:00
Simon Pilgrim 833c260a4b [X86][AVX512] Tag VPTRUNC/VPMOVSX/VPMOVZX instruction scheduler classes
llvm-svn: 319815
2017-12-05 19:21:28 +00:00
Rafael Espindola 74de7e0791 Simplify test.
It can use attrib instead of icacls.

llvm-svn: 319809
2017-12-05 18:26:23 +00:00
Matt Arsenault 7f0a527300 AMDGPU: Fix infinite loop with dbg_value
Surprisingly SIOptimizeExecMaskingPreRA can infinite loop
in some case with DBG_VALUE. Most tests using dbg_value are
run at -O0, so don't run this pass. This seems to only
happen when the value argument is undef.

llvm-svn: 319808
2017-12-05 18:23:17 +00:00
Joel Galenson ea0bafda8a [CVP] Remove some {s|u}sub.with.overflow checks.
This uses ConstantRange::makeGuaranteedNoWrapRegion's newly-added handling for subtraction to allow CVP to remove some subtraction overflow checks.

Differential Revision: https://reviews.llvm.org/D40039

llvm-svn: 319807
2017-12-05 18:14:24 +00:00
Simon Pilgrim 65f805fe30 [X86][X87] Tag FCMOV instruction scheduler classes
llvm-svn: 319804
2017-12-05 18:01:26 +00:00
Dan Gohman c2c997718d [WebAssembly] Implement WASM_STACK_POINTER.
Use the .stack_pointer directive to implement WASM_STACK_POINTER for
specifying a global variable to be the stack pointer.

llvm-svn: 319797
2017-12-05 17:23:43 +00:00
Dan Gohman f7172f4ab0 [WebAssembly] Don't emit .import_global for the wasm target.
.import_global is used by the ELF-based target and not needed by the wasm
target.

llvm-svn: 319796
2017-12-05 17:21:57 +00:00
Xinliang David Li cc35bc9efc [PGO] detect infinite loop and form MST properly
Differential Revision: http://reviews.llvm.org/D40702

llvm-svn: 319794
2017-12-05 17:19:41 +00:00
Rafael Espindola 20569e96e9 Delete temp file if rename fails.
Without this when lld failed to replace the output file it would leave
the temporary behind. The problem is that the existing logic is

- cancel the delete flag
- rename

We have to cancel first to avoid renaming and then crashing and
deleting the old version. What is missing then is deleting the
temporary file if the rename fails.

This can be an issue on both unix and windows, but I am not sure how
to cause the rename to fail reliably on unix. I think it can be done
on ZFS since it has an ACL system similar to what windows uses, but
adding support for checking that in llvm-lit is probably not worth it.

llvm-svn: 319786
2017-12-05 16:40:56 +00:00
Alexey Bataev d4683e6ef1 [InstCombine] Additional test for PR35354, NFC.
llvm-svn: 319783
2017-12-05 16:15:55 +00:00
Jina Nahias 51c1a627c2 [x86][AVX512] Lowering kunpack intrinsics to LLVM IR
This patch, together with a matching clang patch (https://reviews.llvm.org/D39719), implements the lowering of X86 kunpack intrinsics to IR.

Differential Revision: https://reviews.llvm.org/D39720

Change-Id: I4088d9428478f9457f6afddc90bd3d66b3daf0a1
llvm-svn: 319778
2017-12-05 15:42:56 +00:00
Bjorn Pettersson 5abbad7999 Add REQUIRES asserts in combine_loads_from_build_pair.ll
A fixup of r319771, that was causing buildbot failures.

llvm-svn: 319775
2017-12-05 15:26:01 +00:00
Sam Parker 0a436a9d62 [DAGCombine] Move AND nodes to multiple load leaves
Search from AND nodes to find whether they can be propagated back to
loads, so that the AND and load can be combined into a narrow load.
We search through OR, XOR and other AND nodes and all bar one of the
leaves are required to be loads or constants. The exception node then
needs to be masked off meaning that the 'and' isn't removed, but the
loads(s) are narrowed still.

Differential Revision: https://reviews.llvm.org/D39604

llvm-svn: 319773
2017-12-05 15:13:47 +00:00
Bjorn Pettersson 823b299fbc [DAGCombine] Handle big endian correctly in CombineConsecutiveLoads
Summary:
Found out, at code inspection, that there was a fault in
DAGCombiner::CombineConsecutiveLoads for big-endian targets.

A BUILD_PAIR is always having the least significant bits of
the composite value in element 0. So when we are doing the checks
for consecutive loads, for big endian targets, we should check
if the load to elt 1 is at the lower address and the load
to elt 0 is at the higher address.

Normally this bug only resulted in missed oppurtunities for
doing the load combine. I guess that in some rare situation it
could lead to faulty combines, but I've not seen that happen.

Note that this patch actually will trigger load combine for
some big endian regression tests.
One example is test/CodeGen/PowerPC/anon_aggr.ll where we now get
  t76: i64,ch = load<LD8[FixedStack-9]
instead of
  t37: i32,ch = load<LD4[FixedStack-10]>
  t35: i32,ch = load<LD4[FixedStack-9]>
  t41: i64 = build_pair t37, t35
before legalization. Then the legalization will split the LD8
into two loads, so the end result is the same. That should
verify that the transfomation is correct now.

Reviewers: niravd, hfinkel

Reviewed By: niravd

Subscribers: nemanjai, llvm-commits

Differential Revision: https://reviews.llvm.org/D40444

llvm-svn: 319771
2017-12-05 14:50:05 +00:00
Simon Pilgrim 71660c61e6 [X86][AVX512] Add missing scalar CMPSS/CMPSD logic scheduler classes
llvm-svn: 319770
2017-12-05 14:34:42 +00:00
Mikael Holmen 0a3e98062f Bail out of a SimplifyCFG switch table opt at undef values.
Summary:
A true or false result is expected from a comparison, but it seems the possibility of undef was overlooked, which could lead to a failed assert. This is fixed by this patch by bailing out if we encounter undef.

The bug is old and the assert has been there since the end of 2014, so it seems this is unusual enough to forego optimization.

Patch by JesperAntonsson.

Reviewers: spatel, eeckstein, hans

Reviewed By: hans

Subscribers: uabelho, llvm-commits

Differential Revision: https://reviews.llvm.org/D40639

llvm-svn: 319768
2017-12-05 14:14:00 +00:00
Simon Pilgrim b9b46394e3 [X86][AVX512] Cleanup bit logic scheduler classes
llvm-svn: 319767
2017-12-05 14:04:23 +00:00
Simon Pilgrim fd3a2632e5 [X86][AVX512] Tag scalar CVT and CMP instruction scheduler classes
llvm-svn: 319765
2017-12-05 13:49:44 +00:00
Igor Laevsky cec8f47e77 [InstCombine] Don't crash on out of bounds shifts
Differential Revision: https://reviews.llvm.org/D40649

llvm-svn: 319761
2017-12-05 12:18:15 +00:00
Simon Pilgrim aa91155960 [X86][AVX512] Tag VPCMP/VPCMPU instruction scheduler classes
Move hardcoded itinerary out to the instruction declarations. Not sure that IIC_SSE_ALU_F32P is the best schedule for integer comparisons, but I'm not going to change it right now.

llvm-svn: 319760
2017-12-05 12:14:36 +00:00
Simon Pilgrim a2b5862641 [X86][AVX512] Cleanup VPCMP scheduler classes
Move hardcoded itinerary out to the instruction declarations. Not sure that IIC_SSE_ALU_F32P is the best schedule for integer comparisons, but I'm not going to change it right now.

llvm-svn: 319758
2017-12-05 12:02:22 +00:00
Jonas Paulsson b5b91cd402 [SystemZ] set 'guessInstructionProperties = 0' and set flags as needed.
This has proven a healthy exercise, as many cases of incorrect instruction
flags were corrected in the process. As part of this, IntrWriteMem was added
to several SystemZ instrinsics.

Furthermore, a bug was exposed in TwoAddress with this change (as incorrect
hasSideEffects flags were removed and instructions could now be sunk), and
the test case for that bugfix (r319646) is included here as
test/CodeGen/SystemZ/twoaddr-sink.ll.

One temporary test regression (one extra copy) which will hopefully go away
in upcoming patches for similar cases:
test/CodeGen/SystemZ/vec-trunc-to-i1.ll

Review: Ulrich Weigand.
https://reviews.llvm.org/D40437

llvm-svn: 319756
2017-12-05 11:24:39 +00:00
Jonas Paulsson 86c40db49d [Regalloc] Generate and store multiple regalloc hints.
MachineRegisterInfo used to allow just one regalloc hint per virtual
register. This patch extends this to a vector of regalloc hints, which is
filled in by common code with sorted copy hints. Such hints will make for
more ID copies that can be removed.

NB! This improvement is currently (and hopefully temporarily) *disabled* by
default, except for SystemZ. The only reason for this is the big impact this
has on tests, which has unfortunately proven unmanageable. It was a long
while since all the tests were updated and just waiting for review (which
didn't happen), but now targets have to enable this themselves
instead. Several targets could get a head-start by downloading the tests
updates from the Phabricator review. Thanks to those who helped, and sorry
you now have to do this step yourselves.

This should be an improvement generally for any target!

The target may still create its own hint, in which case this has highest
priority and is stored first in the vector. If it has target-type, it will
not be recomputed, as per the previous behaviour.

The temporary hook enableMultipleCopyHints() will be removed as soon as all
targets return true.

Review: Quentin Colombet, Ulrich Weigand.
https://reviews.llvm.org/D38128

llvm-svn: 319754
2017-12-05 10:52:24 +00:00
Guy Blank f3cefdd350 [X86] Fix a bug in handling GRXX subclasses in Domain Reassignment pass
When trying to determine the correct Mask register class corresponding
to a GPR register class, not all register classes were handled.
This caused an assertion to be raised on some scenarios.

Differential Revision:
https://reviews.llvm.org/D40290

llvm-svn: 319745
2017-12-05 09:08:24 +00:00
Craig Topper a404ce955a [X86] Use vector widening to support sign extend from i1 when the dest type is not 512-bits and vlx is not enabled.
Previously we used a wider element type and truncated. But its more efficient to keep the element type and drop unused elements.

If BWI isn't supported and we have a i16 or i8 type, we'll extend it to be i32 and still use a truncate.

llvm-svn: 319740
2017-12-05 06:37:21 +00:00
Daniel Sanders 3c1c4c0ee0 Revert r319691: [globalisel][tablegen] Split atomic load/store into separate opcode and enable for AArch64.
Some concerns were raised with the direction. Revert while we discuss it and look into an alternative

llvm-svn: 319739
2017-12-05 05:52:07 +00:00
Craig Topper e1ba2450c2 [X86] Fix a crash if avx512bw and xop are both enabled when the IR contrains a v32i8 bitreverse.
llvm-svn: 319737
2017-12-05 04:47:12 +00:00
Matt Arsenault 9a60c3ea36 AMDGPU: Fix crash when scheduling DBG_VALUE
This calls handleMove with a DBG_VALUE instruction,
which isn't tracked by LiveIntervals. I'm not sure
this is the correct place to fix this. The generic
scheduler seems to have more deliberate region
selection that skips dbg_value.

The test is also really hard to reduce. I haven't been able
to figure out what exactly causes this particular case to
try moving the dbg_value.

llvm-svn: 319732
2017-12-05 03:09:23 +00:00
Craig Topper 276c770e57 [X86] Use vector widening to support zero extend from i1 when the dest type is not 512-bits and vlx is not enabled.
Previously we used a wider element type and truncated. But its more efficient to keep the element type and drop unused elements.

If BWI isn't supported and we have a i16 or i8 type, we'll extend it to be i32 and still use a truncate.

llvm-svn: 319728
2017-12-05 01:45:46 +00:00
Craig Topper 913b42b0e1 [X86] Don't use kunpck for vXi1 concat_vectors if the upper bits are undef.
This can be efficiently selected by a COPY_TO_REGCLASS without the need for an extra instruction.

llvm-svn: 319726
2017-12-05 01:28:06 +00:00
Craig Topper 6302012442 [X86] Use getZeroVector and remove an unnecessary creation of an APInt before calling getConstant. NFCI
The getConstant function can take care of creating the APInt internally.

getZeroVector will take care of using the correct type for the build vector to avoid re-lowering.

The test change here is because execution domain constraints apparently pass through undef inputs of a zeroing xor. So the different ordering of register allocation here caused the dependency to change.

llvm-svn: 319725
2017-12-05 01:28:04 +00:00
Jan Vesely 39aeab4f30 AMDGPU/EG: Add a new FeatureFMA and use it to selectively enable FMA instruction
Only used by pre-GCN targets
v2: fix predicate setting for FMA_Common

Differential Revision: https://reviews.llvm.org/D40692

llvm-svn: 319712
2017-12-04 23:07:28 +00:00
Jan Vesely d1c9b61e2b AMDGPU: Disable fp64 support on pre GCN asics
It's not implemented.
Passing +fp64-fp16-denormal feature enables fp64 even on asics that don't support it

v2: fix hasFP64 query

Differential Revision: https://reviews.llvm.org/D39931

llvm-svn: 319709
2017-12-04 22:57:29 +00:00
Hans Wennborg 361d4392cf Revert r319490 "XOR the frame pointer with the stack cookie when protecting the stack"
This broke the Chromium build (crbug.com/791714). Reverting while investigating.

> Summary: This strengthens the guard and matches MSVC.
>
> Reviewers: hans, etienneb
>
> Subscribers: hiraditya, JDevlieghere, vlad.tsyrklevich, llvm-commits
>
> Differential Revision: https://reviews.llvm.org/D40622
>
> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319490 91177308-0d34-0410-b5e6-96231b3b80d8

llvm-svn: 319706
2017-12-04 22:21:15 +00:00
Matt Arsenault 68f0505263 AMDGPU: Fix creating invalid copy when adjusting dmask
Move the entire optimization to one place. Before it was possible
to adjust dmask without changing the register class of the output
instruction, since they were done in separate places. Fix all
lane sizes and move all of the optimization into the DAG folding.

llvm-svn: 319705
2017-12-04 22:18:27 +00:00
Paul Robinson 68ba772cc0 Re-submit r289925 (Update .debug_line section version to match DWARF version)
Set the .debug_line version to match the requested DWARF version,
except with a maximum of v4 because we don't support v5 yet.

Previously Chromium had issues with this patch; see PR31407.  Chromium
tool issues have been addressed, so hopefully this will go through
this time.

Patch by Katya Romanova!

Differential Revision: https://reviews.llvm.org/D38002

llvm-svn: 319699
2017-12-04 21:27:46 +00:00
Daniel Sanders c83977fc21 [globalisel][tablegen] Tests for r319691
I forgot to 'svn add' the test files.

llvm-svn: 319698
2017-12-04 21:14:34 +00:00
Hans Wennborg 7e61f24962 DAG: Match truncated rotation (PR35487)
If the truncation has been pushed past the or-node, look through it and
truncate afterwards.

Differential revision: https://reviews.llvm.org/D40792

llvm-svn: 319692
2017-12-04 20:39:57 +00:00
Daniel Sanders 04e4f47e93 [globalisel][tablegen] Split atomic load/store into separate opcode and enable for AArch64.
This patch splits atomics out of the generic G_LOAD/G_STORE and into their own
G_ATOMIC_LOAD/G_ATOMIC_STORE. This is a pragmatic decision rather than a
necessary one. Atomic load/store has little in implementation in common with
non-atomic load/store. They tend to be handled very differently throughout the
backend. It also has the nice side-effect of slightly improving the common-case
performance at ISel since there's no longer a need for an atomicity check in the
matcher table.

All targets have been updated to remove the atomic load/store check from the
G_LOAD/G_STORE path. AArch64 has also been updated to mark
G_ATOMIC_LOAD/G_ATOMIC_STORE legal.

There is one issue with this patch though which also affects the extending loads
and truncating stores. The rules only match when an appropriate G_ANYEXT is
present in the MIR. For example,
  (G_ATOMIC_STORE (G_TRUNC:s16 (G_ANYEXT:s32 (G_ATOMIC_LOAD:s16 X))))
will match but:
  (G_ATOMIC_STORE (G_ATOMIC_LOAD:s16 X))
will not. This shouldn't be a problem at the moment, but as we get better at
eliminating extends/truncates we'll likely start failing to match in some
cases. The current plan is to fix this in a patch that changes the
representation of extending-load/truncating-store to allow the MMO to describe
a different type to the operation.

llvm-svn: 319691
2017-12-04 20:39:32 +00:00
Matthias Braun de4c0ae205 Add missing triple args to tests
llvm-svn: 319686
2017-12-04 20:08:28 +00:00
Haicheng Wu 234eabaf07 [ConstantFold] Support vector index when factoring out GEP index into preceding dimensions
Follow-up of r316824. This patch supports the vector type for both current and
previous index when factoring out the current one into the previous one.

Differential Revision: https://reviews.llvm.org/D39556

llvm-svn: 319683
2017-12-04 19:56:33 +00:00
Sanjoy Das aa92cae14e [BypassSlowDivision] Improve our handling of divisions by constants
(This reapplies r314253.  r314253 was reverted on r314482 because of a
correctness regression on P100, but that regression was identified to be
something else.)

Summary:
Don't bail out on constant divisors for divisions that can be narrowed without
introducing control flow .  This gives us a 32 bit multiply instead of an
emulated 64 bit multiply in the generated PTX assembly.

Reviewers: jlebar

Subscribers: jholewinski, mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D38265

llvm-svn: 319677
2017-12-04 19:21:58 +00:00
Matthias Braun 7eae251bae MachineVerifier: undef phi arg doesn't need to be live-out from predecessor
Differential Revision: https://reviews.llvm.org/D40756

llvm-svn: 319674
2017-12-04 18:57:48 +00:00
Francis Visoiu Mistrih 25528d6de7 [CodeGen] Unify MBB reference format in both MIR and debug output
As part of the unification of the debug format and the MIR format, print
MBB references as '%bb.5'.

The MIR printer prints the IR name of a MBB only for block definitions.

* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E 's/BB#" << ([a-zA-Z0-9_]+)->getNumber\(\)/" << printMBBReference(*\1)/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E 's/BB#" << ([a-zA-Z0-9_]+)\.getNumber\(\)/" << printMBBReference(\1)/g'
* find . \( -name "*.txt" -o -name "*.s" -o -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E 's/BB#([0-9]+)/%bb.\1/g'
* grep -nr 'BB#' and fix

Differential Revision: https://reviews.llvm.org/D40422

llvm-svn: 319665
2017-12-04 17:18:51 +00:00
Pablo Barrio 2b4385846c Fix function pointer tail calls in armv8-M.base
Summary:
The compiler fails with the following error message:

fatal error: error in backend: ran out of registers during
register allocation

Tail call optimization for Armv8-M.base fails to meet all the required
constraints when handling calls to function pointers where the
arguments take up r0-r3. This is because the pointer to the
function to be called can only be stored in r0-r3, but these are
all occupied by arguments. This patch makes sure that tail call
optimization does not try to handle this type of calls.

Reviewers: chill, MatzeB, olista01, rengolin, efriedma

Reviewed By: olista01, efriedma

Subscribers: efriedma, aemerson, javed.absar, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D40706

llvm-svn: 319664
2017-12-04 16:55:49 +00:00
Sam Kolton 5f7f32c382 [AMDGPU] SDWA: add support for PRESERVE into SDWA peephole.
Summary:

Reviewers: arsenm, vpykhtin, rampitec

Subscribers: kzhuravl, wdng, nhaehnle, mgorny, yaxunl, dstuttard, tpr, t-tye

Differential Revision: https://reviews.llvm.org/D37817

llvm-svn: 319662
2017-12-04 16:22:32 +00:00
Sam Parker 987b2c9966 [ARM] CodeGen test
Add another and + load DAG combine test.

llvm-svn: 319660
2017-12-04 15:14:59 +00:00
Anna Thomas 7b360434ff [Loop Predication] Teach LP about reverse loops
Summary:
Currently, we only support predication for forward loops with step
of 1.  This patch enables loop predication for reverse or
countdownLoops, which satisfy the following conditions:
   1. The step of the IV is -1.
   2. The loop has a singe latch as B(X) = X <pred>
latchLimit with pred as s> or u>
   3. The IV of the guard is the decrement
IV of the latch condition (Guard is: G(X) = X-1 u< guardLimit).

This patch was downstream for a while and is the last series of patches
that's from our LP implementation downstream.

Reviewers: apilipenko, mkazantsev, sanjoy

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D40353

llvm-svn: 319659
2017-12-04 15:11:48 +00:00
Jonas Hahnfeld 5db24d7c22 [NVPTX] Assign valid global names
PTX requires that identifiers consist only of [a-zA-Z0-9_$]. The
existing pass already ensured this for globals and this patch adds
the cleanup for functions with local linkage.

However, there was a different problem in the case of collisions
of the adjusted name: The ValueSymbolTable then automatically
appended ".N" with increasing Ns to get a unique name while helping
the ABI demangling. Special case this behavior to omit the dots and
append N directly. This will always give us legal names according
to the PTX requirements.

Differential Revision: https://reviews.llvm.org/D40573

llvm-svn: 319657
2017-12-04 14:19:33 +00:00
Oliver Stannard 7ab60605f8 Revert r319649 - [Asm, ARM] Add fallback diag for multiple invalid operands
This is causing a failure in the llvm-clang-x86_64-expensive-checks-win
buildbot, and I can't reproduce it locally, so reverting until I can work out
what is wrong.

llvm-svn: 319654
2017-12-04 13:42:22 +00:00
Oliver Stannard 7cd4db94f8 [Asm, ARM] Add fallback diag for multiple invalid operands
This adds a "invalid operands for instruction" diagnostic for
instructions where there is an instruction encoding with the correct
mnemonic and which is available for this target, but where multiple
operands do not match those which were provided. This makes it clear
that there is some combination of operands that is valid for the current
target, which the default diagnostic of "invalid instruction" does not.

Since this is a very general error, we only emit it if we don't have a
more specific error.

Differential revision: https://reviews.llvm.org/D36747

llvm-svn: 319649
2017-12-04 12:02:32 +00:00
Martin Storsjo eca862de07 [AArch64] Allow using emulated tls on platforms other than ELF
This matches how it is done on X86.

This allows using emulated tls on windows; in MinGW environments,
native tls isn't supported at the moment.

Set the right Data*bitsDirective for windows to match the existing
tests for other platforms. Make parts of the existing tests a regex,
to allow matching .section .rdata for windows, to avoid having to
duplicate the rest of the tests for windows.

Differential Revision: https://reviews.llvm.org/D40770

llvm-svn: 319644
2017-12-04 09:09:04 +00:00
Martin Storsjo c85cc41801 [ARM] Allow using emulated tls on platforms other than ELF
This matches how it is done on X86.

This allows using emulated tls on windows; in MinGW environments,
native tls isn't supported at the moment.

Differential Revision: https://reviews.llvm.org/D40769

llvm-svn: 319643
2017-12-04 09:08:55 +00:00
Craig Topper 4520d4f8ad [X86] Allow VPMAXUQ/VPMAXSQ/VPMINUQ/VPMINSQ to be used with 128/256 bit vectors when AVX512 is enabled.
These instructions can be used by widening to 512-bits and extracting back to 128/256. We do similar to several other instructions already.

llvm-svn: 319641
2017-12-04 07:21:01 +00:00
Craig Topper 67217d7eb4 [SelectionDAG] Teach computeKnownBits some improvements to ISD::SRL with a non-splat constant shift amount.
If we have a non-splat constant shift amount, the minimum shift amount can be used to infer the number of zero upper bits of the result. There's probably a lot more that we can do here, but this
fixes a case where I wanted to infer the sign bit as zero when all the shift amounts are non-zero.

llvm-svn: 319639
2017-12-04 05:38:42 +00:00
Simon Pilgrim 465a88bb92 [X86][AVX512] Tag packed F2I/I2F/F2F conversion instructions scheduler class
llvm-svn: 319636
2017-12-03 21:16:12 +00:00
Simon Pilgrim 9291053463 [X86][AVX512] Regenerate schedule tests.
llvm-svn: 319635
2017-12-03 21:07:36 +00:00
Yaxun Liu 30e4608cca CodeGen: Fix SelectionDAGISel::LowerArguments for sret addr space
SelectionDAGISel::LowerArguments assumes sret addr space is 0, which is
not true for amdgcn---amdgiz target.

This patch fixes that.

Differential Revision: https://reviews.llvm.org/D40255

llvm-svn: 319630
2017-12-03 03:31:45 +00:00
Sam Clegg a2b35dac03 Reland "[WebAssembly] Add visibility flag to Wasm symbol flags""
Original change was rL319488.

This was reverted rL319602 due to a gcc 7.1 warning.

Differential Revision: https://reviews.llvm.org/D40772

llvm-svn: 319626
2017-12-03 01:19:23 +00:00
Yaxun Liu 494770403a CodeGen: Fix pointer info in SplitVecOp_EXTRACT_VECTOR_ELT/SplitVecRes_INSERT_VECTOR_ELT
Two issues found when doing codegen for splitting vector with non-zero alloca addr space:

DAGTypeLegalizer::SplitVecRes_INSERT_VECTOR_ELT/SplitVecOp_EXTRACT_VECTOR_ELT uses dummy pointer info for creating
SDStore. Since one pointer operand contains multiply and add, InferPointerInfo is unable to
infer the correct pointer info, which ends up with a dummy pointer info for the target to lower
store and results in isel failure. The fix is to introduce MachinePointerInfo::getUnknownStack to
represent MachinePointerInfo which is known in alloca address space but without other information.

TargetLowering::getVectorElementPointer uses value type of pointer in addr space 0 for
multiplication of index and then add it to the pointer. However the pointer may be in an addr
space which has different size than addr space 0. The fix is to use the pointer value type for
index multiplication.

Differential Revision: https://reviews.llvm.org/D39758

llvm-svn: 319622
2017-12-02 22:13:22 +00:00
Simon Atanasyan d4b693bfb8 [llvm-readobj] Print static MIPS GOT
If a linked binary file contains a dynamic section, the GOT layout
defined by the dynamic section entries. In a statically linked file
the GOT is just a series of entries. This change teaches `llvm-readobj`
to print the GOT in that case. That provides a feature parity with GNU
`readelf`.

llvm-svn: 319616
2017-12-02 13:06:35 +00:00
Craig Topper 64469a18f3 [X86] Fix copy paste mistake in test case for r319612.
llvm-svn: 319613
2017-12-02 08:39:02 +00:00
Craig Topper 7d9a3b82c6 [X86] Teach the assembler to support %db8-%db15 as aliases for %dr8-%dr15.
llvm-svn: 319612
2017-12-02 08:27:46 +00:00
Craig Topper 3e846ecb5b [X86] Support %dr8-%dr15 in the assembler.
Apparently I failed to make this work when I fixed it in the disassembler way back in r224862.

llvm-svn: 319611
2017-12-02 08:27:45 +00:00
Tatyana Krasnukha f665f6a279 [ARC] Add instruction subset for the ARC backend.
Reviewers: petecoup, kparzysz

Reviewed By: petecoup

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D37983

llvm-svn: 319609
2017-12-02 05:25:17 +00:00
Nirav Dave 839ff79a8d [DAG][AArch64] Disable post-legalization store
Disable post-legalization store for AArch64 backend which is causing
errors out-of-tree.

llvm-svn: 319607
2017-12-02 04:01:26 +00:00
Heejin Ahn e74a864cec [WebAssembly] Revert r319488 "Add visibility flag to Wasm symbol flags"
This patch reportedly broke one of LLVM bots (ubuntu-gcc7.1-werror).

See http://lab.llvm.org:8011/builders/ubuntu-gcc7.1-werror/builds/3369 for
details.

llvm-svn: 319602
2017-12-02 02:05:06 +00:00
Matt Morehouse 9e658c974b Revert "[X86] Improvement in CodeGen instruction selection for LEAs."
This reverts r319543, due to ASan bot breakage.

llvm-svn: 319591
2017-12-01 22:20:26 +00:00
Nirav Dave 3e76e1e89e [DAG][ARM] Revert "Reenable post-legalize store merge"
due to failures in AArch and ARM code gen.

llvm-svn: 319587
2017-12-01 21:55:47 +00:00
Jake Ehrlich 3da7982cca [MC] Handle unknown literal register numbers in .cfi_* directives
r230670 introduced a step to map EH register numbers to standard
DWARF register numbers. This failed to consider the case when a
user .cfi_* directive uses an integer literal rather than a
register name, to specify a DWARF register number that has no
corresponding LLVM register number (e.g. a special register that
the compiler and assembler have no name for).

Fixes PR34028.

Patch by Roland McGrath

Differential Revision: https://reviews.llvm.org/D36493

llvm-svn: 319586
2017-12-01 21:44:27 +00:00
Philip Reames 6260cf71d3 [IndVars] Fix a bug introduced in r317012
Turns out we can have comparisons which are indirect users of the induction variable that we can make invariant.  In this case, there is no loop invariant value contributing and we'd fail an assert.

The test case was found by a java fuzzer and reduced.  It's a real cornercase.  You have to have a static loop which we've already proven only executes once, but haven't broken the backedge on, and an inner phi whose result can be constant folded by SCEV using exit count reasoning but not proven by isKnownPredicate.  To my knowledge, only the fuzzer has hit this case.

llvm-svn: 319583
2017-12-01 20:57:19 +00:00
Adam Nemet 9303f62255 [opt-remarks] If hotness threshold is set, ignore remarks without hotness
These are blocks that haven't not been executed during training.  For large
projects this could make a significant difference.  For the project, I was
looking at, I got an order of magnitude decrease in the size of the total YAML
files with this and r319235.

Differential Revision: https://reviews.llvm.org/D40678

Re-commit after fixing the failing testcase in rL319576, rL319577 and
rL319578.

llvm-svn: 319581
2017-12-01 20:41:38 +00:00
Fedor Sergeev 3b459c3847 IR printing improvement for loop passes - handle -print-module-scope
Summary:
Adding support for -print-module-scope similar to how it is
being done for function passes. This option causes loop-pass printer
to emit a whole-module IR instead of just a loop itself.

Reviewers: sanjoy, silvas, weimingz

Reviewed By: sanjoy

Subscribers: apilipenko, skatkov, llvm-commits

Differential Revision: https://reviews.llvm.org/D40247

llvm-svn: 319566
2017-12-01 18:33:58 +00:00
Paul Robinson ab69b477a9 [DebugInfo] Bail out if making no progress dumping line tables.
llvm-svn: 319564
2017-12-01 18:25:30 +00:00
Adam Nemet 57783730fd Revert "[opt-remarks] If hotness threshold is set, ignore remarks without hotness"
This reverts commit r319556.

Something is not working with this when used with sample-based profiling.
Investigating...

llvm-svn: 319562
2017-12-01 18:12:29 +00:00
Fedor Sergeev 94dca7c7ea IR printing improvement for function passes - introducing -print-module-scope
Summary:
When debugging function passes it happens to be rather useful to dump
the whole module before the transformation and then use this dump
to analyze this single transformation by running it separately
on that particular module state.

Introducing
    -print-module-scope
debugging option that forces all the function-level IR dumps
to become whole-module dumps.

This option builds on top of normal dumping controls like
   -print-before/after
   -filter-print-funcs

The plan is to eventually extend this option to cover other local passes
(at least loop passes) but that should go as a separate change.

Reviewers: sanjoy, weimingz, silvas, fedor.sergeev

Reviewed By: weimingz

Subscribers: apilipenko, skatkov, llvm-commits, mehdi_amini

Differential Revision: https://reviews.llvm.org/D40245

llvm-svn: 319561
2017-12-01 17:42:46 +00:00
Adam Nemet 8d1fc2b65b [opt-remarks] If hotness threshold is set, ignore remarks without hotness
These are blocks that haven't not been executed during training.  For large
projects this could make a significant difference.  For the project, I was
looking at, I got an order of magnitude decrease in the size of the total YAML
files with this and r319235.

Differential Revision: https://reviews.llvm.org/D40678

llvm-svn: 319556
2017-12-01 17:02:04 +00:00
Hans Wennborg e2470b95da Revert r319531 "[SLPVectorizer] Failure to beneficially vectorize 'copyable' elements in integer binary ops."
It causes builds to fail with "Instruction does not dominate all uses" (PR35497).

> Patch tries to improve vectorization of the following code:
>
> void add1(int * __restrict dst, const int * __restrict src) {
>   *dst++ = *src++;
>   *dst++ = *src++ + 1;
>   *dst++ = *src++ + 2;
>   *dst++ = *src++ + 3;
> }
> Allows to vectorize even if the very first operation is not a binary add, but just a load.
>
> Fixed issues related to previous commit.
>
> Reviewers: spatel, mzolotukhin, mkuper, hfinkel, RKSimon, filcab, ABataev
>
> Reviewed By: ABataev, RKSimon
>
> Subscribers: llvm-commits, RKSimon
>
> Differential Revision: https://reviews.llvm.org/D28907

llvm-svn: 319550
2017-12-01 16:17:24 +00:00
Sam Parker 45b5950f38 [ARM] and + load combine tests
Add a few more tests cases.

llvm-svn: 319548
2017-12-01 15:31:41 +00:00
Nirav Dave eb2b24fded [ARM][DAG] Reenable post-legalize store merge
Summary: Reenable post-legalize stores with constant merging computation and cofrresponding test case.

Reviewers: eastig, efriedma

Subscribers: aemerson, javed.absar, kristof.beyls, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D40701

llvm-svn: 319547
2017-12-01 14:49:26 +00:00
Jatin Bhateja 328199ec26 [X86] Improvement in CodeGen instruction selection for LEAs.
Summary:
1/  Operand folding during complex pattern matching for LEAs has been extended, such that it promotes Scale to
     accommodate similar operand appearing in the DAG  e.g.
                 T1 = A + B
                 T2 = T1 + 10
                 T3 = T2 + A
    For above DAG rooted at T3, X86AddressMode will now look like
                Base = B , Index = A , Scale = 2 , Disp = 10

2/  During OptimizeLEAPass down the pipeline factorization is now performed over LEAs so that if there is an opportunity
     then complex LEAs (having 3 operands) could be factored out  e.g.
                 leal 1(%rax,%rcx,1), %rdx
                 leal 1(%rax,%rcx,2), %rcx
     will be factored as following
                 leal 1(%rax,%rcx,1), %rdx
                 leal (%rdx,%rcx)   , %edx

3/ Aggressive operand folding for AM based selection for LEAs is sensitive to loops, thus avoiding creation of any complex LEAs within a loop.

4/ Simplify LEA converts (lea (BASE,1,INDEX,0)  --> add (BASE, INDEX) which offers better through put.

PR32755 will be taken care of by this pathc.

Previous patch revisions : r313343 , r314886

Reviewers: lsaba, RKSimon, craig.topper, qcolombet, jmolloy, jbhateja

Reviewed By: lsaba, RKSimon, jbhateja

Subscribers: jmolloy, spatel, igorb, llvm-commits

Differential Revision: https://reviews.llvm.org/D35014

llvm-svn: 319543
2017-12-01 14:07:38 +00:00
Sam Parker 412a991b10 [ARM] and + load combine tests
Adding autogenerated tests for narrow load combines.

Differential Revision: https://reviews.llvm.org/D40709

llvm-svn: 319542
2017-12-01 13:42:39 +00:00
Simon Pilgrim 2dc4ff1cde [X86][AVX512] Tag vshift/vpermv/pshufd/pshufb instructions scheduler classes
llvm-svn: 319540
2017-12-01 13:25:54 +00:00
Mikael Holmen 9c13c8b6ec Revert r319537: Bail out of a SimplifyCFG switch table opt at undef values.
Broke build bots so reverting.

llvm-svn: 319539
2017-12-01 13:11:39 +00:00
Florian Hahn 30932a3c16 [InstSimplify] More fcmp cases when comparing against negative constants.
Summary:
For known positive non-zero value X:
    fcmp uge X, -C => true
    fcmp ugt X, -C => true
    fcmp une X, -C => true
    fcmp oeq X, -C => false
    fcmp ole X, -C => false
    fcmp olt X, -C => false


Patch by Paul Walker.

Reviewers: majnemer, t.p.northover, spatel, RKSimon

Reviewed By: spatel

Subscribers: fhahn, llvm-commits

Differential Revision: https://reviews.llvm.org/D40012

llvm-svn: 319538
2017-12-01 12:34:16 +00:00
Mikael Holmen 9f047795fb Bail out of a SimplifyCFG switch table opt at undef values.
Summary:
A true or false result is expected from a comparison, but it seems the possibility of undef was overlooked, which could lead to a failed assert. This is fixed by this patch by bailing out if we encounter undef.

The bug is old and the assert has been there since the end of 2014, so it seems this is unusual enough to forego optimization.

Patch by: JesperAntonsson

Reviewers: spatel, eeckstein, hans

Reviewed By: hans

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D40639

llvm-svn: 319537
2017-12-01 12:30:49 +00:00
Nemanja Ivanovic 4364513cb2 Follow-up to r319434 to turn the pass on by default
Now that the patch has gone through the buildbot cycle,
turn it on by default.

llvm-svn: 319535
2017-12-01 12:02:59 +00:00
Alexander Timofeev c1425c9d6b [AMDGPU] SiFixSGPRCopies should not modify non-divergent PHI
Differential revision: https://reviews.llvm.org/D40556

llvm-svn: 319534
2017-12-01 11:56:34 +00:00
Dinar Temirbulatov 29e86584c6 [SLPVectorizer] Failure to beneficially vectorize 'copyable' elements in integer binary ops.
Patch tries to improve vectorization of the following code:
    
            void add1(int * __restrict dst, const int * __restrict src) {
              *dst++ = *src++;
              *dst++ = *src++ + 1;
              *dst++ = *src++ + 2;
              *dst++ = *src++ + 3;
            }
            Allows to vectorize even if the very first operation is not a binary add, but just a load.
    
            Fixed issues related to previous commit.
    
            Reviewers: spatel, mzolotukhin, mkuper, hfinkel, RKSimon, filcab, ABataev
    
            Reviewed By: ABataev, RKSimon
    
            Subscribers: llvm-commits, RKSimon
    
            Differential Revision: https://reviews.llvm.org/D28907

llvm-svn: 319531
2017-12-01 11:10:47 +00:00
Volkan Keles a32ff00b00 GlobalISel: Enable the legalization of G_MERGE_VALUES and G_UNMERGE_VALUES
Summary: LegalizerInfo assumes all G_MERGE_VALUES and G_UNMERGE_VALUES instructions are legal, so it is not possible to legalize vector operations on illegal vector types. This patch fixes the problem by removing the related check and adding default actions for G_MERGE_VALUES and G_UNMERGE_VALUES.

Reviewers: qcolombet, ab, dsanders, aditya_nandakumar, t.p.northover, kristof.beyls

Reviewed By: dsanders

Subscribers: rovka, javed.absar, igorb, llvm-commits

Differential Revision: https://reviews.llvm.org/D39823

llvm-svn: 319524
2017-12-01 08:19:10 +00:00
Hiroshi Inoue 48e4c7aae6 Recommit rL319407: [SROA] enable splitting for non-whole-alloca loads and stores
Recommiting once reverted patch rL319407 after adding a check for bit vector size to avoid failures in some build bots.

llvm-svn: 319522
2017-12-01 06:05:05 +00:00
Craig Topper f8470a6399 [X86] Custom legalize v2i32 gathers via widening rather than promoting.
The default legalization for v2i32 is promotion to v2i64. This results in a gather that reads 64-bit elements rather than 32. If one of the elements is near a page boundary this can cause an illegal access that can fault.

We also miscalculate the scale for the gather which is an even worse problem, but we probably could have found a separate way to fix that.

llvm-svn: 319521
2017-12-01 06:02:02 +00:00
Craig Topper c261213abc [X86][SelectionDAG] Make sure we explicitly sign extend the index when type promoting the index of scatter and gather.
Type promotion makes no guarantee about the contents of the promoted bits. Since the gather/scatter instruction will use the bits to calculate addresses, we need to ensure they aren't garbage.

llvm-svn: 319520
2017-12-01 06:02:00 +00:00
Craig Topper 81201cee4a [X86] Add another v2i32 gather test case with v2i64 index that wasn't sign extended.
llvm-svn: 319519
2017-12-01 06:01:59 +00:00
Craig Topper 11f733df9b [X86] Add a DAG combine to simplify masks for AVX2 gather instructions.
AVX2 gathers only use the upper bit of the mask allowing us to simplify sign_extend_inreg to a shift left.

llvm-svn: 319514
2017-12-01 02:49:07 +00:00
Sam Clegg 223e2d4f04 [WebAssembly] Update MC tests now that hidden attr is supported
Summary:
Support was added in rL319488 but these tests were not
updated.

Subscribers: jfb, dschuff, jgravelle-google, aheejin, sunfish

Differential Revision: https://reviews.llvm.org/D40693

llvm-svn: 319510
2017-12-01 01:18:47 +00:00
Jake Ehrlich 1a468481c0 Add flag to ArchiveWriter to test GNU64 format more efficiently
Even with the sparse file optimizations the SYM64 test can still be painfully
slow. This unnecessarily slows down devs. It's critical that we test that the
switch to the SYM64 format occurs at 4GB but there isn't any better of a way to
fake the size of the file than sparse files. This change introduces a flag that
allows the cutoff to be arbitrarily set to whatever power of two is desired.
The flag is hidden as it really isn't meant to be used outside this one test.
This is unfortunate but appears necessary, at least until the average hard
drive is much faster.

The changes to the test require some explanation. Prior to this change we knew
that the SYM64 format was being used because the file was simply too large to
have validly handled this case if the SYM64 format were not used. To ensure
that the SYM64 format is still being used I am grepping the file for "SYM64".
Without changing the filename however this would be pointless because "SYM64"
would occur in the file either way. So the filename of the test is also changed
in order to avoid this issue.

Differential Revision: https://reviews.llvm.org/D40632

llvm-svn: 319507
2017-12-01 00:54:28 +00:00
Matt Arsenault 686d5c728f AMDGPU: Use carry-less adds in FI elimination
llvm-svn: 319501
2017-11-30 23:42:30 +00:00
Peter Collingbourne 1f03422610 ThinLTOBitcodeWriter: Try harder to discard unused references to the merged module.
If the thin module has no references to an internal global in the
merged module, we need to make sure to preserve that property if the
global is a member of a comdat group, as otherwise promotion can end
up adding global symbols to the comdat, which is not allowed.

This situation can arise if the external global in the thin module
has dead constant users, which would cause use_empty() to return
false and would cause us to try to promote it. To prevent this from
happening, discard the dead constant users before asking whether a
global is empty.

Differential Revision: https://reviews.llvm.org/D40593

llvm-svn: 319494
2017-11-30 23:05:52 +00:00
Matt Arsenault 84445dd13c AMDGPU: Use gfx9 carry-less add/sub instructions
llvm-svn: 319491
2017-11-30 22:51:26 +00:00
Reid Kleckner ba4014e9dc XOR the frame pointer with the stack cookie when protecting the stack
Summary: This strengthens the guard and matches MSVC.

Reviewers: hans, etienneb

Subscribers: hiraditya, JDevlieghere, vlad.tsyrklevich, llvm-commits

Differential Revision: https://reviews.llvm.org/D40622

llvm-svn: 319490
2017-11-30 22:41:21 +00:00
Sam Clegg 9138b7b005 Add visibility flag to Wasm symbol flags
The LLVM "hidden" flag needs to be passed through the Wasm
intermediate objects in order for the linker to apply
it to the final Wasm object.

The corresponding change in LLD is here: https://github.com/WebAssembly/lld/pull/14

Patch by Nicholas Wilson

Differential Revision: https://reviews.llvm.org/D40442

llvm-svn: 319488
2017-11-30 22:34:58 +00:00
Dan Gohman 59e4c0b938 [memcpyopt] Teach memcpyopt to optimize across basic blocks
This teaches memcpyopt to make a non-local memdep query when a local query
indicates that the dependency is non-local. This notably allows it to
eliminate many more llvm.memcpy calls in common Rust code, often by 20-30%.

Fixes PR28958.

Differential Revision: https://reviews.llvm.org/D38374

llvm-svn: 319482
2017-11-30 22:10:53 +00:00
Krzysztof Parzyszek 2dddd6004f [Hexagon] Fix wrong check in test/CodeGen/Hexagon/newvaluejump-solo.mir
llvm-svn: 319476
2017-11-30 21:23:19 +00:00
Krzysztof Parzyszek 8c67461859 [Hexagon] Fix wrong pass in testcase
llvm-svn: 319471
2017-11-30 20:39:15 +00:00
Krzysztof Parzyszek 44555225a6 [Hexagon] Solo instructions cannot be used with new value jumps
llvm-svn: 319470
2017-11-30 20:32:54 +00:00
Yaxun Liu 1db0f718b5 [AMDGPU] Convert test/tools/llvm-objdump/AMDGPU/source-lines.ll to amdgiz
Differential Revision: https://reviews.llvm.org/D40653

llvm-svn: 319469
2017-11-30 20:27:56 +00:00
Craig Topper d4257565cf [X86] Promote i8 CTPOP to i32 instead of i16 when we have the POPCNT instruction.
The 32-bit version is shorter to encode and the zext we emit for the promotion is likely going to be a 32-bit zero extend anyway.

llvm-svn: 319468
2017-11-30 20:15:31 +00:00
Jake Ehrlich ef3b80c57b [llvm-objcopy] Add support for --only-keep/-j and --keep
This change adds support for the --only-keep option and the -j alias as well.
A common use case for these being used together is to dump a specific section's
data. Additionally the --keep option is added (GNU objcopy doesn't have this)
to avoid removing a bunch of things. This allows people to err on the side of
stripping aggressively and then to keep the specific bits that they need for
their application.

Differential Revision: https://reviews.llvm.org/D39021

llvm-svn: 319467
2017-11-30 20:14:53 +00:00
Daniel Sanders aef1dfc690 [aarch64][globalisel] Legalize G_ATOMIC_CMPXCHG_WITH_SUCCESS and G_ATOMICRMW_*
G_ATOMICRMW_* is generally legal on AArch64. The exception is G_ATOMICRMW_NAND.

G_ATOMIC_CMPXCHG_WITH_SUCCESS needs to be lowered to G_ATOMIC_CMPXCHG with an
external comparison.

Note that IRTranslator doesn't generate these instructions yet.

llvm-svn: 319466
2017-11-30 20:11:42 +00:00
Amara Emerson d78d65c2a4 [GlobalISel][IRTranslator] Fix crash during translation of zero sized loads/stores/args/returns.
This fixes PR35358.

rdar://35619533

Differential Revision: https://reviews.llvm.org/D40604

llvm-svn: 319465
2017-11-30 20:06:02 +00:00
Xinliang David Li c23d2c6883 [PGO] Skip counter promotion for infinite loops
Differential Revision: http://reviews.llvm.org/D40662

llvm-svn: 319462
2017-11-30 19:16:25 +00:00
Daniel Sanders f499b2bf1f [globalisel][tablegen] Add support for specific immediates in the match pattern
This enables a few rules such as ARM's uxtb instruction.

llvm-svn: 319457
2017-11-30 18:48:35 +00:00
Dan Gohman 78c19d60a9 [WebAssembly] Revert r319186 "Support bitcasted function addresses with varargs."
The patch broke Emscripten's EM_ASM macros, which utiltize unprototyped
functions.

See https://bugs.llvm.org/show_bug.cgi?id=35385 for details.

llvm-svn: 319452
2017-11-30 18:16:49 +00:00
Francis Visoiu Mistrih c7832045d5 [MIR] Fix DebugInfo tests after r319445
llvm-svn: 319447
2017-11-30 16:48:53 +00:00
Francis Visoiu Mistrih c71cced0aa [CodeGen] Always use `printReg` to print registers in both MIR and debug
output

As part of the unification of the debug format and the MIR format,
always use `printReg` to print all kinds of registers.

Updated the tests using '_' instead of '%noreg' until we decide which
one we want to be the default one.

Differential Revision: https://reviews.llvm.org/D40421

llvm-svn: 319445
2017-11-30 16:12:24 +00:00
Alexey Bataev d60250dd9b [InstCombine] Additional test for PR35354, NFC.
llvm-svn: 319436
2017-11-30 14:33:58 +00:00
Nemanja Ivanovic db7e77047c [PowerPC] Recommit r314244 with refactoring and off by default
This re-commits everything that was pulled in r314244. The transformation
is off by default (patch to enable it to follow). The code is refactored
to have a single entry-point and provide fine-grained control over patterns
that it selects. This patch also fixes the bugs in the original code.

Everything that failed with the original patch has been re-tested with this
patch (with the transformation turned on). So the patch to turn this on is
soon to follow.

Differential Revision: https://reviews.llvm.org/D38575

llvm-svn: 319434
2017-11-30 13:39:10 +00:00
Simon Pilgrim bb791b3dbd [X86][AVX512] Tag fcmp/ptest/ternlog instructions scheduler classes
llvm-svn: 319433
2017-11-30 13:18:06 +00:00
Simon Pilgrim 1c7556fb29 [X86][AVX512] Regenerate avx512 schedule tests
llvm-svn: 319432
2017-11-30 13:09:21 +00:00
Sean Eveson a6bcd53d52 [MC] Function stack size section.
Re applying after fixing issues in the diff, sorry for any painful conflicts/merges!

Original RFC: http://lists.llvm.org/pipermail/llvm-dev/2017-August/117028.html

This change adds a '.stack-size' section containing metadata on function stack sizes to output ELF files behind the new -stack-size-section flag. The section contains pairs of function symbol references (8 byte) and stack sizes (unsigned LEB128).

The contents of this section can be used to measure changes to stack sizes between different versions of the compiler or a source base. The advantage of having a section is that we can extract this information when examining binaries that we didn't build, and it allows users and tools easy access to that information just by referencing the binary.

There is a follow up change to add an option to clang.

Thanks.

Reviewers: hfinkel, MatzeB

Reviewed By: MatzeB

Subscribers: thegameg, asb, llvm-commits

Differential Revision: https://reviews.llvm.org/D39788

llvm-svn: 319430
2017-11-30 13:05:14 +00:00
Sean Eveson 661e4fbf83 Revert r319423: [MC] Function stack size section.
I messed up the diff.

llvm-svn: 319429
2017-11-30 12:43:25 +00:00
Diana Picus f003d9ff95 [ARM GlobalISel] Bail out for byval
Fallback if we have a byval parameter or argument since we don't support
them yet.

llvm-svn: 319428
2017-11-30 12:23:44 +00:00
Francis Visoiu Mistrih 93ef145862 [CodeGen] Print "%vreg0" as "%0" in both MIR and debug output
As part of the unification of the debug format and the MIR format, avoid
printing "vreg" for virtual registers (which is one of the current MIR
possibilities).

Basically:

* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E "s/%vreg([0-9]+)/%\1/g"
* grep -nr '%vreg' . and fix if needed
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E "s/ vreg([0-9]+)/ %\1/g"
* grep -nr 'vreg[0-9]\+' . and fix if needed

Differential Revision: https://reviews.llvm.org/D40420

llvm-svn: 319427
2017-11-30 12:12:19 +00:00
Sean Eveson f77b4d2f38 [MC] Function stack size section.
Summary:
Original RFC: http://lists.llvm.org/pipermail/llvm-dev/2017-August/117028.html

I wasn't sure who to put as reviewers, so please add/remove people as appropriate.

This change adds a '.stack-size' section containing metadata on function stack sizes to output ELF files behind the new -stack-size-section flag. The section contains pairs of function symbol references (8 byte) and stack sizes (unsigned LEB128).

The contents of this section can be used to measure changes to stack sizes between different versions of the compiler or a source base. The advantage of having a section is that we can extract this information when examining binaries that we didn't build, and it allows users and tools easy access to that information just by referencing the binary.

There is a follow up change to add an option to clang.

Thanks.

Reviewers: hfinkel, MatzeB

Reviewed By: MatzeB

Subscribers: thegameg, asb, llvm-commits

Differential Revision: https://reviews.llvm.org/D39788

llvm-svn: 319423
2017-11-30 12:01:16 +00:00
Serge Guelton 24386867b8 Support generic lowering of vector bswap
llvm-svn: 319419
2017-11-30 11:06:22 +00:00
Simon Pilgrim 3e5987cf8d [X86][AVX512] Tag RCP/RSQRT/GETEXP instructions scheduler classes
llvm-svn: 319418
2017-11-30 10:48:47 +00:00
Jonas Devlieghere c635376d7c [dsymutil] Upstream getBundleInfo implementation
This patch implements `getBundleInfo`, which uses CoreFoundation to
obtain information about the CFBundle. This information is needed to
populate the Plist in the dSYM bundle.

This change only applies to darwin and is an NFC as far as other
platforms are concerned.

Differential revision: https://reviews.llvm.org/D40244

llvm-svn: 319416
2017-11-30 10:25:28 +00:00
Hiroshi Inoue 21e8ded4d2 Revert rL319407: [SROA] enable splitting for non-whole-alloca loads and stores
This reverts commit rL319407 due to failures in some buildbot.

llvm-svn: 319410
2017-11-30 08:29:51 +00:00
Jonas Paulsson b9a2467501 [SystemZ] Bugfix in adjustSubwordCmp.
Csmith generated a program where a store after load to the same address did
not get chained after the new load created during DAG legalizing, and so
performed an illegal overwrite of the expected value.

When the new zero-extending load is created, the chain users of the original
load must be updated, which was not done previously.

A similar case was also found and handled in lowerBITCAST.

Review: Ulrich Weigand
https://reviews.llvm.org/D40542

llvm-svn: 319409
2017-11-30 08:18:50 +00:00
Hiroshi Inoue 422e80aee2 [SROA] enable splitting for non-whole-alloca loads and stores
Currently, SROA splits loads and stores only when they are accessing the whole alloca.
This patch relaxes this limitation to allow splitting a load/store if all other loads and stores to the alloca are disjoint to or fully included in the current load/store. If there is no other load or store that crosses the boundary of the current load/store, the current splitting implementation works as is.
The whole-alloca loads and stores meet this new condition and so they are still splittable.

Here is a simplified motivating example.

struct record {
    long long a;
    int b;
    int c;
};

int func(struct record r) {
    for (int i = 0; i < r.c; i++)
        r.b++;
    return r.b;
}

When updating r.b (or r.c as well), LLVM generates redundant instructions on some platforms (such as x86_64, ppc64); here, r.b and r.c are packed into one 64-bit GPR when the struct is passed as a method argument.

With this patch, the above example is compiled into only few instructions without loop.
Without the patch, unnecessary loop-carried dependency is introduced by SROA and the loop cannot be eliminated by the later optimizers.

Differential Revision: https://reviews.llvm.org/D32998

llvm-svn: 319407
2017-11-30 07:44:46 +00:00
Craig Topper a495744d2c [X86] Optimize avx2 vgatherqps for v2f32 with v2i64 index type.
Normal type legalization will widen everything. This requires forcing 0s into the mask register. We can instead choose the form that only reads 2 elements without zeroing the mask.

llvm-svn: 319406
2017-11-30 07:01:40 +00:00
Craig Topper 321a8b9b63 [X86] Make sure we don't remove sign extends of masks with AVX2 masked gathers.
We don't use k-registers and instead use the MSB so we need to make sure we sign extend the mask to the msb.

llvm-svn: 319405
2017-11-30 06:31:31 +00:00
Graham Yiu 70293fa27a - Removed unused lamba (IsReturnBlock) causing build bots to fail for r319398
- Added lit testcases that were supposed to be part of r319398

llvm-svn: 319399
2017-11-30 03:36:57 +00:00
Matt Arsenault caf0ed4d74 AMDGPU: Allow negative MUBUF vaddr for gfx9
GFX9 does not enable bounds checking for the resource descriptors
used for private access, so it should be OK to use vaddr with
a potentially negative value.

llvm-svn: 319393
2017-11-30 00:52:40 +00:00
Rafael Espindola a4e418f713 Check alignment in getSectionContentsAsArray.
While the ArrayRef can technically have unaligned data, it would be
extremely surprising if iterating over it caused undefined behavior
when a reference to the underlying type was bound.

llvm-svn: 319392
2017-11-30 00:44:22 +00:00
Vedant Kumar 80fbb85555 [Coverage] Use the most-recent completed region count (PR35437)
This is a fix for the coverage segment builder.

If multiple regions must be popped off the active stack at once, and
more than one of them end at the same location, emit a segment using the
count from the most-recent completed region.

Fixes PR35437, rdar://35760630

Testing: invoked llvm-cov on a stage2 build of clang, additional unit
tests, check-profile

llvm-svn: 319391
2017-11-30 00:28:23 +00:00
Joerg Sonnenberger 4b1acff9b3 First step towards more human-friendly PPC assembler output:
- add -ppc-reg-with-percent-prefix option to use %r3 etc as register
  names
- split off logic for Darwinish verbose conditional codes into a helper
  function
- be explicit about Darwin vs AIX vs GNUish assembler flavors

Based on the patch from Alexandre Yukio Yamashita

Differential Revision: https://reviews.llvm.org/D39016

llvm-svn: 319381
2017-11-29 23:05:56 +00:00