Commit Graph

56107 Commits

Author SHA1 Message Date
Diogo N. Sampaio 01b916e188 [ARM] Tighten f64<->f16 conversion requirements
Fix missing Requires fields.

Patch by Bernard Ogden (bogden)

Reviewers: SjoerdMeijer, javed.absar, t.p.northover	

Reviewed By: t.p.northover

Differential Revision: https://reviews.llvm.org/D51631

llvm-svn: 342061
2018-09-12 16:24:43 +00:00
Craig Topper 2262613532 [X86] Remove isel patterns for ADCX instruction
There's no advantage to this instruction unless you need to avoid touching other flag bits. It's encoding is longer, it can't fold an immediate, it doesn't write all the flags.

I don't think gcc will generate this instruction either.

Fixes PR38852.

Differential Revision: https://reviews.llvm.org/D51754

llvm-svn: 342059
2018-09-12 15:47:34 +00:00
Wolfgang Pieb 233bc73047 Reverting r342048, which caused UBSan failures in dsymutil.
llvm-svn: 342056
2018-09-12 14:40:04 +00:00
Roman Lebedev 99359f391e [NFC][InstCombine] R38708 - inefficient pattern for high-bits checking.
The simplest pattern for now:
https://rise4fun.com/Alive/LYjY
https://godbolt.org/z/o4RB8D

https://bugs.llvm.org/show_bug.cgi?id=38708

llvm-svn: 342054
2018-09-12 14:11:37 +00:00
Sander de Smalen 2d77e788f2 [AArch64] Implement aarch64_vector_pcs codegen support.
This patch adds codegen support for the saving/restoring
V8-V23 for functions specified with the aarch64_vector_pcs
calling convention attribute, as added in patch D51477.

Reviewers: t.p.northover, gberry, thegameg, rengolin, javed.absar, MatzeB

Reviewed By: thegameg

Differential Revision: https://reviews.llvm.org/D51479

llvm-svn: 342049
2018-09-12 12:10:22 +00:00
Wolfgang Pieb 3a8781cf6c [DWARF] Refactoring range list dumping to fold DWARF v4 functionality into v5 handling
Eliminating some duplication of rangelist dumping code at the expense of
some version-dependent code in dump and extract routines.

Reviewer: dblaikie, JDevlieghere, vleschuk

Differential revision: https://reviews.llvm.org/D51081

llvm-svn: 342048
2018-09-12 12:01:19 +00:00
David Green e27e87cdcb [CGP] Ensure splitgep gives deterministic output
The output of splitLargeGEPOffsets does not appear to be deterministic because
of the way that we iterate over a DenseMap. I've changed it to a MapVector for
consistent output.

The test here isn't particularly great, only showing a consmetic difference in
output. The original reproducer is much larger but show a diffierence in
instruction ordering, leading to different codegen.

Differential Revision: https://reviews.llvm.org/D51851

llvm-svn: 342043
2018-09-12 10:19:10 +00:00
David Green 2352b30c96 [SimplifyCFG] Put an alignment on generated switch tables
Previously the alignment on the newly created switch table data was not set,
meaning that DataLayout::getPreferredAlignment was free to overalign it to 16
bytes. This causes unnecessary code bloat.

Differential Revision: https://reviews.llvm.org/D51800

llvm-svn: 342039
2018-09-12 09:54:17 +00:00
Sam Parker a023c7a9cb [ARM] Exchange MAC operands in ARMParallelDSP
SMLAD and SMLALD instructions also come in the form of SMLADX and
SMLALDX which perform an exchange on their second operand. To support
this, more of the loads in the MAC candidates are compared for
sequential access and a boolean value has been added to BinOpChain.

AddMACCandiate has been refactored into a small pattern matching
state machine to reduce the amount of duplicated code, but also to
enable the matching to be more flexible. CreateParallelMACPairs now
iterates through all the candidates to find parallel ones.

Differential Revision: https://reviews.llvm.org/D51424

llvm-svn: 342033
2018-09-12 09:17:44 +00:00
Sam Parker 569b24549e [ARM] Allow bitcasts in ARMCodeGenPrepare
Allow bitcasts in the use-def chains, treating them as sources.

Differential Revision: https://reviews.llvm.org/D50758

llvm-svn: 342032
2018-09-12 09:11:48 +00:00
Sander de Smalen 4dbc512676 [AArch64] Add parsing of aarch64_vector_pcs attribute.
This patch adds parsing support for the 'aarch64_vector_pcs'
calling convention attribute to calls and function declarations.

More information describing the vector ABI and procedure call standard
can be found here:

  https://developer.arm.com/products/software-development-tools/\
                            hpc/arm-compiler-for-hpc/vector-function-abi

Reviewers: t.p.northover, rnk, rengolin, javed.absar, thegameg, SjoerdMeijer

Reviewed By: SjoerdMeijer

Differential Revision: https://reviews.llvm.org/D51477

llvm-svn: 342030
2018-09-12 08:54:06 +00:00
Florian Hahn 1086ce2397 [LV] Move InterleaveGroup and InterleavedAccessInfo to VectorUtils.h (NFC)
Move the 2 classes out of LoopVectorize.cpp to make it easier to re-use
them for VPlan outside LoopVectorize.cpp

Reviewers: Ayal, mssimpso, rengolin, dcaballe, mkuper, hsaito, hfinkel, xbolva00

Reviewed By: rengolin, xbolva00

Differential Revision: https://reviews.llvm.org/D49488

llvm-svn: 342027
2018-09-12 08:01:57 +00:00
Ilya Biryukov 95066496d0 Revert "AMDGPU: Move isa version and EF_AMDGPU_MACH_* determination into TargetParser."
This reverts commit r341982.

The change introduced a layering violation. Reverting to unbreak
our integrate.

llvm-svn: 342023
2018-09-12 07:05:30 +00:00
Craig Topper dc32e91bc6 [X86] Teach X86SelectionDAGInfo::EmitTargetCodeForMemcpy about GNUX32
Summary:
In GNUX23, is64BitMode returns true, but pointers are 32-bits. So we shouldn't copy pointer values into RSI/RDI since the widths don't match.

Fixes PR38865 despite what the title says. I think the llvm_unreachable in the copyPhysReg code tricked the optimizer and made the fatal error trigger.

Reviewers: rnk, efriedma, MatzeB, echristo

Reviewed By: efriedma

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D51893

llvm-svn: 342015
2018-09-12 01:57:22 +00:00
Jessica Paquette 2386eab360 [MachineOutliner] Add codegen size remarks to the MachineOutliner
Since the outliner is a module pass, it doesn't get codegen size remarks like
the other codegen passes do. This adds size remarks *to* the outliner.

This is kind of a workaround, so it's peppered with FIXMEs; size remarks
really ought to not ever be handled by the pass itself. However, since the
outliner is the only "MachineModulePass", this works for now. Since the
entire purpose of the MachineOutliner is to produce code size savings, it
really ought to be included in codgen size remarks.

If we ever go ahead and make a MachineModulePass (say, something similar to
MachineFunctionPass), then all of this ought to be moved there.

llvm-svn: 342009
2018-09-11 23:05:34 +00:00
Sanjay Patel 1cf0734b2f [InstCombine] add folds for unsigned-overflow compares
Name: op_ugt_sum
  %a = add i8 %x, %y
  %r = icmp ugt i8 %x, %a
  =>
  %notx = xor i8 %x, -1
  %r = icmp ugt i8 %y, %notx

Name: sum_ult_op
  %a = add i8 %x, %y
  %r = icmp ult i8 %a, %x
  =>
  %notx = xor i8 %x, -1
  %r = icmp ugt i8 %y, %notx

https://rise4fun.com/Alive/ZRxI

AFAICT, this doesn't interfere with any add-saturation patterns
because those have >1 use for the 'add'. But this should be
better for IR analysis and codegen in the basic cases.

This is another fold inspired by PR14613:
https://bugs.llvm.org/show_bug.cgi?id=14613

llvm-svn: 342004
2018-09-11 22:40:20 +00:00
Alexandros Lamprineas fe0512d575 Revert "[GVNHoist] Re-enable GVNHoist by default"
This reverts rL341954.

The builder `sanitizer-x86_64-linux-bootstrap-ubsan` has been
failing with timeouts at stage2 clang/ubsan:

[3065/3073] Linking CXX executable bin/lld
command timed out: 1200 seconds without output running python
../sanitizer_buildbot/sanitizers/buildbot_selector.py,
attempting to kill

llvm-svn: 342001
2018-09-11 22:10:57 +00:00
Reid Kleckner a6f64265ea [codeview] Decode and dump FP regs from S_FRAMEPROC records
Summary:
There are two registers encoded in the S_FRAMEPROC flags: one for locals
and one for parameters. The encoding is described by the
ExpandEncodedBasePointerReg function in cvinfo.h. Two bits are used to
indicate one of four possible values:

  0: no register - Used when there are no variables.
  1: SP / standard - Variables are stored relative to the standard SP
     for the ISA.
  2: FP - Variables are addressed relative to the ISA frame
     pointer, i.e. EBP on x86. If realignment is required, parameters
     use this. If a dynamic alloca is used, locals will be EBP relative.
  3: Alternative - Variables are stored relative to some alternative
     third callee-saved register. This is required to address highly
     aligned locals when there are dynamic stack adjustments. In this
     case, both the incoming SP saved in the standard FP and the current
     SP are at some dynamic offset from the locals. LLVM uses ESI in
     this case, MSVC uses EBX.

Most of the changes in this patch are to pass around the CPU so that we
can decode these into real, named architectural registers.

Subscribers: hiraditya

Differential Revision: https://reviews.llvm.org/D51894

llvm-svn: 341999
2018-09-11 22:00:50 +00:00
Sanjay Patel 26725bdc50 [InstCombine] add folds for icmp with xor mask constant
These are the folds in Alive;
Name: xor_ult
Pre: isPowerOf2(-C1)
%xor = xor i8 %x, C1
%r = icmp ult i8 %xor, C1
=>
%r = icmp ugt i8 %x, ~C1

Name: xor_ugt
Pre: isPowerOf2(C1+1)
%xor = xor i8 %x, C1
%r = icmp ugt i8 %xor, C1
=>
%r = icmp ugt i8 %x, C1

https://rise4fun.com/Alive/Vty

The ugt case in its simplest form was already handled by DemandedBits,
but that's not ideal as shown in the multi-use test.

I'm not sure if these are all of the symmetrical folds, but I adjusted 
the existing code for one of the folds to try to show the similarities.

There's no obvious connection, but this is another preliminary step 
for PR14613...
https://bugs.llvm.org/show_bug.cgi?id=14613

llvm-svn: 341997
2018-09-11 22:00:15 +00:00
Michael Berg c72a7259be add IR flags to MI
Summary: Initial support for nsw, nuw and exact flags in MI

Reviewers: spatel, hfinkel, wristow

Reviewed By: spatel

Subscribers: nlopes

Differential Revision: https://reviews.llvm.org/D51738

llvm-svn: 341996
2018-09-11 21:35:32 +00:00
Matt Morehouse f0d7daa972 Revert "[SanitizerCoverage] Create comdat for global arrays."
This reverts r341987 since it will cause trouble when there's a module
ID collision.

llvm-svn: 341995
2018-09-11 21:15:41 +00:00
Sanjay Patel c79d964fdd [InstCombine] add tests for icmp with xor; NFC
llvm-svn: 341993
2018-09-11 21:13:20 +00:00
Matt Morehouse 7ce6032432 [SanitizerCoverage] Create comdat for global arrays.
Summary:
Place global arrays in comdat sections with their associated functions.
This makes sure they are stripped along with the functions they
reference, even on the BFD linker.

Reviewers: eugenis

Reviewed By: eugenis

Subscribers: eraman, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D51902

llvm-svn: 341987
2018-09-11 20:10:40 +00:00
Alina Sbirlea a496143c9e Update MemorySSA in LoopUnswitch.
Summary:
Update MemorySSA in old LoopUnswitch pass.
Actual dependency and update is disabled by default.

Subscribers: sanjoy, jlebar, Prazek, llvm-commits

Differential Revision: https://reviews.llvm.org/D45301

llvm-svn: 341984
2018-09-11 19:19:21 +00:00
Konstantin Zhuravlyov 941615e4c8 AMDGPU: Move isa version and EF_AMDGPU_MACH_* determination
into TargetParser.

Also switch away from target features to CPU string when
determining isa version. This fixes an issue when we
output wrong isa version in the object code when features
of a particular CPU are altered (i.e. gfx902 w/o xnack
used to result in gfx900).

Differential Revision: https://reviews.llvm.org/D51890

llvm-svn: 341982
2018-09-11 18:56:51 +00:00
Sanjay Patel 342c3bcf11 [InstCombine] enhance vector demanded elements to look at a vector select condition operand
I noticed that we were not back-propagating undef lanes to shuffle masks when we have a 
shuffle that reduces the vector width. This is part of investigating/solving PR38691:
https://bugs.llvm.org/show_bug.cgi?id=38691

The DAG equivalent was proposed with:
D51696

Differential Revision: https://reviews.llvm.org/D51433

llvm-svn: 341981
2018-09-11 18:49:00 +00:00
Sanjay Patel 44c1b3a331 [InstCombine] add tests for add-with-overflow compares; NFC
llvm-svn: 341979
2018-09-11 18:45:28 +00:00
Craig Topper 8238580aae [X86] Prefer unpckhpd over movhlps in isel for fake unary cases
In r337348, I changed lowering to prefer X86ISD::UNPCKL/UNPCKH opcodes over MOVLHPS/MOVHLPS for v2f64 {0,0} and {1,1} shuffles when we have SSE2. This enabled the removal of a bunch of weirdly bitcasted isel patterns in r337349. To avoid changing the tests I placed a gross hack in isel to still emit movhlps instructions for fake unary unpckh nodes. A similar hack was not needed for unpckl and movlhps because we do execution domain switching for those. But unpckh and movhlps have swapped operand order.

This patch removes the hack.

This is a code size increase since unpckhpd requires a 0x66 prefix and movhlps does not. But if that's a big concern we should be using movhlps for all unpckhpd opcodes and let commuteInstruction turnit into unpckhpd when its an advantage.

Differential Revision: https://reviews.llvm.org/D49499

llvm-svn: 341973
2018-09-11 17:57:27 +00:00
Craig Topper cc9efaffad [X86] Teach X86FastISel::X86SelectRet to use EAX for the sret pointer in GNUX32
GNUX32 uses 32-bit pointers despite is64BitMode being true. So we should use EAX to return the value.

Fixes ones of the failures from PR38865.

Differential Revision: https://reviews.llvm.org/D51940

llvm-svn: 341972
2018-09-11 17:57:23 +00:00
Craig Topper 4e63db8387 [InstCombine] Fix incorrect usage of getPrimitiveSizeInBits when we should be using the element size for vectors
For vectors, getPrimitiveSizeInBits returns the full vector width. This code should using the element size for vectors. This could be fixed by calling getScalarSizeInBits, but its even easier to just get it from the APInt we're checking.

Differential Revision: https://reviews.llvm.org/D51938

llvm-svn: 341971
2018-09-11 17:57:20 +00:00
Florian Hahn 5b7e21a6b7 [CallSiteSplitting] Add debug location to created PHI nodes.
There are 2 cases when we create PHI nodes:
 * For the result of the call that was duplicated in the split blocks.
   Those PHI nodes should have the debug location of the call.

 * For values produced before the call. Those instructions need to be
   duplicated in the split blocks and the PHI nodes should have the
   debug locations of those instructions.

Fixes PR37962.

Reviewers: junbuml, gbedwell, vsk

Reviewed By: junbuml

Tags: #debug-info

Differential Revision: https://reviews.llvm.org/D51919

llvm-svn: 341970
2018-09-11 17:55:58 +00:00
Josh Stone f446facab0 [GlobalISel] Lower dbg.declare into indirect DBG_VALUE
Summary:
D31439 changed the semantics of dbg.declare to take the address of a
variable as the first argument, making it indirect.  It specifically
updated FastISel for this change here:

https://reviews.llvm.org/D31439#change-WVArzi177jPl

GlobalISel needs to follow suit, or else it will be missing a level of
indirection in the generated debuginfo.  This problem was seen in a Rust
debuginfo test on aarch64, since GlobalISel is used at -O0 for aarch64.

https://github.com/rust-lang/rust/issues/49807
https://bugzilla.redhat.com/show_bug.cgi?id=1611597
https://bugzilla.redhat.com/show_bug.cgi?id=1625768

Reviewers: dblaikie, aprantl, t.p.northover, javed.absar, rnk

Reviewed By: rnk

Subscribers: #debug-info, rovka, kristof.beyls, JDevlieghere, llvm-commits, tstellar

Differential Revision: https://reviews.llvm.org/D51749

llvm-svn: 341969
2018-09-11 17:52:01 +00:00
Matt Morehouse 40fbdd0c4f Revert "[SanitizerCoverage] Create comdat for global arrays."
This reverts r341951 due to bot breakage.

llvm-svn: 341965
2018-09-11 17:20:14 +00:00
Craig Topper a57bb61a3e [InstCombine] Support (mul (sext x), cst) --> (sext (mul x, cst')) and (mul (zext x), cst) --> (zext (mul x, cst')) for vectors constants.
Similar to D51236, but for mul instead of add.

Differential Revision: https://reviews.llvm.org/D51900

llvm-svn: 341961
2018-09-11 16:51:24 +00:00
Alexandros Lamprineas db18e972d7 [GVNHoist] Re-enable GVNHoist by default
Rebase rL340922 since https://bugs.llvm.org/show_bug.cgi?id=38807
has been fixed by rL341947.

llvm-svn: 341954
2018-09-11 15:55:45 +00:00
Roman Lebedev baf2628043 [DagCombine][NFC] Some more tests fo for X % C == 0 (UREM case) transform
For https://reviews.llvm.org/D50222

Patch by: hermord (Dmytro Shynkevych)!

llvm-svn: 341953
2018-09-11 15:34:26 +00:00
Simon Atanasyan 16c2311c59 [MIPS] Fix illegal type assert in single-float mode
An fp_to_sint node would be incorrectly lowered to a TruncIntFP node in
single-float mode. This would trigger an "Unexpected illegal type!"
assert.

Patch by Dan Ravensloft.

Differential revision: https://reviews.llvm.org/D51810

llvm-svn: 341952
2018-09-11 15:32:47 +00:00
Matt Morehouse eac270caf4 [SanitizerCoverage] Create comdat for global arrays.
Summary:
Place global arrays in comdat sections with their associated functions.
This makes sure they are stripped along with the functions they
reference, even on the BFD linker.

Reviewers: eugenis

Reviewed By: eugenis

Subscribers: eraman, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D51902

llvm-svn: 341951
2018-09-11 15:23:14 +00:00
Alexandros Lamprineas 96762b37e1 [MemorySSAUpdater] Avoid creating self-referencing MemoryDefs
Fix for https://bugs.llvm.org/show_bug.cgi?id=38807, which occurred
while compiling SemaTemplateInstantiate.cpp with clang and GVNHoist
enabled. In the following example:

      1=def(entry)
      /        \
2=def(1)       4=def(1)
3=def(2)       5=def(4)

When removing the MemoryDef 2=def(1) from its basic block, and just
before adding it to the end of the parent basic block, we first
replace all its uses with the defining memory access:

3=def(2) -> 3=def(1)

Then we call insertDef for adding 2=def(1) to the parent basic block,
where we replace the uses of 1=def(entry) with 2=def(1). Doing so we
create a self reference:

2=def(1) -> 2=def(2)  (bad)
3=def(1) -> 3=def(2)  (ok)
4=def(1) -> 4=def(2)  (ok)

Differential Revision: https://reviews.llvm.org/D51801

llvm-svn: 341947
2018-09-11 14:29:59 +00:00
Nico Weber f48e961d23 Make malformed-machos.test pass on my Mac.
For some reason, llvm-objdump defaults to -arch=i386 on this system while
the test checks x86_64 output. Explicitly pass -arch=x86_64.

llvm-svn: 341944
2018-09-11 14:10:33 +00:00
Roman Lebedev de9d787131 [Hexagon] [Test] Remove undef and infinite loop from test
Summary:
The undef and the infinite loop at the end cause this test to be translated
unpredictably. In particular, the checked-for `mpy` disappears under
certain legal optimizations (e.g. the one in D50222).
Since the use of these constructs is not relevant to the behavior tested,
according to the header comment, this change, suggested by @kparzysz,
eliminates them.

Was initially committed in r341046, but was reverted.

Patch by: hermord (Dmytro Shynkevych)!

Reviewers: kparzysz

Reviewed By: kparzysz

Subscribers: lebedev.ri, llvm-commits, kparzysz

Differential Revision: https://reviews.llvm.org/D50944

llvm-svn: 341943
2018-09-11 14:06:14 +00:00
Sam Parker 01db2983cd [ARM] Add smlald support in ARMParallelDSP
Search from i64 reducing phis, as well as i32, to allow the
generation of smlald instructions.

Differential Revision: https://reviews.llvm.org/D51101

llvm-svn: 341941
2018-09-11 14:01:22 +00:00
Sanjay Patel e368f46788 [AArch64] test codegen for unsigned saturated add; NFC
This is identical to the tests added for x86 at rL341845.
A semi-generic DAGCombine should improve things universally.

llvm-svn: 341935
2018-09-11 13:21:28 +00:00
Petar Jovanovic 5abf4bb552 [MIPS] ORC JIT support
This patch adds support for ORC JIT for mips/mips64 architecture.
In common code $static is changed to __ORCstatic because on MIPS
architecture "$" is a reserved character.

Patch by Luka Ercegovcevic

Differential Revision: https://reviews.llvm.org/D49665

llvm-svn: 341934
2018-09-11 13:10:04 +00:00
Alexander Timofeev db7ee7660a [AMDGPU] Preliminary patch for divergence driven instruction selection. Immediate selection predicate changed
Differential revision: https://reviews.llvm.org/D51734
Reviewers: rampitec

llvm-svn: 341928
2018-09-11 11:56:50 +00:00
Johannes Doerfert ae3cfeb3ad [FuncAttrs] Remove "access range attributes" for read-none functions
The presence of readnone and an access range attribute (argmemonly,
inaccessiblememonly, inaccessiblemem_or_argmemonly) is considered an
error by the verifier. This seems strict but also not wrong. This
patch makes sure function attribute detection will remove all access
range attributes for readnone functions.

llvm-svn: 341927
2018-09-11 11:51:29 +00:00
Simon Atanasyan 32d8d1bf04 [mips] Add a pattern for 64-bit GPR variant of the `rdhwr` instruction
MIPS ISAs start to support third operand for the `rdhwr` instruction
starting from Revision 6. But LLVM generates assembler code with
three-operands version of this instruction on any MIPS64 ISA. The third
operand is always zero, so in case of direct code generation we get
correct code.

This patch fixes the bug by adding an instruction alias. The same alias
already exists for 32-bit ISA.

Ideally, we also need to reject three-operands version of the `rdhwr`
instruction in an assembler code if ISA revision is less than 6. That is
a task for a separate patch.

This fixes PR38861 (https://bugs.llvm.org/show_bug.cgi?id=38861)

Differential revision: https://reviews.llvm.org/D51773

llvm-svn: 341919
2018-09-11 09:57:25 +00:00
Craig Topper 844f035e1e [X86] In combineMOVMSK, look through int->fp bitcasts before callling SimplifyDemandedBits.
MOVMSKPS and MOVMSKPD both take FP types, but likely the operations before it are on integer types with just a int->fp bitcast between them. If the bitcast isn't used by anything else and doesn't change the element width we can look through it to simplify the integer ops.

llvm-svn: 341915
2018-09-11 08:20:02 +00:00
Craig Topper 85210311ba [X86] Add test cases inspired by PR38840.
These are test cases inspired by sequences like below for extracting the same bit from every vector element and checking for all zeros/ones.

define i1 @and256_x8(<8 x i32>) {
    %a = trunc <8 x i32> %0 to <8 x i1>
    %b = bitcast <8 x i1> %a to i8
    %d = icmp eq i8 %b, -1
    ret i1 %d
}

This is what the above looks like after InstCombine.

define i1 @and256_x8_opt(<8 x i32>) {
  %2 = and <8 x i32> %0, <i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1>
  %a = icmp ne <8 x i32> %2, zeroinitializer
  %b = bitcast <8 x i1> %a to i8
  %d = icmp eq i8 %b, -1
  ret i1 %d
}

llvm-svn: 341908
2018-09-11 07:23:29 +00:00
Dean Michael Berris d2c50408d4 [XRay] Add TSC to NewCPUId Records
Summary:
This more correctly reflects the data written by the FDR mode runtime.

This is a continuation of the work in D50441.

Reviewers: mboerger, eizan

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D51911

llvm-svn: 341905
2018-09-11 06:36:51 +00:00
Max Kazantsev 9aacaffd98 [NFC] Specify test's option to reduce reliance on defaults
llvm-svn: 341904
2018-09-11 06:34:43 +00:00
Matt Arsenault d0cf1b26d4 AMDGPU: Fix r600 test
llvm-svn: 341898
2018-09-11 04:39:16 +00:00
Matt Arsenault 99c780159d AMDGPU: Don't error on out of bounds address spaces
We should never abort on valid IR. The most reasonable
interpretation of an arbitrary address space pointer is
probably some kind of special subset of global memory.

llvm-svn: 341894
2018-09-11 04:00:41 +00:00
David Blaikie 4ec5a9159b llvm-symbolizer: Fix bug related to TUs interfering with symbolizing
With the merge of TUs and CUs into a single container, some code that
relied on the CU range having an ordered range of contiguous addresses
(for locating a CU at a given offset) broke. But the units from
debug_info (currently only CUs, but CUs and TUs in DWARFv5) are in a
contiguous sub-range of that container - searching only through that
subrange is still valid & so do that.

llvm-svn: 341889
2018-09-11 02:04:45 +00:00
Peter Collingbourne c7d281905b Prevent Constant Folding From Optimizing inrange GEP
This patch does the following things:

1. update SymbolicallyEvaluateGEP so that it bails out if it cannot preserve inrange arribute;
2. update llvm/test/Analysis/ConstantFolding/gep.ll to remove UB in it;
3. remove inaccurate comment above ConstantFoldInstOperandsImpl in llvm/lib/Analysis/ConstantFolding.cpp;
4. add a new regression test that makes sure that no optimizations change an inrange GEP in an unexpected way.

Patch by Zhaomo Yang!

Differential Revision: https://reviews.llvm.org/D51698

llvm-svn: 341888
2018-09-11 01:53:36 +00:00
Dean Michael Berris dd01efc56d [XRay] Add the `llvm-xray fdr-dump` implementation
Summary:
In this change, we implement a `BlockPrinter` which orders records in a
Block that's been indexed by the `BlockIndexer`. This is used in the
`llvm-xray fdr-dump` tool which ties together the various types and
utilities we've been working on, to allow for inspection of XRay FDR
mode traces both with and without verification.

This change is the final step of the refactoring of D50441.

Reviewers: mboerger, eizan

Subscribers: mgorny, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D51846

llvm-svn: 341887
2018-09-11 00:22:53 +00:00
Jessica Paquette a80d6faa10 Add REQUIRES line to machine-size-remarks
Just was made aware that this is necessary for tests outside of
the X86 subdirectory. Add a REQUIRES line to make sure bots that
don't enable x86 are happy.

llvm-svn: 341885
2018-09-10 23:53:08 +00:00
Craig Topper 3de8d592d1 [InstCombine] Add testcases for (mul (sext x), cst) --> (sext (mul x, cst')) and (mul (zext x), cst) --> (zext (mul x, cst')) for vectors constants.
If the multiply won't overflow in the original type we can use a smaller mul and sign extend afterwards. We don't currently support this for vector constants.

llvm-svn: 341884
2018-09-10 23:48:21 +00:00
Alina Sbirlea 116caa2920 [InstCombine] Partially revert rL341674 due to PR38897.
Summary:
Revert min/max changes in rL341674 dues to high compile times causing timeouts (PR38897).
Checking in to unblock failing builds. Patch available for post-commit review and re-revert once resolved.
Working on a smaller reproducer for PR38897.

Reviewers: craig.topper, spatel

Subscribers: sanjoy, jlebar, llvm-commits

Differential Revision: https://reviews.llvm.org/D51897

llvm-svn: 341883
2018-09-10 23:47:21 +00:00
Jessica Paquette cd7bd8262a Explicitly state triple in machine-size-remarks.ll
A bot was unhappy with the x86 triple there before. Set it explicitly to
x86_64-apple-darwin just to get something consistent.

Example failure:
http://lab.llvm.org:8011/builders/llvm-hexagon-elf/builds/16846

llvm-svn: 341882
2018-09-10 23:30:53 +00:00
Philip Reames 9f09161290 [AST] Add test coverage of memsets
Immediately after posting https://reviews.llvm.org/D51895, I noticed a small bug.  These tests would have caught that.

llvm-svn: 341880
2018-09-10 23:14:30 +00:00
Jessica Paquette 54fbfaeace Add size remarks to MachineFunctionPass
This adds per-function size remarks to codegen, similar to what we have in the
IR layer as of r341588. This only impacts MachineFunctionPasses.

This does the same thing, but for `MachineInstr`s instead of just
`Instructions`. After this, when a `MachineFunctionPass` modifies the number of
`MachineInstr`s in the function it ran on, you'll get a remark.

To enable this, use the size-info analysis remark as before.

llvm-svn: 341876
2018-09-10 22:24:10 +00:00
Craig Topper 07889079fa [X89] Explicitly enable aes in aes-schedule.ll to fix failures after r341861.
llvm-svn: 341868
2018-09-10 21:49:01 +00:00
Sanjay Patel 7feb3ed78c [x86] test codegen for unsigned saturated add; NFC
All of the ISA holes are going to make this difficult,
but we can't canonicalize the IR and try to solve PR14613
until we have backend support to get this right.

https://bugs.llvm.org/show_bug.cgi?id=14613

https://rise4fun.com/Alive/Guv
https://rise4fun.com/Alive/AADG

llvm-svn: 341845
2018-09-10 17:40:15 +00:00
Alexander Timofeev 20cbe6f319 [AMDGPU] Preliminary patch for divergence driven instruction selection. Inline immediate move to V_MADAK_F32.
Differential revision: https://reviews.llvm.org/D51586

    Reviewer: rampitec

llvm-svn: 341843
2018-09-10 16:42:49 +00:00
Philip Reames 5660bd460b [AST] Visit memtransfer arguments in order
The only point to this change is the test diffs.  When I remove this code entirely (in favor of the recently added generic handling), I don't want there to be any confusion due to spurious test diffs.

As an aside, the fact out tests are AST construction order dependent is not great.  I thought about fixing that, but the reasonable schemes I might want (e.g. sort by name) need the test diffs anyways.

Philip

llvm-svn: 341841
2018-09-10 16:00:27 +00:00
Petar Jovanovic ce4dd0ae38 [MIPS GlobalISel] Select icmp
Select 32bit integer compare instructions for MIPS32.

Patch by Petar Avramovic.

Differential Revision: https://reviews.llvm.org/D51489

llvm-svn: 341840
2018-09-10 15:56:52 +00:00
Sebastian Pop d76177869a HotColdSplitting: fix test failing because of last commit
llvm-svn: 341839
2018-09-10 15:42:17 +00:00
Gil Rapaport d874c3a480 [LSR] Add tests for small constants; NFC
LSR reassociates small constants that fit into add immediate operands as
unfolded offset. Since unfolded offset is not combined with loop-invariant
registers, LSR does not consider solutions that bump invariant registers by
these constants outside the loop.

llvm-svn: 341835
2018-09-10 14:56:24 +00:00
Tim Northover 12c1f7675f InstCombine: move hasOneUse check to the top of foldICmpAddConstant
There were two combines not covered by the check before now, neither of which
actually differed from normal in the benefit analysis.

The most recent seems to be because it was just added at the top of the
function (naturally). The older is from way back in 2008 (r46687) when we just
didn't put those checks in so routinely, and has been diligently maintained
since.

llvm-svn: 341831
2018-09-10 14:26:44 +00:00
John Brawn 8967e18c4a [GVN] Invalidate cached info for values replaced by equality propagation
When GVN propagates an equality by replacing one value with another it also
needs to invalidate the cached information for the value being replaced.

Differential Revision: https://reviews.llvm.org/D51218

llvm-svn: 341820
2018-09-10 12:23:05 +00:00
Matt Arsenault 7f6dc597d3 AMDGPU: Stop reporting is-noop addrspacecast for constant 32-bit
This will require something to cast. Before this would eliminate
the cast, which would result in copies of $noreg.

llvm-svn: 341803
2018-09-10 11:59:27 +00:00
Matt Arsenault 57b5966dad DAG: Handle odd vector sizes in calling conv splitting
This already worked if only one register piece was used,
but didn't if a type was split into multiple, unequal
sized pieces.

Fixes not splitting 3i16/v3f16 into two registers for
AMDGPU.

This will also allow fixing the ABI for 16-bit vectors
in a future commit so that it's the same for all subtargets.

llvm-svn: 341801
2018-09-10 11:49:23 +00:00
Carl Ritson f898edd117 [AMDGPU] Prevent sequences of non-instructions disrupting GCNHazardRecognizer wait state counting
Summary:
This fixes a bug where a large number of implicit def instructions can fill the GCNHazardRecognizer lookahead buffer causing required NOPs to not be inserted.

Reviewers: nhaehnle, arsenm

Reviewed By: arsenm

Subscribers: sheredom, kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D51726

Change-Id: Ie75338f94de704ee5816b05afd0c922c6748a95b
llvm-svn: 341798
2018-09-10 10:14:48 +00:00
Max Kazantsev 4d10ba37b9 [IndVars] Set Changed if sinkUnusedInvariants changes IR. PR38863
Currently, `sinkUnusedInvariants` does not set Changed flag even if it makes
changes in the IR. There is no clear evidence that it can cause a crash, but it
looks highly suspicious and likely invalid.

Differential Revision: https://reviews.llvm.org/D51777
Reviewed By: skatkov

llvm-svn: 341777
2018-09-10 06:32:00 +00:00
David Carlier 0efae196dd [XRay] Fix buildbot failure
llvm-svn: 341774
2018-09-10 05:29:49 +00:00
David Carlier 07cc5a8df9 [Xray] tooling allow MachO format support
Getting writable xray __DATA sections from MachO as well.

Reviewers: dberris

Reviewed By: dberris

Differential Revision: https://reviews.llvm.org/D51758

llvm-svn: 341772
2018-09-10 05:00:43 +00:00
Matt Arsenault 72d27f5525 AMDGPU: Fix tests using old number for constant address space
llvm-svn: 341770
2018-09-10 02:54:25 +00:00
Matt Arsenault d77fcc2a92 AMDGPU: Use GOT PSV since it has an address space now
llvm-svn: 341768
2018-09-10 02:23:39 +00:00
Matt Arsenault b998674610 AMDGPU: Don't abort on unknown addrspace argument
llvm-svn: 341767
2018-09-10 02:23:30 +00:00
Craig Topper 3823516103 [X86] Custom type legalize (v2i32 (fp_to_uint v2f64))) without avx512vl by widening to v4i32 and v4f64 instead of v8i32 and v8f64. Make it aware of x86-experimental-vector-widening-legalization
We have isel patterns for v4i32/v4f64 that artificially widen to v8i32/v8f64 so just use that.

If x86-experimental-vector-widening-legalization is enabled, we don't need any custom legalization and can just return. I've modified the test RUN lines to cover this case.

llvm-svn: 341765
2018-09-09 20:36:36 +00:00
Sanjay Patel 6ebf218e4c [SelectionDAG] enhance vector demanded elements to look at a vector select condition operand
This is the DAG equivalent of D51433.
If we know we're not using all vector lanes, use that knowledge to potentially simplify a vselect condition.

The reduction/horizontal tests show that we are eliminating AVX1 operations on the upper half of 256-bit 
vectors because we don't need those anyway.
I'm not sure what the pr34592 test is showing. That's run with -O0; is SimplifyDemandedVectorElts supposed 
to be running there?

Differential Revision: https://reviews.llvm.org/D51696

llvm-svn: 341762
2018-09-09 14:13:22 +00:00
Craig Topper 7af5e333e7 [X86] Create paddus/psubus from narrower vectors with i8/i16 element types.
Summary:
This patch allows vectors with a power of 2 number of elements and i8/i16 element type to select paddus/psubus instructions. ReplaceNodeResults has been updated to custom widen these operations up to 128 bits like we already do for PAVG.

Another step towards fixing PR38691

Reviewers: RKSimon, spatel

Reviewed By: RKSimon, spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D51818

llvm-svn: 341753
2018-09-08 19:32:58 +00:00
Craig Topper a2c9694bc8 [X86] Mark the ADCX and ADOX instruction as commutable.
llvm-svn: 341752
2018-09-08 18:47:56 +00:00
Craig Topper 4677110348 [X86] Add test cases for commuting ADCX/ADOX instruction to avoid copies.
This is a MIR test so we can test ADOX which we have no isel patterns for. I also plan to remove ADCX isel patterns in the near future so this will help maintain coverage.

llvm-svn: 341751
2018-09-08 18:47:54 +00:00
Craig Topper c96305970d [X86] Add commuted isel pattern for the load form of ADCX instructions.
This prevents the legacy ADC instruction from being favored over ADCX when the load is in the operand 0.

llvm-svn: 341745
2018-09-08 06:31:43 +00:00
Craig Topper 22a6f51646 [X86] Add load folding test cases for the addcarryx intrinsic.
We are currently only able to fold a load in operand 1 to ADCX. A load in operand 0 will use the legacy ADC instruction.

Ultimately I want to remove isel support for ADCX, but first I'm going to fix the shortcomings I know of so I can write proper MIR tests to maintain coverage later.

llvm-svn: 341744
2018-09-08 06:31:41 +00:00
Craig Topper 761e88d1d4 [X86] Add stack folding MIR test for ADCX/ADOX.
We currently have no way to isel ADOX and I plan to remove isel patterns for ADCX. This test will ensure we still have stack folding support for these instructions if we need them in the future.

llvm-svn: 341743
2018-09-08 05:08:18 +00:00
Adrian Prantl 609bf36952 Remove addBlockByrefAddress(), it is dead code as far as clang is concerned.
This patch removes addBlockByrefAddress(), it is dead code as far as
clang is concerned: Every byref block capture is emitted with a
complex expression that is equivalent to what this function does.

rdar://problem/31629055

Differential Revision: https://reviews.llvm.org/D51763

llvm-svn: 341737
2018-09-08 00:21:55 +00:00
Zachary Turner 0119e38491 Fix some of the PDB tests.
They were unintentionally calling DIA directly, which requires
Windows.  We need to pass the -native flag, and this then required
fixing up one or two tests.

llvm-svn: 341731
2018-09-07 23:36:08 +00:00
Zachary Turner da4b63ab9a [PDB] Support pointer types in the native reader.
In order to start testing this, I've added a new mode to
llvm-pdbutil which is only really useful for writing tests.
It just dumps the value of raw fields in record format.
This isn't really ideal and it won't allow us to test some
important cases, but it's better than nothing for now.

llvm-svn: 341729
2018-09-07 23:21:33 +00:00
Reid Kleckner f803b23879 [COFF] Implement llvm.global_ctors priorities for MSVC COFF targets
Summary:
MSVC and LLD sort sections ASCII-betically, so we need to use section
names that sort between .CRT$XCA (the start) and .CRT$XCU (the default
priority).

In the general case, use .CRT$XCT12345 as the section name, and let the
linker sort the zero-padded digits.

Users with low priorities typically want to initialize as early as
possible, so use .CRT$XCA00199 for prioties less than 200. This number
is arbitrary.

Implements PR38552.

Reviewers: majnemer, mstorsjo

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D51820

llvm-svn: 341727
2018-09-07 23:07:55 +00:00
Abderrazek Zaafrani c30dfb2dfc [SimplifyIndVar] Avoid generating truncate instructions with non-hoisted Laod operand.
Differential Revision: https://reviews.llvm.org/D49151

llvm-svn: 341726
2018-09-07 22:41:57 +00:00
Piotr Padlewski 9a925ba616 Set cost of invariant group intrinsics to 0
Summary:
Like with other similar intrinsics, presense of strip or
launder.invariant.group should not change the result of inlining cost.
This is because they are just markers and do not perform any computation.

Reviewers: amharc, rsmith, reames, kuhar

Subscribers: eraman, llvm-commits

Differential Revision: https://reviews.llvm.org/D51814

llvm-svn: 341725
2018-09-07 22:29:48 +00:00
Thomas Lively a0d25815a0 [WebAssembly] v8x16.shuffle
Summary:
Since the shuffle mask is not exposed as an operand in the native ISel
DAG, create a new WebAssembly ISD node exposing the mask. The mask is
lowered as sixteen immediate byte indices no matter what type the
original vector shuffle was operating on.

This CL depends on D51656

Reviewers: aheejin, dschuff

Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D51659

llvm-svn: 341718
2018-09-07 21:54:46 +00:00
Sanjay Patel caa4de72a2 [InstCombine][x86] add tests for possible blendv transform (PR38814); NFC
llvm-svn: 341715
2018-09-07 21:40:41 +00:00
Philip Reames cb8b3278e5 [AST] Generalize argument specific aliasing
AliasSetTracker has special case handling for memset, memcpy and memmove which pre-existed argmemonly on functions and readonly and writeonly on arguments. This patch generalizes it using the AA infrastructure to any call correctly annotated.

The motivation here is to cut down on confusion, not performance per se. For most instructions, there is a direct mapping to alias set. However, this is not guaranteed by the interface and was not in fact true for these three intrinsics *and only these three intrinsics*. I kept getting myself confused about this invariant, so I figured it would be good to clearly distinguish between a instructions and alias sets. Calls happened to be an easy target.

The nice side effect is that custom implementations of memset/memcpy/memmove - including wrappers discovered by IPO - can now be optimized the same as builts by LICM.

Note: The actual removal of the memset/memtransfer specific handling will happen in a follow on NFC patch.  It was originally part of this one, but separate for ease of review and rebase.

Differential Revision: https://reviews.llvm.org/D50730

llvm-svn: 341713
2018-09-07 21:36:11 +00:00
Reid Kleckner 06d02d0306 [codeview] Add .cv_string directive for testing purposes
The main use case for this directive is to allow assembly writers to
write their own FPO data strings without going through the .cv_fpo*
directive family.

I'm experimenting with different RPN programs to fix PR38857, and I
figured I should go ahead and make this directive permanent.

llvm-svn: 341712
2018-09-07 21:30:52 +00:00
Craig Topper fa535c027e [X86] Add codegen tests for narrow PADDUS/PSUBUS patterns for PR38691.
llvm-svn: 341711
2018-09-07 21:28:46 +00:00
Sanjay Patel c1416b60f2 [InstCombine] narrow vector select with padded condition and extracted result (PR38691)
shuf (sel (shuf NarrowCond, undef, WideMask), X, Y), undef, NarrowMask) -->
sel NarrowCond, (shuf X, undef, NarrowMask), (shuf Y, undef, NarrowMask)

The motivating case from:
https://bugs.llvm.org/show_bug.cgi?id=38691
...is the last regression test. In that case, we're just left with the narrow select.

Note that if we do create new shuffles, they use the existing extraction identity mask, 
so there's no danger that this transform creates arbitrary shuffles.

Differential Revision: https://reviews.llvm.org/D51496

llvm-svn: 341708
2018-09-07 21:03:34 +00:00
Nick Desaulniers 287a3be379 [AArch64] Support reserving x1-7 registers.
Summary:
Reserving registers x1-7 is used to support CONFIG_ARM64_LSE_ATOMICS in Linux kernel. This change adds support for reserving registers x1 through x7.

Reviewers: javed.absar, phosek, srhines, nickdesaulniers, efriedma

Reviewed By: nickdesaulniers, efriedma

Subscribers: niravd, jfb, manojgupta, nickdesaulniers, jyknight, efriedma, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D48580

llvm-svn: 341706
2018-09-07 20:58:57 +00:00
Craig Topper 5cbce81c91 [X86] Don't create ZERO_EXTEND_INREG/SIGN_EXTEND_INREG for v1iX vectors.
The generic type legalizer will scalarize vXi1 instructions getting rid of the vector entirely. Creating wider vector instructions is just going to prevent that.

llvm-svn: 341705
2018-09-07 20:56:03 +00:00
Craig Topper 39f48fdcbc [X86] Don't create X86ISD::AVG nodes from v1iX vectors.
The type legalizer will try to scalarize this and fail.

It looks like there's some other v1iX oddities out there too since we still generated some vector instructions.

llvm-svn: 341704
2018-09-07 20:56:01 +00:00
Craig Topper 4863313b35 [X86] Modify the the rdtscp intrinsic to return values instead of taking a pointer argument
Similar to what was recently done for addcarry/subborrow and has been done for rdrand/rdseed for a while. It's better to use two results and an explicit store in IR when the store isn't part of the semantics of the instruction. This allows store->load forwarding to happen in the middle end. Or the store to be removed if its never loaded.

Differential Revision: https://reviews.llvm.org/D51803

llvm-svn: 341698
2018-09-07 19:14:15 +00:00
Reid Kleckner ee0e8bab2a [codeview] Improve readobj FPO dumper and pdbutil register names
The improved dumping helps me investigate PR38857.

llvm-svn: 341695
2018-09-07 18:48:27 +00:00
Ana Pazos b2ed11a086 [RISCV] Fix crash in decoding instruction with unknown floating point rounding mode
Summary:
Instead of crashing in printFRMArg, decode and warn about invalid instruction.

This bug was uncovered by a LLVM MC Disassembler Protocol Buffer Fuzzer
for the RISC-V assembly language.

Reviewers: asb

Reviewed By: asb

Subscribers: rbar, johnrusso, simoncook, sabuasal, niosHD, kito-cheng, shiva0217, zzheng, edward-jones, mgrang, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, asb

Differential Revision: https://reviews.llvm.org/D51705

llvm-svn: 341691
2018-09-07 18:43:43 +00:00
Fangrui Song 91c95a35c1 [llvm-dwp] Clean up tests X86/*.test
llvm-svn: 341688
2018-09-07 18:29:20 +00:00
Ana Pazos b97d18945b [RISCV] Fix AddressSanitizer heap-buffer-overflow in disassembling
Summary:
RISCVDisassembler should check number of bytes available before reading them.
Crash noticed when enabling -DLLVM_USE_SANITIZER=Address.

This bug was uncovered by a LLVM MC Disassembler Protocol Buffer Fuzzer for the RISC-V assembly language.

Reviewers: asb

Reviewed By: asb

Subscribers: rbar, johnrusso, simoncook, sabuasal, niosHD, kito-cheng, shiva0217, zzheng, edward-jones, mgrang, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, asb

Differential Revision: https://reviews.llvm.org/D51708

llvm-svn: 341686
2018-09-07 18:23:19 +00:00
Craig Topper 72964ae99e [X86] Change the addcarry and subborrow intrinsics to return 2 results and remove the pointer argument.
We should represent the store directly in IR instead. This gives the middle end a chance to remove it if it can see a load from the same address.

Differential Revision: https://reviews.llvm.org/D51769

llvm-svn: 341677
2018-09-07 16:58:39 +00:00
Craig Topper 51e11788a4 [X86] Use regular expressions to make test immune to register allocation changes.
llvm-svn: 341676
2018-09-07 16:58:36 +00:00
Craig Topper 313d09af51 [X86] Teach X86DAGToDAGISel::foldLoadStoreIntoMemOperand to handle loads in operand 1 of commutable operations.
Previously we only handled loads in operand 0, but nothing guarantees the load will be operand 0 for commutable operations.

Differential Revision: https://reviews.llvm.org/D51768

llvm-svn: 341675
2018-09-07 16:27:55 +00:00
Craig Topper 040c2b0acf [InstCombine] Fold (min/max ~X, Y) -> ~(max/min X, ~Y) when Y is freely invertible
If the ~X wasn't able to simplify above the max/min, we might be able to simplify it by moving it below the max/min.

I had to modify the ~(min/max ~X, Y) transform to prevent getting stuck in a loop when we saw the new ~(max/min X, ~Y) before the ~Y had been folded away to remove the new not.

Differential Revision: https://reviews.llvm.org/D51398

llvm-svn: 341674
2018-09-07 16:19:50 +00:00
Anna Thomas 110df11a1a [LV] Fix code gen for conditionally executed loads and stores
Fix a latent bug in loop vectorizer which generates incorrect code for
memory accesses that are executed conditionally. As pointed in review,
this bug definitely affects uniform loads and may affect conditional
stores that should have turned into scatters as well).

The code gen for conditionally executed uniform loads on architectures
that support masked gather instructions is broken.

Without this patch, we were unconditionally executing the *conditional*
load in the vectorized version.

This patch does the following:
1. Uniform conditional loads on architectures with gather support will
   have correct code generated. In particular, the cost model
   (setCostBasedWideningDecision) is fixed.
2. For the recipes which are handled after the widening decision is set,
   we use the isScalarWithPredication(I, VF) form which is added in the
   patch.

3. Fix the vectorization cost model for scalarization
   (getMemInstScalarizationCost): implement and use isPredicatedInst to
   identify *all* predicated instructions, not just scalar+predicated. So,
   now the cost for scalarization will be increased for maskedloads/stores
   and gather/scatter operations. In short, we should be choosing the
   gather/scatter in place of scalarization on archs where it is
   profitable.
4. We needed to weaken the assert in useEmulatedMaskMemRefHack.

Reviewers: Ayal, hsaito, mkuper

Differential Revision: https://reviews.llvm.org/D51313

llvm-svn: 341673
2018-09-07 15:53:48 +00:00
Aditya Kumar 801394a3d7 Hot cold splitting pass
Find cold blocks based on profile information (or optionally with static analysis).
Forward propagate profile information to all cold-blocks.
Outline a cold region.
Set calling conv and prof hint for the callsite of the outlined function.

Worked in collaboration with: Sebastian Pop <s.pop@samsung.com>
Differential Revision: https://reviews.llvm.org/D50658

llvm-svn: 341669
2018-09-07 15:03:49 +00:00
Florian Hahn e32ff4b28a [InstCombine] Do not fold scalar ops over select with vector condition.
If OtherOpT or OtherOpF have scalar types and the condition is a vector,
we would create an invalid select.

Reviewers: spatel, john.brawn, mssimpso, craig.topper

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D51781

llvm-svn: 341666
2018-09-07 14:40:06 +00:00
David Stenberg 45acc9610b [DebugInfo] Handle stack slot offsets for spilled sub-registers in LDV
Summary:
Extend LDV so that stack slot offsets for spilled sub-registers
are added to the emitted debug locations. This is accomplished
by querying InstrInfo::getStackSlotRange().

With this change, LDV will add a DW_OP_plus_uconst operation to
the expression if a sub-register is spilled. Later on, PEI will
add an offset operation for the stack slot, meaning that we will
get expressions of the forms:

 * {DW_OP_constu #fp-offset, DW_OP_minus,
    DW_OP_plus_uconst #subreg-offset}

 * {DW_OP_plus_const #fp-offset,
    DW_OP_minus, DW_OP_plus_uconst #subreg-offset}

The two offset operations should ideally be merged.

Reviewers: rnk, aprantl, stoklund

Reviewed By: aprantl

Subscribers: dblaikie, bjope, nemanjai, JDevlieghere, llvm-commits

Tags: #debug-info

Differential Revision: https://reviews.llvm.org/D51612

llvm-svn: 341659
2018-09-07 13:54:07 +00:00
Sid Manning 9ad0f02749 Add support for getRegisterByName.
Support required to build the Hexagon Linux kernel.

Differential Revision: https://reviews.llvm.org/D51363

llvm-svn: 341658
2018-09-07 13:36:21 +00:00
Simon Pilgrim 04d0748417 [X86][SSE] Add additional fadd/fsub(x, bitcast_fneg(y)) tests with different integer bitwidths
llvm-svn: 341657
2018-09-07 13:27:07 +00:00
Simon Pilgrim 96d6b9c2e2 [DAGCombiner] foldBitcastedFPLogic - Add basic vector support
Add support for bitcasts from float type to an integer type of the same element bitwidth.

There maybe cases where we need to support different widths (e.g. as SSE __m128i is treated as v2i64) - but I haven't seen cases of this in the wild yet.

llvm-svn: 341652
2018-09-07 12:13:45 +00:00
Florian Hahn b30f7aeeeb [NewGVN] Mark function as changed if we erase instructions.
Currently eliminateInstructions only returns true if any instruction got
replaced. In the test case for this patch, we eliminate the trivially
dead calls, for which eliminateInstructions not do a replacement and the
function is not marked as changed, which is why the inliner crashes
while traversing the call graph.

Alternatively we could also change eliminateInstructions to return true
in case we mark instructions for deletion, but that's slightly more code
and doing it at the place where the replacement happens seems safer.

Fixes PR37517.

Reviewers: davide, mcrosier, efriedma, bjope

Reviewed By: bjope

Differential Revision: https://reviews.llvm.org/D51169

llvm-svn: 341651
2018-09-07 11:41:34 +00:00
Simon Pilgrim a2aef22a72 [X86][SSE] Add fadd/fsub(x, bitcast_fneg(y)) tests
Show missing vector support

llvm-svn: 341650
2018-09-07 11:24:43 +00:00
Tim Northover bb7d7b3d33 ARM: fix Thumb2 CodeGen for ldrex with folded frame-index.
Because t2LDREX (& t2STREX) were marked as AddrModeNone, but did allow a
FrameIndex operand, rewriteT2FrameIndex asserted. This gives them a
proper addressing-mode and tells the rewriter about it so that encodable
offsets are exploited and others are rejected.

Should fix PR38828.

llvm-svn: 341642
2018-09-07 09:21:25 +00:00
Alexander Potapenko 8fe99a0ef2 [MSan] Add KMSAN instrumentation to MSan pass
Introduce the -msan-kernel flag, which enables the kernel instrumentation.

The main differences between KMSAN and MSan instrumentations are:

- KMSAN implies msan-track-origins=2, msan-keep-going=true;
- there're no explicit accesses to shadow and origin memory.
  Shadow and origin values for a particular X-byte memory location are
  read and written via pointers returned by
  __msan_metadata_ptr_for_load_X(u8 *addr) and
  __msan_store_shadow_origin_X(u8 *addr, uptr shadow, uptr origin);
- TLS variables are stored in a single struct in per-task storage. A call
  to a function returning that struct is inserted into every instrumented
  function before the entry block;
- __msan_warning() takes a 32-bit origin parameter;
- local variables are poisoned with __msan_poison_alloca() upon function
  entry and unpoisoned with __msan_unpoison_alloca() before leaving the
  function;
- the pass doesn't declare any global variables or add global constructors
  to the translation unit.

llvm-svn: 341637
2018-09-07 09:10:30 +00:00
Alexander Timofeev a805c96c65 [AMDGPU] Preliminary patch for divergence driven instruction selection. Fold immediate SMRD offset.
Differential revision: https://reviews.llvm.org/D51610

Reviewer: rampitec
llvm-svn: 341636
2018-09-07 09:05:34 +00:00
Puyan Lotfi 99124cc082 [llvm-objcopy] Dwarf .debug section compression support (zlib, zlib-gnu).
Third Attempt:
    - Alignment issues resolved.
    - zlib::isAvailable() detected.
    - ArrayRef misuse fixed.

  Usage:

  llvm-objcopy --compress-debug-sections=zlib foo.o
  llvm-objcopy --compress-debug-sections=zlib-gnu foo.o

  In both cases the debug section contents is compressed with zlib. In the GNU
  style case the header is the "ZLIB" magic string followed by the uint64 big-
  endian decompressed size. In the non-GNU mode the header is the
  Elf(32|64)_Chdr.

  Decompression support is coming soon.

  Differential Revision: https://reviews.llvm.org/D49678

llvm-svn: 341635
2018-09-07 08:10:22 +00:00
QingShan Zhang abbb894ff5 [PowerPC] Combine ADD to ADDZE
On the ppc64le platform, if ir has the following form,

define i64 @addze1(i64 %x, i64 %z) local_unnamed_addr #0 {
entry:
  %cmp = icmp ne i64 %z, CONSTANT      (-32767 <= CONSTANT <= 32768)
  %conv1 = zext i1 %cmp to i64
  %add = add nsw i64 %conv1, %x
  ret i64 %add
}
we can optimize it to the form below.

                                when C == 0
                            --> addze X, (addic Z, -1))
                           /
add X, (zext(setne Z, C))--
                           \    when -32768 <= -C <= 32767 && C != 0
                            --> addze X, (addic (addi Z, -C), -1)

Patch By: HLJ2009 (Li Jia He)
Differential Revision: https://reviews.llvm.org/D51403
Reviewed By: Nemanjai 

llvm-svn: 341634
2018-09-07 07:56:05 +00:00
Max Kazantsev 9e6845d8e1 [IndVars] Set Changed when we delete dead instructions. PR38855
IndVars does not set `Changed` flag when it eliminates dead instructions. As result,
it may make IR modifications and report that it has done nothing. It leads to inconsistent
preserved analyzes results.

Differential Revision: https://reviews.llvm.org/D51770
Reviewed By: skatkov

llvm-svn: 341633
2018-09-07 07:23:39 +00:00
Craig Topper 30e129f256 [X86] Add more test cases for missed opportunities for using RMW form of ADC.
llvm-svn: 341630
2018-09-07 02:39:56 +00:00
Jordan Rupprecht 470f745275 [llvm-strip] -p test fix for windows buildbots
Windows ls prints dates as "1997-05-05" instead of "May 05 1997", so only check for a leading space.

llvm-svn: 341614
2018-09-07 00:28:54 +00:00
Puyan Lotfi 5be060e341 Revert: [llvm-objcopy] Dwarf .debug section compression (Second Attempt).
Various bots still fail for unknown reason.

llvm-svn: 341613
2018-09-07 00:28:25 +00:00
Puyan Lotfi f0954dd275 [llvm-objcopy] Dwarf .debug section compression support (zlib, zlib-gnu).
Second Attempt. Alignment issues resolved. zlib::isAvailable() detected.

  Usage:

  llvm-objcopy --compress-debug-sections=zlib foo.o
  llvm-objcopy --compress-debug-sections=zlib-gnu foo.o

  In both cases the debug section contents is compressed with zlib. In the GNU
  style case the header is the "ZLIB" magic string followed by the uint64 big-
  endian decompressed size. In the non-GNU mode the header is the
  Elf(32|64)_Chdr.

  Decompression support is coming soon.

  Differential Revision: https://reviews.llvm.org/D49678

llvm-svn: 341607
2018-09-06 23:59:50 +00:00
Craig Topper 2c9dede9cb [X86] Add RMW ADC patterns with load in operand 1.
ADC is commutable and the load could be in either operand, but we were only checking operand 0.

Ideally we'd mark X86adc_flag as commutable and tablegen would automatically do this, but the EFLAGS register mention is preventing it.

llvm-svn: 341606
2018-09-06 23:55:36 +00:00
Craig Topper 37d68e4599 [X86] Add a test case showing failure to use the RMW form of ADC when the load is in operand 1 going into isel.
The ADC instruction is commutable, but we only have RMW isel patterns with a load on the left hand side. Nothing will canonicalize loads to the LHS on these ops. So we need two patterns.

llvm-svn: 341605
2018-09-06 23:55:34 +00:00
Jordan Rupprecht 29f1ce7dcc [llvm-strip] Fix -p test to check for explicit spaces around dates, to avoid when the filename happens to contain 1995/1997.
llvm-svn: 341595
2018-09-06 22:34:48 +00:00
Eric Christopher fe83270ee9 The initial .text section generated in object files was missing the
SHF_ARM_PURECODE flag when being built with the -mexecute-only flag.
All code sections of an ELF must have the flag set for the final .text
section to be execute-only, otherwise the flag gets removed.

A HasData flag is added to MCSection to aid in the determination that
the section is empty. A virtual setTargetSectionFlags is added to
MCELFObjectTargetWriter to allow subclasses to set target specific
section flags to be added to sections which we then use in the ARM
backend to set SHF_ARM_PURECODE.

Patch by Ivan Lozano!

Reviewed By: echristo

Differential Revision: https://reviews.llvm.org/D48792

llvm-svn: 341593
2018-09-06 22:09:31 +00:00
Wei Mi 94d44c97bc [SampleFDO] Make sample profile loader unaware of compact format change.
The patch tries to make sample profile loader independent of profile format
change. It moves compact format related code into FunctionSamples and
SampleProfileReader classes, and sample profile loader only has to interact
with those two classes and will be unaware of profile format changes.

The cleanup also contain some fixes to further remove the difference between
compactbinary format and binary format. After the cleanup using different
formats originated from the same profile will generate the same binaries,
which we verified by compiling two large server benchmarks w/wo thinlto.

Differential Revision: https://reviews.llvm.org/D51643

llvm-svn: 341591
2018-09-06 22:03:37 +00:00
Scott Linder 834cbc645c Revert r341413
Causes a regression in expensive checks.

llvm-svn: 341589
2018-09-06 21:38:56 +00:00
Jessica Paquette a0aa5b35e7 Output per-function size-info remarks
This patch adds per-function size information remarks. Previously, passing
-Rpass-analysis=size-info would only give you per-module changes. By adding
the ability to do this per-function, it's easier to see which functions
contributed the most to size changes.

https://reviews.llvm.org/D51467

llvm-svn: 341588
2018-09-06 21:19:54 +00:00
Fangrui Song a373582169 Reland rL341509: "[llvm-dwp] Use buffer_stream if output file is not seekable (e.g. "-")"
It caused ambiguity between llvm:🆑:Optional and llvm::Optional, which
has been fixed by dropping `using namespace cl;` in favor of explicit
cl:: qualified names.

llvm-svn: 341586
2018-09-06 20:26:54 +00:00
Tatyana Krasnukha b5f42976ad [ARC] Prevent InstPrinter from crashing on unknown condition codes.
Summary:
Instruction printer shouldn't crash with assertions due to incorrect input data. llvm_unreachable is not intended for runtime error handling.

Reviewers: petecoup

Reviewed By: petecoup

Differential Revision: https://reviews.llvm.org/D51728

llvm-svn: 341581
2018-09-06 19:58:26 +00:00
Sanjay Patel 9e5c163154 [x86] add tests for pow --> cbrt; NFC
llvm-svn: 341575
2018-09-06 18:42:55 +00:00
Martin Storsjo 1e8edd13ee [llvm-ar] Support * as comment char in MRI scripts
MRI scripts have two comment chars, * and ;, but only the latter was
supported before.

Also allow leading spaces before comment chars (and before any command
string), and allow comments after a command.

Differential Revision: https://reviews.llvm.org/D51338

llvm-svn: 341571
2018-09-06 18:10:45 +00:00
Michael Berg 1b34b01a8e [NFC] - in preparation for adding nsw, nuw and exact as flags to MI
llvm-svn: 341565
2018-09-06 17:07:29 +00:00
Sanjay Patel 93bd15a005 [InstCombine] add xor+not folds
This fold is needed to avoid a regression when we try
to recommit rL300977. 
We can't see the most basic win currently because 
demanded bits changes the patterns:
https://rise4fun.com/Alive/plpp

llvm-svn: 341559
2018-09-06 16:23:40 +00:00
JF Bastien 2920061105 ARM64: improve non-zero memset isel by ~2x
Summary:
I added a few ARM64 memset codegen tests in r341406 and r341493, and annotated
where the generated code was bad. This patch fixes the majority of the issues by
requesting that a 2xi64 vector be used for memset of 32 bytes and above.

The patch leaves the former request for f128 unchanged, despite f128
materialization being suboptimal: doing otherwise runs into other asserts in
isel and makes this patch too broad.

This patch hides the issue that was present in bzero_40_stack and bzero_72_stack
because the code now generates in a better order which doesn't have the store
offset issue. I'm not aware of that issue appearing elsewhere at the moment.

<rdar://problem/44157755>

Reviewers: t.p.northover, MatzeB, javed.absar

Subscribers: eraman, kristof.beyls, chrib, dexonsmith, llvm-commits

Differential Revision: https://reviews.llvm.org/D51706

llvm-svn: 341558
2018-09-06 16:03:32 +00:00
Sanjay Patel 99d732052f [InstCombine] add tests for xor-not; NFC
These tests demonstrate a missing fold that would
also be needed to avoid a regression when we try 
to recommit rL300977.

llvm-svn: 341557
2018-09-06 15:35:01 +00:00
Alexander Potapenko 7f270fcf0a [MSan] store origins for variadic function parameters in __msan_va_arg_origin_tls
Add the __msan_va_arg_origin_tls TLS array to keep the origins for variadic function parameters.
Change the instrumentation pass to store parameter origins in this array.

This is a reland of r341528.

test/msan/vararg.cc doesn't work on Mips, PPC and AArch64 (because this
patch doesn't touch them), XFAIL these arches.
Also turned out Clang crashed on i80 vararg arguments because of
incorrect origin type returned by getOriginPtrForVAArgument() - fixed it
and added a test.

llvm-svn: 341554
2018-09-06 15:14:36 +00:00
Alex Bradbury fea4ac01c5 [RISCV][NFC] Rework test/MC/RISCV/rv{32,64}* to allow testing of symbol operands
Standardise on check lines:
* CHECK-ASM
* CHECK-OBJ
* CHECK-ASM-AND-OBJ

This allows for the addition of tests involving symbol operands, which will
not result in identical instructions in both assembly and disassembled object 
output.

This commit doesn't exploit this reworking to increase test coverage of symbol
operands - that will come in a future patch.

llvm-svn: 341546
2018-09-06 13:41:04 +00:00
Alexander Potapenko ac6595bd53 [MSan] revert r341528 to unbreak the bots
llvm-svn: 341541
2018-09-06 12:19:27 +00:00
Alexander Potapenko 1a10ae0def [MSan] store origins for variadic function parameters in __msan_va_arg_origin_tls
Add the __msan_va_arg_origin_tls TLS array to keep the origins for
variadic function parameters.
Change the instrumentation pass to store parameter origins in this array.

llvm-svn: 341528
2018-09-06 08:50:11 +00:00
David Green e6918ca2b3 [SLC] Add an alignment to CreateGlobalString
Previously the alignment on the newly created global strings was not set,
meaning that DataLayout::getPreferredAlignment was free to overalign it
to 16 bytes. This caused unnecessary code bloat with the padding between
variables.

The main example of this happening was the printf->puts optimisation in
SimplifyLibCalls, but as the change here is made in
IRBuilderBase::CreateGlobalString, other globals using this will now be
aligned too.

Differential Revision: https://reviews.llvm.org/D51410

llvm-svn: 341527
2018-09-06 08:42:17 +00:00
Alexander Potapenko d518c5fc87 [MSan] Make sure variadic function arguments do not overflow __msan_va_arg_tls
Turns out that calling a variadic function with too many (e.g. >100 i64's)
arguments overflows __msan_va_arg_tls, which leads to smashing other TLS
data with function argument shadow values.

getShadow() already checks for kParamTLSSize and returns clean shadow if
the argument does not fit, so just skip storing argument shadow for such
arguments.

llvm-svn: 341525
2018-09-06 08:21:54 +00:00
Max Kazantsev eb410f79b3 Revert rL341509 to fix massive failures on buildbots
llvm-svn: 341515
2018-09-06 04:40:49 +00:00
Hsiangkai Wang 760c1ab199 [DebugInfo] Do not generate label debug info if it has been processed.
In DwarfDebug::collectEntityInfo(), if the label entity is processed in
DbgLabels list, it means the label is not optimized out. There is no
need to generate debug info for it with null position.

llvm-svn: 341513
2018-09-06 02:22:06 +00:00
Craig Topper 5a53760f65 [X86][Assembler] Allow %eip as a register in 32-bit mode for .cfi directives.
This basically reverts a change made in r336217, but improves the text of the error message for not allowing IP-relative addressing in 32-bit mode.

Fixes PR38826.

Patch by Iain Sandoe.

llvm-svn: 341512
2018-09-06 02:03:14 +00:00
Fangrui Song 26f23f8c25 [llvm-dwp] Fix `UN:` lines (supposed to be `RUN:`) in X86/simple.test and adjust check lines for TYPES:
Reviewers: dblaikie, aprantl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D51704

llvm-svn: 341510
2018-09-06 00:46:30 +00:00
Fangrui Song 57575e11d1 [llvm-dwp] Use buffer_stream if output file is not seekable (e.g. "-")
Reviewers: dblaikie, pcc

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D51707

llvm-svn: 341509
2018-09-06 00:06:25 +00:00
JF Bastien ec812ce3d6 NFC: more memset inline arm64 coverage
I'm looking at some codegen optimization in this area and want to make sure I understand the current codegen and don't regress it. This patch further expands the tests (which I already expanded in r341406) to capture more of the current code generation when it comes to stack-based small non-zero memset on arm64. This patch annotates some potential fixes.

llvm-svn: 341493
2018-09-05 20:35:06 +00:00
Sanjay Patel dbf52837fe [DAGCombiner] try to convert pow(x, 0.25) to sqrt(sqrt(x))
This was proposed as an IR transform in D49306, but it was not clearly justifiable as a canonicalization. 
Here, we only do the transform when the target tells us that sqrt can be lowered with inline code.

This is the basic case. Some potential enhancements are in the TODO comments:

1. Generalize the transform for other exponents (allow more than 2 sqrt calcs if that's really cheaper).
2. If we have less fast-math-flags, generate code to avoid -0.0 and/or INF.
3. Allow the transform when optimizing/minimizing size (might require a target hook to get that right).

Note that by default, x86 converts single-precision sqrt calcs into sqrt reciprocal estimate with 
refinement. That codegen is controlled by CPU attributes and can be manually overridden. We have plenty 
of test coverage for that already, so I didn't bother to include extra testing for that here. AArch uses 
its full-precision ops in all cases (not sure if that's the intended behavior or not, but that should 
also be covered by existing tests).

Differential Revision: https://reviews.llvm.org/D51630 

llvm-svn: 341481
2018-09-05 17:01:56 +00:00
Jordan Rupprecht 591d889006 [llvm-strip] Support stripping multiple input files
Summary:
Allow strip to be called on multiple input files, which is interpreted as stripping N files in place. Using multiple input files is incompatible with -o.

To allow this, create a `DriverConfig` struct which just wraps a list of `CopyConfigs`. objcopy will only ever have a single `CopyConfig`, but strip will have N (where N >= 1) CopyConfigs.

Reviewers: alexshap, jakehehrlich

Reviewed By: alexshap, jakehehrlich

Subscribers: MaskRay, jakehehrlich, llvm-commits

Differential Revision: https://reviews.llvm.org/D51660

llvm-svn: 341464
2018-09-05 13:10:03 +00:00
Jonas Devlieghere 965b598b2a [DebugInfo] Normalize common kinds of DWARF sub-expressions.
Normalize common kinds of DWARF sub-expressions to make debug info
encoding a bit more compact:

  DW_OP_constu [X < 32] -> DW_OP_litX
  DW_OP_constu [all ones] -> DW_OP_lit0, DW_OP_not (64-bit only)

Differential revision: https://reviews.llvm.org/D51640

llvm-svn: 341457
2018-09-05 10:18:36 +00:00
Max Kazantsev e157cea3ec [NFC] Add test on full IV widening
llvm-svn: 341456
2018-09-05 10:10:59 +00:00
Hsiangkai Wang b2b7f5f6d7 [DebugInfo] Fix bug in LiveDebugVariables.
In lib/CodeGen/LiveDebugVariables.cpp, it uses std::prev(MBBI) to
get DebugValue's SlotIndex. However, the previous instruction may be
also a debug instruction. It could not use a debug instruction to query
SlotIndex in mi2iMap.

Scan all debug instructions and use the first debug instruction to query
SlotIndex for following debug instructions. Only handle DBG_VALUE in
handleDebugValue().

Differential Revision: https://reviews.llvm.org/D50621

llvm-svn: 341446
2018-09-05 05:58:53 +00:00
Sanjay Patel 63cf26cf01 [InstCombine] fix xor-or-xor fold to check uses and handle commutes
I'm probably missing some way to use m_Deferred to remove the code
duplication, but that can be a follow-up.

The improvement in demand_shrink_nsw.ll is an example of missing
the fold because the pattern matching was deficient. I didn't try
to follow the bits in that test, but Alive says it's correct:
https://rise4fun.com/Alive/ugc

llvm-svn: 341426
2018-09-04 23:22:13 +00:00
Sanjay Patel 018ce562a9 [InstCombine] update tests checks; NFC
llvm-svn: 341424
2018-09-04 23:08:23 +00:00
Jordan Rupprecht ec277a8278 [llvm-strip] Allow copying relocation sections without symbol tables.
Summary:
Fixes the error "Link field value 0 in section .rela.plt is invalid" when copying/stripping certain binaries. Minimal repro:

```
$ cat /tmp/a.c
int main() { return 0; }
$ clang -static /tmp/a.c -o /tmp/a
$ llvm-strip /tmp/a -o /tmp/b
llvm-strip: error: Link field value 0 in section .rela.plt is invalid.
```

Reviewers: jakehehrlich, alexshap

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D51493

llvm-svn: 341419
2018-09-04 22:28:49 +00:00
Zhaoshi Zheng a0aa41d793 Revert "Revert r341269: [Constant Hoisting] Hoisting Constant GEP Expressions"
Reland r341269. Use std::stable_sort when sorting constant condidates.

Reverting commit, r341365:

  Revert r341269: [Constant Hoisting] Hoisting Constant GEP Expressions

  One of the tests is failing 50% of the time when expensive checks are
  enabled. Not sure how deep the problem is so just reverting while the
  author can investigate so that the bots stop repeatedly failing and
  blaming things incorrectly. Will respond with details on the original
  commit.

Original commit, r341269:

  [Constant Hoisting] Hoisting Constant GEP Expressions

  Leverage existing logic in constant hoisting pass to transform constant GEP
  expressions sharing the same base global variable. Multi-dimensional GEPs are
  rewritten into single-dimensional GEPs.

  https://reviews.llvm.org/D51396

Differential Revision: https://reviews.llvm.org/D51654

llvm-svn: 341417
2018-09-04 22:17:03 +00:00
Anna Thomas dbacea188b [LV] First order recurrence phis should not be treated as uniform
This is fix for PR38786.
First order recurrence phis were incorrectly treated as uniform,
which caused them to be vectorized as uniform instructions.

Patch by Ayal Zaks and Orivej Desh!

Reviewed by: Anna

Differential Revision: https://reviews.llvm.org/D51639

llvm-svn: 341416
2018-09-04 22:12:23 +00:00
Sanjay Patel 5bbe8cd7ef [InstCombine] add tests for xor-or-xor fold; NFC
There are 2 bugs shown here that were untested before:
1. We fail to perform the fold in 1/2 the possible commuted variants.
2. When the fold is done, it disregards extra uses.

llvm-svn: 341415
2018-09-04 22:10:23 +00:00
Thomas Lively cfab8b4b76 [WebAssembly][NFC] Add colon to label in test
llvm-svn: 341414
2018-09-04 21:51:32 +00:00
Scott Linder dfe089dfd1 [AMDGPU] Legalize VGPR Rsrc operands for MUBUF instructions
Emit a waterfall loop in the general case for a potentially-divergent Rsrc
operand. When practical, avoid this by using Addr64 instructions.

Differential Revision: https://reviews.llvm.org/D50982

llvm-svn: 341413
2018-09-04 21:50:47 +00:00
Thomas Lively 1b55b2be7e [WebAssembly][NFC] Fix formatting and tests
Summary: Small fixes

Reviewers: aheejin, dschuff

Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D51656

llvm-svn: 341411
2018-09-04 21:26:17 +00:00
Sanjay Patel 0f70f86ce0 [InstCombine] make ((X & C) ^ C) form consistent for vectors
It would be better to create a 'not' here, but that's not possible yet.

llvm-svn: 341410
2018-09-04 21:17:14 +00:00
Krzysztof Parzyszek f4ad2cb24f [Hexagon] Don't packetize new-value stores with any other stores
llvm-svn: 341409
2018-09-04 21:07:27 +00:00
JF Bastien fd458fe205 NFC: expand memset inline arm64 coverage
I'm looking at some codegen optimization in this area and want to make sure I understand the current codegen and don't regress it. This patch simply expands the two existing tests to capture more of the current code generation when it comes to heap-based and stack-based small memset on arm64. The tested code is already pretty good, notably when it comes to using STP, FP stores, FP immediate generation, and folding one of the stores into a stack spill when possible. The uses of STUR could be improved, and some more pairing could occur. Straying from bzero patterns currently yield suboptimal code, and I expect a variety of small changes could make things way better.

llvm-svn: 341406
2018-09-04 21:02:00 +00:00
Martin Storsjo fed420d6b6 [MinGW] [AArch64] Add stubs for potential automatic dllimported variables
The runtime pseudo relocations can't handle the AArch64 format PC
relative addressing in adrp+add/ldr pairs. By using stubs, the potentially
dllimported addresses can be touched up by the runtime pseudo relocation
framework.

Differential Revision: https://reviews.llvm.org/D51452

llvm-svn: 341401
2018-09-04 20:56:21 +00:00
Fedor Sergeev 8b6effd969 [SimpleLoopUnswitch] remove a chain of dead blocks at once
Recent change to deleteDeadBlocksFromLoop was not enough to
fix all the problems related to dead blocks after nontrivial
unswitching of switches.

We need to delete all the dead blocks that were created during
unswitching, otherwise we will keep having problems with phi's
or dead blocks.

This change removes all the dead blocks that are reachable from the loop,
not trying to track whether these blocks are newly created by unswitching
or not. While not completely correct, we are unlikely to get loose but
reachable dead blocks that do not belong to our loop nest.

It does fix all the failures currently known, in particular PR38778.

Reviewed By: asbirlea
Differential Revision: https://reviews.llvm.org/D51519

llvm-svn: 341398
2018-09-04 20:19:41 +00:00
Sanjay Patel 664b2e3bd6 [InstCombine] improve xor+and/or tests
The tests attempted to check for commuted variants
of these folds, but complexity-based canonicalization
meant we had no coverage for at least 1/2 of the cases.

Also, the folds correctly check hasOneUse(), but there
was no coverage for that.

llvm-svn: 341394
2018-09-04 19:06:46 +00:00
Matt Arsenault 813613c494 AMDGPU: Fix DAG divergence not reporting flat loads
Match behavior in DAG of r340343

llvm-svn: 341393
2018-09-04 18:58:19 +00:00
Dan Gohman 045a217bee [WebAssembly] Fix operand rewriting in inline asm lowering.
Use MachineOperand::ChangeToImmediate rather than reassigning
MachineOperands to new values created from MachineOperand::CreateImm,
so that their parent pointers are preserved.

This fixes "Instruction has operand with wrong parent set" errors
reported by the MachineVerifier.

llvm-svn: 341389
2018-09-04 17:46:12 +00:00
Hiroshi Yamauchi 9775a620b0 [PGO] Control Height Reduction
Summary:
Control height reduction merges conditional blocks of code and reduces the
number of conditional branches in the hot path based on profiles.

if (hot_cond1) { // Likely true.
  do_stg_hot1();
}
if (hot_cond2) { // Likely true.
  do_stg_hot2();
}

->

if (hot_cond1 && hot_cond2) { // Hot path.
  do_stg_hot1();
  do_stg_hot2();
} else { // Cold path.
  if (hot_cond1) {
    do_stg_hot1();
  }
  if (hot_cond2) {
    do_stg_hot2();
  }
}

This speeds up some internal benchmarks up to ~30%.

Reviewers: davidxl

Reviewed By: davidxl

Subscribers: xbolva00, dmgreen, mehdi_amini, llvm-commits, mgorny

Differential Revision: https://reviews.llvm.org/D50591

llvm-svn: 341386
2018-09-04 17:19:13 +00:00
Francis Visoiu Mistrih 2d3f01c5dc [MachO] Fix inconsistency between error messages when validating LC_DYSYMTAB
llvm-svn: 341379
2018-09-04 16:31:53 +00:00
Francis Visoiu Mistrih 7690af4da9 [MachO] Fix LC_DYSYMTAB validation for external symbols
We were validating the same index (ilocalsym) twice, while iextdefsym
was never validated.

llvm-svn: 341378
2018-09-04 16:31:48 +00:00
Jonas Devlieghere 881452384a [dwarfdump] Improve -diff option by hiding more data.
The -diff option makes it easy to diff dwarf by hiding addresses and
offsets. However not all of them were hidden, which should be fixed by
this patch.

Differential revision: https://reviews.llvm.org/D51593

llvm-svn: 341377
2018-09-04 16:21:37 +00:00
Chandler Carruth 6cb12444cc Revert r341269: [Constant Hoisting] Hoisting Constant GEP Expressions
One of the tests is failing 50% of the time when expensive checks are
enabled. Not sure how deep the problem is so just reverting while the
author can investigate so that the bots stop repeatedly failing and
blaming things incorrectly. Will respond with details on the original
commit.

llvm-svn: 341365
2018-09-04 13:36:44 +00:00
Chandler Carruth 664aa868f5 [x86/SLH] Add a real Clang flag and LLVM IR attribute for Speculative
Load Hardening.

Wires up the existing pass to work with a proper IR attribute rather
than just a hidden/internal flag. The internal flag continues to work
for now, but I'll likely remove it soon.

Most of the churn here is adding the IR attribute. I talked about this
Kristof Beyls and he seemed at least initially OK with this direction.
The idea of using a full attribute here is that we *do* expect at least
some forms of this for other architectures. There isn't anything
*inherently* x86-specific about this technique, just that we only have
an implementation for x86 at the moment.

While we could potentially expose this as a Clang-level attribute as
well, that seems like a good question to defer for the moment as it
isn't 100% clear whether that or some other programmer interface (or
both?) would be best. We'll defer the programmer interface side of this
for now, but at least get to the point where the feature can be enabled
without relying on implementation details.

This also allows us to do something that was really hard before: we can
enable *just* the indirect call retpolines when using SLH. For x86, we
don't have any other way to mitigate indirect calls. Other architectures
may take a different approach of course, and none of this is surfaced to
user-level flags.

Differential Revision: https://reviews.llvm.org/D51157

llvm-svn: 341363
2018-09-04 12:38:00 +00:00
Chandler Carruth 163222f569 Revert r341342: Dwarf .debug section compression support (zlib, zlib-gnu).
Also reverts follow-up commits r341343 and r341344.

The primary commit continues to break some build bots even after the
fixes in r341343 for UBSan issues:
http://lab.llvm.org:8011/builders/clang-cmake-aarch64-full/builds/5823

It is also failing for me locally (linux, x86-64).

llvm-svn: 341360
2018-09-04 11:55:57 +00:00
Chandler Carruth 219888d1b2 [x86/SLH] Teach SLH to harden against the "ret2spec" attack by
implementing the proposed mitigation technique described in the original
design document.

The idea is to check after calls that the return address used to arrive
at that location is in fact the correct address. In the event of
a mis-predicted return which reaches a *valid* return but not the
*correct* return, this will detect the mismatch much like it would
a mispredicted conditional branch.

This is the last published attack vector that I am aware of in the
Spectre v1 space which is not mitigated by SLH+retpolines. However,
don't read *too* much into that: this is an area of ongoing research
where we expect more issues to be discovered in the future, and it also
makes no attempt to mitigate Spectre v4. Still, this is an important
completeness bar for SLH.

The change here is of course delightfully simple. It was predicated on
cutting support for post-instruction symbols into LLVM which was not at
all simple. Many thanks to Hal Finkel, Reid Kleckner, and Justin Bogner
who helped me figure out how to do a bunch of the complex changes
involved there.

Differential Revision: https://reviews.llvm.org/D50837

llvm-svn: 341358
2018-09-04 10:59:10 +00:00
Chandler Carruth 8d8489f513 [x86/SLH] Teach SLH to harden indirect branches and switches without
retpolines.

This implements the core design of tracing the intended target into the
target, checking it, and using that to update the predicate state. It
takes advantage of a few interesting aspects of SLH to make it a bit
easier to implement:
- We already split critical edges with conditional branches, so we can
assume those are gone.
- We already unfolded any memory access in the indirect branch
instruction itself.

I've left hard errors in place to catch if any of these somewhat subtle
invariants get violated.

There is some code that I can factor out and share with D50837 when it
lands, but I didn't want to couple landing the two patches, so I'll do
that in a follow-up cleanup commit if alright.

Factoring out the code to handle different scenarios of materializing an
address remains frustratingly hard. In a bunch of cases you want to fold
one of the cases into an immediate operand of some other instruction,
and you also have both symbols and basic blocks being used which require
different methods on the MI builder (and different operand kinds).
Still, I'll take a stab at sharing at least some of this code in
a follow-up if I can figure out how.

Differential Revision: https://reviews.llvm.org/D51083

llvm-svn: 341356
2018-09-04 10:44:21 +00:00
Nicola Zaghen 9588ad9611 [InstCombine] Fold icmp ugt/ult (add nuw X, C2), C --> icmp ugt/ult X, (C - C2)
Support for sgt/slt was added in rL294898, this adds the same cases also for unsigned compares.

This is the Alive proof: https://rise4fun.com/Alive/nyY

Differential Revision: https://reviews.llvm.org/D50972

llvm-svn: 341353
2018-09-04 10:29:48 +00:00
Fedor Sergeev 961811f3e1 [NFC] correcting patterns in time-passes test to fix buildbot
llvm-svn: 341348
2018-09-04 08:21:37 +00:00
Fedor Sergeev f2d4372e0e [PassTiming] reporting time-passes separately for multiple pass instances of the same pass
Summary:
Refactoring done by rL340872 accidentally appeared to be non-NFC, changing the way how
multiple instances of the same pass are handled - aggregation of results by PassName
forced data for multiple instances to be merged together and reported as one line.

Getting back to creating/reporting timers per pass instance.
Reporting was a bit enhanced by counting pass instances and adding #<num> suffix
to the pass description. Note that it is instances that are being counted,
not invocations of them.

time-passes test updated to account for multiple passes being run.

Reviewers: paquette, jhenderson, MatzeB, skatkov

Reviewed By: skatkov

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D51535

llvm-svn: 341346
2018-09-04 06:12:28 +00:00
Max Kazantsev 2cbba56337 [IndVars] Fix usage of SCEVExpander to not mess with SCEVConstant. PR38674
This patch removes the function `expandSCEVIfNeeded` which behaves not as
it was intended. This function tries to make a lookup for exact existing expansion
and only goes to normal expansion via `expandCodeFor` if this lookup hasn't found
anything. As a result of this, if some instruction above the loop has a `SCEVConstant`
SCEV, this logic will return this instruction when asked for this `SCEVConstant` rather
than return a constant value. This is both non-profitable and in some cases leads to
breach of LCSSA form (as in PR38674).

Whether or not it is possible to break LCSSA with this algorithm and with some
non-constant SCEVs is still in question, this is still being investigated. I wasn't
able to construct such a test so far, so maybe this situation is impossible. If it is,
it will go as a separate fix.

Rather than do it, it is always correct to just invoke `expandCodeFor` unconditionally:
it behaves smarter about insertion points, and as side effect of this it will choose a
constant value for SCEVConstants. For other SCEVs it may end up finding a better insertion
point. So it should not be worse in any case.

NOTE: So far the only known case for which this transform may break LCSSA is mapping
of SCEVConstant to an instruction. However there is a suspicion that the entire algorithm
can compromise LCSSA form for other cases as well (yet not proved).

Differential Revision: https://reviews.llvm.org/D51286
Reviewed By: etherzhhb

llvm-svn: 341345
2018-09-04 05:01:35 +00:00
Puyan Lotfi 5a40cd5b50 [llvm-objcopy] Dwarf .debug section compression support (zlib, zlib-gnu).
Usage:

  llvm-objcopy --compress-debug-sections=zlib foo.o
  llvm-objcopy --compress-debug-sections=zlib-gnu foo.o

  In both cases the debug section contents is compressed with zlib. In the GNU
  style case the header is the "ZLIB" magic string followed by the uint64 big-
  endian decompressed size. In the non-GNU mode the header is the
  Elf(32|64)_Chdr.

  Decompression support is coming soon.


  Differential Revision: https://reviews.llvm.org/D49678

llvm-svn: 341342
2018-09-03 22:25:56 +00:00
Sanjay Patel 0945959869 [AArch64][x86] add tests for pow(x, 0.25); NFC
Folds for this were proposed in D49306, but we
decided the transform is better suited for the backend.

llvm-svn: 341341
2018-09-03 22:11:47 +00:00
Simon Atanasyan 4d13cb0a8a [mips] Disable the selection of mixed microMIPS/MIPS code
This patch modifies hasStandardEncoding() / inMicroMipsMode() /
inMips16Mode() methods of the MipsSubtarget class so only one can be
true at any one time. That prevents the selection of microMIPS and MIPS
instructions and patterns that are defined in TableGen files at the same
time. A few new patterns and instruction definitions hae been added to
keep test cases passed.

Differential revision: https://reviews.llvm.org/D51483

llvm-svn: 341338
2018-09-03 20:48:55 +00:00
Sanjay Patel d75064e6d5 [InstCombine] allow add+not --> sub for arbitrary vector constants.
llvm-svn: 341335
2018-09-03 18:21:59 +00:00
Sanjay Patel faa02b1abb [InstCombine] consolidate tests for ~(X+C); NFC
llvm-svn: 341332
2018-09-03 18:04:21 +00:00
Sid Manning 220f288720 Revert [Hexagon] Add support for getRegisterByName.
Support required to build the Hexagon Linux kernel.

llvm-svn: 341331
2018-09-03 17:59:10 +00:00
Florian Hahn cc9dc599ba [SLC] Support expanding pow(x, n+0.5) to x * x * ... * sqrt(x)
Reviewers: evandro, efriedma, spatel

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D51435

llvm-svn: 341330
2018-09-03 17:37:39 +00:00
Andrea Di Biagio fb3d9e1449 [X86] Remove wrong ReadAdvance from multiclass sse_fp_unop_s.
A ReadAdvance was incorrectly added to the SchedReadWrite list associated with
the following SSE instructions:

sqrtss
sqrtsd
rsqrtss
rcpss

As a consequence, a wrong operand latency was computed for the register operand
used as the base address of the folded load operand.

This patch removes the wrong ReadAdvance, and updates the llvm-mca test cases.
There is still a problem with correctly modeling partial register writes on XMM
registers This other problem is currently tracked here:
https://bugs.llvm.org/show_bug.cgi?id=38813

Differential Revision: https://reviews.llvm.org/D51542

llvm-svn: 341326
2018-09-03 16:47:34 +00:00
Matt Arsenault ca25b58957 DAG: Handle extract_vector_elt in isKnownNeverNaN
llvm-svn: 341317
2018-09-03 14:01:03 +00:00
Jonas Devlieghere 6e5c7e6037 [DebugInfo] Have the verifier accept missing linkage names.
According to the standard, for the .debug_names (the "dwarf accelerator
tables"):

> If a subprogram or inlined subroutine is included, and has a
> DW_AT_linkage_name attribute, there will be an additional index entry
> for the linkage name.

For Swift we generate DW_structure_types with a linkage name and the
verifier was incorrectly rejecting this. This patch fixes that by only
considering the linkage name in those particular cases. The test is the
"reduced" debug info of the failing swift test on swift.org.

Differential revision: https://reviews.llvm.org/D51420

llvm-svn: 341311
2018-09-03 12:12:17 +00:00
Daniel Cederman e9e38c207e [Sparc] allow tls_add/tls_call syntax in assembler parser
Summary: Removing unneeded isCodeGenOnly from tls-specific
instructions - TLS_ADD/TLS_LD/TLS_LDX/TLS_CALL.

Author: fedor.sergeev

Reviewers: jyknight, fedor.sergeev

Reviewed By: jyknight

Subscribers: dcederman, brad, llvm-commits

Differential Revision: https://reviews.llvm.org/D36463

llvm-svn: 341308
2018-09-03 10:38:12 +00:00
Carlos Alberto Enciso eaf2c1f449 Test commit.
Revert change done in r341297. NFC.

Differential Revision: https://reviews.llvm.org/D51583

llvm-svn: 341302
2018-09-03 09:41:43 +00:00
Sander de Smalen 6cab60fa06 Extend hasStoreToStackSlot with list of FI accesses.
For instructions that spill/fill to and from multiple frame-indices
in a single instruction, hasStoreToStackSlot and hasLoadFromStackSlot
should return an array of accesses, rather than just the first encounter
of such an access.

This better describes FI accesses for AArch64 (paired) LDP/STP
instructions.

Reviewers: t.p.northover, gberry, thegameg, rengolin, javed.absar, MatzeB

Reviewed By: MatzeB

Differential Revision: https://reviews.llvm.org/D51537

llvm-svn: 341301
2018-09-03 09:15:58 +00:00
Carlos Alberto Enciso f03e049234 Test commit - adding a new line.
llvm-svn: 341297
2018-09-03 08:26:37 +00:00
QingShan Zhang c2b6c547dc [PowerPC] Add Itineraries of IIC_IntRotateDI for P7/P8
When doing some instruction scheduling work, we noticed some missing itineraries.
Before we switch to machine scheduler, those missing itineraries might not have impact to actually scheduling, 
because we can still get same latency due to default values.

With machine scheduler, however, itineraries will have impact to scheduling.
eg: NumMicroOps will default to be 0 if there is NO itineraries for specific instruction class.
And most of the instruction class with itineraries will have NumMicroOps default to 1.

This will has impact on the count of RetiredMOps, affects the Pending/Available Queue, 
then causing different scheduling or suboptimal scheduling further.

Patch by jsji (Jinsong Ji)
Differential Revision: https://reviews.llvm.org/D51506

llvm-svn: 341293
2018-09-03 03:14:29 +00:00
Sanjay Patel 17e709b66a [InstCombine] allow not+sub fold for arbitrary vector constants
The fold was implemented for the general case but use-limitation,
but the later constant version which didn't check uses was only
matching splat constants.

llvm-svn: 341292
2018-09-02 19:31:45 +00:00
Sanjay Patel 04ab22b3f4 [InstCombine] move/add tests for not+sub; NFC
llvm-svn: 341291
2018-09-02 19:18:13 +00:00
Hsiangkai Wang e0dcc28a4d Revert "[DebugInfo] Fix bug in LiveDebugVariables."
This reverts commit 8f548ff2a1819e1bc051e8218584f1a3d2cf178a.

buildbot failure in LLVM on clang-ppc64be-linux
http://lab.llvm.org:8011/builders/clang-ppc64le-linux/builds/19765

llvm-svn: 341290
2018-09-02 16:35:42 +00:00
Hsiangkai Wang 1368434b49 [DebugInfo] Fix bug in LiveDebugVariables.
In lib/CodeGen/LiveDebugVariables.cpp, it uses std::prev(MBBI) to
get DebugValue's SlotIndex. However, the previous instruction may be
also a debug instruction. It could not use a debug instruction to query
SlotIndex in mi2iMap.

Scan all debug instructions and use the first debug instruction to query
SlotIndex for following debug instructions. Only handle DBG_VALUE in
handleDebugValue().

Differential Revision: https://reviews.llvm.org/D50621

llvm-svn: 341289
2018-09-02 15:57:22 +00:00
Sanjay Patel ca36eb4e33 [Reassociate] swap binop operands to increase factoring potential
If we have a pair of binops feeding another pair of binops, rearrange the operands so 
the matching pair are together because that allows easy factorization folds to happen 
in instcombine:
((X << S) & Y) & (Z << S) --> ((X << S) & (Z << S)) & Y (reassociation)

--> ((X & Z) << S) & Y (factorize shift from 'and' ops optimization)

This is part of solving PR37098:
https://bugs.llvm.org/show_bug.cgi?id=37098

Note that there's an instcombine version of this patch attached there, but we're trying
to make instcombine have less responsibility to improve compile-time efficiency.

For reasons I still don't completely understand, reassociate does this kind of transform
sometimes, but misses everything in my motivating cases.

This patch on its own is gluing an independent cleanup chunk to the end of the existing 
RewriteExprTree() loop. We can build on it and do something stronger to better order the 
full expression tree like D40049. That might be an alternative to the proposal to add a 
separate reassociation pass like D41574.

Differential Revision: https://reviews.llvm.org/D45842

llvm-svn: 341288
2018-09-02 14:22:54 +00:00
Roman Lebedev d7a6244475 [DAGCombine] optimizeSetCCOfSignedTruncationCheck(): handle inverted pattern
Summary:
A follow-up for D49266 / rL337166 + D49497 / rL338044.

This is still the same pattern to check for the [lack of]
signed truncation, but in this case the constants and the predicate
are negated.

https://rise4fun.com/Alive/BDV
https://rise4fun.com/Alive/n7Z

Reviewers: spatel, craig.topper, RKSimon, javed.absar, efriedma, dmgreen

Reviewed By: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D51532

llvm-svn: 341287
2018-09-02 13:56:22 +00:00
Dylan McKay 454258671d [AVR] Redefine the 'LSL' instruction as an alias of 'ADD'
The 'LSL Rd' instruction is equivalent to 'ADD Rd, Rd'.

llvm-svn: 341278
2018-09-01 12:23:00 +00:00
Dylan McKay 97daa142f4 [AVR] Redefine the 'SBR' instruction as an alias
This fixes a TableGen warning about duplicate bit patterns.

SBR
===

This is an alias of 'ORI Rd, K'.

llvm-svn: 341277
2018-09-01 12:22:54 +00:00
Dylan McKay 8b0f9d2e58 [AVR] Define the ROL instruction as an alias of ADC
The 'rol Rd' instruction is equivalent to 'adc Rd'.

This caused compile warnings from tablegen because of conflicting bits
shared between each instruction.

llvm-svn: 341275
2018-09-01 12:22:07 +00:00
Tom Stellard ffc6bd6f3d AMDGPU/GlobalISel: Define instruction mapping for G_SELECT
Reviewers: arsenm

Reviewed By: arsenm

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, llvm-commits, t-tye

Differential Revision: https://reviews.llvm.org/D49737

llvm-svn: 341271
2018-09-01 02:41:19 +00:00
Zhaoshi Zheng f5297fb24b [Constant Hoisting] Hoisting Constant GEP Expressions
Leverage existing logic in constant hoisting pass to transform constant GEP
expressions sharing the same base global variable. Multi-dimensional GEPs are
rewritten into single-dimensional GEPs.

Differential Revision: https://reviews.llvm.org/D51396

llvm-svn: 341269
2018-09-01 00:04:56 +00:00
Jessica Paquette a69696dca6 Fix typo in size remarks for module passes
ModuleCount = InstrCount was incorrect. It should have been
InstrCount = ModuleCount. This was making it emit an extra, incorrect remark
for Print Module IR.

The test didn't catch this, because it didn't ensure that the only remark
output was from the desired pass. So, it was possible to have an extra remark
come through and not fail. Updated the test so that we ensure that the last
remark that's output comes from the desired pass. This is done by ensuring
that whatever is being read after the last remark is YAML output rather than
some incorrect garbage.

llvm-svn: 341267
2018-08-31 22:43:41 +00:00
Stanislav Mekhanoshin 44451b3344 [AMDGPU] Split v32i32 loads
Differential Revision: https://reviews.llvm.org/D51555

llvm-svn: 341266
2018-08-31 22:43:36 +00:00
Craig Topper caf6672779 [X86] Add intrinsics for KTEST instructions.
These intrinsics use the same implementation as PTEST intrinsics, but use vXi1 vectors.

New clang builtins will be accompanying them shortly.

llvm-svn: 341259
2018-08-31 21:31:53 +00:00
Sid Manning b1c9813042 [Hexagon] Add support for getRegisterByName.
Support required to build the Hexagon Linux kernel.

Differential Revision: https://reviews.llvm.org/D51363

llvm-svn: 341238
2018-08-31 19:08:23 +00:00
Alexandre Ganea 6a7efef4af [DebugInfo] Common behavior for error types
Following D50807, and heading towards D50664, this intermediary change does the following:

1. Upgrade all custom Error types in llvm/trunk/lib/DebugInfo/ to use the new StringError behavior (D50807).
2. Implement std::is_error_code_enum and make_error_code() for DebugInfo error enumerations.
3. Rename GenericError -> PDBError (the file will be renamed in a subsequent commit)
4. Update custom error messages to follow the same formatting: (\w\s*)+\.
5. Keep generic "file not found" (ENOENT) errors as they are in PDB code. Previously, there used to be a custom enumeration for that purpose.
6. Remove a few extraneous LF in log() implementations. Printing LF is a responsability at a higher level, not at the error level.

Differential Revision: https://reviews.llvm.org/D51499

llvm-svn: 341228
2018-08-31 17:41:58 +00:00
Craig Topper b7bb9f0078 [X86] Add support for turning vXi1 shuffles into KSHIFTL/KSHIFTR.
This patch recognizes shuffles that shift elements and fill with zeros. I've copied and modified the shift matching code we use for normal vector registers to do this. I'm not sure if there's a good way to share more of this code without making the existing function more complex than it already is.

This will be used to enable kshift intrinsics in clang.

Differential Revision: https://reviews.llvm.org/D51401

llvm-svn: 341227
2018-08-31 17:17:21 +00:00
Andrea Di Biagio a59ec4efa0 [X86][BtVer2] Remove wrong ReadAdvance from AVX vbroadcast(ss|sd|f128) instructions.
The presence of a ReadAdvance for input operand #0 is problematic
because it changes the input latency of the register used as the base address
for the folded load.

A broadcast cannot start executing if the load address hasn't been computed yet.

In the llvm-mca example, the VBROADCASTSS is dependent on the address generated
by the LEAQ.  That means, it cannot start until LEAQ reaches the write-back
stage. If we apply ReadAdvance, then we wrongly assume that the load can start 3
cycles in advance.

Differential Revision: https://reviews.llvm.org/D51534

llvm-svn: 341222
2018-08-31 16:05:48 +00:00
Simon Atanasyan 3785e84cf2 [mips] Fix `mtc1` and `mfc1` definitions for microMIPS R6
The `mtc1` and `mfc1` definitions in the MipsInstrFPU.td have MMRel,
but do not have StdMMR6Rel tags. When these instructions are emitted
for microMIPS R6 targets, `Mips::MipsR62MicroMipsR6` nor
`Mips::Std2MicroMipsR6` cannot find correct op-codes and as a result the
backend uses mips32 variant of the instructions encoding.

The patch fixes this problem by adding the StdMMR6Rel tag and check
instructions encoding in the test case.

Differential revision: https://reviews.llvm.org/D51482

llvm-svn: 341221
2018-08-31 15:57:17 +00:00
Matt Arsenault bf07a50a98 AMDGPU: Restrict extract_vector_elt combine to loads
The intention is to enable the extract_vector_elt load combine,
and doing this for other operations interferes with more
useful optimizations on vectors.

Handle any type of load since in principle we should do the
same combine for the various load intrinsics.

llvm-svn: 341219
2018-08-31 15:39:52 +00:00
Matt Arsenault 6f35f0c212 AMDGPU: Actually commit re-run of update_llc_test_checks
llvm-svn: 341218
2018-08-31 15:05:06 +00:00
Matt Arsenault c807ce0ee4 SLPVectorizer: Fix assert with different sized address spaces
llvm-svn: 341215
2018-08-31 14:34:53 +00:00
Matt Arsenault 28c16bd534 AMDGPU: Fix broken generated check lines
This was incorrectly using the same check prefix for multiple lines

llvm-svn: 341214
2018-08-31 14:34:22 +00:00
Andrea Di Biagio 69da3f3df6 [X86] Add llvm-mca tests that show how operand latency is wrongly computed for SSE sqrtss/sd and rcpss.
According to the timeline view, sqrtss/sd/rcpss start executing before the load
address for the memory operand is available.
This problem is caused by the presence of a ReadAfterLd (a ReadAdvance). Those
unary operations should not specify a ReadAdvance at all.

llvm-svn: 341213
2018-08-31 14:12:13 +00:00
Francis Visoiu Mistrih 8e864be70a [llvm-objdump] Keep the memory buffer from the dSYM alive when using -g -dsym
When using -g and -dsym, llvm-objdump opens the dsym file and keeps the
MachOObjectFile alive, while the memory buffer that the MachOObjectFile
was based on gets destroyed.

Differential Revision: https://reviews.llvm.org/D51365

llvm-svn: 341209
2018-08-31 13:10:54 +00:00
Alexander Ivchenko 9d053074a1 [GlobalISel][X86] Add the support for G_FPTRUNC
Differential Revision: https://reviews.llvm.org/D49855

llvm-svn: 341202
2018-08-31 11:26:51 +00:00
Alexander Ivchenko 9b0b492653 [GlobalISel][X86_64] Support for G_FPTOSI
Differential Revision: https://reviews.llvm.org/D49183

llvm-svn: 341200
2018-08-31 11:16:58 +00:00
Alexander Ivchenko 58a5d6fde7 [GlobalIsel][X86] Support for llvm.trap intrinsic
Differential Revision: https://reviews.llvm.org/D49180

llvm-svn: 341199
2018-08-31 11:05:13 +00:00
Andrea Di Biagio 0e21ca1278 [X86][BtVer2] Add an llvm-mca test that shows how the read latency of AVX broadcastss on ymm registers is incorrectly set.
llvm-svn: 341197
2018-08-31 10:39:33 +00:00
Alexander Ivchenko a26a364e75 [GlobalIsel][X86] Support for G_FCMP
Differential Revision: https://reviews.llvm.org/D49172

llvm-svn: 341193
2018-08-31 09:38:27 +00:00
Roman Lebedev 75c2961b76 [NFC][X86][AArch64] A few more patterns for [lack of] signed truncation check pattern.[NFC][X86][AArch64] A few more patterns for [lack of] signed truncation check pattern.
llvm-svn: 341188
2018-08-31 08:52:03 +00:00
Andrea Di Biagio b998eae2f2 [X86][BtVer2] Fix WriteFShuffle256 schedule write info.
This patch fixes the number of micro opcodes, and processor resource cycles for
the following AVX instructions:

vinsertf128rr/rm
vperm2f128rr/rm
vbroadcastf128

Tests have been regenerated using the usual scripts in the llvm/utils directory.

Differential Revision: https://reviews.llvm.org/D51492

llvm-svn: 341185
2018-08-31 08:30:47 +00:00
Martin Storsjo 2dcaa41e1e [MinGW] [ARM] Add stubs for potential automatic dllimported variables
The runtime pseudo relocations can't handle the ARM format embedded
addresses in movw/movt pairs. By using stubs, the potentially
dllimported addresses can be touched up by the runtime pseudo relocation
framework.

Differential Revision: https://reviews.llvm.org/D51450

llvm-svn: 341176
2018-08-31 08:00:25 +00:00
Craig Topper 7073f03f70 [X86] Add a -x86-experimental-vector-widening command line to vec_fp_to_int.ll.
llvm-svn: 341173
2018-08-31 07:05:38 +00:00
Craig Topper 2140a8e307 [X86] Add -x86-experimental-vector-widening-legalization run line to avx512-cvt.ll
This will cover the (v2i32 (setcc v2f32)) case in replaceNodeResults. That code shouldn't be needed at all in this mode. A future patch will skip it.

llvm-svn: 341171
2018-08-31 07:05:36 +00:00
Matt Arsenault 65e43cade8 AMDGPU: Remove obsolete tests
llvm-svn: 341169
2018-08-31 06:07:45 +00:00
Matt Arsenault 988df63525 AMDGPU: Stop forcing internalize at -O0
This doesn't really matter if clang is always emitting
the visibility as hidden by default.

llvm-svn: 341168
2018-08-31 06:02:36 +00:00
Matt Arsenault 0da6350dc8 AMDGPU: Remove remnants of old address space mapping
llvm-svn: 341165
2018-08-31 05:49:54 +00:00
Fangrui Song 780dfe11fc Import lit.llvm after rL341135
llvm-svn: 341149
2018-08-31 00:22:20 +00:00
Michael Berg 7b9e86445c [NFC] adding initial intersect test for Node to Instruction association
llvm-svn: 341138
2018-08-30 22:43:34 +00:00
Krzysztof Parzyszek d51f7b3b43 [Hexagon] Check validity of register class when generating bitsplit
llvm-svn: 341137
2018-08-30 22:26:43 +00:00
Eli Friedman d5d0a4d27f [ARM] Enable GEP offset splitting for 32-bit ARM.
It has essentially the same benefit it has on 64-bit ARM: it
substantially reduces the number of constants used by large GEP
operations. Seems to be generally helpful across a few different
codebases I've tried.

Differential Revision: https://reviews.llvm.org/D51462

llvm-svn: 341136
2018-08-30 22:18:27 +00:00