Commit Graph

60324 Commits

Author SHA1 Message Date
Sam Clegg 492f752969 [WebAssembly] Initial implementation of PIC code generation
This change implements lowering of references global symbols in PIC
mode.

This change implements lowering of global references in PIC mode using a
new @GOT reference type. @GOT references can be used with function or
data symbol names combined with the get_global instruction. In this case
the linker will insert the wasm global that stores the address of the
symbol (either in memory for data symbols or in the wasm table for
function symbols).

For now I'm continuing to use the R_WASM_GLOBAL_INDEX_LEB relocation
type for this type of reference which means that this relocation type
can refer to either a global or a function or data symbol. We could
choose to introduce specific relocation types for GOT entries in the
future.  See the current dynamic linking proposal:

https://github.com/WebAssembly/tool-conventions/blob/master/DynamicLinking.md

Differential Revision: https://reviews.llvm.org/D54647

llvm-svn: 357022
2019-03-26 19:46:15 +00:00
Ali Tamur 2f5cd03a3f [llvm] Reapply "Prevent duplicate files in debug line header in dwarf 5."
Reapply rL356941 after regenerating the object file in the failing test
llvm/test/tools/llvm-objdump/embedded-source.test from source.

Original commit message:

[llvm] Prevent duplicate files in debug line header in dwarf 5.

Motivation: In previous dwarf versions, file name indexes started from 1, and
the primary source file was not explicit. Dwarf 5 standard (6.2.4) prescribes
the primary source file to be explicitly given an entry with an index number 0.

The current implementation honors the specification by just duplicating the
main source file, once with index number 0, and later maybe with another
index number. While this is compliant with the letter of the standard, the
duplication causes problems for consumers of this information such as lldb.
(Some files are duplicated, where only some of them have a line table although
all refer to the same file)

With this change, dwarf 5 debug line section files always start from 0, and
the zeroth entry is not duplicated whenever possible. This requires different
handling of dwarf 4 and dwarf 5 during generation (e.g. when a function returns
an index zero for a file name, it signals an error in dwarf 4, but not in dwarf 5)
However, I think the minor complication is worth it, because it enables all
consumers (lldb, gdb, dwarfdump, objdump, and so on) to treat all files in the
file name list homogenously.

Tags: #llvm, #debug-info

Differential Revision: https://reviews.llvm.org/D59515

llvm-svn: 357018
2019-03-26 18:53:23 +00:00
George Rimar 279898b315 [llvm-objcopy] - Strip sections before symbols.
This is a fix for https://bugs.llvm.org/show_bug.cgi?id=40007.

Idea is to swap the order of stripping. So that we strip sections before
symbols what allows us to strip relocation sections without emitting
the error about relative symbols.

Differential revision: https://reviews.llvm.org/D59763

llvm-svn: 357017
2019-03-26 18:42:15 +00:00
Heejin Ahn 54551c1df7 [WebAssembly] Don't analyze branches after CFGStackify
Summary:
`WebAssembly::analyzeBranch` now does not analyze anything if the
function is CFG stackified. We were previously doing similar things by
checking if a branch's operand is whether an integer or an MBB, but this
failed to bail out when a BB did not have any terminators.

Consider this case:
```
bb0:
  try $label0
  call @foo    // unwinds to %ehpad
bb1:
  ...
  br $label0   // jumps to %cont. can be deleted
ehpad:
  catch
  ...
cont:
  end_try
```
Here `br $label0` will be deleted in CFGStackify's
`removeUnnecessaryInstrs` function, because we jump to the %cont block
even without the branch. But in this case, MachineVerifier fails to
verify this, because `ehpad` is not a successor of `bb1` even if `bb1`
does not have any terminators. MachineVerifier incorrectly thinks `bb1`
falls through to the next block.

This pass now consistently rejects all analysis after CFGStackify
whether a BB has terminators or not, also making the MachineVerifier
work. (MachineVerifier does not try to verify relationships between BBs
if `analyzeBranch` fails, the behavior we want after CFGStackify.)

This also adds a new option `-wasm-disable-ehpad-sort` for testing. This
option helps create the sorted order we want to test, and without the
fix in this patch, the tests in cfg-stackify-eh.ll fail at
MachineVerifier with `-wasm-disable-ehpad-sort`.

Reviewers: dschuff

Subscribers: sunfish, sbc100, jgravelle-google, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D59740

llvm-svn: 357015
2019-03-26 18:21:20 +00:00
Nikita Popov 7f15dd097e [InstCombine] Add tests for ssubo X, C -> saddo X, -C; NFC
Add baseline tests for canonicalization of
ssubo X, C -> saddo X, -C.

Patch by Dan Robertson.

Differential Revision: https://reviews.llvm.org/D59653

llvm-svn: 357013
2019-03-26 18:05:43 +00:00
Sanjay Patel 81e8d76f5b [InstCombine] form uaddsat from add+umin (PR14613)
This is the last step towards solving the examples shown in:
https://bugs.llvm.org/show_bug.cgi?id=14613

With this change, x86 should end up with psubus instructions
when those are available.

All known codegen issues with expanding the saturating intrinsics
were resolved with:
D59006 / rL356855

We also have some early evidence in D58872 that using the intrinsics
will lead to better perf. If some target regresses from this, custom
lowering of the intrinsics (as in the above for x86) may be needed.

llvm-svn: 357012
2019-03-26 17:50:08 +00:00
Heejin Ahn 1aaa481fc1 [WebAssembly] Add CFGStacikfied field to WebAssemblyFunctionInfo
Summary:
This adds `CFGStackified` field and its serialization to
WebAssemblyFunctionInfo.

Reviewers: dschuff

Subscribers: sunfish, sbc100, jgravelle-google, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D59747

llvm-svn: 357011
2019-03-26 17:46:14 +00:00
Heejin Ahn 52221d56bc [WebAssembly] Support WebAssemblyFunctionInfo serialization
Summary:
The framework for supporting target-specific MachineFunctionInfo was
added in r356215. This adds serialization support for
WebAssemblyFunctionInfo on top of that. This patch only adds the
framework and does not actually serialize anything at this point; we
have to add YAML mapping later for the fields in WebAssemblyFunctionInfo
we want to serialize if necessary.

Reviewers: dschuff, arsenm

Subscribers: sunfish, wdng, sbc100, jgravelle-google, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D59737

llvm-svn: 357009
2019-03-26 17:35:35 +00:00
Heejin Ahn 222718fdd2 [WebAssembly] Fix a bug when mixing TRY/LOOP markers
Summary:
When TRY and LOOP markers are in the same BB and END_TRY and END_LOOP
markers are in the same BB, END_TRY should be _before_ END_LOOP, because
LOOP is always before TRY if they are in the same BB. (TRY is placed in
the latest possible position, whereas LOOP is in the earliest possible
position.)

Reviewers: dschuff

Subscribers: sunfish, sbc100, jgravelle-google, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D59751

llvm-svn: 357008
2019-03-26 17:29:55 +00:00
Heejin Ahn 44a5a4b107 [WebAssembly] Fix bugs in BLOCK/TRY placement
Summary:
Before we placed all TRY/END_TRY markers before placing BLOCK/END_BLOCK
markers. This couldn't handle this case:
```
bb0:
  br bb2
bb1:          // nearest common dominator of bb3 and bb4
  br_if ... bb3
  br bb4
bb2:
  ...
bb3:
  call @foo   // unwinds to ehpad
bb4:
  call @bar   // unwinds to ehpad
ehpad:
  catch
  ...
```

When we placed TRY markers, we placed it in bb1 because it is the
nearest common dominator of bb3 and bb4. But because bb0 jumps to bb2,
when we placed block markers, we ended up with interleaved scopes like
```
block
try
end_block
catch
end_try
```
which was not correct.

This patch fixes the bug by placing BLOCK and TRY markers in one pass
while iterating BBs in a function. This also adds some more routines to
`placeTryMarkers`, because we now have to assume that there can be
previously placed BLOCK and END_BLOCK.

Reviewers: dschuff

Subscribers: sunfish, sbc100, jgravelle-google, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D59739

llvm-svn: 357007
2019-03-26 17:15:55 +00:00
Sanjay Patel 0dd67ed462 [InstCombine] add tests for uaddsat using min; NFC
llvm-svn: 357005
2019-03-26 16:19:13 +00:00
Sanjay Patel 418ee7b7bb [InstCombine] update tests to use FileCheck; NFC
llvm-svn: 357004
2019-03-26 15:58:33 +00:00
Clement Courbet 52da938cd0 [llvm-exegesis] Allow the target to disable the selection of some registers.
Summary:
This prevents "Cannot encode high byte register in REX-prefixed instruction"
from happening on instructions that require REX encoding when AH & co
get selected.
On the down side, these 4 registers can no longer be selected
automatically, but this avoids having to expose all the X86 encoding
complexity.

Reviewers: gchatelet

Subscribers: tschuett, jdoerfert, llvm-commits, bdb

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D59821

llvm-svn: 357003
2019-03-26 15:44:57 +00:00
Luis Marques 72734fc7b5 [RISCV] Update setcc-logic.ll codegen test
This should have been updated as part of D59753.

llvm-svn: 357002
2019-03-26 15:41:45 +00:00
Nirav Dave a28c514581 [DAG] Avoid smart constructor-based dangling nodes.
Various SelectionDAG non-combine operations (e.g. the getNode smart
constructor and legalization) may leave dangling nodes by applying
optimizations or not fully pruning unused result values. This can
result in nodes that are never added to the worklist and therefore can
not be pruned.

Add a node inserter as the current node deleter to make sure such
nodes have the chance of being pruned.

Many minor changes, mostly positive.

llvm-svn: 356996
2019-03-26 15:08:14 +00:00
Luis Marques 614fd9d830 [RISCV] Improve codegen for icmp {ne,eq} with a constant
Adds two patterns to improve the codegen of GPR value comparisons with small
constants. Instead of first loading the constant into another register and then
doing an XOR of those registers, these patterns directly use the constant as an
XORI immediate.

llvm-svn: 356990
2019-03-26 12:55:00 +00:00
Simon Pilgrim e24441aab0 [TargetLowering] Add SimplifyDemandedBits support for ISD::INSERT_VECTOR_ELT
This helps us relax the extension of a lot of scalar elements before they are inserted into a vector.

Its exposes an issue in DAGCombiner::convertBuildVecZextToZext as some/all the zero-extensions may be relaxed to ANY_EXTEND, so we need to handle that case to avoid a couple of AVX2 VPMOVZX test regressions.

Once this is in it should be easier to fix a number of remaining failures to fold loads into VBROADCAST nodes.

Differential Revision: https://reviews.llvm.org/D59484

llvm-svn: 356989
2019-03-26 12:32:01 +00:00
Yi Kong 74b874ac4c Fix nondeterminism introduced in r353954
DenseMap iteration order is not guaranteed, use MapVector instead.

Fix provided by srhines.

Differential Revision: https://reviews.llvm.org/D59807

llvm-svn: 356988
2019-03-26 12:18:08 +00:00
Javed Absar c85cb2fb5d [TableGen] Let list elements have a trailing comma
Let lists have an trailing comma to allow cleaner diffs e.g:
  def : Features<[FeatureA,
                  FeatureB,
                 ]>;
Reviewed By: hfinkel
Differential Revision: https://reviews.llvm.org/D59247

llvm-svn: 356986
2019-03-26 11:16:01 +00:00
Javed Absar 33888ff66b [TableGen] Give meaningful msg for def use in multiclass
When one mistakenly specifies 'def' instead of using 'defm',
the error message is quite misleading: 'Couldn't find class..'
Instead, it should recommend using defm if the multiclass of
same name exists.

Reviewed By: hfinkel

Differential Revision: https://reviews.llvm.org/D59294 

llvm-svn: 356985
2019-03-26 10:49:09 +00:00
Oliver Stannard 5c90238479 [ARM][Asm] Accept upper case coprocessor number and registers
Differential revision: https://reviews.llvm.org/D59760

llvm-svn: 356984
2019-03-26 10:24:03 +00:00
Martin Storsjo 146db4405c [llvm-dlltool] Set a proper machine type for weak symbol object files
This makes GNU binutils not reject the libraries outright.

GNU ld handles weak externals slightly differently though, so it
can't use them for aliases in import libraries, but this makes GNU
ld able to use the rest of the import libraries.

LLD accepted object files with machine type 0 aka
IMAGE_FILE_MACHINE_UNKNOWN.

Differential Revision: https://reviews.llvm.org/D59742

llvm-svn: 356982
2019-03-26 09:02:44 +00:00
Craig Topper 3dce29b8e9 X86AsmParser: Do not process a non-existent token
This error can only happen if an unfinished operation is at Eof.

Patch by Brandon Jones

Differential Revision: https://reviews.llvm.org/D57379

llvm-svn: 356972
2019-03-26 03:12:41 +00:00
Eli Friedman 1e5d569c8c [ARM] Add missing memory operands to a bunch of instructions.
This should hopefully lead to minor improvements in code generation, and
more accurate spill/reload comments in assembly.

Also fix isLoadFromStackSlotPostFE/isStoreToStackSlotPostFE so they
don't lead to misleading assembly comments for merged memory operands;
this is technically orthogonal, but in practice the relevant memory
operand lists don't show up without this change.

Differential Revision: https://reviews.llvm.org/D59713

llvm-svn: 356963
2019-03-25 22:42:30 +00:00
Sanjay Patel 9bcb0766eb [x86] add tests for vector cmps; NFC
llvm-svn: 356959
2019-03-25 22:08:45 +00:00
Matt Arsenault b008b37b61 AMDGPU: Make collapse-endcf test more useful
Without a VALU instruction in the return block, these were mostly
testing the path to delete exec mask code before s_endpgm rather than
the end cf handling.

llvm-svn: 356955
2019-03-25 21:28:51 +00:00
Eli Friedman 92d0d13366 [AArch64] Prefer "mov" over "orr" to materialize constants.
This is generally more readable due to the way the assembler aliases
work.

(This causes a lot of test changes, but it's not really as scary as it
looks at first glance; it's just mechanically changing a bunch of checks
for orr to check for mov instead.)

Differential Revision: https://reviews.llvm.org/D59720

llvm-svn: 356954
2019-03-25 21:25:28 +00:00
Ali Tamur fdce82a814 Revert "[llvm] Prevent duplicate files in debug line header in dwarf 5."
This reverts commit 312ab05887.

My commit broke the build; I will revert and find out what happened.

llvm-svn: 356951
2019-03-25 21:09:07 +00:00
Konstantin Zhuravlyov 51809cbc98 AMDGPU: Add support for cross address space synchronization scopes
Differential Revision: https://reviews.llvm.org/D59517

llvm-svn: 356946
2019-03-25 20:50:21 +00:00
Ali Tamur 312ab05887 [llvm] Prevent duplicate files in debug line header in dwarf 5.
Summary:

Motivation: In previous dwarf versions, file name indexes started from 1, and
the primary source file was not explicit. Dwarf 5 standard (6.2.4) prescribes
the primary source file to be explicitly given an entry with an index number 0.

The current implementation honors the specification by just duplicating the
main source file, once with index number 0, and later maybe with another
index number. While this is compliant with the letter of the standard, the
duplication causes problems for consumers of this information such as lldb.
(Some files are duplicated, where only some of them have a line table although
all refer to the same file)

With this change, dwarf 5 debug line section files always start from 0, and
the zeroth entry is not duplicated whenever possible. This requires different
handling of dwarf 4 and dwarf 5 during generation (e.g. when a function returns
an index zero for a file name, it signals an error in dwarf 4, but not in dwarf 5)
However, I think the minor complication is worth it, because it enables all
consumers (lldb, gdb, dwarfdump, objdump, and so on) to treat all files in the
file name list homogenously.

Reviewers: dblaikie, probinson, aprantl, espindola

Reviewed By: probinson

Subscribers: emaste, jvesely, nhaehnle, aprantl, javed.absar, arichardson, hiraditya, MaskRay, rupprecht, jdoerfert, llvm-commits

Tags: #llvm, #debug-info

Differential Revision: https://reviews.llvm.org/D59515

llvm-svn: 356941
2019-03-25 20:08:00 +00:00
Simon Pilgrim 6f96795b88 [SLPVectorizer] Merge reorderAltShuffleOperands into reorderInputsAccordingToOpcode
As discussed on D59738, this generalizes reorderInputsAccordingToOpcode to handle multiple + non-commutative instructions so we can get rid of reorderAltShuffleOperands and make use of the extra canonicalizations that reorderInputsAccordingToOpcode brings.

Differential Revision: https://reviews.llvm.org/D59784

llvm-svn: 356939
2019-03-25 20:05:27 +00:00
Simon Pilgrim 167af1bafb [SelectionDAG] Add icmp UNDEF handling to SelectionDAG::FoldSetCC
First half of PR40800, this patch adds DAG undef handling to icmp instructions to match the behaviour in llvm::ConstantFoldCompareInstruction and SimplifyICmpInst, this permits constant folding of vector comparisons where some elements had been reduced to UNDEF (by SimplifyDemandedVectorElts etc.).

This involved a lot of tweaking to reduced tests as bugpoint loves to reduce icmp arguments to undef........

Differential Revision: https://reviews.llvm.org/D59363

llvm-svn: 356938
2019-03-25 18:51:57 +00:00
Sanjay Patel f49e33e252 [x86] add another vector zext test; NFC
Goes with the proposal in D59777

llvm-svn: 356930
2019-03-25 17:53:56 +00:00
Matt Arsenault b27e4974d0 MISched: Don't schedule regions with 0 instructions
I think this is correct, but may not necessarily be the correct fix
for the assertion I'm really trying to solve. If a scheduling region
was found that only has dbg_value instructions, the RegPressure
tracker would end up in an inconsistent state because it would skip
over any debug instructions and point to an instruction outside of the
scheduling region. It may still be possible for this to happen if
there are some real schedulable instructions between dbg_values, but I
haven't managed to break this.

The testcase is extremely sensitive and I'm not sure how to make it
more resistent to future scheduler changes that would avoid stressing
this situation.

llvm-svn: 356926
2019-03-25 17:15:44 +00:00
James Henderson 1f44814952 [llvm-objcopy]Preserve data in segments not covered by sections
llvm-objcopy previously knew nothing about data in segments that wasn't
covered by section headers, meaning that it wrote zeroes instead of what
was there. As it is possible for this data to be useful to the loader,
this patch causes llvm-objcopy to start preserving this data. Data in
sections that are explicitly removed continues to be written as zeroes.

This fixes https://bugs.llvm.org/show_bug.cgi?id=41005.

Reviewed by: jakehehrlich, rupprecht

Differential Revision: https://reviews.llvm.org/D59483

llvm-svn: 356919
2019-03-25 16:36:26 +00:00
Simon Pilgrim 77749567a1 [SLPVectorizer] Update file missed in rL356913
Differential Revision: https://reviews.llvm.org/D59738

llvm-svn: 356915
2019-03-25 16:14:21 +00:00
Sanjay Patel 76c1ef3d07 [x86] add tests for vector zext; NFC
The AVX1 lowering is poor.

llvm-svn: 356914
2019-03-25 15:54:34 +00:00
Simon Pilgrim ff3abef395 [SLPVectorizer] reorderInputsAccordingToOpcode - remove non-Instruction canonicalization
Remove attempts to commute non-Instructions to the LHS - the codegen changes appear to rely on chance more than anything else and also have a tendency to fight existing instcombine canonicalization which moves constants to the RHS of commutable binary ops.

This is prep work towards:
(a) reusing reorderInputsAccordingToOpcode for alt-shuffles and removing the similar reorderAltShuffleOperands
(b) improving reordering to optimized cases with commutable and non-commutable instructions to still find splat/consecutive ops.

Differential Revision: https://reviews.llvm.org/D59738

llvm-svn: 356913
2019-03-25 15:53:55 +00:00
Jonas Paulsson 0e75e21eb3 [RegAlloc] Simplify MIR test
Remove the IR part from test/CodeGen/X86/regalloc-copy-hints.mir (added by
r355854).

To make the test remain functional, the parts of the MBB names referring to
BB names have been removed, as well as all machine memory operands.

llvm-svn: 356899
2019-03-25 14:28:32 +00:00
Petar Avramovic a034a64f84 [MIPS GlobalISel] Select copy for arguments from FPRBRegBank
Move selectCopy into MipsInstructionSelector class.
Select copy for arguments from FPRBRegBank for MIPS32.

Differential Revision: https://reviews.llvm.org/D59644

llvm-svn: 356886
2019-03-25 11:38:06 +00:00
Petar Avramovic 3dfa368d5d [MIPS GlobalISel] Add floating point register bank
Add floating point register bank for MIPS32.
Implement getRegBankFromRegClass for float register classes.

Differential Revision: https://reviews.llvm.org/D59643

llvm-svn: 356883
2019-03-25 11:30:46 +00:00
Petar Avramovic 5a457e08f6 [MIPS GlobalISel] Lower float and double arguments in registers
Lower float and double arguments in registers for MIPS32.
When float/double argument is passed through gpr registers
select appropriate move instruction.

Differential Revision: https://reviews.llvm.org/D59642

llvm-svn: 356882
2019-03-25 11:23:41 +00:00
Xing GUO ea16be1ca7 [llvm-readobj] Separate `Symbol Version` dumpers into `LLVM style` and `GNU style`
Summary:
Currently, llvm-readobj can dump symbol version sections only in LLVM style. In this patch, I would like to separate these dumpers into GNU style and 
LLVM style for future implementation.

Reviewers: grimar, jhenderson, mattd, rupprecht

Reviewed By: jhenderson, rupprecht

Subscribers: ormris, dyung, RKSimon, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D59186

llvm-svn: 356881
2019-03-25 11:02:49 +00:00
Diana Picus 254b11a0fd [ARM GlobalISel] 64-bit memops should be aligned
We currently use only VLDR/VSTR for all 64-bit loads/stores, so the
memory operands must be word-aligned. Mark aligned operations as legal
and narrow non-aligned ones to 32 bits.

While we're here, also mark non-power-of-2 loads/stores as unsupported.

llvm-svn: 356872
2019-03-25 08:54:29 +00:00
Craig Topper 7c2554dd92 Revert r356688 "[X86] Don't avoid folding multiple use sign extended 8-bit immediate into instructions under optsize."
Looking back over how the one use optimization works, I don't think this is the right way to fix this.

llvm-svn: 356866
2019-03-25 01:25:32 +00:00
Simon Pilgrim 87d4ab8b92 [X86][SSE41] Start shuffle combining from ZERO_EXTEND_VECTOR_INREG (PR40685)
Enable SSE41 ZERO_EXTEND_VECTOR_INREG shuffle combines - for the PMOVZX(PSHUFD(V)) -> UNPCKH(V,0) pattern we reduce the shuffles (port5-bottleneck on Intel) at the expense of creating a zero (pxor v,v) and an extra register move - which is a good trade off as these are pretty cheap and in most cases it doesn't increase register pressure.

This also exposed a missed opportunity to use combine to ZERO_EXTEND_VECTOR_INREG with folded loads - even if we're in the float domain.

llvm-svn: 356864
2019-03-24 19:06:35 +00:00
Simon Pilgrim 4465a765ee [X86] Remove icmp undef from reduced tests
Pre-commit for D59363 (Add icmp UNDEF handling to SelectionDAG::FoldSetCC)

Approved by @spatel (Sanjay Patel)

llvm-svn: 356859
2019-03-24 17:02:08 +00:00
Simon Pilgrim a71c0ed471 [X86][AVX] Start shuffle combining from ZERO_EXTEND_VECTOR_INREG (PR40685)
Just enable this for AVX for now as SSE41 introduces extra register moves for the PMOVZX(PSHUFD(V)) -> UNPCKH(V,0) pattern (but otherwise helps reduce port5 usage on Intel targets).

Only AVX support is required for PR40685 as the issue is due to 8i8->8i32 zext shuffle leftovers.

llvm-svn: 356858
2019-03-24 16:30:35 +00:00
George Rimar 272571718c Recommit r356738 "[llvm-objcopy] - Implement replaceSectionReferences for GroupSection class."
Fix: r356853 + set AddressAlign to 4 in 
Inputs/compress-debug-sections.yaml for the new group section introduced.

Original commit message:

Currently, llvm-objcopy incorrectly handles compression and decompression of the
sections from COMDAT groups, because we do not implement the
replaceSectionReferences for this type of the sections.

The patch does that.

Differential revision: https://reviews.llvm.org/D59638

llvm-svn: 356856
2019-03-24 14:41:45 +00:00
Sanjay Patel 7d676dfd86 [x86] improve the default expansion of uaddsat/usubsat
This is yet another step towards solving PR14613:
https://bugs.llvm.org/show_bug.cgi?id=14613

uaddsat X, Y --> (X >u (X + Y)) ? -1 : X + Y
usubsat X, Y --> (X >u Y) ? X - Y : 0

We can't count on a sane vector ISA, so override the default (umin/umax)
expansion of unsigned add/sub saturate in cases where we do not have umin/umax.

Differential Revision: https://reviews.llvm.org/D59006

llvm-svn: 356855
2019-03-24 13:55:54 +00:00