llvm-project/llvm/test/CodeGen/X86/merge-consecutive-stores.ll

32 lines
1.0 KiB
LLVM
Raw Normal View History

; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
; RUN: llc < %s -mtriple=i686-unknown-unknown | FileCheck %s
; Make sure that we are zeroing one memory location at a time using xorl and
; not both using XMM registers.
define i32 @foo (i64* %so) nounwind uwtable ssp {
; CHECK-LABEL: foo:
; CHECK: # BB#0:
; CHECK-NEXT: movl {{[0-9]+}}(%esp), %eax
; CHECK-NEXT: movl $0, 28(%eax)
; CHECK-NEXT: movl $0, 24(%eax)
; CHECK-NEXT: movl 20(%eax), %ecx
; CHECK-NEXT: movl $0, 20(%eax)
; CHECK-NEXT: xorl %edx, %edx
; CHECK-NEXT: cmpl 16(%eax), %edx
; CHECK-NEXT: movl $0, 16(%eax)
; CHECK-NEXT: sbbl %ecx, %edx
[x86] use more shift or LEA for select-of-constants (2nd try) The previous rev (r310208) failed to account for overflow when subtracting the constants to see if they're suitable for shift/lea. This version add a check for that and more test were added in r310490. We can convert any select-of-constants to math ops: http://rise4fun.com/Alive/d7d For this patch, I'm enhancing an existing x86 transform that uses fake multiplies (they always become shl/lea) to avoid cmov or branching. The current code misses cases where we have a negative constant and a positive constant, so this is just trying to plug that hole. The DAGCombiner diff prevents us from hitting a terrible inefficiency: we can start with a select in IR, create a select DAG node, convert it into a sext, convert it back into a select, and then lower it to sext machine code. Some notes about the test diffs: 1. 2010-08-04-MaskedSignedCompare.ll - We were creating control flow that didn't exist in the IR. 2. memcmp.ll - Choose -1 or 1 is the case that got me looking at this again. We could avoid the push/pop in some cases if we used 'movzbl %al' instead of an xor on a different reg? That's a post-DAG problem though. 3. mul-constant-result.ll - The trade-off between sbb+not vs. setne+neg could be addressed if that's a regression, but those would always be nearly equivalent. 4. pr22338.ll and sext-i1.ll - These tests have undef operands, so we don't actually care about these diffs. 5. sbb.ll - This shows a win for what is likely a common case: choose -1 or 0. 6. select.ll - There's another borderline case here: cmp+sbb+or vs. test+set+lea? Also, sbb+not vs. setae+neg shows up again. 7. select_const.ll - These are motivating cases for the enhancement; replace cmov with cheaper ops. Assembly differences between movzbl and xor to avoid a partial reg stall are caused later by the X86 Fixup SetCC pass. Differential Revision: https://reviews.llvm.org/D35340 llvm-svn: 310717
2017-08-11 23:44:14 +08:00
; CHECK-NEXT: setl %al
; CHECK-NEXT: movzbl %al, %eax
; CHECK-NEXT: negl %eax
; CHECK-NEXT: retl
%used = getelementptr inbounds i64, i64* %so, i32 3
store i64 0, i64* %used, align 8
%fill = getelementptr inbounds i64, i64* %so, i32 2
%L = load i64, i64* %fill, align 8
store i64 0, i64* %fill, align 8
%cmp28 = icmp sgt i64 %L, 0
%R = sext i1 %cmp28 to i32
ret i32 %R
}