llvm-project/llvm/test/CodeGen/X86/2010-08-04-MaskedSignedComp...

; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
; RUN: llc < %s -mtriple=x86_64-unknown-unknown | FileCheck %s
; PR7814

@g_16 = global i64 -3738643449681751625, align 8
@g_38 = global i32 0, align 4
@.str = private constant [4 x i8] c"%d\0A\00"

define i32 @main() nounwind {
; CHECK-LABEL: main:
; CHECK:       # %bb.0: # %entry
; CHECK-NEXT:    xorl %eax, %eax
; CHECK-NEXT:    cmpq {{.*}}(%rip), %rax
; CHECK-NEXT:    sbbb %al, %al
; CHECK-NEXT:    testb $-106, %al
; CHECK-NEXT:    jle .LBB0_1
; CHECK-NEXT:  # %bb.2: # %if.then
; CHECK-NEXT:    movl $1, {{.*}}(%rip)
; CHECK-NEXT:    movl $1, %esi
; CHECK-NEXT:    jmp .LBB0_3
; CHECK-NEXT:  .LBB0_1: # %entry.if.end_crit_edge
; CHECK-NEXT:    movl {{.*}}(%rip), %esi
; CHECK-NEXT:  .LBB0_3: # %if.end
; CHECK-NEXT:    pushq %rax
; CHECK-NEXT:    movl $.L.str, %edi
; CHECK-NEXT:    xorl %eax, %eax
; CHECK-NEXT:    callq printf
; CHECK-NEXT:    xorl %eax, %eax
; CHECK-NEXT:    popq %rcx
; CHECK-NEXT:    retq
entry:
  %tmp = load i64, i64* @g_16
  %not.lnot = icmp ne i64 %tmp, 0
  %conv = sext i1 %not.lnot to i64
  %and = and i64 %conv, 150
  %conv.i = trunc i64 %and to i8
  %cmp = icmp sgt i8 %conv.i, 0
  br i1 %cmp, label %if.then, label %entry.if.end_crit_edge

entry.if.end_crit_edge:
  %tmp4.pre = load i32, i32* @g_38
  br label %if.end

if.then:
  store i32 1, i32* @g_38
  br label %if.end

if.end:
  %tmp4 = phi i32 [ %tmp4.pre, %entry.if.end_crit_edge ], [ 1, %if.then ] ; <i32> [#uses=1]
  %call5 = tail call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([4 x i8], [4 x i8]* @.str, i64 0, i64 0), i32 %tmp4) nounwind ; <i32> [#uses=0]
  ret i32 0
}

declare i32 @printf(i8* nocapture, ...) nounwind
[x86] auto-generate checks; NFC llvm-svn: 296629 2017-03-01 22:46:59 +08:00			`; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py`
			`; RUN: llc < %s -mtriple=x86_64-unknown-unknown \| FileCheck %s`
PR7814: Truncates cannot be ignored for signed comparisons. llvm-svn: 110268 2010-08-05 06:40:58 +08:00			`; PR7814`

[x86] auto-generate checks; NFC llvm-svn: 296629 2017-03-01 22:46:59 +08:00			`@g_16 = global i64 -3738643449681751625, align 8`
			`@g_38 = global i32 0, align 4`
			`@.str = private constant [4 x i8] c"%d\0A\00"`
PR7814: Truncates cannot be ignored for signed comparisons. llvm-svn: 110268 2010-08-05 06:40:58 +08:00
			`define i32 @main() nounwind {`
[x86] auto-generate checks; NFC llvm-svn: 296629 2017-03-01 22:46:59 +08:00			`; CHECK-LABEL: main:`
[CodeGen] Unify MBB reference format in both MIR and debug output As part of the unification of the debug format and the MIR format, print MBB references as '%bb.5'. The MIR printer prints the IR name of a MBB only for block definitions. * find . \( -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" \) -type f -print0 \| xargs -0 sed -i '' -E 's/BB#" << ([a-zA-Z0-9_]+)->getNumber\(\)/" << printMBBReference(\1)/g' find . \( -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" \) -type f -print0 \| xargs -0 sed -i '' -E 's/BB#" << ([a-zA-Z0-9_]+)\.getNumber\(\)/" << printMBBReference(\1)/g' * find . \( -name ".txt" -o -name ".s" -o -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" \) -type f -print0 \| xargs -0 sed -i '' -E 's/BB#([0-9]+)/%bb.\1/g' * grep -nr 'BB#' and fix Differential Revision: https://reviews.llvm.org/D40422 llvm-svn: 319665 2017-12-05 01:18:51 +08:00			`; CHECK: # %bb.0: # %entry`
[x86] auto-generate checks; NFC llvm-svn: 296629 2017-03-01 22:46:59 +08:00			`; CHECK-NEXT: xorl %eax, %eax`
[x86] use more shift or LEA for select-of-constants (2nd try) The previous rev (r310208) failed to account for overflow when subtracting the constants to see if they're suitable for shift/lea. This version add a check for that and more test were added in r310490. We can convert any select-of-constants to math ops: http://rise4fun.com/Alive/d7d For this patch, I'm enhancing an existing x86 transform that uses fake multiplies (they always become shl/lea) to avoid cmov or branching. The current code misses cases where we have a negative constant and a positive constant, so this is just trying to plug that hole. The DAGCombiner diff prevents us from hitting a terrible inefficiency: we can start with a select in IR, create a select DAG node, convert it into a sext, convert it back into a select, and then lower it to sext machine code. Some notes about the test diffs: 1. 2010-08-04-MaskedSignedCompare.ll - We were creating control flow that didn't exist in the IR. 2. memcmp.ll - Choose -1 or 1 is the case that got me looking at this again. We could avoid the push/pop in some cases if we used 'movzbl %al' instead of an xor on a different reg? That's a post-DAG problem though. 3. mul-constant-result.ll - The trade-off between sbb+not vs. setne+neg could be addressed if that's a regression, but those would always be nearly equivalent. 4. pr22338.ll and sext-i1.ll - These tests have undef operands, so we don't actually care about these diffs. 5. sbb.ll - This shows a win for what is likely a common case: choose -1 or 0. 6. select.ll - There's another borderline case here: cmp+sbb+or vs. test+set+lea? Also, sbb+not vs. setae+neg shows up again. 7. select_const.ll - These are motivating cases for the enhancement; replace cmov with cheaper ops. Assembly differences between movzbl and xor to avoid a partial reg stall are caused later by the X86 Fixup SetCC pass. Differential Revision: https://reviews.llvm.org/D35340 llvm-svn: 310717 2017-08-11 23:44:14 +08:00			`; CHECK-NEXT: cmpq {{.*}}(%rip), %rax`
[DAGCombiner] narrow truncated binops The motivating case for this is shown in: https://bugs.llvm.org/show_bug.cgi?id=32023 and the corresponding rot16.ll regression tests. Because x86 scalar shift amounts are i8 values, we can end up with trunc-binop-trunc sequences that don't get folded in IR. As the TODO comments suggest, there will be regressions if we extend this (for x86, we mostly seem to be missing LEA opportunities, but there are likely vector folds missing too). I think those should be considered existing bugs because this is the same transform that we do as an IR canonicalization in instcombine. We just need more tests to make those visible independent of this patch. Differential Revision: https://reviews.llvm.org/D54640 llvm-svn: 347917 2018-11-30 04:58:26 +08:00			`; CHECK-NEXT: sbbb %al, %al`
Recommit r343498 "[X86] Improve test instruction shrinking when the sign flag is used and the output of the and is truncated." This includes a fix to prevent i16 compares with i32/i64 ands from being shrunk if bit 15 of the and is set and the sign bit is used. Original commit message: Currently we skip looking through truncates if the sign flag is used. But that's overly restrictive. It's safe to look through the truncate as long as we ensure one of the 3 things when we shrink. Either the MSB of the mask at the shrunken size isn't set. If the mask bit is set then either the shrunk size needs to be equal to the compare size or the sign There are still missed opportunities to shrink a load and fold it in here. This will be fixed in a future patch. llvm-svn: 343539 2018-10-02 05:35:26 +08:00			`; CHECK-NEXT: testb $-106, %al`
[x86] use more shift or LEA for select-of-constants (2nd try) The previous rev (r310208) failed to account for overflow when subtracting the constants to see if they're suitable for shift/lea. This version add a check for that and more test were added in r310490. We can convert any select-of-constants to math ops: http://rise4fun.com/Alive/d7d For this patch, I'm enhancing an existing x86 transform that uses fake multiplies (they always become shl/lea) to avoid cmov or branching. The current code misses cases where we have a negative constant and a positive constant, so this is just trying to plug that hole. The DAGCombiner diff prevents us from hitting a terrible inefficiency: we can start with a select in IR, create a select DAG node, convert it into a sext, convert it back into a select, and then lower it to sext machine code. Some notes about the test diffs: 1. 2010-08-04-MaskedSignedCompare.ll - We were creating control flow that didn't exist in the IR. 2. memcmp.ll - Choose -1 or 1 is the case that got me looking at this again. We could avoid the push/pop in some cases if we used 'movzbl %al' instead of an xor on a different reg? That's a post-DAG problem though. 3. mul-constant-result.ll - The trade-off between sbb+not vs. setne+neg could be addressed if that's a regression, but those would always be nearly equivalent. 4. pr22338.ll and sext-i1.ll - These tests have undef operands, so we don't actually care about these diffs. 5. sbb.ll - This shows a win for what is likely a common case: choose -1 or 0. 6. select.ll - There's another borderline case here: cmp+sbb+or vs. test+set+lea? Also, sbb+not vs. setae+neg shows up again. 7. select_const.ll - These are motivating cases for the enhancement; replace cmov with cheaper ops. Assembly differences between movzbl and xor to avoid a partial reg stall are caused later by the X86 Fixup SetCC pass. Differential Revision: https://reviews.llvm.org/D35340 llvm-svn: 310717 2017-08-11 23:44:14 +08:00			`; CHECK-NEXT: jle .LBB0_1`
[CodeGen] Unify MBB reference format in both MIR and debug output As part of the unification of the debug format and the MIR format, print MBB references as '%bb.5'. The MIR printer prints the IR name of a MBB only for block definitions. * find . \( -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" \) -type f -print0 \| xargs -0 sed -i '' -E 's/BB#" << ([a-zA-Z0-9_]+)->getNumber\(\)/" << printMBBReference(\1)/g' find . \( -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" \) -type f -print0 \| xargs -0 sed -i '' -E 's/BB#" << ([a-zA-Z0-9_]+)\.getNumber\(\)/" << printMBBReference(\1)/g' * find . \( -name ".txt" -o -name ".s" -o -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" \) -type f -print0 \| xargs -0 sed -i '' -E 's/BB#([0-9]+)/%bb.\1/g' * grep -nr 'BB#' and fix Differential Revision: https://reviews.llvm.org/D40422 llvm-svn: 319665 2017-12-05 01:18:51 +08:00			`; CHECK-NEXT: # %bb.2: # %if.then`
[x86] auto-generate checks; NFC llvm-svn: 296629 2017-03-01 22:46:59 +08:00			`; CHECK-NEXT: movl $1, {{.*}}(%rip)`
			`; CHECK-NEXT: movl $1, %esi`
[x86] use more shift or LEA for select-of-constants (2nd try) The previous rev (r310208) failed to account for overflow when subtracting the constants to see if they're suitable for shift/lea. This version add a check for that and more test were added in r310490. We can convert any select-of-constants to math ops: http://rise4fun.com/Alive/d7d For this patch, I'm enhancing an existing x86 transform that uses fake multiplies (they always become shl/lea) to avoid cmov or branching. The current code misses cases where we have a negative constant and a positive constant, so this is just trying to plug that hole. The DAGCombiner diff prevents us from hitting a terrible inefficiency: we can start with a select in IR, create a select DAG node, convert it into a sext, convert it back into a select, and then lower it to sext machine code. Some notes about the test diffs: 1. 2010-08-04-MaskedSignedCompare.ll - We were creating control flow that didn't exist in the IR. 2. memcmp.ll - Choose -1 or 1 is the case that got me looking at this again. We could avoid the push/pop in some cases if we used 'movzbl %al' instead of an xor on a different reg? That's a post-DAG problem though. 3. mul-constant-result.ll - The trade-off between sbb+not vs. setne+neg could be addressed if that's a regression, but those would always be nearly equivalent. 4. pr22338.ll and sext-i1.ll - These tests have undef operands, so we don't actually care about these diffs. 5. sbb.ll - This shows a win for what is likely a common case: choose -1 or 0. 6. select.ll - There's another borderline case here: cmp+sbb+or vs. test+set+lea? Also, sbb+not vs. setae+neg shows up again. 7. select_const.ll - These are motivating cases for the enhancement; replace cmov with cheaper ops. Assembly differences between movzbl and xor to avoid a partial reg stall are caused later by the X86 Fixup SetCC pass. Differential Revision: https://reviews.llvm.org/D35340 llvm-svn: 310717 2017-08-11 23:44:14 +08:00			`; CHECK-NEXT: jmp .LBB0_3`
			`; CHECK-NEXT: .LBB0_1: # %entry.if.end_crit_edge`
[x86] auto-generate checks; NFC llvm-svn: 296629 2017-03-01 22:46:59 +08:00			`; CHECK-NEXT: movl {{.*}}(%rip), %esi`
[x86] use more shift or LEA for select-of-constants (2nd try) The previous rev (r310208) failed to account for overflow when subtracting the constants to see if they're suitable for shift/lea. This version add a check for that and more test were added in r310490. We can convert any select-of-constants to math ops: http://rise4fun.com/Alive/d7d For this patch, I'm enhancing an existing x86 transform that uses fake multiplies (they always become shl/lea) to avoid cmov or branching. The current code misses cases where we have a negative constant and a positive constant, so this is just trying to plug that hole. The DAGCombiner diff prevents us from hitting a terrible inefficiency: we can start with a select in IR, create a select DAG node, convert it into a sext, convert it back into a select, and then lower it to sext machine code. Some notes about the test diffs: 1. 2010-08-04-MaskedSignedCompare.ll - We were creating control flow that didn't exist in the IR. 2. memcmp.ll - Choose -1 or 1 is the case that got me looking at this again. We could avoid the push/pop in some cases if we used 'movzbl %al' instead of an xor on a different reg? That's a post-DAG problem though. 3. mul-constant-result.ll - The trade-off between sbb+not vs. setne+neg could be addressed if that's a regression, but those would always be nearly equivalent. 4. pr22338.ll and sext-i1.ll - These tests have undef operands, so we don't actually care about these diffs. 5. sbb.ll - This shows a win for what is likely a common case: choose -1 or 0. 6. select.ll - There's another borderline case here: cmp+sbb+or vs. test+set+lea? Also, sbb+not vs. setae+neg shows up again. 7. select_const.ll - These are motivating cases for the enhancement; replace cmov with cheaper ops. Assembly differences between movzbl and xor to avoid a partial reg stall are caused later by the X86 Fixup SetCC pass. Differential Revision: https://reviews.llvm.org/D35340 llvm-svn: 310717 2017-08-11 23:44:14 +08:00			`; CHECK-NEXT: .LBB0_3: # %if.end`
[x86] auto-generate checks; NFC llvm-svn: 296629 2017-03-01 22:46:59 +08:00			`; CHECK-NEXT: pushq %rax`
			`; CHECK-NEXT: movl $.L.str, %edi`
			`; CHECK-NEXT: xorl %eax, %eax`
			`; CHECK-NEXT: callq printf`
			`; CHECK-NEXT: xorl %eax, %eax`
			`; CHECK-NEXT: popq %rcx`
			`; CHECK-NEXT: retq`
PR7814: Truncates cannot be ignored for signed comparisons. llvm-svn: 110268 2010-08-05 06:40:58 +08:00			`entry:`
[x86] auto-generate checks; NFC llvm-svn: 296629 2017-03-01 22:46:59 +08:00			`%tmp = load i64, i64* @g_16`
			`%not.lnot = icmp ne i64 %tmp, 0`
			`%conv = sext i1 %not.lnot to i64`
			`%and = and i64 %conv, 150`
			`%conv.i = trunc i64 %and to i8`
			`%cmp = icmp sgt i8 %conv.i, 0`
PR7814: Truncates cannot be ignored for signed comparisons. llvm-svn: 110268 2010-08-05 06:40:58 +08:00			`br i1 %cmp, label %if.then, label %entry.if.end_crit_edge`

[x86] auto-generate checks; NFC llvm-svn: 296629 2017-03-01 22:46:59 +08:00			`entry.if.end_crit_edge:`
			`%tmp4.pre = load i32, i32* @g_38`
PR7814: Truncates cannot be ignored for signed comparisons. llvm-svn: 110268 2010-08-05 06:40:58 +08:00			`br label %if.end`

[x86] auto-generate checks; NFC llvm-svn: 296629 2017-03-01 22:46:59 +08:00			`if.then:`
PR7814: Truncates cannot be ignored for signed comparisons. llvm-svn: 110268 2010-08-05 06:40:58 +08:00			`store i32 1, i32* @g_38`
			`br label %if.end`

[x86] auto-generate checks; NFC llvm-svn: 296629 2017-03-01 22:46:59 +08:00			`if.end:`
PR7814: Truncates cannot be ignored for signed comparisons. llvm-svn: 110268 2010-08-05 06:40:58 +08:00			`%tmp4 = phi i32 [ %tmp4.pre, %entry.if.end_crit_edge ], [ 1, %if.then ] ; <i32> [#uses=1]`
[opaque pointer type] Add textual IR support for explicit type parameter to the call instruction See r230786 and r230794 for similar changes to gep and load respectively. Call is a bit different because it often doesn't have a single explicit type - usually the type is deduced from the arguments, and just the return type is explicit. In those cases there's no need to change the IR. When that's not the case, the IR usually contains the pointer type of the first operand - but since typed pointers are going away, that representation is insufficient so I'm just stripping the "pointerness" of the explicit type away. This does make the IR a bit weird - it /sort of/ reads like the type of the first operand: "call void () %x(" but %x is actually of type "void ()" and will eventually be just of type "ptr". But this seems not too bad and I don't think it would benefit from repeating the type ("void (), void () %x(" and then eventually "void (), ptr %x(") as has been done with gep and load. This also has a side benefit: since the explicit type is no longer a pointer, there's no ambiguity between an explicit type and a function that returns a function pointer. Previously this case needed an explicit type (eg: a function returning a void() function was written as "call void () () * @x(" rather than "call void () * @x(" because of the ambiguity between a function returning a pointer to a void() function and a function returning void). No ambiguity means even function pointer return types can just be written alone, without writing the whole function's type. This leaves /only/ the varargs case where the explicit type is required. Given the special type syntax in call instructions, the regex-fu used for migration was a bit more involved in its own unique way (as every one of these is) so here it is. Use it in conjunction with the apply.sh script and associated find/xargs commands I've provided in rr230786 to migrate your out of tree tests. Do let me know if any of this doesn't cover your cases & we can iterate on a more general script/regexes to help others with out of tree tests. About 9 test cases couldn't be automatically migrated - half of those were functions returning function pointers, where I just had to manually delete the function argument types now that we didn't need an explicit function type there. The other half were typedefs of function types used in calls - just had to manually drop the * from those. import fileinput import sys import re pat = re.compile(r'((?:=\|:\|^\|\s)call\s(?:[^@]?))(\s$\|\s(?:(?:\[\[[a-zA-Z0-9_]+\]\]\|[@%](?:(")?[\\\?@a-zA-Z0-9_.]?(?(3)"\|)\|{{.}}))(?:\(\|$)\|undef\|inttoptr\|bitcast\|null\|asm).$)') addrspace_end = re.compile(r"addrspace\(\d+\)\s\$") func_end = re.compile("(?:void.\|\)\s)\$") def conv(match, line): if not match or re.search(addrspace_end, match.group(1)) or not re.search(func_end, match.group(1)): return line return line[:match.start()] + match.group(1)[:match.group(1).rfind('')].rstrip() + match.group(2) + line[match.end():] for line in sys.stdin: sys.stdout.write(conv(re.search(pat, line), line)) llvm-svn: 235145 2015-04-17 07:24:18 +08:00			`%call5 = tail call i32 (i8, ...) @printf(i8 getelementptr inbounds ([4 x i8], [4 x i8]* @.str, i64 0, i64 0), i32 %tmp4) nounwind ; <i32> [#uses=0]`
PR7814: Truncates cannot be ignored for signed comparisons. llvm-svn: 110268 2010-08-05 06:40:58 +08:00			`ret i32 0`
			`}`

			`declare i32 @printf(i8* nocapture, ...) nounwind`
[x86] auto-generate checks; NFC llvm-svn: 296629 2017-03-01 22:46:59 +08:00