llvm-project/llvm/test/MC/X86/align-branch-64-3a.s

Ignoring revisions in .git-blame-ignore-revs. Click here to bypass and see the normal blame view.

42 lines
1.5 KiB
ArmAsm
Raw Normal View History

Align branches within 32-Byte boundary (NOP padding) WARNING: If you're looking at this patch because you're looking for a full performace mitigation of the Intel JCC Erratum, this is not it! This is a preliminary patch on the patch towards mitigating the performance regressions caused by Intel's microcode update for Jump Conditional Code Erratum. For context, see: https://www.intel.com/content/www/us/en/support/articles/000055650.html The patch adds the required assembler infrastructure and command line options needed to exercise the logic for INTERNAL TESTING. These are NOT public flags, and should not be used for anything other than LLVM's own testing/debugging purposes. They are likely to change both in spelling and meaning. WARNING: This patch is knowingly incorrect in some cornercases. We need, and do not yet provide, a mechanism to selective enable/disable the padding. Conversation on this will continue in parellel with work on extending this infrastructure to support prefix padding. The goal here is to have the assembler align specific instructions such that they neither cross or end at a 32 byte boundary. The impacted instructions are: a. Conditional jump. b. Fused conditional jump. c. Unconditional jump. d. Indirect jump. e. Ret. f. Call. The new options for llvm-mc are: -x86-align-branch-boundary=NUM aligns branches within NUM byte boundary. -x86-align-branch=TYPE[+TYPE...] specifies types of branches to align. A new MCFragment type, MCBoundaryAlignFragment, is added, which may emit NOP to align the fused/unfused branch. alignBranchesBegin inserts MCBoundaryAlignFragment before instructions, alignBranchesEnd marks the end of the branch to be aligned, relaxBoundaryAlign grows or shrinks sizes of NOP to align the target branch. Nop padding is disabled when the instruction may be rewritten by the linker, such as TLS Call. Process Note: I am landing a patch by skan as it has been LGTMed, and continuing to iterate on the review is simply slowing us down at this point. We can and will continue to iterate in tree. Patch By: skan Differential Revision: https://reviews.llvm.org/D70157
2019-12-21 02:51:05 +08:00
# Check NOP padding is disabled before instruction that has variant symbol operand.
# RUN: llvm-mc -filetype=obj -triple x86_64-unknown-unknown --x86-align-branch-boundary=32 --x86-align-branch=jmp+call %s | llvm-objdump -d - | FileCheck %s
# CHECK: 0000000000000000 foo:
# CHECK-COUNT-3: : 64 89 04 25 01 00 00 00 movl %eax, %fs:1
# CHECK-COUNT-2: : 48 89 e5 movq %rsp, %rbp
# CHECK: 1e: e8 00 00 00 00 callq {{.*}}
# CHECK-COUNT-3: : 64 89 04 25 01 00 00 00 movl %eax, %fs:1
# CHECK: 3b: 55 pushq %rbp
# CHECK-NEXT: 3c: 89 75 f4 movl %esi, -12(%rbp)
# CHECK-NEXT: 3f: ff 15 00 00 00 00 callq *(%rip)
# CHECK-COUNT-3: : 64 89 04 25 01 00 00 00 movl %eax, %fs:1
# CHECK: 5d: ff 15 00 00 00 00 callq *(%rip)
# CHECK-NEXT-3: : 64 89 04 25 01 00 00 00 movl %eax, %fs:1
# CHECK: 7b: ff 25 00 00 00 00 jmpq *(%rip)
.text
.globl foo
.p2align 4
foo:
.rept 3
movl %eax, %fs:0x1
.endr
.rept 2
movq %rsp, %rbp
.endr
call __tls_get_addr@PLT
.rept 3
movl %eax, %fs:0x1
.endr
pushq %rbp
movl %esi, -12(%rbp)
call *__tls_get_addr@GOTPCREL(%rip)
.rept 3
movl %eax, %fs:0x1
.endr
call *foo@GOTPCREL(%rip)
.rept 3
movl %eax, %fs:0x1
.endr
jmp *foo@GOTPCREL(%rip)