llvm-project/lld/test/MachO/stabs.s

Ignoring revisions in .git-blame-ignore-revs. Click here to bypass and see the normal blame view.

230 lines
9.8 KiB
ArmAsm
Raw Normal View History

# REQUIRES: x86, shell
[lld-macho] Emit STABS symbols for debugging, and drop debug sections Debug sections contain a large amount of data. In order not to bloat the size of the final binary, we remove them and instead emit STABS symbols for `dsymutil` and the debugger to locate their contents in the object files. With this diff, `dsymutil` is able to locate the debug info. However, we need a few more features before `lldb` is able to work well with our binaries -- e.g. having `LC_DYSYMTAB` accurately reflect the number of local symbols, emitting `LC_UUID`, and more. Those will be handled in follow-up diffs. Note also that the STABS we emit differ slightly from what ld64 does. First, we emit the path to the source file as one `N_SO` symbol instead of two. (`ld64` emits one `N_SO` for the dirname and one of the basename.) Second, we do not emit `N_BNSYM` and `N_ENSYM` STABS to mark the start and end of functions, because the `N_FUN` STABS already serve that purpose. @clayborg recommended these changes based on his knowledge of what the debugging tools look for. Additionally, this current implementation doesn't accurately reflect the size of function symbols. It uses the size of their containing sectioins as a proxy, but that is only accurate if `.subsections_with_symbols` is set, and if there isn't an `N_ALT_ENTRY` in that particular subsection. I think we have two options to solve this: 1. We can split up subsections by symbol even if `.subsections_with_symbols` is not set, but include constraints to ensure those subsections retain their order in the final output. This is `ld64`'s approach. 2. We could just add a `size` field to our `Symbol` class. This seems simpler, and I'm more inclined toward it, but I'm not sure if there are use cases that it doesn't handle well. As such I'm punting on the decision for now. Reviewed By: clayborg Differential Revision: https://reviews.llvm.org/D89257
2020-12-02 06:45:01 +08:00
# UNSUPPORTED: system-windows
# RUN: rm -rf %t; split-file %s %t
[lld-macho] Emit STABS symbols for debugging, and drop debug sections Debug sections contain a large amount of data. In order not to bloat the size of the final binary, we remove them and instead emit STABS symbols for `dsymutil` and the debugger to locate their contents in the object files. With this diff, `dsymutil` is able to locate the debug info. However, we need a few more features before `lldb` is able to work well with our binaries -- e.g. having `LC_DYSYMTAB` accurately reflect the number of local symbols, emitting `LC_UUID`, and more. Those will be handled in follow-up diffs. Note also that the STABS we emit differ slightly from what ld64 does. First, we emit the path to the source file as one `N_SO` symbol instead of two. (`ld64` emits one `N_SO` for the dirname and one of the basename.) Second, we do not emit `N_BNSYM` and `N_ENSYM` STABS to mark the start and end of functions, because the `N_FUN` STABS already serve that purpose. @clayborg recommended these changes based on his knowledge of what the debugging tools look for. Additionally, this current implementation doesn't accurately reflect the size of function symbols. It uses the size of their containing sectioins as a proxy, but that is only accurate if `.subsections_with_symbols` is set, and if there isn't an `N_ALT_ENTRY` in that particular subsection. I think we have two options to solve this: 1. We can split up subsections by symbol even if `.subsections_with_symbols` is not set, but include constraints to ensure those subsections retain their order in the final output. This is `ld64`'s approach. 2. We could just add a `size` field to our `Symbol` class. This seems simpler, and I'm more inclined toward it, but I'm not sure if there are use cases that it doesn't handle well. As such I'm punting on the decision for now. Reviewed By: clayborg Differential Revision: https://reviews.llvm.org/D89257
2020-12-02 06:45:01 +08:00
# RUN: llvm-mc -filetype=obj -triple=x86_64-apple-darwin %t/test.s -o %t/test.o
# RUN: llvm-mc -filetype=obj -triple=x86_64-apple-darwin %t/foo.s -o %t/foo.o
# RUN: llvm-mc -filetype=obj -triple=x86_64-apple-darwin %t/no-debug.s -o %t/no-debug.o
## Set modtimes of the files for deterministic test output.
# RUN: env TZ=UTC touch -t "197001010000.16" %t/test.o
# RUN: env TZ=UTC touch -t "197001010000.32" %t/foo.o
# RUN: llvm-ar rcsU %t/foo.a %t/foo.o
[lld-macho] Emit STABS symbols for debugging, and drop debug sections Debug sections contain a large amount of data. In order not to bloat the size of the final binary, we remove them and instead emit STABS symbols for `dsymutil` and the debugger to locate their contents in the object files. With this diff, `dsymutil` is able to locate the debug info. However, we need a few more features before `lldb` is able to work well with our binaries -- e.g. having `LC_DYSYMTAB` accurately reflect the number of local symbols, emitting `LC_UUID`, and more. Those will be handled in follow-up diffs. Note also that the STABS we emit differ slightly from what ld64 does. First, we emit the path to the source file as one `N_SO` symbol instead of two. (`ld64` emits one `N_SO` for the dirname and one of the basename.) Second, we do not emit `N_BNSYM` and `N_ENSYM` STABS to mark the start and end of functions, because the `N_FUN` STABS already serve that purpose. @clayborg recommended these changes based on his knowledge of what the debugging tools look for. Additionally, this current implementation doesn't accurately reflect the size of function symbols. It uses the size of their containing sectioins as a proxy, but that is only accurate if `.subsections_with_symbols` is set, and if there isn't an `N_ALT_ENTRY` in that particular subsection. I think we have two options to solve this: 1. We can split up subsections by symbol even if `.subsections_with_symbols` is not set, but include constraints to ensure those subsections retain their order in the final output. This is `ld64`'s approach. 2. We could just add a `size` field to our `Symbol` class. This seems simpler, and I'm more inclined toward it, but I'm not sure if there are use cases that it doesn't handle well. As such I'm punting on the decision for now. Reviewed By: clayborg Differential Revision: https://reviews.llvm.org/D89257
2020-12-02 06:45:01 +08:00
# RUN: %lld -lSystem %t/test.o %t/foo.o %t/no-debug.o -o %t/test
# RUN: (llvm-objdump --section-headers %t/test; dsymutil -s %t/test) | \
# RUN: FileCheck %s -DDIR=%t -DFOO_PATH=%t/foo.o
## Check that we emit the right modtime even when the object file is in an
## archive.
# RUN: %lld -lSystem %t/test.o %t/foo.a %t/no-debug.o -o %t/test
# RUN: (llvm-objdump --section-headers %t/test; dsymutil -s %t/test) | \
# RUN: FileCheck %s -DDIR=%t -DFOO_PATH=%t/foo.a\(foo.o\)
[lld-macho] Emit STABS symbols for debugging, and drop debug sections Debug sections contain a large amount of data. In order not to bloat the size of the final binary, we remove them and instead emit STABS symbols for `dsymutil` and the debugger to locate their contents in the object files. With this diff, `dsymutil` is able to locate the debug info. However, we need a few more features before `lldb` is able to work well with our binaries -- e.g. having `LC_DYSYMTAB` accurately reflect the number of local symbols, emitting `LC_UUID`, and more. Those will be handled in follow-up diffs. Note also that the STABS we emit differ slightly from what ld64 does. First, we emit the path to the source file as one `N_SO` symbol instead of two. (`ld64` emits one `N_SO` for the dirname and one of the basename.) Second, we do not emit `N_BNSYM` and `N_ENSYM` STABS to mark the start and end of functions, because the `N_FUN` STABS already serve that purpose. @clayborg recommended these changes based on his knowledge of what the debugging tools look for. Additionally, this current implementation doesn't accurately reflect the size of function symbols. It uses the size of their containing sectioins as a proxy, but that is only accurate if `.subsections_with_symbols` is set, and if there isn't an `N_ALT_ENTRY` in that particular subsection. I think we have two options to solve this: 1. We can split up subsections by symbol even if `.subsections_with_symbols` is not set, but include constraints to ensure those subsections retain their order in the final output. This is `ld64`'s approach. 2. We could just add a `size` field to our `Symbol` class. This seems simpler, and I'm more inclined toward it, but I'm not sure if there are use cases that it doesn't handle well. As such I'm punting on the decision for now. Reviewed By: clayborg Differential Revision: https://reviews.llvm.org/D89257
2020-12-02 06:45:01 +08:00
## Check that we emit absolute paths to the object files in our OSO entries
## even if our inputs are relative paths.
# RUN: cd %t && %lld -lSystem test.o foo.o no-debug.o -o test
# RUN: (llvm-objdump --section-headers %t/test; dsymutil -s %t/test) | \
# RUN: FileCheck %s -DDIR=%t -DFOO_PATH=%t/foo.o
# RUN: cd %t && %lld -lSystem test.o foo.a no-debug.o -o %t/test
# RUN: (llvm-objdump --section-headers %t/test; dsymutil -s %t/test) | \
# RUN: FileCheck %s -DDIR=%t -DFOO_PATH=%t/foo.a\(foo.o\)
# CHECK: Sections:
# CHECK-NEXT: Idx Name
# CHECK-NEXT: [[#TEXT_ID:]] __text
# CHECK-NEXT: [[#DATA_ID:]] __data
# CHECK-NEXT: [[#MORE_DATA_ID:]] more_data
# CHECK-NEXT: [[#COMM_ID:]] __common
# CHECK-NEXT: [[#MORE_TEXT_ID:]] more_text
# CHECK: (N_SO ) 00 0000 0000000000000000 '/tmp/test.cpp'
# CHECK-NEXT: (N_OSO ) 03 0001 0000000000000010 '[[DIR]]/test.o'
# CHECK-NEXT: (N_STSYM ) [[#%.2d,MORE_DATA_ID + 1]] 0000 [[#%.16x,STATIC:]] '_static_var'
# CHECK-NEXT: (N_FUN ) [[#%.2d,TEXT_ID + 1]] 0000 [[#%.16x,MAIN:]] '_main'
# CHECK-NEXT: (N_FUN ) 00 0000 0000000000000006{{$}}
# CHECK-NEXT: (N_FUN ) [[#%.2d,TEXT_ID + 1]] 0000 [[#%.16x,BAR:]] '_bar'
# CHECK-NEXT: (N_FUN ) 00 0000 0000000000000000{{$}}
# CHECK-NEXT: (N_FUN ) [[#%.2d,TEXT_ID + 1]] 0000 [[#%.16x,BAR2:]] '_bar2'
# CHECK-NEXT: (N_FUN ) 00 0000 0000000000000001{{$}}
# CHECK-NEXT: (N_FUN ) [[#%.2d,TEXT_ID + 1]] 0000 [[#%.16x,BAZ:]] '_baz'
# CHECK-NEXT: (N_FUN ) 00 0000 0000000000000000{{$}}
# CHECK-NEXT: (N_FUN ) [[#%.2d,TEXT_ID + 1]] 0000 [[#%.16x,BAZ2:]] '_baz2'
# CHECK-NEXT: (N_FUN ) 00 0000 0000000000000002{{$}}
# CHECK-NEXT: (N_FUN ) [[#%.2d,TEXT_ID + 1]] 0000 [[#%.16x,QUX:]] '_qux'
# CHECK-NEXT: (N_FUN ) 00 0000 0000000000000003{{$}}
# CHECK-NEXT: (N_FUN ) [[#%.2d,TEXT_ID + 1]] 0000 [[#%.16x,QUUX:]] '_quux'
# CHECK-NEXT: (N_FUN ) 00 0000 0000000000000004{{$}}
# CHECK-NEXT: (N_GSYM ) [[#%.2d,DATA_ID + 1]] 0000 [[#%.16x,GLOB:]] '_global_var'
# CHECK-NEXT: (N_GSYM ) [[#%.2d,COMM_ID + 1]] 0000 [[#%.16x,ZERO:]] '_zero'
# CHECK-NEXT: (N_FUN ) [[#%.2d,MORE_TEXT_ID + 1]] 0000 [[#%.16x,FUN:]] '_fun'
# CHECK-NEXT: (N_FUN ) 00 0000 0000000000000001{{$}}
# CHECK-NEXT: (N_SO ) 01 0000 0000000000000000{{$}}
# CHECK-NEXT: (N_SO ) 00 0000 0000000000000000 '/foo.cpp'
# CHECK-NEXT: (N_OSO ) 03 0001 0000000000000020 '[[FOO_PATH]]'
# CHECK-NEXT: (N_FUN ) [[#%.2d,TEXT_ID + 1]] 0000 [[#%.16x,FOO:]] '_foo'
# CHECK-NEXT: (N_FUN ) 00 0000 0000000000000001{{$}}
# CHECK-NEXT: (N_SO ) 01 0000 0000000000000000{{$}}
# CHECK-DAG: ( SECT ) [[#%.2d,MORE_DATA_ID + 1]] 0000 [[#STATIC]] '_static_var'
# CHECK-DAG: ( SECT EXT) [[#%.2d,TEXT_ID + 1]] 0000 [[#MAIN]] '_main'
# CHECK-DAG: ( ABS EXT) 00 0000 {{[0-9af]+}} '_abs'
# CHECK-DAG: ( SECT EXT) [[#%.2d,TEXT_ID + 1]] 0000 [[#FOO]] '_foo'
# CHECK-DAG: ( SECT EXT) [[#%.2d,TEXT_ID + 1]] 0000 [[#BAR]] '_bar'
# CHECK-DAG: ( SECT EXT) [[#%.2d,TEXT_ID + 1]] 0000 [[#BAR2]] '_bar2'
# CHECK-DAG: ( SECT EXT) [[#%.2d,TEXT_ID + 1]] 0000 [[#BAZ]] '_baz'
# CHECK-DAG: ( SECT EXT) [[#%.2d,TEXT_ID + 1]] 0000 [[#BAZ2]] '_baz2'
# CHECK-DAG: ( SECT EXT) [[#%.2d,TEXT_ID + 1]] 0000 [[#QUX]] '_qux'
# CHECK-DAG: ( SECT EXT) [[#%.2d,TEXT_ID + 1]] 0000 [[#QUUX]] '_quux'
# CHECK-DAG: ( SECT EXT) [[#%.2d,DATA_ID + 1]] 0000 [[#GLOB]] '_global_var'
# CHECK-DAG: ( SECT EXT) [[#%.2d,COMM_ID + 1]] 0000 [[#ZERO]] '_zero'
# CHECK-DAG: ( SECT EXT) [[#%.2d,MORE_TEXT_ID + 1]] 0000 [[#FUN]] '_fun'
# CHECK-DAG: ( SECT EXT) [[#%.2d,TEXT_ID + 1]] 0000 {{[0-9a-f]+}} '_no_debug'
# CHECK-DAG: ( {{.*}}) {{[0-9]+}} 0000 {{[0-9a-f]+}} '__mh_execute_header'
# CHECK-EMPTY:
[lld-macho] Emit STABS symbols for debugging, and drop debug sections Debug sections contain a large amount of data. In order not to bloat the size of the final binary, we remove them and instead emit STABS symbols for `dsymutil` and the debugger to locate their contents in the object files. With this diff, `dsymutil` is able to locate the debug info. However, we need a few more features before `lldb` is able to work well with our binaries -- e.g. having `LC_DYSYMTAB` accurately reflect the number of local symbols, emitting `LC_UUID`, and more. Those will be handled in follow-up diffs. Note also that the STABS we emit differ slightly from what ld64 does. First, we emit the path to the source file as one `N_SO` symbol instead of two. (`ld64` emits one `N_SO` for the dirname and one of the basename.) Second, we do not emit `N_BNSYM` and `N_ENSYM` STABS to mark the start and end of functions, because the `N_FUN` STABS already serve that purpose. @clayborg recommended these changes based on his knowledge of what the debugging tools look for. Additionally, this current implementation doesn't accurately reflect the size of function symbols. It uses the size of their containing sectioins as a proxy, but that is only accurate if `.subsections_with_symbols` is set, and if there isn't an `N_ALT_ENTRY` in that particular subsection. I think we have two options to solve this: 1. We can split up subsections by symbol even if `.subsections_with_symbols` is not set, but include constraints to ensure those subsections retain their order in the final output. This is `ld64`'s approach. 2. We could just add a `size` field to our `Symbol` class. This seems simpler, and I'm more inclined toward it, but I'm not sure if there are use cases that it doesn't handle well. As such I'm punting on the decision for now. Reviewed By: clayborg Differential Revision: https://reviews.llvm.org/D89257
2020-12-02 06:45:01 +08:00
## Check that we don't attempt to emit rebase opcodes for the debug sections
## when building a PIE (since we have filtered the sections out).
# RUN: %lld -lSystem -pie %t/test.o %t/foo.a %t/no-debug.o -o %t/test
# RUN: llvm-objdump --macho --rebase %t/test | FileCheck %s --check-prefix=PIE
# PIE: Rebase table:
# PIE-NEXT: segment section address type
# PIE-EMPTY:
[lld-macho] Emit STABS symbols for debugging, and drop debug sections Debug sections contain a large amount of data. In order not to bloat the size of the final binary, we remove them and instead emit STABS symbols for `dsymutil` and the debugger to locate their contents in the object files. With this diff, `dsymutil` is able to locate the debug info. However, we need a few more features before `lldb` is able to work well with our binaries -- e.g. having `LC_DYSYMTAB` accurately reflect the number of local symbols, emitting `LC_UUID`, and more. Those will be handled in follow-up diffs. Note also that the STABS we emit differ slightly from what ld64 does. First, we emit the path to the source file as one `N_SO` symbol instead of two. (`ld64` emits one `N_SO` for the dirname and one of the basename.) Second, we do not emit `N_BNSYM` and `N_ENSYM` STABS to mark the start and end of functions, because the `N_FUN` STABS already serve that purpose. @clayborg recommended these changes based on his knowledge of what the debugging tools look for. Additionally, this current implementation doesn't accurately reflect the size of function symbols. It uses the size of their containing sectioins as a proxy, but that is only accurate if `.subsections_with_symbols` is set, and if there isn't an `N_ALT_ENTRY` in that particular subsection. I think we have two options to solve this: 1. We can split up subsections by symbol even if `.subsections_with_symbols` is not set, but include constraints to ensure those subsections retain their order in the final output. This is `ld64`'s approach. 2. We could just add a `size` field to our `Symbol` class. This seems simpler, and I'm more inclined toward it, but I'm not sure if there are use cases that it doesn't handle well. As such I'm punting on the decision for now. Reviewed By: clayborg Differential Revision: https://reviews.llvm.org/D89257
2020-12-02 06:45:01 +08:00
#--- test.s
## Make sure we don't create STABS entries for absolute symbols.
.globl _abs
_abs = 0x123
.section __DATA, __data
.globl _global_var
_global_var:
.quad 123
.section __DATA, more_data
_static_var:
.quad 123
.globl _zero
.zerofill __DATA,__common,_zero,4,2
[lld-macho] Emit STABS symbols for debugging, and drop debug sections Debug sections contain a large amount of data. In order not to bloat the size of the final binary, we remove them and instead emit STABS symbols for `dsymutil` and the debugger to locate their contents in the object files. With this diff, `dsymutil` is able to locate the debug info. However, we need a few more features before `lldb` is able to work well with our binaries -- e.g. having `LC_DYSYMTAB` accurately reflect the number of local symbols, emitting `LC_UUID`, and more. Those will be handled in follow-up diffs. Note also that the STABS we emit differ slightly from what ld64 does. First, we emit the path to the source file as one `N_SO` symbol instead of two. (`ld64` emits one `N_SO` for the dirname and one of the basename.) Second, we do not emit `N_BNSYM` and `N_ENSYM` STABS to mark the start and end of functions, because the `N_FUN` STABS already serve that purpose. @clayborg recommended these changes based on his knowledge of what the debugging tools look for. Additionally, this current implementation doesn't accurately reflect the size of function symbols. It uses the size of their containing sectioins as a proxy, but that is only accurate if `.subsections_with_symbols` is set, and if there isn't an `N_ALT_ENTRY` in that particular subsection. I think we have two options to solve this: 1. We can split up subsections by symbol even if `.subsections_with_symbols` is not set, but include constraints to ensure those subsections retain their order in the final output. This is `ld64`'s approach. 2. We could just add a `size` field to our `Symbol` class. This seems simpler, and I'm more inclined toward it, but I'm not sure if there are use cases that it doesn't handle well. As such I'm punting on the decision for now. Reviewed By: clayborg Differential Revision: https://reviews.llvm.org/D89257
2020-12-02 06:45:01 +08:00
.text
.globl _main, _bar, _bar2, _baz, _baz2, _qux, _quux
.alt_entry _baz
.alt_entry _qux
_bar:
_bar2:
.space 1
_baz:
_baz2:
.space 2
[lld-macho] Emit STABS symbols for debugging, and drop debug sections Debug sections contain a large amount of data. In order not to bloat the size of the final binary, we remove them and instead emit STABS symbols for `dsymutil` and the debugger to locate their contents in the object files. With this diff, `dsymutil` is able to locate the debug info. However, we need a few more features before `lldb` is able to work well with our binaries -- e.g. having `LC_DYSYMTAB` accurately reflect the number of local symbols, emitting `LC_UUID`, and more. Those will be handled in follow-up diffs. Note also that the STABS we emit differ slightly from what ld64 does. First, we emit the path to the source file as one `N_SO` symbol instead of two. (`ld64` emits one `N_SO` for the dirname and one of the basename.) Second, we do not emit `N_BNSYM` and `N_ENSYM` STABS to mark the start and end of functions, because the `N_FUN` STABS already serve that purpose. @clayborg recommended these changes based on his knowledge of what the debugging tools look for. Additionally, this current implementation doesn't accurately reflect the size of function symbols. It uses the size of their containing sectioins as a proxy, but that is only accurate if `.subsections_with_symbols` is set, and if there isn't an `N_ALT_ENTRY` in that particular subsection. I think we have two options to solve this: 1. We can split up subsections by symbol even if `.subsections_with_symbols` is not set, but include constraints to ensure those subsections retain their order in the final output. This is `ld64`'s approach. 2. We could just add a `size` field to our `Symbol` class. This seems simpler, and I'm more inclined toward it, but I'm not sure if there are use cases that it doesn't handle well. As such I'm punting on the decision for now. Reviewed By: clayborg Differential Revision: https://reviews.llvm.org/D89257
2020-12-02 06:45:01 +08:00
_main:
Lfunc_begin0:
callq _foo
[lld-macho] Emit STABS symbols for debugging, and drop debug sections Debug sections contain a large amount of data. In order not to bloat the size of the final binary, we remove them and instead emit STABS symbols for `dsymutil` and the debugger to locate their contents in the object files. With this diff, `dsymutil` is able to locate the debug info. However, we need a few more features before `lldb` is able to work well with our binaries -- e.g. having `LC_DYSYMTAB` accurately reflect the number of local symbols, emitting `LC_UUID`, and more. Those will be handled in follow-up diffs. Note also that the STABS we emit differ slightly from what ld64 does. First, we emit the path to the source file as one `N_SO` symbol instead of two. (`ld64` emits one `N_SO` for the dirname and one of the basename.) Second, we do not emit `N_BNSYM` and `N_ENSYM` STABS to mark the start and end of functions, because the `N_FUN` STABS already serve that purpose. @clayborg recommended these changes based on his knowledge of what the debugging tools look for. Additionally, this current implementation doesn't accurately reflect the size of function symbols. It uses the size of their containing sectioins as a proxy, but that is only accurate if `.subsections_with_symbols` is set, and if there isn't an `N_ALT_ENTRY` in that particular subsection. I think we have two options to solve this: 1. We can split up subsections by symbol even if `.subsections_with_symbols` is not set, but include constraints to ensure those subsections retain their order in the final output. This is `ld64`'s approach. 2. We could just add a `size` field to our `Symbol` class. This seems simpler, and I'm more inclined toward it, but I'm not sure if there are use cases that it doesn't handle well. As such I'm punting on the decision for now. Reviewed By: clayborg Differential Revision: https://reviews.llvm.org/D89257
2020-12-02 06:45:01 +08:00
retq
Lfunc_end0:
_qux:
.space 3
_quux:
.space 4
[lld-macho] Emit STABS symbols for debugging, and drop debug sections Debug sections contain a large amount of data. In order not to bloat the size of the final binary, we remove them and instead emit STABS symbols for `dsymutil` and the debugger to locate their contents in the object files. With this diff, `dsymutil` is able to locate the debug info. However, we need a few more features before `lldb` is able to work well with our binaries -- e.g. having `LC_DYSYMTAB` accurately reflect the number of local symbols, emitting `LC_UUID`, and more. Those will be handled in follow-up diffs. Note also that the STABS we emit differ slightly from what ld64 does. First, we emit the path to the source file as one `N_SO` symbol instead of two. (`ld64` emits one `N_SO` for the dirname and one of the basename.) Second, we do not emit `N_BNSYM` and `N_ENSYM` STABS to mark the start and end of functions, because the `N_FUN` STABS already serve that purpose. @clayborg recommended these changes based on his knowledge of what the debugging tools look for. Additionally, this current implementation doesn't accurately reflect the size of function symbols. It uses the size of their containing sectioins as a proxy, but that is only accurate if `.subsections_with_symbols` is set, and if there isn't an `N_ALT_ENTRY` in that particular subsection. I think we have two options to solve this: 1. We can split up subsections by symbol even if `.subsections_with_symbols` is not set, but include constraints to ensure those subsections retain their order in the final output. This is `ld64`'s approach. 2. We could just add a `size` field to our `Symbol` class. This seems simpler, and I'm more inclined toward it, but I'm not sure if there are use cases that it doesn't handle well. As such I'm punting on the decision for now. Reviewed By: clayborg Differential Revision: https://reviews.llvm.org/D89257
2020-12-02 06:45:01 +08:00
.section __DWARF,__debug_str,regular,debug
.asciz "test.cpp" ## string offset=0
.asciz "/tmp" ## string offset=9
.section __DWARF,__debug_abbrev,regular,debug
Lsection_abbrev:
.byte 1 ## Abbreviation Code
.byte 17 ## DW_TAG_compile_unit
.byte 1 ## DW_CHILDREN_yes
.byte 3 ## DW_AT_name
.byte 14 ## DW_FORM_strp
.byte 27 ## DW_AT_comp_dir
.byte 14 ## DW_FORM_strp
.byte 17 ## DW_AT_low_pc
.byte 1 ## DW_FORM_addr
.byte 18 ## DW_AT_high_pc
.byte 6 ## DW_FORM_data4
.byte 0 ## EOM(1)
.section __DWARF,__debug_info,regular,debug
.set Lset0, Ldebug_info_end0-Ldebug_info_start0 ## Length of Unit
.long Lset0
Ldebug_info_start0:
.short 4 ## DWARF version number
.set Lset1, Lsection_abbrev-Lsection_abbrev ## Offset Into Abbrev. Section
.long Lset1
.byte 8 ## Address Size (in bytes)
.byte 1 ## Abbrev [1] 0xb:0x48 DW_TAG_compile_unit
.long 0 ## DW_AT_name
.long 9 ## DW_AT_comp_dir
.quad Lfunc_begin0 ## DW_AT_low_pc
.set Lset3, Lfunc_end0-Lfunc_begin0 ## DW_AT_high_pc
.long Lset3
.byte 0 ## End Of Children Mark
Ldebug_info_end0:
.subsections_via_symbols
.section __DWARF,__debug_line,regular,debug
.section OTHER,more_text,regular,pure_instructions
.globl _fun
_fun:
ret
[lld-macho] Emit STABS symbols for debugging, and drop debug sections Debug sections contain a large amount of data. In order not to bloat the size of the final binary, we remove them and instead emit STABS symbols for `dsymutil` and the debugger to locate their contents in the object files. With this diff, `dsymutil` is able to locate the debug info. However, we need a few more features before `lldb` is able to work well with our binaries -- e.g. having `LC_DYSYMTAB` accurately reflect the number of local symbols, emitting `LC_UUID`, and more. Those will be handled in follow-up diffs. Note also that the STABS we emit differ slightly from what ld64 does. First, we emit the path to the source file as one `N_SO` symbol instead of two. (`ld64` emits one `N_SO` for the dirname and one of the basename.) Second, we do not emit `N_BNSYM` and `N_ENSYM` STABS to mark the start and end of functions, because the `N_FUN` STABS already serve that purpose. @clayborg recommended these changes based on his knowledge of what the debugging tools look for. Additionally, this current implementation doesn't accurately reflect the size of function symbols. It uses the size of their containing sectioins as a proxy, but that is only accurate if `.subsections_with_symbols` is set, and if there isn't an `N_ALT_ENTRY` in that particular subsection. I think we have two options to solve this: 1. We can split up subsections by symbol even if `.subsections_with_symbols` is not set, but include constraints to ensure those subsections retain their order in the final output. This is `ld64`'s approach. 2. We could just add a `size` field to our `Symbol` class. This seems simpler, and I'm more inclined toward it, but I'm not sure if there are use cases that it doesn't handle well. As such I'm punting on the decision for now. Reviewed By: clayborg Differential Revision: https://reviews.llvm.org/D89257
2020-12-02 06:45:01 +08:00
#--- foo.s
.text
.globl _foo
_foo:
Lfunc_begin0:
retq
Lfunc_end0:
.section __DWARF,__debug_str,regular,debug
.asciz "foo.cpp" ## string offset=0
.asciz "" ## string offset=8
.section __DWARF,__debug_abbrev,regular,debug
Lsection_abbrev:
.byte 1 ## Abbreviation Code
.byte 17 ## DW_TAG_compile_unit
.byte 1 ## DW_CHILDREN_yes
.byte 3 ## DW_AT_name
.byte 14 ## DW_FORM_strp
.byte 27 ## DW_AT_comp_dir
.byte 14 ## DW_FORM_strp
.byte 17 ## DW_AT_low_pc
.byte 1 ## DW_FORM_addr
.byte 18 ## DW_AT_high_pc
.byte 6 ## DW_FORM_data4
.byte 0 ## EOM(1)
.section __DWARF,__debug_info,regular,debug
.set Lset0, Ldebug_info_end0-Ldebug_info_start0 ## Length of Unit
.long Lset0
Ldebug_info_start0:
.short 4 ## DWARF version number
.set Lset1, Lsection_abbrev-Lsection_abbrev ## Offset Into Abbrev. Section
.long Lset1
.byte 8 ## Address Size (in bytes)
.byte 1 ## Abbrev [1] 0xb:0x48 DW_TAG_compile_unit
.long 0 ## DW_AT_name
.long 8 ## DW_AT_comp_dir
.quad Lfunc_begin0 ## DW_AT_low_pc
.set Lset3, Lfunc_end0-Lfunc_begin0 ## DW_AT_high_pc
.long Lset3
.byte 0 ## End Of Children Mark
Ldebug_info_end0:
.subsections_via_symbols
.section __DWARF,__debug_line,regular,debug
.section __DWARF,__debug_aranges,regular,debug
ltmp1:
.byte 0
#--- no-debug.s
## This file has no debug info.
.text
.globl _no_debug
_no_debug:
ret