2020-12-02 06:45:12 +08:00
|
|
|
# REQUIRES: x86, shell
|
[lld-macho] Emit STABS symbols for debugging, and drop debug sections
Debug sections contain a large amount of data. In order not to bloat the size
of the final binary, we remove them and instead emit STABS symbols for
`dsymutil` and the debugger to locate their contents in the object files.
With this diff, `dsymutil` is able to locate the debug info. However, we need
a few more features before `lldb` is able to work well with our binaries --
e.g. having `LC_DYSYMTAB` accurately reflect the number of local symbols,
emitting `LC_UUID`, and more. Those will be handled in follow-up diffs.
Note also that the STABS we emit differ slightly from what ld64 does. First, we
emit the path to the source file as one `N_SO` symbol instead of two. (`ld64`
emits one `N_SO` for the dirname and one of the basename.) Second, we do not
emit `N_BNSYM` and `N_ENSYM` STABS to mark the start and end of functions,
because the `N_FUN` STABS already serve that purpose. @clayborg recommended
these changes based on his knowledge of what the debugging tools look for.
Additionally, this current implementation doesn't accurately reflect the size
of function symbols. It uses the size of their containing sectioins as a proxy,
but that is only accurate if `.subsections_with_symbols` is set, and if there
isn't an `N_ALT_ENTRY` in that particular subsection. I think we have two
options to solve this:
1. We can split up subsections by symbol even if `.subsections_with_symbols`
is not set, but include constraints to ensure those subsections retain
their order in the final output. This is `ld64`'s approach.
2. We could just add a `size` field to our `Symbol` class. This seems simpler,
and I'm more inclined toward it, but I'm not sure if there are use cases
that it doesn't handle well. As such I'm punting on the decision for now.
Reviewed By: clayborg
Differential Revision: https://reviews.llvm.org/D89257
2020-12-02 06:45:01 +08:00
|
|
|
# UNSUPPORTED: system-windows
|
2020-12-22 03:43:58 +08:00
|
|
|
# RUN: rm -rf %t; split-file %s %t
|
[lld-macho] Emit STABS symbols for debugging, and drop debug sections
Debug sections contain a large amount of data. In order not to bloat the size
of the final binary, we remove them and instead emit STABS symbols for
`dsymutil` and the debugger to locate their contents in the object files.
With this diff, `dsymutil` is able to locate the debug info. However, we need
a few more features before `lldb` is able to work well with our binaries --
e.g. having `LC_DYSYMTAB` accurately reflect the number of local symbols,
emitting `LC_UUID`, and more. Those will be handled in follow-up diffs.
Note also that the STABS we emit differ slightly from what ld64 does. First, we
emit the path to the source file as one `N_SO` symbol instead of two. (`ld64`
emits one `N_SO` for the dirname and one of the basename.) Second, we do not
emit `N_BNSYM` and `N_ENSYM` STABS to mark the start and end of functions,
because the `N_FUN` STABS already serve that purpose. @clayborg recommended
these changes based on his knowledge of what the debugging tools look for.
Additionally, this current implementation doesn't accurately reflect the size
of function symbols. It uses the size of their containing sectioins as a proxy,
but that is only accurate if `.subsections_with_symbols` is set, and if there
isn't an `N_ALT_ENTRY` in that particular subsection. I think we have two
options to solve this:
1. We can split up subsections by symbol even if `.subsections_with_symbols`
is not set, but include constraints to ensure those subsections retain
their order in the final output. This is `ld64`'s approach.
2. We could just add a `size` field to our `Symbol` class. This seems simpler,
and I'm more inclined toward it, but I'm not sure if there are use cases
that it doesn't handle well. As such I'm punting on the decision for now.
Reviewed By: clayborg
Differential Revision: https://reviews.llvm.org/D89257
2020-12-02 06:45:01 +08:00
|
|
|
# RUN: llvm-mc -filetype=obj -triple=x86_64-apple-darwin %t/test.s -o %t/test.o
|
|
|
|
# RUN: llvm-mc -filetype=obj -triple=x86_64-apple-darwin %t/foo.s -o %t/foo.o
|
2020-12-02 06:45:12 +08:00
|
|
|
# RUN: llvm-mc -filetype=obj -triple=x86_64-apple-darwin %t/no-debug.s -o %t/no-debug.o
|
2020-12-02 06:45:11 +08:00
|
|
|
## Set modtimes of the files for deterministic test output.
|
|
|
|
# RUN: env TZ=UTC touch -t "197001010000.16" %t/test.o
|
|
|
|
# RUN: env TZ=UTC touch -t "197001010000.32" %t/foo.o
|
|
|
|
# RUN: llvm-ar rcsU %t/foo.a %t/foo.o
|
[lld-macho] Emit STABS symbols for debugging, and drop debug sections
Debug sections contain a large amount of data. In order not to bloat the size
of the final binary, we remove them and instead emit STABS symbols for
`dsymutil` and the debugger to locate their contents in the object files.
With this diff, `dsymutil` is able to locate the debug info. However, we need
a few more features before `lldb` is able to work well with our binaries --
e.g. having `LC_DYSYMTAB` accurately reflect the number of local symbols,
emitting `LC_UUID`, and more. Those will be handled in follow-up diffs.
Note also that the STABS we emit differ slightly from what ld64 does. First, we
emit the path to the source file as one `N_SO` symbol instead of two. (`ld64`
emits one `N_SO` for the dirname and one of the basename.) Second, we do not
emit `N_BNSYM` and `N_ENSYM` STABS to mark the start and end of functions,
because the `N_FUN` STABS already serve that purpose. @clayborg recommended
these changes based on his knowledge of what the debugging tools look for.
Additionally, this current implementation doesn't accurately reflect the size
of function symbols. It uses the size of their containing sectioins as a proxy,
but that is only accurate if `.subsections_with_symbols` is set, and if there
isn't an `N_ALT_ENTRY` in that particular subsection. I think we have two
options to solve this:
1. We can split up subsections by symbol even if `.subsections_with_symbols`
is not set, but include constraints to ensure those subsections retain
their order in the final output. This is `ld64`'s approach.
2. We could just add a `size` field to our `Symbol` class. This seems simpler,
and I'm more inclined toward it, but I'm not sure if there are use cases
that it doesn't handle well. As such I'm punting on the decision for now.
Reviewed By: clayborg
Differential Revision: https://reviews.llvm.org/D89257
2020-12-02 06:45:01 +08:00
|
|
|
|
2020-12-02 06:45:12 +08:00
|
|
|
# RUN: %lld -lSystem %t/test.o %t/foo.o %t/no-debug.o -o %t/test
|
2021-04-08 00:08:14 +08:00
|
|
|
# RUN: (llvm-objdump --section-headers %t/test; dsymutil -s %t/test) | \
|
2020-12-02 06:45:12 +08:00
|
|
|
# RUN: FileCheck %s -DDIR=%t -DFOO_PATH=%t/foo.o
|
2020-12-02 06:45:11 +08:00
|
|
|
|
|
|
|
## Check that we emit the right modtime even when the object file is in an
|
|
|
|
## archive.
|
2020-12-02 06:45:12 +08:00
|
|
|
# RUN: %lld -lSystem %t/test.o %t/foo.a %t/no-debug.o -o %t/test
|
2021-04-08 00:08:14 +08:00
|
|
|
# RUN: (llvm-objdump --section-headers %t/test; dsymutil -s %t/test) | \
|
2020-12-02 06:45:12 +08:00
|
|
|
# RUN: FileCheck %s -DDIR=%t -DFOO_PATH=%t/foo.a\(foo.o\)
|
[lld-macho] Emit STABS symbols for debugging, and drop debug sections
Debug sections contain a large amount of data. In order not to bloat the size
of the final binary, we remove them and instead emit STABS symbols for
`dsymutil` and the debugger to locate their contents in the object files.
With this diff, `dsymutil` is able to locate the debug info. However, we need
a few more features before `lldb` is able to work well with our binaries --
e.g. having `LC_DYSYMTAB` accurately reflect the number of local symbols,
emitting `LC_UUID`, and more. Those will be handled in follow-up diffs.
Note also that the STABS we emit differ slightly from what ld64 does. First, we
emit the path to the source file as one `N_SO` symbol instead of two. (`ld64`
emits one `N_SO` for the dirname and one of the basename.) Second, we do not
emit `N_BNSYM` and `N_ENSYM` STABS to mark the start and end of functions,
because the `N_FUN` STABS already serve that purpose. @clayborg recommended
these changes based on his knowledge of what the debugging tools look for.
Additionally, this current implementation doesn't accurately reflect the size
of function symbols. It uses the size of their containing sectioins as a proxy,
but that is only accurate if `.subsections_with_symbols` is set, and if there
isn't an `N_ALT_ENTRY` in that particular subsection. I think we have two
options to solve this:
1. We can split up subsections by symbol even if `.subsections_with_symbols`
is not set, but include constraints to ensure those subsections retain
their order in the final output. This is `ld64`'s approach.
2. We could just add a `size` field to our `Symbol` class. This seems simpler,
and I'm more inclined toward it, but I'm not sure if there are use cases
that it doesn't handle well. As such I'm punting on the decision for now.
Reviewed By: clayborg
Differential Revision: https://reviews.llvm.org/D89257
2020-12-02 06:45:01 +08:00
|
|
|
|
|
|
|
## Check that we emit absolute paths to the object files in our OSO entries
|
|
|
|
## even if our inputs are relative paths.
|
2020-12-02 06:45:12 +08:00
|
|
|
# RUN: cd %t && %lld -lSystem test.o foo.o no-debug.o -o test
|
2021-04-08 00:08:14 +08:00
|
|
|
# RUN: (llvm-objdump --section-headers %t/test; dsymutil -s %t/test) | \
|
2020-12-02 06:45:12 +08:00
|
|
|
# RUN: FileCheck %s -DDIR=%t -DFOO_PATH=%t/foo.o
|
|
|
|
|
|
|
|
# RUN: cd %t && %lld -lSystem test.o foo.a no-debug.o -o %t/test
|
2021-04-08 00:08:14 +08:00
|
|
|
# RUN: (llvm-objdump --section-headers %t/test; dsymutil -s %t/test) | \
|
2020-12-02 06:45:12 +08:00
|
|
|
# RUN: FileCheck %s -DDIR=%t -DFOO_PATH=%t/foo.a\(foo.o\)
|
|
|
|
|
|
|
|
# CHECK: Sections:
|
|
|
|
# CHECK-NEXT: Idx Name
|
|
|
|
# CHECK-NEXT: [[#TEXT_ID:]] __text
|
|
|
|
# CHECK-NEXT: [[#DATA_ID:]] __data
|
|
|
|
# CHECK-NEXT: [[#MORE_DATA_ID:]] more_data
|
|
|
|
# CHECK-NEXT: [[#COMM_ID:]] __common
|
2020-12-02 06:45:13 +08:00
|
|
|
# CHECK-NEXT: [[#MORE_TEXT_ID:]] more_text
|
2020-12-02 06:45:12 +08:00
|
|
|
|
2021-04-08 00:08:14 +08:00
|
|
|
# CHECK: (N_SO ) 00 0000 0000000000000000 '/tmp/test.cpp'
|
|
|
|
# CHECK-NEXT: (N_OSO ) 03 0001 0000000000000010 '[[DIR]]/test.o'
|
|
|
|
# CHECK-NEXT: (N_STSYM ) [[#%.2d,MORE_DATA_ID + 1]] 0000 [[#%.16x,STATIC:]] '_static_var'
|
|
|
|
# CHECK-NEXT: (N_FUN ) [[#%.2d,TEXT_ID + 1]] 0000 [[#%.16x,MAIN:]] '_main'
|
|
|
|
# CHECK-NEXT: (N_FUN ) 00 0000 0000000000000006{{$}}
|
|
|
|
# CHECK-NEXT: (N_FUN ) [[#%.2d,TEXT_ID + 1]] 0000 [[#%.16x,BAR:]] '_bar'
|
|
|
|
# CHECK-NEXT: (N_FUN ) 00 0000 0000000000000000{{$}}
|
|
|
|
# CHECK-NEXT: (N_FUN ) [[#%.2d,TEXT_ID + 1]] 0000 [[#%.16x,BAR2:]] '_bar2'
|
|
|
|
# CHECK-NEXT: (N_FUN ) 00 0000 0000000000000001{{$}}
|
|
|
|
# CHECK-NEXT: (N_FUN ) [[#%.2d,TEXT_ID + 1]] 0000 [[#%.16x,BAZ:]] '_baz'
|
|
|
|
# CHECK-NEXT: (N_FUN ) 00 0000 0000000000000000{{$}}
|
|
|
|
# CHECK-NEXT: (N_FUN ) [[#%.2d,TEXT_ID + 1]] 0000 [[#%.16x,BAZ2:]] '_baz2'
|
|
|
|
# CHECK-NEXT: (N_FUN ) 00 0000 0000000000000002{{$}}
|
|
|
|
# CHECK-NEXT: (N_FUN ) [[#%.2d,TEXT_ID + 1]] 0000 [[#%.16x,QUX:]] '_qux'
|
|
|
|
# CHECK-NEXT: (N_FUN ) 00 0000 0000000000000003{{$}}
|
|
|
|
# CHECK-NEXT: (N_FUN ) [[#%.2d,TEXT_ID + 1]] 0000 [[#%.16x,QUUX:]] '_quux'
|
|
|
|
# CHECK-NEXT: (N_FUN ) 00 0000 0000000000000004{{$}}
|
|
|
|
# CHECK-NEXT: (N_GSYM ) [[#%.2d,DATA_ID + 1]] 0000 [[#%.16x,GLOB:]] '_global_var'
|
|
|
|
# CHECK-NEXT: (N_GSYM ) [[#%.2d,COMM_ID + 1]] 0000 [[#%.16x,ZERO:]] '_zero'
|
|
|
|
# CHECK-NEXT: (N_FUN ) [[#%.2d,MORE_TEXT_ID + 1]] 0000 [[#%.16x,FUN:]] '_fun'
|
|
|
|
# CHECK-NEXT: (N_FUN ) 00 0000 0000000000000001{{$}}
|
|
|
|
# CHECK-NEXT: (N_SO ) 01 0000 0000000000000000{{$}}
|
|
|
|
# CHECK-NEXT: (N_SO ) 00 0000 0000000000000000 '/foo.cpp'
|
|
|
|
# CHECK-NEXT: (N_OSO ) 03 0001 0000000000000020 '[[FOO_PATH]]'
|
|
|
|
# CHECK-NEXT: (N_FUN ) [[#%.2d,TEXT_ID + 1]] 0000 [[#%.16x,FOO:]] '_foo'
|
|
|
|
# CHECK-NEXT: (N_FUN ) 00 0000 0000000000000001{{$}}
|
|
|
|
# CHECK-NEXT: (N_SO ) 01 0000 0000000000000000{{$}}
|
|
|
|
# CHECK-DAG: ( SECT ) [[#%.2d,MORE_DATA_ID + 1]] 0000 [[#STATIC]] '_static_var'
|
|
|
|
# CHECK-DAG: ( SECT EXT) [[#%.2d,TEXT_ID + 1]] 0000 [[#MAIN]] '_main'
|
|
|
|
# CHECK-DAG: ( ABS EXT) 00 0000 {{[0-9af]+}} '_abs'
|
|
|
|
# CHECK-DAG: ( SECT EXT) [[#%.2d,TEXT_ID + 1]] 0000 [[#FOO]] '_foo'
|
|
|
|
# CHECK-DAG: ( SECT EXT) [[#%.2d,TEXT_ID + 1]] 0000 [[#BAR]] '_bar'
|
|
|
|
# CHECK-DAG: ( SECT EXT) [[#%.2d,TEXT_ID + 1]] 0000 [[#BAR2]] '_bar2'
|
|
|
|
# CHECK-DAG: ( SECT EXT) [[#%.2d,TEXT_ID + 1]] 0000 [[#BAZ]] '_baz'
|
|
|
|
# CHECK-DAG: ( SECT EXT) [[#%.2d,TEXT_ID + 1]] 0000 [[#BAZ2]] '_baz2'
|
|
|
|
# CHECK-DAG: ( SECT EXT) [[#%.2d,TEXT_ID + 1]] 0000 [[#QUX]] '_qux'
|
|
|
|
# CHECK-DAG: ( SECT EXT) [[#%.2d,TEXT_ID + 1]] 0000 [[#QUUX]] '_quux'
|
|
|
|
# CHECK-DAG: ( SECT EXT) [[#%.2d,DATA_ID + 1]] 0000 [[#GLOB]] '_global_var'
|
|
|
|
# CHECK-DAG: ( SECT EXT) [[#%.2d,COMM_ID + 1]] 0000 [[#ZERO]] '_zero'
|
|
|
|
# CHECK-DAG: ( SECT EXT) [[#%.2d,MORE_TEXT_ID + 1]] 0000 [[#FUN]] '_fun'
|
|
|
|
# CHECK-DAG: ( SECT EXT) [[#%.2d,TEXT_ID + 1]] 0000 {{[0-9a-f]+}} '_no_debug'
|
|
|
|
# CHECK-DAG: ( {{.*}}) {{[0-9]+}} 0000 {{[0-9a-f]+}} '__mh_execute_header'
|
2020-12-02 06:45:12 +08:00
|
|
|
# CHECK-EMPTY:
|
[lld-macho] Emit STABS symbols for debugging, and drop debug sections
Debug sections contain a large amount of data. In order not to bloat the size
of the final binary, we remove them and instead emit STABS symbols for
`dsymutil` and the debugger to locate their contents in the object files.
With this diff, `dsymutil` is able to locate the debug info. However, we need
a few more features before `lldb` is able to work well with our binaries --
e.g. having `LC_DYSYMTAB` accurately reflect the number of local symbols,
emitting `LC_UUID`, and more. Those will be handled in follow-up diffs.
Note also that the STABS we emit differ slightly from what ld64 does. First, we
emit the path to the source file as one `N_SO` symbol instead of two. (`ld64`
emits one `N_SO` for the dirname and one of the basename.) Second, we do not
emit `N_BNSYM` and `N_ENSYM` STABS to mark the start and end of functions,
because the `N_FUN` STABS already serve that purpose. @clayborg recommended
these changes based on his knowledge of what the debugging tools look for.
Additionally, this current implementation doesn't accurately reflect the size
of function symbols. It uses the size of their containing sectioins as a proxy,
but that is only accurate if `.subsections_with_symbols` is set, and if there
isn't an `N_ALT_ENTRY` in that particular subsection. I think we have two
options to solve this:
1. We can split up subsections by symbol even if `.subsections_with_symbols`
is not set, but include constraints to ensure those subsections retain
their order in the final output. This is `ld64`'s approach.
2. We could just add a `size` field to our `Symbol` class. This seems simpler,
and I'm more inclined toward it, but I'm not sure if there are use cases
that it doesn't handle well. As such I'm punting on the decision for now.
Reviewed By: clayborg
Differential Revision: https://reviews.llvm.org/D89257
2020-12-02 06:45:01 +08:00
|
|
|
|
2020-12-09 09:47:19 +08:00
|
|
|
## Check that we don't attempt to emit rebase opcodes for the debug sections
|
|
|
|
## when building a PIE (since we have filtered the sections out).
|
|
|
|
# RUN: %lld -lSystem -pie %t/test.o %t/foo.a %t/no-debug.o -o %t/test
|
|
|
|
# RUN: llvm-objdump --macho --rebase %t/test | FileCheck %s --check-prefix=PIE
|
|
|
|
# PIE: Rebase table:
|
|
|
|
# PIE-NEXT: segment section address type
|
|
|
|
# PIE-EMPTY:
|
|
|
|
|
[lld-macho] Emit STABS symbols for debugging, and drop debug sections
Debug sections contain a large amount of data. In order not to bloat the size
of the final binary, we remove them and instead emit STABS symbols for
`dsymutil` and the debugger to locate their contents in the object files.
With this diff, `dsymutil` is able to locate the debug info. However, we need
a few more features before `lldb` is able to work well with our binaries --
e.g. having `LC_DYSYMTAB` accurately reflect the number of local symbols,
emitting `LC_UUID`, and more. Those will be handled in follow-up diffs.
Note also that the STABS we emit differ slightly from what ld64 does. First, we
emit the path to the source file as one `N_SO` symbol instead of two. (`ld64`
emits one `N_SO` for the dirname and one of the basename.) Second, we do not
emit `N_BNSYM` and `N_ENSYM` STABS to mark the start and end of functions,
because the `N_FUN` STABS already serve that purpose. @clayborg recommended
these changes based on his knowledge of what the debugging tools look for.
Additionally, this current implementation doesn't accurately reflect the size
of function symbols. It uses the size of their containing sectioins as a proxy,
but that is only accurate if `.subsections_with_symbols` is set, and if there
isn't an `N_ALT_ENTRY` in that particular subsection. I think we have two
options to solve this:
1. We can split up subsections by symbol even if `.subsections_with_symbols`
is not set, but include constraints to ensure those subsections retain
their order in the final output. This is `ld64`'s approach.
2. We could just add a `size` field to our `Symbol` class. This seems simpler,
and I'm more inclined toward it, but I'm not sure if there are use cases
that it doesn't handle well. As such I'm punting on the decision for now.
Reviewed By: clayborg
Differential Revision: https://reviews.llvm.org/D89257
2020-12-02 06:45:01 +08:00
|
|
|
#--- test.s
|
2020-12-02 06:45:12 +08:00
|
|
|
|
|
|
|
## Make sure we don't create STABS entries for absolute symbols.
|
|
|
|
.globl _abs
|
|
|
|
_abs = 0x123
|
|
|
|
|
|
|
|
.section __DATA, __data
|
|
|
|
.globl _global_var
|
|
|
|
_global_var:
|
|
|
|
.quad 123
|
|
|
|
|
|
|
|
.section __DATA, more_data
|
|
|
|
_static_var:
|
|
|
|
.quad 123
|
|
|
|
|
|
|
|
.globl _zero
|
|
|
|
.zerofill __DATA,__common,_zero,4,2
|
|
|
|
|
[lld-macho] Emit STABS symbols for debugging, and drop debug sections
Debug sections contain a large amount of data. In order not to bloat the size
of the final binary, we remove them and instead emit STABS symbols for
`dsymutil` and the debugger to locate their contents in the object files.
With this diff, `dsymutil` is able to locate the debug info. However, we need
a few more features before `lldb` is able to work well with our binaries --
e.g. having `LC_DYSYMTAB` accurately reflect the number of local symbols,
emitting `LC_UUID`, and more. Those will be handled in follow-up diffs.
Note also that the STABS we emit differ slightly from what ld64 does. First, we
emit the path to the source file as one `N_SO` symbol instead of two. (`ld64`
emits one `N_SO` for the dirname and one of the basename.) Second, we do not
emit `N_BNSYM` and `N_ENSYM` STABS to mark the start and end of functions,
because the `N_FUN` STABS already serve that purpose. @clayborg recommended
these changes based on his knowledge of what the debugging tools look for.
Additionally, this current implementation doesn't accurately reflect the size
of function symbols. It uses the size of their containing sectioins as a proxy,
but that is only accurate if `.subsections_with_symbols` is set, and if there
isn't an `N_ALT_ENTRY` in that particular subsection. I think we have two
options to solve this:
1. We can split up subsections by symbol even if `.subsections_with_symbols`
is not set, but include constraints to ensure those subsections retain
their order in the final output. This is `ld64`'s approach.
2. We could just add a `size` field to our `Symbol` class. This seems simpler,
and I'm more inclined toward it, but I'm not sure if there are use cases
that it doesn't handle well. As such I'm punting on the decision for now.
Reviewed By: clayborg
Differential Revision: https://reviews.llvm.org/D89257
2020-12-02 06:45:01 +08:00
|
|
|
.text
|
2021-04-07 03:09:14 +08:00
|
|
|
.globl _main, _bar, _bar2, _baz, _baz2, _qux, _quux
|
|
|
|
.alt_entry _baz
|
|
|
|
.alt_entry _qux
|
|
|
|
|
|
|
|
_bar:
|
|
|
|
_bar2:
|
|
|
|
.space 1
|
|
|
|
|
|
|
|
_baz:
|
|
|
|
_baz2:
|
|
|
|
.space 2
|
|
|
|
|
[lld-macho] Emit STABS symbols for debugging, and drop debug sections
Debug sections contain a large amount of data. In order not to bloat the size
of the final binary, we remove them and instead emit STABS symbols for
`dsymutil` and the debugger to locate their contents in the object files.
With this diff, `dsymutil` is able to locate the debug info. However, we need
a few more features before `lldb` is able to work well with our binaries --
e.g. having `LC_DYSYMTAB` accurately reflect the number of local symbols,
emitting `LC_UUID`, and more. Those will be handled in follow-up diffs.
Note also that the STABS we emit differ slightly from what ld64 does. First, we
emit the path to the source file as one `N_SO` symbol instead of two. (`ld64`
emits one `N_SO` for the dirname and one of the basename.) Second, we do not
emit `N_BNSYM` and `N_ENSYM` STABS to mark the start and end of functions,
because the `N_FUN` STABS already serve that purpose. @clayborg recommended
these changes based on his knowledge of what the debugging tools look for.
Additionally, this current implementation doesn't accurately reflect the size
of function symbols. It uses the size of their containing sectioins as a proxy,
but that is only accurate if `.subsections_with_symbols` is set, and if there
isn't an `N_ALT_ENTRY` in that particular subsection. I think we have two
options to solve this:
1. We can split up subsections by symbol even if `.subsections_with_symbols`
is not set, but include constraints to ensure those subsections retain
their order in the final output. This is `ld64`'s approach.
2. We could just add a `size` field to our `Symbol` class. This seems simpler,
and I'm more inclined toward it, but I'm not sure if there are use cases
that it doesn't handle well. As such I'm punting on the decision for now.
Reviewed By: clayborg
Differential Revision: https://reviews.llvm.org/D89257
2020-12-02 06:45:01 +08:00
|
|
|
_main:
|
|
|
|
Lfunc_begin0:
|
2020-12-02 06:45:11 +08:00
|
|
|
callq _foo
|
[lld-macho] Emit STABS symbols for debugging, and drop debug sections
Debug sections contain a large amount of data. In order not to bloat the size
of the final binary, we remove them and instead emit STABS symbols for
`dsymutil` and the debugger to locate their contents in the object files.
With this diff, `dsymutil` is able to locate the debug info. However, we need
a few more features before `lldb` is able to work well with our binaries --
e.g. having `LC_DYSYMTAB` accurately reflect the number of local symbols,
emitting `LC_UUID`, and more. Those will be handled in follow-up diffs.
Note also that the STABS we emit differ slightly from what ld64 does. First, we
emit the path to the source file as one `N_SO` symbol instead of two. (`ld64`
emits one `N_SO` for the dirname and one of the basename.) Second, we do not
emit `N_BNSYM` and `N_ENSYM` STABS to mark the start and end of functions,
because the `N_FUN` STABS already serve that purpose. @clayborg recommended
these changes based on his knowledge of what the debugging tools look for.
Additionally, this current implementation doesn't accurately reflect the size
of function symbols. It uses the size of their containing sectioins as a proxy,
but that is only accurate if `.subsections_with_symbols` is set, and if there
isn't an `N_ALT_ENTRY` in that particular subsection. I think we have two
options to solve this:
1. We can split up subsections by symbol even if `.subsections_with_symbols`
is not set, but include constraints to ensure those subsections retain
their order in the final output. This is `ld64`'s approach.
2. We could just add a `size` field to our `Symbol` class. This seems simpler,
and I'm more inclined toward it, but I'm not sure if there are use cases
that it doesn't handle well. As such I'm punting on the decision for now.
Reviewed By: clayborg
Differential Revision: https://reviews.llvm.org/D89257
2020-12-02 06:45:01 +08:00
|
|
|
retq
|
|
|
|
Lfunc_end0:
|
|
|
|
|
2021-04-07 03:09:14 +08:00
|
|
|
_qux:
|
|
|
|
.space 3
|
|
|
|
|
|
|
|
_quux:
|
|
|
|
.space 4
|
|
|
|
|
[lld-macho] Emit STABS symbols for debugging, and drop debug sections
Debug sections contain a large amount of data. In order not to bloat the size
of the final binary, we remove them and instead emit STABS symbols for
`dsymutil` and the debugger to locate their contents in the object files.
With this diff, `dsymutil` is able to locate the debug info. However, we need
a few more features before `lldb` is able to work well with our binaries --
e.g. having `LC_DYSYMTAB` accurately reflect the number of local symbols,
emitting `LC_UUID`, and more. Those will be handled in follow-up diffs.
Note also that the STABS we emit differ slightly from what ld64 does. First, we
emit the path to the source file as one `N_SO` symbol instead of two. (`ld64`
emits one `N_SO` for the dirname and one of the basename.) Second, we do not
emit `N_BNSYM` and `N_ENSYM` STABS to mark the start and end of functions,
because the `N_FUN` STABS already serve that purpose. @clayborg recommended
these changes based on his knowledge of what the debugging tools look for.
Additionally, this current implementation doesn't accurately reflect the size
of function symbols. It uses the size of their containing sectioins as a proxy,
but that is only accurate if `.subsections_with_symbols` is set, and if there
isn't an `N_ALT_ENTRY` in that particular subsection. I think we have two
options to solve this:
1. We can split up subsections by symbol even if `.subsections_with_symbols`
is not set, but include constraints to ensure those subsections retain
their order in the final output. This is `ld64`'s approach.
2. We could just add a `size` field to our `Symbol` class. This seems simpler,
and I'm more inclined toward it, but I'm not sure if there are use cases
that it doesn't handle well. As such I'm punting on the decision for now.
Reviewed By: clayborg
Differential Revision: https://reviews.llvm.org/D89257
2020-12-02 06:45:01 +08:00
|
|
|
.section __DWARF,__debug_str,regular,debug
|
|
|
|
.asciz "test.cpp" ## string offset=0
|
|
|
|
.asciz "/tmp" ## string offset=9
|
|
|
|
.section __DWARF,__debug_abbrev,regular,debug
|
|
|
|
Lsection_abbrev:
|
|
|
|
.byte 1 ## Abbreviation Code
|
|
|
|
.byte 17 ## DW_TAG_compile_unit
|
|
|
|
.byte 1 ## DW_CHILDREN_yes
|
|
|
|
.byte 3 ## DW_AT_name
|
|
|
|
.byte 14 ## DW_FORM_strp
|
|
|
|
.byte 27 ## DW_AT_comp_dir
|
|
|
|
.byte 14 ## DW_FORM_strp
|
|
|
|
.byte 17 ## DW_AT_low_pc
|
|
|
|
.byte 1 ## DW_FORM_addr
|
|
|
|
.byte 18 ## DW_AT_high_pc
|
|
|
|
.byte 6 ## DW_FORM_data4
|
|
|
|
.byte 0 ## EOM(1)
|
|
|
|
.section __DWARF,__debug_info,regular,debug
|
|
|
|
.set Lset0, Ldebug_info_end0-Ldebug_info_start0 ## Length of Unit
|
|
|
|
.long Lset0
|
|
|
|
Ldebug_info_start0:
|
|
|
|
.short 4 ## DWARF version number
|
|
|
|
.set Lset1, Lsection_abbrev-Lsection_abbrev ## Offset Into Abbrev. Section
|
|
|
|
.long Lset1
|
|
|
|
.byte 8 ## Address Size (in bytes)
|
|
|
|
.byte 1 ## Abbrev [1] 0xb:0x48 DW_TAG_compile_unit
|
|
|
|
.long 0 ## DW_AT_name
|
|
|
|
.long 9 ## DW_AT_comp_dir
|
|
|
|
.quad Lfunc_begin0 ## DW_AT_low_pc
|
|
|
|
.set Lset3, Lfunc_end0-Lfunc_begin0 ## DW_AT_high_pc
|
|
|
|
.long Lset3
|
|
|
|
.byte 0 ## End Of Children Mark
|
|
|
|
Ldebug_info_end0:
|
|
|
|
.subsections_via_symbols
|
|
|
|
.section __DWARF,__debug_line,regular,debug
|
|
|
|
|
2020-12-02 06:45:13 +08:00
|
|
|
.section OTHER,more_text,regular,pure_instructions
|
|
|
|
.globl _fun
|
|
|
|
_fun:
|
|
|
|
ret
|
|
|
|
|
[lld-macho] Emit STABS symbols for debugging, and drop debug sections
Debug sections contain a large amount of data. In order not to bloat the size
of the final binary, we remove them and instead emit STABS symbols for
`dsymutil` and the debugger to locate their contents in the object files.
With this diff, `dsymutil` is able to locate the debug info. However, we need
a few more features before `lldb` is able to work well with our binaries --
e.g. having `LC_DYSYMTAB` accurately reflect the number of local symbols,
emitting `LC_UUID`, and more. Those will be handled in follow-up diffs.
Note also that the STABS we emit differ slightly from what ld64 does. First, we
emit the path to the source file as one `N_SO` symbol instead of two. (`ld64`
emits one `N_SO` for the dirname and one of the basename.) Second, we do not
emit `N_BNSYM` and `N_ENSYM` STABS to mark the start and end of functions,
because the `N_FUN` STABS already serve that purpose. @clayborg recommended
these changes based on his knowledge of what the debugging tools look for.
Additionally, this current implementation doesn't accurately reflect the size
of function symbols. It uses the size of their containing sectioins as a proxy,
but that is only accurate if `.subsections_with_symbols` is set, and if there
isn't an `N_ALT_ENTRY` in that particular subsection. I think we have two
options to solve this:
1. We can split up subsections by symbol even if `.subsections_with_symbols`
is not set, but include constraints to ensure those subsections retain
their order in the final output. This is `ld64`'s approach.
2. We could just add a `size` field to our `Symbol` class. This seems simpler,
and I'm more inclined toward it, but I'm not sure if there are use cases
that it doesn't handle well. As such I'm punting on the decision for now.
Reviewed By: clayborg
Differential Revision: https://reviews.llvm.org/D89257
2020-12-02 06:45:01 +08:00
|
|
|
#--- foo.s
|
|
|
|
.text
|
|
|
|
.globl _foo
|
|
|
|
_foo:
|
|
|
|
Lfunc_begin0:
|
|
|
|
retq
|
|
|
|
Lfunc_end0:
|
|
|
|
|
|
|
|
.section __DWARF,__debug_str,regular,debug
|
|
|
|
.asciz "foo.cpp" ## string offset=0
|
|
|
|
.asciz "" ## string offset=8
|
|
|
|
.section __DWARF,__debug_abbrev,regular,debug
|
|
|
|
Lsection_abbrev:
|
|
|
|
.byte 1 ## Abbreviation Code
|
|
|
|
.byte 17 ## DW_TAG_compile_unit
|
|
|
|
.byte 1 ## DW_CHILDREN_yes
|
|
|
|
.byte 3 ## DW_AT_name
|
|
|
|
.byte 14 ## DW_FORM_strp
|
|
|
|
.byte 27 ## DW_AT_comp_dir
|
|
|
|
.byte 14 ## DW_FORM_strp
|
|
|
|
.byte 17 ## DW_AT_low_pc
|
|
|
|
.byte 1 ## DW_FORM_addr
|
|
|
|
.byte 18 ## DW_AT_high_pc
|
|
|
|
.byte 6 ## DW_FORM_data4
|
|
|
|
.byte 0 ## EOM(1)
|
|
|
|
.section __DWARF,__debug_info,regular,debug
|
|
|
|
.set Lset0, Ldebug_info_end0-Ldebug_info_start0 ## Length of Unit
|
|
|
|
.long Lset0
|
|
|
|
Ldebug_info_start0:
|
|
|
|
.short 4 ## DWARF version number
|
|
|
|
.set Lset1, Lsection_abbrev-Lsection_abbrev ## Offset Into Abbrev. Section
|
|
|
|
.long Lset1
|
|
|
|
.byte 8 ## Address Size (in bytes)
|
|
|
|
.byte 1 ## Abbrev [1] 0xb:0x48 DW_TAG_compile_unit
|
|
|
|
.long 0 ## DW_AT_name
|
|
|
|
.long 8 ## DW_AT_comp_dir
|
|
|
|
.quad Lfunc_begin0 ## DW_AT_low_pc
|
|
|
|
.set Lset3, Lfunc_end0-Lfunc_begin0 ## DW_AT_high_pc
|
|
|
|
.long Lset3
|
|
|
|
.byte 0 ## End Of Children Mark
|
|
|
|
Ldebug_info_end0:
|
|
|
|
.subsections_via_symbols
|
|
|
|
.section __DWARF,__debug_line,regular,debug
|
2020-12-02 06:45:12 +08:00
|
|
|
|
2021-03-06 06:22:57 +08:00
|
|
|
.section __DWARF,__debug_aranges,regular,debug
|
|
|
|
ltmp1:
|
|
|
|
.byte 0
|
|
|
|
|
2020-12-02 06:45:12 +08:00
|
|
|
#--- no-debug.s
|
|
|
|
## This file has no debug info.
|
|
|
|
.text
|
|
|
|
.globl _no_debug
|
|
|
|
_no_debug:
|
|
|
|
ret
|