Implement pass 3 of bind opcodes from ld64 (which supports both 32-bit and 64-bit).
Pass 3 implementation condenses BIND_OPCODE_DO_BIND_ADD_ADDR_ULEB opcode
to BIND_OPCODE_DO_BIND_ADD_ADDR_IMM_SCALED. This change is already behind an
O2 flag so it shouldn't impact current performance. I verified ld64's output with x86_64 LLD
and they were both emitting the same optimized bind opcodes (although in a slightly different
order). Tested with arm64_32 LLD and compared that with x86 LLD that the order of the bind
opcodes are the same (offset values are different which should be expected).
Reviewed By: int3, #lld-macho, MaskRay
Differential Revision: https://reviews.llvm.org/D106128
This reverts commit 321b2bef09.
`for (BindIR *p = &opcodes[0]; p->opcode != BIND_OPCODE_DONE; ++p) {` has a heap-buffer-overflow with test/MachO/bind-opcodes.
Implement pass 3 of bind opcodes from ld64 (which supports both 32-bit and 64-bit).
Pass 3 implementation condenses BIND_OPCODE_DO_BIND_ADD_ADDR_ULEB opcode
to BIND_OPCODE_DO_BIND_ADD_ADDR_IMM_SCALED. This change is already behind an
O2 flag so it shouldn't impact current performance. I verified ld64's output with x86_64 LLD
and they were both emitting the same optimized bind opcodes (although in a slightly different
order). Tested with arm64_32 LLD and compared that with x86 LLD that the order of the bind
opcodes are the same (offset values are different which should be expected).
Reviewed By: int3, #lld-macho
Differential Revision: https://reviews.llvm.org/D106128
In D105866, we used an intermediate container to store a list of opcodes. Here,
we use that data structure to help us perform optimization passes that would allow
a more efficient encoding of bind opcodes. Currently, the functionality mirrors the
optimization pass {1,2} done in ld64 for bind opcodes under optimization gate
to prevent slight regressions.
Reviewed By: int3, #lld-macho
Differential Revision: https://reviews.llvm.org/D105867
Size-wise, BIND_OPCODE_SET_SYMBOL_TRAILING_FLAGS_IMM is the most
expensive opcode, since it comes with an associated symbol string. We
were previously emitting it once per binding, instead of once per
symbol. This diff groups all bindings for a given symbol together and
ensures we only emit one such opcode per symbol. This matches ld64's
behavior.
While this is a relatively small win on chromium_framework (-72KiB), for
programs that have more dynamic bindings, the difference can be quite
large.
This change is perf-neutral when linking chromium_framework.
Reviewed By: #lld-macho, thakis
Differential Revision: https://reviews.llvm.org/D105075