Tests were working on my system because the old correct files were left over
and the new bug was that the output files were not being output at all.
Consequently the test work on my system but fail on any other system.
This reverts commit r323484.
llvm-svn: 323486
While writing code for input and output formats in llvm-objcopy it became
apparent that there was a code health problem. This change attempts to solve
that problem by refactoring the code to use Reader and Writer objects that can
read in different objects in different formats, convert them to a single shared
internal representation, and then write them to any other representation.
New classes:
Reader: the base class used to construct instances of the internal
representation
Writer: the base class used to write out instances of the internal
representation
ELFBuilder: a helper class for ELFWriter that takes an ELFFile and converts it
to a Object
SectionVisitor: it became necessary to remove writeSection from SectionBase
because, under the new Reader/Writer scheme, it's possible to convert between
ELF Types such as ELF32LE and ELF32BE. This isn't possible with writeSection
because it (dynamically) depends on the underlying section type *and*
(statically) depends on the ELF type. Bad things would happen if the underlying
sections for ELF32LE were used for writing to ELF64BE. To avoid this code smell
(which would have compiled, run, and output some nonsesnse) I decoupled writing
of sections from a class.
SectionWriter: This is just the ELFT templated implementation of
SectionVisitor. Many classes now have this class as a friend so that the
writing methods in this class can write out private data.
ELFWriter: This is the Writer that outputs to ELF
BinaryWriter: This is the Writer that outputs to Binary
ElfType: Because the ELF Type is not a part of the Object anymore we need a way
to construct the correct default Writer based on properties of the Reader. This
enum just keeps track of the ELF type of the input so it can be used as the
default output type as well.
Object has correspondingly undergone some serious changes as well. It now has
more generic methods for building and manipulating ELF binaries. This interface
makes ELFBuilder easy enough to use and will make the BinaryReader/Builder easy
to create as well. Most changes in this diff are cosmetic and deal with the
fact that a method has been moved from one class to another or a change from a
pointer to a reference. Almost no changes should result in a functional
difference (this is after all a refactor). One minor functional change was made
and the result can be seen in remove-shstrtab-error.test. The fact that it
fails hasn't changed but the error message has changed because that failure is
detected at a later point in the code now (because WriteSectionHeaders is a
property of the ElfWriter *not* a property of the Object). I'd say roughly
80-90% of this code is cosmetically different, 10-19% is different but
functionally the same, and 1-5% is functionally different despite not causing a
change in tests.
Differential Revision: https://reviews.llvm.org/D42222
llvm-svn: 323480
It was reverted after buildbot regressions.
Original commit message:
This allows relative block frequency of call edges to be passed
to the thinlink stage where it will be used to compute synthetic
entry counts of functions.
llvm-svn: 323460
Summary:
This allows relative block frequency of call edges to be passed to the
thinlink stage where it will be used to compute synthetic entry counts
of functions.
Reviewers: tejohnson, pcc
Subscribers: mehdi_amini, llvm-commits, inglorion
Differential Revision: https://reviews.llvm.org/D42212
llvm-svn: 323349
This is needed in order to use our StringPool entries in the Apple
accelerator tables.
As this is NFC we rely on the existing tests for correctness.
llvm-svn: 323339
Combine expression patterns to form expressions with fewer, simple instructions.
This pass does not modify the CFG.
For example, this pass reduce width of expressions post-dominated by TruncInst
into smaller width when applicable.
It differs from instcombine pass in that it contains pattern optimization that
requires higher complexity than the O(1), thus, it should run fewer times than
instcombine pass.
Differential Revision: https://reviews.llvm.org/D38313
llvm-svn: 323321
Summary:
Currently, there is no way to extract a basic block from a function easily. This patch
extends llvm-extract to extract the specified basic block(s).
Reviewers: loladiro, rafael, bogner
Reviewed By: bogner
Subscribers: hintonda, mgorny, qcolombet, llvm-commits
Differential Revision: https://reviews.llvm.org/D41638
llvm-svn: 323266
Opt's "-enable-debugify" mode adds an instance of Debugify at the
beginning of the pass pipeline, and an instance of CheckDebugify at the
end.
You can enable this mode with lit using: -Dopt="opt -enable-debugify".
Note that running test suites in this mode will result in many failures
due to strict FileCheck commands, etc.
It can be more useful to look for assertion failures which arise only
when Debugify is enabled, e.g to prove that we have (or do not have)
test coverage for some code path with debug info present.
Differential Revision: https://reviews.llvm.org/D41793
llvm-svn: 323256
We were a bit too trusting about the offsets encoded in MachO compact unwind
sections, so this passes every access through a bounds check just in case. It
prevents a few segfaults on malformed object files, if one should ever come
along.
Mostly to silence fuzzers in the vague hope they might be able to produce
something useful without the noise.
llvm-svn: 323198
This applies to most pipelines except the LTO and ThinLTO backend
actions - it is for use at the beginning of the overall pipeline.
This extension point will be used to add the GCOV pass when enabled in
Clang.
llvm-svn: 323166
Summary:
First, we need to explain the core of the vulnerability. Note that this
is a very incomplete description, please see the Project Zero blog post
for details:
https://googleprojectzero.blogspot.com/2018/01/reading-privileged-memory-with-side.html
The basis for branch target injection is to direct speculative execution
of the processor to some "gadget" of executable code by poisoning the
prediction of indirect branches with the address of that gadget. The
gadget in turn contains an operation that provides a side channel for
reading data. Most commonly, this will look like a load of secret data
followed by a branch on the loaded value and then a load of some
predictable cache line. The attacker then uses timing of the processors
cache to determine which direction the branch took *in the speculative
execution*, and in turn what one bit of the loaded value was. Due to the
nature of these timing side channels and the branch predictor on Intel
processors, this allows an attacker to leak data only accessible to
a privileged domain (like the kernel) back into an unprivileged domain.
The goal is simple: avoid generating code which contains an indirect
branch that could have its prediction poisoned by an attacker. In many
cases, the compiler can simply use directed conditional branches and
a small search tree. LLVM already has support for lowering switches in
this way and the first step of this patch is to disable jump-table
lowering of switches and introduce a pass to rewrite explicit indirectbr
sequences into a switch over integers.
However, there is no fully general alternative to indirect calls. We
introduce a new construct we call a "retpoline" to implement indirect
calls in a non-speculatable way. It can be thought of loosely as
a trampoline for indirect calls which uses the RET instruction on x86.
Further, we arrange for a specific call->ret sequence which ensures the
processor predicts the return to go to a controlled, known location. The
retpoline then "smashes" the return address pushed onto the stack by the
call with the desired target of the original indirect call. The result
is a predicted return to the next instruction after a call (which can be
used to trap speculative execution within an infinite loop) and an
actual indirect branch to an arbitrary address.
On 64-bit x86 ABIs, this is especially easily done in the compiler by
using a guaranteed scratch register to pass the target into this device.
For 32-bit ABIs there isn't a guaranteed scratch register and so several
different retpoline variants are introduced to use a scratch register if
one is available in the calling convention and to otherwise use direct
stack push/pop sequences to pass the target address.
This "retpoline" mitigation is fully described in the following blog
post: https://support.google.com/faqs/answer/7625886
We also support a target feature that disables emission of the retpoline
thunk by the compiler to allow for custom thunks if users want them.
These are particularly useful in environments like kernels that
routinely do hot-patching on boot and want to hot-patch their thunk to
different code sequences. They can write this custom thunk and use
`-mretpoline-external-thunk` *in addition* to `-mretpoline`. In this
case, on x86-64 thu thunk names must be:
```
__llvm_external_retpoline_r11
```
or on 32-bit:
```
__llvm_external_retpoline_eax
__llvm_external_retpoline_ecx
__llvm_external_retpoline_edx
__llvm_external_retpoline_push
```
And the target of the retpoline is passed in the named register, or in
the case of the `push` suffix on the top of the stack via a `pushl`
instruction.
There is one other important source of indirect branches in x86 ELF
binaries: the PLT. These patches also include support for LLD to
generate PLT entries that perform a retpoline-style indirection.
The only other indirect branches remaining that we are aware of are from
precompiled runtimes (such as crt0.o and similar). The ones we have
found are not really attackable, and so we have not focused on them
here, but eventually these runtimes should also be replicated for
retpoline-ed configurations for completeness.
For kernels or other freestanding or fully static executables, the
compiler switch `-mretpoline` is sufficient to fully mitigate this
particular attack. For dynamic executables, you must compile *all*
libraries with `-mretpoline` and additionally link the dynamic
executable and all shared libraries with LLD and pass `-z retpolineplt`
(or use similar functionality from some other linker). We strongly
recommend also using `-z now` as non-lazy binding allows the
retpoline-mitigated PLT to be substantially smaller.
When manually apply similar transformations to `-mretpoline` to the
Linux kernel we observed very small performance hits to applications
running typical workloads, and relatively minor hits (approximately 2%)
even for extremely syscall-heavy applications. This is largely due to
the small number of indirect branches that occur in performance
sensitive paths of the kernel.
When using these patches on statically linked applications, especially
C++ applications, you should expect to see a much more dramatic
performance hit. For microbenchmarks that are switch, indirect-, or
virtual-call heavy we have seen overheads ranging from 10% to 50%.
However, real-world workloads exhibit substantially lower performance
impact. Notably, techniques such as PGO and ThinLTO dramatically reduce
the impact of hot indirect calls (by speculatively promoting them to
direct calls) and allow optimized search trees to be used to lower
switches. If you need to deploy these techniques in C++ applications, we
*strongly* recommend that you ensure all hot call targets are statically
linked (avoiding PLT indirection) and use both PGO and ThinLTO. Well
tuned servers using all of these techniques saw 5% - 10% overhead from
the use of retpoline.
We will add detailed documentation covering these components in
subsequent patches, but wanted to make the core functionality available
as soon as possible. Happy for more code review, but we'd really like to
get these patches landed and backported ASAP for obvious reasons. We're
planning to backport this to both 6.0 and 5.0 release streams and get
a 5.0 release with just this cherry picked ASAP for distros and vendors.
This patch is the work of a number of people over the past month: Eric, Reid,
Rui, and myself. I'm mailing it out as a single commit due to the time
sensitive nature of landing this and the need to backport it. Huge thanks to
everyone who helped out here, and everyone at Intel who helped out in
discussions about how to craft this. Also, credit goes to Paul Turner (at
Google, but not an LLVM contributor) for much of the underlying retpoline
design.
Reviewers: echristo, rnk, ruiu, craig.topper, DavidKreitzer
Subscribers: sanjoy, emaste, mcrosier, mgorny, mehdi_amini, hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D41723
llvm-svn: 323155
For sections with different virtual and physical addresses, alignment and
placement in the output binary should be based on the physical address.
Ran into this problem with a bare metal ARM project where llvm-objcopy added a
lot of zero-padding before the .data section that had differing addresses. GNU
objcopy did not add the padding, and after this fix, neither does llvm-objcopy.
Update a test case so a section has different physical and virtual addresses.
Fixes B35708
Authored By: Owen Shaw (owenpshaw)
Differential Revision: https://reviews.llvm.org/D41619
llvm-svn: 323144
This frees up the first name to be used as an base class for the
apple table and the dwarf5 .debug_names accel table. The rename was
split off from D42297 (adding of debug_names support), which is still
under review.
llvm-svn: 323113
Summary:
Rename LLVM_CONFIG_EXE to LLVM_CONFIG_PATH, and avoid building it if
passed in by user. This is the same way CLANG_TABLEGEN and
LLVM_TABLEGEN are handled, e.g., when -DLLVM_OPTIMIZED_TABLEGEN=ON is
passed.
Differential Revision: https://reviews.llvm.org/D41806
llvm-svn: 323053
ExternalSymbolMap now stores the string key (rather than using a StringRef),
as the object file backing the key may be removed at any time.
llvm-svn: 323001
Bulk queries reduce IPC/RPC overhead for cross-process JITing and expose
opportunities for parallel compilation.
The two new query methods are lookupFlags, which finds the flags for each of a
set of symbols; and lookup, which finds the address and flags for each of a
set of symbols. (See doxygen comments for more details.)
The existing JITSymbolResolver class is renamed LegacyJITSymbolResolver, and
modified to extend the new JITSymbolResolver class using the following scheme:
- lookupFlags is implemented by calling findSymbolInLogicalDylib for each of the
symbols, then returning the result of calling getFlags() on each of these
symbols. (Importantly: lookupFlags does NOT call getAddress on the returned
symbols, so lookupFlags will never trigger materialization, and lookupFlags will
never call findSymbol, so only symbols that are part of the logical dylib will
return results.)
- lookup is implemented by calling findSymbolInLogicalDylib for each symbol and
falling back to findSymbol if findSymbolInLogicalDylib returns a null result.
Assuming a symbol is found its getAddress method is called to materialize it and
the result (if getAddress succeeds) is stored in the result map, or the error
(if getAddress fails) is returned immediately from lookup. If any symbol is not
found then lookup returns immediately with an error.
This change will break any out-of-tree derivatives of JITSymbolResolver. This
can be fixed by updating those classes to derive from LegacyJITSymbolResolver
instead.
llvm-svn: 322913
Get rid of DEBUG_FUNCTION_NAME symbols. When we actually debug
data, maybe we'll want somewhere to put it... but having a symbol
that just stores the name of another symbol seems odd.
It means you have multiple Symbols with the same name, one
containing the actual function and another containing the name!
Store the names in a vector on the WasmObjectFile when reading
them in. Also stash them on the WasmFunctions themselves.
The names are //not// "symbol names" or aliases or anything,
they're just the name that a debugger should show against the
function body itself. NB. The WasmObjectFile stores them so that
they can be exported in the YAML losslessly, and hence the tests
can be precise.
Enforce that the CODE section has been read in before reading
the "names" section. Requires minor adjustment to some tests.
Patch by Nicholas Wilson!
Differential Revision: https://reviews.llvm.org/D42075
llvm-svn: 322741
Summary:
- Fix a bug in PrettyBuiltinDumper that returns "void" as the name for
an unspecified builtin type. Since the unspecified param of a variadic
function is considered a builtin of unspecified type in PDBs, we set
"..." for its name.
- Provide a method to determine if a PDBSymbolFunc is variadic in
PrettyFunctionDumper since PDBSymbolFunc::getArgument() doesn't return the
last unspecified-type param.
- Add a pretty-func-dumper.test to test pretty dumping of variadic
functions.
Reviewers: zturner, llvm-commits
Reviewed By: zturner
Differential Revision: https://reviews.llvm.org/D41801
llvm-svn: 322608
Summary:
This speeds up export "summary-only" execution by an order of magnitude or two,
depending on number of threads used for prepareFileReports execution.
Also includes minor refactoring for splitting render of summary and detailed data
in two independent methods.
Reviewers: vsk, morehouse
Reviewed By: vsk
Subscribers: llvm-commits, kcc
Differential Revision: https://reviews.llvm.org/D42000
llvm-svn: 322397
There were a few places where outs() was being used
directly rather than the ScopedPrinter object.
Differential Revision: https://reviews.llvm.org/D41370
llvm-svn: 322141
These indexes are useful because they are not always zero based and
functions and globals are referenced elsewhere by their index.
This matches what we already do for the type index space.
Differential Revision: https://reviews.llvm.org/D41877
llvm-svn: 322121
llc, opt, and clang can all autodetect the CPU and supported features. lli cannot as far as I could tell.
This patch uses the getCPUStr() and introduces a new getCPUFeatureList() and uses those in lli in place of MCPU and MAttrs.
Ideally, we would merge getCPUFeatureList and getCPUFeatureStr, but opt and llc need a string and lli wanted a list. Maybe we should just return the SubtargetFeature object and let the caller decide what it needs?
Differential Revision: https://reviews.llvm.org/D41833
llvm-svn: 322100
This change adds support in llvm-objcopy for GNU objcopy's --localize-hidden
option. This option changes every hidden or internal symbol into a local symbol.
llvm-svn: 321884
This is not a record type that clang currently generates,
but it is a record that is encountered in object files generated
by cl. This record is unusual in that it refers directly to
the string table instead of indirectly to the string table via
the FileChecksums table. Because of this, it was previously
overlooked and we weren't remapping the string indices at all.
This would lead to crashes in MSVC when trying to display a
variable whose debug info involved an S_FILESTATIC.
Original bug report by Alexander Ganea
Differential Revision: https://reviews.llvm.org/D41718
llvm-svn: 321883