Commit Graph

122 Commits

Author SHA1 Message Date
Arthur Eubanks 7cbb6e9a8f [llvm-reduce] Assert that the number of chunks does not change with reductions
Followup to D113537.

Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D113816
2021-12-01 15:40:05 -08:00
Florian Hahn fb46e64a01
Revert "[ThreadPool] Do not return shared futures."
This reverts commit a5fff58781.

The offending commit broke building with LLVM_ENABLE_THREADS=OFF.
2021-11-24 19:01:47 +00:00
Florian Hahn 8ef460fc51
[llvm-reduce] Add parallel chunk processing.
This patch adds parallel processing of chunks. When reducing very large
inputs, e.g. functions with 500k basic blocks, processing chunks in
parallel can significantly speed up the reduction.

To allow modifying clones of the original module in parallel, each clone
needs their own LLVMContext object. To achieve this, each job parses the
input module with their own LLVMContext. In case a job successfully
reduced the input, it serializes the result module as bitcode into a
result array.

To ensure parallel reduction produces the same results as serial
reduction, only the first successfully reduced result is used, and
results of other successful jobs are dropped. Processing resumes after
the chunk that was successfully reduced.

The number of threads to use can be configured using the -j option.
It defaults to 1, which means serial processing.

Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D113857
2021-11-24 09:23:52 +00:00
Florian Hahn be56ece918
[llvm-reduce] Move code to check chunk to function, to enable reuse (NFC).
This patch moves the logic to clone and check a new chunk into a new
function, to allow re-use in a follow-up patch that implements parallel
reductions.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D113856
2021-11-16 15:39:13 +00:00
Florian Hahn 97b9b6f565
[llvm-reduce] Add new BitWriter dependency after 28d95a2610. 2021-11-16 12:48:21 +00:00
Florian Hahn 28d95a2610
[llvm-reduce] Allow writing temporary files as bitcode.
Textual LLVM IR files are much bigger and take longer to write to disk.
To avoid the extra cost incurred by serializing to text, this patch adds
an option to save temporary files as bitcode instead.

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D113858
2021-11-16 12:39:42 +00:00
Arthur Eubanks 0b5051cede [llvm-reduce] Don't reuse SmallVector across calls to getAllMetadata()
The SmallVector is not cleared in calls to getAllMetadata().

Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D113808
2021-11-15 14:53:48 -08:00
Florian Hahn 4081df43b6
[llvm-reduce] Remove unnecessary loop.
After cd8aa234fd, there's no need to collect a vector of basic blocks
to keep first. Remove the first loop.
2021-11-14 21:03:21 +00:00
Arthur Eubanks 87687b4ff7 [llvm-reduce] Fix build after D113537
Forgot to amend D113537 with these changes before committing.
2021-11-11 18:53:34 -08:00
Arthur Eubanks 6f288bd772 [llvm-reduce] Count chunks by running a preliminary reduction
Having a separate counting method runs the risk of a mismatch between
the actual reduction method and the counting method.

Instead, create an Oracle that always returns true for shouldKeep(), run
the reduction, and count how many times shouldKeep() was called. The
module should not be modified if shouldKeep() always returns true.

Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D113537
2021-11-11 18:46:09 -08:00
Arthur Eubanks be0b47d530 [llvm-reduce] Skip replacing metadata and callee operands
Metadata operands tend to require special conditions, especially on dbg
intrinsics. We also don't have a zero value for metadata.

Replacing callee operands is a little weird, since calling undef/null
doesn't make sense. It also causes tons of invalid reductions when
reducing calls to intrinsics since only arguments to intrinsics can be
of the metadata type.

Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D113532
2021-11-11 18:42:16 -08:00
Michael Kruse c15f930e96 [llvm-reduce] Introduce operands-skip pass.
Add a new "operands-skip" pass whose goal is to remove instructions in the middle of dependency chains. For instance:
```
  %baseptr = alloca i32
  %arrayidx = getelementptr i32, i32* %baseptr, i32 %idxprom
  store i32 42, i32* %arrayidx
```
might be reducible to
```
  %baseptr = alloca i32
  %arrayidx = getelementptr ...  ; now dead, together with the computation of %idxprom
  store i32 42, i32* %baseptr
```
Other passes would either replace `%baseptr` with undef (operands, instructions) or move it to become a function argument (operands-to-args), both of which might fail the interestingness check.

In principle the implementation allows operand replacement with any value or instruction in the function that passes the filter constraints (same type, dominance, "more reduced"), but is limited in this patch to values that are directly or indirectly used to compute the current operand value, motivated by the example above. Additionally, function arguments are added to the candidate set which helps reducing the number of relevant arguments mitigating a concern of too many arguments mentioned in https://reviews.llvm.org/D110274#3025013.

Possible future extensions:
 * Instead of requiring the same type, bitcast/trunc/zext could be automatically inserted for some more flexibility.
 * If undef is added to the candidate set, "operands-skip"is able to produce any reduction that "operands" can do. Additional candidates might be zero and one, where the "reductive power" classification can prefer one over the other. If undefined behaviour should not be introduced, undef can be removed from the candidate set.

Recommit after resolving conflict with D112651 and reusing
shouldReduceOperand from D113532.

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D111818
2021-11-11 20:16:34 -06:00
Michael Kruse ed7b37155b Revert "[llvm-reduce] Introduce operands-skip pass."
This reverts commit fa4210a9a0.

It causes compile failures, presumably because conflicting with another
patch landed after I checked locally.
2021-11-11 19:25:39 -06:00
Michael Kruse fa4210a9a0 [llvm-reduce] Introduce operands-skip pass.
Add a new "operands-skip" pass whose goal is to remove instructions in the middle of dependency chains. For instance:
```
  %baseptr = alloca i32
  %arrayidx = getelementptr i32, i32* %baseptr, i32 %idxprom
  store i32 42, i32* %arrayidx
```
might be reducible to
```
  %baseptr = alloca i32
  %arrayidx = getelementptr ...  ; now dead, together with the computation of %idxprom
  store i32 42, i32* %baseptr
```
Other passes would either replace `%baseptr` with undef (operands, instructions) or move it to become a function argument (operands-to-args), both of which might fail the interestingness check.

In principle the implementation allows operand replacement with any value or instruction in the function that passes the filter constraints (same type, dominance, "more reduced"), but is limited in this patch to values that are directly or indirectly used to compute the current operand value, motivated by the example above. Additionally, function arguments are added to the candidate set which helps reducing the number of relevant arguments mitigating a concern of too many arguments mentioned in https://reviews.llvm.org/D110274#3025013.

Possible future extensions:
 * Instead of requiring the same type, bitcast/trunc/zext could be automatically inserted for some more flexibility.
 * If undef is added to the candidate set, "operands-skip"is able to produce any reduction that "operands" can do. Additional candidates might be zero and one, where the "reductive power" classification can prefer one over the other. If undefined behaviour should not be introduced, undef can be removed from the candidate set.

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D111818
2021-11-11 18:54:01 -06:00
Florian Hahn cd8aa234fd
[llvm-reduce] Use DenseSet instead of std::set (NFC).
When reducing functions with very large basic blocks (~ almost 1 million
BBs), the majority of time is spent maintaining the order in the std::set
for the basic blocks to keep.

In those cases, DenseSet<> is much more efficient. Use it instead.
2021-11-10 13:56:22 +00:00
Arthur Eubanks b394ba5d7f [llvm-reduce] Print extra newline when encountering unknown pass 2021-11-09 15:20:16 -08:00
Dwight Guth 16c3db8def [llvm-reduce] Fix invalid reduction in basic-blocks delta pass
Previously, if the basic-blocks delta pass tried to remove a basic block
that was the last basic block in a function that did not have external
or weak linkage, the resulting IR would become invalid. Since removing
the last basic block in a function is effectively identical to removing
the function body itself, we check explicitly for this case and if we
detect it, we run the same logic as in ReduceFunctionBodies.cpp

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D113486
2021-11-09 10:43:38 -08:00
Dwight Guth fbfd327fdf [llvm-reduce] Add flag to start at finer granularity
Sometimes if llvm-reduce is interrupted in the middle of a delta pass on
a large file, it can take quite some time for the tool to start actually
doing new work if it is restarted again on the partially-reduced file. A
lot of time ends up being spent testing large chunks when these large
chunks are very unlikely to actually pass the interestingness test. In
cases like this, the tool will complete faster if the starting
granularity is reduced to a finer amount. Thus, we introduce a command
line flag that automatically divides the chunks into smaller subsets a
fixed, user-specified number of times prior to beginning the core loop.

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D112651
2021-11-09 10:14:08 -08:00
Arthur Eubanks f54a8759f0 [llvm-reduce] Reduce more GlobalValue properties
Reviewed By: hans

Differential Revision: https://reviews.llvm.org/D112885
2021-11-02 08:47:41 -07:00
Arthur Eubanks 80ba72b07b [llvm-reduce] Reduce some GlobalObject properties
Specifically, the section and the alignment.

Reviewed By: hans

Differential Revision: https://reviews.llvm.org/D112884
2021-11-02 08:47:32 -07:00
Markus Lavin fd41738e2c Recommit "[llvm-reduce] Add MIR support"
(Second try. Need to link against CodeGen and MC libs.)

The llvm-reduce tool has been extended to operate on MIR (import, clone and
export). Current limitation is that only a single machine function is
supported. A single reducer pass that operates on machine instructions (while
on SSA-form) has been added. Additional MIR specific reducer passes can be
added later as needed.

Differential Revision: https://reviews.llvm.org/D110527
2021-11-02 10:16:42 +01:00
Markus Lavin aee7f3384b Revert "[llvm-reduce] Add MIR support"
This reverts commit bc2773cb1b.

Broke the clang-ppc64le-linux-multistage build. Reverting while I
investigate.
2021-11-02 09:41:02 +01:00
Markus Lavin bc2773cb1b [llvm-reduce] Add MIR support
The llvm-reduce tool has been extended to operate on MIR (import, clone and
export). Current limitation is that only a single machine function is
supported. A single reducer pass that operates on machine instructions (while
on SSA-form) has been added. Additional MIR specific reducer passes can be
added later as needed.

Differential Revision: https://reviews.llvm.org/D110527
2021-11-02 09:14:56 +01:00
Dwight Guth 2f16173627 [llvm-reduce] optimize extractFromModule functions
The extractBasicBlocksFromModule, extractInstrFromModule, and other
similar functions previously performed very poorly when the number of
such elements in the program to reduce was very high. Previously, we
were creating the set which caches elements to keep by looping through
all elements in the module and adding them to the set. However, since
std::set is an ordered set, this introduces a massive amount of
rebalancing if the order of elements in the program and the order of
their pointers in memory are not the same.

The solution is straightforward: first put all the elements to be kept
in a vector, then use the constructor for std::set which takes a pair of
iterators over a collection. This constructor is optimized to avoid
doing unnecessary work when initializing large sets.

Also in this change, we pass BBsToKeep set to functions
replaceBranchTerminator and removeUninterestingBBsFromSwitch as a const
reference rather than passing it by value. This ought to prevent the
need to copy the collection each time these functions are called, which
is expensive if the collection is large.

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D112757
2021-10-29 10:06:26 -07:00
Arthur Eubanks 177a703710 [llvm-reduce] Actually skip invalid candidates in operands-to-args
This was checked while counting but not actually when doing the reduction, resulting in crashes.

Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D112766
2021-10-29 09:14:18 -07:00
Arthur Eubanks 9660563950 [llvm-reduce] Add reduction passes to reduce operands to undef/1/0
Having non-undef constants in a final llvm-reduce output is nicer than
having undefs.

This splits the existing reduce-operands pass into three, one which does
the same as the current pass of reducing to undef, and two more to
reduce to the constant 1 and the constant 0. Do not reduce to undef if
the operand is a ConstantData, and do not reduce 0s to 1s.

Reducing GEP operands very frequently causes invalid IR (since types may
not match up if we index differently into a struct), so don't touch GEPs.

Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D111765
2021-10-19 15:25:21 -07:00
Michael Kruse dd71b65ca8 [llvm-reduce] Introduce operands-to-args pass.
Instead of setting operands to undef as the "operands" pass does,
convert the operands to a function argument. This avoids having to
introduce undef values into the IR which have some unpredictability
during optimizations.

For instance,

    define void @func() {
    entry:
      %val = add i32 32, 21
      store i32 %val, i32* null
      ret void
    }

is reduced to

    define void @func(i32 %val) {
    entry:
      %val1 = add i32 32, 21
      store i32 %val, i32* null
      ret void
    }

(note that the instruction %val is renamed to %val1 when printing
the IR to avoid ambiguity; ideally %val1 would be removed by dce or the
instruction reduction pass)

Any call to @func is replaced with a call to the function with the
new signature and filled with undef. This is not ideal for IPA passes,
but those out-of-scope for now.

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D111503
2021-10-13 09:54:03 -05:00
Arthur Eubanks 77bc3ba365 [NFC][llvm-reduce] Cleanup types
Use Module& wherever possible.
Since every reduction immediately turns Chunks into an Oracle, directly pass Oracle instead.

Reviewed By: hans

Differential Revision: https://reviews.llvm.org/D111122
2021-10-10 18:07:28 -07:00
Arthur Eubanks a7b4ce9cfd [NFC][AttributeList] Replace index_begin/end with an iterator
We expose the fact that we rely on unsigned wrapping to iterate through
all indexes. This can be confusing. Rather, keeping it as an
implementation detail through an iterator is less confusing and is less
code.

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D110885
2021-10-01 10:17:41 -07:00
Florian Hahn 57fbb9ed0e
[llvm-reduce] Skip updating calls where OldF isn't the called fn.
When replacing function calls, skip call instructions where the old
function is not the called function, but e.g. the old function is passed
as an argument.

This fixes a crash due to trying to construct invalid IR for the test
case.

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D109759
2021-10-01 10:52:48 +01:00
Michael Kruse d9562a8e45 [llvm-reduce] Reduce metadata references.
The ReduceMetadata pass before this patch removed metadata on a per-MDNode (or NamedMDNode) basis. Either all references to an MDNode are kept, or all of them are removed. However, MDNodes are uniqued, meaning that references to MDNodes with the same data become references to the same MDNodes. As a consequence, e.g. tbaa references to the same type will all have the same MDNode reference and hence make it impossible to reduce only keeping metadata on those memory access for which they are interesting.
Moreover, MDNodes can also be referenced by some intrinsics or other MDNodes. These references were not considered for removal leading to the possibility that MDNodes are not actually removed even if selected to be removed by the oracle.

This patch changes ReduceMetadata to reduces based on removable metadata references instead. MDNodes without references implicitly dropped anyway. References by intrinsic calls should be removed by ReduceOperands or ReduceInstructions. References in other MDNodes cannot be removed as it would violate the immutability of MDNodes.

Additionally, ReduceMetadata pass before this patch used `setMetadata(I, NULL)` to remove references, where `I` is the index in the array returned by `getAllMetadata`. However, `setMetadata` expects a MDKind (such as `MD_tbaa`) as first argument. `getAllMetadata` does not return those in consecutive order (otherwise it would not need to be a `std::pair` with `first` representing the MDKind).

Reviewed By: aeubanks, swamulism

Differential Revision: https://reviews.llvm.org/D110534
2021-09-29 11:25:35 -05:00
Samuel f18c0739b3 [llvm-reduce] Add reduce operands pass
Add reduction to set operands to default values

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D108903
2021-09-17 12:32:15 -07:00
Arthur Eubanks 2d8a2a91b1 [llvm-reduce] Check if module data strings are empty before attempting to reduce 2021-08-24 10:23:00 -07:00
Arthur Eubanks d2e103644b [llvm-reduce] Remove various module data
This removes the data layout, target triple, source filename, and module
identifier when possible.

Reviewed By: swamulism

Differential Revision: https://reviews.llvm.org/D108568
2021-08-24 09:45:31 -07:00
Timm Bäder 924d62ca4a [llvm][tools] Hide remaining unrelated llvm- tool options
Differential Revision: https://reviews.llvm.org/D106430
2021-07-22 09:47:55 +02:00
Guillaume Chatelet d6da02d952 [llvm] Add enum iteration to Sequence
This patch allows iterating typed enum via the ADT/Sequence utility.

It also changes the original design to better separate concerns:
 - `StrongInt` only deals with safe `intmax_t` operations,
 - `SafeIntIterator` presents the iterator and reverse iterator
 interface but only deals with safe `StrongInt` internally.
 - `iota_range` only deals with `SafeIntIterator` internally.

 This design ensures that operations are always valid. In particular,
 "Out of bounds" assertions fire when:
  - the `value_type` is not representable as an `intmax_t`
  - iterator operations make internal computation underflow/overflow
  - the internal representation cannot be converted back to `value_type`

Differential Revision: https://reviews.llvm.org/D106279
2021-07-21 12:48:53 +00:00
Guillaume Chatelet 2c47b8847e Revert "[llvm] Add enum iteration to Sequence"
This reverts commit a006af5d6e.
2021-07-13 16:44:42 +00:00
Guillaume Chatelet a006af5d6e [llvm] Add enum iteration to Sequence
This patch allows iterating typed enum via the ADT/Sequence utility.

Differential Revision: https://reviews.llvm.org/D103900
2021-07-13 16:22:19 +00:00
Langston Barrett a240358833 [llvm-reduce] Don't delete arguments of intrinsics
The argument reduction pass shouldn't remove arguments of
intrinsics, because the resulting module is ill-formed, and so
inherently uninteresting.

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D103129
2021-06-21 12:43:58 -07:00
Langston Barrett 472c009139 [llvm-reduce] Exit when input module is malformed
The parseInputFile function returns an empty unique_ptr to signal an
error, like when the input file doesn't exist, or is malformed. In this
case, the tool should exit immediately rather than segfault by
dereferencing the unique_ptr later.

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D102891
2021-05-25 10:01:12 -07:00
Arthur Eubanks 511f2cecf7 [llvm-reduce] Don't unset dso_local on implicitly dso_local GVs
This introduces a flag that aborts if we ever reduce to IR that fails
the verifier.

Reviewed By: swamulism, arichardson

Differential Revision: https://reviews.llvm.org/D101279
2021-04-30 11:57:22 -07:00
Arthur Eubanks 545a8177ea [llvm-reduce] Add flag to only run specific passes
Reviewed By: fhahn, hans

Differential Revision: https://reviews.llvm.org/D101278
2021-04-30 11:51:01 -07:00
Arthur Eubanks 9c8b28a69b [llvm-reduce] Remove unwanted module inline asm
We can clear line by line, but that's likely not very important.

Reviewed By: hans

Differential Revision: https://reviews.llvm.org/D99921
2021-04-06 09:35:37 -07:00
Samuel 56fa1b4ff2 [llvm-reduce] Add header guards and fix clang-tidy warnings
Add header guards and fix other clang-tidy warnings in .h files.
Also align misaligned header docs

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D99634
2021-04-01 20:38:49 -07:00
Samuel 24339056c8 [llvm-reduce] Remove dso_local when possible
Add a new delta pass to llvm-reduce that removes dso_local when possible

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D98673
2021-03-29 12:00:10 -07:00
Roman Lebedev 8dee0b4bd6
[llvm-reduce] ReduceGlobalVarInitializers delta pass: fix handling of globals w/ comdat/non-external linkage
Much like with ReduceFunctionBodies delta pass,
we need to remove comdat and set linkage to external,
else verifier will complain, and our deltas are invalid.
2021-01-07 18:05:03 +03:00
Roman Lebedev 5799fc79c3
[llvm-reduce] Refactor global variable delta pass
The limitation of the current pass that it skips initializer-less GV's
seems arbitrary, in all the reduced cases i (personally) looked at,
the globals weren't needed, yet they were kept.

So let's do two things:
1. allow reducing initializer-less globals
2. before reducing globals, reduce their initializers, much like we do function bodies
2021-01-03 01:45:47 +03:00
Roman Lebedev 19ab1817b6
[llvm-reduce] Fix removal of unused llvm intrinsics declarations
ee6e25e439 changed
the delta pass to skip intrinsics, which means we may end up being
left with declarations of intrinsics, that aren't otherwise referenced
in the module. This is obviously unwanted, do drop them.
2021-01-03 01:45:47 +03:00
serge-sans-paille 9218ff50f9 llvmbuildectomy - replace llvm-build by plain cmake
No longer rely on an external tool to build the llvm component layout.

Instead, leverage the existing `add_llvm_componentlibrary` cmake function and
introduce `add_llvm_component_group` to accurately describe component behavior.

These function store extra properties in the created targets. These properties
are processed once all components are defined to resolve library dependencies
and produce the header expected by llvm-config.

Differential Revision: https://reviews.llvm.org/D90848
2020-11-13 10:35:24 +01:00
Florian Hahn 250de7388b
[llvm-reduce] Add reduction for special globals like llvm.used.
This patch adds a reduction of 'special' globals that lead to further
reductions (e.g. alias or regular globals reduction) being less efficient
because there are special constraints on values referenced in those
special globals. For example, values in @llvm.used and
@llvm.compiler.used need to be named, so replacing all uses of an
alias/global with undef or a different unnamed constant results in
invalid IR.

More details:
https://llvm.org/docs/LangRef.html#intrinsic-global-variables

Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D90302
2020-11-11 11:25:05 +00:00