This is the last step needed to avoid regressions for x86 before we flip the switch to allow
expansion of the smallest set of memcpy() via CGP. The DAG version checks for constant strings,
so we need to do that here too.
FWIW, the 2 constant test is not handled by LibCallSimplifier::optimizeMemCmp() because that
code is limited to 8-bit constant arrays. LibCallSimplifier will also fail to optimize some 1
constant tests because its alignment requirements are too strict (shouldn't require alignment
for a constant operand).
Differential Revision: https://reviews.llvm.org/D34071
llvm-svn: 305734
In order to reduce swift binary sizes, Apple is now stripping swift symbols
from the nlist symbol table. llvm-nm currently only looks at the nlist symbol
table and misses symbols that are present in dyld info. This makes it hard to
know the set of symbols for a binary using just llvm-nm. Unless you know to
run llvm-objdump -exports-trie that can output the exported symbols in the dyld
info from the export trie, which does so but in a different format.
Also moving forward the time may come a when a fully linked Mach-O file that
uses dyld will no longer have an nlist symbol table to avoid duplicating the
symbol information.
This change adds three flags to llvm-nm, -add-dyldinfo, -no-dyldinfo, and
-dyldinfo-only.
The first, -add-dyldinfo, has the same effect as when the new bit in the Mach-O
header, MH_NLIST_OUTOFSYNC_WITH_DYLDINFO, appears in a binary. In that it
looks through the dyld info from the export trie and adds symbols to be printed
that are not already in its internal SymbolList variable. The -no-dyldinfo
option turns this behavior off.
The -dyldinfo-only option only looks at the dyld information and recreates the
symbol table from the dyld info from the export trie and binding information.
As if it the Mach-O file had no nlist symbol table.
Also fixed a few bugs with Mach-O N_INDR symbols not correctly printing the
indirect name, or in the same format as the old nm-classic program.
rdar://32021551
llvm-svn: 305733
Summary:
This is required for standalone LSan to work with libdispatch worker threads,
and is a slimmed down version of the functionality provided for ASan
in asan_mac.cc.
Re-commit of r305695 with use_stacks=0 to get around a racy lingering pointer.
Reviewers: alekseyshl, kubamracek, glider, kcc
Subscribers: mgorny, llvm-commits
Differential Revision: https://reviews.llvm.org/D34247
llvm-svn: 305732
Summary:
Existing heuristic uses the ratio between the function entry
frequency and the loop invocation frequency to find cold loops. However,
even if the loop executes frequently, if it has a small trip count per
each invocation, vectorization is not beneficial. On the other hand,
even if the loop invocation frequency is much smaller than the function
invocation frequency, if the trip count is high it is still beneficial
to vectorize the loop.
This patch uses estimated trip count computed from the profile metadata
as a primary metric to determine coldness of the loop. If the estimated
trip count cannot be computed, it falls back to the original heuristics.
Reviewers: Ayal, mssimpso, mkuper, danielcdh, wmi, tejohnson
Reviewed By: tejohnson
Subscribers: tejohnson, mzolotukhin, llvm-commits
Differential Revision: https://reviews.llvm.org/D32451
llvm-svn: 305729
Summary:
Disable generation of counting-function attribute if no_instrument_function
attribute is present in function.
Interaction between -pg and no_instrument_function is the desired behavior
and matches gcc as well.
This is required for fixing a crash in Linux kernel when function tracing
is enabled.
Fixes PR33515.
Reviewers: hfinkel, rengolin, srhines, hans
Reviewed By: hfinkel
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D34357
llvm-svn: 305728
Summary:
Some optimizations in AddReachableCodeToWorklist did not update
the MadeIRChange state. This could happen both when removing
trivially dead instructions (DCE) and at constant folds.
It is essential that changes to the IR is reported correctly,
since for example InstCombinePass::run() will indicate that all
analyses are preserved otherwise.
And the CGPassManager determines if the CallGraph is up-to-date
based on status from InstructionCombiningPass::runOnFunction().
The new test case early_dce_clobbers_callgraph.ll is a reproducer
for some asserts that started to trigger after changes in the
inliner in r305245. With this patch the test case passes again.
Reviewers: sanjoy, craig.topper, dblaikie
Reviewed By: craig.topper
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D34346
llvm-svn: 305725
This seems to be interacting badly with ASan somehow, causing false reports of
heap-buffer overflows: PR33514.
> Summary:
> The patch makes instruction count the highest priority for
> LSR solution for X86 (previously registers had highest priority).
>
> Reviewers: qcolombet
>
> Differential Revision: http://reviews.llvm.org/D30562
>
> From: Evgeny Stupachenko <evstupac@gmail.com>
llvm-svn: 305720
This change avoid a crash that occurred when skipping to EOF while parsing an
ObjC interface/implementation.
rdar://31963299
Differential Revision: https://reviews.llvm.org/D34185
llvm-svn: 305719
Summary: This patch cleans up GenericDomTreeConstruction by replacing typedefs with usings and replaces `typename GraphT::NodeRef` with `NodePtr` to make the file more readable.
Reviewers: sanjoy, dberlin, chandlerc
Reviewed By: dberlin
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D34254
llvm-svn: 305715
Summary:
This is a first step towards getting line info to show up in VS and
windbg. So far, only llvm-pdbutil can parse the PDBs that we produce.
cvdump doesn't like something about our file checksum tables. I'll have
to dig into that next.
This patch adds a new DebugSubsectionRecordBuilder which takes bytes
directly from some other producer, such as a linker, and sticks it into
the PDB. Line tables only need to be relocated. No data needs to be
rewritten.
File checksums and string tables, on the other hand, need to be re-done.
Reviewers: zturner, ruiu
Subscribers: llvm-commits, hiraditya
Differential Revision: https://reviews.llvm.org/D34257
llvm-svn: 305713
In C++ all variables are in default address space. Previously change has been
made to cast automatic variables to default address space. However that is
not sufficient since all temporary variables need to be casted to default
address space.
This patch casts all temporary variables to default address space except those
for passing indirect arguments since they are only used for load/store.
This patch only affects target having non-zero alloca address space.
Differential Revision: https://reviews.llvm.org/D33706
llvm-svn: 305711
Summary:
This patch cleans up GenericDomTree.h by:
- removing unnecessary <NodeT> in DomTreeNodeBase
- removing unnecessary std::move on bools
- changing type of DFSNumIn/DFSNumOut from int to unsigned (since the members were used as unsigned anyway)
The changes don't affect behavior -- everything works as before.
Reviewers: sanjoy, dberlin, chandlerc
Reviewed By: dberlin
Subscribers: davide, llvm-commits
Differential Revision: https://reviews.llvm.org/D34229
llvm-svn: 305710
Summary:
These 4 patterns have the same one use check repeated twice for each. Once without a cast and one with. But the cast has no effect on what method is called.
For the OR case I believe it is always profitable regardless of the number of uses since we'll never increase the instruction count.
For the AND case I believe it is profitable if the pair of xors has one use such that we'll get rid of it completely. Or if the C value is something freely invertible, in which case the not doesn't cost anything.
Reviewers: spatel, majnemer
Reviewed By: spatel
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D34308
llvm-svn: 305705
Summary:
Currently we don't try to do anything with vector xors.
This patch adds support for removing duplicate pairs from a chain of vector xors as its pretty easy to support. We still dont' try to combine the xors with and/ors, but I might try that in a future patch.
Reviewers: mcrosier, davide, resistor
Reviewed By: mcrosier
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D34338
llvm-svn: 305704
A new Gfx9 dasm test added with approx 29000 cases.
Existing tests extended by (approx.):
* Gfx7 asm: 5000 test cases
* Gfx8 asm: 5000 test cases
* Gfx9 asm: 14400 test cases
* Gfx8 dasm: 5200 test cases
llvm-svn: 305702
As all store merges checks are based on the memory operation
performed, allow use of truncated stores and extended loads as valid
input candidates for merging.
Relanding after fixing selection between truncated and normal store.
llvm-svn: 305701
This patch adds support for segment NONE in linker scripts which enables the
specification that a section should not be assigned to any segment.
Note that GNU ld does not disallow the definition of a segment named NONE, which
if defined, effectively overrides the behaviour described above. This feature
has been copied.
Differential Revision: https://reviews.llvm.org/D34203
llvm-svn: 305700
Summary:
After a single predecessor is merged into a basic block, we need to invalidate
the LVI information for the new merged block, when LVI is not provably true for
all of instructions in the new block.
The test cases added show the correct LVI information using the LVI printer
pass.
Reviewers: reames, dberlin, davide, sanjoy
Reviewed by: dberlin, davide
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D34108
llvm-svn: 305699
Summary: use AA to tell whether a load can be moved before a call that writes to memory.
Reviewers: dberlin, davide, sanjoy, hfinkel
Reviewed By: hfinkel
Subscribers: hfinkel, llvm-commits
Differential Revision: https://reviews.llvm.org/D34115
llvm-svn: 305698
Summary:
This fixes the missing space before the designated initializer when `Cpp11BracedListStyle=false` :
const struct A a = { .a = 1, .b = 2 };
^
Also, wrapping between opening brace and designated array initializers used to have an excessive penalty (like breaking between an expression and the subscript operator), leading to unexpected wrapping:
const struct Aaaaaaaaaaaaaaaa aaaaaaaaaaaaaaaaaa =
{[1] = aaaaaaaaaaaaaaaaaaaaaaaaaaa,
[2] = bbbbbbbbbbbbbbbbbbbbbbbbbbb};
instead of:
const struct Aaaaaaaaaaaaaaaa aaaaaaaaaaaaaaaaaa = {
[1] = aaaaaaaaaaaaaaaaaaaaaaaaaaa,
[2] = bbbbbbbbbbbbbbbbbbbbbbbbbbb};
Finally, designated array initializers are not binpacked, just like designated member initializers.
Reviewers: djasper
Reviewed By: djasper
Subscribers: cfe-commits, krasimir, klimek
Differential Revision: https://reviews.llvm.org/D33491
llvm-svn: 305696
Summary:
This is required for standalone LSan to work with libdispatch worker threads,
and is a slimmed down version of the functionality provided for ASan
in asan_mac.cc.
Reviewers: alekseyshl, kubamracek, glider, kcc
Subscribers: mgorny, llvm-commits
Differential Revision: https://reviews.llvm.org/D34247
llvm-svn: 305695
Summary: Implement some of the simplest addressing modes.It should help to test ABI.
Reviewers: zvi, guyblank
Reviewed By: guyblank
Subscribers: rovka, llvm-commits, kristof.beyls
Differential Revision: https://reviews.llvm.org/D33888
llvm-svn: 305691
Use llvm::make_unique to avoid ambiguity with MSVC.
This patch adds a generic MacroFusion pass, that is used on X86 and
AArch64, which both define target-specific shouldScheduleAdjacent
functions. This generic pass should make it easier for other targets to
implement macro fusion and I intend to add macro fusion for ARM shortly.
Differential Revision: https://reviews.llvm.org/D34144
llvm-svn: 305690
Summary:
A number of places were trying to decode the result of wait(). Add a simple
utility function that does that and a struct that encapsulates the
decoded result. Then also provide a pretty-printer for that class.
Reviewers: zturner, krytarowski, eugene
Subscribers: lldb-commits, mgorny
Differential Revision: https://reviews.llvm.org/D33998
llvm-svn: 305689