Since r268966 the modern Verifier pass defaults to stripping invalid debug info
in nonasserts builds. This patch ports this behavior back to the legacy
Verifier pass as well. The primary motivation is that the clang frontend
accepts bitcode files as input but is still using the legacy pass pipeline.
Background: The problem I'm trying to solve with this sequence of patches is
that historically we've done a really bad job at verifying debug info. We want
to be able to make the verifier stricter without having to worry about breaking
bitcode compatibility with existing producers. For example, we don't necessarily
want IR produced by an older version of clang to be rejected by an LTO link just
because of malformed debug info, and rather provide an option to strip it. Note
that merely outdated (but well-formed) debug info would continue to be
auto-upgraded in this scenario.
http://reviews.llvm.org/D20629
<rdar://problem/26448800>
llvm-svn: 270768
As a result of D18634 we no longer infer certain attributes on linkonce_odr
functions at compile time, and may only infer them at LTO time. The readnone
attribute in particular is required for virtual constant propagation (part
of whole-program virtual call optimization) to work correctly.
This change moves the whole-program virtual call optimization pass after
the function attribute inference passes, and enables the attribute inference
passes at opt level 1, so that virtual constant propagation has a chance to
work correctly for linkonce_odr functions.
Differential Revision: http://reviews.llvm.org/D20643
llvm-svn: 270765
It may materialize a declaration, or a definition. The name could
be misleading. This is following a merge of materializeInitFor()
into materializeDeclFor().
Differential Revision: http://reviews.llvm.org/D20593
llvm-svn: 270759
They were originally separated to handle the co-recursion between
the ValueMapper and the ValueMaterializer. This recursion does not
exist anymore: the ValueMapper now uses a Worklist and the
ValueMaterializer is scheduling job on the Worklist.
Differential Revision: http://reviews.llvm.org/D20593
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 270758
This test was hitting an assertion in the value mapper because
the IRLinker was trying to map two times @A while materializing
the initializer for @C.
Fix http://llvm.org/PR27850
Differential Revision: http://reviews.llvm.org/D20586
llvm-svn: 270757
We have need to reuse this functionality, including making
additional generic stream types that are smarter about how and
when they copy memory versus referencing the original memory.
So all of these structures belong in the common library
rather than being pdb specific.
llvm-svn: 270751
There was a typo in r267758. It caused invalid accesses when
given something like "void @free(...)", as NumParams == 0, and
we then try to look at the 0th parameter.
Turns out, most of these were untested; add both attribute
and missing-prototype checks for all libc libfuncs.
Differential Revision: http://reviews.llvm.org/D20543
llvm-svn: 270750
This is probably correct for all uses except cross-module IR linking,
where we need to move the comdat from the source module to the
destination module.
Fixes PR27870.
Reviewers: majnemer
Differential Revision: http://reviews.llvm.org/D20631
llvm-svn: 270743
f32 vectors would use a sequence of BFI instructions instead
of unrolled cmp + select. This was better in the case of a VALU
select with SGPR inputs, but we don't have a way of dealing with that
in the DAG.
llvm-svn: 270731
By making pointer extraction from a vector more expensive in the cost model,
we avoid the vectorization of a loop that is very likely to be memory-bound:
https://llvm.org/bugs/show_bug.cgi?id=27826
There are still bugs related to this, so we may need a more general solution
to avoid vectorizing obviously memory-bound loops when we don't have HW gather
support.
Differential Revision: http://reviews.llvm.org/D20601
llvm-svn: 270729
This should actually address PR27855. This results in adding references to the system libs inside generated dylibs so that they get correctly pulled in when linking against the dylib.
llvm-svn: 270723
LegalizeIntegerTypes does not have a way to expand multiplications for large
integer types (i.e. larger than twice the native bit width). There's no
standard runtime call to use in that case, and so we'd just assert.
Unfortunately, as it turns out, it is possible to hit this case from
standard-ish C code in rare cases. A particular case a user ran into yesterday
involved an __int128 induction variable and a loop with a quadratic (not
linear) recurrence which triggered some backend logic using SCEVExpander. In
this case, the BinomialCoefficient code in SCEV generates some i129 variables,
which get widened to i256. At a high level, this is not actually good (i.e. the
underlying optimization, PPCLoopPreIncPrep, should not be transforming the loop
in question for performance reasons), but regardless, the backend shouldn't
crash because of cost-modeling issues in the optimizer.
This is a straightforward implementation of the multiplication expansion, based
on the algorithm in Hacker's Delight. I validated it against the code for the
mul256b function from http://locklessinc.com/articles/256bit_arithmetic/ using
random inputs. There should be no functional change for previously-working code
(the new expansion code only replaces an assert).
Fixes PR19797.
llvm-svn: 270720
As noted in the review, there are still problems, so this doesn't the bug completely.
Differential Revision: http://reviews.llvm.org/D20529
llvm-svn: 270718
searching for external symbols, and fall back to the SymbolResolver::findSymbol
method if the former returns null.
This makes RuntimeDyld behave more like a static linker: Symbol definitions
from within the current module's "logical dylib" will be preferred to
external definitions. We can build on this behavior in the future to properly
support weak symbol handling.
Custom symbol resolvers that override the findSymbolInLogicalDylib method may
notice changes due to this patch. Clients who have not overridden this method
should generally be unaffected, however users of the OrcMCJITReplacement class
may notice changes.
llvm-svn: 270716
Also, rename recognizeBitReverseOrBSwapIdiom to recognizeBSwapOrBitReverseIdiom,
so the ordering of the MatchBSwaps and MatchBitReversals arguments are
consistent with the function name.
llvm-svn: 270715
Move the now index-based ODR resolution and internalization routines out
of ThinLTOCodeGenerator.cpp and into either LTO.cpp (index-based
analysis) or FunctionImport.cpp (index-driven optimizations).
This is to enable usage by other linkers.
llvm-svn: 270698
Summary:
**Description**
This makes `WidenIV::widenIVUse` (IndVarSimplify.cpp) fail to widen narrow IV uses in some cases. The latter affects IndVarSimplify which may not eliminate narrow IV's when there actually exists such a possibility, thereby producing ineffective code.
When `WidenIV::widenIVUse` gets a NarrowUse such as `{(-2 + %inc.lcssa),+,1}<nsw><%for.body3>`, it first tries to get a wide recurrence for it via the `getWideRecurrence` call.
`getWideRecurrence` returns recurrence like this: `{(sext i32 (-2 + %inc.lcssa) to i64),+,1}<nsw><%for.body3>`.
Then a wide use operation is generated by `cloneIVUser`. The generated wide use is evaluated to `{(-2 + (sext i32 %inc.lcssa to i64))<nsw>,+,1}<nsw><%for.body3>`, which is different from the `getWideRecurrence` result. `cloneIVUser` sees the difference and returns nullptr.
This patch also fixes the broken LLVM tests by adding missing <nsw> entries introduced by the correction.
**Minimal reproducer:**
```
int foo(int a, int b, int c);
int baz();
void bar()
{
int arr[20];
int i = 0;
for (i = 0; i < 4; ++i)
arr[i] = baz();
for (; i < 20; ++i)
arr[i] = foo(arr[i - 4], arr[i - 3], arr[i - 2]);
}
```
**Clang command line:**
```
clang++ -mllvm -debug -S -emit-llvm -O3 --target=aarch64-linux-elf test.cpp -o test.ir
```
**Expected result:**
The ` -mllvm -debug` log shows that all the IV's for the second `for` loop have been eliminated.
Reviewers: sanjoy
Subscribers: atrick, asl, aemerson, mzolotukhin, llvm-commits
Differential Revision: http://reviews.llvm.org/D20058
llvm-svn: 270695
There's already a ARMTargetParser,now adding a similar one for aarch64.
so we can use it to do ARCH/CPU/FPU parsing in clang and llvm, instead of
string comparison.
Patch by Jojo Ma.
llvm-svn: 270687
Followup to D20528 clang patch, this removes the (V)CVTDQ2PD(Y) and (V)CVTPS2PD(Y) llvm intrinsics and auto-upgrades to sitofp/fpext instead.
Differential Revision: http://reviews.llvm.org/D20568
llvm-svn: 270678
A volatile load has side effects beyond what callers expect readonly to
signify. For example, it is not safe to reorder two function calls
which each perform a volatile load to the same memory location.
llvm-svn: 270671
Ensure that the unused fields are explicitly stated when defining the types.
Add some compile time assertions about the size requirements for the structure
types.
llvm-svn: 270663
name_ids() did not return all IDs but only the first NameCount items.
The number of non-zero entries in IDs vector is NameCount, but it
does not mean that all non-zero entries are at the beginning of IDs
vector.
Differential Revision: http://reviews.llvm.org/D20611
llvm-svn: 270656
This is a support COFF feature. Ensure that we can display the weak externals
auxiliary symbol. It contains useful information (such as the default binding
and how to resolve the symbol).
llvm-svn: 270648
[AMDGPU] emitPrologue looks for an unused unallocated SGPR that is not
the scratch descriptor. Continue search if unused register found fails
other requirements.
Reviewers: arsenm, tstellarAMD, nhaehnle
Subscribers: arsenm, llvm-commits, kzhuravl
Differential Revision: http://reviews.llvm.org/D20526
llvm-svn: 270646
We have to modify V2SU before inserting new elements into the
CurrentVRegDefs set because that may move V2SU in memory invalidating
the reference.
llvm-svn: 270644
Summary:
Adds fastpath instrumentation for esan's working set tool. The
instrumentation for an intra-cache-line load or store consists of an
inlined write to shadow memory bits for the corresponding cache line.
Adds a basic test for this instrumentation.
Reviewers: aizatsky
Subscribers: vitalybuka, zhaoqin, kcc, eugenis, llvm-commits
Differential Revision: http://reviews.llvm.org/D20483
llvm-svn: 270640
This patch adds support for:
S_EXPORT
LF_BITFIELD
With this patch, I have run through a couple of gigabytes of PDB
files and cannot find a type or symbol that we do not understand.
llvm-svn: 270637
Instead of this:
i32.const $push10=, __stack_pointer
i32.load $push11=, 0($pop10)
Emit this:
i32.const $push10=, 0
i32.load $push11=, __stack_pointer($pop10)
It's not currently clear which is better, though there's a chance the second
form may be better at overall compression. We can revisit this when we have
more data; for now it makes sense to make PEI consistent with isel.
Differential Revision: http://reviews.llvm.org/D20411
llvm-svn: 270635
This adds support for parsing and dumping the following
symbol types:
S_LPROCREF
S_ENVBLOCK
S_COMPILE2
S_REGISTER
S_COFFGROUP
S_SECTION
S_THUNK32
S_TRAMPOLINE
As of this patch, the test PDB files no longer have any unknown
symbol types.
llvm-svn: 270628
Summary:
Adds createEsanInitToolGV for creating a tool-specific variable passed
to the runtime library.
Adds dtor "esan.module_dtor" and inserts calls from the dtor to
"__esan_exit" in the runtime library.
Updates the EfficiencySanitizer test.
Patch by Qin Zhao.
Reviewers: aizatsky
Subscribers: bruening, kcc, vitalybuka, eugenis, llvm-commits
Differential Revision: http://reviews.llvm.org/D20488
llvm-svn: 270627
The benefits of this patch are
-- We call AnalyzeBranch() to optimize unanalyzable branches, but the result of
AnalyzeBranch() is not used. Now the result is useful.
-- Before the layout of all the MBBs is set, the result of AnalyzeBranch() is
not correct and needs to be fixed before using it to optimize the branch
conditions. Now this optimization is called after the layout, the code used
to fix the result of AnalyzeBranch() is not needed.
-- The branch condition of the last block is not optimized before. Now it is
optimized.
Differential Revision: http://reviews.llvm.org/D20177
llvm-svn: 270623
These attributes aren't used by other debuggers (& may be confused with
other DWARF extensions) so they just waste space (about 1.5% on .dwo
file size on a random large program I tested).
We could remove the ObjC property ones too, but I figured they were
probably more necessary when trying to understand ObjC (I could be wrong
though) & so any debugger interested in working with ObjC would use
them, perhaps? (also, there are some legacy tests in Clang that test for
them - making it one of those annoying cross-project commits and/or
cleanup to refactor those tests)
llvm-svn: 270613
When dumping huge PDB files, too many of the options were grouped
together so you would get neverending spew of output. This patch
introduces more granular display options so you can only dump the
fields you actually care about.
llvm-svn: 270607
This should fix PR27855. We have some terrible hacks in the CMake to add linking SYSTEM_LIBS to all tools. I think we need a better way to do this in the future.
llvm-svn: 270605
This makes use of the newly introduced `CVSymbolVisitor` to dump details
of each type of symbol record in the symbol streams. Future patches will
bring this visitor based dumping to the publics stream, as well as
creating a `SymbolDumpDelegate` to print more information about
relocations etc.
Differential Revision: http://reviews.llvm.org/D20545
Reviewed By: ruiu
llvm-svn: 270585
Summary:
This patch changes the ODR resolution and internalization to be based on
updates to the Index, which are consumed by the backend portion of the
transformations.
It will be followed by an NFC change to move these out of libLTO's
ThinLTOCodeGenerator so that it can be used by other linkers
(gold and lld) and by ThinLTO distributed backends.
The global summary-based portions use callbacks so that the client can
determine the prevailing copy and other information in a client-specific
way. Eventually, with the API being developed in D20268, these may be
modified to use information such as symbol resolutions, supplied by the
clients to the API.
Reviewers: joker-eph
Subscribers: joker.eph, pcc, llvm-commits
Differential Revision: http://reviews.llvm.org/D20290
llvm-svn: 270584
Now, after landing r270560, r270557, r270320 it is a proper time.
Original commit message:
[llvm-mc] - Teach llvm-mc to generate compressed debug sections in zlib style.
Before this patch llvm-mc generated zlib-gnu styled sections.
That means no SHF_COMPRESSED flag was set, magic 'zlib' signature
was used in combination with full size field. Sections were renamed to "*.z*".
This patch reimplements the compression style to zlib one as zlib-gnu looks
to be depricated everywhere.
Differential revision: http://reviews.llvm.org/D20331
llvm-svn: 270569
Similar in spirit to D20497 :
If all elements of a constant vector are known non-zero, then we can say that the
whole vector is known non-zero.
It seems like we could extend this to FP scalar/vector too, but isKnownNonZero()
says it only works for integers and pointers for now.
Differential Revision: http://reviews.llvm.org/D20544
llvm-svn: 270562
Main problem that .debug_info
section was used to check that llvm-dwarfdump is able to decompress
data that was compressed with llvm-mc tool. This section was not compressed
actually, because consumes more space in compressed view.
I changed testcase to use .debug_str section which is one that
is really compressed. So currently test do what is probably was expected to do:
checks that "data"->llvm-mc->llvm-dwarfdump->dumps back initial "data".
Differential revision: http://reviews.llvm.org/D20466
llvm-svn: 270560
Replace bidirectional flow analysis to compute liveness with forward
analysis pass. Treat lifetimes as starting when there is a first
reference to the stack slot, as opposed to starting at the point of the
lifetime.start intrinsic, so as to increase the number of stack
variables we can overlap.
Reviewers: gbiv, qcolumbet, wmi
Differential Revision: http://reviews.llvm.org/D18827
Bug: 25776
llvm-svn: 270559
Fix was:
1) Had to regenerate dwarfdump-test-zlib.elf-x86-64, dwarfdump-test-zlib-gnu.elf-x86-64
(because llvm-symbolizer-zlib.test uses that inputs for its purposes and failed).
2) Updated llvm-symbolizer-zlib.test (updated used call function address to match new files +
added one more check for newly created dwarfdump-test-zlib-gnu.elf-x86-64 binary input).
3) Updated comment in dwarfdump-test-zlib.cc.
Original commit message:
[llvm-dwarfdump] - Teach dwarfdump to decompress debug sections in zlib style.
Before this llvm-dwarfdump only recognized zlib-gnu compression style of headers,
this patch adds support for zlib style.
It looks reasonable to support both styles for dumping,
even if we are not going to suport generating of deprecated gnu one.
Differential revision: http://reviews.llvm.org/D20470
llvm-svn: 270557
Summary:
Change process of parsing of optional operands. All optional operands use same parsing method - parseOptionalOperand().
No default values are added to OperandsVector.
Get rid of WORKAROUND_USE_DUMMY_OPERANDS_INSTEAD_MUTIPLE_DEFAULT_OPERANDS.
Reviewers: tstellarAMD, vpykhtin, artem.tamazov, nhaustov
Subscribers: arsenm, kzhuravl
Differential Revision: http://reviews.llvm.org/D20527
llvm-svn: 270556
fix: forgot to commit the updated dwarfdump-test-zlib.elf-x86-64
Original commit message:
[llvm-dwarfdump] - Teach dwarfdump to decompress debug sections in zlib style.
Before this llvm-dwarfdump only recognized zlib-gnu compression style of headers,
this patch adds support for zlib style.
It looks reasonable to support both styles for dumping,
even if we are not going to suport generating of deprecated gnu one.
Differential revision: http://reviews.llvm.org/D20470
llvm-svn: 270543
Patch by Nitesh Jain.
Summary: The type of Imm in MipsDisassembler.cpp was incorrect since SignExtend64 return int64_t type.As per the MIPSr6 doc ,the offset is added to the address of the instruction following the branch (not the branch itself), to form a PC-relative effective target address hence “4” is added to the offset. The offset of some test case are update to reflect the changes due to “ + 4 ” offset and new test case for negative offset are added.
Reviewers: dsanders, vkalintiris
Differential Revision: http://reviews.llvm.org/D17540
llvm-svn: 270542
Before this llvm-dwarfdump only recognized zlib-gnu compression style of headers,
this patch adds support for zlib style.
It looks reasonable to support both styles for dumping,
even if we are not going to suport generating of deprecated gnu one.
Differential revision: http://reviews.llvm.org/D20470
llvm-svn: 270540
The logic that sets up lit features for sanitizers is largely copied
between here and clang, except clang's was fixed some time ago to
handle multiple sanitizers (ie, Asan + Ubsan). This just makes the
code in LLVM consistent with how it's done in clang to avoid any
gotchas by users of this.
llvm-svn: 270510
Moved the ModuleLoader and supporting helper loadModuleFromBuffer out of
ThinLTOCodeGenerator and into new LTO.h/LTO.cpp files. This is in
preparation for a patch that will utilize these in the gold-plugin.
Note that there are some other pending patches (D20268 and D20290) that
also plan to refactor common interfaces and functionality into this same
pair of new files.
llvm-svn: 270509
This changes IRCE to optimize uses, and not branches. This change is
NFCI since the uses we do inspect are in practice only ever going to be
the condition use in conditional branches; but this flexibility will
later allow us to analyze more complex expressions than just a direct
branch on a range check.
llvm-svn: 270500
Before r269750 we did the comparisons in this loop in signed ints so
that it DTRT when MinCSFrameIndex was 0. This was changed because it's
now possible for MinCSFrameIndex to be UINT_MAX, but that introduced a
bug when we were comparing `>= 0` - this is tautological in unsigned.
Rework the comparisons here to avoid issues with unsigned wrapping.
No test. I couldn't find a way to get any of the StackGrowsUp in-tree
targets to reach the code that sets MinCSFrameIndex.
llvm-svn: 270492
to llvm-objdump. This section is created with -fembed-bitcode option.
This requires the use of libxar and the Cmake and lit support were crafted by
Chris Bieneman!
rdar://26202242
llvm-svn: 270491
This is a work in progress - the chapter text is incomplete, though
the example code compiles and runs.
Feedback and patches are, as usual, most welcome.
llvm-svn: 270487
They were accidentally using the 32-bit load/store instruction for
8/16-bit operations, due to incorrect patterns
(8/16-bit cmpxchg and atomicrmw will be fixed in subsequent changes)
llvm-svn: 270486
This effectively revers commit r270389 and re-lands r270106, but it's
almost a rewrite.
The behavior change in r270106 was that we could no longer assume that
each LF_FUNC_ID record got its own type index. This patch adds a map
from DINode* to TypeIndex, so we can stop making that assumption.
This change also emits padding bytes between type records similar to the
way MSVC does. The size of the type record includes the padding bytes.
llvm-svn: 270485
When an aggregate contains an opaque type its size cannot be
determined. This triggers an "Invalid GetElementPtrInst indices for type" assert
in function checkGEPType. The fix suppresses the conversion in this case.
http://reviews.llvm.org/D20319
llvm-svn: 270479
Summary:
This patch turns on LoopUnrollAnalyzer by default. To mitigate compile
time regressions, I chose very conservative thresholds for now. Later we
can make them more aggressive, but it might require being smarter in
which loops we're optimizing. E.g. currently the biggest issue is that
with more agressive thresholds we unroll many cold loops, which
increases compile time for no performance benefit (performance of those
loops is improved, but it doesn't matter since they are cold).
Test results for compile time(using 4 samples to reduce noise):
```
MultiSource/Benchmarks/VersaBench/ecbdes/ecbdes 5.19%
SingleSource/Benchmarks/Polybench/medley/reg_detect/reg_detect 4.19%
MultiSource/Benchmarks/FreeBench/fourinarow/fourinarow 3.39%
MultiSource/Applications/JM/lencod/lencod 1.47%
MultiSource/Benchmarks/Fhourstones-3_1/fhourstones3_1 -6.06%
```
I didn't see any performance changes in the testsuite, but it improves
some internal tests.
Reviewers: hfinkel, chandlerc
Subscribers: llvm-commits, mzolotukhin
Differential Revision: http://reviews.llvm.org/D20482
llvm-svn: 270478
Summary:
MBBs don't necessarily have a name (in my experience, they almost never
do), in which case this logging is quite unhelpful. The number seems to
work well.
Reviewers: iteratee
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D20533
llvm-svn: 270477
This will pave the way to introduce a full fledged symbol visitor
similar to how we have a type visitor, thus allowing the same
dumping code to be used in llvm-readobj and llvm-pdbdump.
Differential Revision: http://reviews.llvm.org/D20384
Reviewed By: rnk
llvm-svn: 270475