This became no longer necessary after D19462 landed, and will be incompatible
with an upcoming change to the summary data structures that changes how we
represent references.
llvm-svn: 301660
We found that some part of code for the -Map option takes O(m*n)
where m is the number of input sections in some file and n is
the number of symbols in the same file. If you do LTO, we usually
have only a few object files as inputs for the -Map option
feature, so this performance characteristic was worse than I
expected.
This patch rewrites the -Map option feature to speed it up.
I eliminated the O(m*n) bottleneck and also used multi-threading.
As a result, clang link time with the -Map option improved from
18.7 seconds to 11.2 seconds. Without -Map, it takes 7.7 seconds,
so the -Map option is now about 3x faster than before for this
test case (from 11.0 seconds to 3.5 seconds.) The generated output
file size was 223 MiB, and the file contains 1.2M lines.
Differential Revision: https://reviews.llvm.org/D32631
llvm-svn: 301659
CONSTANT imports expect both the `_imp_` prefixed and non-prefixed
symbols should be added to the symbol table. This allows for linking
symbols like _NSConcreteGlobalBlock in WinObjC. The previous change
would generate the import library properly by handling the option but
would not consume the generated entry properly.
llvm-svn: 301657
. swap 4-bit register encoding, 16-bit offset and 32-bit imm to support big endian archs
. add a test
Reported-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
llvm-svn: 301653
- I removed doxygen comments for the intrinsics that "alias" the other existing documented intrinsics and that only sligtly differ in spelling (single underscores vs. double underscores).
#define _tzcnt_u16(a) (__tzcnt_u16((a)))
It will be very hard to keep the documentation for these "aliases" in sync with the documentation for the intrinsics they alias to. Out of sync documentation will be more confusing than no documentation.
I got an OK from Eric Christopher to commit doxygen comments without prior code
review upstream.
llvm-svn: 301652
Summary:
When there is a push_back with a call to make_pair, turn it into emplace_back and remove the unnecessary make_pair call.
Eg.
```
std::vector<std::pair<int, int>> v;
v.push_back(std::make_pair(1, 2)); // --> v.emplace_back(1, 2);
```
make_pair doesn't get removed when explicit template parameters are provided, because of potential problems with type conversions.
Reviewers: Prazek, aaron.ballman, hokein, alexfh
Reviewed By: Prazek, alexfh
Subscribers: JDevlieghere, JonasToth, cfe-commits
Tags: #clang-tools-extra
Differential Revision: https://reviews.llvm.org/D32395
llvm-svn: 301651
Summary:
The motivation example is like below which has 13 cases but only 2 distinct targets
```
lor.lhs.false2: ; preds = %if.then
switch i32 %Status, label %if.then27 [
i32 -7012, label %if.end35
i32 -10008, label %if.end35
i32 -10016, label %if.end35
i32 15000, label %if.end35
i32 14013, label %if.end35
i32 10114, label %if.end35
i32 10107, label %if.end35
i32 10105, label %if.end35
i32 10013, label %if.end35
i32 10011, label %if.end35
i32 7008, label %if.end35
i32 7007, label %if.end35
i32 5002, label %if.end35
]
```
which is compiled into a balanced binary tree like this on AArch64 (similar on X86)
```
.LBB853_9: // %lor.lhs.false2
mov w8, #10012
cmp w19, w8
b.gt .LBB853_14
// BB#10: // %lor.lhs.false2
mov w8, #5001
cmp w19, w8
b.gt .LBB853_18
// BB#11: // %lor.lhs.false2
mov w8, #-10016
cmp w19, w8
b.eq .LBB853_23
// BB#12: // %lor.lhs.false2
mov w8, #-10008
cmp w19, w8
b.eq .LBB853_23
// BB#13: // %lor.lhs.false2
mov w8, #-7012
cmp w19, w8
b.eq .LBB853_23
b .LBB853_3
.LBB853_14: // %lor.lhs.false2
mov w8, #14012
cmp w19, w8
b.gt .LBB853_21
// BB#15: // %lor.lhs.false2
mov w8, #-10105
add w8, w19, w8
cmp w8, #9 // =9
b.hi .LBB853_17
// BB#16: // %lor.lhs.false2
orr w9, wzr, #0x1
lsl w8, w9, w8
mov w9, #517
and w8, w8, w9
cbnz w8, .LBB853_23
.LBB853_17: // %lor.lhs.false2
mov w8, #10013
cmp w19, w8
b.eq .LBB853_23
b .LBB853_3
.LBB853_18: // %lor.lhs.false2
mov w8, #-7007
add w8, w19, w8
cmp w8, #2 // =2
b.lo .LBB853_23
// BB#19: // %lor.lhs.false2
mov w8, #5002
cmp w19, w8
b.eq .LBB853_23
// BB#20: // %lor.lhs.false2
mov w8, #10011
cmp w19, w8
b.eq .LBB853_23
b .LBB853_3
.LBB853_21: // %lor.lhs.false2
mov w8, #14013
cmp w19, w8
b.eq .LBB853_23
// BB#22: // %lor.lhs.false2
mov w8, #15000
cmp w19, w8
b.ne .LBB853_3
```
However, the inline cost model estimates the cost to be linear with the number
of distinct targets and the cost of the above switch is just 2 InstrCosts.
The function containing this switch is then inlined about 900 times.
This change use the general way of switch lowering for the inline heuristic. It
etimate the number of case clusters with the suitability check for a jump table
or bit test. Considering the binary search tree built for the clusters, this
change modifies the model to be linear with the size of the balanced binary
tree. The model is off by default for now :
-inline-generic-switch-cost=false
This change was originally proposed by Haicheng in D29870.
Reviewers: hans, bmakam, chandlerc, eraman, haicheng, mcrosier
Reviewed By: hans
Subscribers: joerg, aemerson, llvm-commits, rengolin
Differential Revision: https://reviews.llvm.org/D31085
llvm-svn: 301649
This allows users to query the target triple and target pointer width, which
would make me able to fix https://github.com/servo/rust-bindgen/issues/593 and
other related bugs in an elegant way (without having to manually parse the
target triple in the command line arguments).
Differential Revision: https://reviews.llvm.org/D32389
llvm-svn: 301648
diagnostic in #pragma diagnostic
This matches the warning group that's specified for the unknown warning options
that are passed-in as command line arguments.
rdar://29526025
llvm-svn: 301647
Summary:
Skip memops if the total value profiled count is 0, we can't correctly
scale up the counts and there is no point anyway.
Reviewers: davidxl
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D32624
llvm-svn: 301645
Reapplied r299221 after fix for nondeterminism in ThinLTO builder (rL301599), with extra check for implicit truncation of inserted element.
llvm-svn: 301644
Summary: ConstStrings are immutable, so there is no need to grab even a reader lock in order to read the length field.
Reviewers: #lldb, labath
Reviewed By: labath
Subscribers: zturner, labath, lldb-commits
Differential Revision: https://reviews.llvm.org/D32306
Patch by Scott Smith <scott.smith@purestorage.com>
llvm-svn: 301642
generation.
This needs changes to GPURuntime to expose synchronization between host
and device.
1. Needs better function naming, I want a better name than
"getOrCreateManagedDeviceArray"
2. DeviceAllocations is used by both the managed memory and the
non-managed memory path. This exploits the fact that the two code paths
are never run together. I'm not sure if this is the best design decision
Reviewed by: PhilippSchaad
Tags: #polly
Differential Revision: https://reviews.llvm.org/D32215
llvm-svn: 301640
Summary:
It turns out that even though ppoll is available on all the android
devices we support, it does not seem to be working properly on all of
them -- MainLoop just does a busy loop with ppoll returning EINTR and
not making any progress.
This brings back the pselect implementation and makes it available on
android. I could not do any cmake checks for this as the ppoll symbol is
actually avaiable -- it just does not work.
Reviewers: beanz, eugene
Subscribers: srhines, lldb-commits
Differential Revision: https://reviews.llvm.org/D32600
llvm-svn: 301636
Declare the ARMInstructionSelector in an anonymous namespace, to make it
more in line with the other targets which were migrated to this in
r299637 in order to avoid TableGen'erated headers being included in
non-GlobalISel builds.
llvm-svn: 301632
There was a garbage character in output introduced by myself in
r290040 "[DWARF] - Introduce DWARFDebugPubTable class for dumping pub* sections."
llvm-svn: 301631
This is a follow up to the fix in r298360 to improve the handling of debug
values when redundant LEAs are removed. The fix in r298360 effectively
discarded the debug values. This patch now attempts to preserve the debug
values by using the DWARF DW_OP_stack_value operation via prependDIExpr.
Moved functions appendOffset and prependDIExpr from Local.cpp to
DebugInfoMetadata.cpp and made them available as static member functions of
DIExpression.
Differential Revision: https://reviews.llvm.org/D31604
llvm-svn: 301630
EarlyCSE should not just ignore assumes. It should use the fact that its condition is true for all dominated instructions.
Reviewers: sanjoy, reames, apilipenko, anna, skatkov
Reviewed By: reames, sanjoy
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D32482
llvm-svn: 301625
If a condition is calculated only once, and there are multiple guards on this condition, we should be able
to remove all guards dominated by the first of them. This patch allows EarlyCSE to try to find the condition
of a guard among the known values, and if it is true, remove the guard. Otherwise we keep the guard and
mark its condition as 'true' for future consideration.
Reviewers: sanjoy, reames, apilipenko, skatkov, anna, dberlin
Reviewed By: reames, sanjoy
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D32476
llvm-svn: 301623
This patch replaces the separate APInts for KnownZero/KnownOne with a single KnownBits struct. This is similar to what was done to ValueTracking's version recently.
This is largely a mechanical transformation from KnownZero to Known.Zero.
Differential Revision: https://reviews.llvm.org/D32569
llvm-svn: 301620
This patch uses various APInt methods to reduce the number of temporary APInts. These were all found while working through converting SelectionDAG's computeKnownBits to also use the KnownBits struct recently added to the ValueTracking version.
llvm-svn: 301618
Tests that run on the iOS simulator require the dlopen'd dylibs are codesigned. This patch adds the "iossim_compile.py" wrapper that codesigns any produces dylib.
Differential Revision: https://reviews.llvm.org/D32561
llvm-svn: 301617
Summary:
In some cases LLVM (especially the SLP vectorizer) will create vectors
that are 256 bytes (or larger). Given that this is intentional[0] is
likely to get more common, this patch updates the StackMap binary
format to deal with the spill locations for said vectors.
This change also bumps the stack map version from 2 to 3.
[0]: https://reviews.llvm.org/D32533#738350
Reviewers: reames, kavon, skatkov, javed.absar
Subscribers: mcrosier, nemanjai, llvm-commits
Differential Revision: https://reviews.llvm.org/D32629
llvm-svn: 301615