When we have dynamic allocas we have a frame pointer, and
when we're lowering frame indexes we should make sure we use it.
Patch by Jacob Gravelle
Differential Revision: https://reviews.llvm.org/D24889
llvm-svn: 282442
other load commands that use the Mach::linkedit_data_command type
but not used in llvm libObject code but used in llvm tool code.
This includes LC_FUNCTION_STARTS, LC_SEGMENT_SPLIT_INFO
and LC_DYLIB_CODE_SIGN_DRS load commands.
llvm-svn: 282441
Summary:
`std::move` and `std::forward` were not marked constexpr in C++11. This can be very damaging because it makes otherwise constant expressions non-constant. For example:
```
#include <utility>
template <class T>
struct Foo {
constexpr Foo(T&& tx) : t(std::move(tx)) {}
T t;
};
[[clang::require_constant_initialization]] Foo<int> f(42); // Foo should be constant initialized but C++11 move is not constexpr. As a result `f` is an unsafe global.
```
This patch applies `constexpr` to `move` and `forward` as an extension in C++11. Normally the library is not allowed to add `constexpr` because it may be observable to the user. In particular adding constexpr may cause valid code to stop compiling. However these problems only happen in more complex situations, like making `__invoke(...)` constexpr. `forward` and `move` are simply enough that applying `constexpr` is safe.
Note that libstdc++ has offered this extension since at least 4.8.1.
Most of the changes in this patch are simply test cleanups or additions. The main changes in the tests are:
* Fold all `forward_N.fail.cpp` tests into a single `forward.fail.cpp` test using -verify.
* Delete most `move_only_N.fail.cpp` tests because they weren't actually testing anything.
* Fold `move_copy.pass.cpp` and `move_only.pass.cpp` into a single `move.pass.cpp` test.
* Add return type and noexcept tests for `forward` and `move`.
Reviewers: rsmith, mclow.lists, EricWF
Subscribers: K-ballo, loladiro
Differential Revision: https://reviews.llvm.org/D24637
llvm-svn: 282439
Summary:
This patch improves thinlto importer
by importing 3x larger functions that are called from hot block.
I compared performance with the trunk on spec, and there
were about 2% on povray and 3.33% on milc. These results seems
to be consistant and match the results Teresa got with her simple
heuristic. Some benchmarks got slower but I think they are just
noisy (mcf, xalancbmki, omnetpp)- running the benchmarks again with
more iterations to confirm. Geomean of all benchmarks including the noisy ones
were about +0.02%.
I see much better improvement on google branch with Easwaran patch
for pgo callsite inlining (the inliner actually inline those big functions)
Over all I see +0.5% improvement, and I get +8.65% on povray.
So I guess we will see much bigger change when Easwaran patch will land
(it depends on new pass manager), but it is still worth putting this to trunk
before it.
Implementation details changes:
- Removed CallsiteCount.
- ProfileCount got replaced by Hotness
- hot-import-multiplier is set to 3.0 for now,
didn't have time to tune it up, but I see that we get most of the interesting
functions with 3, so there is no much performance difference with higher, and
binary size doesn't grow as much as with 10.0.
Reviewers: eraman, mehdi_amini, tejohnson
Subscribers: mehdi_amini, llvm-commits
Differential Revision: https://reviews.llvm.org/D24638
llvm-svn: 282437
This is the Linux counterpart to the sampling support I added
on the macOS side.
This change also introduces zip-file compression if the size of
the sample output is greater than 10 KB. The Linux side can be
quite large and the textual content is averaging over a 10x
compression factor on tests that I force to time out. When
compression takes place, the filename becomes:
{session_dir}/{TestFilename.py}-{pid}.sample.zip
This support relies on the linux 'perf' tool. If it isn't
present, the behavior is to ignore pre-kill processing of
the timed out test process.
Note calling the perf tool under the timeout command appears
to nuke the profiled process. This was causing the timeout
kill logic to fail due to the process having disappeared.
I modified the kill logic to catch the case of the process
not existing, and I have it ignore the kill request in that
case. Any other exception is still raised.
Reviewers: labath
Subscribers: lldb-commits
Differential Revision: https://reviews.llvm.org/D24890
llvm-svn: 282436
Declare __STDC_FORMAT_MACROS, __STDC_LIMIT_MACROS and
__STDC_CONSTANT_MACROS before including real inttypes.h/stdint.h when
the wrapper-header is included in C++11, in order to enable
the necessary macros in C99-compliant libc.
The C99 standard defined that the format macros in inttypes.h should be
defined by the C++ implementations only when __STDC_FORMAT_MACROS is
defined, and the limit and constant macros in stdint.h should be defined
only when __STDC_LIMIT_MACROS and __STDC_CONSTANT_MACROS are defined
appropriately. Following this specification, multiple old versions of
glibc up to 2.17 do not define those macros by default for C++,
rendering the libc++ headers non-compliant to the C++11 standard.
In order to achieve the necessary compliance, __STDC_FORMAT_MACROS is
defined in wrapped inttypes.h just before including the system
inttypes.h, when C++11 or newer is used. Both __STDC_LIMIT_MACROS
and __STDC_CONSTANT_MACROS are defined in newly-wrapped stdint.h. This
fixes the C++11 compliance while preserving the current behavior for
C++03.
Differential Revision: https://reviews.llvm.org/D24903
llvm-svn: 282435
This allows debugging of the JIT and other analyses of the internals of the
expression parser. I've also added a testcase that verifies that the setting
works correctly when off and on.
llvm-svn: 282434
CommandData breakpoint commands didn't know whether they were
Python or Command line commands, so they couldn't serialize &
deserialize themselves properly. Fix that.
I also changed the "breakpoint list" command to note in the output
when the commands are Python commands. Fortunately only one test
was relying on this explicit bit of text output.
llvm-svn: 282432
The BYTE, SHORT, LONG, and QUAD commands store one, two, four, and eight bytes (respectively).
After storing the bytes, the location counter is incremented by the number of bytes
stored.
Previously our scripts handles these commands incorrectly. For example:
SECTIONS {
.foo : {
*(.foo.1)
BYTE(0x11)
...
We accepted the script above treating BYTE as input section description.
These commands are used in the wild though.
Differential revision: https://reviews.llvm.org/D24830
llvm-svn: 282429
PR30521 was about linking shared library. After r282295 code when linking -shared produced
"entry symbol not found" warning, what in combination with --fatal-errors failed linkage.
Patch fixes logic (and adds testcases) to follow next rules:
1) If entry was specified and not found report warning.
2) If entry was not specified then:
a) Emit warning if not -shared.
b) Do not emit warning if -shared.
Differential revision: https://reviews.llvm.org/D24913
llvm-svn: 282427
This option behaves in a similar spirit as -save-temps and writes
internal llvm statistics in json format to a file.
Differential Revision: https://reviews.llvm.org/D24820
llvm-svn: 282426
Previously enabling the statistics with EnableStatistics() would lead to
them getting printed to stderr/-info-output-file on exit. However
frontends may want a way to enable statistics and do the printing on
their own instead of the forced printing on exit.
This changes the code so that only the -stats option enables printing on
exit, EnableStatistics() only enables the tracking but requires invoking
one of the PrintStatistics() variants.
Differential Revision: https://reviews.llvm.org/D24819
llvm-svn: 282425
Rework getLongestCommonPrefixLen() so that it doesn't access string null
terminators. The old version with std::mismatch would do this:
|
v
Strings[0] = ['a', nil]
Strings[1] = ['a', 'a', nil]
^
|
This should silence a warning from the MSVC runtime (PR30515). As
before, I tested this out by preparing a coverage report for FileCheck.
Thanks to Yaron Keren for the report!
llvm-svn: 282422
This patch ensures that we actually scalarize instructions marked scalar after
vectorization. Previously, such instructions may have been vectorized instead.
Differential Revision: https://reviews.llvm.org/D23889
llvm-svn: 282418
Summary:
If coroutine has no suspend points, remove heap allocation and turn a coroutine into a normal function.
Also, if a pattern is detected that coroutine resumes or destroys itself prior to coro.suspend call, turn the suspend point into a simple jump to resume or cleanup label. This pattern occurs when coroutines are used to propagate errors in functions that return expected<T>.
Reviewers: majnemer
Subscribers: mehdi_amini, llvm-commits
Differential Revision: https://reviews.llvm.org/D24408
llvm-svn: 282414
Don't match the UXTW extended reg forms of ADD/ADDS/SUB/SUBS if the
32-bit to 64-bit zero-extend can be done for free by taking advantage
of the 32-bit defining instruction zeroing the upper 32-bits of the X
register destination. This enables better instruction selection in a
few cases, such as:
sub x0, xzr, x8
instead of:
mov x8, xzr
sub x0, x8, w9, uxtw
madd x0, x1, x1, x8
instead of:
mul x9, x1, x1
add x0, x9, w8, uxtw
cmp x2, x8
instead of:
sub x8, x2, w8, uxtw
cmp x8, #0
add x0, x8, x1, lsl #3
instead of:
lsl x9, x1, #3
add x0, x9, w8, uxtw
Reviewers: t.p.northover, jmolloy
Subscribers: mcrosier, aemerson, llvm-commits, rengolin
Differential Revision: https://reviews.llvm.org/D24747
llvm-svn: 282413
Many high-performance processors have a dedicated branch predictor for
indirect branches, commonly used with jump tables. As sophisticated as such
branch predictors are, they tend to have well defined limits beyond which
their effectiveness is hampered or even nullified. One such limit is the
number of possible destinations for a given indirect branches that such
branch predictors can handle.
This patch considers a limit that a target may set to the number of
destination addresses in a jump table.
Patch by: Evandro Menezes <e.menezes@samsung.com>, Aditya Kumar
<aditya.k7@samsung.com>, Sebastian Pop <s.pop@samsung.com>.
Differential revision: https://reviews.llvm.org/D21940
llvm-svn: 282412
This is a follow up to r282152.
A more extensive testing on real apps revealed a subtle bug in r282152.
The revision made shadow mapping non-linear even within a single
user region. But there are lots of code in runtime that processes
memory ranges and assumes that mapping is linear. For example,
region memory access handling simply increments shadow address
to advance to the next shadow cell group. Similarly, DontNeedShadowFor,
java memory mover, search of heap memory block header, etc
make similar assumptions.
To trigger the bug user range would need to cross 0x008000000000 boundary.
This was observed for a module data section.
Make shadow mapping linear within a single user range again.
Add a startup CHECK for linearity.
llvm-svn: 282405
Don't xor user address with kAppMemXor in meta mapping.
The only purpose of kAppMemXor is to raise shadow for ~0 user addresses,
so that they don't map to ~0 (which would cause overlap between
user memory and shadow).
For meta mapping we explicitly add kMetaShadowBeg offset,
so we don't need to additionally raise meta shadow.
llvm-svn: 282403
The index of the new insertelement instruction was evaluated in the
wrong way, it was considered as the index of the inserted value instead
of index of the position, where the value should be inserted.
llvm-svn: 282401
This patch fixes PR30366.
Function foldUDivShl() worked under the assumption that one of the values
in input to the function was always an instance of llvm::Instruction.
However, function visitUDivOperand() (the only user of foldUDivShl) was
clearly violating that precondition; internally, visitUDivOperand() uses pattern
matches to check the operands of a udiv. Pattern matchers for binary operators
know how to handle both Instruction and ConstantExpr values.
This patch fixes the problem in foldUDivShl(). Now we use pattern matchers
instead of explicit casts to Instruction. The reduced test case from PR30366
has been added to test file InstCombine/udiv-simplify.ll.
Differential Revision: https://reviews.llvm.org/D24565
llvm-svn: 282398