Summary:
Add musttail to any resume instructions that is immediately followed by a
suspend (i.e. ret). We do this even in -O0 to support guaranteed tail call
for symmetrical coroutine control transfer (C++ Coroutines TS extension).
This transformation is done only in the resume part of the coroutine that has
identical signature and calling convention as the coro.resume call.
Reviewers: GorNishanov
Reviewed By: GorNishanov
Subscribers: EricWF, majnemer, llvm-commits
Differential Revision: https://reviews.llvm.org/D37125
llvm-svn: 311751
to handle other x86 pseudos that carry flags and thus can't be matched
by our ISel patterns with fused memory accesses.
Differential Revision: https://reviews.llvm.org/D37088
llvm-svn: 311749
This extracts the code out of a giant switch in preparation for expanding it to
handle operations other thin `inc` and `dec`. Add a FIXME indicating what's
coming here.
Differential Revision: https://reviews.llvm.org/D37045
llvm-svn: 311748
Because isOperationCustom was only checking for custom
lowering on illegal types, this was behaving inconsistently
with the other isOperation* functions, so that
isOperationLegalOrCustom != (isOperationLegal || isOperationCustom)
Luckily this is only used in one place which already checks the
type legality on its own.
llvm-svn: 311743
Summary: The expected order of pointer-like keys is hash-function-dependent which in turn depends on the platform/environment. Need to come up with a better way to test reverse iteration of containers with pointer-like keys.
Reviewers: dblaikie, mehdi_amini, efriedma, mgrang
Reviewed By: mgrang
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D37128
llvm-svn: 311741
FeatureSlowUAMem32.
The idea was to mark things that are slow on widely available processors
as slow in the generic CPU so that the code generated for that CPU would
be fast across those processors. However, for this feature that doesn't
work out very well at all.
The problem here is that you can very easily enable AVX or AVX2 on top
of this generic CPU. For example, this can happen just by using AVX2
intrinsics from Clang within a region of code guarded by a dynamic CPU
feature test. When you do that, the generated code with SlowUAMem32 set
is ... amazingly slower. The problem is that there really aren't very
good alternatives to the unaligned loads, and so our vector codegen
regresses significantly.
The other issue is that there are plenty of AMD CPUs with AVX1 that
don't set FeatureSlowUAMem32 and so we shouldn't just check for AVX2
instead of this special feature. =/
It would be nice to have the target attriute logic be able to
enable/disable more than just one feature at a time and control this in
a more fine grained and useful way, but that doesn't seem easy. Given
that it is only Sandybridge and Ivybridge that set this feature, for now
I'm just backing it out of the generic CPU. That has the additional
advantage of going back to the previous state that people seemed vaguely
happy with.
llvm-svn: 311740
Summary:
If assertions are disabled, but LLVM_ABI_BREAKING_CHANGES is enabled,
this will cause an issue with an unchecked Success. Switching to
consumeError() is the correct way to bypass the check. This patch also
includes disabling 2 tests that can't work without assertions enabled,
since llvm_unreachable() with NDEBUG won't crash.
Reviewers: llvm-commits, lhames
Reviewed By: lhames
Subscribers: lhames, pirama
Differential Revision: https://reviews.llvm.org/D36729
llvm-svn: 311739
The comment for this code indicated that it should work similar to our
handling of add lowering above: if we see uses of an instruction other
than flag usage and store usage, it tries to avoid the specialized
X86ISD::* nodes that are designed for flag+op modeling and emits an
explicit test.
Problem is, only the add case actually did this. In all the other cases,
the logic was incomplete and inverted. Any time the value was used by
a store, we bailed on the specialized X86ISD node. All of this appears
to have been historical where we had different logic here. =/
Turns out, we have quite a few patterns designed around these nodes. We
should actually form them. I fixed the code to match what we do for add,
and it has quite a positive effect just within some of our test cases.
The only thing close to a regression I see is using:
notl %r
testl %r, %r
instead of:
xorl -1, %r
But we can add a pattern or something to fold that back out. The
improvements seem more than worth this.
I've also worked with Craig to update the comments to no longer be
actively contradicted by the code. =[ Some of this still remains
a mystery to both Craig and myself, but this seems like a large step in
the direction of consistency and slightly more accurate comments.
Many thanks to Craig for help figuring out this nasty stuff.
Differential Revision: https://reviews.llvm.org/D37096
llvm-svn: 311737
Patch by Patricio Villalobos.
I discovered that lld for darwin is generating the wrong code for lazy
bindings in the __stub_helper section (at least for osx 10.12). This is
the way i can reproduce this problem, using this program:
#include <stdio.h>
int main(int argc, char **argv) {
printf("C: printf!\n");
puts("C: puts!\n");
return 0;
}
Then I link it using i have tested it in 3.9, 4.0 and 4.1 versions:
$ clang -c hello.c
$ lld -flavor darwin hello.o -o h1 -lc
When i execute the binary h1 the system gives me the following error:
C: printf!
dyld: lazy symbol binding failed:
BIND_OPCODE_SET_SEGMENT_AND_OFFSET_ULEB
has segment 4 which is too large (0..3)
dyld: BIND_OPCODE_SET_SEGMENT_AND_OFFSET_ULEB has segment 4 which is too
large (0..3)
Trace/BPT trap: 5
Investigating the code, it seems that the problem is that the asm code
generated in the file StubPass.cpp, specifically in the line 323,when it
adds, what it seems an arbitrary number (12) to the offset into the lazy
bind opcodes section, but it should be calculated depending on the
MachONormalizedFileBinaryWrite::lazyBindingInfo result.
I confirmed this bug by patching the code manually in the binary and
writing the right offset in the asm code (__stub_helper).
This patch fixes the content of the atom that contains the assembly code
when the offset is known.
Differential Revision: https://reviews.llvm.org/D35387
llvm-svn: 311734
The problem is that CMake is mostly imperative and the result of
processing "if (TARGET blah)" checks depends on the order of import of
CMake files.
In this case, "projects" folder is registered before "tools",
and calling "CheckClangHeaders" [renamed to have a better name]
errors out without even giving Clang a chance to be built.
This, in turn, leads to libFuzzer bot failures in some circumstances on
some machines (depends on whether LIT or UNIT tests are scheduled
first).
Differential Revision: https://reviews.llvm.org/D37126
llvm-svn: 311733
This goes back to a discussion about IR canonicalization. We'd like to preserve and convert
more IR to 'select' than we currently do because that's likely the best choice in IR:
http://lists.llvm.org/pipermail/llvm-dev/2016-September/105335.html
...but that's often not true for codegen, so we need to account for this pattern coming in
to the backend and transform it to better DAG ops.
Steps in this patch:
1. Add an EVT param to the existing convertSelectOfConstantsToMath() TLI hook to more finely
enable this transform. Other targets will probably want that anyway to distinguish scalars
from vectors. We're using that here to exclude AVX512 targets, but it may not be necessary.
2. Convert a vselect to ext+add. This eliminates a constant load/materialization, and the
vector ext is often free.
Implementing a more general fold using xor+and can be a follow-up for targets that don't have
a legal vselect. It's also possible that we can remove the TLI hook for the special case fold
implemented here because we're eliminating a constant, but it needs to be tested on other
targets.
Differential Revision: https://reviews.llvm.org/D36840
llvm-svn: 311731
There are 3 small independent changes here:
1. Account for multiple uses in the pattern matching: avoid the transform if it increases the instruction count.
2. Add a missing fold for the case where the numerator is the constant: http://rise4fun.com/Alive/E2p
3. Enable all folds for vector types.
There's still one more potential change - use "shouldChangeType()" to keep from transforming to an illegal integer type.
Differential Revision: https://reviews.llvm.org/D36988
llvm-svn: 311726
linker script SECTION rules. This patch extends it to use a fully specified
file name as it appears in --trace output to match agains, i.e,
"<path>/<objname>.o" or "<path>/<libname>.a(<objname>.o)".
Differential Revision: https://reviews.llvm.org/D37031
llvm-svn: 311713
Currently LLD reads the R_MIPS_HI16's addends in the `computeMipsAddend`
function, the R_MIPS_LO16's addends in both `computeMipsAddend` and
`getImplicitAddend` functions. This patch moves reading all addends to
the `getImplicitAddend` function. As a side effect it fixes a "paired"
HI16/LO16 addend calculation if "LO16" part of a pair is not found.
llvm-svn: 311711
Summary: With accurate sample profile, we can do more aggressive size optimization. For some size-critical application, this can reduce the text size by 20%
Reviewers: davidxl, rsmith
Reviewed By: davidxl, rsmith
Subscribers: mehdi_amini, eraman, sanjoy, cfe-commits
Differential Revision: https://reviews.llvm.org/D37091
llvm-svn: 311707
Summary: We need to have accurate-sample-profile in function attribute so that it works with LTO.
Reviewers: davidxl, rsmith
Reviewed By: davidxl
Subscribers: sanjoy, mehdi_amini, javed.absar, llvm-commits, eraman
Differential Revision: https://reviews.llvm.org/D37113
llvm-svn: 311706
test/std/containers/Emplaceable.h
test/std/containers/NotConstructible.h
test/support/counting_predicates.hpp
Replace unary_function/binary_function inheritance with typedefs.
test/std/depr/depr.function.objects/depr.base/binary_function.pass.cpp
test/std/depr/depr.function.objects/depr.base/unary_function.pass.cpp
test/std/utilities/function.objects/func.require/binary_function.pass.cpp
test/std/utilities/function.objects/func.require/unary_function.pass.cpp
Mark these tests as requiring 98/03/11/14 because 17 removed unary_function/binary_function.
test/std/thread/futures/futures.task/futures.task.members/ctor_func_alloc.pass.cpp
test/std/thread/futures/futures.task/futures.task.nonmembers/uses_allocator.pass.cpp
Mark these tests as requiring 11/14 because 17 removed packaged_task allocator support.
test/std/utilities/function.objects/func.wrap/func.wrap.func/derive_from.pass.cpp
This test doesn't need to be skipped in C++17 mode. Only the construction of
std::function from an allocator needs to be skipped in C++17 mode.
test/std/utilities/function.objects/refwrap/refwrap.access/conversion.pass.cpp
test/std/utilities/function.objects/refwrap/refwrap.assign/copy_assign.pass.cpp
test/std/utilities/function.objects/refwrap/refwrap.const/copy_ctor.pass.cpp
test/std/utilities/function.objects/refwrap/refwrap.const/type_ctor.pass.cpp
When testing these reference_wrapper features, unary_function inheritance is totally irrelevant.
test/std/utilities/function.objects/refwrap/weak_result.pass.cpp
Define and use my_unary_function/my_binary_function to test the weak result type machinery
(which is still present in C++17, although deprecated).
test/support/msvc_stdlib_force_include.hpp
Now we can test C++17 strictly, without enabling removed features.
Fixes D36503.
llvm-svn: 311705
Do not sanitize the 'this' pointer of a member call operator for a lambda with
no capture-default, since that call operator can legitimately be called with a
null this pointer from the static invoker function. Any actual call with a null
this pointer should still be caught in the caller (if it is being sanitized).
This reinstates r311589 (reverted in r311680) with the above fix.
llvm-svn: 311695
Summary: Currently FastISel lowers constexpr calls as indirect calls.
We'd like those to direct calls, and falling back to SelectionDAGISel
handles that.
Reviewers: dschuff, sunfish
Subscribers: jfb, sbc100, llvm-commits, aheejin
Differential Revision: https://reviews.llvm.org/D37073
llvm-svn: 311693
Summary:
Update GCC test suite failure expectations as we add -O0 to the bare tests in
WebAssembly waterfall. There are still several untriaged lld failures.
Reviewers: sbc100, jgravelle-google, dschuff
Reviewed By: dschuff
Subscribers: jfb
Differential Revision: https://reviews.llvm.org/D37100
llvm-svn: 311691
This fixes a warning when there are zero defined predicates and also fixes an
unnoticed bug where the first predicate in the table was unusable.
llvm-svn: 311684
Discovered due to a goofy git setup, the test system-headerline-directive.c
(and a few others) failed because the token-consumption will consume only the
'\r' in CRLF, making the preprocessor's printed value give the wrong line number
when returning from an include. For example:
(line 1):#include <noline.h>\r\n
The "file exit" code causes the printer to try to print the 'returned to the
main file' line. It looks up what the current line number is. However, since the
current 'token' is the '\n' (since only the \r was consumed), it will give the
line number as '1", not '2'. This results in a few failed tests, but more
importantly, results in error messages being incorrect when compiling a
previously preprocessed file.
Differential Revision: https://reviews.llvm.org/D37079
llvm-svn: 311683
Before this patch, lld printed out something like
error: -O: number expected, but got
After this patch, it prints out the same error message like this:
error: -O: number expected, but got ''
Fixes https://bugs.llvm.org/show_bug.cgi?id=34311
llvm-svn: 311681