When we see a SETCC whose only users are zero extend operations, we can replace
it with a subtraction. This results in doing all calculations in GPRs and
avoids CR use.
Currently we do this only for ULT, ULE, UGT and UGE condition codes. There are
ways that this can be extended. For example for signed condition codes. In that
case we will be introducing additional sign extend instructions, so more careful
profitability analysis may be required.
Another direction to extend this is for equal, not equal conditions. Also when
users of SETCC are any_ext or sign_ext, we might be able to do something
similar.
llvm-svn: 287329
libc++ no longer supports C++11 compilers that don't implement `= default`.
This patch removes all instances of the feature test macro
_LIBCPP_HAS_NO_DEFAULTED_FUNCTIONS as well as the potentially dead code it hides.
llvm-svn: 287321
readVersionDeclaration was to read anonymous version definition and
named version definition. Splitting it into two functions should
improve readability as the two cases are different enough.
I also changed a few helper functions to return values instead of
mutating given references.
llvm-svn: 287319
The doxygen comments are automatically generated based on Sony's intrinsics document.
I got an OK from Eric Christopher to commit doxygen comments without prior code
review upstream. This patch was internally reviewed by Charles Li.
llvm-svn: 287317
This is a straightforward extension of the existing support for 32/64-bit element types. Just needed to add the additional instrinsics to the switches.
llvm-svn: 287316
This argument was only used in one place in the codebase, and
it was in a non-critical log statement and can be easily
substituted for an equally meaningful field instead. The
payoff of computing this value is not worth the added
complexity.
llvm-svn: 287315
MergeOutputSection class was a bit hard to use because it provdes
a series of finalize functions that have to be called in a right way
at a right time. It also intereacted with MergeInputSection, and the
logic was somewhat entangled between the two classes.
This patch simplifies it by providing only one finalize function.
Now, all you have to do is to call MergeOutputSection::finalize
when you have added all sections to the output section. Then, it
internally merges strings and initliazes StringPiece objects.
I think this is much easier to understand.
This patch also adds comments.
llvm-svn: 287314
The same thing was done to 32-bit and 64-bit element sizes previously.
This will allow us to support these shuffls in InstCombineCalls along with the other variable shift intrinsics.
llvm-svn: 287312
Fix a typo in the conditional. Caught by going through list of removed
symbols when building with hidden visibility.
Differential Revision: https://reviews.llvm.org/D26825
llvm-svn: 287309
Apparently these two enormous functions were dead. Which is
good, since one was largely a copy of another function with
only a few minor tweaks.
llvm-svn: 287308
Originally I converted this entire function and all dependents
to use StringRef, but there were some test failures that
were tricky to track down, as this is a complicated function.
So I'm starting over, this time in smaller increments.
llvm-svn: 287307
since bpf instruction set was introduced people learned to
read and understand kernel verifier output whereas llvm asm
output stayed obscure and unknown. Convert llvm to emit
assembler text similar to kernel to avoid this discrepancy
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
llvm-svn: 287300
Summary:
This extends FCOPYSIGN support to 512-bit vectors.
I've also added tests to show what the 128-bit and 256-bit cases look like with broadcast loads.
Reviewers: delena, zvi, RKSimon, spatel
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D26791
llvm-svn: 287298
The doxygen comments are automatically generated based on Sony's intrinsics document.
I got an OK from Eric Christopher to commit doxygen comments without prior code
review upstream. This patch was internally reviewed by Paul Robinson and Charles Li.
llvm-svn: 287295
Currently sym_check almost all names found in the binary, including those
which are defined in other libraries. This makes our ABI lists harder to maintain.
This patch adds a --only-stdlib-symbols option to sym_check which removes
all symbols which aren't possibly provided by libc++. It also re-generates
the linux ABI list after making this change.
llvm-svn: 287294
Summary: This should provide the function similar to `--disable-libedit` with the autotools build system, which seems to be missing from the commit (r200595) that adds this.
Reviewers: pcc, beanz
Subscribers: mgorny, llvm-commits
Differential Revision: https://reviews.llvm.org/D26550
llvm-svn: 287293
StartFunction enters a release cleanup for ns_consumed arguments in
ARC, so we need to balance that somehow. We could teach StartFunction
that it's emitting a delegating function, so that the cleanup is
unnecessary, but that would be invasive and somewhat fraught. We could
balance the consumed argument with an extra retain, but clearing the
original variable should be easier to optimize and avoid some extra work
at -O0. And there shouldn't be any difference as long as nothing else
uses the argument, which should always be true for the places we emit
delegate arguments.
Fixes PR 27887.
llvm-svn: 287291
Summary:
This used to work because system headers are found in a (somewhat)
predictable set of locations on Linux. But this is not the case on
MacOS; without this change, we don't look in the right places for our
headers when doing device-side compilation on Mac.
Reviewers: tra
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D26776
llvm-svn: 287286
Summary:
Compiling CUDA device code requires us to know the host toolchain,
because CUDA device-side compiles pull in e.g. host headers.
When we only supported Linux compilation, this worked because
CudaToolChain, which is responsible for device-side CUDA compilation,
inherited from the Linux toolchain. But in order to support MacOS,
CudaToolChain needs to take a HostToolChain pointer.
Because a CUDA toolchain now requires a host TC, we no longer will
create a CUDA toolchain from Driver::getToolChain -- you have to go
through CreateOffloadingDeviceToolChains. I am *pretty* sure this is
correct, and that previously any attempt to create a CUDA toolchain
through getToolChain() would eventually have resulted in us throwing
"error: unsupported use of NVPTX for host compilation".
In any case hacking getToolChain to create a CUDA+host toolchain would
be wrong, because a Driver can be reused for multiple compilations,
potentially with different host TCs, and getToolChain will cache the
result, causing us to potentially use a stale host TC.
So that's the main change in this patch.
In addition, we have to pull CudaInstallationDetector out of Generic_GCC
and into a top-level class. It's now used by the Generic_GCC and MachO
toolchains.
Reviewers: tra
Subscribers: rryan, hfinkel, sfantao
Differential Revision: https://reviews.llvm.org/D26774
llvm-svn: 287285
These are part of the EHABI specification and are exported to be available to
users. Mark them as `_LIBUNWIND_EXPORT` like the rest of the unwind interfaces.
llvm-svn: 287283
Since the output has a section table too, it is meaningful to compute
the sh_link. In a more practical note, the binutils' strip crashes if
sh_link is not set for SHT_ARM_EXIDX.
llvm-svn: 287280