When computing base pointers, we introduce new instructions to propagate the base of existing instructions which might not be bases. However, the algorithm doesn't make any effort to recognize when the new instruction to be inserted is the same as an existing one already in the IR. Since this is happening immediately before rewriting, we don't really have a chance to fix it after the pass runs without teaching loop passes about statepoints.
I'm really not thrilled with this patch. I've rewritten it 4 different ways now, but this is the best I've come up with. The case where the new instruction is just the original base defining value could be merged into the existing algorithm with some complexity. The problem is that we might have something like an extractelement from a phi of two vectors. It may be trivially obvious that the base of the 0th element is an existing instruction, but I can't see how to make the algorithm itself figure that out. Thus, I resort to the call to SimplifyInstruction instead.
Note that we can only adjust the instructions we've inserted ourselves. The live sets are still being tracked in side structures at this point in the code. We can't easily muck with instructions which might be in them. Long term, I'm really thinking we need to materialize the live pointer sets explicitly in the IR somehow rather than using side structures to track them.
Differential Revision: http://reviews.llvm.org/D12004
llvm-svn: 246133
This patch ensures that every analysis diagnostic produced by the vectorizer
will be printed if the loop has a vectorization hint on it. The condition has
also been improved to prevent printing when a disabling hint is specified.
llvm-svn: 246132
This will do things like,
given mylibrary,
return
libmylibrary.dylib on OSX
mylibrary.dll on Windows
and so on for other platforms
It is currently implemented for Windows, Darwin, and Linux. Other platforms should fill in accordingly
llvm-svn: 246131
This is a one-line-change patch that moves the update to UnhandledWeights to the correct position: it should be updated for all clusters instead of just range clusters.
Differential Revision: http://reviews.llvm.org/D12391
llvm-svn: 246129
with multiple uses of feature map construction.
Note: We could make this a static function on TargetInfo if we
fix the x86 port needing to check the triple in an isolated case.
llvm-svn: 246128
As Sanjoy pointed out over in http://reviews.llvm.org/D11819, a switch on an icmp should always be able to become a branch instruction. This patch generalizes that notion slightly to prove that the default case of a switch is unreachable if the cases completely cover all possible bit patterns in the condition. Once that's done, the switch to branch conversion kicks in just fine.
Note: Duplicate case values are disallowed by the LangRef and verifier.
Differential Revision: http://reviews.llvm.org/D11995
llvm-svn: 246125
DeclarationName (because all ctor names are considered the same, and so on).
Reflect this in the type used as the lookup table key. As a side-effect, remove
one copy of the duplicated code used to compute the hash of the key.
llvm-svn: 246124
Summary:
Let NVPTX backend detect integer min and max patterns during isel and emit intrinsics that enable hardware support.
Reviewers: jholewinski, meheff, jingyue
Subscribers: arsenm, llvm-commits, meheff, jingyue, eliben, jholewinski
Differential Revision: http://reviews.llvm.org/D12377
llvm-svn: 246107
Previously in isProfitableToIfCvt() in ARMBaseInstrInfo.cpp, the multiplication between an integer and a branch probability is done manually in an unsafe way that may lead to overflow. This patch corrects those cases by using BranchProbability's member function scale() to avoid overflow (which stores the intermediate result in int64).
Differential Revision: http://reviews.llvm.org/D12295
llvm-svn: 246106
Currently, when lowering switch statement and a new basic block is built for jump table / bit test header, the edge to this new block is not assigned with a correct weight. This patch collects the edge weight from all its successors and assign this sum of weights to the edge (and also the other fall-through edge). Test cases are adjusted accordingly.
Differential Revision: http://reviews.llvm.org/D12166#fae6eca7
llvm-svn: 246104
Expand the information on base pointers to include an example, the assumptions a collector is allowed to make, legal optimizations over gc.relocates, and the assumptions made by RewriteStatepointsForGC. This is the result of a recent conversation with folks from LLIC and the confusions that came to light therein.
llvm-svn: 246103
Summary: This is the first step in a multi-step refactoring to move add_sanitizer_rt_symbols in the direction of other add_* functions in compiler-rt.
Reviewers: filcab, bogner, kubabrecka, zaks.anna, glider, samsonov
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D12386
llvm-svn: 246102
Added a new class called DWARFDIE that contains a DWARFCompileUnit and DWARFDebugInfoEntry so that these items always stay together.
There were many places where we just handed out DWARFDebugInfoEntry pointers and then use them with a compile unit that may or may not be the correct one. Clients outside of DWARFCompileUnit and DWARFDebugInfoEntry should all be dealing with DWARFDIE instances instead of playing with DWARFCompileUnit/DWARFDebugInfoEntry pairs manually.
This paves to the way for some modifications that are coming for DWO.
llvm-svn: 246100
Change `DIBuilder` always to produce 'distinct' nodes when creating
`DISubprogram` definitions. I measured a ~5% memory improvement in the
link step (of ld64) when using `-flto -g`.
`DISubprogram`s are used in two ways in the debug info graph.
Some are definitions, point at actual functions, and can't really be
shared between compile units. With full debug info, these point down at
their variables, forming uniquing cycles. These uniquing cycles are
expensive to link between modules, since all unique nodes that reference
them transitively need to be duplicated (see commit message for r244181
for more details).
Others are declarations, primarily used for member functions in the type
hierarchy. Definitions never show up there; instead, a definition
points at its corresponding declaration node.
I started by making all subprograms 'distinct'. However, that was too
big a hammer: memory usage *increased* ~5% (net increase vs. this patch
of ~10%) because the 'distinct' declarations undermine LTO type
uniquing. This is a targeted fix for the definitions (where uniquing is
an observable problem).
A couple of notes:
- There's an accompanying commit to update IRGen testcases in clang.
- ^ That's what I'm using to test this commit.
- In a follow-up, I'll change the verifier to require 'distinct' on
definitions and add an upgrade to `BitcodeReader`.
llvm-svn: 246098
I stared at `false /*declaration*/` for quite some time before giving up
and checking the actual function to see what it meant. Replacing with
`/* isDefinition = */ false` to save myself effort later.
llvm-svn: 246095
Things of note:
- Other linkage types aren't handled yet. We'll figure it out with dynamic linking.
- Special LLVM globals are either ignored, or error out for now.
- TLS isn't supported yet (WebAssembly will have threads later).
- There currently isn't a syntax for alignment, I left it in a comment so it's easy to hook up.
- Undef is convereted to whatever the type's appropriate null value is.
- assert versus report_fatal_error: follow what other AsmPrinters do, and assert only on what should have been caught elsewhere.
llvm-svn: 246092
A corresponding clang change will make it so that clang can consume part
of an assembler token. The assembler treats '.' as an identifier
character while clang does not, so it's view of the token stream is a
little different.
llvm-svn: 246089
We removed access to the DataLayout on the TargetMachine and
deprecated the C API function LLVMGetTargetMachineData() in r243114.
However the way I tried to be backward compatible was broken: I
changed the wrapper of the TargetMachine to be a structure that
includes the DataLayout as well. However the TargetMachine is also
wrapped by the ExecutionEngine, in the more classic way. A client
using the TargetMachine wrapped by the ExecutionEngine and trying
to get the DataLayout would break.
It seems tricky to solve the problem completely in the C API
implementation. This patch tries to address this backward
compatibility in a more lighter way in the C++ API. The C API is
restored in its original state and the removed C++ API is
reintroduced, but privately. The C API is friended to the
TargetMachine and should be the only consumer for this API.
Reviewers: ributzka
Differential Revision: http://reviews.llvm.org/D12263
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 246082
There is no context where s_mov_b64 is emitted
and could potentially be moved to the VALU.
It is currently only emitted for materializing
immediates, which can't be dependent on vector sources.
The immediate splitting is already done when selecting
constants. I'm not sure what contexts if any the register
splitting would have been used before.
Also clean up using s_mov_b64 in place of v_mov_b64_pseudo,
although this isn't required and just skips the extra step
of eliminating the copy from the SReg_64.
llvm-svn: 246080
When splitting 64-bit operations, create the correct
VALU instructions immediately.
This was splitting things like s_or_b64 into the two
s_or_b32s and then pushing the new instructions
onto the worklist. There's no reason we need
to do this intermediate step.
llvm-svn: 246077
This was causing problems when some functions use a GuardReg and some
don't as can happen when mixing SelectionDAG and FastISel generated
functions.
llvm-svn: 246075
This takes the existing static function hasLiveCondCodeDef and makes it a member function of the X86InstrInfo class. This is a useful utility function that an upcoming change would like to use. NFC.
Patch by: Kevin B. Smith
Differential Revision: http://reviews.llvm.org/D12371
llvm-svn: 246073
Summary:
On Mac OS X overwriting `/usr/lib/libc++.dylib` can cause your computer to fail to boot. This patch tries to make it harder to do that accidentally.
If `CMAKE_SYSTEM_NAME` is `Darwin` and `CMAKE_INSTALL_PREFIX` is `/usr` don't generate installation rules unless the user explicitly provides `LIBCXX_OVERRIDE_DARWIN_INSTALL=ON`. Note that `CMAKE_INSTALL_PREFIX` is always absolute so we don't need to worry about things like `/usr/../usr`.
Reviewers: mclow.lists, beanz, jroelofs
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D12209
llvm-svn: 246070
Summary: Detecting `-Wno-<warning>` flags can be tricky with GCC (See https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html). This patch adds a special `addWarningFlagIfSupported(<flag>)` method to the test compiler object that can be used to add warning flags. The goal of this patch is to help get the test suite running with more warnings.
Reviewers: danalbert, jroelofs
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D11333
llvm-svn: 246069
Summary:
This patch rewrites the C++03 `__invoke` and related meta-programming. There are a number of major changes.
`__invoke` in C++03 now has a fallback overload for when the invoke expression is ill-formed (similar to C++11). This means that the `__invoke_return` traits will return `__nat` when `__invoke(...)` is ill formed. This would previously cause a compile error.
Bullets 1-4 of `__invoke` have been rewritten. In the old version `__invoke` had 32 overloads for bullets 1 and 2,
one for each possible cv-qualified function signature with arities 0-3. 64 overloads would be needed to support member functions
with varargs. Currently these overloads were fundamentally broken. An example overload looked like:
```
template <class Rp, class Tp, class T1, class A0>
Rp __invoke(Rp (Tp::*pm)(A0) const, T1&, A0&)
```
Because `A0` appeared in two different deducible contexts it would have to deduce to be an exact match or the overload
would be rejected. This is made even worse because `A0` appears without a reference qualifier in the member function signature
and with a reference qualifier as an `__invoke` parameter. This means that only member functions that took all
of their arguments by value could be matched.
One possible fix would be to make the second occurrence of `A0` appear in a non-deducible context. This way
any type convertible to `A0` could be passed as the first parameter. The benefit of this approach is that the
signature of the member function enforces the arity and types taken by the `__invoke` signature it generates. However
nothing in the `INVOKE` specification requires this behavior.
My solution is to use a `__invoke_enable_if<PM_Type, Tp>` metafunction to selectively enable the `__invoke` overloads for bullets 1, 2, 3 and 4. It uses `__member_function_traits` to inspect and extract the return type and class type of the pointer to member. Using `__member_function_traits` to inspect `PM_Type` also allows us to reduce the number of `__invoke` overloads from 32 to 8 and add
varargs support at the same time.
Because `__invoke_enable_if` knows the exact return type of `__invoke` for bullets 1-4 we no longer need to use `decltype(__invoke(...))` to
compute the return type in the `__invoke_return*` traits. This will reduce the problems caused by `#define decltype(X) __typeof__(X)` in C++03.
Tests for this change have already been committed. All tests in `test/std/utilities/function.objects` now pass in C++03, previously there were 20 failures.
Reviewers: K-ballo, howard.hinnant, mclow.lists
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D11553
llvm-svn: 246068