Currently, in MachineBlockPlacement pass the loop is rotated to let the best exit to be the last BB in the loop chain, to maximize the fall-through from the loop to outside. With profile data, we can determine the cost in terms of missed fall through opportunities when rotating a loop chain and select the best rotation. Basically, there are three kinds of cost to consider for each rotation:
1. The possibly missed fall through edge (if it exists) from BB out of the loop to the loop header.
2. The possibly missed fall through edges (if they exist) from the loop exits to BB out of the loop.
3. The missed fall through edge (if it exists) from the last BB to the first BB in the loop chain.
Therefore, the cost for a given rotation is the sum of costs listed above. We select the best rotation with the smallest cost. This is only for PGO mode when we have more precise edge frequencies.
Differential revision: http://reviews.llvm.org/D10717
llvm-svn: 250754
A REPL takes over the command line and typically treats input as source code.
REPLs can also do code completion. The REPL class allows its subclasses to
implement the language-specific functionality without having to know about the
IOHandler-specific internals.
Also added a PluginManager-based way of getting to a REPL given a language and
a target.
Also brought in some utility code and expression options that are useful for
REPLs, such as line offsets for expressions, ANSI terminal coloring of errors,
and a few IOHandler convenience functions.
llvm-svn: 250753
In r248047, I attempted to fix a build breakage introduced by using
llvm's regex support from lldb-mi. However, my approach was flawed
when LLVM and lldb are dynamically linked, in which case two copies
of LLVMSupport would end up in memory, causing crashes on lldb start up.
Instead, use LINK_COMPONENTS to make sure lldb-mi has access to the
LLVMSupport symbols without causing duplication in the dynamic library
case.
llvm-svn: 250751
As usual, this is a polymorphic hierarchy without polymorphic ownership,
so simply make the dtor protected non-virtual, protected default copy
ctor/assign, and make derived classes final. The derived classes will
pick up correct default public copy ops (and dtor) implicitly.
(wish I could add -Wdeprecated to the build, but last time I tried it
triggered on some system headers I still need to look into/figure out)
llvm-svn: 250747
Allow LLVM to optimize the sequence like the following:
%inc = add nsw i32 %i, 1
%cmp = icmp slt %n, %inc
into:
%cmp = icmp sle i32 %n, %i
The case is not handled previously due to the complexity of compuation of %n.
Hence, LLVM cannot swap operands of icmp accordingly.
llvm-svn: 250746
Besides the usual, I finally added an overload to
`BasicBlock::splitBasicBlock()` that accepts an `Instruction*` instead
of `BasicBlock::iterator`. Someone can go back and remove this overload
later (after updating the callers I'm going to skip going forward), but
the most common call seems to be
`BB->splitBasicBlock(BB->getTerminator(), ...)` and I'm not sure it's
better to add `->getIterator()` to every one than have the overload.
It's pretty hard to get the usage wrong.
llvm-svn: 250745
This was originally checked in at r250527, but reverted at r250570 because of PR25222.
There were at least 2 problems:
1. The cost check was checking for an instruction with an exact cost of TCC_Expensive;
that should have been >=.
2. The cause of the clang stage 1 failures was illegally sinking 'call' instructions;
we can't sink instructions that may have side effects / are not safe to execute speculatively.
Fixed those conditions in sinkSelectOperand() and added test cases.
Original commit message:
This is a follow-up to the discussion in D12882.
Ideally, we would like SimplifyCFG to be able to form select instructions even when the operands
are expensive (as defined by the TTI cost model) because that may expose further optimizations.
However, we would then like a later pass like CodeGenPrepare to undo that transformation if the
target would likely benefit from not speculatively executing an expensive op (this patch).
Once we have this safety mechanism in place, we can adjust SimplifyCFG to restore its
select-formation behavior that changed with r248439.
Differential Revision: http://reviews.llvm.org/D13297
llvm-svn: 250743
The two names are similar enough that they might lead to confusion.
The output of readobj clarifies but I missed it when I originally
committed this. Found while linking FreeBSD userland with lld.
llvm-svn: 250739
This allows open source MacOSX clients to not have to build debugserver and the current LLDB can find debugserver inside the selected Xcode.app on your system.
<rdar://problem/23167253>
llvm-svn: 250735
Given the name, it is natural for this function to compute the full target.
This will simplify SHF_MERGE handling by allowing getLocalRelTarget to
centralize the addend logic.
llvm-svn: 250731
Without this fix, cancellation requests in one parallel region cause
cancellation of the second region even though the second one was
not intended to be cancelled.
llvm-svn: 250727
Convert two halfword loads into a single 32-bit word load with bitfield extract
instructions. For example :
ldrh w0, [x2]
ldrh w1, [x2, #2]
becomes
ldr w0, [x2]
ubfx w1, w0, #16, #16
and w0, w0, #ffff
llvm-svn: 250719
- Isolate the check for the existence of a stack frame into hasFP.
- Implement getFrameIndexReference for DWARF address computation.
- Use getFrameIndexReference for offset computation in eliminateFrameIndex.
- Preserve debug information for dynamically allocated stack objects.
- Prefer FP to access local objects at -O0.
- Add experimental code to skip allocframe when not strictly necessary
(disabled by default).
llvm-svn: 250718
Emit the CFI instructions after all code transformation have been done.
This will avoid any interference between CFI instructions and packetization.
llvm-svn: 250714
memory, rather than representing the stubs in IR. Update the CompileOnDemand
layer to use this functionality.
Directly emitting stubs is much cheaper than building them in IR and codegen'ing
them (see below). It also plays well with remote JITing - stubs can be emitted
directly in the target process, rather than having to send them over the wire.
The downsides are:
(1) Care must be taken when resolving symbols, as stub symbols are held in a
separate symbol table. This is only a problem for layer writers and other
people using this API directly. The CompileOnDemand layer hides this detail.
(2) Aliases of function stubs can't be symbolic any more (since there's no
symbol definition in IR), but must be converted into a constant pointer
expression. This means that modules containing aliases of stubs cannot be
cached. In practice this is unlikely to be a problem: There's no benefit to
caching such a module anyway.
On balance I think the extra performance is more than worth the trade-offs: In a
simple stress test with 10000 dummy functions requiring stubs and a single
executed "hello world" main function, directly emitting stubs reduced user time
for JITing / executing by over 90% (1.5s for IR stubs vs 0.1s for direct
emission).
llvm-svn: 250712
The reason of collecting all undefines in vector is that during reading files we already need to have Symtab created. Or like was done in that patch - to put undefines from scripts somewhere to delay Symtab.addUndefinedOpt() call.
Differential Revision: http://reviews.llvm.org/D13870
llvm-svn: 250711
Even though these are under examples/, they actually get loaded
when LLDB starts up during initialization of ScriptInterpreterPython.
There's obviously some kind of layering issue here (and comments
in the code even point to that as well), but for now just make them
py3 compatible.
llvm-svn: 250710
Newer versions of CMake include a "smarter" FindLibxml2 package.
In theory this is a good thing, but on Windows it's now smart
enough to find the version that comes with Gnuwin32, which doesn't
appear to be a valid libxml2 distribution. Or at the very least,
LLDB currently uses some header files from libxml2 that are not
part of this distribution.
Nobody on Windows is using any of this functionality right now
anyway, so just disable it.
llvm-svn: 250709
warnings similar to the following:
runtime/src/kmp_global.c:117:35: warning: implicit conversion from
'unsigned long' to 'int' changes value from 18446744073709551615 to -1
[-Wconstant-conversion]
int __kmp_sys_max_nth = KMP_MAX_NTH;
~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~
runtime/src/kmp.h:849:34: note: expanded from macro 'KMP_MAX_NTH'
# define KMP_MAX_NTH PTHREAD_THREADS_MAX
^~~~~~~~~~~~~~~~~~~
Clamp KMP_MAX_NTH to INT_MAX to avoid these warnings. Also use INT_MAX
whenever PTHREAD_THREADS_MAX is not defined at all.
Differential Revision: http://reviews.llvm.org/D13827
llvm-svn: 250708
This makes the format tests look more like most other FileCheck tests in clang.
The multiple-inputs tests still use temp files, to make sure that the file
input code in clang-format stays tested.
Stop stripping out the comment lines in style-on-command-line.cpp as they don't
get in the way and it makes the test simpler. Also remove 2>&1s on the tests in
that file that don't need it.
http://reviews.llvm.org/D13852
llvm-svn: 250706