Scaling factors are not free on X86 because every "complex" addressing mode
breaks the related instruction into 2 allocations instead of 1.
<rdar://problem/16730541>
llvm-svn: 207301
a helper function. Also factor the other two places where we did the
same thing into the helper function. =] Much cleaner this way. NFC.
llvm-svn: 207300
right intrinsics.
A packed logical shift right with a shift count bigger than or equal to the
element size always produces a zero vector. In all other cases, it can be
safely replaced by a 'lshr' instruction.
llvm-svn: 207299
I'm a bit surprised that I have not implemented this yet. This is
definitely needed to handle real-world module definition files.
This patch contains a unit test for r207294.
llvm-svn: 207297
I'm fixing another bug in the parser, and I wanted to submit this
fix as a separate change as it's logically independent from the other.
I'll add a test for this shortly.
llvm-svn: 207294
Summary:
If we're doing a v4f32/v4i32 shuffle on x86 with SSE4.1, we can lower
certain shufflevectors to an insertps instruction:
When most of the shufflevector result's elements come from one vector (and
keep their index), and one element comes from another vector or a memory
operand.
Added tests for insertps optimizations on shufflevector.
Added support and tests for v4i32 vector optimization.
Reviewers: nadav
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D3475
llvm-svn: 207291
This reverts commit r207286. It causes an ICE on the
cmake-llvm-x86_64-linux buildbot [1]:
llvm/lib/Analysis/BlockFrequencyInfo.cpp: In lambda function:
llvm/lib/Analysis/BlockFrequencyInfo.cpp:182:1: internal compiler error: in get_expr_operands, at tree-ssa-operands.c:1035
[1]: http://bb.pgr.jp/builders/cmake-llvm-x86_64-linux/builds/12093/steps/build_llvm/logs/stdio
llvm-svn: 207287
Previously, irreducible backedges were ignored. With this commit,
irreducible SCCs are discovered on the fly, and modelled as loops with
multiple headers.
This approximation specifies the headers of irreducible sub-SCCs as its
entry blocks and all nodes that are targets of a backedge within it
(excluding backedges within true sub-loops). Block frequency
calculations act as if we insert a new block that intercepts all the
edges to the headers. All backedges and entries to the irreducible SCC
point to this imaginary block. This imaginary block has an edge (with
even probability) to each header block.
The result is now reasonable enough that I've added a number of
testcases for irreducible control flow. I've outlined in
`BlockFrequencyInfoImpl.h` ways to improve the approximation.
<rdar://problem/14292693>
llvm-svn: 207286
This also avoids the need for subtly side-effecting calls to manifest
strings in the string table at the point where items are added to the
accelerator tables.
llvm-svn: 207281
Prior to this patch, CGRecordLower assumed that virtual bases could not
be placed before the nvsize of an object. This isn't true in Itanium
mode, virtual bases are placed at dsize rather than vnsize and in the
case of zero sized non-virtual bases nvsize can be larger than dsize.
This patch fixes CGRecordLowering to avoid an assert and to clip
bitfields properly in this case. A test case is included.
llvm-svn: 207280
This adds support for an -mattr option to the gold plugin and to llvm-lto. This
allows the caller to specify details of the subtarget architecture, like +aes,
or +ssse3 on x86. Note that this requires a change to the include/llvm-c/lto.h
interface: it adds a function lto_codegen_set_attr and it increments the
version of the interface.
llvm-svn: 207279
Pulls out some more code from some of the rather monolithic DWARF
classes. Unlike the address table, the string table won't move up into
DwarfDebug - each DWARF file has its own string table (but there can be
only one address table).
llvm-svn: 207277
The __yield intrinsic generates a hint instruction to indicate that the thread
is not performing any useful operations at the moment. This is for
compatibility with MSVC, although, the intrinsic is also part of the ACLE, and
is enabled globally as a result.
llvm-svn: 207275
Adds try/except blocks around clean-up code. Prevents a race between gdb
remote kill command reception by llgs (which leads llgs to shut down)
and the pexpect server kill (which can fail if the kill command handling
completes first). Warnings are emitted on the logger for any clean-up
code that fails.
llvm-svn: 207273
optional path prior to the file base name.
On Linux x86_64 (Ubuntu 12.04) I am sometimes getting a full path
on the stderr.txt. This changes the test for target.error-path and
target.output-path settings to ignore any optional directory before
the expected file base name.
llvm-svn: 207272
Consider this use from the new testcase:
LSR Use: Kind=ICmpZero, Offsets={0}, widest fixup type: i32
reg({1000,+,-1}<nw><%for.body>)
-3003 + reg({3,+,3}<nw><%for.body>)
-1001 + reg({1,+,1}<nuw><nsw><%for.body>)
-1000 + reg({0,+,1}<nw><%for.body>)
-3000 + reg({0,+,3}<nuw><%for.body>)
reg({-1000,+,1}<nw><%for.body>)
reg({-3000,+,3}<nsw><%for.body>)
This is the last use we consider for a solution in SolveRecurse, so CurRegs is
a large set. (CurRegs is the set of registers that are needed by the
previously visited uses in the in-progress solution.)
ReqRegs is {
{3,+,3}<nw><%for.body>,
{1,+,1}<nuw><nsw><%for.body>
}
This is the intersection of the regs used by any of the formulas for the
current use and CurRegs.
Now, the code requires a formula to contain *all* these regs (the comment is
simply wrong), otherwise the formula is immediately disqualified. Obviously,
no formula for this use contains two regs so they will all get disqualified.
The fix modifies the check to allow the formula in this case. The idea is
that neither of these formulae is introducing any new registers which is the
point of this early pruning as far as I understand.
In terms of set arithmetic, we now allow formulas whose used regs are a subset
of the required regs not just the other way around.
There are few more loops in the test-suite that are now successfully LSRed. I
have benchmarked those and found very minimal change.
Fixes <rdar://problem/13965777>
llvm-svn: 207271
Actually use the `reference` typedef, and remove the private
redefinition of `pointer` since it has no users.
Using `reference` exposes a problem with r207257, which specified the
wrong `value_type` to `iterator_facade_base` (fixed that too).
llvm-svn: 207270
buildbot - do not insert debug intrinsics before phi nodes.
Debug info for optimized code: Support variables that are on the stack and
described by DBG_VALUEs during their lifetime.
Previously, when a variable was at a FrameIndex for any part of its
lifetime, this would shadow all other DBG_VALUEs and only a single
fbreg location would be emitted, which in fact is only valid for a small
range and not the entire lexical scope of the variable. The included
dbg-value-const-byref testcase demonstrates this.
This patch fixes this by
Local
- emitting dbg.value intrinsics for allocas that are passed by reference
- dropping all dbg.declares (they are now fully lowered to dbg.values)
SelectionDAG
- renamed constructors for SDDbgValue for better readability.
- fix UserValue::match() to handle indirect values correctly
- not inserting an MMI table entries for dbg.values that describe allocas.
- lowering dbg.values that describe allocas into *indirect* DBG_VALUEs.
CodeGenPrepare
- leaving dbg.values for an alloca were they are (see comment)
Other
- regenerated/updated instcombine.ll testcase and included source
rdar://problem/16679879
http://reviews.llvm.org/D3374
llvm-svn: 207269
We never aka vector types because our attributed syntax for it is less
comprehensible than the typedefs. This leaves the user in the dark when
the typedef isn't named that well.
Example:
v2s v; v4f w;
w = v;
The naming in this cases isn't even that bad, but the error we give is
useless without looking up the actual typedefs.
t.c:6:5: error: assigning to 'v4f' from incompatible type 'v2s'
Now:
t.c:6:5: error: assigning to 'v4f' (vector of 4 'float' values) from
incompatible type 'v2s' (vector of 2 'int' values)
We do this for all diagnostics that print a vector type.
llvm-svn: 207267
Patch by Kostya Serebryany.
unique_ptr would be nice, but it's a bit too much work for an area I'm
not familiar with, nor invested in, unfortunately.
llvm-svn: 207265
This should reduce the chance of memory leaks like those fixed in
r207240.
There's still some unclear ownership of DIEs happening in DwarfDebug.
Pushing unique_ptr and references through more APIs should help expose
the cases where ownership is a bit fuzzy.
llvm-svn: 207263
Since this doesn't return ownership (the DIE has been added to the
specified parent already) nor return null, just return by reference.
llvm-svn: 207259
Use the fancy new `iterator_facade_base` to add
`scc_iterator::operator->()`. Remove other definitions where
`iterator_facade_base` does the right thing.
<rdar://problem/14292693>
llvm-svn: 207257
This'll make changing to unique_ptr ownership of DIEs easier since the
usages will now have '*' on them making them textually compatible
between unique_ptr and raw pointer.
llvm-svn: 207253
It's fishy to be changing the `std::vector<>` owned by the iterator, and
no one actual does it, so I'm going to remove the ability in a
subsequent commit. First, update the users.
<rdar://problem/14292693>
llvm-svn: 207252