At the moment we expect rotates to have the form:
(or (shl X, Y), (shr X, Z))
where Y == bitsize(X) - Z or Z == bitsize(X) - Y. This form means that
the (or ...) is undefined for Y == 0 or Z == 0. This undefinedness can
be avoided by using Y == (C * bitsize(X) - Z) & (bitsize(X) - 1) or
Z == (C * bitsize(X) - Y) & (bitsize(X) - 1) for any integer C
(including 0, the most natural choice).
llvm-svn: 198861
InstCombine converts (sub 32, (add X, C)) into (sub 32-C, X),
so a rotate left of a 32-bit Y by X+C could appear as either:
(or (shl Y, (add X, C)), (shr Y, (sub 32, (add X, C))))
without InstCombine or:
(or (shl Y, (add X, C)), (shr Y, (sub 32-C, X)))
with it.
We already matched the first form. This patch handles the second too.
llvm-svn: 198860
operand into the Value interface just like the core print method is.
That gives a more conistent organization to the IR printing interfaces
-- they are all attached to the IR objects themselves. Also, update all
the users.
This removes the 'Writer.h' header which contained only a single function
declaration.
llvm-svn: 198836
are part of the core IR library in order to support dumping and other
basic functionality.
Rename the 'Assembly' include directory to 'AsmParser' to match the
library name and the only functionality left their -- printing has been
in the core IR library for quite some time.
Update all of the #includes to match.
All of this started because I wanted to have the layering in good shape
before I started adding support for printing LLVM IR using the new pass
infrastructure, and commandline support for the new pass infrastructure.
llvm-svn: 198688
There is a wrong assumption that the vector element type and the
type of each ConstantSDNode in the build_vector were the same.
However, when promoting the integer operand of a legally typed
build_vector, the operand type and the vector element type do not
need to be the same
(See method 'DAGTypeLegalizer::PromoteIntOp_BUILD_VECTOR' in
LegalizeIntegerTypes.cpp).
in AArch64 backend, the following dag sequence:
C0: i1 = Constant<0>
C1: i1 = Constant<-1>
V: v8i1 = BUILD_VECTOR C1, C1, C0, C0, C0, C0, C0, C0
is type-legalized into:
NewC0: i32 = Constant<0>
NewC1: i32 = Constant<1>
V: v8i8 = BUILD_VECTOR NewC1, NewC1, NewC0, NewC0, NewC0, NewC0, NewC0, NewC0
Forcing a getZeroExtend to VTBits to ensure that the new constant
is correctly.
llvm-svn: 198582
This moves the check up into the parent class so that all targets can use it
without having to copy (and keep in sync) the same error message.
llvm-svn: 198579
For AArch64 backend, if DAGCombiner see "sext(setcc)", it will
combine them together to a single setcc with extended value type.
Then if it see "zext(setcc)", it assumes setcc is Vxi1, and try to
create "(and (vsetcc), (1, 1, ...)". While setcc isn't Vxi1,
DAGcombiner will create wrong node and get wrong code emitted.
llvm-svn: 198190
ConstantSDNodes (or UNDEFs) into a simple BUILD_VECTOR.
For example, given the following sequence of dag nodes:
i32 C = Constant<1>
v4i32 V = BUILD_VECTOR C, C, C, C
v4i32 Result = SIGN_EXTEND_INREG V, ValueType:v4i1
The SIGN_EXTEND_INREG node can be folded into a build_vector since
the vector in input is a BUILD_VECTOR of constants.
The optimized sequence is:
i32 C = Constant<-1>
v4i32 Result = BUILD_VECTOR C, C, C, C
llvm-svn: 198084
This changes the MachineFrameInfo API to use the new SSPLayoutKind information
produced by the StackProtector pass (instead of a boolean flag) and updates a
few pass dependencies (to preserve the SSP analysis).
The stack layout follows the same approach used prior to this change - i.e.,
only LargeArray stack objects will be placed near the canary and everything
else will be laid out normally. After this change, structures containing large
arrays will also be placed near the canary - a case previously missed by the
old implementation.
Out of tree targets will need to update their usage of
MachineFrameInfo::CreateStackObject to remove the MayNeedSP argument.
The next patch will implement the rules for sspstrong and sspreq. The end goal
is to support ssp-strong stack layout rules.
WIP.
Differential Revision: http://llvm-reviews.chandlerc.com/D2158
llvm-svn: 197653
This optional register liveness analysis pass can be enabled with either
-enable-stackmap-liveness, -enable-patchpoint-liveness, or both. The pass
traverses each basic block in a machine function. For each basic block the
instructions are processed in reversed order and if a patchpoint or stackmap
instruction is encountered the current live-out register set is encoded as a
register mask and attached to the instruction.
Later on during stackmap generation the live-out register mask is processed and
also emitted as part of the stackmap.
This information is optional and intended for optimization purposes only. This
will enable a client of the stackmap to reason about the registers it can use
and which registers need to be preserved.
Reviewed by Andy
llvm-svn: 197317
This reverts commit r197254.
This was an accidental merge of Juergen's patch. It will be checked in
shortly, but wasn't meant to go in quite yet.
Conflicts:
include/llvm/CodeGen/StackMaps.h
lib/CodeGen/StackMaps.cpp
test/CodeGen/X86/stackmap-liveness.ll
llvm-svn: 197260
DAGCombiner could fold (truncate (load)) -> smaller load if the original
load was the width of the truncation result or wider. This patch extends
it to handle cases where the original load was narrower (and so the
extension type stays the same).
llvm-svn: 197030
This re-lands commit r196876, which was reverted in r196879.
The tests have been fixed to pass on platforms with a stack alignment
larger than 4.
Update to clang side tests will land shortly.
llvm-svn: 196939
One unusual feature of the z architecture is that the result of a
previous load can be reused indefinitely for subsequent loads, even if
a cache-coherent store to that location is performed by another CPU.
A special serializing instruction must be used if you want to force
a load to be reattempted.
Since volatile loads are not supposed to be omitted in this way,
we should insert a serializing instruction before each such load.
The same goes for atomic loads.
The patch implements this at the IR->DAG boundary, in a similar way
to atomic fences. It is a no-op for targets other than SystemZ.
llvm-svn: 196905
For stack frames requiring realignment, three pointers may be needed:
- ebp to address incoming arguments
- esi (could be any callee-saved register) to address locals
- esp to address outgoing arguments
We would use esi unconditionally without verifying that it did not
conflict with inline assembly.
This change doesn't do the verification, it simply emits a fatal error
on functions that use stack realignment, dynamic SP adjustments, and
inline assembly.
Because stack realignment is common on Windows, we also no longer assume
that MS inline assembly clobbers esp. Instead, we analyze the inline
instructions for implicit definitions and check if esp is there. If so,
we require the use of a base pointer and consider it in the condition
above.
Mostly fixes PR16830, but we could try harder to find a non-conflicting
base pointer.
Reviewers: sunfish
Differential Revision: http://llvm-reviews.chandlerc.com/D1317
llvm-svn: 196876
This just extends the existing hack. It should be enough to get a reproducible bootstrap
on 32 bits.
I will open a bug to track getting a real fix for this.
llvm-svn: 196462
A Direct stack map location records the address of frame index. This
address is itself the value that the runtime requested. This differs
from IndirectMemRefOp locations, which refer to a stack locations from
which the requested values must be loaded. Direct locations can
directly communicate the address if an alloca, while IndirectMemRefOp
handle register spills.
For example:
entry:
%a = alloca i64...
llvm.experimental.stackmap(i32 <ID>, i32 <shadowBytes>, i64* %a)
Since both the alloca and stackmap intrinsic are in the entry block,
and the intrinsic takes the address of the alloca, the runtime can
assume that LLVM will not substitute alloca with any intervening
value. This must be verified by the runtime by checking that the stack
map's location is a Direct location type. The runtime can then
determine the alloca's relative location on the stack immediately after
compilation, or at any time thereafter. This differs from Register and
Indirect locations, because the runtime can only read the values in
those locations when execution reaches the instruction address of the
stack map.
llvm-svn: 195712
Summary:
Moved the requirement for SelectionDAG::getConstant() to return legally
typed nodes slightly earlier. There were two optional DAGCombine passes
that were missed out and were required to produce type-legal DAGs.
Simplified a code-path in tryFoldToZero() to use SelectionDAG::getConstant().
This provides support for both promoted and expanded vector types whereas the
previous code only supported promoted vector types.
Fixes a "Type for zero vector elements is not legal" assertion detected by
an llvm-stress generated test.
Reviewers: resistor
CC: llvm-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D2251
llvm-svn: 195635
Improvements over r195317:
- Set/restore EnableFastISel flag instead of just running FastISel within
SelectAllBasicBlocks; the flag is checked in various places, and
FastISel won't run properly if those places don't do the right thing.
- Test looks for normal ISel versus FastISel behavior, and not
something more subtle that doesn't work everywhere.
Based on work by Andrea Di Biagio.
llvm-svn: 195491
The legalizer can now do this type of expansion for more
type combinations without loading and storing to and
from the stack.
NOTE: This is a candidate for the 3.4 branch.
llvm-svn: 195398
This patch is a rewrite of the original patch commited in r194542. Instead of
relying on the type legalizer to do the splitting for us, we now peform the
splitting ourselves in the DAG combiner. This is necessary for the case where
the vector mask is a legal type after promotion and still wouldn't require
splitting.
Patch by: Juergen Ributzka
NOTE: This is a candidate for the 3.4 branch.
llvm-svn: 195397
Summary:
LegalizeSetCCCondCode can now legalize SETEQ and SETNE by returning the inverse
condition and requesting that the caller invert the result of the condition.
The caller of LegalizeSetCCCondCode must handle the inverted CC, and they do
so as follows:
SETCC, BR_CC:
Invert the result of the SETCC with SelectionDAG::getNOT()
SELECT_CC:
Swap the true/false operands.
This is necessary for MSA which lacks an integer SETNE instruction.
Reviewers: resistor
CC: llvm-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D2229
llvm-svn: 195355
It broke, at least, i686 target. It is reproducible with "llc -mtriple=i686-unknown".
FYI, it didn't appear to add either "-O0" or "-fast-isel".
llvm-svn: 195339