except that the result may not be a constant. Switch jump threading to
use it so that it gets things like (X & 0) -> 0, which occur when phi preds
are deleted and the remaining phi pred was a zero.
llvm-svn: 86637
This patch forbids implicit conversion of DenseMap::const_iterator to
DenseMap::iterator which was possible because DenseMapIterator inherited
(publicly) from DenseMapConstIterator. Conversion the other way around is now
allowed as one may expect.
The template DenseMapConstIterator is removed and the template parameter
IsConst which specifies whether the iterator is constant is added to
DenseMapIterator.
Actually IsConst parameter is not necessary since the constness can be
determined from KeyT but this is not relevant to the fix and can be addressed
later.
Patch by Victor Zverovich!
llvm-svn: 86636
takes decimated instructions and applies identities to them. This
is pretty minimal at this point, but I plan to pull some instcombine
logic out into these and similar routines.
llvm-svn: 86613
was generated. This caused code like this:
## The asm code for the function
.section __TEXT,__const
.align 2
lJTI11_0:
LJTI11_0:
.long LBB11_16
.long LBB11_4
.long LBB11_5
.long LBB11_6
.long LBB11_7
.long LBB11_8
.long LBB11_9
.long LBB11_10
.long LBB11_11
.long LBB11_12
.long LBB11_13
.long LBB11_14
Leh_func_end11: ## <---now in the wrong section!
The `Leh_func_end11' would then end up in the wrong section, causing the
resulting EH frame information to be wrong:
__ZL11CheckRightsjPKcbRbRP6NSData.eh:
.set Lset500eh,Leh_frame_end11-Leh_frame_begin11
.long Lset500eh ; Length of Frame Information Entry
Leh_frame_begin11:
.long Leh_frame_begin11-Leh_frame_common
.long Leh_func_begin11-.
.set Lset501eh,Leh_func_end11-Leh_func_begin11
.long Lset501eh ; FDE address range
`Lset501eh' is now something huge instead of the real value.
The X86 back-end generates the jump table after the EH information is
emitted. Do the same here.
llvm-svn: 86588
the loop. This is needed because with indirectbr it may not be possible
for LoopSimplify to guarantee that all loop exit predecessors are
inside the loop. This fixes PR5437.
LCCSA no longer actually requires LoopSimplify form, but for now it
must still have the dependency because the PassManager doesn't know
how to schedule LoopSimplify otherwise.
llvm-svn: 86569
here:
1) We need to avoid processing sigma nodes as phi nodes for constraint generation.
2) We need to generate constraints for comparisons against constants properly.
This includes our first working ABCD test!
llvm-svn: 86498
graphs being produced. The cause was that we were incorrectly marking sigma instructions as
processed after handling the sigma-specific constraints for them, potentially neglecting to
process them as normal instructions as well.
Unfortunately, the testcase that inspired this still doesn't work because of a bug in the solver,
which is next on the list to debug.
llvm-svn: 86486
when both the source and dest are illegal types, since it would cause
the phi to grow (for example, we shouldn't transform test14b's phi to
a phi on i320). This fixes an infinite loop on i686 bootstrap with
phi slicing turned on, so turn it back on.
llvm-svn: 86483
not turn a PHI in a legal type into a PHI of an illegal type, and
add a new optimization that breaks up insane integer PHI nodes into
small pieces (PR3451).
llvm-svn: 86443
1. rename the movhp patfrag to movlhps, since thats what it actually matches
2. eliminate the bogus movhps load and store patterns, they were incorrect. The load transforms are already handled (correctly) by shufps/unpack.
3. revert a recent test change to its correct form.
llvm-svn: 86415
(eliminating some extends) if the new type of the
computation is legal or if both the source and dest
are illegal. This prevents instcombine from changing big
chains of computation into i64 on 32-bit targets for
example.
llvm-svn: 86398
datatypes on a given CPU. This is intended to allow instcombine and other
transformations to avoid converting big sequences of operations to an
inconvenient width, and will help clean up after SRoA. See also "Adding
legal integer sizes to TargetData" on Feb 1, 2009 on llvmdev, and PR3451.
Comments welcome.
llvm-svn: 86370
MachineRelocations, "stub" always refers to a far-call stub or a
load-a-faraway-global stub, so this patch adds "Far" to the term. (Other stubs
are used for lazy compilation and dlsym address replacement.) The variable was
also inconsistent between the positive and negative sense, and the positive
sense ("NeedStub") was more demanding than is accurate (since a nearby-enough
function can be called directly even if the platform often requires a stub).
Since the negative sense causes double-negatives, I switched to
"MayNeedFarStub" globally.
llvm-svn: 86363
except it doesn't care if the definitions' virtual registers differ. This is
used by machine LICM and other MI passes to perform CSE.
- Teach Thumb2InstrInfo::isIdentical() to check two t2LDRpci_pic are identical.
Since pc relative constantpool entries are always different, this requires it
it check if the values can actually the same.
llvm-svn: 86328
A non-identity copy cannot be coalesced when the phi join destination register
is live at the copy site.
Also verify the condition that the PHI join source register is only used in
the PHI join. Otherwise the coalescing is invalid.
llvm-svn: 86322
was wrong and too aggressive in the sense that DPSoRegFrm includes both constant
shifts (with Inst{4} = 0) and register controlled shifts (with Inst{4} = 1 and
Inst{7} = 0). The 'rr' fragment of the multiclass definitions actually means
register/register with no shift, see A8-11.
llvm-svn: 86319
Here is the original commit message:
This commit updates malloc optimizations to operate on malloc calls that have constant int size arguments.
Update CreateMalloc so that its callers specify the size to allocate:
MallocInst-autoupgrade users use non-TargetData-computed allocation sizes.
Optimization uses use TargetData to compute the allocation size.
Now that malloc calls can have constant sizes, update isArrayMallocHelper() to use TargetData to determine the size of the malloced type and the size of malloced arrays.
Extend getMallocType() to support malloc calls that have non-bitcast uses.
Update OptimizeGlobalAddressOfMalloc() to optimize malloc calls that have non-bitcast uses. The bitcast use of a malloc call has to be treated specially here because the uses of the bitcast need to be replaced and the bitcast needs to be erased (just like the malloc call) for OptimizeGlobalAddressOfMalloc() to work correctly.
Update PerformHeapAllocSRoA() to optimize malloc calls that have non-bitcast uses. The bitcast use of the malloc is not handled specially here because ReplaceUsesOfMallocWithGlobal replaces through the bitcast use.
Update OptimizeOnceStoredGlobal() to not care about the malloc calls' bitcast use.
Update all globalopt malloc tests to not rely on autoupgraded-MallocInsts, but instead use explicit malloc calls with correct allocation sizes.
llvm-svn: 86311
of going through the global TheJIT variable. This makes it easier to use
features of JITEmitter that aren't in JITCodeEmitter for fixing PR5201.
llvm-svn: 86305
load of a GV from constantpool and then add pc. It allows the code sequence to
be rematerializable so it would be hoisted by machine licm.
- Add a late pass to break these pseudo instructions into a number of real
instructions. Also move the code in Thumb2 IT pass that breaks up t2MOVi32imm
to this pass. This is done before post regalloc scheduling to allow the
scheduler to proper schedule these instructions. It also allow them to be
if-converted and shrunk by later passes.
llvm-svn: 86304
will not accept negative values for these. LLVM's default operand printing
sign extends values, so that valid unsigned values appear as negative
immediates. Print all VMOV immediate operands as hex values to resolve this.
Radar 7372576.
llvm-svn: 86301
predicates. This allows us to jump thread things like:
_ZN12StringSwitchI5ColorE4CaseILj7EEERS1_RAT__KcRKS0_.exit119:
%tmp1.i24166 = phi i8 [ 1, %bb5.i117 ], [ %tmp1.i24165, %_Z....exit ], [ %tmp1.i24165, %bb4.i114 ]
%toBoolnot.i87 = icmp eq i8 %tmp1.i24166, 0 ; <i1> [#uses=1]
%tmp4.i90 = icmp eq i32 %tmp2.i, 6 ; <i1> [#uses=1]
%or.cond173 = and i1 %toBoolnot.i87, %tmp4.i90 ; <i1> [#uses=1]
br i1 %or.cond173, label %bb4.i96, label %_ZN12...
Where it is "obvious" that when coming from %bb5.i117 that the 'and' is always
false. This triggers a surprisingly high number of times in the testsuite,
and gets us closer to generating good code for doug's strswitch testcase.
This also make a bunch of other code in jump threading redundant, I'll rip
out in the next patch. This survived an enable-checking llvm-gcc bootstrap.
llvm-svn: 86264
unsplittable critical edges, which means the introduction of
loops which cannot be transformed to LoopSimplify form. Fix
LoopSimplify to avoid transforming such loops into invalid
code.
llvm-svn: 86176