%RAX<def> = ...
%RAX<def> = SUBREG_TO_REG 0, %EAX:3<kill>, 3
The first def is defining RAX, not EAX so the top bits were not zero-extended.
llvm-svn: 67511
- Make type declarations match the struct/class keyword of the definition.
- Move AddSignalHandler into the namespace where it belongs.
- Correctly call functions from template base.
- Some other small changes.
With this patch, LLVM and Clang should build properly and with far less noise under VS2008.
llvm-svn: 67347
size by the array amount as an i32 value instead of promoting from
i32 to i64 then doing the multiply. Not doing this broke wrap-around
assumptions that the optimizers (validly) made. The ultimate real
fix for this is to introduce i64 version of alloca and remove mallocinst.
This fixes PR3829
llvm-svn: 67093
vector shuffle mask. Forced the mask to be built using i32. Note: this will
be irrelevant once vector_shuffle no longer takes a build vector for the
shuffle mask.
llvm-svn: 67076
U test/CodeGen/X86/2009-03-13-PHIElimBug.ll
D test/CodeGen/X86/2009-03-16-PHIElimInLPad.ll
U lib/CodeGen/PHIElimination.cpp
r67049 was causing this failure:
Running /Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm.src/test/CodeGen/X86/dg.exp ...
FAIL: /Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm.src/test/CodeGen/X86/2009-03-13-PHIElimBug.ll for PR3784
Failed with exit(1) at line 1
while running: llvm-as < /Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm.src/test/CodeGen/X86/2009-03-13-PHIElimBug.ll | llc -march=x86 | /usr/bin/grep -A 2 {call f} | /usr/bin/grep movl
child process exited abnormally
llvm-svn: 67051
how invokes are set up. The fix could be disturbed by
register copies coming after the EH_LABEL, and also didn't
behave quite right when it was the invoke result that
was used in a phi node. Also (see new testcase) fix
another phi elimination bug while there: register copies
in the landing pad need to come after the EH_LABEL, because
that's where execution branches to when unwinding. If they
come before the EH_LABEL then they will never be executed...
Also tweak the original testcase so it doesn't use a no-longer
existing counter.
The accumulated phi elimination changes fix two of seven Ada
testsuite failures that turned up after landing pad critical
edge splitting was turned off. So there's probably more to come.
llvm-svn: 67049
if FPConstant is legal because if the FPConstant doesn't need to be stored
in a constant pool, the transformation is unlikely to be profitable.
llvm-svn: 66994
ptrtoint and inttoptr in X86FastISel. These casts aren't always
handled in the generic FastISel code because X86 sometimes needs
custom code to do truncation and zero-extension.
llvm-svn: 66988
by inserting explicit zero extensions where necessary. Included
is a testcase where SelectionDAG produces a virtual register
holding an i1 value which FastISel previously mistakenly assumed
to be zero-extended.
llvm-svn: 66941
1. ConstantPoolSDNode alignment field is log2 value of the alignment requirement. This is not consistent with other SDNode variants.
2. MachineConstantPool alignment field is also a log2 value.
3. However, some places are creating ConstantPoolSDNode with alignment value rather than log2 values. This creates entries with artificially large alignments, e.g. 256 for SSE vector values.
4. Constant pool entry offsets are computed when they are created. However, asm printer group them by sections. That means the offsets are no longer valid. However, asm printer uses them to determine size of padding between entries.
5. Asm printer uses expensive data structure multimap to track constant pool entries by sections.
6. Asm printer iterate over SmallPtrSet when it's emitting constant pool entries. This is non-deterministic.
Solutions:
1. ConstantPoolSDNode alignment field is changed to keep non-log2 value.
2. MachineConstantPool alignment field is also changed to keep non-log2 value.
3. Functions that create ConstantPool nodes are passing in non-log2 alignments.
4. MachineConstantPoolEntry no longer keeps an offset field. It's replaced with an alignment field. Offsets are not computed when constant pool entries are created. They are computed on the fly in asm printer and JIT.
5. Asm printer uses cheaper data structure to group constant pool entries.
6. Asm printer compute entry offsets after grouping is done.
7. Change JIT code to compute entry offsets on the fly.
llvm-svn: 66875
related transformations out of target-specific dag combine into the
ARM backend. These were added by Evan in r37685 with no testcases
and only seems to help ARM (e.g. test/CodeGen/ARM/select_xform.ll).
Add some simple X86-specific (for now) DAG combines that turn things
like cond ? 8 : 0 -> (zext(cond) << 3). This happens frequently
with the recently added cp constant select optimization, but is a
very general xform. For example, we now compile the second example
in const-select.ll to:
_test:
movsd LCPI2_0, %xmm0
ucomisd 8(%esp), %xmm0
seta %al
movzbl %al, %eax
movl 4(%esp), %ecx
movsbl (%ecx,%eax,4), %eax
ret
instead of:
_test:
movl 4(%esp), %eax
leal 4(%eax), %ecx
movsd LCPI2_0, %xmm0
ucomisd 8(%esp), %xmm0
cmovbe %eax, %ecx
movsbl (%ecx), %eax
ret
This passes multisource and dejagnu.
llvm-svn: 66779
alignment of the generated constant pool entry to the
desired alignment of a type. If we don't do this, we end up
trying to do movsd from 4-byte alignment memory. This fixes
450.soplex and 456.hmmer.
llvm-svn: 66641
1. Use the same value# to represent unknown values being merged into sub-registers.
2. When coalescer commute an instruction and the destination is a physical register, update its sub-registers by merging in the extended ranges.
llvm-svn: 66610
the untimed version of getOrCreateSourceID. getOrCreateSourceID calls
GetOrCreateSourceID, of course.
- Move some methods into the "private" section. Constify at least one method.
- General clean-ups.
llvm-svn: 66582
scheduled in multiple regions, liveness data used by the
anti-dependence breaker is carried from one region to the next, however
the information reflects the state of the instructions before scheduling.
After scheduling, there may be new live range overlaps. Handle this by
pessimizing the liveness data carried between regions to the point where
it will be conservatively correct now matter how the earlier region is
scheduled. This fixes a miscompilation in 176.gcc with the post-RA
scheduler enabled.
llvm-svn: 66558
whether a global is dead or not. This should fix PR3749 - linker adds
spurious use to appending globals. I can't reasonably add a testcase
for this, because the bc writer/reader strip dead constant users.
llvm-svn: 66404
and extern_weak_odr. These are the same as the non-odr versions,
except that they indicate that the global will only be overridden
by an *equivalent* global. In C, a function with weak linkage can
be overridden by a function which behaves completely differently.
This means that IP passes have to skip weak functions, since any
deductions made from the function definition might be wrong, since
the definition could be replaced by something completely different
at link time. This is not allowed in C++, thanks to the ODR
(One-Definition-Rule): if a function is replaced by another at
link-time, then the new function must be the same as the original
function. If a language knows that a function or other global can
only be overridden by an equivalent global, it can give it the
weak_odr linkage type, and the optimizers will understand that it
is alright to make deductions based on the function body. The
code generators on the other hand map weak and weak_odr linkage
to the same thing.
llvm-svn: 66339
with multiple chain operands. This can occur when the scheduler
has added chain operands to a node that already has a chain
operand, in order to handle physical register dependencies.
This fixes an llvm-gcc bootstrap failure on x86-64 introduced
in r66058.
llvm-svn: 66240
so it changed it into a 31 via the TLO.ShrinkDemandedConstant() call. Then it
would go through the DAG combiner again. This time it had a value of 31, which
was turned into a -1 by TLI.SimplifyDemandedBits(). This would ping pong
forever.
Teach the TLO.ShrinkDemandedConstant() call not to lower a value if the demanded
value is an XOR of all ones.
llvm-svn: 65985
arbitrary vector sizes. Add an optional MinSplatBits parameter to specify
a minimum for the splat element size. Update the PPC target to use the
revised interface.
llvm-svn: 65899
extracts + build_vector into a shuffle would fail, because the
type of the new build_vector would not be legal. Try harder to
create a legal build_vector type. Note: this will be totally
irrelevant once vector_shuffle no longer takes a build_vector for
shuffle mask.
New:
_foo:
xorps %xmm0, %xmm0
xorps %xmm1, %xmm1
subps %xmm1, %xmm1
mulps %xmm0, %xmm1
addps %xmm0, %xmm1
movaps %xmm1, 0
Old:
_foo:
xorps %xmm0, %xmm0
movss %xmm0, %xmm1
xorps %xmm2, %xmm2
unpcklps %xmm1, %xmm2
pshufd $80, %xmm1, %xmm1
unpcklps %xmm1, %xmm2
pslldq $16, %xmm2
pshufd $57, %xmm2, %xmm1
subps %xmm0, %xmm1
mulps %xmm0, %xmm1
addps %xmm0, %xmm1
movaps %xmm1, 0
llvm-svn: 65791
overly long ints, e.g. i96, into pieces at PHIs
and the nodes that feed into them; however big-endian
reverses the order of the pieces (for some reason), and
wasn't doing it the same way on both sides, so
the pieces didn't match and runtime failures ensued.
Fixes 188.ammp and sqlite3 on ppc32.
llvm-svn: 65481
them are generic changes.
- Use the "fast" flag that's already being passed into the asm printers instead
of shoving it into the DwarfWriter.
- Instead of calling "MI->getParent()->getParent()" for every MI, set the
machine function when calling "runOnMachineFunction" in the asm printers.
llvm-svn: 65379
a DBG_LABEL or not. We want to fall back to the original way of emitting debug
info when we're in -O0/-fast mode.
- Add plumbing in to pass the "Fast" flag to places that need it.
- XFAIL DebugInfo/deaddebuglabel.ll. This is finding 11 labels instead of 8. I
need to investigate still.
llvm-svn: 65367
instruction. The class also consolidates the code for detecting constant
splats that's shared across PowerPC and the CellSPU backends (and might be
useful for other backends.) Also introduces SelectionDAG::getBUID_VECTOR() for
generating new BUILD_VECTOR nodes.
llvm-svn: 65296
Now we're using one gross, but quite robust hack :) (previous ones
did not work, for example, when ext_weak symbol was used deep inside
constant expression in the initializer).
The proper fix of this problem will require some quite huge asmprinter
changes and that's why was postponed. This fixes PR3629 by the way :)
llvm-svn: 65230
Ideally these would never get created in the first place, but until we enhance the spiller to have a more
global picture of what's happening, this is necessary for code quality in some circumstances.
llvm-svn: 65120
that has not been JIT'd yet, the callee is put on a list of pending functions
to JIT. The call is directed through a stub, which is updated with the address
of the function after it has been JIT'd. A new interface for allocating and
updating empty stubs is provided.
Add support for removing the ModuleProvider the JIT was created with, which
would otherwise invalidate the JIT's PassManager, which is initialized with the
ModuleProvider's Module.
Add support under a new ExecutionEngine flag for emitting the infomration
necessary to update Function and GlobalVariable stubs after JITing them, by
recording the address of the stub and the name of the GlobalValue. This allows
code to be copied from one address space to another, where libraries may live
at different virtual addresses, and have the stubs updated with their new
correct target addresses.
llvm-svn: 64906
(Note: Eventually, commits like this will be handled via a pre-commit hook that
does this automagically, as well as expand tabs to spaces and look for 80-col
violations.)
llvm-svn: 64827
U include/llvm/CodeGen/DebugLoc.h
U lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
U lib/CodeGen/SelectionDAG/SelectionDAGBuild.cpp
U lib/Target/X86/AsmPrinter/X86ATTAsmPrinter.cpp
Enable debug location generation at -Os. This goes with the reapplication of the
r63639 patch.
llvm-svn: 64715
one bit set, because the bit may be shifted off the end. Instead,
just check for a constant 1 being shifted. This is still sufficient
to handle all the cases in test/CodeGen/X86/bt.ll. This fixes PR3583.
llvm-svn: 64622
Cleanup some warning.
Remark: when struct/class are declared differently than they are defined, this make problem for VC++ since it seems to mangle class differently that struct. These error are very hard to understand and find. So please, try to keep your definition/declaration in sync.
Only tested with VS2008. hope it does not break anything. feel free to revert.
llvm-svn: 64554
in inline asm as signed (what gcc does). Add partial support
for x86-specific "e" and "Z" constraints, with appropriate
signedness for printing.
llvm-svn: 64400
is determined by whether the node has a Flag operand. However, if the
node does have a Flag operand, it will be glued to its register's
def, so the heuristic would end up spuriously applying to whatever
node is the def.
llvm-svn: 64319
It was transforming (x&y)==y to (x&y)!=0 in the case where
y is variable and known to have at most one bit set (e.g. z&1).
This is not correct; the expressions are not equivalent when y==0.
I believe this patch salvages what can be salvaged, including
all the cases in bt.ll. Dan, please review.
Fixes gcc.c-torture/execute/20040709-[12].c
llvm-svn: 64314
instruction index across each part. Instruction indices are used
to make live range queries, and live ranges can extend beyond
scheduling region boundaries.
Refactor the ScheduleDAGSDNodes class some more so that it
doesn't have to worry about this additional information.
llvm-svn: 64288
a scheduling region boundary. This isn't necessary for
correctness; it helps with compile time, as it avoids the need
for data- and anti-dependencies from all spills and reloads on
the stack-pointer modification.
llvm-svn: 64255
scheduling, and generalize is so that preserves state across
scheduling regions. This fixes incorrect live-range information around
terminators and labels, which are effective region boundaries.
In place of looking for terminators to anchor inter-block dependencies,
introduce special entry and exit scheduling units for this purpose.
llvm-svn: 64254
suprise to some callers, e.g. register coalescer. For now, add an parameter
that tells AnalyzeBranch whether it's safe to modify the mbb. A better
solution is out there, but I don't have time to deal with it right now.
llvm-svn: 64124
Right now if the coalesced copy def is dead and its src is a kill, and that
there are now other uses within the live range, the coalescer would mark the
def of the source register as dead. But it should also check if there are
other kills which means the value has other uses not in the live range.
llvm-svn: 64075
Adjust derived classes to pass UnknownLoc where
a DebugLoc does not make sense. Pick one of
DebugLoc and non-DebugLoc variants to survive
for all such classes.
llvm-svn: 64000
Many targets build placeholder nodes for special operands, e.g.
GlobalBaseReg on X86 and PPC for the PIC base. There's no
sensible way to associate debug info with these. I've left
them built with getNode calls with explicit DebugLoc::getUnknownLoc operands.
I'm not too happy about this but don't see a good improvement;
I considered adding a getPseudoOperand or something, but it
seems to me that'll just make it harder to read.
llvm-svn: 63992
SelectionDAGISel::CreateScheduler, and make it just create the
scheduler. Leave running the scheduler to the higher-level code.
This makes the higher-level code a little more explicit and
easier to follow, and will help enable some future refactoring.
llvm-svn: 63944
that used this header to select a scheduling policy should
use SchedulerRegistry.h instead (llvm-gcc and clang were
updated a while ago).
llvm-svn: 63934
target directories themselves. This also means that VMCore no longer
needs to know about every target's list of intrinsics. Future work
will include converting the PowerPC target to this interface as an
example implementation.
llvm-svn: 63765
but when legalizing the operation, we split the vector type and generate a library
call whose type needs to be promoted. For example, X86 with SSE on but MMX off,
a divide v2i64 will be scalarized to 2 calls to a library using i64.
llvm-svn: 63760
support GraphViz, I've been using the foo->dump() facility. This
patch is a minor rewrite to the SelectionDAG dump() stuff to make it a
little more helpful. The existing foo->dump() functionality does not
change; this patch adds foo->dumpr(). All of this is only useful when
running LLVM under a debugger.
llvm-svn: 63736