promoted to legal types without changing the type of the vector. This is
following a suggestion from Duncan
(http://lists.cs.uiuc.edu/pipermail/llvmdev/2009-February/019923.html).
The transformation that used to be done during type legalization is now
postponed to DAG legalization. This allows the BUILD_VECTORs to be optimized
and potentially handled specially by target-specific code.
It turns out that this is also consistent with an optimization done by the
DAG combiner: a BUILD_VECTOR and INSERT_VECTOR_ELT may be combined by
replacing one of the BUILD_VECTOR operands with the newly inserted element;
but INSERT_VECTOR_ELT allows its scalar operand to be larger than the
element type, with any extra high bits being implicitly truncated. The
result is a BUILD_VECTOR where one of the operands has a type larger the
the vector element type.
Any code that operates on BUILD_VECTORs may now need to be aware of the
potential type discrepancy between the vector element type and the
BUILD_VECTOR operands. This patch updates all of the places that I could
find to handle that case.
llvm-svn: 68996
This will be used to replace things like X86's MOV32to32_.
Enhance ScheduleDAGSDNodesEmit to be more flexible and robust
in the presense of subregister superclasses and subclasses. It
can now cope with the definition of a virtual register being in
a subclass of a use.
Re-introduce the code for recording register superreg classes and
subreg classes. This is needed because when subreg extracts and
inserts get coalesced away, the virtual registers are left in
the correct subclass.
llvm-svn: 68961
to support C99 inline, GNU extern inline, etc. Related bugzilla's
include PR3517, PR3100, & PR2933. Nothing uses this yet, but it
appears to work.
llvm-svn: 68940
Create debug_inlined dwarf section using these information. This info is used by gdb, at least on Darwin, to enable better experience debugging inlined functions. See DwarfWriter.cpp for more information on structure of debug_inlined section.
llvm-svn: 68847
the key. This will cause it to create a new std::string, which isn't
wanted. Instead, pass back the "const char*". Modify the EmitString() method to
take a "const char*".
llvm-svn: 68741
register destinations that are tied to source operands. The
TargetInstrDescr::findTiedToSrcOperand method silently fails for inline
assembly. The existing MachineInstr::isRegReDefinedByTwoAddr was very
close to doing what is needed, so this revision makes a few changes to
that method and also renames it to isRegTiedToUseOperand (for consistency
with the very similar isRegTiedToDefOperand and because it handles both
two-address instructions and inline assembly with tied registers).
llvm-svn: 68714
with SUBREG_TO_REG, teach SimpleRegisterCoalescing to coalesce
SUBREG_TO_REG instructions (which are similar to INSERT_SUBREG
instructions), and teach the DAGCombiner to take advantage of this on
targets which support it. This eliminates many redundant
zero-extension operations on x86-64.
This adds a new TargetLowering hook, isZExtFree. It's similar to
isTruncateFree, except it only applies to actual definitions, and not
no-op truncates which may not zero the high bits.
Also, this adds a new optimization to SimplifyDemandedBits: transform
operations like x+y into (zext (add (trunc x), (trunc y))) on targets
where all the casts are no-ops. In contexts where the high part of the
add is explicitly masked off, this allows the mask operation to be
eliminated. Fix the DAGCombiner to avoid undoing these transformations
to eliminate casts on targets where the casts are no-ops.
Also, this adds a new two-address lowering heuristic. Since
two-address lowering runs before coalescing, it helps to be able to
look through copies when deciding whether commuting and/or
three-address conversion are profitable.
Also, fix a bug in LiveInterval::MergeInClobberRanges. It didn't handle
the case that a clobber range extended both before and beyond an
existing live range. In that case, multiple live ranges need to be
added. This was exposed by the new subreg coalescing code.
Remove 2008-05-06-SpillerBug.ll. It was bugpoint-reduced, and the
spiller behavior it was looking for no longer occurrs with the new
instruction selection.
llvm-svn: 68576
When compiling in Thumb mode, only the low (R0-R7) registers are available
for most instructions. Breaking the low registers into a new register class
handles this. Uses of R12, SP, etc, are handled explicitly where needed
with copies inserted to move results into low registers where the rest of
the code generator can deal with them.
llvm-svn: 68545
elements in a form that is efficient for the reader to just get a
pointer in memory and start reading. APIs to do efficient reading
and writing are still todo.
llvm-svn: 68465
Constant, MDString and MDNode which can only be used by globals with a name
that starts with "llvm." or as arguments to a function with the same naming
restriction.
llvm-svn: 68420
- Particularly nice for small constant strings, which get optimized
down nicely. On a synthetic benchmark writing out "hello" in a
loop, this is about 2x faster with gcc and 3x faster with
llvm-gcc. llc on insn-attrtab.bc from 403.gcc is about .5% faster.
- I tried for a fancier solution which wouldn't increase code size as
much (by trying to match constant arrays), but can't quite make it
fly.
llvm-svn: 68396
"The code was doing "if (End+NumInputs > Capacity) ...". If End is
close to 0xFFFFFFFF and NumInputs is large, it'll overflow, the
condition will come out false, and the vector won't grow to
accommodate the new elements, and the program will crash in memmove."
Patch by Jeffrey Yasskin!
llvm-svn: 68277
- The code is silly, I'm just amusing myself. Rewrite to be efficient
if you like. :)
Also, if you wish to debate the proper names of the triple components
I'm all ears.
llvm-svn: 68252
which are effectively smart pointers to Value*'s. They are both very light
weight and simple, and react to values being destroyed or being RAUW'd.
WeakVN does a best effort to follow a value around, including through RAUW
operations and will get nulled out of the value is destroyed. This is useful
for the eventual "metadata that references a value" work, because it is a
reference to a value that does not show up on its use_* list.
AssertingVH is a pointer that compiles down to a dumb raw pointer when
assertions are disabled. When enabled, it emits an assertion if the
pointed-to value is destroyed while it is still being referenced. This
is very useful for Maps and other things, and should have caught the recent
bugs in CallGraph and Reassociate, for example.
llvm-svn: 68149
entered via fall-through. Don't miss fallthroughs from blocks
terminated by conditional branches. Also, move
isOnlyReachableByFallthrough out of line.
llvm-svn: 68129
llvm::sys::getOS{Name,Version}.
Right now the implementation just derives from LLVM_HOSTTRIPLE (which
is wrong, but it doesn't look like we have a define for the target
triple). Ideally this routine would actually be able to compute the
triple for targets we care about.
llvm-svn: 68118
only reachable via fall-through edges. This dramatically reduces the
number of labels printed, and thus also the number of labels the
assembler must parse and remember.
llvm-svn: 68073
you to do things like:
/// PointerUnion<int*, float*> P;
/// P = (int*)0;
/// printf("%d %d", P.is<int*>(), P.is<float*>()); // prints "1 0"
/// X = P.get<int*>(); // ok.
/// Y = P.get<float*>(); // runtime assertion failure.
/// Z = P.get<double*>(); // does not compile.
/// P = (float*)0;
/// Y = P.get<float*>(); // ok.
/// X = P.get<int*>(); // runtime assertion failure.
llvm-svn: 67987
function with a new NumLowBitsAvailable enum, which makes the
value available as an integer constant expression.
Add PointerLikeTypeTraits specializations for Instruction* and
Use** since they are only guaranteed 4-byte aligned.
Enhance PointerIntPair to know about (and enforce) the alignment
specified by PointerLikeTypeTraits. This should allow things
like PointerIntPair<PointerIntPair<void*, 1,bool>, 1, bool>
because the inner one knows that 2 low bits are free.
llvm-svn: 67979
x * 40
=>
shlq $3, %rdi
leaq (%rdi,%rdi,4), %rax
This has the added benefit of allowing more multiply to be folded into addressing mode. e.g.
a * 24 + b
=>
leaq (%rdi,%rdi,2), %rax
leaq (%rsi,%rax,8), %rax
llvm-svn: 67917
causing a bootstrap failure. Bootstraps here on
x86-32-linux and x86-64-linux. Requested by the
author Gabor Greif who says that a bug that might
have been causing the failure has since been fixed.
llvm-svn: 67844
- Make type declarations match the struct/class keyword of the definition.
- Move AddSignalHandler into the namespace where it belongs.
- Correctly call functions from template base.
- Some other small changes.
With this patch, LLVM and Clang should build properly and with far less noise under VS2008.
llvm-svn: 67347
the inliner; prevents nondeterministic behavior
when the same address is reallocated.
Don't build call graph nodes for debug intrinsic calls;
they're useless, and there were typically a lot of them.
llvm-svn: 67311
the set of blocks in which values are used, the set in which
values are live-through, and the set in which values are
killed. For the live-through and killed sets, conservative
approximations are used.
llvm-svn: 67309
a single character requires only one branch to follow slow path.
- Never use a buffer when writing on an unbuffered stream.
- Move default buffer size to header.
llvm-svn: 67066
write as arguments.
- Add raw_ostream::GetNumBytesInBuffer.
- Privatize buffer pointers.
- Get rid of slow and unnecessary code for writing out large strings.
llvm-svn: 67060
- Flush a known non-empty buffers; enforces the interface to
flush_impl and kills off HandleFlush (which I saw no reason to be
an inline method, Chris?).
- Clarify invariant that flush_impl is only called with OutBufCur >
OutBufStart.
- This also cleary collects all places where we have to deal with the
buffer possibly not existing.
- A few more comments and fixing the unbuffered behavior remain in
this commit sequence.
llvm-svn: 67057
by inserting explicit zero extensions where necessary. Included
is a testcase where SelectionDAG produces a virtual register
holding an i1 value which FastISel previously mistakenly assumed
to be zero-extended.
llvm-svn: 66941
changes.
For InvokeInst now all arguments begin at op_begin().
The Callee, Cont and Fail are now faster to get by
access relative to op_end().
This patch introduces some temporary uglyness in CallSite.
Next I'll bring CallInst up to a similar scheme and then
the uglyness will magically vanish.
This patch also exposes all the reliance of the libraries
on InvokeInst's operand ordering. I am thinking of taking
care of that too.
llvm-svn: 66920
1. ConstantPoolSDNode alignment field is log2 value of the alignment requirement. This is not consistent with other SDNode variants.
2. MachineConstantPool alignment field is also a log2 value.
3. However, some places are creating ConstantPoolSDNode with alignment value rather than log2 values. This creates entries with artificially large alignments, e.g. 256 for SSE vector values.
4. Constant pool entry offsets are computed when they are created. However, asm printer group them by sections. That means the offsets are no longer valid. However, asm printer uses them to determine size of padding between entries.
5. Asm printer uses expensive data structure multimap to track constant pool entries by sections.
6. Asm printer iterate over SmallPtrSet when it's emitting constant pool entries. This is non-deterministic.
Solutions:
1. ConstantPoolSDNode alignment field is changed to keep non-log2 value.
2. MachineConstantPool alignment field is also changed to keep non-log2 value.
3. Functions that create ConstantPool nodes are passing in non-log2 alignments.
4. MachineConstantPoolEntry no longer keeps an offset field. It's replaced with an alignment field. Offsets are not computed when constant pool entries are created. They are computed on the fly in asm printer and JIT.
5. Asm printer uses cheaper data structure to group constant pool entries.
6. Asm printer compute entry offsets after grouping is done.
7. Change JIT code to compute entry offsets on the fly.
llvm-svn: 66875
access each with a fixed negative index from op_end().
This has two important implications:
- getUser() will work faster, because there are less iterations
for the waymarking algorithm to perform. This is important
when running various analyses that want to determine callers
of basic blocks.
- getSuccessor() now runs faster, because the indirection via OperandList
is not necessary: Uses corresponding to the successors are at fixed
offset to "this".
The price we pay is the slightly more complicated logic in the operator
User::delete, as it has to pick up the information whether it has to free
the memory of an original unconditional BranchInst or a BranchInst that
was originally conditional, but has been shortened to unconditional.
I was not able to come up with a nicer solution to this problem. (And
rest assured, I tried *a lot*).
Similar reorderings will follow for InvokeInst and CallInst. After that
some optimizations to pred_iterator and CallSite will fall out naturally.
llvm-svn: 66815