configure was silently failing to produce anything in the case
where clang wasn't at tools/clang/, resulting in compilation
errors much later in the build when config.h didn't exist.
llvm-svn: 149563
Adds an instruction itinerary to all x86 instructions, giving each a default latency of 1, using the InstrItinClass IIC_DEFAULT.
Sets specific latencies for Atom for the instructions in files X86InstrCMovSetCC.td, X86InstrArithmetic.td, X86InstrControl.td, and X86InstrShiftRotate.td. The Atom latencies for the remainder of the x86 instructions will be set in subsequent patches.
Adds a test to verify that the scheduler is working.
Also changes the scheduling preference to "Hybrid" for i386 Atom, while leaving x86_64 as ILP.
Patch by Preston Gurd!
llvm-svn: 149558
It is simpler to define a composite index directly:
def ssub_2 : SubRegIndex<[dsub_1, ssub_0]>;
def ssub_3 : SubRegIndex<[dsub_1, ssub_1]>;
Than specifying the composite indices on each register:
CompositeIndices = [(ssub_2 dsub_1, ssub_0),
(ssub_3 dsub_1, ssub_1)] in ...
This also makes it clear that SubRegIndex composition is supposed to be
unique.
llvm-svn: 149556
The final tie breaker comparison also needs to return +/-1, or 0.
This is not a less() function.
This could cause otherwise identical super-classes to be ordered
unstably, depending on what the system qsort routine does with a bad
compare function.
llvm-svn: 149549
This new scheduler plugs into the existing selection DAG scheduling framework. It is a top-down critical path scheduler that tracks register pressure and uses a DFA for pipeline modeling.
Patch by Sergei Larin!
llvm-svn: 149547
It could only be specified on the commandline, and wouldn't show
up as an option in the GUI or when invoked via `cmake -i` at all.
This also tells CMake that it's a BOOL, rather than "UNINITIALIZED".
llvm-svn: 149506
The CMake build already generated one. Follows clang r149497.
This brings us one step closer to compiling and configuring clang
separately from LLVM using the autoconf build, too.
(I lack the right version of autoconf et al. to regen, but it
was a simple change, so I just updated configure manually.)
llvm-svn: 149498
The purpose of refactoring is to hide operand roles from SwitchInst user (programmer). If you want to play with operands directly, probably you will need lower level methods than SwitchInst ones (TerminatorInst or may be User). After this patch we can reorganize SwitchInst operands and successors as we want.
What was done:
1. Changed semantics of index inside the getCaseValue method:
getCaseValue(0) means "get first case", not a condition. Use getCondition() if you want to resolve the condition. I propose don't mix SwitchInst case indexing with low level indexing (TI successors indexing, User's operands indexing), since it may be dangerous.
2. By the same reason findCaseValue(ConstantInt*) returns actual number of case value. 0 means first case, not default. If there is no case with given value, ErrorIndex will returned.
3. Added getCaseSuccessor method. I propose to avoid usage of TerminatorInst::getSuccessor if you want to resolve case successor BB. Use getCaseSuccessor instead, since internal SwitchInst organization of operands/successors is hidden and may be changed in any moment.
4. Added resolveSuccessorIndex and resolveCaseIndex. The main purpose of these methods is to see how case successors are really mapped in TerminatorInst.
4.1 "resolveSuccessorIndex" was created if you need to level down from SwitchInst to TerminatorInst. It returns TerminatorInst's successor index for given case successor.
4.2 "resolveCaseIndex" converts low level successors index to case index that curresponds to the given successor.
Note: There are also related compatability fix patches for dragonegg, klee, llvm-gcc-4.0, llvm-gcc-4.2, safecode, clang.
llvm-svn: 149481
The pass pointer should never be referenced after sending it to
schedulePass(), which may delete the pass. To fix this bug I had to
clean up the design leading to more goodness.
You may notice now that any non-analysis pass is printed. So things like loop-simplify and lcssa show up, while target lib, target data, alias analysis do not show up. Normally, analysis don't mutate the IR, but you can now check this by using both -print-after and -print-before. The effects of analysis will now show up in between the two.
The llc path is still in bad shape. But I'll be improving it in my next checkin. Meanwhile, print-machineinstrs still works the same way. With print-before/after, many llc passes that were not printed before now are, some of these should be converted to analysis. A few very important passes, isel and scheduler, are not properly initialized, so not printed.
llvm-svn: 149480
This is the initial checkin of the basic-block autovectorization pass along with some supporting vectorization infrastructure.
Special thanks to everyone who helped review this code over the last several months (especially Tobias Grosser).
llvm-svn: 149468
Changing arguments from being passed as fixed to varargs is unsafe, as
the ABI may require they be handled differently (stack vs. register, for
example).
Remove two tests which rely on the bitcast being folded into the direct
call, which is exactly the transformation that's unsafe.
llvm-svn: 149457
symbol from an assignment. In this case the symbol did not have a fragment so
MCObjectWriter::IsSymbolRefDifferenceFullyResolved() should not have been
calling IsSymbolRefDifferenceFullyResolvedImpl() with a NULL fragment and should
just have returned false in that case.
llvm-svn: 149442
This class is used to represent SubRegIndex instances instead of the raw
Record pointers that were used before.
No functional change intended.
llvm-svn: 149418
now that this handles the release / retain calls.
Adds a regression test for that bug (which is a compile-time
regression) and for the last two changes to the IntrusiveRefCntPtr,
especially tests for the memory leak due to copy construction of the
ref-counted object and ensuring that the traits are used for release /
retain calls.
llvm-svn: 149411
This removes implicit assumption about the form of MI coming into regalloc. In particular, it should be independent of ProcessImplicitDefs which will eventually become a standard part of coming out of SSA--unless we simply can eliminate IMPLICIT_DEF completely. Current unit tests expose this once I remove incidental pass ordering restrictions.
This is not a final fix. Just a temporary workaround until I figure out the right way.
llvm-svn: 149360
kicking in the big win of ConstantDataArray. As part of this, change
the implementation of GetConstantStringInfo in ValueTracking to work
with ConstantDataArray (and not ConstantArray) making it dramatically,
amazingly, more efficient in the process and renaming it to
getConstantStringInfo.
This keeps around a GetConstantStringInfo entrypoint that (grossly)
forwards to getConstantStringInfo and constructs the std::string
required, but existing clients should move over to
getConstantStringInfo instead.
llvm-svn: 149351
vectors of all one bits to be printed more cleverly in the AsmPrinter.
Unfortunately, the byte value for all one bits is the same with
-fsigned-char as the error return of '-1'. Force this to be the unsigned
byte value when returning it to avoid this problem, and update the test
case for the shiny new behavior.
Yay for building LLVM and Clang with -funsigned-char.
Chris, please review, and let me know if there is any reason to not
desire this change. It seems good on the surface, and certainly intended
based on the code written.
llvm-svn: 149299
- Don't call malloc+free in the very hot forward().
- Don't call isTiedToDefOperand().
- Don't create BitVector temporaries.
- Merge DeadRegs into KillRegs.
- Eliminate the early clobber checks, they were irrelevant to scavenging.
- Remove unnecessary code from -Asserts builds.
This speeds up ARM PEI by 3.4x and overall llc -O0 codegen time by 11%.
llvm-svn: 149189
Sometimes there is only one 'resume' instruction per function. In those
situations, we don't need a separate block for the call to _Unwind_Resume. In
fact, it adds a lot of overhead to code-gen if we do that -- especially at -O0.
If we have a single 'resume' instruction, just generate the call within that
block.
<rdar://problem/10694814>
llvm-svn: 149159
Move to a model where we build whatever branches are checked out
in the source directories. This was a bit too smart (and complicated)
in handling details best left to the user and the revision control
system.
In addition, get rid of support for llvm-gcc and building gcc as
these are no longer necessary.
llvm-svn: 149149
around within a basic block while maintaining live-intervals.
Updated ScheduleTopDownLive in MachineScheduler.cpp to use the moveInstr API
when reordering MIs.
llvm-svn: 149147
GEP instructions are there for the compiler and shouldn't really output much
code (if any at all). When a GEP is stored in the entry block, Fast ISel (for
one) will not know that it could fold it into further uses. For instance, inside
of the EH handling code. This results in a lot of unnecessary spills and loads
which bloat code and slows down pretty much everything.
<rdar://problem/10694814>
llvm-svn: 149114
mid-level constant folding APIs instead of doing its own analysis.
This makes it more general (e.g. can now share a <2 x i64> with a
<4 x i32>) and avoid duplicating a bunch of logic.
llvm-svn: 149111
Adjust an example MachObjectWriter diagnostic to use the information
to issue a better message.
Before:
LLVM ERROR: unknown ARM fixup kind!
After:
x.s:6:5: error: unsupported relocation on symbol
beq bar
^
rdar://9800182
llvm-svn: 149093
The Win64 calling convention has xmm6-15 as callee-saved while still
clobbering all ymm registers.
Add a YMM_HI_6_15 pseudo-register that aliases the clobbered part of the
ymm registers, and mark that as call-clobbered. This allows live xmm
registers across calls.
This hack wouldn't be necessary with RegisterMask operands representing
the call clobbers, but they are not quite operational yet.
llvm-svn: 149088
we're at it, allow PatternMatch's "neg" pattern to match integer
vector negations, and enhance ComputeNumSigned bits to handle
shl of vectors.
llvm-svn: 149082
ConstantExpr::getWithOperandReplaced and ConstantExpr::replaceUsesOfWithOnConstant
in terms of ConstantExpr::getWithOperands. While we're at it,
make sure that ConstantExpr::getWithOperands covers all instructions: it was
missing insert/extractvalue.
llvm-svn: 149076
MachineBasicBlock::canFallThrough(). We're interested in the state of the
instruction (i.e., is this a barrier or not?), not if the instruction is
predicable or not.
rdar://10501092
llvm-svn: 149070
The live range of the source register may be extended when a redundant
copy is eliminated. Make sure any kill flags between the two copies are
cleared.
This fixes PR11765.
llvm-svn: 149069
This enables the linker to match concrete relocation types (absolute or relative) with whatever library or C++ support code is being linked against.
llvm-svn: 149057
. "fptosi" and "fptoui" IR instructions are defined with round-to-zero rounding mode.
. Currently for AVX mode for <4xdouble> and <8xdouble> the "VCVTPD2DQ.128" and "VCVTPD2DQ.256" instructions are selected (for .fp_to_sint. DAG node operation ) by AVX codegen. However they use round-to-nearest-even rounding mode.
. Consequently, the conversion produces incorrect numbers.
The fix is to replace selection of VCVTPD2DQ instructions with VCVTTPD2DQ instructions. The latter use truncate (i.e. round-to-zero) rounding mode.
As .fp_to_sint. DAG node operation is used only for lowering of "fptosi" and "fptoui" IR instructions, the fix in X86InstrSSE.td definition file doesn.t have an impact on other LLVM flows.
The patch includes changes in the .td file, LIT test for the changes and a fix in a legacy LIT test (which produced asm code conflicting with LLVN IR spec).
llvm-svn: 149056
ConstantVector. Fix some outright bugs in the implementation of
ConstantArray and Constant struct, which would cause us to not make
one big UndefValue when asking for an array/struct with all undef
elements. Enhance Constant::isAllOnesValue to work with
ConstantDataVector.
llvm-svn: 149021
to reduce the number of cast<>'s we have. This allows someone to use
things like Ty->getVectorNumElements() instead of
cast<VectorType>(Ty)->getNumElements() when you know that a type is a
vector.
It would be a great general cleanup to move the codebase to use these,
I will do so in the code I'm touching.
llvm-svn: 148999
This boils down to using MachineOperand::readsReg() more.
This fixes PR11829 where a use ended up after the first def when
lowering REG_SEQUENCE instructions involving IMPLICIT_DEFs.
llvm-svn: 148996
function. They don't appear to be used, and are inconsistent with handling of
other physreg intervals (i.e. intervals that are not live-in) where ranges are
not inserted for aliases.
llvm-svn: 148986
"Although a Thumb2 instruction, the IT mnemonic shall be permitted in
ARM mode, and the condition verified to match the condition code(s)
on the following instruction(s)."
PR11853
llvm-svn: 148969
savings from a pointer argument becoming an alloca. Sometimes callees will even
compare a pointer to null and then branch to an otherwise unreachable block!
Detect these cases and compute the number of saved instructions, instead of
bailing out and reporting no savings.
llvm-svn: 148941
to 64-bits, and added a new attribute in bit #32. Specifically, remove
this new attribute from the enum used in the C API. It's not yet clear
what the best approach is for exposing these new attributes in the
C API, and several different proposals are on the table. Until then, we
can simply not expose this bit in the API at all.
Also, I've reverted a somewhat unrelated change in the same revision
which switched from "1 << 31" to "1U << 31" for the top enum. While "1
<< 31" is technically undefined behavior, implementations DTRT here.
However, MS and -pedantic mode warn about non-'int' type enumerator
values. If folks feel strongly about this I can put the 'U' back in, but
it seemed best to wait for the proper solution.
llvm-svn: 148937
- Use MipsAnalyzeImmediate to expand immediates that do not fit in 16-bit.
- Change the types of variables so that they are sufficiently large to handle
64-bit pointers.
- Emit instructions to set register $28 in a function prologue after
instructions which store callee-saved registers have been emitted.
llvm-svn: 148917
expand offsets that do not fit in the 16-bit immediate field of load and store
instructions. Also change the types of variables so that they are sufficiently
large to handle 64-bit pointers.
llvm-svn: 148916
A REG_SEQUENCE instruction is lowered into a sequence of partial defs:
%vreg7:ssub_0<def,undef> = COPY %vreg20:ssub_0
%vreg7:ssub_1<def> = COPY %vreg2
%vreg7:ssub_2<def> = COPY %vreg2
%vreg7:ssub_3<def> = COPY %vreg2
The first def needs an <undef> flag to indicate it is the beginning of
the live range, while the other defs are read-modify-write. Previously,
we depended on LiveIntervalAnalysis to notice and fix the missing
<def,undef>, but that solution was never robust, it was causing problems
with ProcessImplicitDefs and the lowering of chained REG_SEQUENCE
instructions.
This fixes PR11841.
llvm-svn: 148879
When not using subsections via symbols, the assembler can resolve
symbol differences (including pcrel references) to non-local
labels at assembly time, not just those in the same atom.
llvm-svn: 148865
dealing in the host triple, be honest about it and document the decision
to default the target triple to the host triple unless overridden.
llvm-svn: 148822