Commit Graph

55460 Commits

Author SHA1 Message Date
Pete Cooper 91244268d7 Consider address spaces for hashing and CSEing DAG nodes. Otherwise two loads from different x86 segments but the same address would get CSEd
llvm-svn: 160987
2012-07-30 20:23:19 +00:00
Kevin Enderby 5c490f1b8f Fix a bug in ARMMachObjectWriter::RecordRelocation() in ARMMachObjectWriter.cpp
where the other_half of the movt and movw relocation entries needs to get set
and only with the 16 bits of the other half.

rdar://10038370

llvm-svn: 160978
2012-07-30 18:46:15 +00:00
Jakob Stoklund Olesen 7361846f32 Add MachineInstr::isTransient().
This is a cleaned up version of the isFree() function in
MachineTraceMetrics.cpp.

Transient instructions are very unlikely to produce any code in the
final output. Either because they get eliminated by RegisterCoalescing,
or because they are pseudo-instructions like labels and debug values.

llvm-svn: 160977
2012-07-30 18:34:14 +00:00
Jakob Stoklund Olesen 3df6c46fdd Add MachineTraceMetrics::verify().
This function verifies the consistency of cached data in the
MachineTraceMetrics analysis.

llvm-svn: 160976
2012-07-30 18:34:11 +00:00
Jakob Stoklund Olesen eb488fe165 Verify that the CFG hasn't changed during invalidate().
The MachineTraceMetrics analysis must be invalidated before modifying
the CFG. This will catch some of the violations of that rule.

llvm-svn: 160969
2012-07-30 17:36:49 +00:00
Jakob Stoklund Olesen fee94ca15b Add MachineBasicBlock::isPredecessor().
A->isPredecessor(B) is the same as B->isSuccessor(A), but it can
tolerate a B that is null or dangling. This shouldn't happen normally,
but it it useful for verification code.

llvm-svn: 160968
2012-07-30 17:36:47 +00:00
Nadav Rotem 77f1b9c477 When constant folding GEP expressions, keep the address space information of pointers.
Together with Ran Chachick <ran.chachick@intel.com>

llvm-svn: 160954
2012-07-30 07:25:20 +00:00
Craig Topper efd97044a3 Mark MOVZX16/MOVSX16 as neverHasSideEffects/mayLoad
llvm-svn: 160953
2012-07-30 07:14:07 +00:00
Craig Topper c6b7ef61f4 Mark MOVZX32_NOREX as isCodeGenOnly and neverHasSideEffects. The isCodeGenOnly change allows special detection of _NOREX instructions to be removed from tablegen disassembler code.
llvm-svn: 160951
2012-07-30 06:48:11 +00:00
Craig Topper 14eac5dda8 Give VCVTTPD2DQ priority over CVTTPD2DQ.
llvm-svn: 160942
2012-07-30 02:20:32 +00:00
Craig Topper f881d385da Fix patterns for CVTTPS2DQ to specify SSE2 instead of SSE1.
llvm-svn: 160941
2012-07-30 02:14:02 +00:00
Craig Topper 415b3586d0 Fix up patterns for VCVTSS2SD. Specifically give it priority over SSE form. Add an OptForSpeed to explicitly pair up with an OptForSize that was already on another pattern.
llvm-svn: 160939
2012-07-30 01:38:57 +00:00
Craig Topper 28402efcb6 Fix load types on intrinsic forms of SS2SD and SD2SS AVX/SSE convert instruction patterns.
llvm-svn: 160938
2012-07-29 23:26:34 +00:00
Craig Topper b6767f3acd Move more SSE/AVX convert instruction patterns into their definitions.
llvm-svn: 160937
2012-07-29 22:30:06 +00:00
Manman Ren f87dd7c01b Revert r160920 and r160919 due to dragonegg and clang selfhost failure
llvm-svn: 160927
2012-07-29 02:44:09 +00:00
Craig Topper fc93281c07 Fold patterns for some of the SSE/AVX convert instructions into their instruction definitions.
llvm-svn: 160922
2012-07-28 18:59:19 +00:00
Craig Topper 024797b9a2 Mark some of the SSE/AVX convert instructions as mayLoad/neverHasSideEffects.
llvm-svn: 160921
2012-07-28 18:36:39 +00:00
Manman Ren 0fa3ab88ba X86 Peephole: fold loads to the source register operand if possible.
Machine CSE and other optimizations can remove instructions so folding
is possible at peephole while not possible at ISel.

rdar://10554090 and rdar://11873276

llvm-svn: 160919
2012-07-28 16:48:01 +00:00
Craig Topper 44f9b5343d Make CVTSS2SI instruction definition consistent with CVTSD2SI.
llvm-svn: 160914
2012-07-28 08:28:23 +00:00
Craig Topper 1c1aef07b8 Fix up memory load types for SSE scalar convert intrinsic patterns.
llvm-svn: 160913
2012-07-28 07:59:59 +00:00
Manman Ren 32367c063b X86 Peephole: fix PR13475 in optimizeCompare.
It is possible that an instruction can use and update EFLAGS.
When checking the safety, we should check the usage of EFLAGS first before
declaring it is safe to optimize due to the update.

llvm-svn: 160912
2012-07-28 03:15:46 +00:00
Andrew Trick 940534371b Reenable a basic SSA DAG builder optimization.
Jakob fixed ProcessImplicifDefs in r159149.

llvm-svn: 160910
2012-07-28 01:48:15 +00:00
Jakob Stoklund Olesen 0563369755 Add more debug output to MachineTraceMetrics.
llvm-svn: 160905
2012-07-27 23:58:38 +00:00
Jakob Stoklund Olesen 1152202cc2 Keep track of the head and tail of the trace through each block.
This makes it possible to quickly detect blocks that are outside the
trace.

llvm-svn: 160904
2012-07-27 23:58:36 +00:00
Eric Christopher 86ca9f9e11 Add a DW_AT_high_pc for CUs that are a single address range. Update
all tests accordingly.

Fixes PR13351.

Patch by shinichiro hamaji!

llvm-svn: 160899
2012-07-27 22:00:05 +00:00
Jakob Stoklund Olesen 7dfe7abdee Also compute register mask lists under -new-live-intervals.
llvm-svn: 160898
2012-07-27 21:56:39 +00:00
Chad Rosier bd9f2ba4d6 Typos.
llvm-svn: 160897
2012-07-27 21:41:59 +00:00
Evan Cheng 249716e8ae Teach CodeGenPrep to look past bitcast when it's duplicating return instruction
into predecessor blocks to enable tail call optimization.

rdar://11958338

llvm-svn: 160894
2012-07-27 21:21:26 +00:00
Jakob Stoklund Olesen 97e14e02f1 Eliminate the IS_PHI_DEF flag and VNInfo::setIsPHIDef().
A value number is a PHI def if and only if it begins at a block
boundary. This can be derived from the def slot, a separate flag is not
necessary.

llvm-svn: 160893
2012-07-27 21:11:14 +00:00
Jakob Stoklund Olesen 4021a7bf25 Add a -new-live-intervals experimental option.
This option replaces the existing live interval computation with one
based on LiveRangeCalc.cpp. The new algorithm does not depend on
LiveVariables, and it can be run at any time, before or after leaving
SSA form.

llvm-svn: 160892
2012-07-27 20:58:46 +00:00
Andrew Kaylor 8e87a75be7 Fixing problems with X86_64_32 relocations and making the assertions more readable.
llvm-svn: 160889
2012-07-27 20:30:12 +00:00
Jakob Stoklund Olesen bc65e8f94e Add <imp-def> of super-register when lowering SUBREG_TO_REG.
Patch by Tyler Nowicki!

llvm-svn: 160888
2012-07-27 20:19:49 +00:00
Andrew Kaylor 782d5c434f Test commit, clean up comment
llvm-svn: 160880
2012-07-27 18:39:47 +00:00
Nuno Lopes 85591f899d fix PR13390: do not loop forever with self-referencing self instructions
llvm-svn: 160876
2012-07-27 18:21:15 +00:00
Nuno Lopes 20c7eb3549 fix infinite loop in instcombine in the presence of a (malformed) self-referencing select inst.
This can happen as long as the instruction is not reachable. Instcombine does generate these unreachable malformed selects when doing RAUW

llvm-svn: 160874
2012-07-27 18:03:57 +00:00
Andrew Kaylor 5c01090c49 Test commit, clean up comment
llvm-svn: 160873
2012-07-27 17:52:42 +00:00
Jakob Stoklund Olesen 0c06121e4e Give MCRegisterInfo an implementation file.
Move some functions from MCRegisterInfo.h that don't need to be inline.

This shrinks llc by 8K.

llvm-svn: 160865
2012-07-27 16:25:20 +00:00
Akira Hatanaka 97ba7696f8 Pass the correct call frame size to callseq_start node. This is needed to
replace uses of function getMaxCallFrameSize defined in MipsFunctionInfo with
the one MachineFrameInfo has.

llvm-svn: 160841
2012-07-26 23:27:01 +00:00
Pete Cooper abc13af9c6 Simplify demanded bits of select sources where the condition is a constant vector
llvm-svn: 160835
2012-07-26 23:10:24 +00:00
Jakob Stoklund Olesen 7cd08536c2 Remove the X86 sub_ss and sub_sd sub-register indexes completely.
llvm-svn: 160833
2012-07-26 23:07:20 +00:00
Jakob Stoklund Olesen 77cd55b4ee Remove the last mentions of sub_ss and sub_sd from patterns.
I'll remove these two sub-register indexes shortly.

llvm-svn: 160831
2012-07-26 23:03:08 +00:00
Jakob Stoklund Olesen b96d0b4e08 Eliminate sub_ss, sub_sd from broadcast patterns.
The (COPY_TO_REGCLASS GR32:$src, VR128) pattern looks odd, but
copyPhysReg does the right thing with it. (The old pattern would
eventually produce the same cross-class copy).

llvm-svn: 160830
2012-07-26 22:59:06 +00:00
Pete Cooper e807e45bff Teach SimplifyDemandedBits how to look through fpext and fptrunc to simplify their operand
llvm-svn: 160823
2012-07-26 22:37:04 +00:00
Jakob Stoklund Olesen 206b825f5c Eliminate more sub_ss / sub_sd patterns.
This gets rid of some more INSERT_SUBREG - IMPLICIT_DEF patterns,
simplifying the emitted code a bit.

llvm-svn: 160820
2012-07-26 22:30:18 +00:00
Jakob Stoklund Olesen 75d17b0577 Eliminate some SUBREG_TO_REG patterns with sub_ss and sub_sd.
The SUBREG_TO_REG instruction has magic semantics asserting that the
source value was defined by an instruction that cleared the high half of
the register. Those semantics are never actually exploited for xmm
registers.

llvm-svn: 160818
2012-07-26 22:03:21 +00:00
Jakob Stoklund Olesen ceee4a9d0c Eliminate a batch of uses of sub_ss and sub_sd in the X86 target.
These idempotent sub-register indices don't do anything --- They simply
map XMM registers to themselves.  They no longer affect register classes
either since the SubRegClasses field has been removed from Target.td.

This patch replaces XMM->XMM EXTRACT_SUBREG and INSERT_SUBREG patterns
with COPY_TO_REGCLASS patterns which simply become COPY instructions.

The number of IMPLICIT_DEF instructions before register allocation is
reduced, and that is the cause of the test case changes.

llvm-svn: 160816
2012-07-26 21:40:42 +00:00
Micah Villmow 7b473d9f72 Add support for v16i32/v16i64 into the code generator. This is required for backends that use i32/i64 vectors for the getSetCCResultType function.
llvm-svn: 160814
2012-07-26 21:22:00 +00:00
Chad Rosier 7c427c40cb Make comments in Debug.cpp and Debug.h consistent. Rename SetCurrentDebugType;
Function names should be camel case, and start with a lower case letter.  No
functional change intended.

llvm-svn: 160813
2012-07-26 20:38:52 +00:00
Jakob Stoklund Olesen 35400b1dda Use an otherwise unused variable.
llvm-svn: 160798
2012-07-26 19:42:56 +00:00
Jakob Stoklund Olesen f9029fef2a Start scaffolding for a MachineTraceMetrics analysis pass.
This is still a work in progress.

Out-of-order CPUs usually execute instructions from multiple basic
blocks simultaneously, so it is necessary to look at longer traces when
estimating the performance effects of code transformations.

The MachineTraceMetrics analysis will pick a typical trace through a
given basic block and provide performance metrics for the trace. Metrics
will include:

- Instruction count through the trace.
- Issue count per functional unit.
- Critical path length, and per-instruction 'slack'.

These metrics can be used to determine the performance limiting factor
when executing the trace, and how it will be affected by a code
transformation.

Initially, this will be used by the early if-conversion pass.

llvm-svn: 160796
2012-07-26 18:38:11 +00:00