llvm-project

Commit Graph

Author	SHA1	Message	Date
Andrew Trick	aca8fb3c45	LSR fix: "Special" users are just like "Basic" users but allow -1 scale. llvm-svn: 158536	2012-06-15 20:07:26 +00:00
Bill Wendling	4fd966347a	Remove assignments which aren't used afterwards. llvm-svn: 158535	2012-06-15 19:30:42 +00:00
Pete Cooper	e24d6a19e3	Allow SROA to split up an array of vectors into multiple vectors, even when the vectors are dynamically indexed llvm-svn: 158529	2012-06-15 18:07:29 +00:00
Rafael Espindola	1821c6c3b0	Some optimizations done by globalopt are safe only for internal linkage, not linkonce linkage. For example, it is not valid to add unnamed_addr. This also fixes a crash in g++.dg/opt/static5.C. llvm-svn: 158528	2012-06-15 18:00:24 +00:00
Jakob Stoklund Olesen	a15a224db0	Preserve <undef> flags in ARMExpandPseudo. This probably mostly shows up in bugpoint-generated code. llvm-svn: 158527	2012-06-15 17:46:54 +00:00
Jakob Stoklund Olesen	5767ad727c	Use regunit liveness in RegisterCoalescer when it is available. We only do very limited physreg coalescing now, but we still merge virtual registers into reserved registers. llvm-svn: 158526	2012-06-15 17:36:48 +00:00
Rafael Espindola	768b41c17a	Factor macro argument parsing into helper methods and add support for .irp. Patch extracted from a larger one by the PaX team. I added the testcases and tightened error handling a bit. llvm-svn: 158523	2012-06-15 14:02:34 +00:00
Duncan Sands	7838603ffc	Fix issues (infinite loop and/or crash) with self-referential instructions, for example degenerate phi nodes and binops that use themselves in unreachable code. Thanks to Charles Davis for the testcase that uncovered this can of worms. llvm-svn: 158508	2012-06-15 08:37:50 +00:00
Craig Topper	11913052d6	Move AVX version of convert instructions that write to GPRs to the Op1 table. llvm-svn: 158497	2012-06-15 07:02:58 +00:00
Marshall Clow	bfb85e676c	Had a closing brace inside an #ifdef -- oops! llvm-svn: 158485	2012-06-15 01:15:47 +00:00
Marshall Clow	71757ef3ed	Adding acessors to COFFObjectFile so that clients can get at the (non-generic) bits llvm-svn: 158484	2012-06-15 01:08:25 +00:00
Pete Cooper	1d1fa72837	Recommit r158407: Allow SROA to look at a vector type and see if the offset is out of range to be replaced with a scalar access. Now with additional fix and test for indexing into a vector inside a struct llvm-svn: 158479	2012-06-14 23:53:53 +00:00
Rafael Espindola	def1b09be2	Implement the isSafeToDiscardIfUnused predicate and use it in globalopt and globaldce. Globaldce was already removing linkonce globals, but globalopt was not. llvm-svn: 158476	2012-06-14 22:48:13 +00:00
Pete Cooper	8bbce768d8	Move X86::VCVTTSD2SIrr from the 2 operand to 1 operand MemRegOp table. Can someone with more knowledge of this please look at other entries to see if others need moved. llvm-svn: 158474	2012-06-14 22:12:58 +00:00
Akira Hatanaka	5fd22485a3	Fix coding style violations. Remove white spaces and tabs. llvm-svn: 158471	2012-06-14 21:10:56 +00:00
Akira Hatanaka	d8ab16b86f	1. introduce MipsPat in place of Pat in order to exclude those from being used by Mips16 or Micro Mips 2. clean up a few lines too long encountered Patch by Reed Kotler. llvm-svn: 158470	2012-06-14 21:03:23 +00:00
Akira Hatanaka	1b420ac4c8	Make machine verifier check the first instruction of the last bundle instead of the last instruction of a basic block. llvm-svn: 158468	2012-06-14 20:51:13 +00:00
Lang Hames	a33db65bd9	Make comment slightly more helpful. llvm-svn: 158467	2012-06-14 20:37:15 +00:00
Pete Cooper	5d19452f3f	Revert r158454: Allow SROA to look at a vector type... Its breaking the vectorise buildbot This reverts commit 12c1f86ffa731e2952c80d2cc577000c96b8962c. llvm-svn: 158462	2012-06-14 18:32:52 +00:00
Andrew Trick	45877fa011	misched: disable SSA check pending PR13112. llvm-svn: 158461	2012-06-14 17:48:49 +00:00
Pete Cooper	a7e6d58a87	Recommit r158407: Allow SROA to look at a vector type and see if the offset is out of range to be replaced with a scalar access. Now with additional fix and test for indexing into a vector inside a struct llvm-svn: 158454	2012-06-14 16:38:13 +00:00
NAKAMURA Takumi	27bdc671ed	MipsLongBranch.cpp: Tweak llvm::next() to appease msvc. llvm-svn: 158446	2012-06-14 12:29:48 +00:00
Richard Barton	b0ec375b96	Replace assertion failure for badly formatted CPS instrution with error message. llvm-svn: 158445	2012-06-14 10:48:04 +00:00
Jush Lu	ac96b764ea	Cleanup whitespace. llvm-svn: 158443	2012-06-14 06:08:19 +00:00
Manman Ren	c2bc2d106b	InstCombine: fix a bug when combining (fcmp cc0 x, y) && (fcmp cc1 x, y). uno && ueq was converted to ueq, it should be converted to uno. llvm-svn: 158441	2012-06-14 05:57:42 +00:00
Akira Hatanaka	d74b1c1a48	Fix Mips/CMakeLists.txt. llvm-svn: 158437	2012-06-14 01:23:55 +00:00
Akira Hatanaka	a215929d5f	Add file MipsLongBranch.cpp. llvm-svn: 158436	2012-06-14 01:22:24 +00:00
Akira Hatanaka	a1b142f97c	Remove code in MipsAsmPrinter and MipsMCInstLower. llvm-svn: 158434	2012-06-14 01:20:12 +00:00
Akira Hatanaka	eb36522a4d	Add long branch expansion pass for MIPS. llvm-svn: 158433	2012-06-14 01:19:35 +00:00
Akira Hatanaka	64f8df28ed	Add AT to the list of registers clobbered by branches so that it is available as a scratch register when they are expanded to long branches. llvm-svn: 158432	2012-06-14 01:17:59 +00:00
Akira Hatanaka	194a8773ea	In MipsRegisterInfo::eliminateFrameIndex, call Mips::loadImmediate to load an immediate that does not fit into 16-bit. llvm-svn: 158431	2012-06-14 01:17:36 +00:00
Akira Hatanaka	2372c8bb5f	In MipsFrameLowering::emitPrologue and emitEpilogue, call Mips::loadImmediate to load an immediate that does not fit into 16-bit. Also, take into consideration the global base register slot on the stack when computing the stack size. llvm-svn: 158430	2012-06-14 01:17:13 +00:00
Akira Hatanaka	acd1a7dc68	Define function MipsInstrInfo::GetInstSizeInBytes, which will be called to compute the size of basic blocks in a function. Also, define a function which emits a series of instructions to load an immediate. llvm-svn: 158429	2012-06-14 01:16:45 +00:00
Akira Hatanaka	0c76448471	In MipsISelDAGToDAG.cpp, store the global base register to a stack frame object. Long-branches need access to the global base register to get the destination address. llvm-svn: 158428	2012-06-14 01:16:15 +00:00
Akira Hatanaka	51c70c62cf	Add methods to MipsFunctionInfo for initializing and accessing the stack frame object for the global base register. This is the first of a series of patches which implements long branch expansion for MIPS. llvm-svn: 158427	2012-06-14 01:15:36 +00:00
Akira Hatanaka	5ac78681c1	Bundle jump/branch instructions with the instructions in the delay slot in delay slot filler pass of MIPS, per suggestion of Jakob Stoklund Olesen. This change, along with the fix in r158154, enables machine verification to be run after delay slot filling. llvm-svn: 158426	2012-06-13 23:25:52 +00:00
Akira Hatanaka	df5205ef3d	Implement a DAGCombine in MipsISelLowering.cpp which transforms the following pattern: (add v0, (add v1, abs_lo(tjt))) => (add (add v0, v1), abs_lo(tjt)) "tjt" is a TargetJumpTable node. llvm-svn: 158419	2012-06-13 20:33:18 +00:00
Akira Hatanaka	1daf8c2a16	Set a higher value for maxStoresPerMemcpy in MipsISelLowering.cpp. llvm-svn: 158414	2012-06-13 19:33:32 +00:00
Akira Hatanaka	9586618c58	Simplify CreateLoadLR and CreateStoreLR in MipsISelLowering.cpp. llvm-svn: 158413	2012-06-13 19:06:08 +00:00
Akira Hatanaka	f0273603f5	Implement fastcc calling convention for MIPS. llvm-svn: 158410	2012-06-13 18:06:00 +00:00
Richard Osborne	ab7d788eb5	Fix pattern for MKMSK instruction. llvm-svn: 158409	2012-06-13 17:59:12 +00:00
Pete Cooper	e2fe809772	Revert "Allow SROA to look at a vector type and see if the offset is out of range to be replaced with a scalar access" This reverts commit 51786e0aaec76b973205066bd44f7f427b21969f. llvm-svn: 158408	2012-06-13 17:55:22 +00:00
Pete Cooper	e1d4e8b563	Allow SROA to look at a vector type and see if the offset is out of range to be replaced with a scalar access llvm-svn: 158407	2012-06-13 17:30:34 +00:00
Argyrios Kyrtzidis	444fd42634	Fix building ThreadLocal.cpp with --disable-threads. llvm-svn: 158405	2012-06-13 16:30:06 +00:00
Kay Tiong Khoo	f294921e24	*typo: Cyles changed to Cycles llvm-svn: 158404	2012-06-13 15:53:04 +00:00
Duncan Sands	409d8ae165	It is possible for several constants which aren't individually absorbing to combine to the absorbing element. Thanks to nbjoerg on IRC for pointing this out. llvm-svn: 158399	2012-06-13 12:15:56 +00:00
Duncan Sands	318a89ddac	When linearizing a multiplication, return at once if we see a factor of zero, since then the entire expression must equal zero (similarly for other operations with an absorbing element). With this in place a bunch of reassociate code for handling constants is dead since it is all taken care of when linearizing. No intended functionality change. llvm-svn: 158398	2012-06-13 09:42:13 +00:00
Craig Topper	71dc02d659	Fix intrinsics for XOP frczss/sd instructions. These instructions only take one source register and zero the upper bits of the destination rather than preserving them. llvm-svn: 158396	2012-06-13 07:18:53 +00:00
Hal Finkel	9898614854	Add another missing 64-bit itinerary definition for the PPC A2 core. llvm-svn: 158393	2012-06-13 05:55:09 +00:00
Manman Ren	d33f4efbfd	SimplifyCFG: fold unconditional branch to its predecessor if profitable. This patch extends FoldBranchToCommonDest to fold unconditional branches. For unconditional branches, we fold them if it is easy to update the phi nodes in the common successors. rdar://10554090 llvm-svn: 158392	2012-06-13 05:43:29 +00:00
Jakob Stoklund Olesen	1c66b87f7d	Eliminate struct TableGenBackend. TableGen backends are simply written as functions now. Patch by Sean Silva! llvm-svn: 158389	2012-06-13 05:15:49 +00:00
Akira Hatanaka	21371766d1	Clean up trailing blanks in Mips16InstrFormats.td Patch by Reed Kotler. llvm-svn: 158382	2012-06-13 02:42:47 +00:00
Akira Hatanaka	5fa541231b	disable use of directive .set nomicromips until this directive is pushed in gas to open source fsf Patch by Reed Kotler. llvm-svn: 158381	2012-06-13 02:41:14 +00:00
Andrew Trick	344fb64fa3	sched: fix latency of memory dependence chain edges for consistency. For store->load dependencies that may alias, we should always use TrueMemOrderLatency, which may eventually become a subtarget hook. In effect, we should guarantee at least TrueMemOrderLatency on at least one DAG path from a store to a may-alias load. This should fix the standard mode as well as -enable-aa-sched-mi". llvm-svn: 158380	2012-06-13 02:39:03 +00:00
Andrew Trick	5b90645abb	sched: Avoid trivially redundant DAG edges. Take the one with higher latency. llvm-svn: 158379	2012-06-13 02:39:00 +00:00
Akira Hatanaka	3fe00f29ad	1. fix places where immed is used in place of imm to be consistent with non mips16 2. fix some comments to change OPcode->EXTEND for extended instructions Patch by Reed Kotler. llvm-svn: 158378	2012-06-13 02:37:54 +00:00
Hal Finkel	79c39da135	Add some missing 64-bit itinerary definitions for the PPC A2 core. llvm-svn: 158373	2012-06-12 20:32:29 +00:00
Duncan Sands	72aea01b6e	Use DenseMap as SmallMap workaround rather than std::map, at Chandler's request. llvm-svn: 158371	2012-06-12 20:26:43 +00:00
Duncan Sands	67cd591989	Use std::map rather than SmallMap because SmallMap assumes that the value has POD type, causing memory corruption when mapping to APInts with bitwidth > 64. Merge another crash testcase into crash.ll while there. llvm-svn: 158369	2012-06-12 20:16:51 +00:00
Chad Rosier	c6916f88a8	[arm-fast-isel] Add support for -arm-long-calls. Patch by Jush Lu <jush.msn@gmail.com>. llvm-svn: 158368	2012-06-12 19:25:13 +00:00
Hal Finkel	8c33dde666	Split out the PPC instruction class IntSimple from IntGeneral. On the POWER7, adds and logical operations can also be handled in the load/store pipelines. We'll call these IntSimple. llvm-svn: 158366	2012-06-12 19:01:24 +00:00
Hal Finkel	f1cc96ab50	Fixes for PPC host detection and features. POWER4 is a 64-bit CPU (better matched to the 970). The g3 is really the 750 (no altivec), the g4+ is the 74xx (not the 750). Patch by Andreas Tobler. llvm-svn: 158363	2012-06-12 16:39:23 +00:00
Duncan Sands	d7aeefebd6	Now that Reassociate's LinearizeExprTree can look through arbitrary expression topologies, it is quite possible for a leaf node to have huge multiplicity, for example: x0 = xx, x1 = x0x0, x2 = x1*x1, ... rapidly gives a value which is x raised to a vast power (the multiplicity, or weight, of x). This patch fixes the computation of weights by correctly computing them no matter how big they are, rather than just overflowing and getting a wrong value. It turns out that the weight for a value never needs more bits to represent than the value itself, so it is enough to represent weights as APInts of the same bitwidth and do the right overflow-avoiding dance steps when computing weights. As a side-effect it reduces the number of multiplies needed in some cases of large powers. While there, in view of external uses (eg by the vectorizer) I made LinearizeExprTree static, pushing the rank computation out into users. This is progress towards fixing PR13021. llvm-svn: 158358	2012-06-12 14:33:56 +00:00
Hal Finkel	59b0ee8a56	Reapply r158337, this time properly protect Darwin/PPC host CPU use with __ppc__. Original commit message: Move PPC host-CPU detection logic from PPCSubtarget into sys::getHostCPUName(). Both the new Linux functionality and the old Darwin functions have been moved. This change also allows this information to be queried directly by clang and other frontends (clang, for example, will now have real -mcpu=native support). llvm-svn: 158349	2012-06-12 03:03:13 +00:00
Argyrios Kyrtzidis	c6dc4d75fd	Satisfy C++ aliasing rules, per suggestion by Chandler. llvm-svn: 158346	2012-06-12 01:06:16 +00:00
Jakob Stoklund Olesen	f8f128606c	Revert r158337 "Move PPC host-CPU detection logic from PPCSubtarget into sys::getHostCPUName()." This commit broke most of the PowerPC unit tests when running on Intel/Apple. llvm-svn: 158345	2012-06-12 00:58:40 +00:00
Argyrios Kyrtzidis	8d19c86c9a	For llvm::sys::ThreadLocalImpl instead of malloc'ing the platform-specific thread local data, embed them in the class using a uint64_t and make sure we get compiler errors if there's a platform where this is not big enough. This makes ThreadLocal more safe for using it in conjunction with CrashRecoveryContext. Related to crash in rdar://11434201. llvm-svn: 158342	2012-06-12 00:21:31 +00:00
Andrew Trick	3e465fb225	misched: When querying RegisterPressureTracker, always save current and max pressure. llvm-svn: 158340	2012-06-11 23:42:23 +00:00
Andrew Trick	d054bd833a	misched: regpressure getMaxPressureDelta, revert accidental checkin. llvm-svn: 158339	2012-06-11 23:42:20 +00:00
Hal Finkel	23c699e497	Move PPC host-CPU detection logic from PPCSubtarget into sys::getHostCPUName(). Both the new Linux functionality and the old Darwin functions have been moved. This change also allows this information to be queried directly by clang and other frontends (clang, for example, will now have real -mcpu=native support). llvm-svn: 158337	2012-06-11 23:14:31 +00:00
Hal Finkel	bddc916f2b	Enable MFOCRF generation on the PPC A2 core. llvm-svn: 158324	2012-06-11 19:57:04 +00:00
Hal Finkel	bfd3d08d18	Rename the PPC target feature gpul to mfocrf. The PPC target feature gpul (IsGigaProcessor) was only used for one thing: To enable the generation of the MFOCRF instruction. Furthermore, this instruction is available on other PPC cores outside of the G5 line. This feature now corresponds to the HasMFOCRF flag. No functionality change. llvm-svn: 158323	2012-06-11 19:57:01 +00:00
Hal Finkel	25d4c568d3	Add A2 to the list of PPC CPUs recognized by Linux host CPU-type detection. llvm-svn: 158322	2012-06-11 19:56:57 +00:00
Hal Finkel	2c09058f19	Emit the two-operand form of the PPC mfcr instruction as mfocrf. This is necessary on Linux and supported on Darwin, see PR2604. llvm-svn: 158315	2012-06-11 15:43:15 +00:00
Hal Finkel	ba671c0ea7	Add local CPU detection for Linux PPC. This functionality mirrors that available on PPC/Darwin. llvm-svn: 158314	2012-06-11 15:43:13 +00:00
Hal Finkel	f2b9c38d6f	Add POWER6 and POWER7 CPU types to the PPC backend. No functional change; these will be used by upcoming scheduler enhancements. llvm-svn: 158313	2012-06-11 15:43:08 +00:00
Jakob Stoklund Olesen	e6aed139f0	Write llvm-tblgen backends as functions instead of sub-classes. The TableGenBackend base class doesn't do much, and will be removed completely soon. Patch by Sean Silva! llvm-svn: 158311	2012-06-11 15:37:55 +00:00
Bill Wendling	4b79647a6e	Re-enable the CMN instruction. We turned off the CMN instruction because it had semantics which we weren't getting correct. If we are comparing with an immediate, then it's okay to use the CMN instruction. <rdar://problem/7569620> llvm-svn: 158302	2012-06-11 08:07:26 +00:00
Benjamin Kramer	2150145ae4	InstCombine: factor code better. No functionality change. llvm-svn: 158301	2012-06-11 08:01:25 +00:00
Benjamin Kramer	8b8a76974f	InstCombine: Turn (zext A) == (B & (1<<X)-1) into A == (trunc B), narrowing the compare. This saves a cast, and zext is more expensive on platforms with subreg support than trunc is. This occurs in the BSD implementation of memchr(3), see PR12750. On the synthetic benchmark from that bug stupid_memchr and bsd_memchr have the same performance now when not inlining either function. stupid_memchr: 323.0us bsd_memchr: 321.0us memchr: 479.0us where memchr is the llvm-gcc compiled bsd_memchr from osx lion's libc. When inlining is enabled bsd_memchr still regresses down to llvm-gcc memchr time, I haven't fully understood the issue yet, something is grossly mangling the loop after inlining. llvm-svn: 158297	2012-06-10 20:35:00 +00:00
Hal Finkel	4e9f1a859f	Enable ILP scheduling for all nodes by default on PPC. Over the entire test-suite, this has an insignificantly negative average performance impact, but reduces some of the worst slowdowns from the anti-dep. change (r158294). Largest speedups: SingleSource/Benchmarks/Stanford/Quicksort - 28% SingleSource/Benchmarks/Stanford/Towers - 24% SingleSource/Benchmarks/Shootout-C++/matrix - 23% MultiSource/Benchmarks/SciMark2-C/scimark2 - 19% MultiSource/Benchmarks/MiBench/automotive-bitcount/automotive-bitcount - 15% (matrix and automotive-bitcount were both in the top-5 slowdown list from the anti-dep. change) Largest slowdowns: MultiSource/Benchmarks/McCat/03-testtrie/testtrie - 28% MultiSource/Benchmarks/mediabench/gsm/toast/toast - 26% MultiSource/Benchmarks/MiBench/automotive-susan/automotive-susan - 21% SingleSource/Benchmarks/CoyoteBench/lpbench - 20% MultiSource/Applications/d/make_dparser - 16% llvm-svn: 158296	2012-06-10 19:32:29 +00:00
Nadav Rotem	17ee58a792	Add AutoUpgrade support for the SSE4 ptest intrinsics. Patch by Michael Kuperstein. llvm-svn: 158295	2012-06-10 18:42:51 +00:00
Hal Finkel	a8100281ae	Use critical anti-dep. breaking on all PPC targets, but also add other register classes. Using 'all' instead of 'critical' would be better because it would make it easier to satisfy the bundling constraints, but, as noted in the FIXME, that is currently not possible with the crs. This yields an average 1% speedup over the entire test suite (on Power 7). Largest speedups: SingleSource/Benchmarks/Shootout-C++/moments - 40% MultiSource/Benchmarks/McCat/03-testtrie/testtrie - 28% SingleSource/Benchmarks/BenchmarkGame/nsieve-bits - 26% SingleSource/Benchmarks/McGill/misr - 23% MultiSource/Applications/JM/ldecod/ldecod - 22% Largest slowdowns: SingleSource/Benchmarks/Shootout-C++/matrix - -29% SingleSource/Benchmarks/Shootout-C++/ary3 - -22% MultiSource/Benchmarks/BitBench/uuencode/uuencode - -18% SingleSource/Benchmarks/Shootout-C++/ary - -17% MultiSource/Benchmarks/MiBench/automotive-bitcount/automotive-bitcount - -15% llvm-svn: 158294	2012-06-10 11:15:36 +00:00
Craig Topper	7afe343be5	Add intrinsics for immediate form of XOP vprot instructions. Use i128mem instead of f128mem for integer XOP instructions. llvm-svn: 158291	2012-06-10 07:31:56 +00:00
Hal Finkel	2edfbddcf0	Improve ext/trunc patterns on PPC64. The PPC64 backend had patterns for i32 <-> i64 extensions and truncations that would leave self-moves in the final assembly. Replacing those patterns with ones based on the SUBREG builtins yields better-looking code. Thanks to Jakob and Owen for their suggestions in this matter. llvm-svn: 158283	2012-06-09 22:10:19 +00:00
Craig Topper	a54893c662	Use XOP vpcom intrinsics in patterns instead of a target specific SDNode type. Remove the custom lowering code that selected the SDNode type. llvm-svn: 158279	2012-06-09 17:02:24 +00:00
Craig Topper	3352ba55b9	Replace XOP vpcom intrinsics with fewer intrinsics that take the immediate as an argument. llvm-svn: 158278	2012-06-09 16:46:13 +00:00
Aaron Ballman	36a978cca2	Disabling a spurious deprecation warning about using PathV1 from within the PathV1 implementation file. llvm-svn: 158274	2012-06-09 13:59:29 +00:00
Aaron Ballman	503bbff367	Fixing a typo in the comments. llvm-svn: 158273	2012-06-09 13:46:36 +00:00
Benjamin Kramer	0748008df5	Allocate the contents of DwarfDebug's StringMaps in a single big BumpPtrAllocator. llvm-svn: 158265	2012-06-09 10:34:15 +00:00
Duncan Sands	556eab8878	Silence a gcc-4.6 warning: GCC fails to understand that secondReg and cmpOp2 are correlated, and thinks that cmpOp2 may be used uninitialized. llvm-svn: 158263	2012-06-09 10:04:03 +00:00
Hal Finkel	eb50c2d4a4	Enable tail merging on PPC. Tail merging had been disabled on PPC because it would disturb bundling decisions made during pre-RA scheduling on the 970 cores. Now, however, all bundling decisions are made during post-RA scheduling, and tail merging is generally beneficial (the average test-suite speedup is insignificantly positive). Largest test-suite speedups: MultiSource/Benchmarks/mediabench/gsm/toast/toast - 30% MultiSource/Benchmarks/BitBench/uuencode/uuencode - 23% SingleSource/Benchmarks/Shootout-C++/ary - 21% SingleSource/Benchmarks/Stanford/Queens - 17% Largest slowdowns: MultiSource/Benchmarks/MiBench/security-sha/security-sha - 24% MultiSource/Benchmarks/McCat/03-testtrie/testtrie - 22% MultiSource/Applications/JM/ldecod/ldecod - 14% MultiSource/Benchmarks/mediabench/g721/g721encode/encode - 9% This is improved by using full (instead of just critical) anti-dependency breaking, but doing so still causes miscompiles and so cannot yet be enabled by default. llvm-svn: 158259	2012-06-09 03:14:50 +00:00
Andrew Trick	fc8ce08be3	Register pressure: added getPressureAfterInstr. llvm-svn: 158256	2012-06-09 02:16:58 +00:00
Jakob Stoklund Olesen	c26fbbfba5	Sketch a LiveRegMatrix analysis pass. The LiveRegMatrix represents the live range of assigned virtual registers in a Live interval union per register unit. This is not fundamentally different from the interference tracking in RegAllocBase that both RABasic and RAGreedy use. The important differences are: - LiveRegMatrix tracks interference per register unit instead of per physical register. This makes interference checks cheaper and assignments slightly more expensive. For example, the ARM D7 reigster has 24 aliases, so we would check 24 physregs before assigning to one. With unit-based interference, we check 2 units before assigning to 2 units. - LiveRegMatrix caches regmask interference checks. That is currently duplicated functionality in RABasic and RAGreedy. - LiveRegMatrix is a pass which makes it possible to insert target-dependent passes between register allocation and rewriting. Such passes could tweak the register assignments with interference checking support from LiveRegMatrix. Eventually, RABasic and RAGreedy will be switched to LiveRegMatrix. llvm-svn: 158255	2012-06-09 02:13:10 +00:00
Jack Carter	2db37e8226	Test commit llvm-svn: 158250	2012-06-09 00:27:55 +00:00
Jakob Stoklund Olesen	be336295cd	Also compute MBB live-in lists in the new rewriter pass. This deduplicates some code from the optimizing register allocators, and it means that it is now possible to change the register allocators' solutions simply by editing the VirtRegMap between the register allocator pass and the rewriter. llvm-svn: 158249	2012-06-09 00:14:47 +00:00
Dmitri Gribenko	dbeafa773a	Convert comments to proper Doxygen comments. llvm-svn: 158248	2012-06-09 00:01:45 +00:00
Jakob Stoklund Olesen	1224312f5b	Reintroduce VirtRegRewriter. OK, not really. We don't want to reintroduce the old rewriter hacks. This patch extracts virtual register rewriting as a separate pass that runs after the register allocator. This is possible now that CodeGen/Passes.cpp can configure the full optimizing register allocator pipeline. The rewriter pass uses register assignments in VirtRegMap to rewrite virtual registers to physical registers, and it inserts kill flags based on live intervals. These finalization steps are the same for the optimizing register allocators: RABasic, RAGreedy, and PBQP. llvm-svn: 158244	2012-06-08 23:44:45 +00:00
Nuno Lopes	2710f1b049	canonicalize: -%a + 42 into 42 - %a previously we were emitting: -(%a + 42) This fixes the infinite loop in PR12338. The generated code is still not perfect, though. Will work on that next llvm-svn: 158237	2012-06-08 22:30:05 +00:00
Evan Cheng	c5adccab1a	Start implementing pre-ra if-converter: using speculation and selects to eliminate branches. llvm-svn: 158234	2012-06-08 21:53:50 +00:00
Andrew Trick	423fa6faee	TargetInstrInfo hooks implemented in codegen should be declared pure virtual. llvm-svn: 158233	2012-06-08 21:52:38 +00:00
Duncan Sands	3293f460e7	Reapply commit 158073 with a fix (the testcase was already committed). The problem was that by moving instructions around inside the function, the pass could accidentally move the iterator being used to advance over the function too. Fix this by only processing the instruction equal to the iterator, and leaving processing of instructions that might not be equal to the iterator to later (later = after traversing the basic block; it could also wait until after traversing the entire function, but this might make the sets quite big). Original commit message: Grab-bag of reassociate tweaks. Unify handling of dead instructions and instructions to reoptimize. Exploit this to more systematically eliminate dead instructions (this isn't very useful in practice but is convenient for analysing some testcase I am working on). No need for WeakVH any more: use an AssertingVH instead. llvm-svn: 158226	2012-06-08 20:15:33 +00:00
Hal Finkel	41e6fd1df9	Remove the TODO statement in the PPC README re: CTR loops As Chris points out, this can now be removed! TODO: check if the associated section on viterbi's inner loop can also be removed. llvm-svn: 158224	2012-06-08 20:02:09 +00:00
Hal Finkel	c6b5debb40	Enable PPC CTR loop formation by default. Thanks to Jakob's help, this now causes no new test suite failures! Over the entire test suite, this gives an average 1% speedup. The largest speedups are: SingleSource/Benchmarks/Misc/pi - 108% SingleSource/Benchmarks/CoyoteBench/lpbench - 54% MultiSource/Benchmarks/Prolangs-C/unix-smail/unix-smail - 50% SingleSource/Benchmarks/Shootout/ary3 - 32% SingleSource/Benchmarks/Shootout-C++/matrix - 30% The largest slowdowns are: MultiSource/Benchmarks/mediabench/gsm/toast/toast - -30% MultiSource/Benchmarks/Prolangs-C/bison/mybison - -25% MultiSource/Benchmarks/BitBench/uuencode/uuencode - -22% MultiSource/Applications/d/make_dparser - -14% SingleSource/Benchmarks/Shootout-C++/ary - -13% In light of these slowdowns, additional profiling work is obviously needed! llvm-svn: 158223	2012-06-08 19:19:53 +00:00
Hal Finkel	3d32ad3a7f	Mark the PPC CTRRC and CTRRC8 register classes as non-allocatable. Marking these classes as non-alocatable allows CTR loop generation to work correctly with the block placement passes, etc. These register classes are currently used only by some unused TCRETURN patterns. In future cleanup, these will be removed. Thanks again to Jakob for suggesting this fix to the CTR loop problem! llvm-svn: 158221	2012-06-08 19:02:08 +00:00
Manman Ren	6bc2d27073	Enable optimization for integer ABS on X86 if Subtarget has CMOV. llvm-svn: 158220	2012-06-08 18:58:26 +00:00
Chad Rosier	3d464d8068	Fix a crash in APInt::lshr when shiftAmt > BitWidth. Patch by James Benton <jbenton@vmware.com>. llvm-svn: 158213	2012-06-08 18:04:52 +00:00
Andrew Trick	596af1b02e	Fix Target->Codegen dependence. Bulk move of TargetInstrInfo implementation into TargetInstrInfoImpl. This is dirty because the code isn't part of TargetInstrInfoImpl class, nor should it be, because the methods are not target hooks. However, it's the current mechanism for keeping libTarget useful outside the backend. You'll get a not-so-nice link error if you invoke a TargetInstrInfo method that depends on CodeGen. The TargetInstrInfoImpl class should probably be removed since it doesn't really solve this problem. To really fix this, we probably need separate interfaces for the CodeGen/nonCodeGen sides of TargetInstrInfo. llvm-svn: 158212	2012-06-08 17:23:27 +00:00
Nuno Lopes	4b68c1da54	BoundsChecking: add support for ConstantPointerNull. fixes a bunch of instrumentation failures in loops with reallocs llvm-svn: 158210	2012-06-08 16:31:42 +00:00
Hal Finkel	821e00121c	Disable the PPC CTR-Loops pass by default. The pass itself works well, but the something in the Machine* infrastructure does not understand terminators which define registers. Without the ability to use the block-placement pass, etc. this causes performance regressions (and so is turned off by default). Turning off the analysis turns off the problems with the Machine* infrastructure. llvm-svn: 158206	2012-06-08 15:38:25 +00:00
Hal Finkel	8b01503ee5	Fix a bug in the new PPC CTR-Loops pass. The code which tests for an induction operation cannot assume that any ADDI instruction will have a register operand because the operand could also be a frame index; for example: %vreg16<def> = ADDI8 <fi#0>, 0; G8RC:%vreg16 llvm-svn: 158205	2012-06-08 15:38:23 +00:00
Hal Finkel	96c2d4d945	Add the PPCCTRLoops pass: a PPC machine-code-level optimization pass to form CTR-based loop branching code. This pass is derived from the Hexagon HardwareLoops pass. The only significant enhancement over the Hexagon pass is that PPCCTRLoops will also attempt to delete the replaced add and compare operations if they are no longer otherwise used. Also, invalid preheader DebugLoc is not used. llvm-svn: 158204	2012-06-08 15:38:21 +00:00
Duncan Sands	9a5cf92250	Revert commit 158073 while waiting for a fix. The issue is that reassociate can move instructions within the instruction list. If the instruction just happens to be the one the basic block iterator is pointing to, and it is moved to a different basic block, then we get into an infinite loop due to the iterator running off the end of the basic block (for some reason this doesn't fire any assertions). Original commit message: Grab-bag of reassociate tweaks. Unify handling of dead instructions and instructions to reoptimize. Exploit this to more systematically eliminate dead instructions (this isn't very useful in practice but is convenient for analysing some testcase I am working on). No need for WeakVH any more: use an AssertingVH instead. llvm-svn: 158199	2012-06-08 13:37:30 +00:00
Manman Ren	2cdc8afccf	X86: optimize generated code for integer ABS This patch will generate the following for integer ABS: movl %edi, %eax negl %eax cmovll %edi, %eax INSTEAD OF movl %edi, %ecx sarl $31, %ecx leal (%rdi,%rcx), %eax xorl %ecx, %eax There exists a target-independent DAG combine for integer ABS, which converts integer ABS to sar+add+xor. For X86, we match this pattern back to neg+cmov. This is implemented in PerformXorCombine. rdar://10695237 llvm-svn: 158175	2012-06-07 22:39:10 +00:00
Nadav Rotem	bbd40f67d8	Do not optimize the used bits of the x86 vselect condition operand, when the condition operand is a vector of 1-bit predicates. This may happen on MIC devices. llvm-svn: 158168	2012-06-07 20:53:48 +00:00
Nadav Rotem	4e50efead6	Fix a bug in FoldSelectOpOp. Bitcast ops may change the number of vector elements, which may disagree with the select condition type. llvm-svn: 158166	2012-06-07 20:28:57 +00:00
Andrew Trick	a5d24ca453	Continue factoring computeOperandLatency. Use it for ARM hasHighOperandLatency. llvm-svn: 158164	2012-06-07 19:42:04 +00:00
Andrew Trick	5b1cadf9f7	ARM getOperandLatency rewrite. Match expectations of the new latency API. Cleanup and make the logic consistent. llvm-svn: 158163	2012-06-07 19:42:00 +00:00
Andrew Trick	3564bdfa61	ARM getOperandLatency should return -1 for unknown, consistent with API llvm-svn: 158162	2012-06-07 19:41:58 +00:00
Andrew Trick	fb1a74c2b2	Fix ARM getInstrLatency logic to work with the current API. llvm-svn: 158161	2012-06-07 19:41:55 +00:00
Manman Ren	746e4859d0	PR13046: we can't replace usage of SUB with CMP in the lowering phase. It will cause assertion failure later on. llvm-svn: 158160	2012-06-07 19:27:33 +00:00
Rafael Espindola	55d1145bd5	Use a base register instead of an index register with the local dynamic model. Fixes pr13048. llvm-svn: 158158	2012-06-07 18:39:19 +00:00
Pete Cooper	cd72016cab	Move terminator machine verification to check MachineBasicBlock::instr_iterator instead of MBB::iterator llvm-svn: 158154	2012-06-07 17:41:39 +00:00
Manman Ren	ae02c5a93e	X86: replace SUB with CMP if possible This patch will optimize the following movq %rdi, %rax subq %rsi, %rax cmovsq %rsi, %rdi movq %rdi, %rax to cmpq %rsi, %rdi cmovsq %rsi, %rdi movq %rdi, %rax Perform this optimization if the actual result of SUB is not used. rdar: 11540023 llvm-svn: 158126	2012-06-07 00:42:47 +00:00
Manman Ren	9c9641812c	Revert r157755. The commit is intended to fix rdar://11540023. It is implemented as part of peephole optimization. We can actually implement this in the SelectionDAG lowering phase. llvm-svn: 158122	2012-06-06 23:53:03 +00:00
Jakob Stoklund Olesen	00e7dffefb	Properly verify liveness with bundled machine instructions. Bundles should be treated as one atomic transaction when checking liveness. That is how the register allocator (and VLIW targets) treats bundles. llvm-svn: 158116	2012-06-06 22:34:30 +00:00
Benjamin Kramer	3f87e3b707	Add accessors for all private members of DisasmContext. LLVM should be -Wunused-private-field clean now. llvm-svn: 158103	2012-06-06 20:45:10 +00:00
Andrew Trick	05ff4667eb	Move RegisterClassInfo.h. Allow targets to access this API. It's required for RegisterPressure. llvm-svn: 158102	2012-06-06 20:29:31 +00:00
Andrew Trick	88517f608c	Move RegisterPressure.h. Make it a general utility for use by Targets. llvm-svn: 158097	2012-06-06 19:47:35 +00:00
Benjamin Kramer	009b1c1cf1	Round 2 of dead private variable removal. LLVM is now -Wunused-private-field clean except for - lib/MC/MCDisassembler/Disassembler.h. Not sure why it keeps all those unaccessible fields. - gtest. llvm-svn: 158096	2012-06-06 19:47:08 +00:00
Benjamin Kramer	628a39faa3	Remove unused private fields found by clang's new -Wunused-private-field. There are some that I didn't remove this round because they looked like obvious stubs. There are dead variables in gtest too, they should be fixed upstream. llvm-svn: 158090	2012-06-06 18:25:08 +00:00
Chad Rosier	5d6f01ad77	Add support for dynamic stack realignment in the presence of dynamic allocas on X86. rdar://11496434 llvm-svn: 158087	2012-06-06 17:37:40 +00:00
Chad Rosier	faa3894628	Fix combine of uno && ord -> false so that the ordering of the fcmps doesn't matter. rdar://11579835 llvm-svn: 158084	2012-06-06 17:22:40 +00:00
Jakob Stoklund Olesen	f435b1867d	Remove dead debug option -disable-rematerialization. Remat has been stable for years, and it isn't done by LiveIntervalAnalysis any longer. (See LiveRangeEdit). llvm-svn: 158079	2012-06-06 16:22:41 +00:00
Duncan Sands	763da45e9e	Grab-bag of reassociate tweaks. Unify handling of dead instructions and instructions to reoptimize. Exploit this to more systematically eliminate dead instructions (this isn't very useful in practice but is convenient for analysing some testcase I am working on). No need for WeakVH any more: use an AssertingVH instead. llvm-svn: 158073	2012-06-06 14:53:10 +00:00
Benjamin Kramer	3de5d40f4d	Stop leaking RegScavengers from TailDuplication. llvm-svn: 158069	2012-06-06 13:53:41 +00:00
Richard Barton	f1ef87ddbb	Correct decoder for T1 conditional B encoding llvm-svn: 158055	2012-06-06 09:12:53 +00:00
Craig Topper	bf2409e8aa	Mark several instructions SSE2 instead of SSE3 as they should be. llvm-svn: 158049	2012-06-06 06:45:27 +00:00
Jakob Stoklund Olesen	c141ba584e	Move LiveUnionArray into LiveIntervalUnion.h It is useful outside RegAllocBase. llvm-svn: 158041	2012-06-05 23:57:30 +00:00
Jakob Stoklund Olesen	46d229c573	Don't print register names in LiveIntervalUnion::print(). Soon we'll be making LiveIntervalUnions for register units as well. This was the only place using the RepReg member, so just remove it. llvm-svn: 158038	2012-06-05 23:07:19 +00:00
Matt Beaumont-Gay	7ba769bedd	Suppress -Wunused-variable in -Asserts build llvm-svn: 158037	2012-06-05 23:00:03 +00:00
Jakob Stoklund Olesen	f3f7d6f6e2	Simplify LiveInterval::print(). Don't print out the register number and spill weight, making the TRI argument unnecessary. This allows callers to interpret the reg field. It can currently be a virtual register, a physical register, a spill slot, or a register unit. llvm-svn: 158031	2012-06-05 22:51:54 +00:00
Jakob Stoklund Olesen	12e03dae44	Add experimental support for register unit liveness. Instead of computing a live interval per physreg, LiveIntervals can compute live intervals per register unit. This makes impossible the confusing situation where aliasing registers could have overlapping live intervals. It should also make fixed interferernce checking cheaper since registers have fewer register units than aliases. Live intervals for regunits are computed on demand, using MRI use-def chains and the new LiveRangeCalc class. Only regunits live in to ABI blocks are precomputed during LiveIntervals::runOnMachineFunction(). The regunit liveness computations don't depend on LiveVariables. llvm-svn: 158029	2012-06-05 22:02:15 +00:00
Jakob Stoklund Olesen	989b3b1516	Implement LiveRangeCalc::extendToUses() and createDeadDefs(). These LiveRangeCalc methods are to be used when computing a live range from scratch. llvm-svn: 158027	2012-06-05 21:54:09 +00:00
Andrew Trick	4b037005d2	MachineInstr::eraseFromParent fix for removing bundled instrs. Patch by Ivan Llopard. llvm-svn: 158025	2012-06-05 21:44:23 +00:00
Andrew Trick	4544606c71	misched: API for minimum vs. expected latency. Minimum latency determines per-cycle scheduling groups. Expected latency determines critical path and cost. llvm-svn: 158021	2012-06-05 21:11:27 +00:00
Lang Hames	a59100cc08	Add a new intrinsic: llvm.fmuladd. This intrinsic represents a multiply-add expression (a * b + c) that can be implemented as a fused multiply-add (fma) if the target determines that this will be more efficient. This intrinsic will be used to implement FP_CONTRACT support and an aggressive FMA formation mode. If your target has a fast FMA instruction you should override the isFMAFasterThanMulAndAdd method in TargetLowering to return true. llvm-svn: 158014	2012-06-05 19:07:46 +00:00
Yuan Lin	572a3a2cce	Fix header file include order in NVPTX backend NV_CONTRIB llvm-svn: 158013	2012-06-05 19:06:13 +00:00
Andrew Trick	a6fb910fad	LoopUnroll: always check for NULL LoopPassManager llvm-svn: 158007	2012-06-05 17:51:05 +00:00
Roman Divacky	c856653fb3	PPC32 uses R2 as the TLS register. Fix the copy and paste. llvm-svn: 158004	2012-06-05 17:14:17 +00:00
Andrew Trick	39a99140c7	X86 itinerary properties. llvm-svn: 157981	2012-06-05 03:44:46 +00:00
Andrew Trick	b2680c718f	ARM itinerary properties. llvm-svn: 157980	2012-06-05 03:44:43 +00:00
Andrew Trick	73d7736b17	misched: Added MultiIssueItineraries. This allows a subtarget to explicitly specify the issue width and other properties without providing pipeline stage details for every instruction. llvm-svn: 157979	2012-06-05 03:44:40 +00:00
Andrew Trick	a88d46e818	sdsched: Use the right heuristics when -mcpu is not provided and we have no itinerary. Use ILP heuristics for long latency instrs if no scoreboard exists. llvm-svn: 157978	2012-06-05 03:44:34 +00:00
Andrew Trick	ed7c96d7d9	misched: Allow disabling scoreboard hazard checking for subtargets with a valid itinerary but no pipeline stages. An itinerary can contain useful scheduling information without specifying pipeline stages for each instruction. llvm-svn: 157977	2012-06-05 03:44:32 +00:00
Andrew Trick	515f131786	whitespace llvm-svn: 157976	2012-06-05 03:44:29 +00:00
Andrew Trick	d36adece50	misched: comments from code review. llvm-svn: 157975	2012-06-05 03:44:26 +00:00
Jakob Stoklund Olesen	345528944c	Remove the last remat-related code from LiveIntervalAnalysis. Rematerialization is handled by LiveRangeEdit now. llvm-svn: 157974	2012-06-05 01:06:15 +00:00
Jakob Stoklund Olesen	9e27e2621a	Stop using LiveIntervals::isReMaterializable(). It is an old function that does a lot more than required by CalcSpillWeights, which was the only remaining caller. The isRematerializable() function never actually sets the isLoad argument, so don't try to compute that. llvm-svn: 157973	2012-06-05 01:06:12 +00:00
Joel Jones	7f2ac7a2c8	Revert commit r157966 llvm-svn: 157972	2012-06-05 00:47:21 +00:00
Joel Jones	d08534f82e	This change handles a another case for generating the bic instruction when a compile time constant is known. This occurs when implicitly zero extending function arguments from 16 bits to 32 bits. <rdar://problem/11481151> llvm-svn: 157966	2012-06-04 23:38:57 +00:00
Jakob Stoklund Olesen	188d830405	Delete dead code. llvm-svn: 157963	2012-06-04 23:01:41 +00:00
Rafael Espindola	47d988c54c	When gvn decides to replace an instruction with another, we have to patch the replacement to make it at least as generic as the instruction being replaced. This includes: * dropping nsw/nuw flags * getting the least restrictive tbaa and fpmath metadata * merging ranges Fixes PR12979. llvm-svn: 157958	2012-06-04 22:44:21 +00:00
Jakob Stoklund Olesen	11fb248aa6	Switch LiveIntervals member variable to LLVM naming standards. No functional change. llvm-svn: 157957	2012-06-04 22:39:14 +00:00
Jakob Stoklund Olesen	5ef0e0b262	Pass context pointers to LiveRangeCalc::reset(). Remove the same pointers from all the other LiveRangeCalc functions, simplifying the interface. llvm-svn: 157941	2012-06-04 18:21:16 +00:00
Akira Hatanaka	6734685f21	Fix a bug in MipsTargetLowering::LowerLOAD. A shift-right-logical node is inserted after the shift-left-logical node. llvm-svn: 157937	2012-06-04 17:46:29 +00:00
Roman Divacky	e3f15c98d1	Implement local-exec TLS on PowerPC. llvm-svn: 157935	2012-06-04 17:36:38 +00:00
Hans Wennborg	245917b536	MIPS TLS: use the model selected by TargetMachine::getTLSModel(). This was mostly done already in r156162, but I missed one place. llvm-svn: 157929	2012-06-04 14:02:08 +00:00
Nadav Rotem	b7bb72e4f3	Remove the "-promote-elements" flag. This flag is now enabled by default. llvm-svn: 157925	2012-06-04 11:27:21 +00:00
Hans Wennborg	09610f3e09	Better comments for TLS-related X86 MachineOperand flags. llvm-svn: 157920	2012-06-04 09:55:36 +00:00
Craig Topper	c6ac4cefcc	Add intrinsic forms for FMA instructions to opcode folding tables. llvm-svn: 157917	2012-06-04 07:46:16 +00:00
Craig Topper	3cb143016d	Add VFMADDSUB and VFMSUBADD FMA instructions to folding tables. Also add 213 forms of scalar FMA instructions. llvm-svn: 157914	2012-06-04 07:08:21 +00:00
Hal Finkel	1de9bf01e4	Fix a copy-and-paste duplication error in the PPC 440 and A2 schedules (no functionality change). llvm-svn: 157912	2012-06-04 02:39:52 +00:00
Hal Finkel	595817eebe	Enable generating PPC pre-increment (r+imm) instructions by default. It seems that this no longer causes test suite failures on PPC64 (after r157159), and often gives a performance benefit, so it can be enabled by default. llvm-svn: 157911	2012-06-04 02:21:00 +00:00
Rafael Espindola	34b9c511c1	Represent .rept as an anonymous macro. This removes the need for the ActiveRept vector. No functionality change. Extracted from a patch by the PaX Team. llvm-svn: 157909	2012-06-03 23:57:14 +00:00
Rafael Espindola	dd17c237a8	Add a typedef to simplify the code a bit. Not functionality change. Part of a patch by the PaX Team. llvm-svn: 157908	2012-06-03 22:41:23 +00:00
Craig Topper	79dbb0c6e4	Rename FMA3 feature flag to just FMA to match gcc so it can be added to clang. llvm-svn: 157903	2012-06-03 18:58:46 +00:00
Craig Topper	2c5ccd8af7	Simplify the fma4 renaming code. llvm-svn: 157902	2012-06-03 16:48:52 +00:00
Craig Topper	720c7bde5c	Autoupgrade support the rename of x86.fma4 intrinsics to x86.fma from r157898. llvm-svn: 157899	2012-06-03 08:07:25 +00:00
Craig Topper	fd53b80219	Rename fma4 intrinsics to just fma since they are now used for both FMA4 and FMA3. Autoupgrade support coming in a separate commit. llvm-svn: 157898	2012-06-03 07:26:46 +00:00
Manman Ren	5097e4f38a	Revert r157831 llvm-svn: 157896	2012-06-03 03:14:24 +00:00
Craig Topper	29eafea292	Use sse_load_f32/64 for scalar FMA3 intrinsic patterns instead of 128-bit loads to match instruction behavior. llvm-svn: 157895	2012-06-03 01:40:43 +00:00
Craig Topper	badd755a0e	Add neverHasSideEffects and mayLoad to FMA3 instructions. llvm-svn: 157894	2012-06-03 00:30:49 +00:00
Benjamin Kramer	172f80849f	Use access(2) instead of stat(2) to check if a file exists. Apart from being slightly cheaper, this fixes a real bug that hits 32 bit linux systems. When passing a file larger than 2G to be linked (which isn't that uncommon with large projects such as WebKit), clang's driver checks if the file exists but the file size doesn't fit in an off_t and stat(2) fails with EOVERFLOW. Clang then says that the file doesn't exist instead of passing it to the linker. llvm-svn: 157891	2012-06-02 16:28:09 +00:00
Benjamin Kramer	bde9176663	Fix typos found by http://github.com/lyda/misspell-check llvm-svn: 157885	2012-06-02 10:20:22 +00:00
Stepan Dyatkovskiy	0e46d8a08c	PR1255: case ranges. IntRange converted from struct to class. So main change everywhere is replacement of ".Low/High" with ".getLow/getHigh()" llvm-svn: 157884	2012-06-02 09:42:43 +00:00
Stepan Dyatkovskiy	9549f5894b	PR1255: case ranges. IntegersSubsetGeneric, IntegersSubsetMapping: added IntTy template parameter, that allows use either APInt or IntItem. This change allows to write unittest for these classes. llvm-svn: 157880	2012-06-02 07:26:00 +00:00
Akira Hatanaka	6f3b2a670f	Fix a bug in the code which custom-lowers truncating stores in LegalizeDAG. Check that the SDValue TargetLowering::LowerOperation returns is not null before replacing the original node with the returned node. llvm-svn: 157873	2012-06-02 01:10:34 +00:00
Chris Lattner	58268c23ac	remove an unused variable. llvm-svn: 157872	2012-06-02 01:03:42 +00:00
Akira Hatanaka	23327b30ef	Remove code which is no longer needed in MipsAsmPrinter and MipsMCInstLower. llvm-svn: 157867	2012-06-02 00:05:11 +00:00
Akira Hatanaka	019e592f75	Set operation actions for load/store nodes in the Mips backend. llvm-svn: 157866	2012-06-02 00:04:42 +00:00
Akira Hatanaka	f11571d90d	Add definitions of 32/64-bit unaligned load/store instructions for Mips. llvm-svn: 157865	2012-06-02 00:04:19 +00:00
Akira Hatanaka	8f1db778a4	Define functions MipsTargetLowering::LowerLOAD and LowerSTORE which custom-lower unaligned load and store nodes. llvm-svn: 157864	2012-06-02 00:03:49 +00:00
Akira Hatanaka	b9ebf8d644	Define Mips specific unaligned load/store nodes. llvm-svn: 157863	2012-06-02 00:03:12 +00:00
Akira Hatanaka	4e76bf8282	Expand unaligned i16 loads/stores for the Mips backend. This is the first of a series of patches which make changes to the backend to emit unaligned load/store instructions (lwl,lwr,swl,swr) during instruction selection. llvm-svn: 157862	2012-06-02 00:02:45 +00:00
Akira Hatanaka	56bf023a6d	In MipsMCInstLower::LowerSymbolOperand, get offset from symbol if the MachineOperand type has a valid offset. llvm-svn: 157861	2012-06-02 00:02:11 +00:00
Jakob Stoklund Olesen	54038d796c	Switch all register list clients to the new MC*Iterator interface. No functional change intended. Sorry for the churn. The iterator classes are supposed to help avoid giant commits like this one in the future. The TableGen-produced register lists are getting quite large, and it may be necessary to change the table representation. This makes it possible to do so without changing all clients (again). llvm-svn: 157854	2012-06-01 23:28:30 +00:00
Bill Wendling	e85f34969e	Register the gcov "writeout" at init time. Don't list this as a d'tor. Instead, inject some code in that will run via the "__mod_init_func" method that registers the gcov "writeout" function to execute at exit time. The problem is that the "__mod_term_func" method of specifying d'tors is deprecated on Darwin. And it can lead to some ambiguities when dealing with multiple libraries. <rdar://problem/11110106> llvm-svn: 157852	2012-06-01 23:14:32 +00:00
Jakob Stoklund Olesen	ca487d2183	Remove physreg support from adjustCopiesBackFrom and removeCopyByCommutingDef. After physreg coalescing was disabled, these functions can't do anything useful with physregs anyway. llvm-svn: 157849	2012-06-01 22:38:19 +00:00
Jakob Stoklund Olesen	9b09cf0c11	Simplify some more getAliasSet callers. MCRegAliasIterator can include Reg itself in the list. llvm-svn: 157848	2012-06-01 22:38:17 +00:00
Rafael Espindola	103c2cfbbd	Use dominates(Instruction, Use) in the verifier. This removes a bit of context from the verifier erros, but reduces code duplication in a fairly critical part of LLVM and makes dominates easier to test. llvm-svn: 157845	2012-06-01 21:56:26 +00:00
Chad Rosier	f319324082	[arm-fast-isel] Fix handling of the frameaddress intrinsic. If depth is 0 then DestReg is undefined. llvm-svn: 157840	2012-06-01 21:12:31 +00:00
Jakob Stoklund Olesen	92a0083944	Switch some getAliasSet clients to MCRegAliasIterator. MCRegAliasIterator can optionally visit the register itself, allowing for simpler code. llvm-svn: 157837	2012-06-01 20:36:54 +00:00
Manman Ren	879ca9d47d	X86: peephole optimization to remove cmp instruction This patch will optimize the following: sub r1, r3 cmp r3, r1 or cmp r1, r3 bge L1 TO sub r1, r3 bge L1 or ble L1 If the branch instruction can use flag from "sub", then we can eliminate the "cmp" instruction. llvm-svn: 157831	2012-06-01 19:49:33 +00:00
Manman Ren	e873552091	ARM: properly handle alignment for struct byval. Factor out the expansion code into a function. This change is to be enabled in clang. rdar://9877866 llvm-svn: 157830	2012-06-01 19:33:18 +00:00
Nuno Lopes	adf1c859dd	BoundsChecking: fix a bug when the handling of recursive PHIs failed and could leave dangling references in the cache add regression tests for this problem. Can already compile & run: PHP, PCRE, and ICU (i.e., all the software I tried) llvm-svn: 157822	2012-06-01 17:43:31 +00:00
Hans Wennborg	789acfb63d	Implement the local-dynamic TLS model for x86 (PR3985) This implements codegen support for accesses to thread-local variables using the local-dynamic model, and adds a clean-up pass so that the base address for the TLS block can be re-used between local-dynamic access on an execution path. llvm-svn: 157818	2012-06-01 16:27:21 +00:00
Stepan Dyatkovskiy	66305749f1	PR1255: case ranges. IntegersSubset devided into IntegersSubsetGeneric and into IntegersSubset itself. The first has no references to ConstantInt and works with IntItem only. IntegersSubsetMapping also made generic. Here added second template parameter "IntegersSubsetTy" that allows to use on of two IntegersSubset types described below. llvm-svn: 157815	2012-06-01 16:17:57 +00:00
Chris Lattner	cc84e6d2b5	quick fix for PR13006, will check in testcase later. llvm-svn: 157813	2012-06-01 15:02:52 +00:00
Stepan Dyatkovskiy	bd7303b7f7	PR1255: case ranges. IntItem cleanup. IntItemBase, IntItemConstantIntImp and IntItem merged into IntItem. All arithmetic operators was propogated from APInt. Also added comparison operators <,>,<=,>=. Currently you will find set of macros that propogates operators from APInt to IntItem in the beginning of IntegerSubset. Note that THESE MACROS WILL REMOVED after all passes will case-ranges compatible. Also note that these macros much smaller pain that something like this: if (V->getValue().ugt(AnotherV->getValue()) { ... } These changes made IntItem full featured integer object. It allows to make IntegerSubset class generic (move out all ConstantInt references inside and add unit-tests) in next commits. llvm-svn: 157810	2012-06-01 10:06:14 +00:00
Craig Topper	1d4d62d76c	Enable automatic detection of FMA3 support to allow intrinsics to be used. llvm-svn: 157805	2012-06-01 06:10:14 +00:00
Craig Topper	00649d5111	Remove fadd(fmul) patterns for FMA3. This needs to be implemented by paying attention to FP_CONTRACT and matching @llvm.fma which is not available yet. This will allow us to enablle intrinsic use at least though. llvm-svn: 157804	2012-06-01 06:07:48 +00:00
Craig Topper	2e127b5274	Add VFNSUB* instructions to folding table. llvm-svn: 157802	2012-06-01 05:48:39 +00:00
Craig Topper	9eadcfdf2a	Remove a trailing space and fix a comment. llvm-svn: 157801	2012-06-01 05:34:01 +00:00
Chris Lattner	466076b95f	enhance the logic for looking through tailcalls to look through transparent casts in multiple-return value scenarios, like what happens on X86-64 when returning small structs. llvm-svn: 157800	2012-06-01 05:29:15 +00:00
Craig Topper	df09da8355	Tidy up. Remove trailing spaces and fix the worst of the 80 column violations. llvm-svn: 157799	2012-06-01 05:24:29 +00:00
Chris Lattner	182fe3eef1	enhance getNoopInput to know about vector<->vector bitcasts of legal types, as well as int<->ptr casts. This allows us to tailcall functions with some trivial casts between the call and return (i.e. because the return types disagree). llvm-svn: 157798	2012-06-01 05:16:33 +00:00
Chris Lattner	4f3615de97	rearrange some logic, no functionality change. llvm-svn: 157796	2012-06-01 05:01:15 +00:00
Manman Ren	9f9111651e	ARM: support struct byval in llvm We handle struct byval by inserting a pseudo op, which will be expanded to a loop at ExpandISelPseudos. A separate patch for clang will be submitted to enable struct byval. rdar://9877866 llvm-svn: 157793	2012-06-01 02:44:42 +00:00
Michael J. Spencer	5c502811f1	Fix 80 columns. llvm-svn: 157788	2012-06-01 00:58:41 +00:00
Eric Christopher	1cf3338bb4	Add support for enum forward declarations. Part of rdar://11570854 llvm-svn: 157786	2012-06-01 00:22:32 +00:00
Chad Rosier	526772de29	Put the shiny new MCSubRegIterator to work. llvm-svn: 157783	2012-06-01 00:02:08 +00:00
Nuno Lopes	288e86ff6b	add -bounds-checking-multiple-traps option to make one trap BB per check disabled by default for now; we can discusse the default value (& name) later llvm-svn: 157777	2012-05-31 22:58:48 +00:00
Nuno Lopes	7d00061d87	revamp BoundsChecking considerably: - compute size & offset at the same time. The side-effects of this are that we now support negative GEPs. It's now approaching a phase that it can be reused by other passes (e.g., lowering of the objectsize intrinsic) - use APInt throughout to handle wrap-arounds - add support for PHI instrumentation - add a cache (required for recursive PHIs anyway) - remove hoisting support for now, since it was wrong in a few cases sorry for the churn here.. tests will follow soon. llvm-svn: 157775	2012-05-31 22:45:40 +00:00
Jakob Stoklund Olesen	4f203ea34b	Add support for return value promotion in X86 calling conventions. Patch by Yiannis Tsiouris! llvm-svn: 157757	2012-05-31 17:28:20 +00:00
Manman Ren	9bccb64e56	X86: replace SUB with CMP if possible This patch will optimize the following movq %rdi, %rax subq %rsi, %rax cmovsq %rsi, %rdi movq %rdi, %rax to cmpq %rsi, %rdi cmovsq %rsi, %rdi movq %rdi, %rax Perform this optimization if the actual result of SUB is not used. rdar: 11540023 llvm-svn: 157755	2012-05-31 17:20:29 +00:00
Jakob Stoklund Olesen	fa9d7db17b	Add a PrintRegUnit helper similar to PrintReg. Reg-units are named after their root registers, and most units have a single root, so they simply print as 'AL', 'XMM0', etc. The rare dual root reg-units print as FPSCR~FPSCR_NZCV, FP0~ST7, ... The printing piggybacks on the existing register name tables, so no extra const data space is required. llvm-svn: 157754	2012-05-31 17:18:29 +00:00
Joel Jones	585bc82489	Fix typos llvm-svn: 157752	2012-05-31 17:11:25 +00:00
Rafael Espindola	e3c5f3e5b1	Fix typos noticed by Benjamin Kramer. Also make the checks stronger and test that we reject ranges that overlap a previous wrapped range. llvm-svn: 157749	2012-05-31 16:04:26 +00:00
Benjamin Kramer	a0396e4583	X86: Rename the CLMUL target feature to PCLMUL. It was renamed in gcc/gas a while ago and causes all kinds of confusion because it was named differently in llvm and clang. llvm-svn: 157745	2012-05-31 14:34:17 +00:00
Rafael Espindola	97d7787788	Require intervals in the range metadata to be in a canonical form: They must be non contiguous, non overlapping and sorted by the lower end. While this is technically a backward incompatibility, every frontent currently produces range metadata with a single interval and we don't have any pass that merges intervals yet, so no existing bitcode files should be rejected by this. llvm-svn: 157741	2012-05-31 13:45:46 +00:00
Elena Demikhovsky	602f3a26d6	Added FMA3 Intel instructions. I disabled FMA3 autodetection, since the result may differ from expected for some benchmarks. I added tests for GodeGen and intrinsics. I did not change llvm.fma.f32/64 - it may be done later. llvm-svn: 157737	2012-05-31 09:20:20 +00:00
Duncan Sands	339bb61e32	Enhance the sinking code to handle diamond patterns. Patch by Carlo Alberto Ferraris. llvm-svn: 157736	2012-05-31 08:09:49 +00:00
Craig Topper	c1ac05dad5	Add intrinsic for pclmulqdq instruction. llvm-svn: 157731	2012-05-31 04:37:40 +00:00
Akira Hatanaka	bff8e31d3c	Cleanup and factoring of mips16 tablegen classes. Make register classes CPU16RegsRegClass and CPURARegRegClass available. Add definition of mips16 jalr instruction. Patch by Reed Kotler. llvm-svn: 157730	2012-05-31 02:59:44 +00:00
Eric Christopher	368461cad0	Fix typo in assembly directive. Noticed by inspection. llvm-svn: 157726	2012-05-31 00:53:18 +00:00
Jakob Stoklund Olesen	5541f6026e	Avoid depending on list orders and register numbering. This code is covered by test/CodeGen/ARM/arm-modifier.ll. llvm-svn: 157720	2012-05-30 23:00:43 +00:00
Jakob Stoklund Olesen	0b97dbcf1a	Extract some pointer hacking to a function. Switch to MCSuperRegIterator while we're there. llvm-svn: 157717	2012-05-30 22:40:03 +00:00
Jakob Stoklund Olesen	05e2245fc6	Prioritize smaller register classes for urgent evictions. It helps compile exotic inline asm. In the test case, normal GR32 virtual registers use up eax-edx so the final GR32_ABCD live range has no registers left. Since all the live ranges were tiny, we had no way of prioritizing the smaller register class. This patch allows tiny unspillable live ranges to be evicted by tiny unspillable live ranges from a smaller register class. <rdar://problem/11542429> llvm-svn: 157715	2012-05-30 21:46:58 +00:00
Eric Christopher	f481ab3877	Add support for the mips inline asm 'm' output modifier. Patch by Jack Carter. llvm-svn: 157709	2012-05-30 19:05:19 +00:00
Owen Anderson	0eda3e1de6	Switch the canonical FMA term operand order to match both the comment I wrote and the usual LLVM convention. llvm-svn: 157708	2012-05-30 18:54:50 +00:00
Owen Anderson	c7aaf523e1	Teach DAGCombine to canonicalize the position of a constant in the term operands of an FMA node. llvm-svn: 157707	2012-05-30 18:50:39 +00:00
Chad Rosier	fba46a64aa	Remove extra space. llvm-svn: 157706	2012-05-30 18:47:55 +00:00
Benjamin Kramer	406a2db1f6	Make sure that we're dealing with a binary SCEVExpr when simplifying. llvm-svn: 157704	2012-05-30 18:42:43 +00:00
Jakob Stoklund Olesen	ad8103dc7b	Fix some uses of getSubRegisters() to use getSubReg() instead. It is better to address sub-registers directly by name instead of relying on their position in the sub-register list. llvm-svn: 157703	2012-05-30 18:40:49 +00:00
Jakob Stoklund Olesen	3a48c06456	Remove some redundant tests. An empty list is not represented as a null pointer. Let TRI do its own shortcuts. llvm-svn: 157702	2012-05-30 18:38:56 +00:00
Benjamin Kramer	50b26ebb2b	Teach SCEV's icmp simplification logic that a-b == 0 is equivalent to a == b. This also required making recursive simplifications until nothing changes or a hard limit (currently 3) is hit. With the simplification in place indvars can canonicalize loops of the form for (unsigned i = 0; i < a-b; ++i) into for (unsigned i = 0; i != a-b; ++i) which used to fail because SCEV created a weird umax expr for the backedge taken count. llvm-svn: 157701	2012-05-30 18:32:23 +00:00
Chris Lattner	1622a99e58	it's pointed out that R11 can be used for magic things, and doing things just for 64-bit registers is silly. Just optimize 3 more. llvm-svn: 157699	2012-05-30 18:08:02 +00:00
Chris Lattner	04d722a68d	Extend the (abi-irrelevant) return convention to be able to return more than two values in integer registers. This is already supported by the fastcc convention, but it doesn't hurt to support it in the standard conventions as well. In cases where we can cheat at the calling convention, this allows us to avoid returning things through memory in more cases. llvm-svn: 157698	2012-05-30 17:50:14 +00:00
Chad Rosier	820d248c4d	[arm-fast-isel] Add support for the llvm.frameaddress() intrinsic. Patch by Jush Lu <jush.msn@gmail.com>. llvm-svn: 157696	2012-05-30 17:23:22 +00:00

... 3 4 5 6 7 ...

55057 Commits