llvm-project

Commit Graph

Author	SHA1	Message	Date
Hal Finkel	ed6a28597b	Enable early if conversion on PPC On cores for which we know the misprediction penalty, and we have the isel instruction, we can profitably perform early if conversion. This enables us to replace some small branch sequences with selects and avoid the potential stalls from mispredicting the branches. Enabling this feature required implementing canInsertSelect and insertSelect in PPCInstrInfo; isel code in PPCISelLowering was refactored to use these functions as well. llvm-svn: 178926	2013-04-05 23:29:01 +00:00
Hal Finkel	85526f2e71	Correct the PPC A2 misprediction penalty The manual states that there is a minimum of 13 cycles from when the mispredicted branch is issued to when the correct branch target is issued. llvm-svn: 178925	2013-04-05 23:28:58 +00:00
Bill Wendling	eb108bad50	Use the target options specified on a function to reset the back-end. During LTO, the target options on functions within the same Module may change. This would necessitate resetting some of the back-end. Do this for X86, because it's a Friday afternoon. llvm-svn: 178917	2013-04-05 21:52:40 +00:00
Renato Golin	91de828f46	Reverting 178851 as it broke buildbots llvm-svn: 178883	2013-04-05 16:39:53 +00:00
Chad Rosier	4a7005e976	[ms-inline asm] Add support for numeric displacement expressions in bracketed memory operands. Essentially, this layers an infix calculator on top of the parsing state machine. The scale on the index register is still expected to be an immediate __asm mov eax, [eax + ebx4] and will not work with more complex expressions. For example, __asm mov eax, [eax + ebx(22)] The plus and minus binary operators assume the numeric value of a register is zero so as to not change the displacement. Register operands should never be an operand for a multiply or divide operation; the scaleindexreg expression is always replaced with a zero on the operand stack to prevent such a case. rdar://13521380 llvm-svn: 178881	2013-04-05 16:28:55 +00:00
Stepan Dyatkovskiy	6b53a2f50a	Buildbot fix for r178851: mistake was in wrong TargetRegisterInfo::getRegClass usage. llvm-svn: 178854	2013-04-05 07:34:08 +00:00
Stepan Dyatkovskiy	b309b3b33e	Fix for PR14824: "Optimization arm_ldst_opt inserts newly generated instruction vldmia at incorrect position". Patch introduces memory operands tracking in ARMLoadStoreOpt::LoadStoreMultipleOpti. For each register it keeps the order of load operations as it was before optimization pass. It is kind of deep improvement of fix proposed by Hao: http://llvm.org/bugs/show_bug.cgi?id=14824#c4 But it also tracks conflicts between different register classes (e.g. D2 and S5). For more details see: Bug description: http://llvm.org/bugs/show_bug.cgi?id=14824 LLVM Commits discussion: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20130311/167936.html http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20130318/168688.html http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20130325/169376.html http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20130401/170238.html llvm-svn: 178851	2013-04-05 05:52:14 +00:00
Hal Finkel	1a958cf30d	Add a SchedMachineModel for the PPC G5 llvm-svn: 178850	2013-04-05 05:49:18 +00:00
Hal Finkel	5fde1b033e	Add a SchedMachineModel for the PPC A2 llvm-svn: 178848	2013-04-05 05:34:08 +00:00
Arnold Schwaighofer	fb6b9f48d0	ARM scheduler model: Add scheduler info to more instructions and resource descriptions for compares llvm-svn: 178844	2013-04-05 05:01:06 +00:00
Arnold Schwaighofer	5dde1f39c1	ARM scheduler model: Swift has varying latencies, uops for simple ALU ops llvm-svn: 178842	2013-04-05 04:42:00 +00:00
Arnold Schwaighofer	44f902ed7d	X86 cost model: Differentiate cost for vector shifts of constants SSE2 has efficient support for shifts by a scalar. My previous change of making shifts expensive did not take this into account marking all shifts as expensive. This would prevent vectorization from happening where it is actually beneficial. With this change we differentiate between shifts of constants and other shifts. radar://13576547 llvm-svn: 178808	2013-04-04 23:26:24 +00:00
Arnold Schwaighofer	b977387112	CostModel: Add parameter to instruction cost to further classify operand values On certain architectures we can support efficient vectorized version of instructions if the operand value is uniform (splat) or a constant scalar. An example of this is a vector shift on x86. We can efficiently support for (i = 0 ; i < ; i += 4) w[0:3] = v[0:3] << <2, 2, 2, 2> but not for (i = 0; i < ; i += 4) w[0:3] = v[0:3] << x[0:3] This patch adds a parameter to getArithmeticInstrCost to further qualify operand values as uniform or uniform constant. Targets can then choose to return a different cost for instructions with such operand values. A follow-up commit will test this feature on x86. radar://13576547 llvm-svn: 178807	2013-04-04 23:26:21 +00:00
Hal Finkel	e5680b3c36	Rename the current PPC BCL definition to BCLalways BCL is normally a conditional branch-and-link instruction, but has an unconditional form (which is used in the SjLj code, for example). To make clear that this BCL instruction definition is specifically the special unconditional form (which does not meaningfully take a condition-register input), rename it to BCLalways. No functionality change intended. llvm-svn: 178803	2013-04-04 22:55:54 +00:00
Hal Finkel	f96c18e3bc	PPC: Improve code generation for mixed-precision reciprocal sqrt The DAGCombine logic that recognized a/sqrt(b) and transformed it into a multiplication by the reciprocal sqrt did not handle cases where the sqrt and the division were separated by an fpext or fptrunc. llvm-svn: 178801	2013-04-04 22:44:12 +00:00
Jyotsna Verma	a929ab58c0	Hexagon: Expand br_cc. It fixes following tests for Hexagon: CodeGen/Generic/2003-07-29-BadConstSbyte.ll CodeGen/Generic/2005-10-21-longlonggtu.ll CodeGen/Generic/2009-04-28-i128-cmp-crash.ll CodeGen/Generic/MachineBranchProb.ll CodeGen/Generic/builtin-expect.ll CodeGen/Generic/pr12507.ll llvm-svn: 178794	2013-04-04 21:18:26 +00:00
Richard Osborne	0c12d1851e	[XCore] Add bru instruction. llvm-svn: 178783	2013-04-04 20:05:35 +00:00
Richard Osborne	f18d95f756	[XCore] The RRegs register class is a superset of GRRegs. At the time when the XCore backend was added there were some issues with with overlapping register classes but these all seem to be fixed now. Describing the register classes correctly allow us to get rid of a codegen only instruction (LDAWSP_lru6_RRegs) and it means we can disassemble ru6 instructions that use registers above r11. llvm-svn: 178782	2013-04-04 19:57:46 +00:00
Jakob Stoklund Olesen	299475e0c6	Avoid high-latency false CPSR dependencies even for tMOVSi. The Thumb2SizeReduction pass avoids false CPSR dependencies, except it still aggressively creates tMOVi8 instructions because they are so common. Avoid creating false CPSR dependencies even for tMOVi8 instructions when the the CPSR flags are known to have high latency. This allows integer computation to overlap floating point computations. Also process blocks in a reverse post-order and propagate high-latency flags to successors. <rdar://problem/13468102> llvm-svn: 178773	2013-04-04 18:25:36 +00:00
Vincent Lejeune	bcbb13d691	R600: Use a mask for offsets when encoding instructions llvm-svn: 178763	2013-04-04 14:00:09 +00:00
Vincent Lejeune	8e377fdba6	R600: Fix wrong address when substituting ENDIF llvm-svn: 178762	2013-04-04 14:00:03 +00:00
Vincent Lejeune	c44fa99719	R600: Take export into account when computing cf address llvm-svn: 178761	2013-04-04 13:59:59 +00:00
Jakob Stoklund Olesen	8cfaffaade	Add SPARC v9 support for select on 64-bit compares. This requires v9 cmov instructions using the %xcc flags instead of the %icc flags. Still missing: - Select floats on %xcc flags. - Select i64 on %fcc flags. llvm-svn: 178737	2013-04-04 03:08:00 +00:00
Arnold Schwaighofer	e9b5016411	X86 cost model: Vector shifts are expensive in most cases The default logic does not correctly identify costs of casts because they are marked as custom on x86. For some cases, where the shift amount is a scalar we would be able to generate better code. Unfortunately, when this is the case the value (the splat) will get hoisted out of the loop, thereby making it invisible to ISel. radar://13130673 radar://13537826 llvm-svn: 178703	2013-04-03 21:46:05 +00:00
Vincent Lejeune	c3d3f9b66e	R600: Fix last ALU of a clause being emitted in a separate clause llvm-svn: 178675	2013-04-03 18:24:47 +00:00
Hal Finkel	b0c810ff6d	Cleanup PPC reciprocal-estimate functionality Incorporating review feedback from Bill Schmidt on r178617. No functionality change intended. llvm-svn: 178672	2013-04-03 17:44:56 +00:00
Vincent Lejeune	80031d9fc4	R600: Factorize maximum alu per clause in a single location llvm-svn: 178667	2013-04-03 16:49:34 +00:00
Vincent Lejeune	b6d6c0d458	R600: Simplify data structure and add DEBUG to R600ControlFlowFinalizer llvm-svn: 178665	2013-04-03 16:24:09 +00:00
Vincent Lejeune	9931298b30	R600: Consider KILLGT as an ALU instruction Mesa does not override llvm behavior wrt KILLGT anymore so llvm has to handle KILLGT on its own. llvm-svn: 178664	2013-04-03 16:24:04 +00:00
Hal Finkel	7ac4592e97	PPC: Enable FRES and FRSQRTE on the default PPC64 description I discussed this with Bill Schmidt on IRC, and it was decided that this is a safe and reasonable default. llvm-svn: 178659	2013-04-03 14:40:18 +00:00
Hal Finkel	0c6d21933a	PPC: Add a FIXME regarding the non-working fma+fneg Altivec pattern llvm-svn: 178658	2013-04-03 14:40:16 +00:00
Hal Finkel	2ed21a8ca6	Remove some obsolete PowerPC/README entries llvm-svn: 178657	2013-04-03 14:25:55 +00:00
Ulrich Weigand	084ff8e891	More direct types in PowerPC AltiVec intrinsics. This patch follows up on work done by Bill Schmidt in r178277, and replaces most of the remaining uses of VRRC in ISEL DAG patterns. The resulting .inc files are identical except for comments, so no change in code generation is expected. llvm-svn: 178656	2013-04-03 14:08:13 +00:00
Bill Schmidt	92e26646bc	Fix PR15632: No support for ppcf128 floating-point remainder on PowerPC. For this we need to use a libcall. Previously LLVM didn't implement libcall support for frem, so I've added it in the usual straightforward manner. A test case from the bug report is included. llvm-svn: 178639	2013-04-03 13:05:44 +00:00
Tim Northover	5816ca117b	AArch64: implement ETMv4 trace system registers. llvm-svn: 178637	2013-04-03 12:31:29 +00:00
Timur Iskhodzhanov	f4e0665e56	Fix SRet for thiscall in i686-pc-win32 llvm-svn: 178634	2013-04-03 11:27:54 +00:00
Tim Northover	5b097a735f	AArch64: switch patterns to be type-based rather than RegClass-based It's a bit of churn in the blame log, but I think there are real benefits to the newer system so I'm making the change in one go. llvm-svn: 178633	2013-04-03 11:19:16 +00:00
Jakob Stoklund Olesen	d9bbdfd3cc	Add 64-bit compare + branch for SPARC v9. The same compare instruction is used for 32-bit and 64-bit compares. It sets two different sets of flags: icc and xcc. This patch adds a conditional branch instruction using the xcc flags for 64-bit compares. llvm-svn: 178621	2013-04-03 04:41:44 +00:00
Hal Finkel	b00fc87608	Remove some unsupported-feature comments from PPC.td These refer to the reciprocal estimate support recently committed. llvm-svn: 178618	2013-04-03 04:03:58 +00:00
Hal Finkel	2e10331057	Use PPC reciprocal estimates with Newton iteration in fast-math mode When unsafe FP math operations are enabled, we can use the fre[s] and frsqrte[s] instructions, which generate reciprocal (sqrt) estimates, together with some Newton iteration, in order to quickly generate floating-point division and sqrt results. All of these instructions are separately optional, and so each has its own feature flag (except for the Altivec instructions, which are covered under the existing Altivec flag). Doing this is not only faster than using the IEEE-compliant fdiv/fsqrt instructions, but allows these computations to be pipelined with other computations in order to hide their overall latency. I've also added a couple of missing fnmsub patterns which turned out to be missing (but are necessary for good code generation of the Newton iterations). Altivec needs a similar fix, but that will probably be more complicated because fneg is expanded for Altivec's v4f32. llvm-svn: 178617	2013-04-03 04:01:11 +00:00
Eric Christopher	e2fbc67e81	Formatting. llvm-svn: 178589	2013-04-02 23:06:40 +00:00
Akira Hatanaka	023c678a0d	[mips] Small update to the implementation of eh.return for Mips. This patch initializes t9 to the handler address, but only if the relocation model is pic. This handles the case where handler to which eh.return jumps points to the start of the function. Patch by Sasa Stankovic. llvm-svn: 178588	2013-04-02 23:02:07 +00:00
Akira Hatanaka	2ffc5734e7	[mips] Expand pseudo multiply/divide instructions in MipsCodeEmitter.cpp. This patch fixes the following two tests which have been failing on llvm-mips-linux builder since r178403: LLVM :: Analysis/Profiling/load-branch-weights-ifs.ll LLVM :: Analysis/Profiling/load-branch-weights-loops.ll llvm-svn: 178584	2013-04-02 22:53:58 +00:00
Chad Rosier	8a24466f69	[ms-inline asm] Add support for parsing variables with namespace alias qualifiers. This patch only adds support for parsing these identifiers in the X86AsmParser. The front-end interface isn't capable of looking up these identifiers at this point in time. The end result is the compiler now errors during object file emission, rather than at parse time. Test case coming shortly. Part of rdar://13499009 and PR13340 llvm-svn: 178566	2013-04-02 20:02:33 +00:00
Bill Schmidt	3581cd4b4c	Fix PR15630: Replace faulty stdcx. with stwcx. When doing a partword atomic operation, a lwarx was being paired with a stdcx. instead of a stwcx. when compiling for a 64-bit target. The target has nothing to do with it in this case; we always need a stwcx. Thanks to Kai Nacke for reporting the problem. llvm-svn: 178559	2013-04-02 18:37:08 +00:00
Chad Rosier	7925d280ff	[fast-isel] Use the correct API to disable FastLowerArguments for Win64. llvm-svn: 178549	2013-04-02 16:31:41 +00:00
Justin Holewinski	a922c7e90e	[NVPTX] Fix a few style issues in NVVMReflect llvm-svn: 178536	2013-04-02 12:37:11 +00:00
Jakob Stoklund Olesen	8eabc3ffde	Add 64-bit load and store instructions. There is only a few new instructions, the rest is handled with patterns. llvm-svn: 178528	2013-04-02 04:09:28 +00:00
Jakob Stoklund Olesen	917e07f095	Basic 64-bit ALU operations. SPARC v9 extends all ALU instructions to 64 bits, so we simply need to add patterns to use them for both i32 and i64 values. llvm-svn: 178527	2013-04-02 04:09:23 +00:00
Jakob Stoklund Olesen	bddb20eeef	Materialize 64-bit immediates. The last resort pattern produces 6 instructions, and there are still opportunities for materializing some immediates in fewer instructions. llvm-svn: 178526	2013-04-02 04:09:17 +00:00
Jakob Stoklund Olesen	c1d1a4816e	Add 64-bit shift instructions. SPARC v9 defines new 64-bit shift instructions. The 32-bit shift right instructions are still usable as zero and sign extensions. This adds new F3_Sr and F3_Si instruction formats that probably should be used for the 32-bit shifts as well. They don't really encode an simm13 field. llvm-svn: 178525	2013-04-02 04:09:12 +00:00
Jakob Stoklund Olesen	739d722ef7	Add predicates for distinguishing 32-bit and 64-bit modes. The 'sparc' architecture produces 32-bit code while 'sparcv9' produces 64-bit code. It is also possible to run 32-bit code using SPARC v9 instructions with: llc -march=sparc -mattr=+v9 llvm-svn: 178524	2013-04-02 04:09:06 +00:00
Jakob Stoklund Olesen	0b21f35aca	Add support for 64-bit calling convention. This is far from complete, but it is enough to make it possible to write test cases using i64 arguments. Missing features: - Floating point arguments. - Receiving arguments on the stack. - Calls. llvm-svn: 178523	2013-04-02 04:09:02 +00:00
Jakob Stoklund Olesen	5ad3b35377	Add an I64Regs register class for 64-bit registers. We are going to use the same registers for 32-bit and 64-bit values, but in two different register classes. The I64Regs register class has a larger spill size and alignment. The addition of an i64 register class confuses TableGen's type inference, so it is necessary to clarify the type of some immediates and the G0 register. In 64-bit mode, pointers are i64 and should use the I64Regs register class. Implement getPointerRegClass() to dynamically provide the pointer register class depending on the subtarget. Use ptr_rc and iPTR for memory operands. Finally, add the i64 type to the IntRegs register class. This register class is not used to hold i64 values, I64Regs is for that. The type is required to appease TableGen's type checking in output patterns like this: def : Pat<(add i64:$a, i64:$b), (ADDrr $a, $b)>; SPARC v9 uses the same ADDrr instruction for i32 and i64 additions, and TableGen doesn't know to check the type of register sub-classes. llvm-svn: 178522	2013-04-02 04:08:54 +00:00
Hal Finkel	93d75ea08a	Fix typo in PPCISelLowering Thanks to Bill Schmidt for finding this in review of r178480. llvm-svn: 178521	2013-04-02 03:29:51 +00:00
Andrew Trick	e1d88cfb57	The divide unit is not pipeline, but it is still buffered. Buffered means a later divide may be executed out-of-order while a prior divide is sitting (buffered) in a reservation station. You can tell it's not pipelined, because operations that use it reserve it for more than one cycle: def : WriteRes<WriteIDiv, [HWPort0, HWDivider]> { let Latency = 25; let ResourceCycles = [1, 10]; } We don't currently distinguish between an unpipeline operation and one that is split into multiple micro-ops requiring the same unit. Except that the later may have NumMicroOps > 1 if they also consume issue/dispatch resources. llvm-svn: 178519	2013-04-02 01:58:47 +00:00
NAKAMURA Takumi	fd98f7f2b6	Target/R600: Fix CMake build to add missing files. llvm-svn: 178508	2013-04-01 22:05:58 +00:00
Vincent Lejeune	bfaa63a6db	R600: Add support for native control flow llvm-svn: 178505	2013-04-01 21:48:05 +00:00
Vincent Lejeune	ace6f7351e	R600/SI: Share code recording ShaderTypeAttribute between generations llvm-svn: 178504	2013-04-01 21:47:53 +00:00
Vincent Lejeune	f43bc57b66	R600: Emit CF_ALU and use true kcache register. llvm-svn: 178503	2013-04-01 21:47:42 +00:00
Hal Finkel	3f88d08974	Fix a bad assert in PPCTargetLowering llvm-svn: 178489	2013-04-01 18:42:58 +00:00
Hal Finkel	f6d45f2379	Add more PPC floating-point conversion instructions The P7 and A2 have additional floating-point conversion instructions which allow a direct two-instruction sequence (plus load/store) to convert from all combinations (signed/unsigned i32/i64) <--> (float/double) (on previous cores, only some combinations were directly available). llvm-svn: 178480	2013-04-01 17:52:07 +00:00
Hal Finkel	39caf9f5ec	Use ImmToIdxMap.count in PPCRegisterInfo Code improvement suggested by Jakob (in review of r178450). No functionality change intended. llvm-svn: 178473	2013-04-01 17:02:06 +00:00
Hal Finkel	290376dd78	Add the PPC popcntw instruction The popcntw instruction is available whenever the popcntd instruction is available, and performs a separate popcnt on the lower and upper 32-bits. Ignoring the high-order count, this can be used for the 32-bit input case (saving on the explicit zero extension otherwise required to use popcntd). llvm-svn: 178470	2013-04-01 15:58:15 +00:00
Hal Finkel	60c7510711	Treat PPCISD::STFIWX like the memory opcode that it is PPCISD::STFIWX is really a memory opcode, and so it should come after FIRST_TARGET_MEMORY_OPCODE, and we should use DAG.getMemIntrinsicNode to create nodes using it. No functionality change intended (although there could be optimization benefits from preserving the MMO information). llvm-svn: 178468	2013-04-01 15:37:53 +00:00
Duncan Sands	fee96f832d	Remove unused typedef. llvm-svn: 178462	2013-04-01 13:46:15 +00:00
Arnold Schwaighofer	6793aebb84	ARM Scheduler Model: Add resources instructions, map resources in subtargets Reapply r177968: After commit 178074 we can now have undefined scheduler variants. Move the CortexA9 resources into the CortexA9 SchedModel namespace. Define resource mappings under the CortexA9 SchedModel. Define resources and mappings for the SwiftModel. Incooperate Andrew's feedback. llvm-svn: 178460	2013-04-01 13:07:05 +00:00
Benjamin Kramer	52ceb44331	X86TTI: Add accurate costs for itofp operations, based on the actual instruction counts. llvm-svn: 178459	2013-04-01 10:23:49 +00:00
Vincent Lejeune	53f3525d35	R600: Emit native instructions for tex llvm-svn: 178452	2013-03-31 19:33:04 +00:00
Duncan Sands	e1aa194aab	There is no longer any need to silence this compiler warning as the warning has been turned off globally. llvm-svn: 178451	2013-03-31 17:44:09 +00:00
Hal Finkel	8540f7771c	Cleanup ImmToIdxMap and noImmForm in PPCRegisterInfo ImmToIdxMap should be a DenseMap (not a std::map) because there is no ordering requirement. Also, we don't need a separate list of instructions for noImmForm in eliminateFrameIndex, because this list is essentially the complement of the keys in ImmToIdxMap. No functionality change intended. llvm-svn: 178450	2013-03-31 14:43:31 +00:00
Benjamin Kramer	b60633fb87	X86: Promote sitofp <8 x i16> to <8 x i32> when AVX is available. A vector sext + sitofp is a lot cheaper than 8 scalar conversions. llvm-svn: 178448	2013-03-31 12:49:15 +00:00
Hal Finkel	beb296bea1	Add the PPC lfiwax instruction This instruction is available on modern PPC64 CPUs, and is now used to improve the SINT_TO_FP lowering (by eliminating the need for the separate sign extension instruction and decreasing the amount of needed stack space). llvm-svn: 178446	2013-03-31 10:12:51 +00:00
Hal Finkel	e53429a13e	Cleanup PPC(64) i32 -> float/double conversion The existing SINT_TO_FP code for i32 -> float/double conversion was disabled because it relied on broken EXTSW_32/STD_32 instruction definitions. The original intent had been to enable these 64-bit instructions to be used on CPUs that support them even in 32-bit mode. Unfortunately, this form of lying to the infrastructure was buggy (as explained in the FIXME comment) and had therefore been disabled. This re-enables this functionality, using regular DAG nodes, but only when compiling in 64-bit mode. The old STD_32/EXTSW_32 definitions (which were dead) are removed. llvm-svn: 178438	2013-03-31 01:58:02 +00:00
Benjamin Kramer	9c9e0a2c04	Change '@SECREL' suffix to GAS-compatible '@SECREL32'. '@SECREL' is what is used by the Microsoft assembler, but GNU as expects '@SECREL32'. With the patch, the MC-generated code works fine in combination with a recent GNU as (2.23.51.20120920 here). Patch by David Nadlinger! Differential Revision: http://llvm-reviews.chandlerc.com/D429 llvm-svn: 178427	2013-03-30 16:21:50 +00:00
Justin Holewinski	59fd8ba5f5	[NVPTX] Remove support for SM < 2.0. This was never fully supported anyway. llvm-svn: 178417	2013-03-30 14:29:30 +00:00
Justin Holewinski	b94bd05b95	[NVPTX] Add NVVMReflect pass to allow compile-time selection of specific code paths. This allows us to write code like: if (__nvvm_reflect("FOO")) // Do something else // Do something else and compile into a library, then give "FOO" a value at kernel compile-time so the check becomes a no-op. llvm-svn: 178416	2013-03-30 14:29:25 +00:00
Justin Holewinski	0497ab142d	[NVPTX] Run clang-format on all NVPTX sources. Hopefully this resolves any outstanding style issues and gives us an automated way of ensuring we conform to the style guidelines. llvm-svn: 178415	2013-03-30 14:29:21 +00:00
Akira Hatanaka	b3c1847b30	[mips] Add patterns for DSP indexed load instructions. llvm-svn: 178408	2013-03-30 02:14:45 +00:00
Akira Hatanaka	b1457304cc	[mips] Define reg+imm load/store pattern templates. llvm-svn: 178407	2013-03-30 02:01:48 +00:00
Akira Hatanaka	fb221c197d	[mips] Fix DSP instructions to have explicit accumulator register operands. Check that instruction selection can select multiply-add/sub DSP instructions from a pattern that doesn't have intrinsics. llvm-svn: 178406	2013-03-30 01:58:00 +00:00
Akira Hatanaka	33c060480d	Remove unused variables. llvm-svn: 178405	2013-03-30 01:46:28 +00:00
Akira Hatanaka	9efcd76c2c	[mips] Move the code which does dag-combine for multiply-add/sub nodes to derived class MipsSETargetLowering. We shouldn't be generating madd/msub nodes if target is Mips16, since Mips16 doesn't have support for multipy-add/sub instructions. llvm-svn: 178404	2013-03-30 01:42:24 +00:00
Akira Hatanaka	be8612f6f4	[mips] Fix definitions of multiply, multiply-add/sub and divide instructions. The new instructions have explicit register output operands and use table-gen patterns instead of C++ code to do instruction selection. Mips16's instructions are unaffected by this change. llvm-svn: 178403	2013-03-30 01:36:35 +00:00
Akira Hatanaka	f0ea500c14	[mips] Remove function getFPBranchCodeFromCond. Rename invertFPCondCodeAdd. llvm-svn: 178396	2013-03-30 01:16:38 +00:00
Akira Hatanaka	d5a0e096bc	Fix indentation. llvm-svn: 178395	2013-03-30 01:15:17 +00:00
Akira Hatanaka	28721bd7dd	[mips] Add mips-specific nodes which will be used to select multiply and divide instructions. llvm-svn: 178394	2013-03-30 01:14:04 +00:00
Akira Hatanaka	3a34d14745	[mips] Implement getRepRegClassFor in MipsSETargetLowering. This function is called in several places in ScheduleDAGRRList.cpp. llvm-svn: 178393	2013-03-30 01:12:05 +00:00
Akira Hatanaka	cd77e15cfb	[mips] Fix MipsSEInstrInfo::copyPhysReg, loadRegFromStack and storeRegToStack to handle accumulator registers. llvm-svn: 178392	2013-03-30 01:08:05 +00:00
Akira Hatanaka	3b70145184	[mips] Expand pseudo load, store and copy instructions right before callee-saved scan. The code makes use of register's scavenger's capability to spill multiple registers. llvm-svn: 178391	2013-03-30 01:04:11 +00:00
Akira Hatanaka	c8d85025a0	[mips] Define pseudo instructions for spilling and copying accumulator registers. llvm-svn: 178390	2013-03-30 00:54:52 +00:00
Jyotsna Verma	add82b3c75	Hexagon: Add emitFrameIndexDebugValue function to emit debug information. llvm-svn: 178368	2013-03-29 21:09:53 +00:00
Hal Finkel	f8ac57e289	Implement FRINT lowering on PPC using frin Like nearbyint, rint can be implemented on PPC using the frin instruction. The complication comes from the fact that rint needs to set the FE_INEXACT flag when the result does not equal the input value (and frin does not do that). As a result, we use a custom inserter which, after the rounding, compares the rounded value with the original, and if they differ, explicitly sets the XX bit in the FPSCR register (which corresponds to FE_INEXACT). Once LLVM has better modeling of the floating-point environment we should be able to (often) eliminate this extra complexity. llvm-svn: 178362	2013-03-29 19:41:55 +00:00
Akira Hatanaka	7b8b9b9abf	[mips] Define a function which returns the GPR register class. llvm-svn: 178359	2013-03-29 19:17:42 +00:00
Benjamin Kramer	70671b9937	Remove the old CodePlacementOpt pass. It was superseded by MachineBlockPlacement and disabled by default since LLVM 3.1. llvm-svn: 178349	2013-03-29 17:14:24 +00:00
Jyotsna Verma	26226cea4b	Hexagon: Disable DwarfUsesInlineInfoSection flag. llvm-svn: 178345	2013-03-29 15:46:12 +00:00
Hal Finkel	c20a08d25b	Add PPC FP rounding instructions fri[mnpz] These instructions are available on the P5x (and later) and on the A2. They implement the standard floating-point rounding operations (floor, trunc, etc.). One caveat: frin (round to nearest) does not implement "ties to even", and so is only enabled in fast-math mode. llvm-svn: 178337	2013-03-29 08:57:48 +00:00
Akira Hatanaka	f05e9ad59f	[mips] Change type of accumulator registers to Untyped. Add two more accumulator register classes for Mips64 and DSP-ASE. No functionality changes. llvm-svn: 178328	2013-03-29 03:27:21 +00:00
Akira Hatanaka	465faccafa	[mips] Define overloaded versions of storeRegToStack and loadRegFromStack. No functionality changes. llvm-svn: 178327	2013-03-29 02:14:12 +00:00
Akira Hatanaka	11184e4c8c	[mips] Add parameter Alignment to MipsFrameLowering's constructor. No functionality changes. llvm-svn: 178326	2013-03-29 01:51:04 +00:00
Jack Carter	311246c6d5	[Mips Assembler] Add support for OR macro with imediate opperand Mips assembler supports macros that allows the OR instruction to have an immediate parameter. This patch adds an instruction alias that converts this macro into a Mips ORI instruction. Contributer: Vladimir Medic llvm-svn: 178316	2013-03-28 23:45:13 +00:00
Michael Liao	a486a11dcf	Add support of RDSEED defined in AVX2 extension llvm-svn: 178314	2013-03-28 23:41:26 +00:00
Michael Liao	5fff5c7b26	Enhance boolean simplification to handle 16-/64-bit RDRAND - RDRAND always clears the destination value when a random value is not available (i.e. CF == 0). This value is truncated or zero-extended as the false boolean value to be returned. Boolean simplification needs to skip this 'zext' or 'trunc' node. llvm-svn: 178312	2013-03-28 23:38:52 +00:00
Michael Liao	96b42608ab	Skip moving call address loading into callseq when targets prefer register indirect call. To enable a load of a call address to be folded with that call, this load is moved from outside of callseq into callseq. Such a moving adds a non-glued node (that load) into a glued sequence. This non-glue load is only removed when DAG selection folds them into a memory form call instruction. When such instruction selection is disabled, it breaks DAG schedule. To prevent that, such moving is disabled when target favors register indirect call. Previous workaround disabling CALL32m/CALL64m insn selection is removed. llvm-svn: 178308	2013-03-28 23:13:21 +00:00
Jack Carter	e1d85d55e6	[Mips Assembler] Add alias definitions for jal Mips assembler allows following to be used as aliased instructions: jal $rs for jalr $rs jal $rd,$rd for jalr $rd,$rs This patch provides alias definitions in td files and test cases to show the usage. Contributer: Vladimir Medic llvm-svn: 178304	2013-03-28 23:02:21 +00:00
Nadav Rotem	ff8c45529c	Add the X86 FMAs to the scheduling model. llvm-svn: 178303	2013-03-28 22:54:45 +00:00
Nadav Rotem	e7b6a8aa8c	Add the Haswell machine model. llvm-svn: 178301	2013-03-28 22:34:46 +00:00
Nadav Rotem	a20ec3164e	Remove the unused port from the SandyBridge machine model llvm-svn: 178300	2013-03-28 22:32:41 +00:00
Michael Liao	c93fe7f8b2	Add ADX CPUID detection llvm-svn: 178299	2013-03-28 22:29:53 +00:00
Eric Christopher	6c75232cf0	These two are default in the constructor for MCAsmInfo. llvm-svn: 178293	2013-03-28 21:37:18 +00:00
Timur Iskhodzhanov	a2fd5fdd7a	Make Win32 put the SRet address into EAX, fixes PR15556 llvm-svn: 178291	2013-03-28 21:30:04 +00:00
Hal Finkel	22e41c411e	Only enable 64-bit bswap DAG combines for PPC64 Compiling in 32-bit mode on a P7 would assert after 64-bit DAG combines were added for bswap with load/store. This is because these combines are really only valid in 64-bit mode, regardless of the CPU (and this was not being checked). llvm-svn: 178286	2013-03-28 20:23:46 +00:00
Jyotsna Verma	a46059b74d	Hexagon: Replace switch-case in isDotNewInst with TSFlags. llvm-svn: 178281	2013-03-28 19:44:04 +00:00
Hal Finkel	93492fa696	Fix bad indentation in r178276 Thanks to Bill Schmidt for pointing this out! llvm-svn: 178280	2013-03-28 19:43:12 +00:00
Jyotsna Verma	27c06f3322	Hexagon: Enable SupportDebugInfomation and DwarfInSection flags. llvm-svn: 178279	2013-03-28 19:34:49 +00:00
Bill Schmidt	74b2e72ab3	Use direct types in most PowerPC Altivec instructions and patterns. This follows up Ulrich Weigand's work in PPCInstrInfo.td and PPCInstr64Bit.td by doing the corresponding work for most of the Altivec patterns. I have not been able to do anything for the following classes of instructions: (1) Vector logicals. These don't have corresponding intrinsics and don't have a single obvious vector type. So far as I can tell I need to leave these as VRRC. Affected instructions are: VAND, VANDC, VNOR, VOR, VXOR, V_SET0. (2) Instructions that make use of vector shuffle. The selection code promotes all shuffles to v16i8, so any pattern that matches on a shuffle is constrained. I haven't found any way to make the patterns match on their natural types, so I plan to leave these as VRRC. Affected instructions are: VMRG*, VSPLTB, VSPLTH, VSPLTW, VPKUHUM, VPKUWUM. No change in behavior is anticipated. llvm-svn: 178277	2013-03-28 19:27:24 +00:00
Hal Finkel	31d2956510	Add the PPC64 ldbrx/stdbrx instructions These are 64-bit load/store with byte-swap, and available on the P7 and the A2. Like the similar instructions for 16- and 32-bit words, these are matched in the target DAG-combine phase against load/store-bswap pairs. llvm-svn: 178276	2013-03-28 19:25:55 +00:00
Gordon Keiser	772cf466da	Fix issue with disassembler decoding CBZ/CBNZ immediates as negatives when the upper bit is set. They should always be zero-extended, not sign extended. Added test case. llvm-svn: 178275	2013-03-28 19:22:28 +00:00
Gordon Keiser	fb1ce5fa25	Testing commit access to llvm. Remove two lines of whitespace from the Thumb README. llvm-svn: 178256	2013-03-28 18:26:15 +00:00
Jyotsna Verma	93e740485f	Hexagon: Use multiclass for gp-relative instructions. Remove noV4T gp-relative instructions. llvm-svn: 178246	2013-03-28 16:25:57 +00:00
Tim Northover	08bb3ce383	AArch64: implement GICv3 system registers llvm-svn: 178236	2013-03-28 14:30:46 +00:00
Hal Finkel	a4d074863a	Add the PPC64 popcntd instruction PPC ISA 2.06 (P7, A2, etc.) has a popcntd instruction. Add this instruction and tell TTI about it so that popcount-loop recognition will know about it. llvm-svn: 178233	2013-03-28 13:29:47 +00:00
Hal Finkel	035b4825ce	Cleanup PPC CR-spill kill flags and 32- vs. 64-bit instructions There were a few places where kill flags were not being set correctly, and where 32-bit instruction variants were being used with 64-bit registers. After r178180, this code was being triggered causing llc to assert. llvm-svn: 178220	2013-03-28 03:38:16 +00:00
Hal Finkel	25aab01058	Fix typo in PPCInstr64Bit llvm-svn: 178219	2013-03-28 03:38:08 +00:00
Preston Gurd	d6be4bf87f	This patch follows is a follow up to r178171, which uses the register form of call in preference to memory indirect on Atom. In this case, the patch applies the optimization to the code for reloading spilled registers. The patch also includes changes to sibcall.ll and movgs.ll, which were failing on the Atom buildbot after the first patch was applied. This patch by Sriram Murali. llvm-svn: 178193	2013-03-27 23:16:18 +00:00
Chad Rosier	1530ba5e73	[ms-inline asm] Add support of imm displacement before bracketed memory expression. Specifically, this syntax: ImmDisp [ BaseReg + Scale*IndexReg + Disp ] We don't currently support: ImmDisp [ Symbol ] rdar://13518671 llvm-svn: 178186	2013-03-27 21:49:56 +00:00
Hal Finkel	37714b8a48	Resynchronize isLoadFromStackSlot with LoadRegFromStackSlot (and stores) in PPCInstrInfo These functions should have the same list of load/store instructions. Now that all load/store forms have been normalized (to single instructions or pseudos) they can be resynchronized. Found by inspection, although hopefully this will improve optimization. I've also added some comments. llvm-svn: 178180	2013-03-27 21:21:15 +00:00
Preston Gurd	663e6f9558	For the current Atom processor, the fastest way to handle a call indirect through a memory address is to load the memory address into a register and then call indirect through the register. This patch implements this improvement by modifying SelectionDAG to force a function address which is a memory reference to be loaded into a virtual register. Patch by Sriram Murali. llvm-svn: 178171	2013-03-27 19:14:02 +00:00
Hal Finkel	1996f3d87f	Fix typo (common to both X86 and PPC) Thanks to Bill Schmidt for pointing this out during code review! llvm-svn: 178170	2013-03-27 19:10:42 +00:00
Hal Finkel	5791f51449	Remove more dead LR-as-GPR PPC code I had removed similar code a few days ago, but somehow missed this. llvm-svn: 178169	2013-03-27 19:10:40 +00:00
Hal Finkel	f1af79ab45	Remove "gpr0 allocation" from the PPC README TODO list As Chris pointed out, post r178123, this is now done! llvm-svn: 178165	2013-03-27 18:39:52 +00:00
Christian Konig	08f5929942	R600/SI: add SETO/SETUO patterns 6 more piglit tests. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> llvm-svn: 178145	2013-03-27 15:27:31 +00:00
Hal Finkel	687143557d	Print PPC ZERO as 0 (not r0) even on Darwin It seems that the Darwin PPC assembler requires r0 to be written as 0 when it means 0 (at least in lwarx/stwcx.). Fixes PR15605. llvm-svn: 178142	2013-03-27 13:20:52 +00:00
Tim Northover	d3490dc06a	Switch to LLVM support function abs64 to keep VS2008 happy. llvm-svn: 178141	2013-03-27 13:15:08 +00:00
Silviu Baranga	dc45336d09	Enabling the generation of dependency breakers for partial updates on Cortex-A15. Also fixing a small bug in getting the update clearence for VLD1LNd32. llvm-svn: 178134	2013-03-27 12:38:44 +00:00
Jyotsna Verma	653d8839c8	Hexagon: Disable optimizations at O0. llvm-svn: 178132	2013-03-27 11:14:24 +00:00
Christian Konig	3c14580acb	R600/SI: add cummuting of rev instructions Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Tested-by: Michel Dänzer <michel.daenzer@amd.com> llvm-svn: 178127	2013-03-27 09:12:59 +00:00
Christian Konig	70a5032c1b	R600/SI: add mulhu/mulhs patterns Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Tested-by: Michel Dänzer <michel.daenzer@amd.com> llvm-svn: 178126	2013-03-27 09:12:51 +00:00
Christian Konig	20a7e6b764	R600/SI: add srl/sha patterns for SI Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Tested-by: Michel Dänzer <michel.daenzer@amd.com> llvm-svn: 178125	2013-03-27 09:12:44 +00:00
Hal Finkel	0f77861d9f	Allocate r0 on PPC The R0 register can now be allocated because instructions that cannot use R0 as a GPR have been appropriately marked. llvm-svn: 178123	2013-03-27 06:52:27 +00:00
Hal Finkel	573fc28d64	Use the PPC no-r0 class on the TOC LD pseudos The register parameter in these instructions becomes the base register in an r+i ld instruction (and, thus, cannot be r0). This is not yet testable because we don't yet allocate r0 (and even then any test would be very fragile). llvm-svn: 178121	2013-03-27 06:36:55 +00:00
Hal Finkel	3fa362a51a	Apply the no-r0 register class to the PPC SELECT_CC_I[4\|8] pseudos Either operand of these pseudo instructions can be transformed into the first operand of an isel instruction (and this operand cannot be r0). This is not yet testable because we don't yet allocate r0 (and even when we do, any test would be very fragile). llvm-svn: 178119	2013-03-27 05:57:58 +00:00
Hal Finkel	42a312b261	Apply the no-r0 class to PPC TOC ADDI[S] pseudo instructions Like the addi/addis instructions themselves, these pseudo instructions also cannot have r0 as their register parameter (because it will be interpreted as the value 0). This is not yet testable because we don't yet allocate r0 (and even when we do, any regression test would be very fragile because it would depend on the register allocator heuristics). llvm-svn: 178118	2013-03-27 05:57:56 +00:00
Bill Schmidt	a1b72d0f6a	Remove the link register from the GPR classes on PowerPC. Some implementation detail in the forgotten past required the link register to be placed in the GPRC and G8RC register classes. This is just wrong on the face of it, and causes several extra intersection register classes to be generated. I found this was having evil effects on instruction scheduling, by causing the wrong register class to be consulted for register pressure decisions. No code generation changes are expected, other than some minor changes in instruction order. Seven tests in the test bucket required minor tweaks to adjust to the new normal. llvm-svn: 178114	2013-03-27 02:40:14 +00:00
Hal Finkel	a7b0630ba8	Don't spill PPC VRSAVE on non-Darwin (even in SjLj) As Bill Schmidt pointed out to me, only on Darwin do we need to spill/restore VRSAVE in the SjLj code. For non-Darwin, don't spill/restore VRSAVE (and I've added some asserts to make sure that we're not). As it turns out, we're not currently handling the Darwin case correctly (I've added a FIXME in the test case). I've tried adding various implied register definitions/uses to force the spill without success, so I'll need to address this later. llvm-svn: 178096	2013-03-27 00:02:20 +00:00
Michael Liao	03f9ad0e67	Add XTEST codegen support llvm-svn: 178083	2013-03-26 22:47:01 +00:00
Michael Liao	e344ec919f	Add HLE target feature llvm-svn: 178082	2013-03-26 22:46:02 +00:00
Jakob Stoklund Olesen	1ac7e662d4	Enable SandyBridgeModel for all modern Intel P6 descendants. All Intel CPUs since Yonah look a lot alike, at least at the granularity of the scheduling models. We can add more accurate models for processors that aren't Sandy Bridge if required. Haswell will probably need its own. The Atom processor and anything based on NetBurst is completely different. So are the non-Intel chips. llvm-svn: 178080	2013-03-26 22:19:12 +00:00
Hal Finkel	567fa62ddc	Restore real bit lengths on PPC register numbers As suggested by Bill Schmidt (in reviewing r178067), use the real register number bit lengths (which is self-documenting, and prevents using illegal numbers), and set only the relevant bits in HWEncoding (which defaults to 0). No functionality change intended. llvm-svn: 178077	2013-03-26 21:50:26 +00:00
Hal Finkel	feea653974	PPC: Use HWEncoding and TRI->getEncodingValue As pointed out by Jakob, we don't need to maintain a separate register-numbering table. Instead we should let TableGen generate the table for us from the information (already present) in PPCRegisterInfo.td. TRI->getEncodingValue is now used to access register-encoding values. No functionality change intended. llvm-svn: 178067	2013-03-26 20:08:20 +00:00

1 2 3 4 5 ...

23949 Commits