llvm-project

Commit Graph

Author	SHA1	Message	Date
Jim Grosbach	e423e865fe	X86: Fix encoding of 'movd %xmm0, %rax' The assembly string for the VMOVPQIto64rr instruction incorrectly lacked the 'v' prefix, resulting in mis-assembly of the vanilla movd instruction. llvm-svn: 162963	2012-08-31 00:30:30 +00:00
Chad Rosier	98cfa1044f	With the fix in r162954/162955 every cvt function returns true. Thus, have the ConvertToMCInst() return void, rather then a bool. Update all the cvt functions as well. llvm-svn: 162961	2012-08-31 00:03:31 +00:00
Chad Rosier	db482ef7a7	Fix for r162954. Return the Error. llvm-svn: 162955	2012-08-30 23:22:05 +00:00
Chad Rosier	8513ffbb83	Move a check to the validateInstruction() function where it more properly belongs. llvm-svn: 162954	2012-08-30 23:20:38 +00:00
Chad Rosier	5eec49fe09	Typo. llvm-svn: 162952	2012-08-30 23:00:00 +00:00
Michael Liao	bbd10792c2	Introduce 'UseSSEx' to force SSE legacy encoding - Add 'UseSSEx' to force SSE legacy insn not being selected when AVX is enabled. As the penalty of inter-mixing SSE and AVX instructions, we need prevent SSE legacy insn from being generated except explicitly specified through some intrinsics. For patterns supported by both SSE and AVX, so far, we force AVX insn will be tried first relying on AddedComplexity or position in td file. It's error-prone and introduces bugs accidentally. 'UseSSEx' is disabled when AVX is turned on. For SSE insns inherited by AVX, we need this predicate to force VEX encoding or SSE legacy encoding only. For insns not inherited by AVX, we still use the previous predicates, i.e. 'HasSSEx'. So far, these insns fall into the following categories: * SSE insns with MMX operands * SSE insns with GPR/MEM operands only (xFENCE, PREFETCH, CLFLUSH, CRC, and etc.) * SSE4A insns. * MMX insns. * x87 insns added by SSE. 2 test cases are modified: - test/CodeGen/X86/fast-isel-x86-64.ll AVX code generation is different from SSE one. 'vcvtsi2sdq' cannot be selected by fast-isel due to complicated pattern and fast-isel fallback to materialize it from constant pool. - test/CodeGen/X86/widen_load-1.ll AVX code generation is different from SSE one after fixing SSE/AVX inter-mixing. Exec-domain fixing prefers 'vmovapd' instead of 'vmovaps'. llvm-svn: 162919	2012-08-30 16:54:46 +00:00
NAKAMURA Takumi	ac49029fd9	PPCISelLowering.cpp: Fix r162725. [Tobias von Koch] What's happening here is that the CR6SET/CR6UNSET is breaking the chain of register copies glued to the function call (BL_SVR4 node). The scheduler then moves other instructions in between those and the function call, which isn't good! Right. That's the case where there is no chain of register copies before the call, so InFlag == 0... Attached is a new revision of the patch which should fix this for good. llvm-svn: 162916	2012-08-30 15:52:29 +00:00
NAKAMURA Takumi	8ad54e04d2	PPCISelLowering.cpp: Whitespace. llvm-svn: 162915	2012-08-30 15:52:23 +00:00
Tim Northover	ca9f384ff8	Add support for moving pure S-register to NEON pipeline if desired llvm-svn: 162898	2012-08-30 10:17:45 +00:00
Craig Topper	e39ad7b549	Only perform DAG combine on FMAs of legal types. llvm-svn: 162892	2012-08-30 06:56:15 +00:00
Michael Liao	3c8980646b	Fix PR13727 - The root cause is that target constant materialization in X86 fast-isel creates a PC-rel addressing which may overflow 32-bit range in non-Small code model if .rodata section is allocated too far away from code segment in MCJIT, which uses Large code model so far. - Follow the similar logic to fix non-Small code model in fast-isel by skipping non-Small code model. llvm-svn: 162881	2012-08-30 00:30:16 +00:00
Jakob Stoklund Olesen	cea3e77433	Rename hasVolatileMemoryRef() to hasOrderedMemoryRef(). Ordered memory operations are more constrained than volatile loads and stores because they must be ordered with respect to all other memory operations. llvm-svn: 162861	2012-08-29 21:19:21 +00:00
Hal Finkel	1859d26528	Reserve space for the mandatory traceback fields on PPC64. We need to reserve space for the mandatory traceback fields, though leaving them as zero is appropriate for now. Although the ABI calls for these fields to be filled in fully, no compiler on Linux currently does this, and GDB does not read these fields. GDB uses the first word of zeroes during exception handling to find the end of the function and the size field, allowing it to compute the beginning of the function. DWARF information is used for everything else. We need the extra 8 bytes of pad so the size field is found in the right place. As a comparison, GCC fills in a few of the fields -- language, number of saved registers -- but ignores the rest. IBM's proprietary OSes do make use of the full traceback table facility. Patch by Bill Schmidt. llvm-svn: 162854	2012-08-29 20:22:24 +00:00
Tim Northover	771f160758	Refactor setExecutionDomain to be clearer about what it's doing and more robust. llvm-svn: 162844	2012-08-29 16:36:07 +00:00
Benjamin Kramer	8f5c5ded4e	Make helper function static. llvm-svn: 162843	2012-08-29 16:17:01 +00:00
Benjamin Kramer	8bcc971174	Make MemoryBuiltins aware of TargetLibraryInfo. This disables malloc-specific optimization when -fno-builtin (or -ffreestanding) is specified. This has been a problem for a long time but became more severe with the recent memory builtin improvements. Since the memory builtin functions are used everywhere, this required passing TLI in many places. This means that functions that now have an optional TLI argument, like RecursivelyDeleteTriviallyDeadFunctions, won't remove dead mallocs anymore if the TLI argument is missing. I've updated most passes to do the right thing. Fixes PR13694 and probably others. llvm-svn: 162841	2012-08-29 15:32:21 +00:00
Craig Topper	a999c66292	Convert FMA4 patterns to use target specific nodes instead of intrinsics to align with FMA3. llvm-svn: 162829	2012-08-29 07:18:25 +00:00
Andrew Trick	b57e225742	Cleanup sloppy code. Jakob's review. llvm-svn: 162825	2012-08-29 04:41:37 +00:00
Jush Lu	e87e559e62	[arm-fast-isel] Add support for ARM PIC. llvm-svn: 162823	2012-08-29 02:41:21 +00:00
Andrew Trick	bd0073ddd7	Fix ARM vector copies of overlapping register tuples. I have tested the fix, but have not been successfull in generating a robust unit test. This can only be exposed through particular register assignments. llvm-svn: 162821	2012-08-29 01:58:55 +00:00
Andrew Trick	4cc6949a2b	cleanup llvm-svn: 162820	2012-08-29 01:58:52 +00:00
Chad Rosier	3b1336ceb9	Typo. llvm-svn: 162807	2012-08-28 23:57:47 +00:00
Michael Liao	407d659fa5	Add comments on the literal value used. llvm-svn: 162805	2012-08-28 23:42:17 +00:00
Jack Carter	cd6b0e1368	The instruction DEXT may be transformed into DEXTU or DEXTM depending on the size of the extraction and its position in the 64 bit word. This patch allows support of the dext transformations with mips64 direct object output. 0 <= msb < 32 0 <= lsb < 32 0 <= pos < 32 1 <= size <= 32 DINS The field is entirely contained in the right-most word of the doubleword 32 <= msb < 64 0 <= lsb < 32 0 <= pos < 32 2 <= size <= 64 DINSM The field straddles the words of the doubleword 32 <= msb < 64 32 <= lsb < 64 32 <= pos < 64 1 <= size <= 32 DINSU The field is entirely contained in the left-most word of the doubleword llvm-svn: 162782	2012-08-28 20:07:41 +00:00
Michael Liao	710e1a594b	Explicitly update the number of nodes to be traversed llvm-svn: 162780	2012-08-28 19:20:29 +00:00
Jack Carter	c20a21b855	Some instructions are passed to the assembler to be transformed to the final instruction variant. An example would be dsrll which is transformed into dsll32 if the shift value is greater than 32. For direct object output we need to do this transformation in the codegen. If the instruction was inside branch delay slot, it was being missed. This patch corrects this oversight. llvm-svn: 162779	2012-08-28 19:07:39 +00:00
Roman Divacky	8c4b6a307e	Emit word of zeroes after the last instruction as a start of the mandatory traceback table on PowerPC64. This helps gdb handle exceptions. The other mandatory fields are ignored by gdb and harder to implement so just add there a FIXME. Patch by Bill Schmidt. PR13641. llvm-svn: 162778	2012-08-28 19:06:55 +00:00
Akira Hatanaka	206cefe66c	Follow-up patch to r162731. Fix a couple of bugs in mips' long branch pass. This patch was supposed to be committed along with r162731, so I don't have a new test case. llvm-svn: 162777	2012-08-28 18:58:57 +00:00
Hal Finkel	742b535e40	Add PPC Freescale e500mc and e5500 subtargets. Add subtargets for Freescale e500mc (32-bit) and e5500 (64-bit) to the PowerPC backend. Patch by Tobias von Koch. llvm-svn: 162764	2012-08-28 16:12:39 +00:00
Bill Wendling	cc56718038	The commutative flag is already correctly set within the multiclass. If we set it here, then a 'register-memory' version would wrongly get the commutative flag. <rdar://problem/12180135> llvm-svn: 162741	2012-08-28 07:36:46 +00:00
Craig Topper	72f51c3986	Convert V_SETALLONES/AVX_SETALLONES/AVX2_SETALLONES to Post-RA pseudos. llvm-svn: 162740	2012-08-28 07:30:47 +00:00
Craig Topper	bd509eea4a	Merge AVX_SET0PSY/AVX_SET0PDY/AVX2_SET0 into a single post-RA pseudo. llvm-svn: 162738	2012-08-28 07:05:28 +00:00
Michael Liao	b7d85b6328	Fix PR12312 - Add a target-specific DAG optimization to recognize a pattern PTEST-able. Such a pattern is a OR'd tree with X86ISD::OR as the root node. When X86ISD::OR node has only its flag result being used as a boolean value and all its leaves are extracted from the same vector, it could be folded into an X86ISD::PTEST node. llvm-svn: 162735	2012-08-28 03:34:40 +00:00
Jakob Stoklund Olesen	b3de7b1790	Revert r162713: "Add ATOMIC_LDR* pseudo-instructions to model atomic_load on ARM." This wasn't the right way to enforce ordering of atomics. We are already setting the isVolatile bit on memory operands of atomic operations which is good enough to enforce the correct ordering. llvm-svn: 162732	2012-08-28 03:11:27 +00:00
Akira Hatanaka	b5af7121b1	Fix mips' long branch pass. Instructions emitted to compute branch offsets now use immediate operands instead of symbolic labels. This change was needed because there were problems when R_MIPS_HI16/LO16 relocations were used to make shared objects. llvm-svn: 162731	2012-08-28 03:03:05 +00:00
Hal Finkel	679c73cb33	Split several PPC instruction classes. Slight reorganisation of PPC instruction classes for scheduling. No functionality change for existing subtargets. - Clearly separate load/store-with-update instructions from regular loads and stores. - Split IntRotateD -> IntRotateD and IntRotateDI - Split out fsub and fadd from FPGeneral -> FPAddSub - Update existing itineraries Patch by Tobias von Koch. llvm-svn: 162729	2012-08-28 02:49:14 +00:00
Hal Finkel	686f2ee226	Allow remat of LI on PPC. Allow load-immediates to be rematerialised in the register coalescer for PPC. This makes test/CodeGen/PowerPC/big-endian-formal-args.ll fail, because it relies on a register move getting emitted. The immediate load is equivalent, so change this test case. Patch by Tobias von Koch. llvm-svn: 162727	2012-08-28 02:10:33 +00:00
Hal Finkel	5ab378037f	Eliminate redundant CR moves on PPC32. The 32-bit ABI requires CR bit 6 to be set if the call has fp arguments and unset if it doesn't. The solution up to now was to insert a MachineNode to set/unset the CR bit, which produces a CR vreg. This vreg was then copied into CR bit 6. When the register allocator saw a bunch of these in the same function, it allocated the set/unset CR bit in some random CR register (1 extra instruction) and then emitted CR moves before every vararg function call, rather than just setting and unsetting CR bit 6 directly before every vararg function call. This patch instead inserts a PPCcrset/PPCcrunset instruction which are then matched by a dedicated instruction pattern. Patch by Tobias von Koch. llvm-svn: 162725	2012-08-28 02:10:27 +00:00
Hal Finkel	e39526a789	Optimize zext on PPC64. The zeroextend IR instruction is lowered to an 'and' node with an immediate mask operand, which in turn gets legalised to a sequence of ori's & ands. This can be done more efficiently using the rldicl instruction. Patch by Tobias von Koch. llvm-svn: 162724	2012-08-28 02:10:15 +00:00
Jakob Stoklund Olesen	89d6b29d16	More missing mayLoad flags on AVX multiclasses. llvm-svn: 162714	2012-08-28 00:02:01 +00:00
Jakob Stoklund Olesen	b24cb8c541	Add ATOMIC_LDR* pseudo-instructions to model atomic_load on ARM. It is not safe to use normal LDR instructions because they may be reordered by the scheduler. The ATOMIC_LDR pseudos have a mayStore flag that prevents reordering. Atomic loads are also prevented from participating in rematerialization and load folding. llvm-svn: 162713	2012-08-27 23:58:52 +00:00
Bill Wendling	988a47d7e5	Make sure we add the predicate after all of the registers are added. <rdar://problem/12183003> llvm-svn: 162703	2012-08-27 22:12:44 +00:00
Craig Topper	a737ef8964	Remove MMX shift intrinsic handling code that also exists in SelectionDAGBuilder. llvm-svn: 162661	2012-08-27 08:08:30 +00:00
Craig Topper	5af2fed5f2	Don't allow vextractf128 to be folded with unaligned stores. We don't fold unaligned loads so shouldn't fold unaligned stores as it can cause an alignment fault to occur. llvm-svn: 162658	2012-08-27 07:19:59 +00:00
Craig Topper	6d44554cd4	Fold some patterns into instruction definitons so tablegen can infer flags removing the need for an explicit 'neverHasSideEffects = 1' llvm-svn: 162656	2012-08-27 07:04:50 +00:00
Craig Topper	f7828f91ee	Add HasAVX1Only predicate and use it for patterns that have an AVX1 instruction and an AVX2 instruction rather than relying on AddedComplexity. llvm-svn: 162654	2012-08-27 06:08:57 +00:00
Richard Smith	228e6d4cf3	Fix integer undefined behavior due to signed left shift overflow in LLVM. Reviewed offline by chandlerc. llvm-svn: 162623	2012-08-24 23:29:28 +00:00
Jakob Stoklund Olesen	3d91b43ad2	Add missing mayLoad flags to a large class of AVX *_Int instructions. llvm-svn: 162622	2012-08-24 23:29:07 +00:00
Jakob Stoklund Olesen	74352494a6	Missed tLEApcrelJT. ARMConstantIslandPass expects this instruction to stay in the same basic block as the jump table branch. llvm-svn: 162615	2012-08-24 22:46:55 +00:00
Jakob Stoklund Olesen	47ac1a8ec0	Explicitly mark LEApcrel pseudos with hasSideEffects. It's not clear that they should be marked as such, but tbb formation fails if t2LEApcrelJT is hoisted of of a loop. This doesn't change the flags on these instructions, UnmodeledSideEffects was already inferred from the missing pattern. llvm-svn: 162603	2012-08-24 21:44:11 +00:00

1 2 3 4 5 ...

22007 Commits