llvm-project

Commit Graph

Author	SHA1	Message	Date
Sanjay Patel	2366168bad	80-cols; NFC llvm-svn: 244755	2015-08-12 15:12:25 +00:00
Sanjay Patel	dc87d1440c	fix typo; NFC llvm-svn: 244753	2015-08-12 15:09:09 +00:00
Zoran Jovanovic	366783e14c	[mips][microMIPS] Create microMIPS64r6 subtarget and implement DALIGN, DAUI, DAHI, DATI, DEXT, DEXTM and DEXTU instructions Differential Revision: http://reviews.llvm.org/D10923 llvm-svn: 244744	2015-08-12 12:45:16 +00:00
Michael Kuperstein	fe0d9bb6eb	[X86] Disable mul -> shl + lea combine when compiling for minsize Differential Revision: http://reviews.llvm.org/D11904 llvm-svn: 244740	2015-08-12 11:27:26 +00:00
Michael Kuperstein	bc7f99a3ab	[X86] Allow x86 call frame optimization to fold more loads into pushes This abstracts away the test for "when can we fold across a MachineInstruction" into the the MI interface, and changes call-frame optimization use the same test the peephole optimizer users. Differential Revision: http://reviews.llvm.org/D11945 llvm-svn: 244729	2015-08-12 10:14:58 +00:00
Matt Arsenault	c574686529	AMDGPU: Fix assert on dbg_value instructions llvm-svn: 244728	2015-08-12 09:04:44 +00:00
Simon Pilgrim	8c049d5c03	[InstCombine] Move SSE/AVX vector blend folding to instcombiner As discussed in D11886, this patch moves the SSE/AVX vector blend folding to instcombiner from PerformINTRINSIC_WO_CHAINCombine (which allows us to remove this completely). InstCombiner already had partial support for this, I just had to add support for zero (ConstantAggregateZero) masks and also the case where both selection inputs were the same (allowing us to ignore the mask). I also moved all the relevant combine tests into InstCombine/blend_x86.ll Differential Revision: http://reviews.llvm.org/D11934 llvm-svn: 244723	2015-08-12 08:08:56 +00:00
Saleem Abdulrasool	9e5f2a96f1	X86: hoist a condition into a variable (NFC) The same value is used multiple times through the function. Hoist the condition into a variable. This should fix a silly static analysis warning where the conditions flip around. No functional change intended. llvm-svn: 244713	2015-08-12 02:01:36 +00:00
Sanjay Patel	260b6d36f4	[x86] enable machine combiner reassociations for 256-bit vector FP mul/add llvm-svn: 244705	2015-08-12 00:29:10 +00:00
Alex Lorenz	5659a2f961	PseudoSourceValue: Transform the mips subclass to target independent subclasses This commit transforms the mips-specific 'MipsCallEntry' subclass of the 'PseudoSourceValue' class into two, target-independent subclasses named 'GlobalValuePseudoSourceValue' and 'ExternalSymbolPseudoSourceValue'. This change makes it easier to serialize the pseudo source values by removing target-specific pseudo source values. Reviewers: Akira Hatanaka llvm-svn: 244698	2015-08-11 23:23:17 +00:00
Alex Lorenz	e40c8a2b26	PseudoSourceValue: Replace global manager with a manager in a machine function. This commit removes the global manager variable which is responsible for storing and allocating pseudo source values and instead it introduces a new manager class named 'PseudoSourceValueManager'. Machine functions now own an instance of the pseudo source value manager class. This commit also modifies the 'get...' methods in the 'MachinePointerInfo' class to construct pseudo source values using the instance of the pseudo source value manager object from the machine function. This commit updates calls to the 'get...' methods from the 'MachinePointerInfo' class in a lot of different files because those calls now need to pass in a reference to a machine function to those methods. This change will make it easier to serialize pseudo source values as it will enable me to transform the mips specific MipsCallEntry PseudoSourceValue subclass into two target independent subclasses. Reviewers: Akira Hatanaka llvm-svn: 244693	2015-08-11 23:09:45 +00:00
Alex Lorenz	c49e4fe9cc	PseudoSourceValue: Introduce a 'PSVKind' enumerator. This commit introduces a new enumerator named 'PSVKind' in the 'PseudoSourceValue' class. This enumerator is now used to distinguish between the various kinds of pseudo source values. This change is done in preparation for the changes to the pseudo source value object management and to the PseudoSourceValue's class hierarchy - the next two PseudoSourceValue commits will get rid of the global variable that manages the pseudo source values and the mips specific MipsCallEntry subclass. Reviewers: Akira Hatanaka llvm-svn: 244687	2015-08-11 22:32:00 +00:00
Mark Heffernan	438ffe5eac	Use 32-bit divides instead of 64-bit divides where possible. For NVPTX, try to use 32-bit division instead of 64-bit division when the dividend and divisor fit in 32 bits. This speeds up some internal benchmarks significantly. The underlying reason is that many index computations are carried out in 64-bits but never actually exceed the capacity of a 32-bit word. llvm-svn: 244684	2015-08-11 22:16:34 +00:00
JF Bastien	da06bce8b5	WebAssembly: implement comparison. Some of the FP comparisons (ueq, one, ult, ule, ugt, uge) are currently broken, I'll fix them in a follow-up. Reviewers: sunfish Subscribers: llvm-commits, jfb Differential Revision: http://reviews.llvm.org/D11924 llvm-svn: 244665	2015-08-11 21:02:46 +00:00
Sanjay Patel	2c6a01570d	[x86] enable machine combiner reassociations for 128-bit vector single/double multiplies llvm-svn: 244657	2015-08-11 20:19:23 +00:00
JF Bastien	480c840896	WebAssembly: implement WebAssemblyTargetLowering::getTargetNodeName Summary: Implementation is the same as in AArch64. Subscribers: aemerson, jfb, llvm-commits, sunfish Differential Revision: http://reviews.llvm.org/D11956 llvm-svn: 244655	2015-08-11 20:13:18 +00:00
Rafael Espindola	3adc7ce9f1	Use llvm::make_unique to fix the MSVC build. llvm-svn: 244641	2015-08-11 18:11:17 +00:00
Michael Kuperstein	243c073a2e	[X86] Allow merging of immediates within a basic block for code size savings First step in preventing immediates that occur more than once within a single basic block from being pulled into their users, in order to prevent unnecessary large instruction encoding .Currently enabled only when optimizing for size. Patch by: zia.ansari@intel.com Differential Revision: http://reviews.llvm.org/D11363 llvm-svn: 244601	2015-08-11 14:10:58 +00:00
James Molloy	b7b2a1e9b4	[AArch64] Match fminnum/fmaxnum for vector fminnm/fmaxnm instead of an intrinsic. Lower Intrinsic::aarch64_neon_fmin/fmax to fminnum/fmannum and match that instead. Minimal functional change: - Extra tests added because coverage of scalar fminnm/fmaxnm instructions was nonexistant. - f16 test updated because now we actually generate scalar fminnm/fmaxnm we no longer need to bail out to a libcall! llvm-svn: 244595	2015-08-11 12:06:37 +00:00
James Molloy	edf38f0cb0	[AArch64] Replace the custom AArch64ISD::FMIN/MAX nodes with ISD::FMINNAN/MAXNAN NFCI. This just removes custom ISDNodes that are no longer needed. llvm-svn: 244594	2015-08-11 12:06:33 +00:00
James Molloy	d616c642bb	[ARM] Match fminnan/fmaxnan for vector vmin/vmax instead of an intrinsic Lower Intrinsic::arm_neon_vmins/vmaxs to fminnan/fmaxnan and match that instead. This is important because SDAG will soon be able to select FMINNAN itself, so we need a unified lowering path for intrinsics and SDAG. NFCI. llvm-svn: 244593	2015-08-11 12:06:28 +00:00
James Molloy	ee868b2a3e	[ARM] Match fminnum/fmaxnum for vector vminnm/vmaxnm instead of an intrinsic Lower the intrinsic to a FMINNUM/FMAXNUM node and select that instead. This is important because soon SDAG will be able to select FMINNUM/FMAXNUM itself, so we need an integrated lowering path between SDAG and intrinsics. NFCI. llvm-svn: 244592	2015-08-11 12:06:25 +00:00
James Molloy	ea3a687a33	[ARM] Replace ARMISD::VMINNM/VMAXNM with ISD::FMINNUM/FMAXNUM NFCI. This replaces another custom ISDNode with a generic equivalent. llvm-svn: 244591	2015-08-11 12:06:22 +00:00
James Molloy	db8ee4b5a9	[ARM] Replace ARMISD::FMIN/FMAX with the shiny new ISD::FMINNAN/FMAXNAN. NFCI. This removes a custom ISDNode. llvm-svn: 244590	2015-08-11 12:06:15 +00:00
Marina Yatsina	8c997af103	[X86] Add SAL mnemonics for Intel syntax SAL and SHL instructions perform the same operation Differential Revision: http://reviews.llvm.org/D11882 llvm-svn: 244588	2015-08-11 12:05:06 +00:00
Marina Yatsina	d353c45eaf	[X86] Fix REPE, REPZ, REPNZ for intel syntax REPE, REPZ, REPNZ, REPNE should have mnemonics for Intel syntax as well. Currently using these instructions causes compilation errors for Intel syntax. Differential Revision: http://reviews.llvm.org/D11794 llvm-svn: 244584	2015-08-11 11:28:10 +00:00
Marina Yatsina	f6bc15d763	[X86] Fix imul alias for intel syntax The "imul reg, imm" alias is not defined for intel syntax. In intel syntax there is no w/l/q suffix for the imul instruction. Differential Revision: http://reviews.llvm.org/D11887 llvm-svn: 244582	2015-08-11 10:43:04 +00:00
Vasileios Kalintiris	1c78ca6a09	[mips] Remap move as or. Summary: This patch remaps the assembly idiom 'move' to 'or' instead of 'daddu' or 'addu'. The use of addu/daddu instead of or as move was highlighted as a performance issue during the analysis of a recent 64bit design. Originally move was encoded as 'or' by binutils but was changed for the r10k cpu family due to their pipeline which had 2 arithmetic units and a single logical unit, and so could issue multiple (d)addu based moves at the same time but only 1 logical move. This patch preserves the disassembly behaviour so that disassembling a old style (d)addu move still appears as move, but assembling move always gives an or Patch by Simon Dardis. Reviewers: vkalintiris Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11796 llvm-svn: 244579	2015-08-11 08:56:25 +00:00
Michael Kuperstein	7337ee23d8	[X86] When optimizing for minsize, use POP for small post-call stack clean-up When optimizing for size, replace "addl $4, %esp" and "addl $8, %esp" following a call by one or two pops, respectively. We don't try to do it in general, but only when the stack adjustment immediately follows a call - which is the most common case. That allows taking a short-cut when trying to find a free register to pop into, instead of a full-blown liveness check. If the adjustment immediately follows a call, then every register the call clobbers but doesn't define should be dead at that point, and can be used. Differential Revision: http://reviews.llvm.org/D11749 llvm-svn: 244578	2015-08-11 08:48:48 +00:00
JF Bastien	11bf0da0d7	WebAssembly: NFC fix release build break, unused variable. Summary: Caused by D11914, pointed out by blaikie. Subscribers: llvm-commits, jfb, dblaikie Differential Revision: http://reviews.llvm.org/D11929 llvm-svn: 244570	2015-08-11 04:52:24 +00:00
JF Bastien	ef172fc9f0	WebAssembly: add basic floating-point tests Summary: I somehow forgot to add these when I added the basic floating-point opcodes. Also remove ceil/floor/trunc/nearestint for now, and add them only when properly tested. Subscribers: llvm-commits, sunfish, jfb Differential Revision: http://reviews.llvm.org/D11927 llvm-svn: 244562	2015-08-11 02:45:15 +00:00
Cameron Esfahani	f97999dc46	Explicitly clear the MI operand list when getInstruction() is called. Call MI.clear() within MCD::OPC_Decode case and inside of translateInstruction() for the X86 target. Remove now unnecessary MI.clear() from ARMDisassembler. Summary: Explicitly clear the MI operand list when getInstruction() is called. Reviewers: hfinkel, t.p.northover, hvarga, kparzysz, jyknight, qcolombet, uweigand Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11665 llvm-svn: 244557	2015-08-11 01:15:07 +00:00
JF Bastien	e73ce68225	WebAssembly: simply assert on SNaN and NaNs with payloads Summary: convertToHexString doesn't represent them correctly at this point in time. This is a follow-up to sunfish's suggestion in D11914. Subscribers: llvm-commits, sunfish, jfb Differential Revision: http://reviews.llvm.org/D11925 llvm-svn: 244551	2015-08-11 00:49:20 +00:00
Joerg Sonnenberger	ebe7bf44ec	Add lduw and lwua aliases for SPARCv9. llvm-svn: 244535	2015-08-10 23:47:22 +00:00
Joerg Sonnenberger	2ee3d76737	Load/store for float registers from/to alternate space. llvm-svn: 244532	2015-08-10 23:33:17 +00:00
JF Bastien	4a6422562d	WebAssembly: print immediates Summary: For now output using C99's hexadecimal floating-point representation. This patch also cleans up how machine operands are printed: instead of special-casing per type of machine instruction, the code now handles operands generically. Reviewers: sunfish Subscribers: llvm-commits, jfb Differential Revision: http://reviews.llvm.org/D11914 llvm-svn: 244520	2015-08-10 22:36:48 +00:00
Joerg Sonnenberger	6dce129051	Add support for the signx instrution alias of SPARCv9. llvm-svn: 244519	2015-08-10 22:32:25 +00:00
JF Bastien	fa9746dc8d	x86: Emit LAHF/SAHF instead of PUSHF/POPF NaCl's sandbox doesn't allow PUSHF/POPF out of security concerns (priviledged emulators have forgotten to mask system bits in the past, and EFLAGS's DF bit is a constant source of hilarity). Commit r220529 fixed PR20376 by saving cmpxchg's flags result using EFLAGS, this commit now generated LAHF/SAHF instead, for all of x86 (not just NaCl) because it leads to an overall performance gain over PUSHF/POPF. As with the previous patch this code generation is pretty bad because it occurs very later, after register allocation, and in many cases it rematerializes flags which were already available (e.g. already in a register through SETE). Fortunately it's somewhat rare that this code needs to fire. I did [[ https://github.com/jfbastien/benchmark-x86-flags \| a bit of benchmarking ]], the results on an Intel Haswell E5-2690 CPU at 2.9GHz are: \| Time per call (ms) \| Runtime (ms) \| Benchmark \| \| 0.000012514 \| 6257 \| sete.i386 \| \| 0.000012810 \| 6405 \| sete.i386-fast \| \| 0.000010456 \| 5228 \| sete.x86-64 \| \| 0.000010496 \| 5248 \| sete.x86-64-fast \| \| 0.000012906 \| 6453 \| lahf-sahf.i386 \| \| 0.000013236 \| 6618 \| lahf-sahf.i386-fast \| \| 0.000010580 \| 5290 \| lahf-sahf.x86-64 \| \| 0.000010304 \| 5152 \| lahf-sahf.x86-64-fast \| \| 0.000028056 \| 14028 \| pushf-popf.i386 \| \| 0.000027160 \| 13580 \| pushf-popf.i386-fast \| \| 0.000023810 \| 11905 \| pushf-popf.x86-64 \| \| 0.000026468 \| 13234 \| pushf-popf.x86-64-fast \| Clearly `PUSHF`/`POPF` are suboptimal. It doesn't really seems to be worth teaching LLVM about individual flags, at least not for this purpose. Reviewers: rnk, jvoung, t.p.northover Subscribers: llvm-commits Differential revision: http://reviews.llvm.org/D6629 llvm-svn: 244503	2015-08-10 20:59:36 +00:00
Sanjay Patel	d09391c8cd	fix minsize detection: minsize attribute implies optimizing for size llvm-svn: 244499	2015-08-10 20:45:44 +00:00
Simon Pilgrim	a3a72b41de	[InstCombine] Move SSE2/AVX2 arithmetic vector shift folding to instcombiner As discussed in D11760, this patch moves the (V)PSRA(WD) arithmetic shift-by-constant folding to InstCombine to match the logical shift implementations. Differential Revision: http://reviews.llvm.org/D11886 llvm-svn: 244495	2015-08-10 20:21:15 +00:00
James Y Knight	3994be87de	[Sparc] Implement i64 load/store support for 32-bit sparc. The LDD/STD instructions can load/store a 64bit quantity from/to memory to/from a consecutive even/odd pair of (32-bit) registers. They are part of SparcV8, and also present in SparcV9. (Although deprecated there, as you can store 64bits in one register). As recommended on llvmdev in the thread "How to enable use of 64bit load/store for 32bit architecture" from Apr 2015, I've modeled the 64-bit load/store operations as working on a v2i32 type, rather than making i64 a legal type, but with few legal operations. The latter does not (currently) work, as there is much code in llvm which assumes that if i64 is legal, operations like "add" will actually work on it. The same assumption does not hold for v2i32 -- for vector types, it is workable to support only load/store, and expand everything else. This patch: - Adds a new register class, IntPair, for even/odd pairs of registers. - Modifies the list of reserved registers, the stack spilling code, and register copying code to support the IntPair register class. - Adds support in AsmParser. (note that in asm text, you write the name of the first register of the pair only. So the parser has to morph the single register into the equivalent paired register). - Adds the new instructions themselves (LDD/STD/LDDA/STDA). - Hooks up the instructions and registers as a vector type v2i32. Adds custom legalizer to transform i64 load/stores into v2i32 load/stores and bitcasts, so that the new instructions can actually be generated, and marks all operations other than load/store on v2i32 as needing to be expanded. - Copies the unfortunate SelectInlineAsm hack from ARMISelDAGToDAG. This hack undoes the transformation of i64 operands into two arbitrarily-allocated separate i32 registers in SelectionDAGBuilder. and instead passes them in a single IntPair. (Arbitrarily allocated registers are not useful, asm code expects to be receiving a pair, which can be passed to ldd/std.) Also adds a bunch of test cases covering all the bugs I've added along the way. Differential Revision: http://reviews.llvm.org/D8713 llvm-svn: 244484	2015-08-10 19:11:39 +00:00
Chad Rosier	c56a9132d0	[AArch64] Convert a conditional check that will always be true to an assert. NFC. llvm-svn: 244479	2015-08-10 18:42:45 +00:00
Chad Rosier	caed6db51e	Typo. Move comment closer to relevant code. NFC. llvm-svn: 244465	2015-08-10 17:17:19 +00:00
Sanjay Patel	10294b59de	fix minsize detection: minsize attribute implies optimizing for size llvm-svn: 244464	2015-08-10 17:15:17 +00:00
Sanjay Patel	0f12d71b49	fix minsize detection: minsize attribute implies optimizing for size llvm-svn: 244463	2015-08-10 17:00:44 +00:00
Sanjay Patel	68b0325a9e	fix minsize detection: minsize attribute implies optimizing for size llvm-svn: 244460	2015-08-10 16:47:47 +00:00
Sanjay Patel	9a9003d94c	fix minsize detection: minsize attribute implies optimizing for size llvm-svn: 244458	2015-08-10 16:43:20 +00:00
Marina Yatsina	a0e02410e1	Test commit to verify commit access llvm-svn: 244438	2015-08-10 11:33:10 +00:00
Saleem Abdulrasool	6bc5ed3e7a	X86: remove a dead store (NFC) The SP was always unconditionally assigned to later, but initialised early. This delays the initialisation, and avoids the dead store. Identified by clang static analysis. No functional change intended. llvm-svn: 244423	2015-08-09 20:39:09 +00:00
Sanjay Patel	e0178262d4	[x86] enable machine combiner reassociations for 128-bit vector single/double adds llvm-svn: 244403	2015-08-08 19:08:20 +00:00
Benjamin Kramer	df005cbe19	Fix some comment typos. llvm-svn: 244402	2015-08-08 18:27:36 +00:00
Craig Topper	cb1f601a7b	[X86] Add ADX and RDSEED to Skylake processor. llvm-svn: 244396	2015-08-08 07:31:15 +00:00
Craig Topper	01dd4ea334	Add SlowBTMem to Sandy Bridge and newer Intel CPUs. Reading through Agner Fog's table suggests there have been no improvements to these processors relative to Westmere for bit test instructions. llvm-svn: 244395	2015-08-08 07:20:04 +00:00
Tom Stellard	30cf77457d	AMDGPU/SI: Another attempt to fix Windows bots broken by r244372 llvm-svn: 244383	2015-08-08 01:11:07 +00:00
Matt Arsenault	b130076469	Remove unnecessary includes llvm-svn: 244382	2015-08-08 00:41:53 +00:00
Matt Arsenault	cbd753761a	AMDGPU: Implement AMDGPUOperand::print() llvm-svn: 244381	2015-08-08 00:41:51 +00:00
Matt Arsenault	4635915504	AMDGPU/SI: Remove VCCReg llvm-svn: 244380	2015-08-08 00:41:48 +00:00
Matt Arsenault	6942d1a034	AMDGPU/SI: Remove source uses of VCCReg llvm-svn: 244379	2015-08-08 00:41:45 +00:00
Tom Stellard	fc70950bf2	AMDGPU/SI: Attempt to fix Windows bots broken by r244372 llvm-svn: 244376	2015-08-08 00:17:59 +00:00
Tom Stellard	fd25395c72	AMDGPU: Add pass to lower OpenCL image and sampler arguments. The pass adds new kernel arguments for image attributes, and resolves calls to dummy attribute and resource id getter functions. Patch by: Zoltan Gilian llvm-svn: 244372	2015-08-07 23:19:30 +00:00
Quentin Colombet	7d8c74ff3f	[AArch64][LoadStoreOptimizer] Turn a test into an assert. NFC. At this point the given Opc must be valid, otherwise we should not look for a matching pair to form paired load or store. Thanks to Chad to point out this piece of code! llvm-svn: 244366	2015-08-07 22:40:51 +00:00
Tom Stellard	8ebad11ee9	AMDGPU/SI: Use InstAlias instead of MnemonicAlias for VOPC instructions Summary: With InstAlias, we don't need to print the _e32 portion of the mnemonic when we print the $dst operand. This change makes it possible to include vcc in the asm string when we switch VOPC over to having implicit vcc defs. Reviewers: arsenm Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11813 llvm-svn: 244362	2015-08-07 22:00:56 +00:00
Matt Arsenault	711b390a7c	AMDGPU: Assume SMRD access for constant address space Since r243294 these are selected to SMRD and moved later if required. llvm-svn: 244354	2015-08-07 20:18:34 +00:00
Chad Rosier	9659de379d	[ARM] Remove an unused reference to MachineRegisterInfo. NFC. llvm-svn: 244334	2015-08-07 17:02:29 +00:00
Tom Stellard	c8733e805e	AMDGPU/SI: Use correct encoding of vopc for VI in the assembler Summary: We were using the SI encoding for VI. Reviewers: arsenm Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11812 llvm-svn: 244332	2015-08-07 16:45:33 +00:00
Tom Stellard	85656cabfb	AMDGPU/SI: v_mac_legacy_f32 does not exist on VI Reviewers: arsenm Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11810 llvm-svn: 244322	2015-08-07 15:34:30 +00:00
Tom Stellard	11f19f78f0	AMDGPU/SI: Remove unused outs parameter from VOPC TableGen classes Reviewers: arsenm Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11809 llvm-svn: 244321	2015-08-07 15:34:27 +00:00
Silviu Baranga	a07090f7fa	Fix unused variable warning introduced in r244314 llvm-svn: 244315	2015-08-07 12:05:46 +00:00
Silviu Baranga	3e8e51c1a9	[ARM] Update ReconstructShuffle to handle mismatched types Summary: Port the ReconstructShuffle function from AArch64 to ARM to handle mismatched incoming types in the BUILD_VECTOR node. This fixes an outstanding FIXME in the ReconstructShuffle code. Reviewers: t.p.northover, rengolin Subscribers: aemerson, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D11720 llvm-svn: 244314	2015-08-07 11:40:46 +00:00
JF Bastien	315cc06840	WebAssembly: textual emission uses expected opcode names Summary: WebAssembly's tablegen instructions have the names WebAssembly expects, but by LLVM convention they're uppercase and suffixed with their type after an underscore. Leave the C++ code that way, but print outt he names WebAssembly expects (lowercase, no type). We could teach tablegen to do this later, maybe by using `!cast<string>(node)` in the .td files. Reviewers: sunfish Subscribers: jfb, llvm-commits Differential Revision: http://reviews.llvm.org/D11776 llvm-svn: 244305	2015-08-07 01:57:03 +00:00
Juergen Ributzka	f09c7a3d0f	[AArch64][FastISel] Always use AND before checking the branch flag. When we are not emitting the condition for the branch, because the condition is in another BB or SDAG did the selection for us, then we have to mask the flag in the register with AND. This is required when the condition comes from a truncate, because SDAG only truncates down to a legal size of i32. This fixes rdar://problem/22161062. llvm-svn: 244291	2015-08-06 22:44:15 +00:00
Juergen Ributzka	9f54dbe7a1	Revert "[AArch64][FastISel] Add more truncation tests." and "[AArch64][FastISel] Always use an AND instruction when truncating to non-legal types." This reverts commit r243198 and 243304. Turns out this wasn't the correct fix for this problem. It works only within FastISel, but fails when the truncate is selected by SDAG. llvm-svn: 244287	2015-08-06 22:13:48 +00:00
Pete Cooper	ebcd748927	Convert a bunch of loops to foreach. NFC. After r244074, we now have a successors() method to iterate over all the successors of a TerminatorInst. This commit changes a bunch of eligible loops to use it. llvm-svn: 244260	2015-08-06 20:22:46 +00:00
Tom Stellard	d488605ed3	AMDGPU/SI: Add Fiji support Patch by: Alex Deucher llvm-svn: 244255	2015-08-06 19:43:02 +00:00
Tom Stellard	217361c33f	AMDGPU/SI: Add support for 32-bit immediate SMRD offsets on CI Reviewers: arsenm Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11604 llvm-svn: 244254	2015-08-06 19:28:38 +00:00
Tom Stellard	dee26a2876	AMDGPU/SI: Use ComplexPatterns for SMRD addressing modes Summary: This allows us to consolidate several of the TableGen patterns. Reviewers: arsenm Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11602 llvm-svn: 244253	2015-08-06 19:28:30 +00:00
Nico Rieck	78199518c4	Rename inst_range() to instructions() for consistency. NFC llvm-svn: 244248	2015-08-06 19:10:45 +00:00
Chad Rosier	22eb71056d	[AArch64] Use a static function and other minor cleanup for readability. NFC. llvm-svn: 244233	2015-08-06 17:37:18 +00:00
Chad Rosier	f77e909f0a	[AArch64] Improve the readability of the ld/st optimization pass. NFC. llvm-svn: 244222	2015-08-06 15:50:12 +00:00
Douglas Katzman	63d64da0ce	[SPARC] Don't compare arch name as a string, use the enum instead. Fixes PR22695 llvm-svn: 244221	2015-08-06 15:44:12 +00:00
Michael Liao	66233b7d79	Removing tailing whitespaces llvm-svn: 244203	2015-08-06 09:06:20 +00:00
Michael Kuperstein	868dc65444	[X86] Improve EmitLoweredSelect for contiguous CMOV pseudo instructions. This change improves EmitLoweredSelect() so that multiple contiguous CMOV pseudo instructions with the same (or exactly opposite) conditions get lowered using a single new basic-block. This eliminates unnecessary extra basic-blocks (and CFG merge points) when contiguous CMOVs are being lowered. Patch by: kevin.b.smith@intel.com Differential Revision: http://reviews.llvm.org/D11428 llvm-svn: 244202	2015-08-06 08:45:34 +00:00
Alex Lorenz	49873a8382	MIR Serialization: Initial serialization of the machine operand target flags. This commit implements the initial serialization of the machine operand target flags. It extends the 'TargetInstrInfo' class to add two new methods that help to provide text based serialization for the target flags. This commit can serialize only the X86 target flags, and the target flags for the other targets will be serialized in the follow-up commits. Reviewers: Duncan P. N. Exon Smith llvm-svn: 244185	2015-08-06 00:44:07 +00:00
JF Bastien	0f8a99b62f	x86: NFC remove needless InstrCompiler cast Summary: The casts from String to PatFrag weren't needed if we instead provided an SDNode. This fix was suggested by @pete in D11382. Subscribers: pete, llvm-commits Differential Revision: http://reviews.llvm.org/D11788 llvm-svn: 244167	2015-08-05 23:15:37 +00:00
Bjarke Hammersholt Roune	5cbc7d2999	[NVPTX] Use LDG for pointer induction variables. More specifically, make NVPTXISelDAGToDAG able to emit cached loads (LDG) for pointer induction variables. Also fix latent bug where LDG was not restricted to kernel functions. I believe that this could not be triggered so far since we do not currently infer that a pointer is global outside a kernel function, and only loads of global pointers are considered for cached loads. llvm-svn: 244166	2015-08-05 23:11:57 +00:00
David Blaikie	3affe6e264	-Wdeprecated: Remove some dead code that was relying on a questionable (rule-of-3-violating) copy ctor in MCInstPrinter llvm-svn: 244133	2015-08-05 21:15:48 +00:00
Krzysztof Parzyszek	eca6f04074	[Hexagon] Edit a comment. NFC llvm-svn: 244130	2015-08-05 21:08:26 +00:00
JF Bastien	8662083770	x86 atomic: optimize a.store(reg op a.load(acquire), release) Summary: PR24191 finds that the expected memory-register operations aren't generated when relaxed { load ; modify ; store } is used. This is similar to PR17281 which was addressed in D4796, but only for memory-immediate operations (and for memory orderings up to acquire and release). This patch also handles some floating-point operations. Reviewers: reames, kcc, dvyukov, nadav, morisset, chandlerc, t.p.northover, pete Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11382 llvm-svn: 244128	2015-08-05 21:04:59 +00:00
JF Bastien	7c4218f49c	Revert "Fix MO's analyzePhysReg, it was confusing sub- and super-registers. Problem pointed out by Michael Hordijk." I mistakenly committed the patch for D6629, and was trying to commit another. Reverting until it gets proper signoff. llvm-svn: 244121	2015-08-05 20:53:56 +00:00
JF Bastien	ce5256f5c5	Fix MO's analyzePhysReg, it was confusing sub- and super-registers. Problem pointed out by Michael Hordijk. llvm-svn: 244120	2015-08-05 20:49:46 +00:00
Krzysztof Parzyszek	73e66f323a	[Hexagon] Implement TargetTransformInfo for Hexagon Author: Brendon Cahoon <bcahoon@codeaurora.org> llvm-svn: 244089	2015-08-05 18:35:37 +00:00
Chandler Carruth	93205eb966	[TTI] Make the cost APIs in TargetTransformInfo consistently use 'int' rather than 'unsigned' for their costs. For something like costs in particular there is a natural "negative" value, that of savings or saved cost. As a consequence, there is a lot of code that subtracts or creates negative values based on cost, all of which is prone to awkwardness or bugs when dealing with an unsigned type. Similarly, we never want these values to wrap, as that would cause Very Bad code generation (likely percieved as an infinite loop as we try to emit over 2^32 instructions or some such insanity). All around 'int' seems a much better fit for these basic metrics. I've added asserts to ensure that at least the TTI interface never returns negative numbers here. If we ever have a use case for negative numbers, we can remove this, but this way a bug where someone used '-1' to produce a 'very large' cost will be caught by the assert. This passes all tests, and is also UBSan clean. No functional change intended. Differential Revision: http://reviews.llvm.org/D11741 llvm-svn: 244080	2015-08-05 18:08:10 +00:00
Pete Cooper	3ae0ee5453	Move BB succ_iterator to be inside TerminatorInst. NFC. To get the successors of a BB we currently do successors(BB) which ultimately walks the successors of the BB's terminator. This moves the iterator to TerminatorInst as thats what we're actually using to do the iteration, and adds a member function to TerminatorInst to allow us to iterate directly over successors given an instruction. For example, we can now do for (auto *Succ : BI->successors()) instead of for (unsigned i = 0, e = BI->getNumSuccessors(); i != e; ++i) Reviewed by Tobias Grosser. llvm-svn: 244074	2015-08-05 17:43:01 +00:00
Chad Rosier	69e3eb3c79	[AArch64] Register AArch64DeadRegisterDefinition pass with LLVM pass manager. llvm-svn: 244067	2015-08-05 17:35:34 +00:00
James Y Knight	bce20afe0f	[Sparc] Fix disassembly of popc instruction. And add tests. Patch by David Wiberg! llvm-svn: 244064	2015-08-05 17:00:30 +00:00
Matt Arsenault	95f0606e62	AMDGPU/SI: Remove EXECReg For the same reasons as the other physical registers. llvm-svn: 244062	2015-08-05 16:42:57 +00:00
Matt Arsenault	4c0487bff6	AMDGPU: Remove SCCReg. These should be handled as a physical register rather than a virtual register class with one member. llvm-svn: 244061	2015-08-05 16:42:54 +00:00
Chad Rosier	1c81432eb6	[AArch64] Register (existing) AArch64BranchRelaxation pass with LLVM pass manager. Summary: Among other things, this allows -print-after-all/-print-before-all to dump IR around this pass. llvm-svn: 244060	2015-08-05 16:12:10 +00:00
Chad Rosier	0c6c5fc303	[AArch64] Make the naming of the Address Type Promotion pass consistent. llvm-svn: 244057	2015-08-05 15:32:23 +00:00
Chad Rosier	794b9b2fdd	[AArch64] Register (existing) AArch64AdvSIMDScalar pass with LLVM pass manager. Summary: Among other things, this allows -print-after-all/-print-before-all to dump IR around this pass. IIRC, this pass is off by default, but it's still helpful when debugging. llvm-svn: 244056	2015-08-05 15:18:58 +00:00
Chad Rosier	084b78632e	Make this less error prone by using a #define. NFC. llvm-svn: 244048	2015-08-05 14:48:44 +00:00
Chad Rosier	9378c16ac8	[AArch64] Register (existing) AArch64ExpandPseudo pass with LLVM pass manager. Summary: Among other things, this allows -print-after-all/-print-before-all to dump IR around this pass. llvm-svn: 244046	2015-08-05 14:22:53 +00:00
Chad Rosier	96530b3a43	[AArch64] Register (existing) AArch64LoadStoreOpt pass with LLVM pass manager. Summary: Among other things, this allows -print-after-all/-print-before-all to dump IR around this pass. This is the AArch64 version of r243052. llvm-svn: 244041	2015-08-05 13:44:51 +00:00
Chad Rosier	43f5c84cfc	Update comment. NFC. llvm-svn: 244038	2015-08-05 12:40:13 +00:00
Artyom Skrobov	6fbef2a780	ARMISelDAGToDAG.cpp had this self-contradictory code: return StringSwitch<int>(Flags) .Case("g", 0x1) .Case("nzcvq", 0x2) .Case("nzcvqg", 0x3) .Default(-1); ... // The _g and _nzcvqg versions are only valid if the DSP extension is // available. if (!Subtarget->hasThumb2DSP() && (Mask & 0x2)) return -1; ARMARM confirms that the comment is right, and the code was wrong. llvm-svn: 244029	2015-08-05 11:02:14 +00:00
Tanya Lattner	0d28f80bd1	Rename all references to old mailing lists to new lists.llvm.org address. llvm-svn: 243999	2015-08-05 03:51:17 +00:00
Sanjay Patel	924879ad2c	wrap OptSize and MinSize attributes for easier and consistent access (NFCI) Create wrapper methods in the Function class for the OptimizeForSize and MinSize attributes. We want to hide the logic of "or'ing" them together when optimizing just for size (-Os). Currently, we are not consistent about this and rely on a front-end to always set OptimizeForSize (-Os) if MinSize (-Oz) is on. Thus, there are 18 FIXME changes here that should be added as follow-on patches with regression tests. This patch is NFC-intended: it just replaces existing direct accesses of the attributes by the equivalent wrapper call. Differential Revision: http://reviews.llvm.org/D11734 llvm-svn: 243994	2015-08-04 15:49:57 +00:00
Sanjay Patel	75ced2782b	[x86] machine combiner reassociation: mark EFLAGS operand as 'dead' In the commentary for D11660, I wasn't sure if it was alright to create new integer machine instructions without also creating the implicit EFLAGS operand. From what I can see, the implicit operand is always created by the MachineInstrBuilder based on the instruction type, so we don't have to do that explicitly. However, in reviewing the debug output, I noticed that the operand was not marked as 'dead'. The machine combiner should do that to preserve future optimization opportunities that may be checking for that dead EFLAGS operand themselves. Differential Revision: http://reviews.llvm.org/D11696 llvm-svn: 243990	2015-08-04 15:21:56 +00:00
Vasileios Kalintiris	2f12b2ede5	[mips][FastISel] Disable code generation for unsupported targets through FastISel. Summary: Previously, we would check whether the target is supported or not, only in fastSelectInstruction(). This means that 64-bit targets could use FastISel too. We fix this by checking every overridden method of the FastISel class and by falling back to SelectionDAG if the target isn't supported. This change should have been committed along with r243638, but somehow I missed it. Reviewers: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11755 llvm-svn: 243986	2015-08-04 14:35:50 +00:00
Vasileios Kalintiris	044e172228	Revert r229675 - [mips] Avoid redundant sign extension of the result of binary bitwise instructions. It introduced two regressions on 64-bit big-endian targets running under N32 (MultiSource/Benchmarks/tramp3d-v4/tramp3d-v4, and MultiSource/Applications/kimwitu++/kc) The issue is that on 64-bit targets comparisons such as BEQ compare the whole GPR64 but incorrectly tell the instruction selector that they operate on GPR32's. This leads to the elimination of i32->i64 extensions that are actually required by comparisons to work correctly. There's currently a patch under review that fixes this problem. llvm-svn: 243984	2015-08-04 14:26:35 +00:00
Saleem Abdulrasool	0a2672bb43	ARM: support windows division routines This adds the software division routines for the Windows RTABI. These are not expected to be used often though as most modern Windows ARM capable targets support hardware division. In the case that the target CPU doesnt support hardware division, this will be the fallback. llvm-svn: 243952	2015-08-04 03:57:56 +00:00
Saleem Abdulrasool	67697a7ea9	ARM: make Darwin libcall registration table driven (NFC) Make the libcall updating table driven similar to the approach that the Linux and Windows codepath does below. NFC. llvm-svn: 243951	2015-08-04 03:57:52 +00:00
Ahmed Bougacha	81fda188f9	[AArch64] Rename FP formats to be more consistent. NFC. Some are named "FP", others "SD", others still "FP*SD". Rename all this to just use "FP", which, except for conversions (which don't use this format naming scheme), implies "SD" anyway. llvm-svn: 243936	2015-08-04 01:38:08 +00:00
Ahmed Bougacha	e0e12db8c8	[AArch64] Add isel support for f16 indexed LD/ST. llvm-svn: 243935	2015-08-04 01:29:38 +00:00
Ahmed Bougacha	e8ea9ac32b	[AArch64][v8.1a] The "pan" sysreg isn't MSR-specific. NFCI. It's already in SysRegMappings, no need to also have it in MSRMappings: the latter is only used if we didn't find a match in the former. llvm-svn: 243933	2015-08-04 00:55:11 +00:00
Ahmed Bougacha	0cbe2efcd6	[AArch64] Remove unnecessary "break". NFC. llvm-svn: 243931	2015-08-04 00:49:08 +00:00
Ahmed Bougacha	239d635d3d	[AArch64] Use SDValue bool operator. NFC. llvm-svn: 243930	2015-08-04 00:48:02 +00:00
Ahmed Bougacha	b0ae36f0d1	[AArch64] Vector FCOPYSIGN supports Custom-lowering: mark it as such. There's a bunch of code in LowerFCOPYSIGN that does smart lowering, and is actually already vector-aware; let's use it instead of scalarizing! The only interesting change is that for v2f32, we previously always used use v4i32 as the integer vector type. Use v2i32 instead, and mark FCOPYSIGN as Custom. llvm-svn: 243926	2015-08-04 00:42:34 +00:00
Tim Northover	9c340ec6fd	ARM: remove horrible printf left over from debugging llvm-svn: 243907	2015-08-03 22:19:08 +00:00
Pete Cooper	7be8f8f018	Convert some AArch64 code to foreach loops. NFC. Also converted a cast<> to dyn_cast while i was working on the same line of code. llvm-svn: 243894	2015-08-03 19:04:32 +00:00
Tim Northover	910dde7ab2	ARM: prefer allocating VFP regs at stride 4 on Darwin. This is necessary for WatchOS support, where the compact unwind format assumes this kind of layout. For now we only want this on Swift-like CPUs though, where it's been the Xcode behaviour for ages. Also, since it can expand the prologue we don't want it at -Oz. llvm-svn: 243884	2015-08-03 17:20:10 +00:00
John Brawn	f3324cf1a5	[ARM] Make GlobalMerge merge extern globals by default Enabling merging of extern globals appears to be generally either beneficial or harmless. On some benchmarks suites (on Cortex-M4F, Cortex-A9, and Cortex-A57) it gives improvements in the 1-5% range, but in the rest the overall effect is zero. Differential Revision: http://reviews.llvm.org/D10966 llvm-svn: 243874	2015-08-03 12:13:33 +00:00
James Molloy	6967e5e4a3	Be less conservative about forming IT blocks. In http://reviews.llvm.org/rL215382, IT forming was made more conservative under the belief that a flag-setting instruction was unpredictable inside an IT block on ARMv6M. But actually, ARMv6M doesn't even support IT blocks so that's impossible. In the ARMARM for v7M, v7AR and v8AR it states that the semantics of such an instruction changes inside an IT block - it doesn't set the flags. So actually it is fine to use one inside an IT block as long as the flags register is dead afterwards. This gives significant performance improvements in a variety of MPEG based workloads. Differential revision: http://reviews.llvm.org/D11680 llvm-svn: 243869	2015-08-03 09:24:48 +00:00
JF Bastien	fda53373f2	WebAssembly: implement getScalarShiftAmountTy so we can shift by amount, with type Summary: This currently sets the shift amount RHS to the same type as the LHS, and assumes that the LHS is a simple type. This isn't currently the case e.g. with weird integers sizes, but will eventually be true and will assert if not. That's what you get for having an experimental backend: break it and you get to keep both pieces. Most backends either set the RHS to MVT::i32 or MVT::i64, but WebAssembly is a virtual ISA and tries to have regular-looking binary operations where both operands are the same type (even if a 64-bit RHS shifter is slightly silly, hey it's free!). Subscribers: llvm-commits, sunfish, jfb Differential Revision: http://reviews.llvm.org/D11715 llvm-svn: 243860	2015-08-03 00:00:11 +00:00
Craig Topper	e3dcce9700	De-constify pointers to Type since they can't be modified. NFC This was already done in most places a while ago. This just fixes the ones that crept in over time. llvm-svn: 243842	2015-08-01 22:20:21 +00:00
Jingyue Wu	ffa09be222	[NVPTX] allow register copy between float and int Summary: Fixes PR24303. With Bruno's WIP (D11197) on PeepholeOptimizer, across-class register copying (e.g. i32 to f32) becomes possible. Enhance NVPTXInstrInfo::copyPhysReg to handle these cases. Reviewers: jholewinski Subscribers: eliben, jholewinski, llvm-commits, bruno Differential Revision: http://reviews.llvm.org/D11622 llvm-svn: 243839	2015-08-01 18:02:12 +00:00
David Blaikie	78633802c2	-Wdeprecated-clean: Fix cases of violating the rule of 5 in ways that are deprecated in C++11 Remove some unnecessary explicit special members in Hexagon that, once removed, allow the other implicit special members to be used without depending on deprecated features. llvm-svn: 243825	2015-08-01 05:31:27 +00:00
JF Bastien	8f9aea08d4	WebAssembly: handle more than int32 argument/return Summary: Also test 64-bit integers, except shifts for now which are broken because isel dislikes the 32-bit truncate that precedes them. Reviewers: sunfish Subscribers: llvm-commits, jfb Differential Revision: http://reviews.llvm.org/D11699 llvm-svn: 243822	2015-08-01 04:48:44 +00:00
David Blaikie	a5fd382eb3	-Wdeprecated-clean: Fix cases of violating the rule of 5 in ways that are deprecated in C++11 Various targets use std::swap on specific MCAsmOperands (ARM and possibly Hexagon as well). It might be helpful to mark those subclasses as final, to ensure that the availability of move/copy operations can't lead to slicing. (same sort of requirements as the non-vitual dtor - protected or a final class) llvm-svn: 243820	2015-08-01 04:40:41 +00:00
Alex Lorenz	b4d0d6a345	AMDGPU/SI: Add implicit register operands in the correct order. This commit fixes a bug in the class 'SIInstrInfo' where the implicit register machine operands were added to a machine instruction in an incorrect order - the implicit uses were added before the implicit defs. I found this bug while working on moving the implicit register operand verification code from the MIR parser to the machine verifier. This commit also makes the method 'addImplicitDefUseOperands' in the machine instruction class public so that it can be reused in the 'SIInstrInfo' class. Reviewers: Matt Arsenault Differential Revision: http://reviews.llvm.org/D11689 llvm-svn: 243799	2015-07-31 23:30:09 +00:00
Jingyue Wu	cf70053b20	[NVPTX] convert pointers in byval kernel arguments to global Summary: For example, in struct S { int x; int y; }; __global__ void foo(S s) { int *b = s.y; // use b } "b" is guaranteed to point to global. NVPTX should emit ld.global/st.global for accessing "b". Reviewers: jholewinski Subscribers: llvm-commits, jholewinski Differential Revision: http://reviews.llvm.org/D11505 llvm-svn: 243790	2015-07-31 21:44:14 +00:00
JF Bastien	4a2d56044f	WebAssembly: handle `ret void`. Summary: Use -1 as numoperands for the return SDTypeProfile, denoting that return is variadic. Note that the patterns in InstrControl.td still need to match the inputs, so this ins't an "anything goes" variadic on ret! The next step will be to handle other local types (not just int32). Reviewers: sunfish Subscribers: llvm-commits, jfb Differential Revision: http://reviews.llvm.org/D11692 llvm-svn: 243783	2015-07-31 21:04:18 +00:00
JF Bastien	e71e653a5f	x86: check hasOpaqueSPAdjustment in canRealignStack Summary: @rnk pointed out in [1] that x86's canRealignStack logic should match that in CantUseSP from hasBasePointer. [1]: http://reviews.llvm.org/D11160?id=29713#inline-89350 Reviewers: rnk Subscribers: rnk, llvm-commits Differential Revision: http://reviews.llvm.org/D11377 llvm-svn: 243772	2015-07-31 18:28:09 +00:00
JF Bastien	d7fcc6f9c7	WebAssembly: handle unused function arguments. Subscribers: llvm-commits, sunfish, jfb Differential Revision: http://reviews.llvm.org/D11684 llvm-svn: 243770	2015-07-31 18:13:27 +00:00
JF Bastien	600aee9805	WebAssembly: print basic integer assembly. Summary: This prints assembly for int32 integer operations defined in WebAssemblyInstrInteger.td only, with major caveats: - The operation names are currently incorrect. - Other integer and floating-point types will be added later. - The printer isn't factored out to handle recursive AST code yet, since it can't even handle control flow anyways. - The assembly format isn't full s-expressions yet either, this will be added later. - This currently disables PrologEpilogCodeInserter as well as MachineCopyPropagation becasue they don't like virtual registers, which WebAssembly likes quite a bit. This will be fixed by factoring out NVPTX's change (currently a fork of PrologEpilogCodeInserter). Reviewers: sunfish Subscribers: llvm-commits, jfb Differential Revision: http://reviews.llvm.org/D11671 llvm-svn: 243763	2015-07-31 17:53:38 +00:00
Sanjay Patel	9ff4626028	[x86] reassociate integer multiplies using machine combiner pass Add i16, i32, i64 imul machine instructions to the list of reassociation candidates. A new bit of logic is needed to handle integer instructions: they have an implicit EFLAGS operand, so we have to make sure it's dead in order to do any reassociation with integer ops. Differential Revision: http://reviews.llvm.org/D11660 llvm-svn: 243756	2015-07-31 16:21:55 +00:00
Geoff Berry	8a7ef3b2ee	[AArch64] Favor extended reg patterns for sub Summary: Favor the extended reg patterns over the shifted reg patterns that match only the operand shift and not the full sign/zero extend and shift. Reviewers: jmolloy, t.p.northover Subscribers: mcrosier, aemerson, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D11569 llvm-svn: 243753	2015-07-31 15:55:54 +00:00
Jingyue Wu	4be014aebe	Refactor: Simplify boolean conditional return statements in lib/Target/NVPTX Summary: Use clang-tidy to simplify boolean conditional return statements Reviewers: rafael, echristo, chandlerc, bkramer, craig.topper, dexonsmith, chapuni, eliben, jingyue, jholewinski Subscribers: llvm-commits, jholewinski Differential Revision: http://reviews.llvm.org/D9983 llvm-svn: 243734	2015-07-31 05:09:47 +00:00
Matt Arsenault	e1ce344b5a	AMDGPU: Fix v16i32 to v16i8 truncstore llvm-svn: 243731	2015-07-31 04:12:04 +00:00
Matt Arsenault	ba01337942	AMDGPU/SI: Set DwarfRegNum This requires a fix in tablegen for the cast<int> from bits<16> to work in the list initializer. llvm-svn: 243723	2015-07-31 01:12:10 +00:00
Tom Stellard	82325598c3	AMDGPU/SI: Remove unused pattern for f32 constant loads Reviewers: arsenm Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11603 llvm-svn: 243719	2015-07-31 01:02:32 +00:00
Sumanth Gundapaneni	532a13691c	[ARM] Lower modulo operation to generate __aeabi_divmod on Android For a modulo (reminder) operation, clang -target armv7-none-linux-gnueabi generates "__modsi3" clang -target armv7-none-eabi generates "__aeabi_idivmod" clang -target armv7-linux-androideabi generates "__modsi3" Android bionic libc doesn't provide a __modsi3, instead it provides a "__aeabi_idivmod". This patch fixes the LLVM ARMISelLowering to generate the correct call when ever there is a modulo operation. Differential Revision: http://reviews.llvm.org/D11661 llvm-svn: 243717	2015-07-31 00:45:12 +00:00
Sanjay Patel	1166f2ff9f	fix memcpy/memset/memmove lowering when optimizing for size Fixing MinSize attribute handling was discussed in D11363. This is a prerequisite patch to doing that. The handling of OptSize when lowering mem* functions was broken on Darwin because it wants to ignore -Os for these cases, but the existing logic also made it ignore -Oz (MinSize). The Linux change demonstrates a widespread problem. The backend doesn't usually recognize the MinSize attribute by itself; it assumes that if the MinSize attribute exists, then the OptSize attribute must also exist. Fixing this more generally will be a follow-on patch or two. Differential Revision: http://reviews.llvm.org/D11568 llvm-svn: 243693	2015-07-30 21:41:50 +00:00
Matt Arsenault	7a0c3a92c0	AMDGPU: Set SubRegIndex size and offset I'm not sure what reasons the comment here could have had for not setting these. Without these set, there is an assertion hit during DWARF emission. llvm-svn: 243661	2015-07-30 17:03:11 +00:00
Matt Arsenault	b39e858356	AMDGPU: Fix unreachable when emitting binary debug info Copy implementation of applyFixup from AArch64 with AArch64 bits ripped out. Tests will be included with a later commit. Several other problems must be fixed before binary debug info emission will work. llvm-svn: 243660	2015-07-30 17:03:08 +00:00
Tom Stellard	4229aa942d	AMDGPU/SI: Simplify moveSMRDToVALU() Summary: Replace the switch on instruction opcode with a switch on register size. This way we don't need to update the switch statement when we add new SMRD variants. Reviewers: arsenm Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11601 llvm-svn: 243652	2015-07-30 16:20:42 +00:00
Tom Stellard	9d74076065	AMDGPU/SI: Remove isTriviallyReMaterializable() function from SIInstrInfo Summary: This function is never called. isReallyTriviallyReMaterializable() is the function that should be implemented instead. Reviewers: arsenm Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11620 llvm-svn: 243651	2015-07-30 16:20:40 +00:00
Vasileios Kalintiris	2041b1dd0b	[mips][FastISel] Remove hidden mips-fast-isel option. Summary: This hidden option would disable code generation through FastISel by default. It was removed from the available options and from the Fast-ISel tests that required it in order to run the tests. Reviewers: dsanders Subscribers: qcolombet, llvm-commits Differential Revision: http://reviews.llvm.org/D11610 llvm-svn: 243638	2015-07-30 12:39:33 +00:00
Vasileios Kalintiris	77fb0a3dcf	[mips][FastISel] Apply only zero-extension to constants prior to their materialization. Summary: Previously, we would sign-extend non-boolean negative constants and zero-extend otherwise. This was problematic for PHI instructions with negative values that had a type with bitwidth less than that of the register used for materialization. More specifically, ComputePHILiveOutRegInfo() assumes the constants present in a PHI node are zero extended in their container and afterwards deduces the known bits. For example, previously we would materialize an i16 -4 with the following instruction: addiu $r, $zero, -4 The register would end-up with the 32-bit 2's complement representation of -4. However, ComputePHILiveOutRegInfo() would generate a constant with the upper 16-bits set to zero. The SelectionDAG builder would use that information to generate an AssertZero node that would remove any subsequent trunc & zero_extend nodes. In theory, we should modify ComputePHILiveOutRegInfo() to consult target-specific hooks about the way they prefer to materialize the given constants. However, git-blame reports that this specific code has not been touched since 2011 and it seems to be working well for every target so far. Reviewers: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11592 llvm-svn: 243636	2015-07-30 11:51:44 +00:00
Michael Kuperstein	cdb076b8d4	[X86] Recognize "flags" as an identifier, not a register in Intel-syntax inline asm Patch by: marina.yatsina@intel.com Differential Revision: http://reviews.llvm.org/D11512 llvm-svn: 243630	2015-07-30 10:10:25 +00:00
Sanjay Patel	5bfbb36a09	push fast-math check for machine-combiner reassociations into instruction-type check; NFC This makes it simpler to add instruction types that don't depend on fast-math. llvm-svn: 243596	2015-07-30 00:04:21 +00:00
Nick Lewycky	c3890d2969	Fix typo "fuction" noticed in comments in AssumptionCache.h, and also all the other files that have the same typo. All comments, no functionality change! (Merely a "fuctionality" change.) Bonus change to remove emacs major mode marker from SystemZMachineFunctionInfo.cpp because emacs already knows it's C++ from the extension. Also fix typo "appeary" in AMDGPUMCAsmInfo.h. llvm-svn: 243585	2015-07-29 22:32:47 +00:00
Eric Christopher	d566fb12a1	Rename hasCompatibleFunctionAttributes->areInlineCompatible based on suggestions. Currently the function is only used for inline purposes and this is more descriptive for the use. llvm-svn: 243578	2015-07-29 22:09:48 +00:00
Simon Pilgrim	ba10f76705	[X86][SSE] Keep 32-bit target i64 vector shifts on SSE unit. This patch improves the 32-bit target i64 constant matching to detect the shuffle vector splats that are introduced by i64 vector shift vectorization (D8416). Differential Revision: http://reviews.llvm.org/D11327 llvm-svn: 243577	2015-07-29 21:44:27 +00:00
Tim Northover	2a9d801fd5	AArch64: use 32-bit MOV rather than UBFX to truncate registers. It's potentially more efficient on Cyclone, and from the optimization guides & schedulers looks like it has no effect on Cortex-A53 or A57. In general you'd expect a MOV to be about the most efficient instruction with its semantics, even though the official "UXTW" alias is really a UBFX. llvm-svn: 243576	2015-07-29 21:34:32 +00:00
Simon Pilgrim	86478c6909	[X86][SSE] Vectorize i64 ASHR operations This patch vectorizes the v2i64/v4i64 ASHR shift operations - the last remaining integer vector shifts that are still being transferred to/from the scalar unit to be completed. Differential Revision: http://reviews.llvm.org/D11439 llvm-svn: 243569	2015-07-29 20:31:45 +00:00
Jingyue Wu	3a04dc6e78	Roll forward r242871 r242871 missed one place that should be guarded with isPhysicalReg. This patch fixes that. llvm-svn: 243555	2015-07-29 18:59:09 +00:00
Bruno Cardoso Lopes	38c0250679	Revert "[PeepholeOptimizer] Look through PHIs to find additional register sources" Reported to Broke some internal tests: PR24303 This reverts commit r243486. llvm-svn: 243540	2015-07-29 17:46:47 +00:00
Tim Northover	cf739b8c3d	AArch64: use AddressingModes.h accessors for compare shifts No functional change because "lsl #12" is actually encoded as 12, but one less bug if someone ever decides to change that for the giggles. llvm-svn: 243536	2015-07-29 16:39:56 +00:00
Jingyue Wu	7ec38530a5	Temporarily revert r242871 PR24299 llvm-svn: 243522	2015-07-29 15:26:11 +00:00
Bill Schmidt	42ddd71120	[PPC] Fix PR24216: Don't generate splat for misaligned shuffle mask Given certain shuffle-vector masks, LLVM emits splat instructions which splat the wrong bytes from the source register. The issue is that the function PPC::isSplatShuffleMask() in PPCISelLowering.cpp does not ensure that the splat pattern found is requesting bytes that are aligned on an EltSize boundary. This patch detects this situation as not a valid splat mask, resulting in a permute being generated instead of a splat. Patch and test case by Tyler Kenney, cleaned up a bit by me. This is a simple bug fix that would be good to incorporate into 3.7. llvm-svn: 243519	2015-07-29 14:31:57 +00:00
Akira Hatanaka	f53b0403f8	[AArch64] Define subtarget feature strict-align. This commit defines subtarget feature strict-align and uses it instead of cl::opt -aarch64-strict-align to decide whether strict alignment should be forced. rdar://problem/21529937 llvm-svn: 243516	2015-07-29 14:17:26 +00:00
Alex Lorenz	d8a1e542ab	Fix broken ArrayRef conversion from r243497. llvm-svn: 243501	2015-07-28 23:34:27 +00:00
Sanjay Patel	1dd15598cf	fix TLI's combineRepeatedFPDivisors interface to return the minimum user threshold This fix was suggested as part of D11345 and is part of fixing PR24141. With this change, we can avoid walking the uses of a divisor node if the target doesn't want the combineRepeatedFPDivisors transform in the first place. There is no NFC-intended other than that. Differential Revision: http://reviews.llvm.org/D11531 llvm-svn: 243498	2015-07-28 23:05:48 +00:00
Alex Lorenz	ef5c196fb0	MIR Serialization: Serialize the target index machine operands. Reviewers: Duncan P. N. Exon Smith llvm-svn: 243497	2015-07-28 23:02:45 +00:00
Akira Hatanaka	2670f4a550	[ARM] Define subtarget feature strict-align. This commit defines subtarget feature strict-align and uses it instead of cl::opt -arm-strict-align to decide whether strict alignment should be forced. Also, remove the logic that was checking the OS and architecture as clang is now responsible for setting strict-align based on the command line options specified and the target architecute and OS. rdar://problem/21529937 http://reviews.llvm.org/D11470 llvm-svn: 243493	2015-07-28 22:44:28 +00:00
Tim Northover	17ae83a25f	AArch64: be careful of large immediates when optimising cmps. llvm-svn: 243492	2015-07-28 22:42:32 +00:00
Bruno Cardoso Lopes	3c235763e5	[PeepholeOptimizer] Look through PHIs to find additional register sources Reapply 243271 with more fixes; although we are not handling multiple sources with coalescable copies, we were not properly skipping this case. - Teaches the ValueTracker in the PeepholeOptimizer to look through PHI instructions. - Add findNextSourceAndRewritePHI method to lookup into multiple sources returnted by the ValueTracker and rewrite PHIs with new sources. With these changes we can find more register sources and rewrite more copies to allow coaslescing of bitcast instructions. Hence, we eliminate unnecessary VR64 <-> GR64 copies in x86, but it could be extended to other archs by marking "isBitcast" on target specific instructions. The x86 example follows: A: psllq %mm1, %mm0 movd %mm0, %r9 jmp C B: por %mm1, %mm0 movd %mm0, %r9 jmp C C: movd %r9, %mm0 pshufw $238, %mm0, %mm0 Becomes: A: psllq %mm1, %mm0 jmp C B: por %mm1, %mm0 jmp C C: pshufw $238, %mm0, %mm0 Differential Revision: http://reviews.llvm.org/D11197 rdar://problem/20404526 llvm-svn: 243486	2015-07-28 21:45:50 +00:00
Vasileios Kalintiris	9876946aee	[mips][FastISel] Fix call lowering by bailing out on "fastcc" calls. Summary: Currently, we support only the MIPS O32 ABI calling convention for call lowering. With this change we avoid using the O32 calling convetion for lowering calls marked as using the fast calling convention. Reviewers: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11515 llvm-svn: 243485	2015-07-28 21:43:31 +00:00
Vasileios Kalintiris	9ec6114860	[mips][FastISel] Fix generated code for IR's select instruction. Summary: Generate correct code for the select instruction by zero-extending it's boolean/condition operand to GPR-width. This is necessary because the conditional-move instructions operate on the whole register. Reviewers: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11506 llvm-svn: 243469	2015-07-28 19:57:25 +00:00
Matt Arsenault	7227cc1a48	AMDGPU: Don't try to use LDS/vector for private if pointer value stored If the pointer is the store's value operand, this would produce a broken module. Make sure the use is actually for the pointer operand. llvm-svn: 243462	2015-07-28 18:47:00 +00:00
Matt Arsenault	fdcd39a8ad	AMDGPU: Fix crash if called function is a bitcast getCalledFunction() is null, so this would crash. Replace crash with an error on unsupported call. llvm-svn: 243461	2015-07-28 18:29:14 +00:00
Matt Arsenault	916cea5682	AMDGPU: Fix return type of getImplicitParameterOffset. Patch by Zoltan Gilian <zoltan.gilian@gmail.com> llvm-svn: 243459	2015-07-28 18:09:55 +00:00
JF Bastien	ae7eebd429	WebAssembly: MCAsmInfo only has one syntax variant for now. Summary: MCAsmInfo is set up with the default AssemblerDialect, which is zero. Subscribers: llvm-commits, sunfish, jfb Differential Revision: http://reviews.llvm.org/D11567 llvm-svn: 243452	2015-07-28 17:23:07 +00:00
Chih-Hung Hsieh	1e859582d6	Implement target independent TLS compatible with glibc's emutls.c. The 'common' section TLS is not implemented. Current C/C++ TLS variables are not placed in common section. DWARF debug info to get the address of TLS variables is not generated yet. clang and driver changes in http://reviews.llvm.org/D10524 Added -femulated-tls flag to select the emulated TLS model, which will be used for old targets like Android that do not support ELF TLS models. Added TargetLowering::LowerToTLSEmulatedModel as a target-independent function to convert a SDNode of TLS variable address to a function call to __emutls_get_address. Added into lib/Target//ISelLowering.cpp to call LowerToTLSEmulatedModel for TLSModel::Emulated. Although all targets supporting ELF TLS models are enhanced, emulated TLS model has been tested only for Android ELF targets. Modified AsmPrinter.cpp to print the emutls_v.* and emutls_t.* variables for emulated TLS variables. Modified DwarfCompileUnit.cpp to skip some DIE for emulated TLS variabls. TODO: Add proper DIE for emulated TLS variables. Added new unit tests with emulated TLS. Differential Revision: http://reviews.llvm.org/D10522 llvm-svn: 243438	2015-07-28 16:24:05 +00:00
Geoff Berry	c573bf7a5f	[AArch64] Match float round and convert to int instructions. Summary: Add patterns for doing floating point round with various rounding modes followed by conversion to int as a single FCVT* instruction. Reviewers: t.p.northover, jmolloy Subscribers: aemerson, rengolin, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D11424 llvm-svn: 243422	2015-07-28 15:24:10 +00:00
Adhemerval Zanella	7bc3319d84	Implement __builtin_thread_pointer This path add the aarch64 lowering of __builtin_thread_pointer. It uses the already implemented AArch64ISD::THREAD_POINTER used in TLS generation. llvm-svn: 243412	2015-07-28 13:03:31 +00:00
Michael Kuperstein	cba308cf96	[X86] Remove mergeSPUpdatesUp() X86FrameLowering has both a mergeSPUpdates() that accepts a direction, and an mergeSPUpdatesUp(), which seem to do the same thing, except for a slightly different interface. Removed the less general function. NFC. Differential Revision: http://reviews.llvm.org/D11510 llvm-svn: 243396	2015-07-28 08:56:13 +00:00
Simon Pilgrim	df984f58ad	[X86][SSE] Use bitmasks instead of shuffles where possible. VPAND is a lot faster than VPSHUFB and VPBLENDVB - this patch ensures we attempt to lower to a basic bitmask before lowering to the slower byte shuffle/blend instructions. Split off from D11518. Differential Revision: http://reviews.llvm.org/D11541 llvm-svn: 243395	2015-07-28 08:54:41 +00:00
Igor Breger	8352a0ddf2	AVX512: Implemented encoding and intrinsics for VGETEXPSS/D instructions Added tests for intrinsics and encoding. Differential Revision: http://reviews.llvm.org/D11528 llvm-svn: 243390	2015-07-28 06:53:28 +00:00
Sanjay Patel	8c13e3680d	fix invalid load folding with SSE/AVX FP logical instructions (PR22371) This is a follow-up to the FIXME that was added with D7474 ( http://reviews.llvm.org/rL229531 ). I thought this load folding bug had been made hard-to-hit, but it turns out to be very easy when targeting 32-bit x86 and causes a miscompile/crash in Wine: https://bugs.winehq.org/show_bug.cgi?id=38826 https://llvm.org/bugs/show_bug.cgi?id=22371#c25 The quick fix is to simply remove the scalar FP logical instructions from the load folding table in X86InstrInfo, but that causes us to miss load folds that should be possible when lowering fabs, fneg, fcopysign. So the majority of this patch is altering those lowerings to use vector FP logical instructions (because that's all x86 gives us anyway). That lets us do the load folding legally. Differential Revision: http://reviews.llvm.org/D11477 llvm-svn: 243361	2015-07-28 00:48:32 +00:00
JF Bastien	088c47ee5b	WebAssembly: add a generic CPU Summary: WebAssemblySubtarget.cpp expects a default 'generic' CPU to exist, and this seems to be prevalent with other targets. It makes sense to have something between MVP and bleeding-edge, even though for now it's the same as MVP. This removes a warning that's currently generated. Subscribers: jfb, llvm-commits, sunfish Differential Revision: http://reviews.llvm.org/D11546 llvm-svn: 243345	2015-07-27 23:25:54 +00:00
JF Bastien	6c6efa1786	WebAssembly: more MCAsmInfo nits. Summary: As suggested by sunfish. Subscribers: jfb, llvm-commits, sunfish Differential Revision: http://reviews.llvm.org/D11544 llvm-svn: 243339	2015-07-27 22:40:31 +00:00
Alexandros Lamprineas	4ea707555a	- Added support for parsing HWDiv features using Target Parser. - Architecture extensions are represented as a bitmap. Phabricator: http://reviews.llvm.org/D11457 llvm-svn: 243335	2015-07-27 22:26:59 +00:00
Colin LeMahieu	fe2c8b8015	[llvm-mc] Pushing plumbing through for --fatal-warnings flag. llvm-svn: 243334	2015-07-27 21:56:53 +00:00
Sanjay Patel	1cf245fd96	remove unnecessary forward declaration; NFC llvm-svn: 243328	2015-07-27 21:11:55 +00:00
Sanjay Patel	aa99a2304d	don't repeat function names in comments; NFC llvm-svn: 243327	2015-07-27 21:03:03 +00:00
JF Bastien	1a12bf1aa2	WebAssembly: minor MCAsmInfo fixes Summary: Fix pointer / callee-save stack sto size. Update comment character to be LISP-ish. Subscribers: llvm-commits, sunfish, jfb Differential Revision: http://reviews.llvm.org/D11537 llvm-svn: 243326	2015-07-27 20:46:51 +00:00
Bruno Cardoso Lopes	b20841df44	Revert "[PeepholeOptimizer] Look through PHIs to find additional register sources" Still breaks some ARM buildbots. This reverts r243271. llvm-svn: 243318	2015-07-27 20:26:04 +00:00
Akira Hatanaka	2541e0241c	[AArch64] Remove check for Darwin that was needed to decide if x18 should be reserved. The decision to reserve x18 is going to be made solely by the front-end, so it isn't necessary to check if the OS is Darwin in the backend. llvm-svn: 243308	2015-07-27 19:18:47 +00:00
Diego Novillo	cd973c4f77	Fix ODR violation. NFC. There is an ODR conflict between lib/ExecutionEngine/ExecutionEngineBindings.cpp and lib/Target/TargetMachineC.cpp. The inline definitions should simply be marked static (thanks dblaikie for the hint). llvm-svn: 243298	2015-07-27 18:27:23 +00:00
Marek Olsak	93df060871	AMDGPU: don't match vgpr loads for constant loads Author: Dave Airlie <airlied@redhat.com> In order to implement indirect sampler loads, we don't want to match on a VGPR load but an SGPR one for constants, as we cannot feed VGPRs to the sampler only SGPRs. this should be applicable for llvm 3.7 as well. llvm-svn: 243294	2015-07-27 18:16:08 +00:00
Sanjay Patel	beb4cffb43	fix typo and spacing; NFC llvm-svn: 243287	2015-07-27 17:39:20 +00:00
Pete Cooper	2e20147403	Revert "Add const to some Type* parameters which didn't need to be mutable. NFC." This reverts commit r243146. Feedback from Craig Topper and David Blaikie was that we don't put const on Type as it has no mutable state. llvm-svn: 243282	2015-07-27 17:15:24 +00:00
Bruno Cardoso Lopes	669c921bfd	[PeepholeOptimizer] Look through PHIs to find additional register sources Reapply r242295 with fixes in the implementation. - Teaches the ValueTracker in the PeepholeOptimizer to look through PHI instructions. - Add findNextSourceAndRewritePHI method to lookup into multiple sources returnted by the ValueTracker and rewrite PHIs with new sources. With these changes we can find more register sources and rewrite more copies to allow coaslescing of bitcast instructions. Hence, we eliminate unnecessary VR64 <-> GR64 copies in x86, but it could be extended to other archs by marking "isBitcast" on target specific instructions. The x86 example follows: A: psllq %mm1, %mm0 movd %mm0, %r9 jmp C B: por %mm1, %mm0 movd %mm0, %r9 jmp C C: movd %r9, %mm0 pshufw $238, %mm0, %mm0 Becomes: A: psllq %mm1, %mm0 jmp C B: por %mm1, %mm0 jmp C C: pshufw $238, %mm0, %mm0 Differential Revision: http://reviews.llvm.org/D11197 rdar://problem/20404526 llvm-svn: 243271	2015-07-27 14:39:46 +00:00
Silviu Baranga	7581d22512	[ARM/AArch64] Fix cost model for interleaved accesses Summary: Fix the cost of interleaved accesses for ARM/AArch64. We were calling getTypeAllocSize and using it to check the number of bits, when we should have called getTypeAllocSizeInBits instead. This would pottentially cause the vectorizer to generate loads/stores and shuffles which cannot be matched with an interleaved access instruction. No performance changes are expected for now since matching/generating interleaved accesses is still disabled by default. Reviewers: rengolin Subscribers: aemerson, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D11524 llvm-svn: 243270	2015-07-27 14:39:34 +00:00
Simon Pilgrim	81accb7b27	[X86] Reordered lowerVectorShuffleAsBitMask before lowerVectorShuffleAsBlend. NFCI. Allows us to show diffs for D11518 more clearly llvm-svn: 243264	2015-07-27 12:37:19 +00:00
Marek Olsak	1354b87695	AMDGPU/SI: Fix the V_FRACT_F64 SI bug workaround This is a candidate for 3.7. llvm-svn: 243263	2015-07-27 11:37:42 +00:00
Sean Silva	e1c6b549ef	Avoid using uncommon acronym "MSROM". llvm-svn: 243256	2015-07-27 00:46:59 +00:00
Igor Breger	f2460112ad	Implemented encoding and intrinsics of the following instructions vunpckhps/pd, vunpcklps/pd, vpunpcklbw, vpunpckhbw, vpunpcklwd, vpunpckhwd, vpunpckldq, vpunpckhdq, vpunpcklqdq, vpunpckhqdq Added tests for intrinsics and encoding. Differential Revision: http://reviews.llvm.org/D11509 llvm-svn: 243246	2015-07-26 14:41:44 +00:00

... 2 3 4 5 6 ...

34097 Commits