llvm-project

Commit Graph

Author	SHA1	Message	Date
Dan Gohman	4da4abd87f	[WebAssembly] Fix scheduling dependencies in register-stackified code Add physical register defs to instructions used from stackified instructions to prevent them from being scheduled into the middle of a stack sequence. This is a conservative measure which may be loosened in the future. Differential Revision: http://reviews.llvm.org/D15252 llvm-svn: 254811	2015-12-05 00:51:40 +00:00
Derek Schuff	9d77952332	[WebAssembly] Support constant offsets on loads and stores This is just prototype for load/store for i32 types. I'll add them to the rest of the types if we like this direction. Differential Revision: http://reviews.llvm.org/D15197 llvm-svn: 254807	2015-12-05 00:26:39 +00:00
Dan Gohman	35bfb24c28	[WebAssembly] Initial varargs support. Full varargs support will depend on prologue/epilogue support, but this patch gets us started with most of the basic infrastructure. Differential Revision: http://reviews.llvm.org/D15231 llvm-svn: 254799	2015-12-04 23:22:35 +00:00
Hans Wennborg	5000ce8a63	X86: Don't emit SAHF/LAHF for 64-bit targets unless explicitly supported These instructions are not supported by all CPUs in 64-bit mode. Emitting them causes Chromium to crash on start-up for users with such chips. (GCC puts these instructions behind -msahf on 64-bit for the same reason.) This patch adds FeatureLAHFSAHF, enables it by default for 32-bit targets and modern CPUs, and changes X86InstrInfo::copyPhysReg back to the lowering from before r244503 when the instructions are not available. Differential Revision: http://reviews.llvm.org/D15240 llvm-svn: 254793	2015-12-04 23:00:33 +00:00
Chad Rosier	f3491496dc	[AArch64] Expand vector SDIVREM/UDIVREM operations. http://reviews.llvm.org/D15214 Patch by Ana Pazos <apazos@codeaurora.org>! llvm-svn: 254773	2015-12-04 21:38:44 +00:00
Manman Ren	19c7bbe3b7	[CXX TLS calling convention] Add CXX TLS calling convention. This commit adds a new target-independent calling convention for C++ TLS access functions. It aims to minimize overhead in the caller by perserving as many registers as possible. The target-specific implementation for X86-64 is defined as following: Arguments are passed as for the default C calling convention The same applies for the return value(s) The callee preserves all GPRs - except RAX and RDI The access function makes C-style TLS function calls in the entry and exit block, C-style TLS functions save a lot more registers than normal calls. The added calling convention ties into the existing implementation of the C-style TLS functions, so we can't simply use existing calling conventions such as preserve_mostcc. rdar://9001553 llvm-svn: 254737	2015-12-04 17:40:13 +00:00
Alexey Bataev	7cf324772f	LEA code size optimization pass (Part 1): Remove redundant address recalculations, by Andrey Turetsky Add new x86 pass which replaces address calculations in load or store instructions with def register of existing LEA (must be in the same basic block), if the LEA calculates address that differs only by a displacement. Works only with -Os or -Oz. Differential Revision: http://reviews.llvm.org/D13294 llvm-svn: 254712	2015-12-04 10:53:15 +00:00
NAKAMURA Takumi	a3561b388c	Move llvm/test/CodeGen/Generic/function-alias.ll to X86. It is incompatible to PECOFF. FIXME: It may be ELF-generic. llvm-svn: 254685	2015-12-04 02:00:12 +00:00
Quentin Colombet	901f036353	[ARM] When a bitcast is about to be turned into a VMOVDRR, try to combine it with its source instead of forcing the values on GPRs. This improves the lowering of vector code when such bitcasts happen in the middle of vector computations. rdar://problem/23691584 llvm-svn: 254684	2015-12-04 01:53:14 +00:00
Matthias Braun	97d0ffbe06	ScheduleDAGInstrs: Rework schedule graph builder. Re-comitting with a change that avoids undefined uses getting put into the VRegUses list. The new algorithm remembers the uses encountered while walking backwards until a matching def is found. Contrary to the previous version this: - Works without LiveIntervals being available - Allows to increase the precision to subregisters/lanemasks (not used for now) The changes in the AMDGPU tests are necessary because the R600 scheduler is not stable with respect to the order of nodes in the ready queues. Differential Revision: http://reviews.llvm.org/D9068 llvm-svn: 254683	2015-12-04 01:51:19 +00:00
JF Bastien	580b6572b5	X86InstrInfo::copyPhysReg: workaround reg liveness Summary: computeRegisterLiveness and analyzePhysReg are currently getting confused about liveness in some cases, breaking copyPhysReg's calculation of whether AX is dead in some cases. Work around this issue temporarily by assuming that AX is always live. See detail in: https://llvm.org/bugs/show_bug.cgi?id=25033#c7 And associated bugs PR24535 PR25033 PR24991 PR24992 PR25201. This workaround makes the code correct but slightly inefficient, but it seems to confuse the machine instr verifier which now things EAX was undefined in some cases where it's being conservatively saved / restored. Reviewers: majnemer, sanjoy Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D15198 llvm-svn: 254680	2015-12-04 01:18:17 +00:00
Evgeniy Stepanov	7fc3cb5919	Fix function-alias.ll test on non-X86 targets. llvm-svn: 254676	2015-12-04 00:57:25 +00:00
Evgeniy Stepanov	2bb9c5ca22	Emit function alias to data as a function symbol. CFI emits jump slots for indirect functions as a byte array constant, and declares function-typed aliases to these constants. This change fixes AsmPrinter to emit these aliases as function symbols and not data symbols. llvm-svn: 254674	2015-12-04 00:45:43 +00:00
JF Bastien	1ac69947b6	CodeGen peephole: fold redundant phys reg copies Code generation often exposes redundant physical register copies through virtual registers such as: %vreg = COPY %PHYSREG ... %PHYSREG = COPY %vreg There are cases where no intervening clobber of %PHYSREG occurs, and the later copy could therefore be removed. In some cases this further allows us to remove the initial copy. This patch contains a motivating example which comes from the x86 build of Chrome, specifically cc::ResourceProvider::UnlockForRead uses libstdc++'s implementation of hash_map. That example has two tests live at the same time, and after machine sinking LLVM has confused itself enough and things spilling EFLAGS is a great idea even though it's never restored and the comparison results are both live. Before this patch we have: DEC32m %RIP, 1, %noreg, <ga:@L>, %noreg, %EFLAGS<imp-def> %vreg1<def> = COPY %EFLAGS; GR64:%vreg1 %EFLAGS<def> = COPY %vreg1; GR64:%vreg1 JNE_1 <BB#1>, %EFLAGS<imp-use> Both copies are useless. This patch tries to eliminate the later copy in a generic manner. dec is especially confusing to LLVM when compared with sub. I wrote this patch to treat all physical registers generically, but only remove redundant copies of non-allocatable physical registers because the allocatable ones caused issues (e.g. when calling conventions weren't properly modeled) and should be handled later by the register allocator anyways. The following tests used to failed when the patch also replaced allocatable registers: CodeGen/X86/StackColoring.ll CodeGen/X86/avx512-calling-conv.ll CodeGen/X86/copy-propagation.ll CodeGen/X86/inline-asm-fpstack.ll CodeGen/X86/musttail-varargs.ll CodeGen/X86/pop-stack-cleanup.ll CodeGen/X86/preserve_mostcc64.ll CodeGen/X86/tailcallstack64.ll CodeGen/X86/this-return-64.ll This happens because COPY has other special meaning for e.g. dependency breakage and x87 FP stack. Note that all other backends' tests pass. Reviewers: qcolombet Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D15157 llvm-svn: 254665	2015-12-03 23:43:56 +00:00
Dan Gohman	391a98afd5	[WebAssembly] Fix dominance check for PHIs in the StoreResult pass When a block has no terminator instructions, getFirstTerminator() returns end(), which can't be used in dominance checks. Check dominance for phi operands separately. Also, remove some bits from WebAssemblyRegStackify.cpp that were causing trouble on the same testcase; they were left behind from an earlier experiment. Differential Revision: http://reviews.llvm.org/D15210 llvm-svn: 254662	2015-12-03 23:07:03 +00:00
Reid Kleckner	93fc520339	[X86] Put no-op ADJCALLSTACK markers around all dynamic lowerings Summary: These ADJCALLSTACK markers don't generate code, but they keep dynamic alloca code that calls chkstk out of the prologue. This slightly pessimizes inalloca calls by preventing some register copy coalescing, but I can live with that. Reviewers: qcolombet Subscribers: hans, llvm-commits Differential Revision: http://reviews.llvm.org/D15200 llvm-svn: 254645	2015-12-03 20:46:59 +00:00
Andrew Kaylor	92b3b16ba3	Move branch folding test to a better location. llvm-svn: 254640	2015-12-03 19:41:25 +00:00
Matthias Braun	0d4505c067	AArch64FastISel: Use cbz/cbnz to branch on i1 In the case of a conditional branch without a preceding cmp we used to emit a "and; cmp; b.eq/b.ne" sequence, use tbz/tbnz instead. Differential Revision: http://reviews.llvm.org/D15122 llvm-svn: 254621	2015-12-03 17:19:58 +00:00
Tom Stellard	9760f03757	AMDGPU/SI: Emit constant arrays in the .hsrodata_readonly_agent section Summary: This is done only when targeting HSA. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D13807 llvm-svn: 254587	2015-12-03 03:34:32 +00:00
Matthias Braun	2fd672a221	Revert "ScheduleDAGInstrs: Rework schedule graph builder." This works mostly fine but breaks some stage 1 builders when compiling compiler-rt on i386. Revert for further investigation as I can't see an obvious cause/fix. This reverts commit r254577. llvm-svn: 254586	2015-12-03 03:01:10 +00:00
Matthias Braun	d35fe3d984	ScheduleDAGInstrs: Rework schedule graph builder. The new algorithm remembers the uses encountered while walking backwards until a matching def is found. Contrary to the previous version this: - Works without LiveIntervals being available - Allows to increase the precision to subregisters/lanemasks (not used for now) The changes in the AMDGPU tests are necessary because the R600 scheduler is not stable with respect to the order of nodes in the ready queues. Differential Revision: http://reviews.llvm.org/D9068 llvm-svn: 254577	2015-12-03 02:05:27 +00:00
Derek Schuff	5268aaf7b6	[WebAssembly] Add a test for wasm-store-results pass Differential Revision: http://reviews.llvm.org/D15167 llvm-svn: 254570	2015-12-03 00:50:30 +00:00
Kyle Butt	2f713eb438	Tests: PPC: remove unnecessary metadata. NFC Remove unnecessary metadata from a test case. llvm-svn: 254544	2015-12-02 21:08:03 +00:00
Tom Stellard	00f2f91af4	AMDGPU/SI: Correctly emit agent global segment variables when targeting HSA Differential Revision: http://reviews.llvm.org/D14508 llvm-svn: 254540	2015-12-02 19:47:57 +00:00
Kyle Butt	cf6a8bfe51	[CodeGen]: Fix bad interaction with AntiDep breaking and inline asm. AggressiveAntiDepBreaker was renaming registers specified by the user for inline assembly. While this will work for compiler-specified registers, it won't work for user-specified registers, and at the time this runs, I don't currently see a way to distinguish them. llvm-svn: 254532	2015-12-02 18:58:51 +00:00
Tim Northover	f520eff782	AArch64: use ldxp/stxp pair to implement 128-bit atomic loads. The ARM ARM is clear that 128-bit loads are only guaranteed to have been atomic if there has been a corresponding successful stxp. It's less clear for AArch32, so I'm leaving that alone for now. llvm-svn: 254524	2015-12-02 18:12:57 +00:00
Tom Stellard	e3b5aeaf83	AMDGPU/SI: Don't emit group segment global variables Summary: Only global or readonly segment variables should appear in object files. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15111 llvm-svn: 254519	2015-12-02 17:00:42 +00:00
Christof Douma	8b5dc2c94e	[AArch64]: Add support for Cortex-A35 Adds support for the new Cortex-A35 ARMv8-A core. llvm-svn: 254503	2015-12-02 11:53:44 +00:00
Nemanja Ivanovic	74e31bc929	Patch to fix a crash in the PowerPC back end due to ISD::ROTL and ISD::ROTR not being expanded. Test case included. llvm-svn: 254501	2015-12-02 10:36:24 +00:00
Simon Pilgrim	3fc3454a0c	[X86][FMA] Optimize FNEG(FMUL) Patterns On FMA targets, we can avoid having to load a constant to negate a float/double multiply by instead using a FNMSUB (-(X*Y)-0) Fix for PR24366 Differential Revision: http://reviews.llvm.org/D14909 llvm-svn: 254495	2015-12-02 09:07:55 +00:00
Asaf Badouh	2489f350c0	[X86][AVX512] add comi with Sae add builtin_ia32_vcomisd and builtin_ia32_vcomisd Differential Revision: http://reviews.llvm.org/D14331 llvm-svn: 254493	2015-12-02 08:17:51 +00:00
Quentin Colombet	f1e91c8bf1	[X86] Make sure the prologue does not clobber EFLAGS when it lives accross it. This is a superset of the fix done in r254448. This fixes PR25607. llvm-svn: 254478	2015-12-02 01:22:54 +00:00
Tim Northover	f3be9d5c0b	AArch64: fix 128-bit shifts We mustn't introduce a shift of exactly 64-bits for any inputs, since that's an UNDEF value (and worse, it's not what you want with the natural Arch64 implementation). The generated code is pretty horrific, but I couldn't come up with an obviously better alternative (if the amount is constant EXTR could help). Turns out 128-bit shifts are just nasty. rdar://22491037 llvm-svn: 254475	2015-12-02 00:33:54 +00:00
Matt Arsenault	592d068198	AMDGPU: Error on addrspacecasts that aren't actually implemented llvm-svn: 254469	2015-12-01 23:04:05 +00:00
Matt Arsenault	f9bfeafd00	AMDGPU: Implement isNoopAddrSpaceCast llvm-svn: 254468	2015-12-01 23:04:00 +00:00
Quentin Colombet	9cb01aa30a	[X86] Make sure the prologue does not clobber EFLAGS when it lives accross it. This fixes PR25629. llvm-svn: 254448	2015-12-01 19:49:31 +00:00
Artyom Skrobov	5d1f2524a0	Fix Thumb1 epilogue generation Summary: This had been broken for a very long time, but nobody noticed until D14357 enabled shrink-wrapping by default. Reviewers: jroelofs, qcolombet Subscribers: tyomitch, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D14986 llvm-svn: 254444	2015-12-01 19:25:11 +00:00
Weiming Zhao	56ab51870c	[AArch64] Fix a corner case in BitFeild select Summary: When not useful bits, BitWidth becomes 0 and APInt will not be happy. See https://llvm.org/bugs/show_bug.cgi?id=25571 We can just mark the operand as IMPLICIT_DEF is none bits of it is used. Reviewers: t.p.northover, jmolloy Subscribers: gberry, jmolloy, mgrang, aemerson, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D14803 llvm-svn: 254440	2015-12-01 19:17:49 +00:00
Elena Demikhovsky	aa1f17ea95	AVX-512: regenerated test for avx512 arithmetics, NFC llvm-svn: 254410	2015-12-01 12:35:03 +00:00
Yury Gribov	d7dbb66eb8	Introduce new @llvm.get.dynamic.area.offset.i{32, 64} intrinsics. The @llvm.get.dynamic.area.offset.* intrinsic family is used to get the offset from native stack pointer to the address of the most recent dynamic alloca on the caller's stack. These intrinsics are intendend for use in combination with @llvm.stacksave and @llvm.restore to get a pointer to the most recent dynamic alloca. This is useful, for example, for AddressSanitizer's stack unpoisoning routines. Patch by Max Ostapenko. Differential Revision: http://reviews.llvm.org/D14983 llvm-svn: 254404	2015-12-01 11:40:55 +00:00
Cong Hou	d97c100dc4	Replace all weight-based interfaces in MBB with probability-based interfaces, and update all uses of old interfaces. (This is the second attempt to submit this patch. The first caused two assertion failures and was reverted. See https://llvm.org/bugs/show_bug.cgi?id=25687) The patch in http://reviews.llvm.org/D13745 is broken into four parts: 1. New interfaces without functional changes (http://reviews.llvm.org/D13908). 2. Use new interfaces in SelectionDAG, while in other passes treat probabilities as weights (http://reviews.llvm.org/D14361). 3. Use new interfaces in all other passes. 4. Remove old interfaces. This patch is 3+4 above. In this patch, MBB won't provide weight-based interfaces any more, which are totally replaced by probability-based ones. The interface addSuccessor() is redesigned so that the default probability is unknown. We allow unknown probabilities but don't allow using it together with known probabilities in successor list. That is to say, we either have a list of successors with all known probabilities, or all unknown probabilities. In the latter case, we assume each successor has 1/N probability where N is the number of successors. An assertion checks if the user is attempting to add a successor with the disallowed mixed use as stated above. This can help us catch many misuses. All uses of weight-based interfaces are now updated to use probability-based ones. Differential revision: http://reviews.llvm.org/D14973 llvm-svn: 254377	2015-12-01 05:29:22 +00:00
Hans Wennborg	1dbaf67537	Revert r254348: "Replace all weight-based interfaces in MBB with probability-based interfaces, and update all uses of old interfaces." and the follow-up r254356: "Fix a bug in MachineBlockPlacement that may cause assertion failure during BranchProbability construction." Asserts were firing in Chromium builds. See PR25687. llvm-svn: 254366	2015-12-01 03:49:42 +00:00
Cong Hou	fa1917c673	Replace all weight-based interfaces in MBB with probability-based interfaces, and update all uses of old interfaces. The patch in http://reviews.llvm.org/D13745 is broken into four parts: 1. New interfaces without functional changes (http://reviews.llvm.org/D13908). 2. Use new interfaces in SelectionDAG, while in other passes treat probabilities as weights (http://reviews.llvm.org/D14361). 3. Use new interfaces in all other passes. 4. Remove old interfaces. This patch is 3+4 above. In this patch, MBB won't provide weight-based interfaces any more, which are totally replaced by probability-based ones. The interface addSuccessor() is redesigned so that the default probability is unknown. We allow unknown probabilities but don't allow using it together with known probabilities in successor list. That is to say, we either have a list of successors with all known probabilities, or all unknown probabilities. In the latter case, we assume each successor has 1/N probability where N is the number of successors. An assertion checks if the user is attempting to add a successor with the disallowed mixed use as stated above. This can help us catch many misuses. All uses of weight-based interfaces are now updated to use probability-based ones. Differential revision: http://reviews.llvm.org/D14973 llvm-svn: 254348	2015-12-01 00:02:51 +00:00
Simon Pilgrim	db26b3ddfa	[X86][FMA4] Prefer FMA4 to FMA We currently output FMA instructions on targets which support both FMA4 + FMA (i.e. later Bulldozer CPUS bdver2/bdver3/bdver4). This patch flips this so FMA4 is preferred; this is for several reasons: 1 - FMA4 is non-destructive reducing the need for mov instructions. 2 - Its more straighforward to commute and fold inputs (although the recent work on FMA has reduced this difference). 3 - All supported targets have FMA4 performance equal or better to FMA - Piledriver (bdver2) in particular has half the throughput when executing FMA instructions. Its looks like no future AMD processor lines will support FMA4 after the Bulldozer series so we're not causing problems for later CPUs. Differential Revision: http://reviews.llvm.org/D14997 llvm-svn: 254339	2015-11-30 22:22:06 +00:00
Paul Robinson	a2550a6da3	Have 'optnone' respect the -fast-isel=false option. This is primarily useful for debugging optnone v. ISel issues. Differential Revision: http://reviews.llvm.org/D14792 llvm-svn: 254335	2015-11-30 21:56:16 +00:00
Cong Hou	eb9c7056f0	[X86] Update test/CodeGen/X86/avg.ll with the help of update_llc_test_checks.py. NFC. llvm-svn: 254334	2015-11-30 21:46:08 +00:00
Matt Arsenault	26f8f3db39	AMDGPU: Rework how private buffer passed for HSA If we know we have stack objects, we reserve the registers that the private buffer resource and wave offset are passed and use them directly. If not, reserve the last 5 SGPRs just in case we need to spill. After register allocation, try to pick the next available registers instead of the last SGPRs, and then insert copies from the inputs to the reserved registers in the progloue. This also only selectively enables all of the input registers which are really required instead of always enabling them. llvm-svn: 254331	2015-11-30 21:16:03 +00:00
Matt Arsenault	0e3d38937e	AMDGPU: Remove SIPrepareScratchRegs It does not work because of emergency stack slots. This pass was supposed to eliminate dummy registers for the spill instructions, but the register scavenger can introduce more during PrologEpilogInserter, so some would end up left behind if they were needed. The potential for spilling the scratch resource descriptor and offset register makes doing something like this overly complicated. Reserve registers to use for the resource descriptor and use them directly in eliminateFrameIndex. Also removes creating another scratch resource descriptor when directly selecting scratch MUBUF instructions. The choice of which registers are reserved is temporary. For now it attempts to pick the next available registers after the user and system SGPRs. llvm-svn: 254329	2015-11-30 21:15:53 +00:00
Matt Arsenault	ff6da2fe89	AMDGPU: Use assert zext for workgroup sizes llvm-svn: 254328	2015-11-30 21:15:45 +00:00
Quentin Colombet	cdad10f333	[ARM] For old thumb ISA like v4t, we cannot use PC directly in pop. Fix the epilogue emission to account for that. llvm-svn: 254325	2015-11-30 20:37:58 +00:00

1 2 3 4 5 ...

14348 Commits