llvm-project

Commit Graph

Author	SHA1	Message	Date
Lei Liu	361615cfd0	AArch64: Set shift bit of TLSLE HI12 add instruction Summary: AArch64 LLVM assembler emits add instruction without shift bit to calculate the higher 12-bit address of TLS variables in local exec model. This generates wrong code sequence to access TLS variables with thread offset larger than 0x1000. Reviewers: t.p.northover, peter.smith, rovka Subscribers: salim.nasser, aemerson, llvm-commits, rengolin Differential Revision: https://reviews.llvm.org/D24702 llvm-svn: 282661	2016-09-29 01:05:48 +00:00
Quentin Colombet	40cbc27ff3	[RegisterBankInfo] Uniquely generate OperandsMapping. This is a step toward statically allocate InstructionMapping. Like the previous few commits, the goal is to move toward a TableGen'ed like structure with no dynamic allocation at all. This should already improve compile time by getting rid of a bunch of memmove of SmallVectors. llvm-svn: 282643	2016-09-28 22:20:49 +00:00
Konstantin Zhuravlyov	e14df4b236	[AMDGPU] Promote uniform i16 ops to i32 ops for targets that have 16 bit instructions Differential Revision: https://reviews.llvm.org/D24125 llvm-svn: 282624	2016-09-28 20:05:39 +00:00
Artem Belevich	3e1211581c	[NVPTX] Added intrinsics for atom.gen.{sys\|cta}.* instructions. These are only available on sm_60+ GPUs. Differential Revision: https://reviews.llvm.org/D24943 llvm-svn: 282607	2016-09-28 17:25:38 +00:00
Nirav Dave	e524f50882	Revert "In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled." This reverts commit r282600 due to test failues with MCJIT llvm-svn: 282604	2016-09-28 16:37:50 +00:00
Dylan McKay	1f69cdb321	[AVR] Rename the builtin calling convention names 'BUILTIN' is clearer than 'RT' in this context. llvm-svn: 282602	2016-09-28 16:04:40 +00:00
Marina Yatsina	76bfc6670b	[x86] Accept 'retn' as an alias to 'ret[lqw]'\'ret' (At&t\Intel) Implement 'retn' simply by aliasing it to the relevant 'ret' instruction Commit on behalf of coby Differential Revision: https://reviews.llvm.org/D24346 llvm-svn: 282601	2016-09-28 15:52:56 +00:00
Nirav Dave	e17e055b75	In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled. Simplify Consecutive Merge Store Candidate Search Now that address aliasing is much less conservative, push through simplified store merging search which only checks for parallel stores through the chain subgraph. This is cleaner as the separation of non-interfering loads/stores from the store-merging logic. Whem merging stores, search up the chain through a single load, and finds all possible stores by looking down from through a load and a TokenFactor to all stores visited. This improves the quality of the output SelectionDAG and generally the output CodeGen (with some exceptions). Additional Minor Changes: 1. Finishes removing unused AliasLoad code 2. Unifies the the chain aggregation in the merged stores across code paths 3. Re-add the Store node to the worklist after calling SimplifyDemandedBits. 4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is arbitrary, but seemed sufficient to not cause regressions in tests. This finishes the change Matt Arsenault started in r246307 and jyknight's original patch. Many tests required some changes as memory operations are now reorderable. Some tests relying on the order were changed to use volatile memory operations Noteworthy tests: CodeGen/AArch64/argument-blocks.ll - It's not entirely clear what the test_varargs_stackalign test is supposed to be asserting, but the new code looks right. CodeGen/AArch64/arm64-memset-inline.lli - CodeGen/AArch64/arm64-stur.ll - CodeGen/ARM/memset-inline.ll - The backend now generates worse code due to store merging succeeding, as we do do a 16-byte constant-zero store efficiently. CodeGen/AArch64/merge-store.ll - Improved, but there still seems to be an extraneous vector insert from an element to itself? CodeGen/PowerPC/ppc64-align-long-double.ll - Worse code emitted in this case, due to the improved store->load forwarding. CodeGen/X86/dag-merge-fast-accesses.ll - CodeGen/X86/MergeConsecutiveStores.ll - CodeGen/X86/stores-merging.ll - CodeGen/Mips/load-store-left-right.ll - Restored correct merging of non-aligned stores CodeGen/AMDGPU/promote-alloca-stored-pointer-value.ll - Improved. Correctly merges buffer_store_dword calls CodeGen/AMDGPU/si-triv-disjoint-mem-access.ll - Improved. Sidesteps loading a stored value and merges two stores CodeGen/X86/pr18023.ll - This test has been removed, as it was asserting incorrect behavior. Non-volatile stores CAN be moved past volatile loads, and now are. CodeGen/X86/vector-idiv.ll - CodeGen/X86/vector-lzcnt-128.ll - It's basically impossible to tell what these tests are actually testing. But, looks like the code got better due to the memory operations being recognized as non-aliasing. CodeGen/X86/win32-eh.ll - Both loads of the securitycookie are now merged. CodeGen/AMDGPU/vgpr-spill-emergency-stack-slot-compute.ll - This test appears to work but no longer exhibits the spill behavior. Reviewers: arsenm, hfinkel, tstellarAMD, nhaehnle, jyknight Subscribers: wdng, nhaehnle, nemanjai, arsenm, weimingz, niravd, RKSimon, aemerson, qcolombet, resistor, tstellarAMD, t.p.northover, spatel Differential Revision: https://reviews.llvm.org/D14834 llvm-svn: 282600	2016-09-28 15:50:43 +00:00
Dylan McKay	536239f144	[AVR] Import the LLVM namespace inside AVRMCTargetDesc.cpp llvm-svn: 282598	2016-09-28 15:35:26 +00:00
Dylan McKay	e762094864	[AVR] Add AVRMCTargetDesc.cpp Summary: This adds the AVRMCTargetDesc file in tree. It allows creation of the core classes used in the backend. Reviewers: arsenm, kparzysz Subscribers: wdng, beanz, mgorny Differential Revision: https://reviews.llvm.org/D25023 llvm-svn: 282597	2016-09-28 15:31:12 +00:00
Dylan McKay	d6e7fc6d9a	[AVR] Update the signature of createAVRAsmBackend It has been recently changed to also take a MCTargetOptions structure. llvm-svn: 282594	2016-09-28 14:35:07 +00:00
Dylan McKay	f010a2b41a	[AVR] Enable the assembly parser We very recently landed the code. This commit enables the parser. It also adds a missing include to AVRAsmParser.cpp llvm-svn: 282593	2016-09-28 14:34:42 +00:00
Dylan McKay	0fe1e63837	[AVR] Merge most recent changes to AVRInstrInfo.td This adds two new things: - Operand types per fixup - Atomic pseudo operations llvm-svn: 282588	2016-09-28 13:44:02 +00:00
Dylan McKay	b967d16c43	[AVR] Update the data layout The previous data layout caused issues when dealing with atomics. Foe example, it is illegal to load a 16-bit value with less than 16-bits of alignment. This changes the data layout so that all types are aligned by at least their own width. Interestingly, this also _slightly_ decreased register pressure in some cases. llvm-svn: 282587	2016-09-28 13:29:10 +00:00
Dylan McKay	1f877f06b9	[AVR] Add assembly parser Summary: This patch adds the AVRAsmParser library. Reviewers: arsenm, kparzysz Subscribers: wdng, beanz, mgorny, kparzysz, simoncook, jtbandes, llvm-commits Differential Revision: https://reviews.llvm.org/D20046 llvm-svn: 282584	2016-09-28 13:02:57 +00:00
Guy Blank	2bdc74a471	[X86][FastISel] Use a COPY from K register to a GPR instead of a K operation The KORTEST was introduced due to a bug where a TEST instruction used a K register. but, turns out that the opposite case of KORTEST using a GPR is now happening The change removes the KORTEST flow and adds a COPY instruction from the K reg to a GPR. Differential Revision: https://reviews.llvm.org/D24953 llvm-svn: 282580	2016-09-28 11:22:17 +00:00
Simon Pilgrim	55b8eaa505	Strip trailing whitespace llvm-svn: 282579	2016-09-28 11:08:00 +00:00
Jonas Paulsson	58c5a7f55a	[SystemZ] Implementation of getUnrollingPreferences(). This commit enables more unrolling for SystemZ by implementing the SystemZTargetTransformInfo::getUnrollingPreferences() method. It has been found that it is better to only unroll moderately, so the DefaultUnrollRuntimeCount has been moved into UnrollingPreferences in order to set this to a lower value for SystemZ (4). Reviewers: Evgeny Stupachenko, Ulrich Weigand. https://reviews.llvm.org/D24451 llvm-svn: 282570	2016-09-28 09:41:38 +00:00
Quentin Colombet	c0f11a9fb8	[AArch64][RegisterBankInfo] Switch to statically allocated ValueMapping. Another step toward TableGen'ed like structure for the RegisterBankInfo of AArch64. By doing this, we also save a bit of compile time for the exact same output. llvm-svn: 282550	2016-09-27 22:55:04 +00:00
Quentin Colombet	caae9cd246	[AArch64][RegisterBankInfo] Fix copy/paste in comments. NFC. llvm-svn: 282549	2016-09-27 22:54:57 +00:00
Sanjay Patel	764ae8bd72	[x86] add folds for FP logic with vector zeros The 'or' case shows up in copysign. The copysign code also had redundant checking for a scalar zero operand with 'and', so I removed that. I'm not sure how to test vector 'and', 'andn', and 'xor' yet, but it seems better to just include all of the logic ops since we're fixing 'or' anyway. llvm-svn: 282546	2016-09-27 22:28:13 +00:00
Geoff Berry	b124331db7	[TargetRegisterInfo, AArch64] Add target hook for isConstantPhysReg(). Summary: The current implementation of isConstantPhysReg() checks for defs of physical registers to determine if they are constant. Some architectures (e.g. AArch64 XZR/WZR) have registers that are constant and may be used as destinations to indicate the generated value is discarded, preventing isConstantPhysReg() from returning true. This change adds a TargetRegisterInfo hook that overrides the no defs check for cases such as this. Reviewers: MatzeB, qcolombet, t.p.northover, jmolloy Subscribers: junbuml, aemerson, mcrosier, rengolin Differential Revision: https://reviews.llvm.org/D24570 llvm-svn: 282543	2016-09-27 22:17:27 +00:00
Sanjay Patel	43ef1ad0ba	[x86] use isNullFPConstant(); NFCI Also, put the related FP logic functions together to see the similarities. llvm-svn: 282522	2016-09-27 18:48:02 +00:00
Krzysztof Parzyszek	586fc12e32	[RDF] Add "dead" flag to node attributes llvm-svn: 282520	2016-09-27 18:24:33 +00:00
Krzysztof Parzyszek	1d32220721	[RDF] Special treatment of exception handling registers A landing pad can have live-in registers that are defined by the runtime, not the program (exception pointer register and exception selector register). Make sure to recognize that case and not link these registers with any defs in the program. Each landing pad will have phi nodes added at the beginning to provide definitions of these registers, but the uses of those phi nodes will not have any reaching defs. llvm-svn: 282519	2016-09-27 18:18:44 +00:00
Konstantin Zhuravlyov	da4687c531	[AMDGPU] Enable changing instprinter's behavior based on the per-function subtarget This is a prerequisite for coming waitcnt changes Differential Revision: https://reviews.llvm.org/D24939 llvm-svn: 282489	2016-09-27 14:42:48 +00:00
Simon Dardis	d2ed8abb15	[mips] Disable tail calls temporarily Disable tail calls while the remaining bugs are fixed. Enable only for tests. Reviewers: vkalintiris Differential Review: https://reviews.llvm.org/D24912 llvm-svn: 282487	2016-09-27 13:15:54 +00:00
Simon Dardis	0486d585c5	[mips] Add rsqrt, recip for MIPS Add rsqrt.[ds], recip.[ds] for MIPS. Correct the microMIPS definitions for architecture support and register usage. Reviewers: vkalintiris, zoran.jovanoic Differential Review: https://reviews.llvm.org/D24499 llvm-svn: 282485	2016-09-27 12:25:15 +00:00
Nemanja Ivanovic	6f22b41398	[Power9] Builtins for ELF v.2 API conformance - back end portion This patch corresponds to review: https://reviews.llvm.org/D24396 This patch adds support for the "vector count trailing zeroes", "vector compare not equal" and "vector compare not equal or zero instructions" as well as "scalar count trailing zeroes" instructions. It also changes the vector negation to use XXLNOR (when VSX is enabled) so as not to increase register pressure (previously this was done with a splat immediate of all ones followed by an XXLXOR). This was done because the altivec.h builtins (patch to follow) use vector negation and the use of an additional register for the splat immediate is not optimal. llvm-svn: 282478	2016-09-27 08:42:12 +00:00
Craig Topper	789888002a	[X86] Use std::max to calculate alignment instead of assuming RC->getSize() will not return a value greater than 32. I think it theoretically could be 64 for AVX-512. llvm-svn: 282471	2016-09-27 06:44:25 +00:00
Davide Italiano	a9f85d68cc	[CodeGen] Add support for emitting .init_array instead of .ctors on FreeBSD. PR: 30494 llvm-svn: 282451	2016-09-26 22:53:15 +00:00
Derek Schuff	92d300eb8f	[WebAssembly] Use the frame pointer instead of the stack pointer When we have dynamic allocas we have a frame pointer, and when we're lowering frame indexes we should make sure we use it. Patch by Jacob Gravelle Differential Revision: https://reviews.llvm.org/D24889 llvm-svn: 282442	2016-09-26 21:18:03 +00:00
Nirav Dave	6477ce2697	Add support for Code16GCC [X86] The .code16gcc directive parses X86 assembly input in 32-bit mode and outputs in 16-bit mode. Teach parser to switch modes appropriately. Reviewers: dwmw2, craig.topper Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D20109 llvm-svn: 282430	2016-09-26 19:33:36 +00:00
Andrew Kaylor	595307a468	Add optimization bisect support to an optional Mips pass Differential Revision: https://reviews.llvm.org/D19513 llvm-svn: 282428	2016-09-26 19:05:37 +00:00
Tom Stellard	1b9748c6a2	AMDGPU/SI: Don't crash on anonymous GlobalValues Summary: We need to call AsmPrinter::getNameWithPrefix() in order to handle anonymous GlobalValues (e.g. @0, @1). Reviewers: arsenm, b-sumner Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D24865 llvm-svn: 282420	2016-09-26 17:29:25 +00:00
Geoff Berry	256fcf975f	[AArch64] Improve add/sub/cmp isel of uxtw forms. Don't match the UXTW extended reg forms of ADD/ADDS/SUB/SUBS if the 32-bit to 64-bit zero-extend can be done for free by taking advantage of the 32-bit defining instruction zeroing the upper 32-bits of the X register destination. This enables better instruction selection in a few cases, such as: sub x0, xzr, x8 instead of: mov x8, xzr sub x0, x8, w9, uxtw madd x0, x1, x1, x8 instead of: mul x9, x1, x1 add x0, x9, w8, uxtw cmp x2, x8 instead of: sub x8, x2, w8, uxtw cmp x8, #0 add x0, x8, x1, lsl #3 instead of: lsl x9, x1, #3 add x0, x9, w8, uxtw Reviewers: t.p.northover, jmolloy Subscribers: mcrosier, aemerson, llvm-commits, rengolin Differential Revision: https://reviews.llvm.org/D24747 llvm-svn: 282413	2016-09-26 15:34:47 +00:00
Evandro Menezes	e45de8a5ec	Add support to optionally limit the size of jump tables. Many high-performance processors have a dedicated branch predictor for indirect branches, commonly used with jump tables. As sophisticated as such branch predictors are, they tend to have well defined limits beyond which their effectiveness is hampered or even nullified. One such limit is the number of possible destinations for a given indirect branches that such branch predictors can handle. This patch considers a limit that a target may set to the number of destination addresses in a jump table. Patch by: Evandro Menezes <e.menezes@samsung.com>, Aditya Kumar <aditya.k7@samsung.com>, Sebastian Pop <s.pop@samsung.com>. Differential revision: https://reviews.llvm.org/D21940 llvm-svn: 282412	2016-09-26 15:32:33 +00:00
Dylan McKay	c4ec11f451	[AVR] Add AVRMCExpr Summary: This adds the AVRMCExpr headers and implementation. Reviewers: arsenm, ruiu, grosbach, kparzysz Subscribers: wdng, beanz, mgorny, kparzysz, jtbandes, llvm-commits Differential Revision: https://reviews.llvm.org/D20503 llvm-svn: 282397	2016-09-26 11:35:32 +00:00
Sam Kolton	984461062f	Revert "[AMDGPU] Disassembler: print label names in branch instructions" This reverts commit 6c6dbe625263ec9fcf8de0df27263cf147cde550. llvm-svn: 282396	2016-09-26 11:29:03 +00:00
Sam Kolton	1559f76257	[AMDGPU] Disassembler: print label names in branch instructions Summary: Add AMDGPUSymbolizer for finding names for labels from ELF symbol table. Reviewers: vpykhtin, artem.tamazov, tstellarAMD Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye Differential Revision: https://reviews.llvm.org/D24802 llvm-svn: 282394	2016-09-26 10:05:50 +00:00
James Molloy	9abb2fa5bb	[ARM] Promote small global constants to constant pools If a constant is unamed_addr and is only used within one function, we can save on the code size and runtime cost of an indirection by changing the global's storage to inside the constant pool. For example, instead of: ldr r0, .CPI0 bl printf bx lr .CPI0: &format_string format_string: .asciz "hello, world!\n" We can emit: adr r0, .CPI0 bl printf bx lr .CPI0: .asciz "hello, world!\n" This can cause significant code size savings when many small strings are used in one function (4 bytes per string). This recommit contains fixes for a nasty bug related to fast-isel fallback - because fast-isel doesn't know about this optimization, if it runs and emits references to a string that we inline (because fast-isel fell back to SDAG) we will end up with an inlined string and also an out-of-line string, and we won't emit the out-of-line string, causing backend failures. It also contains fixes for emitting .text relocations which made the sanitizer bots unhappy. llvm-svn: 282387	2016-09-26 07:26:24 +00:00
Zvi Rackover	839d15a194	[X86] Optimization for replacing LEA with MOV at frame index elimination time Summary: Replace a LEA instruction of the form 'lea (%esp), %ebx' --> 'mov %esp, %ebx' MOV is preferable over LEA because usually there are more issue-slots available to execute MOVs than LEAs. Latest processors also support zero-latency MOVs. Fixes pr29022. Reviewers: hfinkel, delena, igorb, myatsina, mkuper Differential Revision: https://reviews.llvm.org/D24705 llvm-svn: 282385	2016-09-26 06:42:07 +00:00
Ayman Musa	d7a5ed4141	[X86][avx512] Fix bug in masked compress store. Differential Revision: https://reviews.llvm.org/D23984 llvm-svn: 282381	2016-09-26 06:22:08 +00:00
Craig Topper	87155274b8	[X86] Remove what appears to be leftover MMX code involving (v1i64 scalar_to_vector). llvm-svn: 282361	2016-09-25 16:34:11 +00:00
Craig Topper	aab59a48e7	[X86] Remove patterns for scalar_to_vector from FR32/FR64 to 256-bit vectors. Lowering explicitly avoids creating this pattern. llvm-svn: 282360	2016-09-25 16:34:09 +00:00
Craig Topper	0cc188d979	[AVX-512] Replace get512BitSuperRegister with calls to TargetRegisterInfo::getMatchingSuperReg. llvm-svn: 282359	2016-09-25 16:34:06 +00:00
Craig Topper	60d3ef1d72	[AVX-512] Fix some patterns predicates to properly enforce priority for various versions of CVTDQ2PD instruction. llvm-svn: 282358	2016-09-25 16:34:02 +00:00
Craig Topper	3c9faa32c1	[AVX-512] Add rounding versions of instructions to hasUndefRegUpdate. llvm-svn: 282357	2016-09-25 16:33:59 +00:00
Craig Topper	d8b2bd492c	[AVX-512] Add the scalar unsigned integer to fp conversion instructions to hasUndefRegUpdate. llvm-svn: 282356	2016-09-25 16:33:57 +00:00
Craig Topper	ac941b9736	[AVX-512] Remove duplicate instructions for converting integer to scalar floating point. We can use patterns to point to the other instructions instead. llvm-svn: 282355	2016-09-25 16:33:53 +00:00

1 2 3 4 5 ...

39472 Commits