llvm-project

Commit Graph

Author	SHA1	Message	Date
Matt Arsenault	663ab8c119	AMDGPU: Use brev for materializing SGPR constants This is already done with VGPR immediates and saves 4 bytes. llvm-svn: 285765	2016-11-01 23:14:20 +00:00
Matt Arsenault	3d463193a9	AMDGPU: Default to using scalar mov to materialize immediate This is the conservatively correct way because it's easy to move or replace a scalar immediate. This was incorrect in the case when the register class wasn't known from the static instruction definition, but still needed to be an SGPR. The main example of this is inlineasm has an SGPR constraint. Also start verifying the register classes of inlineasm operands. llvm-svn: 285762	2016-11-01 22:55:07 +00:00
Eric Christopher	690f8e587e	Move the initialization of PreferredLoopExit into runOnMachineFunction to be near the other function specific initializations. llvm-svn: 285758	2016-11-01 22:15:50 +00:00
Sam McCall	2a36eee4ca	Fix uninitialized access in MachineBlockPlacement. Summary: Currently PreferredLoopExit is set only in buildLoopChains, which is never called if there are no MachineLoops. MSan is currently broken by this: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/145/steps/check-llvm%20msan/logs/stdio This is a naive fix to get things green again. iteratee: you may have a better fix. This change will also mean PreferredLoopExit will not carry over if buildCFGChains() is called a second time in runOnMachineFunction, this appears to be the right thing. Reviewers: bkramer, iteratee, echristo Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D26069 llvm-svn: 285757	2016-11-01 22:02:14 +00:00
Matt Arsenault	a6319b82ca	AMDGPU: Stop creating unused virtual registers These are only used in the spill to VMEM path. Move them to the one use. llvm-svn: 285756	2016-11-01 21:58:07 +00:00
George Burgess IV	66837aba0a	[MemorySSA] Tighten up types to make our API prettier. NFC. Patch by bryant. Differential Revision: https://reviews.llvm.org/D26126 llvm-svn: 285750	2016-11-01 21:17:46 +00:00
Sanjay Patel	9840cdad4c	[ValueTracking] remove TODO comment; NFC InstCombine should always canonicalize patterns like the one shown in the comment when visiting 'select' insts in adjustMinMax(). Scalars were already handled there, and vector splats are handled after: https://reviews.llvm.org/rL285732 llvm-svn: 285744	2016-11-01 20:43:00 +00:00
Matt Arsenault	2d8c289b4b	AMDGPU: Workaround for instruction size with literals Instructions with a 32-bit base encoding with an optional 32-bit literal encoded after them report their size as 4 for the disassembler. Consider these when computing the MachineInstr size. This fixes problems caused by size estimate consistency in BranchRelaxation. llvm-svn: 285743	2016-11-01 20:42:24 +00:00
Sanjay Patel	c3d89842ad	[InstCombine] allow splat vector folds in adjustMinMax() llvm-svn: 285732	2016-11-01 20:08:02 +00:00
Sanjay Patel	c0339c77ef	[InstCombine] Fold nuw left-shifts in `ugt`/`ule` comparisons. This transforms %a = shl nuw %x, c1 %b = icmp {ugt\|ule} %a, c0 into %b = icmp {ugt\|ule} %x, (c0 >> c1) z3: (declare-const x (_ BitVec 64)) (declare-const c0 (_ BitVec 64)) (declare-const c1 (_ BitVec 64)) (push) (assert (= x (bvlshr (bvshl x c1) c1))) ; nuw (assert (not (= (bvugt (bvshl x c1) c0) (bvugt x (bvlshr c0 c1))))) (check-sat) (get-model) (pop) (push) (assert (= x (bvlshr (bvshl x c1) c1))) ; nuw (assert (not (= (bvule (bvshl x c1) c0) (bvule x (bvlshr c0 c1))))) (check-sat) (get-model) (pop) Patch by bryant! Differential Revision: https://reviews.llvm.org/D25913 llvm-svn: 285729	2016-11-01 19:19:29 +00:00
Krzysztof Parzyszek	654dc11b79	[Hexagon] Rename operand/predicate names for unshifted integers For example, rename s6Ext to s6_0Ext. The names for shifted integers include the underscore and this will make the naming consistent. It also exposed a few duplicates that were removed. llvm-svn: 285728	2016-11-01 19:02:10 +00:00
Matt Arsenault	cb578f84e0	BranchRelaxation: Expand unconditional branches first It's likely if a conditional branch needs to be expanded, the following unconditional branch will also need expansion. By expanding the unconditional branch first, the conditional branch can be simply inverted to jump over the inserted indirect branch block. If the conditional branch is expanded first, it results in an additional branch. This avoids test regressions in future commits. llvm-svn: 285722	2016-11-01 18:34:00 +00:00
Sanjay Patel	644d7c3b8a	[InstCombine] clean up adjustMinMax(); NFCI 1. Change param names for readability 2. Change pointer param to ref 3. Early exit to reduce indent 4. Change switch to if/else llvm-svn: 285718	2016-11-01 18:15:03 +00:00
Konstantin Zhuravlyov	d971a1123f	[AMDGPU] Check if type transforms to i16 (VI+) when getting AMDGPUISD::FFBH_U32 This will prevent following regression when enabling i16 support (D18049): test/CodeGen/AMDGPU/ctlz.ll test/CodeGen/AMDGPU/ctlz_zero_undef.ll Differential Revision: https://reviews.llvm.org/D25802 llvm-svn: 285716	2016-11-01 17:49:33 +00:00
Sanjay Patel	7ce658388b	[InstCombine] add helper function for adjustMinMax(); NFCI This is just a cut and paste; clean-up and enhancements to follow. llvm-svn: 285715	2016-11-01 17:46:08 +00:00
Alex Bradbury	b2e5472d85	[RISCV] Add stub backend This contains just enough for lib/Target/RISCV to compile. Notably a basic RISCVTargetMachine and RISCVTargetInfo. At this point you can attempt llc -march=riscv32 myinput.ll and will find it fails due to the lack of MCAsmInfo. See http://lists.llvm.org/pipermail/llvm-dev/2016-August/103748.html for further discussion Differential Revision: https://reviews.llvm.org/D23560 llvm-svn: 285712	2016-11-01 17:27:54 +00:00
Tom Stellard	9677b60288	AMDGPU: Fix buildbots broken by r285704 llvm-svn: 285711	2016-11-01 17:20:03 +00:00
Alex Bradbury	1524f62b97	[RISCV] Add RISC-V ELF defines Add the necessary definitions for RISC-V ELF files, including relocs. Also make necessary trivial change to ELFYaml, llvm-objdump, and llvm-readobj in order to work with RISC-V ELFs. Differential Revision: https://reviews.llvm.org/D23557 llvm-svn: 285708	2016-11-01 16:59:37 +00:00
Alex Bradbury	b6e784a240	[RISCV] Recognise riscv32 and riscv64 in triple parsing code This is the first in a series of 10 initial patches that incrementally add an MC layer for RISC-V to LLVM. See <http://lists.llvm.org/pipermail/llvm-dev/2016-August/103748.html> for more discussion. Differential Revision: https://reviews.llvm.org/D23557 llvm-svn: 285707	2016-11-01 16:47:54 +00:00
Alex Bradbury	58eba09949	[TableGen] Move OperandMatchResultTy enum to MCTargetAsmParser.h As it stands, the OperandMatchResultTy is only included in the generated header if there is custom operand parsing. However, almost all backends make use of MatchOperand_Success and friends from OperandMatchResultTy for e.g. parseRegister. This is a pain when starting an AsmParser for a new backend that doesn't yet have custom operand parsing. Move the enum to MCTargetAsmParser.h. This patch is a prerequisite for D23563 Differential Revision: https://reviews.llvm.org/D23496 llvm-svn: 285705	2016-11-01 16:32:05 +00:00
Tom Stellard	94c21bc088	AMDGPU: Implement expansion of f16 = FP_TO_FP16 f64 I wanted to implement this as a target independent expansion, however when targets say they want to expand FP_TO_FP16 what they actually want is the unsafe math expansion when possible and expansion to a libcall in all other cases. The only way to make this work as a target independent would be to add logic to target's TargetLowering construction to mark theses nodes as Expand when LegalizeDAG can use the unsafe expansion and mark them as LibCall when it cannot. I think this would be possible, but I think it would be too fragile and complex as it would require targets to keep their expansion logic up to date with the code in LegalizeDAG. Reviewers: bogner, ab, t.p.northover, arsenm Subscribers: wdng, llvm-commits, nhaehnle Differential Revision: https://reviews.llvm.org/D25999 llvm-svn: 285704	2016-11-01 16:31:48 +00:00
Simon Pilgrim	6dd8fab443	[InstCombine] Folding of shifts by the sum of positive values This patch introduces the combine: (C1 shift (A add C2)) -> ((C1 shift C2) shift A) iff A and C2 are both positive If both A and C2 are know to be positive then we can safely split into 2 shifts, permitting the folding of the Inner shift. Fix for the spec benchmark case mentioned by @nadav on PR15141 (assuming we can prove that the inputs as positive). Differential Revision: https://reviews.llvm.org/D26000 llvm-svn: 285696	2016-11-01 15:40:30 +00:00
James Molloy	70a3d6df52	[Thumb-1] Synthesize TBB/TBH instructions to make use of compressed jump tables [Reapplying r284580 and r285917 with fix and testing to ensure emitted jump tables for Thumb-1 have 4-byte alignment] The TBB and TBH instructions in Thumb-2 allow jump tables to be compressed into sequences of bytes or shorts respectively. These instructions do not exist in Thumb-1, however it is possible to synthesize them out of a sequence of other instructions. It turns out this sequence is so short that it's almost never a lose for performance and is ALWAYS a significant win for code size. TBB example: Before: lsls r0, r0, #2 After: add r0, pc adr r1, .LJTI0_0 ldrb r0, [r0, #6] ldr r0, [r0, r1] lsls r0, r0, #1 mov pc, r0 add pc, r0 => No change in prologue code size or dynamic instruction count. Jump table shrunk by a factor of 4. The only case that can increase dynamic instruction count is the TBH case: Before: lsls r0, r4, #2 After: lsls r4, r4, #1 adr r1, .LJTI0_0 add r4, pc ldr r0, [r0, r1] ldrh r4, [r4, #6] mov pc, r0 lsls r4, r4, #1 add pc, r4 => 1 more instruction in prologue. Jump table shrunk by a factor of 2. So there is an argument that this should be disabled when optimizing for performance (and a TBH needs to be generated). I'm not so sure about that in practice, because on small cores with Thumb-1 performance is often tied to code size. But I'm willing to turn it off when optimizing for performance if people want (also note that TBHs are fairly rare in practice!) llvm-svn: 285690	2016-11-01 13:37:41 +00:00
Valery Pykhtin	8a89d3662a	[AMDGPU] Expand vector mulhu/mulhs Differential revision: https://reviews.llvm.org/D26077 llvm-svn: 285684	2016-11-01 10:26:48 +00:00
Nemanja Ivanovic	e70fa63390	[PowerPC] Implement vector shift builtins - llvm portion This patch corresponds to review https://reviews.llvm.org/D26095. Committing on behalf of Tony Jiang. llvm-svn: 285681	2016-11-01 09:42:32 +00:00
Serge Pavlov	6ac8e034f6	Allow resolving response file names relative to including file If a response file included by construct @file itself includes a response file and that file is specified by relative file name, current behavior is to resolve the name relative to the current working directory. The change adds additional flag to ExpandResponseFiles that may be used to resolve nested response file names relative to including file. With the new mode a set of related response files may be kept together and reference each other with short position independent names. Differential Revision: https://reviews.llvm.org/D24917 llvm-svn: 285675	2016-11-01 06:53:29 +00:00
Sanjoy Das	9c729067e9	[TBAA] Use wrapper objects instead of raw getOperand s; NFC This is intended to make the semantic intent clearer. The wrapper objects are now generic to avoid `const_cast` s. Since `const` ness is part of the API of `MDNode::getMostGenericTBAA` (and therefore I can't make things `const` all the way through without some code churn outside TypeBasedAliasAnalysis.cpp), this seemed like the cleanest solution. llvm-svn: 285665	2016-11-01 02:58:30 +00:00
Sanjoy Das	a1a77f3a21	[TBAA] Rename accessors to be more idiomatic; NFC llvm-svn: 285661	2016-11-01 01:21:57 +00:00
Peter Collingbourne	d3a6c70b2d	Bitcode: Simplify BitstreamWriter::EnterBlockInfoBlock() interface. No block info block should need to define local abbreviations, so we can always use a code width of 2. Also change all block info block writers to use EnterBlockInfoBlock. Differential Revision: https://reviews.llvm.org/D26168 llvm-svn: 285660	2016-11-01 01:18:57 +00:00
Matt Arsenault	f3dd863031	AMDGPU: Whitespace fixes llvm-svn: 285659	2016-11-01 00:55:14 +00:00
Sanjay Patel	70c5f02d25	[DAG] disable nsw/nuw for add/sub/mul when simplifying based on demanded bits (PR30841) This bug was exposed by using nsw/nuw for more aggressive folds in: https://reviews.llvm.org/rL284844 The changes mimic the IR demanded bits logic in InstCombiner::SimplifyDemandedUseBits(), but we can't just flip flag bits in the DAG; we have to create a new node that has the bits cleared. This should fix: https://llvm.org/bugs/show_bug.cgi?id=30841 llvm-svn: 285656	2016-10-31 23:28:45 +00:00
Davide Italiano	51cbe13a3f	[Hexagon] Garbage collect dead code. llvm-svn: 285654	2016-10-31 22:56:56 +00:00
Evgeniy Stepanov	1bd9fc7098	Fix a typo. Found with PVS-Studio here: http://www.viva64.com/en/b/0446/ llvm-svn: 285652	2016-10-31 22:42:39 +00:00
Saleem Abdulrasool	e1aa782bd0	CodeGen: further loosen -O0 CG for WoA division Generate the slowest possible codepath for noopt CodeGen. Even trying to be clever with the negated jump can cause out-of-range jumps. Use a wide branch instead. Although the code is modelled simplistically, the later optimizations would recombine the branching into `cbz` if possible. This re-enables the previous optimization as well as hopefully gives us working code in all cases. Addresses PR30356! llvm-svn: 285649	2016-10-31 22:12:37 +00:00
Teresa Johnson	002af9bbce	[ThinLTO] Disable importing and other cross-module optis at -O0 Summary: There is no point to importing at -O0, since we won't inline. We should also disable other cross-module optimizations. (Plan to backport this fix to the 3.9 branch to fix PR30774) Reviewers: pcc Subscribers: johanengelen, mehdi_amini Differential Revision: https://reviews.llvm.org/D25918 llvm-svn: 285648	2016-10-31 22:12:21 +00:00
Justin Lebar	ed1e312f05	[NVPTX] Remove NVPTXFavorNonGenericAddrSpaces pass. Summary: This has been replaced by the NVPTXInferAddressSpaces pass. We've had the new one as the default with the old one accessible via a flag for some months now, and we've had no problems. Reviewers: tra Subscribers: llvm-commits, jholewinski, jingyue, mgorny Differential Revision: https://reviews.llvm.org/D26165 llvm-svn: 285642	2016-10-31 21:51:42 +00:00
Kevin Enderby	d503940e8f	More additional error checks for invalid Mach-O files when the offsets and sizes of an element of the file overlaps with another element in the Mach-O file. This shows the approach to this testing for three elements and contains for tests for their overlap. Checking for all the remain elements will be added next. llvm-svn: 285632	2016-10-31 20:29:48 +00:00
Nemanja Ivanovic	60bdfe5a7c	[PPC] add absolute difference altivec instructions and matching intrinsics This patch corresponds to review https://reviews.llvm.org/D26072. Committing on behalf of Sean Fertile. llvm-svn: 285627	2016-10-31 19:47:52 +00:00
Victor Leschuk	e1156c2eb0	DebugInfo: make DW_TAG_atomic_type valid DW_TAG_atomic_type was already included in Dwarf.defs and emitted correctly, however Verifier didn't recognize it as valid. Thus we introduce the following changes: * Make DW_TAG_atomic_type valid tag for IR and DWARF (enabled only with -gdwarf-5) * Add it to related docs * Add DebugInfo tests Differential Revision: https://reviews.llvm.org/D26144 llvm-svn: 285624	2016-10-31 19:09:38 +00:00
Kuba Brecka	a28c9e8f09	[asan] Move instrumented null-terminated strings to a special section, LLVM part On Darwin, simple C null-terminated constant strings normally end up in the __TEXT,__cstring section of the resulting Mach-O binary. When instrumented with ASan, these strings are transformed in a way that they cannot be in __cstring (the linker unifies the content of this section and strips extra NUL bytes, which would break instrumentation), and are put into a generic __const section. This breaks some of the tools that we have: Some tools need to scan all C null-terminated strings in Mach-O binaries, and scanning all the contents of __const has a large performance penalty. This patch instead introduces a special section, __asan_cstring which will now hold the instrumented null-terminated strings. Differential Revision: https://reviews.llvm.org/D25026 llvm-svn: 285619	2016-10-31 18:51:58 +00:00
Tim Northover	037af52c8b	GlobalISel: allow truncating pointer casts on AArch64. llvm-svn: 285615	2016-10-31 18:31:09 +00:00
Tim Northover	cdf23f1d93	GlobalISel: translate stack protector intrinsics llvm-svn: 285614	2016-10-31 18:30:59 +00:00
Rui Ueyama	ddc79225c3	Define DbiStreamBuilder::addSectionMap. This change enables LLD to construct a Section Map stream in a PDB file. I do not understand all these fields in the Section Map yet, but it seems like a copy of a COFF section header in another format. With this patch, DbiStreamBuilder can emit a Section Map which llvm-pdbdump can dump. Differential Revision: https://reviews.llvm.org/D26112 llvm-svn: 285606	2016-10-31 17:38:56 +00:00
Greg Clayton	cddab279f6	Modify DWARFFormValue to remember the DWARFUnit that it was decoded with. Modifying DWARFFormValue to remember the DWARFUnit that it was encoded with can simplify the usage of instances of this class. Previously users would have to try and pass in the same DWARFUnit that was used to decode the form value and there was a possibility that a different DWARFUnit might be supplied to the functions that extract values (strings, CU relative references, addresses) and cause problems. This fixes this potential issue by storing the DWARFUnit inside the DWARFFormValue so that this mistake can't be made. Instances of DWARFFormValue are not stored permanently and are used as temporary values, so the increase in size of an instance of DWARFFormValue isn't a big deal. This makes decoding form values more bullet proof and is a change that will be used by future modifications. https://reviews.llvm.org/D26052 llvm-svn: 285594	2016-10-31 16:46:02 +00:00
Michael Zuckerman	68a5c53616	[x86][inline-asm][AVX512][llvm][PART-2] Introducing "k" and "Yk" constraints for extended inline assembly, enabling use of AVX512 masked vectorized instructions. Commit on behalf of mharoush Extending inline assembly support, compatible with GCC as folowing: "k" constraint hints the compiler to select any of AVX512 k0-k7 registers. "Yk" constraint is a subset of "k" excluding k0 which is not allowd to be used as a mask. Reviewer: 1. rnk Differential Revision: https://reviews.llvm.org/D25062 llvm-svn: 285591	2016-10-31 16:19:58 +00:00
Artem Tamazov	54bfd548aa	[AMDGPU][MC][gfx8] Support 20-bit immediate offset in SMEM instructions. Fixes Bug 30808. Note that passing subtarget information to predicates seems too complicated, so gfx8-specific def smrd_offset_20 introduced. Old gfx6/7-specific def renamed to smrd_offset_8 for clarity. Lit tests updated. Differential Revision: https://reviews.llvm.org/D26085 llvm-svn: 285590	2016-10-31 16:07:39 +00:00
Krzysztof Parzyszek	22586dcb2a	[Hexagon] Don't expand mux instructions with both sources identical llvm-svn: 285588	2016-10-31 15:45:09 +00:00
Ulrich Weigand	2e5e51b3f3	[SystemZ] Rework processor feature definitions and add -mcpu=archX support This patch implements two changes: - Move processor feature definition into a new file SystemZFeatures.td, and provide explicit lists of supported and unsupported features for each level of the z/Architecture. This allows specifying unsupported features in the scheduler definition files for each processor. - Add optional aliases for the -mcpu processor names according to the level of the z/Architecture, for compatibility with other compilers on the platform. The supported aliases are: -mcpu=arch8 equals -mcpu=z10 -mcpu=arch9 equals -mcpu=z196 -mcpu=arch10 equals -mcpu=zEC12 -mcpu=arch11 equals -mcpu=z13 llvm-svn: 285577	2016-10-31 14:33:29 +00:00
Ulrich Weigand	d28be373d4	[SystemZ] Guard LEFR/LFER with FeatureVector The LEFR/LFER pseudos are aliases for vector instructions and should therefore be guared by FeatureVector. If they aren't, the TableGen scheduler definition checking might complain that there is no data for those pseudos for pre-z13 machines. No functional change intended. llvm-svn: 285576	2016-10-31 14:28:43 +00:00
Ulrich Weigand	d9001301d9	[SystemZ] Correctly diagnose missing features in AsmParser Currently, when using an instruction that is not supported on the currently selected architecture, the LLVM assembler is likely to diagnose an "invalid operand" instead of a "missing feature". This is because many operands require a custom parser in order to be processed correctly, and if an instruction is not available according to the current feature set, the generated parser code will also not detect the associated custom operand parsers. Fixed by temporarily enabling all features while parsing operands. The missing features will then be correctly detected when actually parsing the instruction itself. llvm-svn: 285575	2016-10-31 14:25:05 +00:00
Ulrich Weigand	ec5d779eb8	[SystemZ] Fix encoding of MVCK and .insn ss LLVM currently treats the first operand of MVCK as if it were a regular base+index+displacement address. However, it is in fact a base+displacement combined with a length register field. While the two might look syntactically similar, there are two semantic differences: - %r0 is a valid length register, even though it cannot be used as an index register. - In an expression with just a single register like 0(%rX), the register is treated as base with normal addresses, while it is treated as the length register (with an empty base) for MVCK. Fixed by adding a new operand parser class BDRAddr and reworking the assembler parser to distinguish between address + length register operands and regular addresses. llvm-svn: 285574	2016-10-31 14:21:36 +00:00
Dorit Nuzman	bf2c15b5dc	Second attempt at r285517. llvm-svn: 285568	2016-10-31 13:17:31 +00:00
Jonas Paulsson	6788ddeac9	[SystemZ] Model 2 VBU units (not 1) in SystemZScheduleZ13.td. NFC. Review: Ulrich Weigand. llvm-svn: 285566	2016-10-31 13:05:48 +00:00
Alexey Bataev	d07c731d86	Improved cost model for FDIV and FSQRT, by Andrew Tischenko There is a bug describing poor cost model for floating point operations: Bug 29083 - [X86][SSE] Improve costs for floating point operations. This patch is the second one in series of patches dealing with cost model. Differential Revision: https://reviews.llvm.org/D25722 llvm-svn: 285564	2016-10-31 12:10:53 +00:00
Craig Topper	d4e580705d	[AVX-512] Add missing patterns for selecting masked vector extracts that started from shuffles. llvm-svn: 285546	2016-10-31 05:55:57 +00:00
Sanjoy Das	1707869db5	[SCEV] Try to order n-ary expressions in CompareValueComplexity llvm-svn: 285535	2016-10-31 03:32:43 +00:00
Sanjoy Das	299e67291c	[SCEV] In CompareValueComplexity, order global values by their name llvm-svn: 285529	2016-10-30 23:52:56 +00:00
Sanjoy Das	b4830a84b9	[SCEV] Use auto for consistency with an upcoming change; NFC llvm-svn: 285528	2016-10-30 23:52:53 +00:00
Sanjay Patel	339a51ac13	[DAG] x \| x --> x llvm-svn: 285522	2016-10-30 18:19:35 +00:00
Sanjay Patel	13aee345ca	[DAG] x & x --> x llvm-svn: 285521	2016-10-30 18:13:30 +00:00
Dorit Nuzman	06903d16af	Revert r285517 due to build failures. llvm-svn: 285518	2016-10-30 14:34:57 +00:00
Dorit Nuzman	3c1c658f24	[LoopVectorize] Make interleaved-accesses analysis less conservative about possible pointer-wrap-around concerns, in some cases. Before this patch, collectConstStridedAccesses (part of interleaved-accesses analysis) called getPtrStride with [Assume=false, ShouldCheckWrap=true] when examining all candidate pointers. This is too conservative. Instead, this patch makes collectConstStridedAccesses use an optimistic approach, calling getPtrStride with [Assume=true, ShouldCheckWrap=false], and then, once the candidate interleave groups have been formed, revisits the pointer-wrapping analysis but only where it matters: namely, in groups that have gaps, and where the gaps are not at the very end of the group (in which case the loop is peeled). This second time getPtrStride is called with [Assume=false, ShouldCheckWrap=true], but this could further be improved to using Assume=true, once we also add the logic to track that we are not going to meet the scev runtime checks threshold. Differential Revision: https://reviews.llvm.org/D25276 llvm-svn: 285517	2016-10-30 12:23:26 +00:00
Craig Topper	b7781a95fd	[X86] Use intrinsics table for PMADDUBSW and PMADDWD so that we can use the legacy intrinsics to select EVEX encoded instructions when available. This removes a couple tablegen classes that become unused after this change. Another class gained an additional parameter to allow PMADDUBSW to specify a different result type from its input type. llvm-svn: 285515	2016-10-30 06:56:16 +00:00
Teresa Johnson	bf28c8fa45	[ThinLTO] Use per-summary flag to prevent exporting locals used in inline asm Summary: Instead of using the workaround of suppressing the entire index for modules that call inline asm that may reference locals, use the NoRename flag on the summary for any locals in the llvm.used set, and add a reference edge from any functions containing inline asm. This avoids issues from having no summaries despite the module defining global values, which was preventing more aggressive index-based optimization. It will be followed by a subsequent patch to make a similar fix for local references in module level asm (to fix PR30610). Reviewers: mehdi_amini Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D26121 llvm-svn: 285513	2016-10-30 05:40:44 +00:00
Teresa Johnson	3bc8abdffc	[ThinLTO] Correctly resolve linkonce when importing aliasee Summary: When we have an aliasee that is linkonce, while we can't convert the non-prevailing copies to available_externally, we still need to convert the prevailing copy to weak. If a reference to the aliasee is exported, not converting a copy to weak will result in undefined references when the linkonce is removed in its original module. Add a new test and update existing tests. Reviewers: mehdi_amini Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D26076 llvm-svn: 285512	2016-10-30 05:15:23 +00:00
Craig Topper	bf9e5a16a4	[X86] Don't use loadv2i64 on SSE version of PMULHRSW. Use memopv2i64 instead. This bug was introduced in r285501. llvm-svn: 285510	2016-10-30 00:02:55 +00:00
NAKAMURA Takumi	ff76cfefc0	NativeFormatting.cpp: Fix build for mingw. Where would writePadding() be? llvm-svn: 285509	2016-10-29 23:14:18 +00:00
Teresa Johnson	38d4df714c	[ThinLTO] Rename doPromoteLocalToGlobal to shouldPromoteLocalToGlobal (NFC) Rename as suggested in code review for D26063. llvm-svn: 285508	2016-10-29 21:52:23 +00:00
Teresa Johnson	1b9c2be8f4	[ThinLTO] Use NoPromote flag in summary during promotion Summary: Replace the check of whether a GV has a section with the flag check in the summary. This is in preparation for using the NoPromote flag to convey other situations when we can't promote (e.g. locals used in inline asm). Reviewers: mehdi_amini Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D26063 llvm-svn: 285507	2016-10-29 21:31:48 +00:00
Peter Collingbourne	310474f576	IR: Remove a no longer needed assert. This assert was checking for a miscompile in a version of GCC that we no longer support. llvm-svn: 285506	2016-10-29 20:57:12 +00:00
Craig Topper	defe9ffbb5	[X86] Use intrinsics table for VPMULHRSW intrincis so that the legacy intrinsics can select EVEX encoded instructions when available. This requires a minor rename of the instructions due to the use of different tablegen classes and how the names are concatenated. llvm-svn: 285501	2016-10-29 18:41:45 +00:00
Sanjay Patel	36eeb6d6f6	[ValueTracking] recognize more variants of smin/smax Try harder to detect obfuscated min/max patterns: the initial pattern was added with D9352 / rL236202. There was a bug fix for PR27137 at rL264996, but I think we can do better by folding the corresponding smax pattern and commuted variants. The codegen tests demonstrate the effect of ValueTracking on the backend via SelectionDAGBuilder. We can't expose these differences minimally in IR because we don't have smin/smax intrinsics for IR. Differential Revision: https://reviews.llvm.org/D26091 llvm-svn: 285499	2016-10-29 16:21:19 +00:00
Sanjay Patel	978f827d12	[InstCombine] re-use bitcasted compare operands in selects (PR28001) These mixed bitcast patterns show up with SSE/AVX intrinsics because we bitcast function parameters to <2 x i64>. The bitcasts obfuscate the expected min/max forms as shown in PR28001: https://llvm.org/bugs/show_bug.cgi?id=28001#c6 Differential Revision: https://reviews.llvm.org/D25943 llvm-svn: 285495	2016-10-29 15:22:04 +00:00
Simon Pilgrim	75a697a17e	[DAGCombiner] (REAPPLIED) Add vector demanded elements support to computeKnownBits Currently computeKnownBits returns the common known zero/one bits for all elements of vector data, when we may only be interested in one/some of the elements. This patch adds a DemandedElts argument that allows us to specify the elements we actually care about. The original computeKnownBits implementation calls with a DemandedElts demanding all elements to match current behaviour. Scalar types set this to 1. The approach was found to be easier than trying to add a per-element known bits solution, for a similar usefulness given the combines where computeKnownBits is typically used. I've only added support for a few opcodes so far (the ones that have proven straightforward to test), all others will default to demanding all elements but can be updated in due course. DemandedElts support could similarly be added to computeKnownBitsForTargetNode in a future commit. This looked like this had caused compile time regressions on some buildbots (and was reverted in rL285381), but appears to have just been a harmless bystander! Differential Revision: https://reviews.llvm.org/D25691 llvm-svn: 285494	2016-10-29 11:29:39 +00:00
Elena Demikhovsky	519b4ccd70	Fixed FMA + FNEG combine. Masked form of FMA should be omitted in this optimization. Differential Revision: https://reviews.llvm.org/D25984 llvm-svn: 285492	2016-10-29 08:44:46 +00:00
Matt Arsenault	c88ba36eab	AMDGPU: Use 1/2pi inline imm on VI I'm guessing at how it is supposed to be printed llvm-svn: 285490	2016-10-29 04:05:06 +00:00
Matthias Braun	7d78614ae9	AArch64DeadRegisterDefinitionsPass: Cleanup; NFC - Fix doxygen file comment - reduce indentation in loop - Factor out some common subexpressions - Move independent helper function out of class - Fix Changed flag (this is not strictly NFC but a bugfix, but the flag seems ignored anyway) llvm-svn: 285488	2016-10-29 01:03:41 +00:00
Rui Ueyama	77be2403f6	Define calculateDbgStreamSize for consistency. llvm-svn: 285487	2016-10-29 00:56:44 +00:00
Tim Shen	1bab9cfbe5	[APFloat] Remove the redundent function body of uninitialized ctor, which should be done in r285468 llvm-svn: 285486	2016-10-29 00:51:41 +00:00
Zachary Turner	5b2243e884	Resubmit "Add support for advanced number formatting." This resubmits r284436 and r284437, which were reverted in r284462 as they were breaking the AArch64 buildbot. The breakage on AArch64 turned out to be a miscompile which is still not fixed, but is actively tracked at llvm.org/pr30748. This resubmission re-writes the code in a way so as to make the miscompile not happen. llvm-svn: 285483	2016-10-29 00:27:22 +00:00
Davide Italiano	86168b23cf	[DAGCombiner] Fix a crash visiting `AND` nodes. Instead of asserting that the shift count is != 0 we just bail out as it's not profitable trying to optimize a node which will be removed anyway. Differential Revision: https://reviews.llvm.org/D26098 llvm-svn: 285480	2016-10-28 23:55:32 +00:00
Tom Stellard	6695ba0440	AMDGPU/SI: Don't use non-0 waitcnt values when waiting on Flat instructions Summary: Flat instruction can return out of order, so we need always need to wait for all the outstanding flat operations. Reviewers: tony-tye, arsenm Subscribers: kzhuravl, wdng, nhaehnle, llvm-commits, yaxunl Differential Revision: https://reviews.llvm.org/D25998 llvm-svn: 285479	2016-10-28 23:53:48 +00:00
Matt Arsenault	4e9c1e3a79	AMDGPU: Fix instruction flags for s_endpgm Set isReturn, remove hasSideEffects. Also remove hasCtrlDep, I'm not really sure what that does. llvm-svn: 285476	2016-10-28 23:00:38 +00:00
Adrian Prantl	3cd37d0aeb	Refactor DW_LNE_* into Dwarf.def llvm-svn: 285475	2016-10-28 22:57:02 +00:00
Adrian Prantl	79deba6446	Refactor DW_LNS_* into Dwarf.def llvm-svn: 285474	2016-10-28 22:56:59 +00:00
Adrian Prantl	8580d3f3d3	Refactor DW_APPLE_PROPERTY_* into Dwarf.def llvm-svn: 285473	2016-10-28 22:56:56 +00:00
Adrian Prantl	44a4461b16	Refactor DW_CFA_* into Dwarf.def llvm-svn: 285472	2016-10-28 22:56:53 +00:00
Adrian Prantl	23865816d5	Refactor all DW_FORM_* constants into Dwarf.def llvm-svn: 285470	2016-10-28 22:56:45 +00:00
Tim Shen	b4991548c8	[APFloat] Fix memory bugs revealed by MSan Reviewers: eugenis, hfinkel, kbarton, iteratee, echristo Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D26102 llvm-svn: 285468	2016-10-28 22:45:33 +00:00
Justin Bogner	db6b6a7f0c	SDAG: Make sure we use an allocatable reg class when we create this vreg As per the discussion on r280783, if constrainRegClass fails we need to call getAllocatableClass like we did before that commit. llvm-svn: 285467	2016-10-28 22:42:54 +00:00
Matt Arsenault	7b6475568d	AMDGPU: Add definitions for scalar store instructions Also add glc bit to the scalar loads since they exist on VI and change the caching behavior. This currently has an assembler bug where the glc bit is incorrectly accepted on SI/CI which do not have it. llvm-svn: 285463	2016-10-28 21:55:15 +00:00
Matt Arsenault	4b6a6cc8e9	AMDGPU: Rename glc operand type While trying to add the glc bit to SMEM instructions on VI with the new refactoring I ran into some kind of shadowing problem for the glc operand when using the pseudoinstruction as a multiclass parameter. Everywhere that currently uses it defines the operand to have the same name as its type, i.e. glc:$glc which works. For some reason now it conflicts, and its up evaluating to the wrong thing. For the real encoding classes, let Inst{16} = !if(ps.has_glc, glc, ?); was not being evaluated and still visible in the Inst initializer in the expanded td file. In other cases I got a a different error about an illegal operand where this was using { 0 } initializer from the bits<1> glc initializer instead of evaluating it as false in the if. For consistency all of the operand types should probably be captialized to avoid conflicting with the variable names unless somebody has a better idea of how to fix this. llvm-svn: 285462	2016-10-28 21:55:08 +00:00
Justin Lebar	f0a80ba385	[NVPTX] Compute 'rem' using the result of 'div', if possible. Summary: In isel, transform Num % Den into Num - (Num / Den) * Den if the result of Num / Den is already available. Reviewers: tra Subscribers: hfinkel, llvm-commits, jholewinski Differential Revision: https://reviews.llvm.org/D26090 llvm-svn: 285461	2016-10-28 21:44:00 +00:00
Justin Lebar	0ede5fb1bb	Don't leave unused divs/rems sitting around in BypassSlowDivision. Summary: This "pass" eagerly creates div and rem instructions even when only one is needed -- it relies on a later pass (machine DCE?) to clean them up. This is problematic not just from a cleanliness perspective (this pass is running during CodeGenPrepare, so should leave the IR in a better state), but it also creates a problem for instruction selection. If we always have a div+rem, isel will always select a divrem instruction (if possible), even when a single div or rem would do. Specifically, in NVPTX, we want to compute rem from the output of div, if available. But if a div is not available, we want to leave the rem alone. This transformation is overeager if div is always available. Because this code runs as part of CodeGenPrepare, it's nontrivial to write a test for this change. But this will effectively be tested by a later patch which adds the aforementioned change to NVPTX isel. Reviewers: tra Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D26088 llvm-svn: 285460	2016-10-28 21:43:54 +00:00
Justin Lebar	468bf73209	Don't claim the udiv created in BypassSlowDivision is exact. Summary: In BypassSlowDivision's short-dividend path, we would create e.g. udiv exact i32 %a, %b "exact" here means that we are asserting that %a is a multiple of %b. But we have no reason to believe this must be true -- this is just a bug, as far as I can tell. Reviewers: tra Subscribers: jholewinski, llvm-commits Differential Revision: https://reviews.llvm.org/D26097 llvm-svn: 285459	2016-10-28 21:43:51 +00:00
Matt Arsenault	4eae301995	AMDGPU: Diagnose using too many SGPRs This is possible when using inline asm. llvm-svn: 285447	2016-10-28 20:31:47 +00:00
Krzysztof Parzyszek	2717175c99	Handle non-~0 lane masks on live-in registers in LivePhysRegs When LivePhysRegs adds live-in registers, it recognizes ~0 as a special lane mask indicating the entire register. If the lane mask is not ~0, it will only add the subregisters that overlap the specified lane mask. The problem is that if a live-in register does not have subregisters, and the lane mask is not ~0, it will not be added to the live set. (The given lane mask may simply be the lane mask of its register class.) If a register does not have subregisters, add it to the live set if the lane mask is non-zero. Differential Revision: https://reviews.llvm.org/D26094 llvm-svn: 285440	2016-10-28 20:06:37 +00:00
Matt Arsenault	ef00283425	SpeculativeExecution: Allow speculating more inst types Partial step towards removing the whitelist and only using TTI's cost. llvm-svn: 285438	2016-10-28 20:00:33 +00:00
Matt Arsenault	08906a3c62	AMDGPU: Fix using incorrect private resource with no allocation It's possible to have a use of the private resource descriptor or scratch wave offset registers even though there are no allocated stack objects. This would result in continuing to use the maximum number reserved registers. This could go over the number of SGPRs available on VI, or violate the SGPR limit requested by the function attributes. llvm-svn: 285435	2016-10-28 19:43:31 +00:00
Nemanja Ivanovic	e28a0fc72a	Implement vector count leading/trailing bytes with zero lsb and vector parity builtins - llvm portion This patch corresponds to review https://reviews.llvm.org/D26003. Committing on behalf of Zaara Syeda. llvm-svn: 285434	2016-10-28 19:38:24 +00:00
Teresa Johnson	7c31cb1665	[ThinLTO] Use flags from summary when writing variable summary (NFC) We already read the flags out of the summary when writing the summary records for functions and aliases, do the same for variables. This is an NFC change for now since the flags computed on the fly from the GlobalValue currently will always match those in the summary already, but once I send a follow-on patch to set the NoRename flag for locals in the llvm.used set this becomes a necessary change. llvm-svn: 285433	2016-10-28 19:36:00 +00:00
George Burgess IV	013fd7315f	[MemorySSA] Add const to getClobberingMemoryAccess. Thanks to bryant for the patch! Differential Revision: https://reviews.llvm.org/D26086 llvm-svn: 285432	2016-10-28 19:22:46 +00:00
Adrian Prantl	7e55f17825	Move the DWARF attribute constants into Dwarf.def and delete 300 lines of silly code. llvm-svn: 285425	2016-10-28 18:21:39 +00:00
Matthias Braun	de8c1b3433	MachineRegisterInfo: Remove unused arg from isConstantPhysReg(); NFC llvm-svn: 285423	2016-10-28 18:05:09 +00:00
Matthias Braun	35a024fe0f	TargetPassConfig: Move addPass of IPRA RegUsageInfoProp down. TargetPassConfig::addMachinePasses() does some housekeeping first: Handling the -print-machineinstrs flag and doing an initial printing "After Instruction Selection". There is no reason for RegUsageInfoProp to run before those two steps. llvm-svn: 285422	2016-10-28 18:05:05 +00:00
Adrian Prantl	c4fbbcf9ed	Import/update constants from the DWARF 5 public review draft document. https://reviews.llvm.org/D26051 llvm-svn: 285421	2016-10-28 17:59:50 +00:00
Krzysztof Parzyszek	87a47be039	[Hexagon] Maintain kill flags through splitting in expand-condsets Do not use LiveIntervals to recalculate kills, because that cannot be done accurately without implicit uses on predicated instructions. llvm-svn: 285409	2016-10-28 15:50:22 +00:00
Tom Stellard	13068995b9	[Loads] Fix crash in is isDereferenceableAndAlignedPointer() Summary: We were trying to add APInt values with different bit sizes after visiting an addrspacecast instruction which changed the bit width of the pointer. Reviewers: majnemer, hfinkel Subscribers: hfinkel, wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D24774 llvm-svn: 285407	2016-10-28 15:32:28 +00:00
Simon Pilgrim	d9189891fc	[SelectionDAG] computeKnownBits - early-out if any BUILD_VECTOR element has no known bits No need to check the remaining elements - no common known bits are available. llvm-svn: 285399	2016-10-28 14:07:44 +00:00
Simon Pilgrim	8c043061e5	[SelectionDAG] Tidyup UDIV computeKnownBits implementation No need to clear KnownOne2/KnownZero2 bits as the next call to computeKnownBits will overwrite them anyway llvm-svn: 285398	2016-10-28 13:42:23 +00:00
Simon Pilgrim	755cef1ba8	[SelectionDAG] Increment computeKnownBits recursion depth for SMIN/SMAX/UMIN/UMAX like all other ops llvm-svn: 285397	2016-10-28 13:13:16 +00:00
Igor Laevsky	c3ccf5d77b	[LCSSA] Perform LCSSA verification only for the current loop nest. Now LPPassManager will run LCSSA verification only for the top-level loop which was processed on the current iteration. Differential Revision: https://reviews.llvm.org/D25873 llvm-svn: 285394	2016-10-28 12:57:20 +00:00
Juergen Ributzka	5cee232be4	Revert "[DAGCombiner] Add vector demanded elements support to computeKnownBits" This seems to have increased LTO compile time bejond 2x of previous builds. See http://lab.llvm.org:8080/green/job/clang-stage2-configure-Rlto/10676/ llvm-svn: 285381	2016-10-28 04:01:12 +00:00
Davide Italiano	631cd27f29	[Reassociate] Removing instructions mutates the IR. Fixes PR 30784. Discussed with Justin, who pointed out that in the new PassManager infrastructure we can have more fine-grained control on which analyses we want to preserve, but this is the best we can do with the current infrastructure. llvm-svn: 285380	2016-10-28 02:47:09 +00:00
Teresa Johnson	02563cd3a6	[ThinLTO] Create AliasSummary when building index Summary: Previously we were creating the alias summary on the fly while writing the summary to bitcode. This moves the creation of these summaries to the module summary index builder where we build the rest of the summary index. This is going to be necessary for setting the NoRename flag for values possibly used in inline asm or module level asm. Reviewers: mehdi_amini Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D26049 llvm-svn: 285379	2016-10-28 02:39:38 +00:00
Teresa Johnson	58fbc916a0	[ThinLTO] Rename HasSection to NoRename (NFC) Summary: This is in preparation for a change to utilize this flag for symbols referenced/defined in either inline or module level assembly. Reviewers: mehdi_amini Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D26048 llvm-svn: 285376	2016-10-28 02:24:59 +00:00
Davide Italiano	6231a7e4d1	[IR] Clang-format my previous commit. NFCI. llvm-svn: 285375	2016-10-28 01:41:56 +00:00
Davide Italiano	30665147f9	[ConstantFold] Get the correct vector type when folding a getelementptr. Differential Revision: https://reviews.llvm.org/D26014 llvm-svn: 285371	2016-10-28 00:53:16 +00:00
Tom Stellard	aea899e2a0	AMDGPU/SI: Handle hazard with s_rfe_b64 Reviewers: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, llvm-commits, tony-tye Differential Revision: https://reviews.llvm.org/D25638 llvm-svn: 285368	2016-10-27 23:50:21 +00:00
Tom Stellard	04051b5fad	AMDGPU/SI: Handle hazard with sgpr lane selects for v_{read,write}lane Reviewers: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, tony-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D25637 llvm-svn: 285367	2016-10-27 23:42:29 +00:00
Tom Stellard	6b9c1be4ea	AMDGPU/SI: Fix unused variable warning on non-debug builds llvm-svn: 285363	2016-10-27 23:28:03 +00:00
Ekaterina Romanova	b7f96d1241	Reverting back r285355: "Update .debug_line section version information to match DWARF version", while I'm investigating a test failure. llvm-svn: 285362	2016-10-27 23:20:19 +00:00
Tom Stellard	b133fbb9a4	AMDGPU/SI: Handle hazard with > 8 byte VMEM stores Reviewers: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, tony-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D25577 llvm-svn: 285359	2016-10-27 23:05:31 +00:00
Tim Shen	139a58f75e	Reapply r285351 "[APFloat] Add DoubleAPFloat mode to APFloat. NFC." with a workaround for old clang. llvm-svn: 285358	2016-10-27 22:52:40 +00:00
Ekaterina Romanova	0b82459c6c	Update .debug_line section version information to match DWARF version. In the past the compiler always emitted .debug_line version 2, though some opcodes from DWARF 3 (e.g. DW_LNS_set_prologue_end, DW_LNS_set_epilogue_begin or DW_LNS_set_isa) and from DWARF 4 could be emitted by the compiler. This patch changes version information of .debug_line to exactly match the DWARF version. For .debug_line version 4, a new field maximum_operations_per_instruction is emitted. Differential Revision: https://reviews.llvm.org/D16697 llvm-svn: 285355	2016-10-27 22:37:25 +00:00
Tim Shen	414b0155c4	Revert "[APFloat] Add DoubleAPFloat mode to APFloat. NFC." This reverts r285351, since it breaks the build. llvm-svn: 285354	2016-10-27 21:54:29 +00:00
Kostya Serebryany	bcfb0802e2	[libFuzzer] enable use_cmp by default llvm-svn: 285353	2016-10-27 21:44:37 +00:00
Tim Shen	f38e87fa48	[APFloat] Add DoubleAPFloat mode to APFloat. NFC. Summary: This patch adds DoubleAPFloat mode to APFloat. Now, an APFloat with semantics PPCDoubleDouble will have DoubleAPFloat layout (APFloat.U.Double), which contains two underlying APFloats as PPCDoubleDoubleImpl and IEEEdouble semantics. Currently the IEEEdouble APFloat is not used, and the first APFloat behaves exactly the same before this change. This patch consists of three kinds of logics: 1) Construction and destruction of APFloat. Now the ctors, dtor, assign opertors and factory functions construct different underlying layout based on the semantics passed in. 2) s/IEEE/getIEEE()/ for normal, lifetime-unrelated computation functions. These functions only access Floats[0] in DoubleAPFloat, which is the same as today's semantic. 3) A "Double dispatch" function, APFloat::convert. Converting between two different layouts requires appropriate logic. Neither of these change the external behavior. Reviewers: hfinkel, kbarton, echristo, iteratee Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D25977 llvm-svn: 285351	2016-10-27 21:39:51 +00:00
Peter Collingbourne	fc0a99bfda	BitcodeReader: Require clients to read the block info block at most once. This change makes it the client's responsibility to call ReadBlockInfoBlock() at most once. This is in preparation for a future change that will allow there to be multiple block info blocks. See also: http://lists.llvm.org/pipermail/llvm-dev/2016-October/106512.html Differential Revision: https://reviews.llvm.org/D26016 llvm-svn: 285350	2016-10-27 21:39:28 +00:00
Kyle Butt	ab9cca7b0c	CodeGen: Handle missed case of block removal during BlockPlacement. There is a use after free bug in the existing code. Loop layout selects a preferred exit block, and then lays out the loop. If this block is removed during layout, it needs to be invalidated to prevent a use after free. llvm-svn: 285348	2016-10-27 21:37:20 +00:00
Sanjay Patel	c0de9c9e40	[InstCombine] fix foldSPFofSPF() to handle vector splats llvm-svn: 285345	2016-10-27 21:19:40 +00:00
Kevin Enderby	bc5c29a65f	Another additional error check for invalid Mach-O files for the obsolete load commands. Again the philosophy of the error checking in libObject for Mach-O files, the idea behind the checking is that we never will return a Mach-O file out of libObject that contains unknown things the library code can’t operate on. So known obsolete load commands will cause a hard error. Also to make things clear I have added comments to the values and structures in Support/Mach-O.h and Support/MachO.def as to what is obsolete. As noted in a TODO in the code, there may need to be a non-default mode to allow some unknown values for well structured Mach-O files with things like unknown load load commands. So things like using an old lldb on a newer Mach-O file could still provide some limited functionality. llvm-svn: 285342	2016-10-27 20:59:10 +00:00
Tom Stellard	30d30824b4	AMDGPU/SI: Handle s_setreg hazard in GCNHazardRecognizer Reviewers: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, llvm-commits, tony-tye Differential Revision: https://reviews.llvm.org/D25528 llvm-svn: 285338	2016-10-27 20:39:09 +00:00
Haicheng Wu	430b3e4893	[LoopUnroll] Check partial unrolling is enabled before initialization. NFC. Differential Revision: https://reviews.llvm.org/D23891 llvm-svn: 285330	2016-10-27 18:40:02 +00:00
Simon Pilgrim	d23219b9ee	[X86][AVX512] Fix MUL v8i64 costs on non-AVX512DQ targets llvm-svn: 285329	2016-10-27 18:32:06 +00:00
Sanjay Patel	611f9f92fc	[InstCombine] handle simple vector integer constants in IsFreeToInvert llvm-svn: 285318	2016-10-27 17:30:50 +00:00
Simon Pilgrim	47c1ff7a43	[X86][AVX512DQ] Move v2i64 and v4i64 MUL lowering to tablegen As suggested by @igorb on D26011 llvm-svn: 285313	2016-10-27 17:07:40 +00:00
Saleem Abdulrasool	075d2e3c59	ARM: ensure that the Windows DBZ check is in range The Windows ARM target expects the compiler to emit a division-by-zero check. The check would use the form of: cmp r?, #0 cbz .Ltrap b .Lbody .Lbody: ... .Ltrap: udf #249 @ __brkdiv0 This works great most of the time. However, if the body of the function is greater than 127 bytes, the branch target limitation of cbz becomes an issue. This occurs in the unoptimized code generation cases sometimes (like in compiler-rt). Since this is a matter of correctness, possibly pay a small penalty instead. We now form this slightly differently: cbnz .Lbody udf #249 @ __brkdiv0 .Lbody: ... The positive case is through the branch instead of being the next instruction. However, because of the basic block layout, the negated branch is going to be a short distance always (2 bytes away, after the inserted __brkdiv0). The new t__brkdiv0 instruction is required to explicitly mark the instruction as a terminator as the generic UDF instruction is not a terminator. Addresses PR30532! llvm-svn: 285312	2016-10-27 16:59:22 +00:00
Greg Clayton	6c273763a3	Switch all DWARF variables for tags, attributes and forms over to use the llvm::dwarf enumerations instead of using raw uint16_t values. This allows easier debugging as users can see the values of the enumerations in the variables view that will show the enumeration string instead of just a number. https://reviews.llvm.org/D26013 llvm-svn: 285309	2016-10-27 16:32:04 +00:00
Dehao Chen	b94c09baa0	Add Loop Sink pass to reverse the LICM based of basic block frequency. Summary: LICM may hoist instructions to preheader speculatively. Before code generation, we need to sink down the hoisted instructions inside to loop if it's beneficial. This pass is a reverse of LICM: looking at instructions in preheader and sinks the instruction to basic blocks inside the loop body if basic block frequency is smaller than the preheader frequency. Reviewers: hfinkel, davidxl, chandlerc Subscribers: anna, modocache, mgorny, beanz, reames, dberlin, chandlerc, mcrosier, junbuml, sanjoy, mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D22778 llvm-svn: 285308	2016-10-27 16:30:08 +00:00
Vasileios Kalintiris	cfb005a0ee	[mips] Do not allow -opt-bisect-limit to skip the PIC call optimization pass. r282428 added the MipsOptimizePICCall as an opt-in pass that can be skipped when using the -opt-bisect-limit option. However, this pass is needed because it generates code that conforms to the o32 ABI specification by using the $t9 register for PIC calls with JALR instructions. This bug was exposed by the fact that skipFunction() also checks for the "optnone" attribute. This caused functions with that attribute to break the requirements of the o32 ABI. llvm-svn: 285305	2016-10-27 15:50:36 +00:00
Simon Pilgrim	820e1326d7	[X86][AVX512DQ] Improve lowering of MUL v2i64 and v4i64 With DQI but without VLX, lower v2i64 and v4i64 MUL operations with v8i64 MUL (vpmullq). Updated cost table accordingly. Differential Revision: https://reviews.llvm.org/D26011 llvm-svn: 285304	2016-10-27 15:27:00 +00:00
Sanjay Patel	e372aecb8a	[ValueTracking] fix matchSelectPattern to allow vector splat folds of min/max/abs/nabs llvm-svn: 285303	2016-10-27 15:26:10 +00:00
Bjorn Pettersson	807f732ce8	Fix memory issue in AttrBuilder::removeAttribute uses. Summary: Found when running Valgrind. This removes two unnecessary assignments when using AttrBuilder::removeAttribute. AttrBuilder::removeAttribute returns a reference to the object. As the LHSes were the same as the callees, the assignments resulted in memcpy calls where dst = src. Commited on behalf-of: dstenb (David Stenberg) Reviewers: mkuper, rnk Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D25460 llvm-svn: 285298	2016-10-27 14:48:09 +00:00
Krzysztof Parzyszek	046da74699	[Hexagon] Do not expand ISD::SELECT for HVX vectors llvm-svn: 285297	2016-10-27 14:30:16 +00:00
Simon Pilgrim	01e755eab1	[DAGCombiner] Add vector demanded elements support to computeKnownBits Currently computeKnownBits returns the common known zero/one bits for all elements of vector data, when we may only be interested in one/some of the elements. This patch adds a DemandedElts argument that allows us to specify the elements we actually care about. The original computeKnownBits implementation calls with a DemandedElts demanding all elements to match current behaviour. Scalar types set this to 1. The approach was found to be easier than trying to add a per-element known bits solution, for a similar usefulness given the combines where computeKnownBits is typically used. I've only added support for a few opcodes so far (the ones that have proven straightforward to test), all others will default to demanding all elements but can be updated in due course. DemandedElts support could similarly be added to computeKnownBitsForTargetNode in a future commit. Differential Revision: https://reviews.llvm.org/D25691 llvm-svn: 285296	2016-10-27 14:29:28 +00:00
Alexey Bataev	46c0278e7d	[SLP] Fix for PR30626: Compiler crash inside SLP Vectorizer. After successfull horizontal reduction vectorization attempt for PHI node vectorizer tries to update root binary op by combining vectorized tree and the ReductionPHI node. But during vectorization this ReductionPHI can be vectorized itself and replaced by the `undef` value, while the instruction itself is marked for deletion. This 'marked for deletion' PHI node then can be used in new binary operation, causing "Use still stuck around after Def is destroyed" crash upon PHI node deletion. Also the test is fixed to make it perform actual testing. Differential Revision: https://reviews.llvm.org/D25671 llvm-svn: 285286	2016-10-27 12:02:28 +00:00
Sam Parker	e7d9505c08	[ARM] Predicate UMAAL selection on hasDSP. UMAAL is a DSP instruction and it is not available on thumbv7m (Cortex-M3) and thumbv6m (Cortex-M0+1) targets. Also fix wrong CHECK prefix in longMAC.ll test. Patch by Vadzim Dambrouski. Differential Revision: https://reviews.llvm.org/D25890 llvm-svn: 285278	2016-10-27 09:47:10 +00:00
Dylan McKay	dd680cc753	[AVR] Generate all of the TableGen files we need This enables generation of all of the TableGen files that are used downstream. llvm-svn: 285274	2016-10-27 08:20:47 +00:00
Nicolai Haehnle	7b0e25b7ad	AMDGPU: Fix SILoadStoreOptimizer when writes cannot be merged due register dependencies Summary: When finding a match for a merge and collecting the instructions that must be moved, keep in mind that the instruction we merge might actually use one of the defs that are being moved. Fixes piglit spec/arb_enhanced_layouts/execution/component-layout/vs-tcs-load-output[-indirect]. The fact that the ds_read in the test case is not eliminated suggests that there might be another problem related to alias analysis, but that's a separate problem: this pass should still work correctly even when earlier optimization passes missed something or were disabled. Reviewers: tstellarAMD, arsenm Subscribers: kzhuravl, wdng, yaxunl, llvm-commits, tony-tye Differential Revision: https://reviews.llvm.org/D25829 llvm-svn: 285273	2016-10-27 08:15:07 +00:00

1 2 3 4 5 ...

96451 Commits