llvm-project

Commit Graph

Author	SHA1	Message	Date
Eric Christopher	5f54195e4a	Remove a FIXME. Explanation: This function is in TargetLowering because it uses RegClassForVT which would need to be moved to TargetRegisterInfo and would necessitate moving isTypeLegal over as well - a massive change that would just require TargetLowering having a TargetRegisterInfo class member that it would use. llvm-svn: 230585	2015-02-26 00:00:35 +00:00
Eric Christopher	23a3a7c871	Remove an argument-less call to getSubtargetImpl from TargetLoweringBase. This required plumbing a TargetRegisterInfo through computeRegisterProperties and into findRepresentativeClass which uses it for register class iteration. This required passing a subtarget into a few target specific initializations of TargetLowering. llvm-svn: 230583	2015-02-26 00:00:24 +00:00
Ramkumar Ramachandra	f8ea847e48	MemDepPrinter: Fix some nits introduced in r228596 Differential Revision: http://reviews.llvm.org/D7644 llvm-svn: 230582	2015-02-25 23:55:00 +00:00
Justin Bogner	a7ad4b3f3b	Object: Handle Mach-O kext bundle files This particular subtype of Mach-O was missing. Add it. llvm-svn: 230567	2015-02-25 22:59:20 +00:00
Justin Bogner	2e427d4dbd	InstrProf: Make the __llvm_profile_runtime_user symbol hidden This symbol exists only to pull in the required pieces of the runtime, so nothing ever needs to refer to it. Making it hidden avoids the potential for issues with duplicate symbols when linking profiled libraries together. llvm-svn: 230566	2015-02-25 22:52:20 +00:00
Duncan P. N. Exon Smith	738889f752	IR: Drop newline from AssemblyWriter::printMDNodeBody() Remove a newline from `AssemblyWriter::printMDNodeBody()`, and add one to `AssemblyWriter::writeMDNode()`. NFCI for assembly output. However, this drops an inconsistent newline from `Metadata::print()` when `this` is an `MDNode`. Now the newline added by `Metadata::dump()` won't look so verbose. llvm-svn: 230565	2015-02-25 22:46:38 +00:00
Sanjay Patel	cc29f4f2cb	only propagate equality comparisons of FP values that we are certain are non-zero This is a follow-on to r227491 which tightens the check for propagating FP values. If a non-constant value happens to be a zero, we would hit the same bug as before. Bug noted and patch suggested by Eli Friedman. llvm-svn: 230564	2015-02-25 22:46:08 +00:00
Justin Bogner	3588686baf	InstrProf: Remove dead code in CoverageMappingReader Remove a default argument that's never passed and a constructor that's never called. llvm-svn: 230563	2015-02-25 22:44:50 +00:00
Eric Christopher	75dbd7ca3e	Move TargetLoweringBase::getTypeConversion to the .cpp file from the .h file. It's used in only one place (other than recursively) and there's no need to include it everywhere. Saves almost 900k from total llvm object file size. llvm-svn: 230561	2015-02-25 22:41:30 +00:00
JF Bastien	d52c990a90	InstCombine: extract instead of shuffle when performing vector/array type punning Summary: SROA generates code that isn't quite as easy to optimize and contains unusual-sized shuffles, but that code is generally correct. As discussed in D7487 the right place to clean things up is InstCombine, which will pick up the type-punning pattern and transform it into a more obvious bitcast+extractelement, while leaving the other patterns SROA encounters as-is. Test Plan: make check Reviewers: jvoung, chandlerc Subscribers: llvm-commits llvm-svn: 230560	2015-02-25 22:30:51 +00:00
Frederic Riss	de3743453f	[dwarfdump] Fix frame info register number dump. llvm-svn: 230559	2015-02-25 22:30:09 +00:00
Duncan P. N. Exon Smith	89c1eaa531	IR: Annotate dump methods with LLVM_DUMP_METHOD It turns out we have a macro to ensure that debuggers can access `dump()` methods. Use it. Hopefully this will prevent me (and others) from committing crimes like in r223802 (search for /10000/, or just see the fix in r224407). llvm-svn: 230555	2015-02-25 22:08:21 +00:00
Frederic Riss	ac10b0d61d	Try to appease buildbots. It seems ArrayRefs to multi-dimensional arrays confuse some compilers. llvm-svn: 230554	2015-02-25 22:07:43 +00:00
Hal Finkel	cf59921670	[PowerPC] Make LDtocL and friends invariant loads LDtocL, and other loads that roughly correspond to the TOC_ENTRY SDAG node, represent loads from the TOC, which is invariant. As a result, these loads can be hoisted out of loops, etc. In order to do this, we need to generate GOT-style MMOs for TOC_ENTRY, which requires treating it as a legitimate memory intrinsic node type. Once this is done, the MMO transfer is automatically handled for TableGen-driven instruction selection, and for nodes generated directly in PPCISelDAGToDAG, we need to transfer the MMOs manually. Also, we were not transferring MMOs associated with pre-increment loads, so do that too. Lastly, this fixes an exposed bug where R30 was not added as a defined operand of UpdateGBR. This problem was highlighted by an example (used to generate the test case) posted to llvmdev by Francois Pichet. llvm-svn: 230553	2015-02-25 21:36:59 +00:00
Frederic Riss	c0dd7243ee	[dwarfdump] Make debug_frame dump actually useful. This adds support for pretty-printing instruction operands. The new output looks like: 00000000 00000010 ffffffff CIE Version: 1 Augmentation: Code alignment factor: 1 Data alignment factor: -4 Return address column: 8 DW_CFA_def_cfa: reg4 +4 DW_CFA_offset: reg8 -4 DW_CFA_nop: DW_CFA_nop: 00000014 00000010 00000000 FDE cie=00000000 pc=00000000...00000022 DW_CFA_advance_loc: 3 DW_CFA_def_cfa_offset: +12 DW_CFA_nop: llvm-svn: 230551	2015-02-25 21:30:22 +00:00
Frederic Riss	2fe0e54fd6	[dwarfdump] Don't print meaningless pointer. CIE pointers were never filled in before, and printing the pointer is totally pointless anyway. llvm-svn: 230550	2015-02-25 21:30:19 +00:00
Frederic Riss	056ad058bb	DWARFDebugFrame: Move some code around. NFC. Move the FrameEntry::dumpInstructions down in the file at some place where it can see the declarations of FDE and CIE. llvm-svn: 230549	2015-02-25 21:30:16 +00:00
Frederic Riss	41bb2c6d4f	DWARFDebugFrame: Add some trivial accessors. NFC. To be used for dumping. llvm-svn: 230548	2015-02-25 21:30:13 +00:00
Frederic Riss	baf195f7eb	DWARFDebugFrame: Actually collect CIEs associated with FDEs. This is the first commit in a small series aiming at making debug_frame dump more useful (right now it prints a list of opeartions without their operands). llvm-svn: 230547	2015-02-25 21:30:09 +00:00
Manman Ren	082a336a89	[LTO API] fix memory leakage introduced at r230290. r230290 released the LLVM module but not the LTOModule. rdar://19024554 llvm-svn: 230544	2015-02-25 21:20:53 +00:00
David Majnemer	e1bbad9eb2	X86, Win64: Allow 'mov' to restore the stack pointer if we have a FP The Win64 epilogue structure is very restrictive, it permits a very small number of opcodes and none of them are 'mov'. This means that given: mov %rbp, %rsp pop %rbp The mov isn't the epilogue, only the pop is. This is problematic unless a frame pointer is present in which case we are free to do whatever we'd like in the "body" of the function. If a frame pointer is present, unwinding will undo the prologue operations in reverse order regardless of the fact that we are at an instruction which is reseting the stack pointer. llvm-svn: 230543	2015-02-25 21:13:37 +00:00
Peter Collingbourne	eba7f73ff9	LowerBitSets: Align referenced globals. This change aligns globals to the next highest power of 2 bytes, up to a maximum of 128. This makes it more likely that we will be able to compress bit sets with a greater alignment. In many more cases, we can now take advantage of a new optimization also introduced in this patch that removes bit set checks if the bit set is all ones. The 128 byte maximum was found to provide the best tradeoff between instruction overhead and data overhead in a recent build of Chromium. It allows us to remove ~2.4MB of instructions at the cost of ~250KB of data. Differential Revision: http://reviews.llvm.org/D7873 llvm-svn: 230540	2015-02-25 20:42:41 +00:00
Andrew Kaylor	b59b80b956	Fixing a problem with insert location in WinEH outlining llvm-svn: 230535	2015-02-25 20:12:49 +00:00
Sanjoy Das	dcc84db264	Bugfix: SCEVExpander incorrectly marks increment operations as no-wrap (The change was landed in r230280 and caused the regression PR22674. This version contains a fix and a test-case for PR22674). When emitting the increment operation, SCEVExpander marks the operation as nuw or nsw based on the flags on the preincrement SCEV. This is incorrect because, for instance, it is possible that {-6,+,1} is <nuw> while {-6,+,1}+1 = {-5,+,1} is not. This change teaches SCEV to mark the increment as nuw/nsw only if it can explicitly prove that the increment operation won't overflow. Apart from the attached test case, another (more realistic) manifestation of the bug can be seen in Transforms/IndVarSimplify/pr20680.ll. Differential Revision: http://reviews.llvm.org/D7778 llvm-svn: 230533	2015-02-25 20:02:59 +00:00
Hal Finkel	0746211811	[PowerPC] Cleanup unused target-specific SDAG nodes We had somehow accumulated a few target-specific SDAG nodes dealing with PPC64 TOC access that were referenced only in TableGen patterns. The associated (pseudo-)instructions are used, but are being generated directly. NFC. llvm-svn: 230518	2015-02-25 18:06:45 +00:00
Matthias Braun	02892ec62d	AArch64: Add debug message for large shift constants. As requested in code review. llvm-svn: 230517	2015-02-25 18:03:50 +00:00
Sanjay Patel	40eaa8df99	Fix really obscure bug in CannotBeNegativeZero() (PR22688) With a diabolically crafted test case, we could recurse through this code and return true instead of false. The larger engineering crime is the use of magic numbers. Added FIXME comments for those. llvm-svn: 230515	2015-02-25 18:00:15 +00:00
Vladimir Medic	bcb7467540	[MIPS]Multiple and add instructions for Mips are currently available in mips32r2/mips64r2 and later but should also be available in mips4, mips5, and mips64. This patch fixes the requested features and updates the corresponding test files. llvm-svn: 230500	2015-02-25 15:24:37 +00:00
Bruno Cardoso Lopes	ab7afa9144	[X86][MMX] Reapply: Add MMX instructions to foldable tables Reapply r230248. Teach the peephole optimizer to work with MMX instructions by adding entries into the foldable tables. This covers folding opportunities not handled during isel. llvm-svn: 230499	2015-02-25 15:14:02 +00:00
Bruno Cardoso Lopes	48b10681f9	[X86][MMX] Prevent MMX_MOVD64rm folding MMX_MOVD64rm zero-extends i32 load results into i64 registers. The peephole optimizer will try to fold it in other MMX foldable instructions, the wrong thing to do, since there's no MMX memory instruction that loads from i32 and does implict zero extension. Remove 'canFoldAsLoad' from MOVD64rm in order to prevent such folding. The current MMX tests already test this, but since there are no MMX instructions in the foldable tables yet, this did not trigger. This commit prepares the addition of those instructions. llvm-svn: 230498	2015-02-25 15:13:52 +00:00
Renato Golin	b9887ef32a	Improve handling of stack accesses in Thumb-1 Thumb-1 only allows SP-based LDR and STR to be word-sized, and SP-base LDR, STR, and ADD only allow offsets that are a multiple of 4. Make some changes to better make use of these instructions: * Use word loads for anyext byte and halfword loads from the stack. * Enforce 4-byte alignment on objects accessed in this way, to ensure that the offset is valid. * Do the same for objects whose frame index is used, in order to avoid having to use more than one ADD to generate the frame index. * Correct how many bits of offset we think AddrModeT1_s has. Patch by John Brawn. llvm-svn: 230496	2015-02-25 14:41:06 +00:00
Aaron Ballman	5561ed448b	Silencing a "result of 32-bit shift implicitly converted to 64 bits (was 64-bit shift intended?)" warning in MSVC; NFC. llvm-svn: 230489	2015-02-25 13:05:24 +00:00
Aaron Ballman	70c27ded97	Silencing a -Wsign-compare warning triggered in MSVC; NFC. llvm-svn: 230488	2015-02-25 13:02:23 +00:00
NAKAMURA Takumi	b01d86b315	Fix UTF8 chars to ASCII. llvm-svn: 230479	2015-02-25 11:02:00 +00:00
Elena Demikhovsky	56eadcf5ce	AVX-512: Gather and Scatter patterns Gather and scatter instructions additionally write to one of the source operands - mask register. In this case Gather has 2 destination values - the loaded value and the mask. Till now we did not support code gen pattern for gather - the instruction was generated from intrinsic only and machine node was hardcoded. When we introduce the masked_gather node, we need to select instruction automatically, in the standard way. I added a flag "hasTwoExplicitDefs" that allows to handle 2 destination operands. (Some code in the X86InstrFragmentsSIMD.td is commented out, just to split one big patch in many small patches) llvm-svn: 230471	2015-02-25 09:46:31 +00:00
Charles Davis	33d1dc0008	[IC] Turn non-null MD on pointer loads to range MD on integer loads. Summary: This change fixes the FIXME that you recently added when you committed (a modified version of) my patch. When `InstCombine` combines a load and store of an pointer to those of an equivalently-sized integer, it currently drops any `!nonnull` metadata that might be present. This change replaces `!nonnull` metadata with `!range !{ 1, -1 }` metadata instead. Reviewers: chandlerc Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D7621 llvm-svn: 230462	2015-02-25 05:10:25 +00:00
David Blaikie	b5b5efd2d1	[opaque pointer type] Bitcode support for explicit type parameter on GEP. Like r230414, add bitcode support including backwards compatibility, for an explicit type parameter to GEP. At the suggestion of Duncan I tried coalescing the two older bitcodes into a single new bitcode, though I did hit a wrinkle: I couldn't figure out how to create an explicit abbreviation for a record with a variable number of arguments (the indicies to the gep). This means the discriminator between inbounds and non-inbounds gep is a full variable-length field I believe? Is my understanding correct? Is there a way to create such an abbreviation? Should I just use two bitcodes as before? Reviewers: dexonsmith Differential Revision: http://reviews.llvm.org/D7736 llvm-svn: 230415	2015-02-25 01:08:52 +00:00
David Blaikie	8503565eec	[opaque pointer type] bitcode support for explicit type parameter to the load instruction Summary: I've taken my best guess at this, but I've cargo culted in places & so explanations/corrections would be great. This seems to pass all the tests (check-all, covering clang and llvm) so I believe that pretty well exercises both the backwards compatibility and common (same version) compatibility given the number of checked in bitcode files we already have. Is that a reasonable approach to testing here? Would some more explicit tests be desired? 1) is this the right way to do back-compat in this case (looking at the number of entries in the bitcode record to disambiguate between the old schema and the new?) 2) I don't quite understand the logarithm logic to choose the encoding type of the type parameter in the abbreviation description, but I found another instruction doing the same thing & it seems to work. Is that the right approach? Reviewers: dexonsmith Differential Revision: http://reviews.llvm.org/D7655 llvm-svn: 230414	2015-02-25 01:07:20 +00:00
Hal Finkel	c93a9a2cb4	[PowerPC] Add support for the QPX vector instruction set This adds support for the QPX vector instruction set, which is used by the enhanced A2 cores on the IBM BG/Q supercomputers. QPX vectors are 256 bytes wide, holding 4 double-precision floating-point values. Boolean values, modeled here as <4 x i1> are actually also represented as floating-point values (essentially { -1, 1 } for { false, true }). QPX shares many features with Altivec and VSX, but is distinct from both of them. One major difference is that, instead of adding completely-separate vector registers, QPX vector registers are extensions of the scalar floating-point registers (lane 0 is the corresponding scalar floating-point value). The operations supported on QPX vectors mirrors that supported on the scalar floating-point values (with some additional ones for permutations and logical/comparison operations). I've been maintaining this support out-of-tree, as part of the bgclang project, for several years. This is not the entire bgclang patch set, but is most of the subset that can be cleanly integrated into LLVM proper at this time. Adding this to the LLVM backend is part of my efforts to rebase bgclang to the current LLVM trunk, but is independently useful (especially for codes that use LLVM as a JIT in library form). The assembler/disassembler test coverage is complete. The CodeGen test coverage is not, but I've included some tests, and more will be added as follow-up work. llvm-svn: 230413	2015-02-25 01:06:45 +00:00
Rafael Espindola	8bc9ccc60a	Support SHF_MERGE sections in COMDATs. This patch unifies the comdat and non-comdat code paths. By doing this it add missing features to the comdat side and removes the fixed section assumptions from the non-comdat side. In ELF there is no one true section for "4 byte mergeable" constants. We are better off computing the required properties of the section and asking the context for it. llvm-svn: 230411	2015-02-25 00:52:15 +00:00
David Blaikie	7b0281089e	BitcodeWriter: Refactor common computation of bits required for a type index. Suggested by Duncan. Happy to bikeshed the name, cache the result, etc. llvm-svn: 230410	2015-02-25 00:51:52 +00:00
Peter Collingbourne	1baeaa395a	LowerBitSets: Introduce global layout builder. The builder is based on a layout algorithm that tries to keep members of small bit sets together. The new layout compresses Chromium's bit sets to around 15% of their original size. Differential Revision: http://reviews.llvm.org/D7796 llvm-svn: 230394	2015-02-24 23:17:02 +00:00
David Majnemer	841e0d60ed	PrologEpilogInserter: Clean up math in calculateFrameObjectOffsets There is no need to open-code the alignment calculation, we have a handy RoundUpToAlignment function which "Does The Right Thing (TM)". llvm-svn: 230392	2015-02-24 23:08:13 +00:00
Sanjay Patel	cee38616c8	remove function names from comments; NFC llvm-svn: 230391	2015-02-24 22:43:06 +00:00
Simon Pilgrim	d8820ae70c	Reapplied D7816 & rL230177 & rL230278 - with an additional fix toensure that the smallest build vector input scalar type is always used. Additional (crash) test cases already committed. llvm-svn: 230388	2015-02-24 22:08:56 +00:00
Andrew Kaylor	1476e6d1bb	Fixing eol-style llvm-svn: 230378	2015-02-24 20:49:35 +00:00
Eric Christopher	af48495130	Revert: Author: Simon Pilgrim <llvm-dev@redking.me.uk> Date: Mon Feb 23 23:04:28 2015 +0000 Fix based on post-commit comment on D7816 & rL230177 - BUILD_VECTOR operand truncation was using the the BV's output scalar type instead of the input type. and Author: Simon Pilgrim <llvm-dev@redking.me.uk> Date: Sun Feb 22 18:17:28 2015 +0000 [DagCombiner] Generalized BuildVector Vector Concatenation The CONCAT_VECTORS combiner pass can transform the concat of two BUILD_VECTOR nodes into a single BUILD_VECTOR node. This patch generalises this to support any number of BUILD_VECTOR nodes, and also permits UNDEF nodes to be included as well. This was noticed as AVX vec128 -> vec256 canonicalization sometimes creates a CONCAT_VECTOR with a real vec128 lower and an vec128 UNDEF upper. Differential Revision: http://reviews.llvm.org/D7816 as the root cause of PR22678 which is causing an assertion inside the DAG combiner. I'll follow up to the main thread as well. llvm-svn: 230358	2015-02-24 19:11:00 +00:00
Eric Christopher	fe59972bbc	Rename UpdateRegAllocHint to match style guidelines. llvm-svn: 230357	2015-02-24 19:10:57 +00:00
Matthias Braun	7526035155	AArch64: Relax assert about large shift sizes. The reason why these large shift sizes happen is because OpaqueConstants currently inhibit alot of DAG combining, but that has to be addressed in another commit (like the proposal in D6946). Differential Revision: http://reviews.llvm.org/D6940 llvm-svn: 230355	2015-02-24 18:52:04 +00:00
Matthias Braun	00a4076e94	DAGCombiner: Move variable definitions closer to use; NFC llvm-svn: 230354	2015-02-24 18:52:01 +00:00
Matthias Braun	a8558ca2ed	DAGCombiner: Move variable declaration closer to definiion; NFC llvm-svn: 230353	2015-02-24 18:51:59 +00:00
Tom Stellard	ecc419c31d	R600/SI: Remove isel mubuf legalization We legalize mubuf instructions post-instruction selection, so this code is no longer needed. llvm-svn: 230352	2015-02-24 17:59:19 +00:00
Tim Northover	e95c5b3236	ARM: treat [N x i32] and [N x i64] as AAPCS composite types The logic is almost there already, with our special homogeneous aggregate handling. Tweaking it like this allows front-ends to emit AAPCS compliant code without ever having to count registers or add discarded padding arguments. Only arrays of i32 and i64 are needed to model AAPCS rules, but I decided to apply the logic to all integer arrays for more consistency. llvm-svn: 230348	2015-02-24 17:22:34 +00:00
Tobias Grosser	2ca0ae2a24	Revert "Raising minimum required CMake version to 2.8.12.2." This reverts commit r230062. Debian stable (wheezy) ships still with cmake 2.8.9. The commit broke my LLVM/Polly buildbot, to my knowledge our only Linux+cmake buildbot. llvm-svn: 230343	2015-02-24 16:39:46 +00:00
Sanjay Patel	a709f3a5ae	simplify control flow; NFC llvm-svn: 230342	2015-02-24 16:26:02 +00:00
Hans Wennborg	953d6fb84e	Revert r230280: "Bugfix: SCEVExpander incorrectly marks increment operations as no-wrap" This caused PR22674, failing this assert: Instructions.h:2281: llvm::Value* llvm::PHINode::getOperand(unsigned int) const: Assertion `i_nocapture < OperandTraits<PHINode>::operands(this) && "getOperand() out of range!"' failed. llvm-svn: 230341	2015-02-24 16:19:29 +00:00
Michael Kuperstein	d2f3b87812	[x32] Mark RBX as reserved when EBX is the base pointer. This should have gone into r230334. llvm-svn: 230339	2015-02-24 16:13:16 +00:00
Sanjay Patel	2898548598	fix typo in comment; NFC llvm-svn: 230338	2015-02-24 16:11:05 +00:00
Michael Kuperstein	8ffb409135	[x32] x32 should use ebx as the base pointer. This fixes the original issue in PR22655, but not the secondary one. llvm-svn: 230334	2015-02-24 15:27:13 +00:00
Hal Finkel	cec70130ac	[SDAG] Handle LowerOperation returning its input consistently For almost all node types, if the target requested custom lowering, and LowerOperation returned its input, we'd treat the original node as legal. This did not work, however, for many loads and stores, because they follow slightly different code paths, and we did not account for the possibility of LowerOperation returning its input at those call sites. I think that we now handle this consistently everywhere. At the call sites in LegalizeDAG, we used to assert in this case, so there's no functional change for any existing code there. For the call sites in LegalizeVectorOps, this really only affects whether or not we set Changed = true, but I think makes the semantics clearer. No test case here, but it will be covered by an upcoming PowerPC commit adding QPX support. llvm-svn: 230332	2015-02-24 12:59:47 +00:00
Toma Tabacu	a90f144a1d	[mips] Reformat some TableGen definitions. NFC. Summary: Separated some instruction and pseudo-instruction definitions from InstAlias definitions, added banner for pseudo-instructions and removed a redundant whitespace from a pseudo-instruction definition. No functional change. Reviewers: dsanders Reviewed By: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D7552 llvm-svn: 230327	2015-02-24 11:52:19 +00:00
Kuba Brecka	f5875d3026	Fix alloca_instruments_all_paddings.cc test to work under higher -O levels (llvm part) When AddressSanitizer only a single dynamic alloca and no static allocas, due to an early exit from FunctionStackPoisoner::poisonStack we forget to unpoison the dynamic alloca. This patch fixes that. Reviewed at http://reviews.llvm.org/D7810 llvm-svn: 230316	2015-02-24 09:47:05 +00:00
Craig Topper	cf51397c48	[X86] Remove the AbsMem32 type from the assembly parser. Only really need the 16-bit version which will automatically get prioritized over AbsMem. llvm-svn: 230313	2015-02-24 08:02:13 +00:00
Reed Kotler	5fb7d8b508	Beginning of alloca implementation for Mips fast-isel Summary: Begin to add various address modes; including alloca. Test Plan: Make sure there are no regressions in test-suite at O0/02 in mips32r1/r2 Reviewers: dsanders Reviewed By: dsanders Subscribers: echristo, rfuhler, llvm-commits Differential Revision: http://reviews.llvm.org/D6426 llvm-svn: 230300	2015-02-24 02:36:45 +00:00
Bob Wilson	8e29dec986	Fix handling of negative offsets for AddrModeT2_i8s4 in rewriteT2FrameIndex. This is a follow up to r230233 to fix something that I noticed by inspection. The AddrModeT2_i8s4 addressing mode does not support negative offsets. I spent a good chunk of the day trying to come up with a testcase for this but was not successful. This addressing mode is used to spill and restore GPRPair registers in Thumb2 code and that does not happen often. We also make very limited used of negative offsets when lowering frame indexes. I am going ahead with the change anyway, because I am pretty confident that it is correct. I also added a missing assertion to check that the low bits of the scaled offset are zero. llvm-svn: 230297	2015-02-24 01:37:31 +00:00
Sanjoy Das	b14010d28b	Fix bug 22641 The bug was a result of getPreStartForExtend interpreting nsw/nuw flags on an add recurrence more strongly than is legal. {S,+,X}<nsw> implies S+X is nsw only if the backedge of the loop is taken at least once. NOTE: I had accidentally committed an unrelated change with the commit message of this change in r230275 (r230275 was reverted in r230279). This is the correct change for this commit message. Differential Revision: http://reviews.llvm.org/D7808 llvm-svn: 230291	2015-02-24 01:02:42 +00:00
Manman Ren	6487ce955a	[LTO API] add lto_codegen_set_module to set the destination module. When debugging LTO issues with ld64, we use -save-temps to save the merged optimized bitcode file, then invoke ld64 again on the single bitcode file to speed up debugging code generation passes and ld64 stuff after code generation. llvm linking a single bitcode file via lto_codegen_add_module will generate a different bitcode file from the single input. With the newly-added lto_codegen_set_module, we can make sure the destination module is the same as the input. lto_codegen_set_module will transfer the ownship of the module to code generator. rdar://19024554 llvm-svn: 230290	2015-02-24 00:45:56 +00:00
Adam Nemet	8bc61df9f2	[LoopAccesses] LAA::getInfo to use const reference for stride parameter And other required const-correctness fixes to make this work. llvm-svn: 230289	2015-02-24 00:41:59 +00:00
David Majnemer	3aa0bd81a2	X86: Only use 'lea' in Win64 epilogues if a frame pointer exists We can only use 'add' in epilogues, 'lea' is not permitted unless we've established a frame pointer in the prologue. llvm-svn: 230286	2015-02-24 00:11:32 +00:00
Sanjoy Das	82ea3d45b5	New instcombine rule: max(~a,~b) -> ~min(a, b) This case is interesting because ScalarEvolutionExpander lowers min(a, b) as ~max(~a,~b). I think the profitability heuristics can be made more clever/aggressive, but this is a start. Differential Revision: http://reviews.llvm.org/D7821 llvm-svn: 230285	2015-02-24 00:08:41 +00:00
Sanjoy Das	18c243b933	Bugfix: SCEVExpander incorrectly marks increment operations as no-wrap When emitting the increment operation, SCEVExpander marks the operation as nuw or nsw based on the flags on the preincrement SCEV. This is incorrect because, for instance, it is possible that {-6,+,1} is <nuw> while {-6,+,1}+1 = {-5,+,1} is not. This change teaches SCEV to mark the increment as nuw/nsw only if it can explicitly prove that the increment operation won't overflow. Apart from the attached test case, another (more realistic) manifestation of the bug can be seen in Transforms/IndVarSimplify/pr20680.ll. NOTE: this change was landed with an incorrect commit message in rL230275 and was reverted for that reason in rL230279. This commit message is the correct one. Differential Revision: http://reviews.llvm.org/D7778 llvm-svn: 230280	2015-02-23 23:22:58 +00:00
Sanjoy Das	c9cf0151cf	Revert 230275. 230275 got committed with an incorrect commit message due to a mixup on my side. Will re-land in a few moments with the correct commit message. llvm-svn: 230279	2015-02-23 23:13:22 +00:00
Simon Pilgrim	662c1d2770	Fix based on post-commit comment on D7816 & rL230177 - BUILD_VECTOR operand truncation was using the the BV's output scalar type instead of the input type. llvm-svn: 230278	2015-02-23 23:04:28 +00:00
Andrea Di Biagio	af3f397b10	[X86] Teach how to custom lower double-to-half conversions under fast-math. This patch teaches the backend how to expand a double-half conversion into a double-float conversion immediately followed by a float-half conversion. We do this only under fast-math, and if float-half conversions are legal for the target. Added test CodeGen/X86/fastmath-float-half-conversion.ll Differential Revision: http://reviews.llvm.org/D7832 llvm-svn: 230276	2015-02-23 22:59:02 +00:00
Sanjoy Das	913dfd8f7f	Fix bug 22641 The bug was a result of getPreStartForExtend interpreting nsw/nuw flags on an add recurrence more strongly than is legal. {S,+,X}<nsw> implies S+X is nsw only if the backedge of the loop is taken at least once. Differential Revision: http://reviews.llvm.org/D7808 llvm-svn: 230275	2015-02-23 22:55:13 +00:00
Rafael Espindola	993502eafd	Fix invalid cast. Fixes PR22525. Patch by Ben Longbons with testcase by me. llvm-svn: 230271	2015-02-23 21:51:06 +00:00
David Majnemer	006c490ba8	X86: Use a smaller 'mov' instruction for stack probe calls Prologue emission, in some cases, requires calls to a stack probe helper function. The amount of stack to probe is passed as a register argument in the Win64 ABI but the instruction sequence used is pessimistic: it assumes that the number of bytes to probe is greater than 4 GB. Instead, select a more appropriate opcode depending on the number of bytes we are going to probe. llvm-svn: 230270	2015-02-23 21:50:30 +00:00
David Majnemer	31d868b618	X86: Use 'mov' instead of 'lea' in Win64 SEH prologues when possible 'mov' and 'lea' are equivalent when the displacement applied with 'lea' is zero. However, 'mov' should encode smaller. llvm-svn: 230269	2015-02-23 21:50:27 +00:00
David Majnemer	b85e023b8b	X86: Explain why we cannot use a 'mov' in a Win64 epilogue llvm-svn: 230268	2015-02-23 21:50:25 +00:00
David Majnemer	086f6a7e6e	X86: Consistently use 'epilogue' instead of 'epilog' llvm-svn: 230267	2015-02-23 21:50:18 +00:00
Sanjay Patel	27aa1423d2	add newline for easier reading; NFC llvm-svn: 230265	2015-02-23 21:32:09 +00:00
Bruno Cardoso Lopes	24492b057e	[AsmPrinter] Access pointers to globals via pcrel GOT entries Front-ends could use global unnamed_addr to hold pointers to other symbols, like @gotequivalent below: @foo = global i32 42 @gotequivalent = private unnamed_addr constant i32* @foo @delta = global i32 trunc (i64 sub (i64 ptrtoint (i32** @gotequivalent to i64), i64 ptrtoint (i32* @delta to i64)) to i32) The global @delta holds a data "PC"-relative offset to @gotequivalent, an unnamed pointer to @foo. The darwin/x86-64 assembly output for this follows: .globl _foo _foo: .long 42 .globl _gotequivalent _gotequivalent: .quad _foo .globl _delta _delta: .long _gotequivalent-_delta Since unnamed_addr indicates that the address is not significant, only the content, we can optimize the case above by replacing pc-relative accesses to "GOT equivalent" globals, by a PC relative access to the GOT entry of the final symbol instead. Therefore, "delta" can contain a pc relative relocation to foo's GOT entry and we avoid the emission of "gotequivalent", yielding the assembly code below: .globl _foo _foo: .long 42 .globl _delta _delta: .long _foo@GOTPCREL+4 There are a couple of advantages of doing this: (1) Front-ends that need to emit a great deal of data to store pointers to external symbols could save space by not emitting such "got equivalent" globals and (2) IR constructs combined with this opt opens a way to represent GOT pcrel relocations by using the LLVM IR, which is something we previously had no way to express. Differential Revision: http://reviews.llvm.org/D6922 rdar://problem/18534217 llvm-svn: 230264	2015-02-23 21:26:18 +00:00
Andrew Kaylor	982ea13c79	Removing unused private field. llvm-svn: 230259	2015-02-23 21:03:30 +00:00
Andrew Kaylor	322236eed6	Second attempt to fix WinEHCatchDirector build failures. llvm-svn: 230257	2015-02-23 20:44:34 +00:00
Andrew Kaylor	2e30b459ec	Attempting to fix WinEHCatchDirector destructor related build failures. llvm-svn: 230252	2015-02-23 20:19:15 +00:00
Andrew Kaylor	f22fe4ae18	Remap frame variables for native Windows exception handling. Differential Revision: http://reviews.llvm.org/D7770 llvm-svn: 230249	2015-02-23 20:01:56 +00:00
Bruno Cardoso Lopes	32173cdf06	Revert "[X86][MMX] Add MMX instructions to foldable tables" This reverts commit r230226 since it breaks win buildbots. llvm-svn: 230248	2015-02-23 19:53:37 +00:00
Chad Rosier	1df9124289	Revert "Revert "Raising minimum required CMake version to 2.8.12.2."" This reverts commit r230240, which was an accidental commit. llvm-svn: 230246	2015-02-23 19:34:04 +00:00
Eric Christopher	ed47b22951	Rewrite the global merge pass to be subprogram agnostic for now. It was previously using the subtarget to get values for the global offset without actually checking each function as it was generating code. Go ahead and solidify the current behavior and make the existing FIXMEs more prominent. As a note the ARM backend previously had a thumb1 and non-thumb1 set of defaults. Only the former was tested so I've changed the behavior to only use that for now. llvm-svn: 230245	2015-02-23 19:28:45 +00:00
Chad Rosier	543900539f	Prevent hoisting fmul from THEN/ELSE to IF if there is fmsub/fmadd opportunity. This patch adds the isProfitableToHoist API. For AArch64, we want to prevent a fmul from being hoisted in cases where it is more profitable to form a fmsub/fmadd. Phabricator Review: http://reviews.llvm.org/D7299 Patch by Lawrence Hu <lawrence@codeaurora.org> llvm-svn: 230241	2015-02-23 19:15:16 +00:00
Chad Rosier	7c3310694c	Revert "Raising minimum required CMake version to 2.8.12.2." This reverts commit 247aed4710e8befde76da42b27313661dea7cf66. llvm-svn: 230240	2015-02-23 19:15:08 +00:00
Mehdi Amini	cd3ca6f7dd	InstSimplify: simplify 0 / X if nnan and nsz From: Fiona Glaser <fglaser@apple.com> llvm-svn: 230238	2015-02-23 18:30:25 +00:00
Daniel Sanders	afe27c7d27	[mips] Honour -mno-odd-spreg for vector insert/extract when MSA is enabled. Summary: -mno-odd-spreg prohibits the use of odd-numbered single-precision floating point registers. However, vector insert/extract was still using them when manipulating the subregisters of an MSA register. Fixed this by ensuring that insertion/extraction is only performed on even-numbered vector registers when -mno-odd-spreg is given. Reviewers: vmedic, sstankovic Reviewed By: sstankovic Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D7672 llvm-svn: 230235	2015-02-23 17:22:16 +00:00
Bob Wilson	89e94fc3ad	Fix incorrect immediate size for AddrModeT2_i8s4 in rewriteT2FrameIndex. The natural way to handle this addressing mode would be to say that it has 8 bits and gets scaled by 4, but since the MC layer is expecting the scaling to be already reflected in the immediate value, we have been setting the Scale to 1. That's fine, but then NumBits needs to be adjusted to reflect the effective increase in the range of the immediate. That adjustment was missing. The consequence is that the register scavenger can fail. The estimateRSStackSizeLimit() function in ARMFrameLowering.cpp correctly assumes that the AddrModeT2_i8s4 address mode can handle scaled offsets up to 1020. Under just the right circumstances, we fail to reserve space for the scavenger because it thinks that nothing will be needed. However, the overly pessimistic behavior in rewriteT2FrameIndex causes some frame indexes to be out of range and require scavenged registers, and so the scavenger asserts. Unfortunately I have not been able to come up with a testcase for this. I can only reproduce it on an internal branch where the frame layout and register allocation is slightly different than trunk. We really need a way to serialize MachineInstr-level IR to write reasonable tests for things like this. rdar://problem/19909005 llvm-svn: 230233	2015-02-23 16:57:19 +00:00
Benjamin Kramer	654a85e2ee	Sync the __builtin_expects for our 3 quadratically probed hash table implementations. This assumes that a) finding the bucket containing the value is LIKELY b) finding an empty bucket is LIKELY c) growing the table is UNLIKELY I also switched the a) and b) cases for SmallPtrSet as we seem to use the set mostly more for insertion than for checking existence. In a simple benchmark consisting of 2^21 insertions of 2^20 unique pointers into a DenseMap or SmallPtrSet a few percent speedup on average, but nothing statistically significant. llvm-svn: 230232	2015-02-23 16:41:36 +00:00
Bruno Cardoso Lopes	f488e2ae69	[X86][MMX] Add MMX instructions to foldable tables Teach the peephole optimizer to work with MMX instructions by adding entries into the foldable tables. This covers folding opportunities not handled during isel. llvm-svn: 230226	2015-02-23 15:23:22 +00:00
Bruno Cardoso Lopes	9e1c4c17d9	[X86][MMX] Support folding loads in psll, psrl and psra intrinsics llvm-svn: 230225	2015-02-23 15:23:14 +00:00
Elena Demikhovsky	52e81bc499	AVX-512: recommitted 229837 + bugfix + test llvm-svn: 230223	2015-02-23 15:12:31 +00:00
Elena Demikhovsky	145e5b4409	restructured X86 scalar unary operation templates I made the templates general, no need to define pattern separately for each instruction/intrinsic. Now only need to add r_Int pattern for AVX. llvm-svn: 230221	2015-02-23 14:14:02 +00:00
David Majnemer	eba692dd28	AsmParser: Check ConstantExpr insertvalue operands for type correctness llvm-svn: 230206	2015-02-23 07:13:52 +00:00

1 2 3 4 5 ...

77383 Commits