llvm-project

Commit Graph

Author	SHA1	Message	Date
Craig Topper	46469aa4da	[X86] Add IntrNoMem to the AVX512 conflict intrinsics. llvm-svn: 226897	2015-01-23 06:11:45 +00:00
Rafael Espindola	5fa925ebf6	Add STB_GNU_UNIQUE to the ELF writer. This lets llvm-mc assemble files produced by gcc. llvm-svn: 226895	2015-01-23 04:44:35 +00:00
NAKAMURA Takumi	2bbc90cca5	Reformat. llvm-svn: 226888	2015-01-23 01:02:07 +00:00
NAKAMURA Takumi	f6eee4ad67	MipsAsmParser.cpp: Suppress a warning introduced in r226657. [-Wunused-variable] llvm-svn: 226887	2015-01-23 01:01:52 +00:00
Jan Vesely	5f715d36a7	R600: Try to use lower types for 64bit division if possible v2: add and enable tests for SI Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Matt Arsenault <Matthew.Arsenault@amd.com> llvm-svn: 226881	2015-01-22 23:42:43 +00:00
Jan Vesely	6269e3ca2f	SelectionDAG: Add KnownBits and SignBits computation for EXTRACT_ELEMENT v2: use getZExtValue add missing break codestyle v3: add few more comments Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Matt Arsenault <Matthew.Arsenault@amd.com> llvm-svn: 226880	2015-01-22 23:42:41 +00:00
Jan Vesely	f7987ca5a7	R600: Simplify LowerUDIVREM optimizations can handle removing the Hi part operations. The generated code is identical for R600, ~10% icount reduction for SI v2: rebase Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Matt Arsenault <Matthew.Arsenault@amd.com> llvm-svn: 226879	2015-01-22 23:42:39 +00:00
Duncan P. N. Exon Smith	68ab023ef7	IR: Change GenericDwarfNode::getHeader() to StringRef Simplify the API to use a `StringRef` directly rather than exposing the `MDString` bits underneath. llvm-svn: 226876	2015-01-22 23:10:55 +00:00
Duncan P. N. Exon Smith	e8b5e49ffd	IR: DwarfNode => DebugNode, NFC These things are potentially used for non-DWARF data (see the discussion in PR22235), so take the `Dwarf` out of the name. Since the new name gives fewer clues, update the doxygen to properly describe what they are. llvm-svn: 226874	2015-01-22 22:47:44 +00:00
Simon Pilgrim	7e6d573e87	[X86][AVX] Added (V)MOVDDUP / (V)MOVSLDUP / (V)MOVSHDUP memory folding + tests. Minor tweak now that D7042 is complete, we can enable stack folding for (V)MOVDDUP and do proper testing. Added missing AVX ymm folding patterns and fixed alignment for AVX VMOVSLDUP / VMOVSHDUP. llvm-svn: 226873	2015-01-22 22:39:59 +00:00
Chandler Carruth	df8b223dea	[PM] Actually add the new pass manager support for the assumption cache. I had already factored this analysis specifically to enable doing this, but hadn't actually committed the necessary wiring to get at this from the new pass manager. This also nicely shows how the separate cache object can be directly managed by the new pass manager. This analysis didn't have any direct tests and so I've added a printer pass and a boring test case. I chose to print the i1 value which is being assumed rather than the call to llvm.assume as that seems much more useful for testing... but suggestions on an even better printing strategy welcome. My main goal was to make sure things actually work. =] llvm-svn: 226868	2015-01-22 21:53:09 +00:00
Benjamin Kramer	cb36becbeb	Remove dead leak detector parts that fell out of use in r224703. llvm-svn: 226867	2015-01-22 21:43:01 +00:00
Duncan P. N. Exon Smith	8d536973a2	IR: Update references to temporaries before deleting During `MDNode::deleteTemporary()`, call `replaceAllUsesWith(nullptr)` to update all tracking references to `nullptr`. This fixes PR22280, where inverted destruction order between tracking references and the temporaries themselves caused a use-after-free in `LLParser`. An alternative fix would be to add an assertion that there are no users, and continue to fix inverted destruction order in clients (like `LLParser`), but instead I decided to make getting-teardown-right easy. (If someone disagrees let me know.) llvm-svn: 226866	2015-01-22 21:36:45 +00:00
Chris Bieneman	799ef37d02	Refactoring cl::parser construction and initialization. Summary: Some parsers need references back to the option they are members of. This is used for handling the argument string as well as by the various pass name parsers for making pass names into flags. Making parsers that need to refer back to the option have a reference to the option eliminates some of the members of various parsers, and enables further code cleanup. Reviewers: dexonsmith Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D7131 llvm-svn: 226864	2015-01-22 21:01:12 +00:00
Ramkumar Ramachandra	75a4f35b26	Intrinsics: introduce llvm_any_ty aka ValueType Any Specifically, gc.result benefits from this greatly. Instead of: gc.result.int.* gc.result.float.* gc.result.ptr.* ... We now have a gc.result.* that can specialize to literally any type. Differential Revision: http://reviews.llvm.org/D7020 llvm-svn: 226857	2015-01-22 20:14:38 +00:00
Reid Kleckner	f12b33454f	Revert "Don't remove a landing pad if the invoke requires a table entry." This reverts commit r176827. Björn Steinbrink pointed out that this didn't actually fix the bug (PR15555) it was attempting to fix. With this reverted, we can now remove landingpad cleanups that immediately resume unwinding, converting the invoke to a call. llvm-svn: 226850	2015-01-22 19:29:46 +00:00
Sanjay Patel	37c41c1d2c	merge consecutive stores of extracted vector elements (PR21711) This is a 2nd try at the same optimization as http://reviews.llvm.org/D6698. That patch was checked in at r224611, but reverted at r225031 because it caused a failure outside of the regression tests. The cause of the crash was not recognizing consecutive stores that have mixed source values (loads and vector element extracts), so this patch adds a check to bail out if any store value is not coming from a vector element extract. This patch also refactors the shared logic of the constant source and vector extracted elements source cases into a helper function. Differential Revision: http://reviews.llvm.org/D6850 llvm-svn: 226845	2015-01-22 18:21:26 +00:00
David Blaikie	e7d473461e	Revert "PR21408: Workaround the appearance of duplicate variables due to problems when inlining two calls to the same function from the same call site." The underlying bug has been fixed in r226736 so there's no need to workaround this anymore. This reverts commit r220923. llvm-svn: 226842	2015-01-22 17:49:59 +00:00
Tim Northover	7cd58934a8	AArch64: decode all MRS/MSR forms early to avoid saving FeatureBits. Currently, we're adding a uint64_t describing the current subtarget so that matching can check whether the specified register is valid. However, we want to move to a bitset for those bits (x86 has more than 64 of them). This can't live in a union so it's probably better to do the checks early (especially as there are only 3 of them). llvm-svn: 226841	2015-01-22 17:23:04 +00:00
Adrian Prantl	0d7d8e4512	Rewrite DIExpression::printInternal() to use the iterator interface. NFC. llvm-svn: 226836	2015-01-22 16:55:22 +00:00
Adrian Prantl	2585a98d38	Rename DIExpressionIterator to DIExpression::iterator. Addresses review feedback from Duncan. llvm-svn: 226835	2015-01-22 16:55:20 +00:00
Rafael Espindola	5a67ed1038	[pr21886] Change MCJIT/ELF to support MSVC C++ mangled symbol. The ELF format is used on Windows by the MCJIT engine. Thus, on Windows, the ELFObjectWriter can encounter symbols mangled using the MS Visual Studio C++ name mangling. Symbols mangled using the MSVC C++ name mangling can legally have "@@@" as a substring. The EFLObjectWriter should not interpret the "@@@" substring as specifying GNU-style symbol versioning. The ELFObjectWriter therefore check for the MSVC C++ name mangling prefix which is either "?", "@?", "imp_?" or "imp_?@". llvm-svn: 226830	2015-01-22 14:20:45 +00:00
Aaron Ballman	9ada6cd0f6	Silencing a -Wsign-compare warning (all uses of this constant are within unsigned expressions anyway); NFC. llvm-svn: 226826	2015-01-22 13:57:41 +00:00
Michael Kuperstein	25e34d11f3	[DAGCombine] Produce better code for constant splats This solves PR22276. Splats of constants would sometimes produce redundant shuffles, sometimes ridiculously so (see the PR for details). Fold these shuffles into BUILD_VECTORs early on instead. Differential Revision: http://reviews.llvm.org/D7093 Fixed recommit of r226811. llvm-svn: 226816	2015-01-22 13:07:28 +00:00
Alexander Potapenko	a007905e4e	Mark \|TLI\| variables used to suppress -Wunused-variable warnings. (These vars are only used in assertions) llvm-svn: 226815	2015-01-22 13:03:33 +00:00
Michael Kuperstein	ff74032018	Revert r226811, MSVC accepts code sane compilers don't. llvm-svn: 226814	2015-01-22 12:48:07 +00:00
Michael Kuperstein	84fad3e5c9	[DAGCombine] Produce better code for constant splats This solves PR22276. Splats of constants would sometimes produce redundant shuffles, sometimes ridiculously so (see the PR for details). Fold these shuffles into BUILD_VECTORs early on instead. Differential Revision: http://reviews.llvm.org/D7093 llvm-svn: 226811	2015-01-22 12:37:23 +00:00
Timur Iskhodzhanov	b4b6b74079	[ASan/Win] Move the shadow to 0x30000000 llvm-svn: 226809	2015-01-22 12:24:21 +00:00
Elena Demikhovsky	150d9f3187	Fixed a bug in type legalizer for masked load/store intrinsics. The problem occurs when after vectorization we have type <2 x i32>. This type is promoted to <2 x i64> and then requires additional efforts for expanding loads and truncating stores. I added EXPAND / TRUNCATE attributes to the masked load/store SDNodes. The code now contains additional shuffles. I've prepared changes in the cost estimation for masked memory operations, it will be submitted separately. llvm-svn: 226808	2015-01-22 12:07:59 +00:00
Elena Demikhovsky	94cfbbab33	Fixed a comment llvm-svn: 226806	2015-01-22 10:01:36 +00:00
Elena Demikhovsky	9c26462a27	Fixed a bug in narrowing store operation. Type MVT::i1 became legal in KNL, but store operation can't be narrowed to this type, since the size of VT (1 bit) is not equal to its actual store size(8 bits). Added a test provided by David (dag@cray.com) llvm-svn: 226805	2015-01-22 09:39:08 +00:00
Sanjoy Das	351db05308	[NFC] Introduce a 'struct Range' for IRCE Use the struct instead of a std::pair<Value , Value >. This makes a Range an obviously immutable object, and we can now assert that a range is well-typed (Begin->getType() == End->getType()) on its construction. llvm-svn: 226804	2015-01-22 09:32:02 +00:00
Craig Topper	e0c8e8f6a7	Revert r226798. Guess I missed the patterns. llvm-svn: 226802	2015-01-22 09:01:20 +00:00
Craig Topper	ffef4cf1e1	Use u8imm instead of i32i8imm on a couple instructions that have no patterns and thus no reason to use a larger operand size. llvm-svn: 226798	2015-01-22 08:53:11 +00:00
Craig Topper	9b39e54001	[X86] Remove some unused multiclasses from AVX512 instruction file. llvm-svn: 226797	2015-01-22 08:53:08 +00:00
Sanjoy Das	d1fb13ce4c	Fix crashes in IRCE caused by mismatched types There are places where the inductive range check elimination pass depends on two llvm::Values or llvm::SCEVs to be of the same llvm::Type when they do not need to be. This patch relaxes those restrictions (by bailing out of the optimization if the types mismatch), and adds test cases to trigger those paths. These issues were found by bootstrapping clang with IRCE running in the -O3 pass ordering. Differential Revision: http://reviews.llvm.org/D7082 llvm-svn: 226793	2015-01-22 08:29:18 +00:00
Erik Eckstein	96cfb9c655	SLPVectorizer: add a second limit for the number of alias checks. Even with the current limit on the number of alias checks, the containing loop has quadratic complexity. This begins to hurt for blocks containing > 1K load/store instructions. This commit introduces a limit for the loop count. It reduces the runtime for such very large blocks. llvm-svn: 226792	2015-01-22 08:20:51 +00:00
Elena Demikhovsky	079b2d8c0c	Fixed a bug in masked load/store in reversed loop. Added a test. The bug was submitted to bugzilla: http://llvm.org/bugs/show_bug.cgi?id=22225 llvm-svn: 226791	2015-01-22 08:20:06 +00:00
Chandler Carruth	a917458203	[PM] Rename InstCombine.h to InstCombineInternal.h in preparation for creating a non-internal header file for the InstCombine pass. I thought about calling this InstCombiner.h or in some way more clearly associating it with the InstCombiner clas that it is primarily defining, but there are several other utility interfaces defined within this for InstCombine. If, in the course of refactoring, those end up moving elsewhere or going away, it might make more sense to make this the combiner's header alone. Naturally, this is a bikeshed to a certain degree, so feel free to lobby for a different shade of paint if this name just doesn't suit you. llvm-svn: 226783	2015-01-22 05:25:13 +00:00
Chandler Carruth	cd8522ef44	[canonicalize] Teach InstCombine to canonicalize loads which are only ever stored to always use a legal integer type if one is available. Regardless of whether this particular type is good or bad, it ensures we don't get weird differences in generated code (and resulting performance) from "equivalent" patterns that happen to end up using a slightly different type. After some discussion on llvmdev it seems everyone generally likes this canonicalization. However, there may be some parts of LLVM that handle it poorly and need to be fixed. I have at least verified that this doesn't impede GVN and instcombine's store-to-load forwarding powers in any obvious cases. Subtle cases are exactly what we need te flush out if they remain. Also note that this IR pattern should already be hitting LLVM from Clang at least because it is exactly the IR which would be produced if you used memcpy to copy a pointer or floating point between memory instead of a variable. llvm-svn: 226781	2015-01-22 05:08:12 +00:00
Saleem Abdulrasool	10ed0babd3	ARM: fail less catastrophically on invalid Windows input Windows supports a restricted set of relocations (compared to ARM ELF). In some cases, we may end up generating an unsupported relocation. This can occur with bad input to the assembler in particular (the frontend should never generate code that cannot be compiled). Generate an error rather than just aborting. The change in the API is driven by the desire to provide a slightly more helpful message for debugging purposes. llvm-svn: 226779	2015-01-22 04:03:32 +00:00
Chandler Carruth	fa11d837a0	[canonicalize] Move a helper function further up the file so it can be used earlier. NFC. llvm-svn: 226777	2015-01-22 03:34:54 +00:00
Reid Kleckner	f690f50519	Win64 SEH: Emit the constant 1 for catch-all into xdata llvm-svn: 226767	2015-01-22 02:27:44 +00:00
Sanjoy Das	cb47366366	Make ScalarEvolution less aggressive with respect to no-wrap flags. ScalarEvolution currently lowers a subtraction recurrence to an add recurrence with the same no-wrap flags as the subtraction. This is incorrect because `sub nsw X, Y` is not the same as `add nsw X, -Y` and `sub nuw X, Y` is not the same as `add nuw X, -Y`. This patch fixes the issue, and adds two test cases demonstrating the bug. Differential Revision: http://reviews.llvm.org/D7081 llvm-svn: 226755	2015-01-22 00:48:47 +00:00
Adrian Prantl	531641a0c6	Make DwarfExpression use the new DIExpressionIterator. NFC. llvm-svn: 226748	2015-01-22 00:00:59 +00:00
Adrian Prantl	9260ccaeb4	Rewrite DIExpression::Verify() using an iterator. NFC. Addresses review comments for r226627. llvm-svn: 226747	2015-01-22 00:00:52 +00:00
Chandler Carruth	2135b97d8f	[canonicalization] Refactor how we create new stores into a helper function. This is a bit tidier anyways and will make a subsquent patch simpler as I want to add another case to this combine. llvm-svn: 226746	2015-01-21 23:45:01 +00:00
Simon Pilgrim	5fa0fb23ca	[X86][SSE] Missing SSE/AVX1 memory folding integer instructions Added most of the missing integer vector folding patterns for SSE (to SSE42) and AVX1. The most useful of these are probably the i32/i64 extraction, i8/i16/i32/i64 insertions, zero/sign extension, unsigned saturation subtractions, i64 subtractions and the variable mask blends (pblendvb) - others include CLMUL, SSE42 string comparisons and bit tests. Differential Revision: http://reviews.llvm.org/D7094 llvm-svn: 226745	2015-01-21 23:43:30 +00:00
Tim Northover	3007ba0ab3	DAGCombine: fold (or (and X, M), (and X, N)) -> (and X, (or M, N)) It can help with argument juggling on some targets, and is generally a good idea. llvm-svn: 226740	2015-01-21 23:17:19 +00:00
David Blaikie	df706288fb	DebugInfo: Use distinct inlinedAt MDLocations to avoid separate inlined calls being coalesced When two calls from the same MDLocation are inlined they currently get treated as one inlined function call (creating difficulty debugging, duplicate variables, etc). Clang worked around this by including column information on inline calls which doesn't address LTO inlining or calls to the same function from the same line and column (such as through a macro). It also didn't address ctor and member function calls. By making the inlinedAt locations distinct, every call site has an explicitly distinct location that cannot be coalesced with any other call. This can produce linearly (2x in the worst case where every call is inlined and the call instruction has a non-call instruction at the same location) more debug locations. Any increase beyond that are in cases where the Clang workaround was insufficient and the new scheme is creating necessary distinct nodes that were being erroneously coalesced previously. After this change to LLVM the incomplete workarounds in Clang. That should reduce the number of debug locations (in a build without column info, the default on Darwin, not the default on Linux) by not creating pseudo-distinct locations for every call to an inline function. (oh, and I made the inlined-at chain rebuilding iterative instead of recursive because I was having trouble wrapping my head around it the way it was - open to discussion on the right design for that function (including going back to a recursive solution)) llvm-svn: 226736	2015-01-21 22:57:29 +00:00
Matthias Braun	c1988f384c	LiveIntervalAnalysis: Mark subregister defs as undef when we determined they are only reading a dead superregister value This was not necessary before as this case can only be detected when the liveness analysis is at subregister level. llvm-svn: 226733	2015-01-21 22:55:13 +00:00
Chris Bieneman	9e13af7ac3	Adding a new cl::HideUnrelatedOptions API to allow clang to migrate off cl::getRegisteredOptions. Summary: cl::getRegisteredOptions really exposes some of the innards of how command line parsing is implemented. Exposing new APIs that allow us to disentangle client code from implementation details will allow us to make more extensive changes to command line parsing. Reviewers: chandlerc, dexonsmith, beanz Reviewed By: dexonsmith Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D7100 llvm-svn: 226729	2015-01-21 22:45:52 +00:00
Simon Pilgrim	b16b09b154	[X86][SSE] Added support for SSE3 lane duplication shuffle instructions This patch adds shuffle matching for the SSE3 MOVDDUP, MOVSLDUP and MOVSHDUP instructions. The big use of these being that they avoid many single source shuffles from needing to use (pre-AVX) dual source instructions such as SHUFPD/SHUFPS: causing extra moves and preventing load folds. Adding these instructions uncovered an issue in XFormVExtractWithShuffleIntoLoad which crashed on single operand shuffle instructions (now fixed). It also involved fixing getTargetShuffleMask to correctly identify theses instructions as unary shuffles. Also adds a missing tablegen pattern for MOVDDUP. Differential Revision: http://reviews.llvm.org/D7042 llvm-svn: 226716	2015-01-21 22:44:35 +00:00
Jonathan Roelofs	229eb4ca5c	Fix load-store optimizer on thumbv4t Thumbv4t does not have lo->lo copies other than MOVS, and that can't be predicated. So emit MOVS when needed and bail if there's a predicate. http://reviews.llvm.org/D6592 llvm-svn: 226711	2015-01-21 22:39:43 +00:00
David Majnemer	4c0a6e918a	InstCombine: Don't strip bitcasts off of callsites marked 'thunk' The return type of a thunk is meaningless, we just want the arguments and return value to be forwarded. llvm-svn: 226708	2015-01-21 22:32:04 +00:00
Simon Pilgrim	47af023ada	[X86][SSE] movddup shuffle mask decodes Patch to provide shuffle decodes and asm comments for the SSE3/AVX1 movddup double duplication instructions. llvm-svn: 226705	2015-01-21 22:02:30 +00:00
Matthias Braun	311730ac78	LiveIntervalAnalysis: Factor out code to update liveness on vreg def removal This cleans up code and is more in line with the general philosophy of modifying LiveIntervals through LiveIntervalAnalysis instead of changing them directly. This also fixes a case where SplitEditor::removeBackCopies() would miss the subregister ranges. llvm-svn: 226690	2015-01-21 19:02:30 +00:00
Matthias Braun	cfb8ad29b5	LiveIntervalAnalysis: Factor out code to update liveness on physreg def removal This cleans up code and is more in line with the general philosophy of modifying LiveIntervals through LiveIntervalAnalysis instead of changing them directly. llvm-svn: 226687	2015-01-21 18:50:21 +00:00
Matthias Braun	1002baf7b9	LiveIntervalAnalysis: Remove unused pruneValue() variant. llvm-svn: 226686	2015-01-21 18:45:57 +00:00
Adrian Prantl	1292e24d0e	Let subprograms with instructions without parent scopes fail the verification. Tested via a unit test. Follow-up to r226616. llvm-svn: 226684	2015-01-21 18:32:56 +00:00
Matt Arsenault	b00554886f	R600/SI: Custom lower fround This fixes it for SI. It also removes the pattern used previously for Evergreen for f32. I'm not sure if the the new R600 output is better or not, but it uses 1 fewer instructions if BFI is available. llvm-svn: 226682	2015-01-21 18:18:25 +00:00
Colin LeMahieu	94269db8ba	[Hexagon] Converting multiply and accumulate with immediate intrinsics to patterns. llvm-svn: 226681	2015-01-21 18:13:15 +00:00
Ahmed Bougacha	8f09e9f7c5	[X86] Declare SSE4.1/AVX2 vector extloads covered by PMOV[SZ]X legal. Now that we can fully specify extload legality, we can declare them legal for the PMOVSX/PMOVZX instructions. This for instance enables a DAGCombine to fire on code such as (and (<zextload-equivalent> ...), <redundant mask>) to turn it into: (zextload ...) as seen in the testcase changes. There is one regression, in widen_load-2.ll: we're no longer able to do store-to-load forwarding with illegal extload memory types. This will be addressed separately. Differential Revision: http://reviews.llvm.org/D6533 llvm-svn: 226676	2015-01-21 17:07:06 +00:00
George Burgess IV	3c898c2119	Fixed a bug with how we determine bitset indices. llvm-svn: 226671	2015-01-21 16:37:21 +00:00
Yaron Keren	3f02c14cc7	Add missing include guards to WindowsSupport.h. llvm-svn: 226669	2015-01-21 16:20:38 +00:00
Tim Northover	cf3d80fedb	Revert "DAGCombine: fold (or (and X, M), (and X, N)) -> (and X, (or M, N))" It hadn't gone through review yet, but was still on my local copy. This reverts commit r226663 llvm-svn: 226665	2015-01-21 15:48:52 +00:00
Tim Northover	b9184f2b1a	AArch64: add backend option to reserve x18 (platform register) AAPCS64 says that it's up to the platform to specify whether x18 is reserved, and a first step on that way is to add a flag controlling it. From: Andrew Turner <andrew@fubar.geek.nz> llvm-svn: 226664	2015-01-21 15:43:31 +00:00
Tim Northover	85cd2791c9	DAGCombine: fold (or (and X, M), (and X, N)) -> (and X, (or M, N)) llvm-svn: 226663	2015-01-21 15:43:28 +00:00
Michael Kuperstein	ada9fa1ca9	[x32] Fast ISel should use LEA64_32r instead of LEA32r to adjust addresses in x32 mode. llvm-svn: 226661	2015-01-21 14:44:05 +00:00
Evgeniy Stepanov	79ca0fd1a0	[msan] Update origin for the entire destination range on memory store. Previously we always stored 4 bytes of origin at the destination address even for 8-byte (and longer) stores. This should fix rare missing, or incorrect, origin stacks in MSan reports. llvm-svn: 226658	2015-01-21 13:21:31 +00:00
Jozef Kolek	5cfebdde2b	[mips][microMIPS] MicroMIPS 16-bit unconditional branch instruction B Implement microMIPS 16-bit unconditional branch instruction B. Implemented 16-bit microMIPS unconditional instruction has real name B16, and B is an alias which expands to either B16 or BEQ according to the rules: b 256 --> b16 256 # R_MICROMIPS_PC10_S1 b 12256 --> beq $zero, $zero, 12256 # R_MICROMIPS_PC16_S1 b label --> beq $zero, $zero, label # R_MICROMIPS_PC16_S1 Differential Revision: http://reviews.llvm.org/D3514 llvm-svn: 226657	2015-01-21 12:39:30 +00:00
Jozef Kolek	2c6d73207e	[mips][microMIPS] Implement ADDIUPC instruction Differential Revision: http://reviews.llvm.org/D6582 llvm-svn: 226656	2015-01-21 12:10:11 +00:00
Chandler Carruth	df5747a900	[PM] Refactor the InstCombiner interface to use an external worklist. Because in its primary function pass the combiner is run repeatedly over the same function until doing so produces no changes, it is essentially to not re-allocate the worklist. However, as a utility, the more common pattern would be to put a limited set of instructions in the worklist rather than the entire function body. That is also the more likely pattern when used by the new pass manager. The result is a very light weight combiner that does the visiting with a separable worklist. This can then be wrapped up in a helper function for users that want a combiner utility, or as I have here it can be wrapped up in a pass which manages the iterations used when combining an entire function's instructions. Hopefully this removes some of the worst of the interface warts that became apparant with the last patch here. However, there is clearly more work. I've again left some FIXMEs for the most egregious. The ones that stick out to me are the exposure of the worklist and IR builder as public members, and the use of pointers rather than references. However, fixing these is likely to be much more mechanical and less interesting so I didn't want to touch them in this patch. llvm-svn: 226655	2015-01-21 11:38:17 +00:00
Chandler Carruth	ba4c5179a0	[PM] Simplify (ha! ha!) the way that instcombine calls the SimplifyLibCalls utility by sinking it into the specific call part of the combiner. This will avoid us needing to do any contortions to build this object in a subsequent refactoring I'm doing and seems generally better factored. We don't need this utility everywhere and it carries no interesting state so we might as well build it on demand. llvm-svn: 226654	2015-01-21 11:23:40 +00:00
Vladimir Medic	435cf8a415	[Mips][Disassembler]When disassembler meets load/store from coprocessor 2 instructions for mips r6 it crashes as the access to operands array is out of range. This patch adds dedicated decoder method that properly handles decoding of these instructions. llvm-svn: 226652	2015-01-21 10:47:36 +00:00
Craig Topper	42b326ea12	[x86] Remove some unnecessary and slightly confusing typecasts from some patterns. I think it actually went i32->iPtr->i32 in some of these cases. llvm-svn: 226647	2015-01-21 08:43:57 +00:00
Craig Topper	7ff6ab30a9	[X86] Convert all the i8imm used by AVX512 and MMX instructions to u8imm. llvm-svn: 226646	2015-01-21 08:43:49 +00:00
Craig Topper	620b50cc23	[X86] Convert all the i8imm used by SSE and AVX instructions to u8imm. This makes the assembler check their size and removes a hack from the disassembler to avoid sign extending the immediate. llvm-svn: 226645	2015-01-21 08:15:54 +00:00
Craig Topper	f38dea1cfa	[x86] Add assembly parser bounds checking to the immediate value for cmpss/cmpsd/cmpps/cmppd. llvm-svn: 226642	2015-01-21 06:07:53 +00:00
Chandler Carruth	9280382ac6	[PM] Replace an abuse of inheritance to override a single function with a more direct approach: a type-erased glorified function pointer. Now we can pass a function pointer into this for the easy case and we can even pass a lambda into it in the interesting case in the instruction combiner. I'll be using this shortly to simplify the interfaces to InstCombiner, but this helps pave the way and seems like a better design for the libcall simplifier utility. llvm-svn: 226640	2015-01-21 02:11:59 +00:00
Adrian Prantl	34bcbeed03	Make DIExpression::Verify() stricter by checking that the number of elements and the ordering is sane and cleanup the accessors. llvm-svn: 226627	2015-01-21 00:59:20 +00:00
Chandler Carruth	1edb9d63e9	[PM] Separate the InstCombiner from its pass. This creates a small internal pass which runs the InstCombiner over a function. This is the hard part of porting InstCombine to the new pass manager, as at this point none of the code in InstCombine has access to a Pass object any longer. The resulting interface for the InstCombiner is pretty terrible. I'm not planning on leaving it that way. The key thing missing is that we need to separate the worklist from the combiner a touch more. Once that's done, it should be possible for any part of LLVM to just create a worklist with instructions, populate it, and then combine it until empty. The pass will just be the (obvious and important) special case of doing that for an entire function body. For now, this is the first increment of factoring to make all of this work. llvm-svn: 226618	2015-01-20 22:44:35 +00:00
Adrian Prantl	de200dfad2	DebugLocs without a scope should fail the verification. Follow-up to r226588. llvm-svn: 226616	2015-01-20 22:37:25 +00:00
Chandler Carruth	b3d03df3ac	[PM] Reformat this code with clang-format so that subsequent changes don't get muddied up by formatting changes. Some of these don't really seem like improvements to me, but they also don't seem any worse and I care much more about not formatting them manually than I do about the particular formatting. =] llvm-svn: 226610	2015-01-20 21:10:35 +00:00
Colin LeMahieu	988c68f2a7	[Hexagon] Adding intrinsics for doubleword ALU operations. llvm-svn: 226606	2015-01-20 20:45:05 +00:00
Daniel Jasper	6b77455f81	Prevent binary-tree deterioration in sparse switch statements. This addresses part of llvm.org/PR22262. Specifically, it prevents considering the densities of sub-ranges that have fewer than TLI.getMinimumJumpTableEntries() elements. Those densities won't help jump tables. This is not a complete solution but works around the most pressing issue. Review: http://reviews.llvm.org/D7070 llvm-svn: 226600	2015-01-20 19:43:33 +00:00
Ramkumar Ramachandra	be10ece5ed	[GC] Verify-pass void vararg functions in gc.statepoint With the appropriate Verifier changes, exactracting the result out of a statepoint wrapping a vararg function crashes. However, a void vararg function works fine: commit this first step. Differential Revision: http://reviews.llvm.org/D7071 llvm-svn: 226599	2015-01-20 19:42:46 +00:00
Adrian Prantl	565cc18d8f	Reapply: Teach SROA how to update debug info for fragmented variables. This reapplies r225379. ChangeLog: - The assertion that this commit previously ran into about the inability to handle indirect variables has since been removed and the backend can handle this now. - Testcases were upgrade to the new MDLocation format. - Instead of keeping a DebugDeclares map, we now use llvm::FindAllocaDbgDeclare(). Original commit message follows. Debug info: Teach SROA how to update debug info for fragmented variables. This allows us to generate debug info for extremely advanced code such as typedef struct { long int a; int b;} S; int foo(S s) { return s.b; } which at -O1 on x86_64 is codegen'd into define i32 @foo(i64 %s.coerce0, i32 %s.coerce1) #0 { ret i32 %s.coerce1, !dbg !24 } with this patch we emit the following debug info for this TAG_formal_parameter [3] AT_location( 0x00000000 0x0000000000000000 - 0x0000000000000006: rdi, piece 0x00000008, rsi, piece 0x00000004 0x0000000000000006 - 0x0000000000000008: rdi, piece 0x00000008, rax, piece 0x00000004 ) AT_name( "s" ) AT_decl_file( "/Volumes/Data/llvm/_build.ninja.release/test.c" ) Thanks to chandlerc, dblaikie, and echristo for their feedback on all previous iterations of this patch! llvm-svn: 226598	2015-01-20 19:42:22 +00:00
Tom Stellard	e99fb65d87	R600/SI: Add subtarget feature to enable VGPR spilling for all shader types This is disabled by default, but can be enabled with the subtarget feature: 'vgpr-spilling' llvm-svn: 226597	2015-01-20 19:33:04 +00:00
Tom Stellard	021053f500	R600/SI: Fix simple-loop.ll test llvm-svn: 226596	2015-01-20 19:33:02 +00:00
Jozef Kolek	0d49117769	Reverted revision 226577. llvm-svn: 226595	2015-01-20 19:29:28 +00:00
Chandler Carruth	3a62216a8a	[PM] Clean up a bunch of the doxygen / API docs on the InstCombiner pass prior to refactoring it. llvm-svn: 226594	2015-01-20 19:27:58 +00:00
Manman Ren	dab999d54f	[llvm link] Destroy ConstantArrays in LLVMContext if they are not used. ConstantArrays constructed during linking can cause quadratic memory explosion. An example is the ConstantArrays constructed when linking in GlobalVariables with appending linkage. Releasing all unused constants can cause a 20% LTO compile-time slowdown for a large application. So this commit releases unused ConstantArrays only. rdar://19040716. It reduces memory footprint from 20+G to 6+G. llvm-svn: 226592	2015-01-20 19:24:59 +00:00
Tom Stellard	3a70d07f51	R600/SI: Remove stray debugging code from r226586 llvm-svn: 226591	2015-01-20 19:24:31 +00:00
Adrian Prantl	f88b2c8c74	Add an assertion and prefer a crash over an infinite loop. llvm-svn: 226588	2015-01-20 18:03:37 +00:00
Tom Stellard	95292bbfcd	R600/SI: Use external symbols for scratch buffer We were passing the scratch buffer address to the shaders via user sgprs, but now we use external symbols and have the driver patch the shader using reloc information. llvm-svn: 226586	2015-01-20 17:49:47 +00:00
Tom Stellard	8255af45cb	R600/SI: Add kill flag when copying scratch offset to a register This allows us to re-use the same register for the scratch offset when accessing large private arrays. llvm-svn: 226585	2015-01-20 17:49:45 +00:00
Tom Stellard	8058069529	R600/SI: Don't store scratch buffer frame index in MUBUF offset field We don't have a good way of legalizing this if the frame index offset is more than the 12-bits, which is size of MUBUF's offset field, so now we store the frame index in the vaddr field. llvm-svn: 226584	2015-01-20 17:49:43 +00:00
Tom Stellard	1106b1c662	R600/SI: Update SIInstrInfo:verifyInstruction() after r225662 Now that we have our own custom register operand types, we need to handle them in the verifiier. llvm-svn: 226583	2015-01-20 17:49:41 +00:00
Aaron Ballman	6fa2141dca	Silencing a -Wunused-variable warning in non-asserts builds; NFC. llvm-svn: 226581	2015-01-20 17:10:45 +00:00
Jozef Kolek	45f7f9c1ab	[mips][microMIPS] MicroMIPS 16-bit unconditional branch instruction B Implement microMIPS 16-bit unconditional branch instruction B. Implemented 16-bit microMIPS unconditional instruction has real name B16, and B is an alias which expands to either B16 or BEQ according to the rules: b 256 --> b16 256 # R_MICROMIPS_PC10_S1 b 12256 --> beq $zero, $zero, 12256 # R_MICROMIPS_PC16_S1 b label --> beq $zero, $zero, label # R_MICROMIPS_PC16_S1 Differential Revision: http://reviews.llvm.org/D3514 llvm-svn: 226577	2015-01-20 16:45:27 +00:00
Kai Nacke	63072f81b3	[mips] Add octeon branch instructions bbit0/bbit032/bbit1/bbit132 This commits adds the octeon branch instructions bbit0/bbit032/bbit1/bbit132. It also includes patterns for instruction selection and test cases. Reviewed by D. Sanders llvm-svn: 226573	2015-01-20 16:10:51 +00:00
Evgeniy Stepanov	c5b974e6d2	[msan] Optimize -msan-check-constant-shadow. The new code does not create new basic blocks in the case when shadow is a compile-time constant; it generates either an unconditional __msan_warning call or nothing instead. llvm-svn: 226569	2015-01-20 15:21:35 +00:00
Mohit K. Bhakkad	46ad7f7ec5	[MSan][LLVM][MIPS] Shadow and Origin offsets for MIPS Reviewers: kcc, samsonov, petarj, eugenis Differential Revision: http://reviews.llvm.org/D6146 llvm-svn: 226565	2015-01-20 13:05:42 +00:00
Craig Topper	9f4d485610	[x86] Add some mayLoad/hasSideEffects flags. Remove one that was already covered by a pattern. llvm-svn: 226562	2015-01-20 12:15:30 +00:00
Chandler Carruth	aaf0b4cd57	[PM] Port LoopInfo to the new pass manager, adding both a LoopAnalysis pass and a LoopPrinterPass with the expected associated wiring. I've added a RUN line to the only test case (!!!) we have that actually prints loops. Everything seems to be working. This is somewhat exciting as this is the first analysis using another analysis to go in for the new pass manager. =D I also believe it is the last analysis necessary for porting instcombine, but of course I may yet discover more. llvm-svn: 226560	2015-01-20 10:58:50 +00:00
Daniel Jasper	d106b734cf	Factor out a splitSwitchCase() function so that it can be reused. This is in preparation for a fix to llvm.org/PR22262. One of the ideas here is to first find a good jump table range first and then split before and after it. Thereby, we don't need to use the split-based-on-density heuristic at all, which can make the "binary tree" deteriorate in various cases. Also some minor cleanups. No functional changes. llvm-svn: 226551	2015-01-20 08:57:44 +00:00
Chandler Carruth	5175b9a7b9	[PM] Move the LoopInfo analysis pointer into the InstCombiner class along with the other analyses. The most obvious reason why is because eventually I need to separate out the pass layer from the rest of the instcombiner. However, it is also probably a compile time win as every query through the pass manager layer is pretty slow these days. llvm-svn: 226550	2015-01-20 08:35:24 +00:00
Karthik Bhat	0b0f4660fa	Fix Operandreorder logic in SLPVectorizer to generate longer vectorizable chain. This patch fixes 2 issues in reorderInputsAccordingToOpcode 1) AllSameOpcodeLeft and AllSameOpcodeRight was being calculated incorrectly resulting in code not being vectorized in few cases. 2) Adds logic to reorder operands if we get longer chain of consecutive loads enabling vectorization. Handled the same for cases were we have AltOpcode. Thanks Michael for inputs and review. Review: http://reviews.llvm.org/D6677 llvm-svn: 226547	2015-01-20 06:11:00 +00:00
David Majnemer	3087b22e1a	Bitcode: Don't create comdats when autoupgrading macho bitcode Don't infer COMDAT groups from older bitcode if the target is macho, it doesn't have COMDATs. llvm-svn: 226546	2015-01-20 05:58:07 +00:00
Duncan P. N. Exon Smith	aa687a3d4c	Reapply "IR: Simplify DIBuilder's HeaderBuilder API, NFC" This reverts commit r226542, effectively reapplying r226540. This time, initialize `IsEmpty` in the copy and move constructors as well. llvm-svn: 226545	2015-01-20 05:02:42 +00:00
Duncan P. N. Exon Smith	5f39dfd429	Revert "IR: Simplify DIBuilder's HeaderBuilder API, NFC" This reverts commit r226540, since I hit an unexpected bot failure [1]. I'll investigate. [1]: http://bb.pgr.jp/builders/cmake-llvm-x86_64-linux/builds/20244 llvm-svn: 226542	2015-01-20 03:01:27 +00:00
Duncan P. N. Exon Smith	03e0583a2d	IR: Move MDNode clone() methods from ValueMapper to MDNode, NFC Now that the clone methods used by `MapMetadata()` don't do any remapping (and return a temporary), they make more sense as member functions on `MDNode` (and subclasses). llvm-svn: 226541	2015-01-20 02:56:57 +00:00
Duncan P. N. Exon Smith	8a07e7f657	IR: Simplify DIBuilder's HeaderBuilder API, NFC Change `HeaderBuilder` API to work well even when it's not starting with a tag. There's already one case like this, and the tag is moving elsewhere as part of PR22235. llvm-svn: 226540	2015-01-20 02:54:07 +00:00
Duncan P. N. Exon Smith	a7477285b9	AsmParser: PARSE_MD_FIELD() => ParseMDField(), NFC Extract most of `PARSE_MD_FIELD()` into a function. llvm-svn: 226539	2015-01-20 02:42:29 +00:00
Duncan P. N. Exon Smith	8839cb1dc8	AsmParser: Refactor duplicate code, NFC llvm-svn: 226538	2015-01-20 02:39:21 +00:00
Chandler Carruth	10f28f26fd	[PM] Replace the Pass argument in MergeBasicBlockIntoOnlyPred with a DominatorTree argument as that is the analysis that it wants to update. This removes the last non-loop utility function in Utils/ which accepts a raw Pass argument. llvm-svn: 226537	2015-01-20 01:37:09 +00:00
Duncan P. N. Exon Smith	408f5a25fa	IR: Delete GenericDwarfNode during teardown Fix a leak in `LLVMContextImpl` teardown that the leak sanitizer tracked down [1]. I've just switched to automatic dispatch here (since I'll inevitably forget again with the next class). [1]: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/811/steps/check-llvm%20asan/logs/stdio llvm-svn: 226536	2015-01-20 01:18:32 +00:00
Duncan P. N. Exon Smith	db6bc8bfdf	Bitcode: Simplify MDNode subclass dispatch, NFC llvm-svn: 226535	2015-01-20 01:03:09 +00:00
Duncan P. N. Exon Smith	6592deeab2	Bitcode: WriteMDNode() => WriteMDTuple(), NFC llvm-svn: 226534	2015-01-20 01:01:53 +00:00
Duncan P. N. Exon Smith	9a6f64e7b8	Bitcode: Add ValueEnumerator::getMetadataOrNullID(), NFC llvm-svn: 226533	2015-01-20 01:00:23 +00:00
Duncan P. N. Exon Smith	2da09e4408	IR: Canonicalize GenericDwarfNode empty headers to null llvm-svn: 226532	2015-01-20 00:58:46 +00:00
Duncan P. N. Exon Smith	0f529998a5	IR: Detect whether to call recalculateHash() via SFINAE, NFC Rather than relying on updating switch statements correctly, detect whether `setHash()` exists in the subclass. If so, call `recalculateHash()` and `setHash(0)` appropriately. llvm-svn: 226531	2015-01-20 00:57:33 +00:00
Duncan P. N. Exon Smith	fed199a758	IR: Introduce GenericDwarfNode As part of PR22235, introduce `DwarfNode` and `GenericDwarfNode`. The former is a metadata node with a DWARF tag. The latter matches our current (generic) schema of a header with string (and stringified integer) data and an arbitrary number of operands. This doesn't move it into place yet; that change will require a large number of testcase updates. llvm-svn: 226529	2015-01-20 00:01:43 +00:00
Duncan P. N. Exon Smith	2a6b5fcfaa	AsmParser: Abstract more of MDLocation parser, NFC llvm-svn: 226527	2015-01-19 23:44:41 +00:00
Duncan P. N. Exon Smith	66ca92e509	AsmParser: Split up ParseMDFieldsImpl(), NFC llvm-svn: 226526	2015-01-19 23:39:32 +00:00
Duncan P. N. Exon Smith	13890af51c	AsmParser: Fix error location for missing fields llvm-svn: 226524	2015-01-19 23:32:36 +00:00
Duncan P. N. Exon Smith	909131b95f	IR: Cleanup MDNode field use, NFC Swap usage of `SubclassData32` and `MDNodeSubclassData`, and rename `MDNodeSubclassData` to `NumUnresolved`. Small drive-by cleanup to `countUnresolvedOperands()` since otherwise the name clash with local vars named `NumUnresolved` would be confusing. llvm-svn: 226523	2015-01-19 23:18:34 +00:00
Duncan P. N. Exon Smith	8647529250	IR: Move replaceWithUniqued(), etc., to source file, NFC llvm-svn: 226522	2015-01-19 23:17:09 +00:00
Duncan P. N. Exon Smith	a1ae4f6b30	IR: Cleanup MDNode::MDNode(), NFC llvm-svn: 226521	2015-01-19 23:15:21 +00:00
Duncan P. N. Exon Smith	2bc00f4a38	IR: Merge UniquableMDNode back into MDNode, NFC As pointed out in r226501, the distinction between `MDNode` and `UniquableMDNode` is confusing. When we need subclasses of `MDNode` that don't use all its functionality it might make sense to break it apart again, but until then this makes the code clearer. llvm-svn: 226520	2015-01-19 23:13:14 +00:00
Duncan P. N. Exon Smith	93e983e707	IR: Extract MDNodeOpsKey, NFC Make the MDTuple operand hashing logic reusable. llvm-svn: 226519	2015-01-19 22:53:18 +00:00
Duncan P. N. Exon Smith	f9d1bc9919	IR: Simplify uniquifyImpl(), NFC llvm-svn: 226518	2015-01-19 22:52:07 +00:00
Duncan P. N. Exon Smith	6cf10d2786	IR: Simplify erasing from uniquing store, NFC llvm-svn: 226517	2015-01-19 22:47:08 +00:00
Duncan P. N. Exon Smith	6dc22bf27b	Utils: Simplify MapMetadata(), NFC Extract out the operand remapping loops, which are now very similar. llvm-svn: 226515	2015-01-19 22:44:32 +00:00
Duncan P. N. Exon Smith	9fa10658ce	Skip upcast, NFC llvm-svn: 226514	2015-01-19 22:41:14 +00:00
Simon Pilgrim	20bc37c7db	[X86][AVX] Missing AVX1 memory folding float instructions Now that we can create much more exhaustive X86 memory folding tests, this patch adds the missing AVX1/F16C floating point instruction stack foldings we can easily test for including the scalar intrinsics (add, div, max, min, mul, sub), conversions float/int to double, half precision conversions, rounding, dot product and bit test. The patch also adds a couple of obviously missing SSE instructions (more to follow once we have full SSE testing). Now that scalar folding is working it broke a very old test (2006-10-07-ScalarSSEMiscompile.ll) - this test appears to make no sense as its trying to ensure that a scalar subtraction isn't folded as it 'would zero the top elts of the loaded vector' - this test just appears to be wrong to me. Differential Revision: http://reviews.llvm.org/D7055 llvm-svn: 226513	2015-01-19 22:40:45 +00:00
Duncan P. N. Exon Smith	c862be860d	Fix whitespace, NFC llvm-svn: 226512	2015-01-19 22:40:25 +00:00
Duncan P. N. Exon Smith	0dcffe2cdc	Utils: Simplify MapMetadata(), NFC Take advantage of the new ability of temporary nodes to mutate to distinct and uniqued nodes to greatly simplify the `MapMetadata()` helper functions. llvm-svn: 226511	2015-01-19 22:39:07 +00:00
Duncan P. N. Exon Smith	e33530909d	IR: Allow temporary nodes to become uniqued or distinct Add `MDNode::replaceWithUniqued()` and `MDNode::replaceWithDistinct()`, which mutate temporary nodes to become uniqued or distinct. On uniquing collisions, the unique version is returned and the node is deleted. This takes advantage of temporary nodes being folded back in, and should let me clean up some awkward logic in `MapMetadata()`. llvm-svn: 226510	2015-01-19 22:24:52 +00:00
Duncan P. N. Exon Smith	c5a0e2e3a7	IR: Split out countUnresolvedOperands(), NFC llvm-svn: 226508	2015-01-19 22:18:29 +00:00
Duncan P. N. Exon Smith	422e5c7acc	Cleanup whitespace, NFC llvm-svn: 226507	2015-01-19 22:16:01 +00:00
Duncan P. N. Exon Smith	7d82313bcd	IR: Return unique_ptr from MDNode::getTemporary() Change `MDTuple::getTemporary()` and `MDLocation::getTemporary()` to return (effectively) `std::unique_ptr<T, MDNode::deleteTemporary>`, and clean up call sites. (For now, `DIBuilder` call sites just call `release()` immediately.) There's an accompanying change in each of clang and polly to use the new API. llvm-svn: 226504	2015-01-19 21:30:18 +00:00
Rafael Espindola	2658554aec	Add r224985 back with fixes. The fixes are to note that AArch64 has additional restrictions on when local relocations can be used. In particular, ld64 requires that relocations to cstring/cfstrings use linker visible symbols. Original message: In an assembly expression like bar: .long L0 + 1 the intended semantics is that bar will contain a pointer one byte past L0. In sections that are merged by content (strings, 4 byte constants, etc), a single position in the section doesn't give the linker enough information. For example, it would not be able to tell a relocation must point to the end of a string, since that would look just like the start of the next. The solution used in ELF to use relocation with symbols if there is a non-zero addend. In MachO before this patch we would just keep all symbols in some sections. This would miss some cases (only cstrings on x86_64 were implemented) and was inefficient since most relocations have an addend of 0 and can be represented without the symbol. This patch implements the non-zero addend logic for MachO too. llvm-svn: 226503	2015-01-19 21:11:14 +00:00
Duncan P. N. Exon Smith	946fdcc50c	IR: Remove MDNodeFwdDecl Remove `MDNodeFwdDecl` (as promised in r226481). Aside from API changes, there's no real functionality change here. `MDNode::getTemporary()` now forwards to `MDTuple::getTemporary()`, which returns a tuple with `isTemporary()` equal to true. The main point is that we can now add temporaries of other `MDNode` subclasses, needed for PR22235 (I introduced `MDNodeFwdDecl` in the first place because I didn't recognize this need, and thought they were only needed to handle forward references). A few things left out of (or highlighted by) this commit: - I've had to remove the (few) uses of `std::unique_ptr<>` to deal with temporaries, since the destructor is no longer public. `getTemporary()` should probably return the equivalent of `std::unique_ptr<T, MDNode::deleteTemporary>`. - `MDLocation::getTemporary()` doesn't exist yet (worse, it actually does exist, but does the wrong thing: `MDNode::getTemporary()` is inherited and returns an `MDTuple`). - `MDNode` now only has one subclass, `UniquableMDNode`, and the distinction between them is actually somewhat confusing. I'll fix those up next. llvm-svn: 226501	2015-01-19 20:36:39 +00:00
Colin LeMahieu	0ee02fc9fe	[Hexagon] Updating muxir/ri/ii intrinsics. Setting predicate registers as compatible with i32 rather than doing custom type conversion. llvm-svn: 226500	2015-01-19 20:31:18 +00:00
Duncan P. N. Exon Smith	5b8c440100	IR: Extract out and reuse `storeImpl()`, NFC llvm-svn: 226499	2015-01-19 20:18:13 +00:00
Duncan P. N. Exon Smith	b57f9e9735	IR: Extract out getUniqued(), NFC llvm-svn: 226498	2015-01-19 20:16:50 +00:00
Duncan P. N. Exon Smith	1b0064d0d2	IR: Reuse `getImpl()` for `getDistinct()`, NFC Merge `getDistinct()`'s implementation with those of `get()` and `getIfExists()` for both `MDTuple` and `MDLocation`. This will make it easier to scale to supporting temporaries. llvm-svn: 226497	2015-01-19 20:14:15 +00:00
Duncan P. N. Exon Smith	efdf285bbe	IR: Simplify MDNode::setOperand(), NFC llvm-svn: 226492	2015-01-19 19:29:25 +00:00
Duncan P. N. Exon Smith	3d5805685b	IR: Simplify handleChangedOperand() fast path, NFC Use `isUniqued()` instead of `isStoredDistinctInContext()`, and remove an assertion that won't be valid once temporaries are merged back in. llvm-svn: 226491	2015-01-19 19:28:28 +00:00
Duncan P. N. Exon Smith	b8f796031f	IR: Remove direct comparisons against Metadata::Storage, NFC llvm-svn: 226490	2015-01-19 19:26:24 +00:00
Duncan P. N. Exon Smith	f08b8b4be6	IR: Assert that resolve() is only called on uniqued nodes, NFC Add an assertion in `UniquableMDNode::resolve()` to prevent temporaries from being resolved (once they're merged back in). Needed to shuffle order of `resolve()` and `storeDistinctInContext()` to prevent it from firing. llvm-svn: 226489	2015-01-19 19:25:33 +00:00
Duncan P. N. Exon Smith	105acf7885	IR: Remove isa<UniquableMDNode>, NFC llvm-svn: 226488	2015-01-19 19:10:14 +00:00
Duncan P. N. Exon Smith	9b1c6d34e5	IR: Simplify DIBuilder::trackIfUnresolved(), NFC llvm-svn: 226487	2015-01-19 19:09:14 +00:00
Duncan P. N. Exon Smith	e34014d11c	IR: Remove isa<MDNodeFwdDecl>, NFC llvm-svn: 226486	2015-01-19 19:06:41 +00:00
Duncan P. N. Exon Smith	66ed52231f	IR: Unify code for MDNode::isResolved(), NFC Unify the definitions of `MDNode::isResolved()` and `UniquableMDNode::isResolved()`. Previously, `UniquableMDNode` could answer this question more efficiently, but now that RAUW support has been unified with `MDNodeFwdDecl`, `MDNode` doesn't need any casts to figure out the answer. llvm-svn: 226485	2015-01-19 19:03:18 +00:00
Duncan P. N. Exon Smith	2711ca7c28	IR: Store RAUW support and Context in the same pointer, NFC Add an `LLVMContext &` to `ReplaceableMetadataImpl`, create a class that either holds a reference to an `LLVMContext` or owns a `ReplaceableMetadataImpl`, and use the new class in `MDNode`. - This saves a pointer in `UniquableMDNode` at the cost of a pointer in `ValueAsMetadata` (which didn't used to store the `LLVMContext`). There are far more of the former. - Unifies RAUW support between `MDNodeFwdDecl` (which is going away, see r226481) and `UniquableMDNode`. llvm-svn: 226484	2015-01-19 19:02:06 +00:00
Colin LeMahieu	fcd4569af6	[Hexagon] Converting intrinsics combine imm/imm, simple shifts and extends. llvm-svn: 226483	2015-01-19 18:56:19 +00:00
Duncan P. N. Exon Smith	de03a8b38d	IR: Add isUniqued() and isTemporary() Change `MDNode::isDistinct()` to only apply to 'distinct' nodes (not temporaries), and introduce `MDNode::isUniqued()` and `MDNode::isTemporary()` for the other two possibilities. llvm-svn: 226482	2015-01-19 18:45:35 +00:00
Duncan P. N. Exon Smith	f134045365	IR: Use an enum to describe Metadata storage, NFC More clearly describe the type of storage used for `Metadata`. - `Uniqued`: uniqued, stored in the context. - `Distinct`: distinct, stored in the context. - `Temporary`: not owned by anyone. This is the first in a series of commits to fix a design problem with `MDNodeFwdDecl` that I need to solve for PR22235. While `MDNodeFwdDecl` works well as a forward declaration, we use `MDNode::getTemporary()` for more than forward declarations -- we also need to create early versions of nodes (with fields not filled in) that we'll fill out later (see `DIBuilder::finalize()` and `CGDebugInfo::finalize()` for examples). This was a blind spot I had when I introduced `MDNodeFwdDecl` (which David Blaikie (indirectly) highlighted in an unrelated review [1]). [1]: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20150112/252381.html In general, we need `MDTuple::getTemporary()` to give a temporary tuple (like `MDNodeFwdDecl`), `MDLocation::getTemporary()` to give a temporary location, and (the problem at hand) `GenericDebugMDNode::getTemporary()` to give a temporary generic debug node. So I need to fold the idea of "temporary" nodes back into `UniquableMDNode`. (More commits to follow as I refactor.) llvm-svn: 226481	2015-01-19 18:36:18 +00:00
Colin LeMahieu	9327bdad2f	[Hexagon] Converting remaining ALU32/ALU intrinsics. llvm-svn: 226480	2015-01-19 18:33:58 +00:00
Colin LeMahieu	663419b008	[Hexagon] Converting ALU32/ALU intrinsics to new patterns. llvm-svn: 226478	2015-01-19 18:22:19 +00:00
Adrian Prantl	5883af3faa	Remove support for DIVariable's FlagIndirectVariable and expect frontends to use a DIExpression with a DW_OP_deref instead. This is not only a much more natural place for this informationl; there is also a technical reason: The FlagIndirectVariable is used to mark a variable that is turned into a reference by virtue of the calling convention; this happens for example to aggregate return values. The inliner, for example, may actually need to undo this indirection to correctly represent the value in its new context. This is impossible to implement because the DIVariable can't be safely modified. We can however safely construct a new DIExpression on the fly. llvm-svn: 226476	2015-01-19 17:57:29 +00:00
Greg Fitzgerald	fa78d08675	[AArch64] Implement GHC calling convention Original patch by Luke Iannini. Minor improvements and test added by Erik de Castro Lopo. Differential Revision: http://reviews.llvm.org/D6877 From: Erik de Castro Lopo <erikd@mega-nerd.com> llvm-svn: 226473	2015-01-19 17:40:05 +00:00
Colin LeMahieu	310bad8b7e	[Hexagon] Converting halfword to double accumulating multiply intrinsics. llvm-svn: 226472	2015-01-19 17:36:32 +00:00
Rafael Espindola	c569ac46eb	Produce errors when an assignment expression would use a common symbol. An assignment will produce a symbol with a given section and offset. There is no way to represent something like "1 byte after a common symbol". This matches the behavior of GNU as. Part of PR22217. llvm-svn: 226470	2015-01-19 17:30:24 +00:00
Bradley Smith	3131e85edd	[ARM] SSAT/USAT with an 'asr #32' shift should result in an undefined encoding rather than unpredictable llvm-svn: 226469	2015-01-19 16:37:17 +00:00
Bradley Smith	30057b245e	[ARM] Fixup sign extend instruction availability w.r.t. DSP extension llvm-svn: 226468	2015-01-19 16:36:02 +00:00
Rafael Espindola	12ca34f53f	Bring r226038 back. No change in this commit, but clang was changed to also produce trivial comdats when needed. Original message: Don't create new comdats in CodeGen. This patch stops the implicit creation of comdats during codegen. Clang now sets the comdat explicitly when it is required. With this patch clang and gcc now produce the same result in pr19848. llvm-svn: 226467	2015-01-19 15:16:06 +00:00
Chandler Carruth	d450056c78	[PM] Replace the Pass argument to SplitEdge with specific analyses used and updated. This may appear to remove handling for things like alias analysis when splitting critical edges here, but in fact no callers of SplitEdge relied on this. Similarly, all of them wanted to preserve LCSSA if there was any update of the loop info. That makes the interface much simpler. With this, all of BasicBlockUtils.h is free of Pass arguments and prepared for the new pass manager. This is tho majority of utilities that relied on pass arguments. llvm-svn: 226459	2015-01-19 12:36:53 +00:00
Chandler Carruth	f8753fc48d	[PM] Cleanup a dead option to critical edge splitting that I noticed while refactoring this API for the new pass manager. No functionality changed here, the code didn't actually support this option. llvm-svn: 226457	2015-01-19 12:12:00 +00:00
Chandler Carruth	37df2cfbf8	[PM] Remove the Pass argument from all of the critical edge splitting APIs and replace it and numerous booleans with an option struct. The critical edge splitting API has a really large surface of flags and so it seems worth burning a small option struct / builder. This struct can be constructed with the various preserved analyses and then flags can be flipped in a builder style. The various users are now responsible for directly passing along their analysis information. This should be enough for the critical edge splitting to work cleanly with the new pass manager as well. This API is still pretty crufty and could be cleaned up a lot, but I've focused on this change just threading an option struct rather than a pass through the API. llvm-svn: 226456	2015-01-19 12:09:11 +00:00
Chandler Carruth	ad34d91343	[PM] Relax asserts and always try to reconstruct loop simplify form when we can while splitting critical edges. The only code which called this and didn't require simplified loops to be preserved is polly, and the code behaves correctly there anyways. Without this change, it becomes really hard to share this code with the new pass manager where things like preserving loop simplify form don't make any sense. If anyone discovers this code behaving incorrectly, what it should be testing for is whether the loops it needs to be in simplified form are in fact in that form. It should always be trying to preserve that form when it exists. llvm-svn: 226443	2015-01-19 10:23:00 +00:00
Erik Eckstein	76cb53a839	SLPVectorizer: limit the number of alias checks to reduce the runtime. In case of blocks with many memory-accessing instructions, alias checking can take lot of time (because calculating the memory dependencies has quadratic complexity). I chose a limit which resulted in no changes when running the benchmarks. llvm-svn: 226439	2015-01-19 09:33:38 +00:00
Hal Finkel	c3168129af	[PowerPC] Minor correction to r226432 We don't need to exclude patchpoints from the implicit r2 dependence in FastISel because it is added as an implicit operand and, thus, should not confuse that StackMap code. By inspection / no test case. llvm-svn: 226434	2015-01-19 07:44:45 +00:00
Michael Kuperstein	54c61edee7	[MIScheduler] Slightly better handling of constrainLocalCopy when both source and dest are local This fixes PR21792. Differential Revision: http://reviews.llvm.org/D6823 llvm-svn: 226433	2015-01-19 07:30:47 +00:00
Hal Finkel	af51993ee1	[PowerPC] Add r2 as an operand for all calls under both PPC64 ELF V1 and V2 Our PPC64 ELF V2 call lowering logic added r2 as an operand to all direct call instructions in order to represent the dependency on the TOC base pointer value. Restricting this to ELF V2, however, does not seem to make sense: calls under ELF V1 have the same dependence, and indirect calls have an r2 dependence just as direct ones. Make sure the dependence is noted for all calls under both ELF V1 and ELF V2. llvm-svn: 226432	2015-01-19 07:20:27 +00:00
Craig Topper	f4bf9119a1	[x86] Change AVX512 intrinsics to take a 8-bit immediate for the comparision kind instead of a 32-bit immediate. This better aligns with the emitted instruction. It also matches SSE and AVX1 equivalents. Also add auto upgrade support. llvm-svn: 226430	2015-01-19 06:07:27 +00:00
Chandler Carruth	0eae112009	[PM] Lift the analyses into the interface for SplitLandingPadPredecessors and remove the Pass argument from its interface. Another step to the utilities being usable with both old and new pass managers. llvm-svn: 226426	2015-01-19 03:03:39 +00:00
David Blaikie	186db431c0	unique_ptrify the RelInfo parameter to TargetRegistry::createMCSymbolizer llvm-svn: 226416	2015-01-18 20:45:48 +00:00
David Blaikie	9459832ebd	std::unique_ptrify the MCStreamer argument to createAsmPrinter llvm-svn: 226414	2015-01-18 20:29:04 +00:00
Hal Finkel	58884f9fe6	[PowerPC] Don't hard-code R2 as register when processing TOC relocations Instructions that have high-order TOC relocations always carry R2 as their base register, so it does not matter whether we take the register from the instruction or just hard-code it in PPCAsmPrinter. In the future, however, we might want to apply these relocations to instructions using a different register, so taking the register from the instruction is a better thing to do. No change in functionality here, however. llvm-svn: 226403	2015-01-18 15:59:44 +00:00
Hal Finkel	8ea446b6a4	[PowerPC] Add some FIXMEs for fastcc and FPR <-> GPR moves So we don't forget, once we support FPR <-> GPR moves on the P8, we'll likely want to re-visit this part of the calling convention. llvm-svn: 226401	2015-01-18 14:31:10 +00:00
Hal Finkel	f81b6dd7a2	[PowerPC] Initial PPC64 calling-convention changes for fastcc The default calling convention specified by the PPC64 ELF (V1 and V2) ABI is designed to work with both prototyped and non-prototyped/varargs functions. As a result, GPRs and stack space are allocated for every argument, even those that are passed in floating-point or vector registers. GlobalOpt::OptimizeFunctions will transform local non-varargs functions (that do not have their address taken) to use the 'fast' calling convention. When functions are using the 'fast' calling convention, don't allocate GPRs for arguments passed in other types of registers, and don't allocate stack space for arguments passed in registers. Other changes for the fast calling convention may be added in the future. llvm-svn: 226399	2015-01-18 12:08:47 +00:00
Chandler Carruth	b5797b659f	[PM] Pull the analyses used for another utility routine into its API rather than relying on the pass object. This one is a bit annoying, but will pay off. First, supporting this one will make the next one much easier, and for utilities like LoopSimplify, this is moving them (slowly) closer to not having to pass the pass object around throughout their APIs. llvm-svn: 226396	2015-01-18 09:21:15 +00:00
Chandler Carruth	32c52c7e04	[PM] Sink the specific analyses preserved by SplitBlock into its interface, removing Pass from its interface. This also makes those analyses optional so that passes which don't even preserve these (or use them) can skip the logic entirely. llvm-svn: 226394	2015-01-18 02:39:37 +00:00
Chandler Carruth	b5c115357c	[PM] Replace another Pass argument with specific analyses that are optionally updated by MergeBlockIntoPredecessors. No functionality changed, just refactoring to clear the way for the new pass manager. llvm-svn: 226392	2015-01-18 02:11:23 +00:00
Chandler Carruth	94209094a5	[PM] Refactor how the LoopRotation pass access the DominatorTree. Instead of querying the pass every where we need to, do that once and cache a pointer in the pass object. This is both simpler and I'm about to add yet another place where we need to dig out that pointer. llvm-svn: 226391	2015-01-18 02:08:05 +00:00
Chandler Carruth	5eee895ccf	[PM] Lift the actual analyses used into the inferface rather than accepting a Pass and querying it for analyses. This is necessary to allow the utilities to work both with the old and new pass managers, and I also think this makes the interface much more clear and helps the reader know what analyses the utility can actually handle. I plan to repeat this process iteratively to clean up all the pass utilities. llvm-svn: 226386	2015-01-18 01:45:07 +00:00
Chandler Carruth	691addc25f	[PM] Now that LoopInfo isn't in the Pass type hierarchy, it is much cleaner to derive from the generic base. Thise removes a ton of boiler plate code and somewhat strange and pointless indirections. It also remove a bunch of the previously needed friend declarations. To fully remove these, I also lifted the verify logic into the generic LoopInfoBase, which seems good anyways -- it is generic and useful logic even for the machine side. llvm-svn: 226385	2015-01-18 01:25:51 +00:00
Chandler Carruth	bc045a5a33	[PM] Cleanup more warnings my refactoring exposed where now we have unused variables in a no-asserts build. I've fixed this by putting the entire loop behind an #ifndef as it contains nothing other than asserts. llvm-svn: 226377	2015-01-17 14:49:23 +00:00
Chandler Carruth	24fd029a60	[PM] Remove a dead field. This was dead even before I refactored how we initialized it, but my refactoring made it trivially dead and it is now caught by a Clang warning. This fixes the warning and should clean up the -Werror bot failures (sorry!). llvm-svn: 226376	2015-01-17 14:31:35 +00:00
Chandler Carruth	4f8f307c77	[PM] Split the LoopInfo object apart from the legacy pass, creating a LoopInfoWrapperPass to wire the object up to the legacy pass manager. This switches all the clients of LoopInfo over and paves the way to port LoopInfo to the new pass manager. No functionality change is intended with this iteration. llvm-svn: 226373	2015-01-17 14:16:18 +00:00
Hal Finkel	c19805a75d	[PowerPC] Don't list R11 as a patchpoint scratch register R11's status is the same under both the PPC64 ELF V1 and V2 ABIs: it is reserved for use as an "environment pointer" for compilation models that require such a thing. We don't, we also don't need a second scratch register, and because we support only "local" patchpoint call targets, we might as well let R11 be used for anyregcc patchpoints. llvm-svn: 226369	2015-01-17 03:57:34 +00:00
Mehdi Amini	37f316afaf	Improve DAG combine pass on certain IR vector patterns Loading 2 2x32-bit float vectors into the bottom half of a 256-bit vector produced suboptimal code in AVX2 mode with certain IR combinations. In particular, the IR optimizer folded 2f32 + 2f32 -> 4f32, 4f32 + 4f32 (undef) -> 8f32 into a 2f32 + 2f32 -> 8f32, which seems more canonical, but then mysteriously generated rather bad code; the movq/movhpd combination didn't match. The problem lay in the BUILD_VECTOR optimization path. The 2f32 inputs would get promoted to 4f32 by the type legalizer, eventually resulting in a BUILD_VECTOR on two 4f32 into an 8f32. The BUILD_VECTOR then, recognizing these were both half the output size, concatted them and then produced a shuffle. However, the resulting concat + shuffle was more complex than it should be; in the case where the upper half of the output is undef, we probably want to generate shuffle + concat instead. This enhancement causes the vector_shuffle combine step to recognize this suboptimal pattern and correct it. I included it there instead of in BUILD_VECTOR in case the same suboptimal pattern occurs for other reasons. This results in the optimizer correctly producing the optimal movq + movhpd sequence for all three variations on this IR, even with AVX2. I've included a test case. Radar link: rdar://problem/19287012 Fix for PR 21943. From: Fiona Glaser <fglaser@apple.com> llvm-svn: 226360	2015-01-17 01:35:56 +00:00
Lang Hames	2996895f28	[RuntimeDyld] Tidy up emitCommonSymbols a little. NFC. llvm-svn: 226358	2015-01-17 00:55:05 +00:00
Richard Trieu	73d06526ba	Remove std::move that was preventing return value optimization. llvm-svn: 226356	2015-01-17 00:46:44 +00:00
Matthias Braun	7618b2b23d	RegisterCoalescer: Cleanup and improved comment for a subtle detail. llvm-svn: 226353	2015-01-17 00:33:13 +00:00
Matthias Braun	0eb940aed0	RegisterCoalescer: Cleanup by factoring out a common expression llvm-svn: 226352	2015-01-17 00:33:11 +00:00
Matthias Braun	e2fa081615	RegisterCoalescer: Cleanup comment style - Consistenly put comments above the function declaration, not the definition. To achieve this some duplicate comments got merged and some comment parts describing implementation details got moved into their functions. - Consistently use doxygen comments above functions. - Do not use doxygen comments inside functions. llvm-svn: 226351	2015-01-17 00:33:09 +00:00
Matthias Braun	fc6ef3a270	RegisterCoalescer: Drive-by typo + whitespace fix llvm-svn: 226350	2015-01-17 00:33:06 +00:00
Lang Hames	1f7eab338f	[RuntimeDyld] Remove the brace initialization that was introduced in r226341. Evidently MSVC doesn't like it. llvm-svn: 226349	2015-01-17 00:32:56 +00:00
Philip Reames	287987ca13	Update a comment Be a bit more explicit about the fact that addrspace(1) is not reserved. llvm-svn: 226344	2015-01-16 23:21:07 +00:00
Philip Reames	36319538d0	clang-format all the GC related files (NFC) Nothing interesting here... llvm-svn: 226342	2015-01-16 23:16:12 +00:00
Lang Hames	6bfd398022	[RuntimeDyld] Track symbol visibility in RuntimeDyld. RuntimeDyld symbol info previously consisted of just a Section/Offset pair. This patch replaces that pair type with a SymbolInfo class that also tracks symbol visibility. A new method, RuntimeDyld::getExportedSymbolLoadAddress, is introduced which only returns a non-zero result for exported symbols. For non-exported or non-existant symbols this method will return zero. The RuntimeDyld::getSymbolAddress method retains its current behavior, returning non-zero results for all symbols regardless of visibility. No in-tree clients of RuntimeDyld are changed. The newly introduced functionality will be used by the Orc APIs. No test case: Since this patch doesn't modify the behavior for any in-tree clients we don't have a good tool to test this with yet. Once Orc is in we can use it to write regression tests that test these changes. llvm-svn: 226341	2015-01-16 23:13:56 +00:00
Kevin Enderby	c1271893af	Fix the Archive::Child::getRawSize() method used by llvm-objdump’s -archive-headers option and tweak its use in llvm-objdump. Add back the test case for the -archive-headers option. llvm-svn: 226332	2015-01-16 22:10:36 +00:00
Colin LeMahieu	823415b881	[Hexagon] Converting halfword to doubleword multiply intrinsics. llvm-svn: 226326	2015-01-16 21:41:57 +00:00
Colin LeMahieu	cd9b276966	[Hexagon] Converting accumulating halfword multiply intrinsics to patterns. llvm-svn: 226324	2015-01-16 21:36:34 +00:00
Colin LeMahieu	3b047e0ee5	[Hexagon] Beginning converting intrinsics to patterns instead of duplicated definitions. Converting halfword multiply intrinsics. llvm-svn: 226318	2015-01-16 20:38:54 +00:00
Colin LeMahieu	54adb6a5d5	[Hexagon] Fix 226309, replacement atomic store patterns didn't actually exist, added new versions. llvm-svn: 226315	2015-01-16 20:16:14 +00:00
Saleem Abdulrasool	c3f8ad3e83	X86: fix comment typo in AsmParser Fix a typo. NFC. llvm-svn: 226313	2015-01-16 20:16:06 +00:00
Philip Reames	2b45395876	Move ownership of GCStrategy objects to LLVMContext Note: This change ended up being slightly more controversial than expected. Chandler has tentatively okayed this for the moment, but I may be revisiting this in the near future after we settle some high level questions. Rather than have the GCStrategy object owned by the GCModuleInfo - which is an immutable analysis pass used mainly by gc.root - have it be owned by the LLVMContext. This simplifies the ownership logic (i.e. can you have two instances of the same strategy at once?), but more importantly, allows us to access the GCStrategy in the middle end optimizer. To this end, I add an accessor through Function which becomes the canonical way to get at a GCStrategy instance. In the near future, this will allows me to move some of the checks from http://reviews.llvm.org/D6808 into the Verifier itself, and to introduce optimization legality predicates for some of the recent additions to InstCombine. (These will follow as separate changes.) Differential Revision: http://reviews.llvm.org/D6811 llvm-svn: 226311	2015-01-16 20:07:33 +00:00
Colin LeMahieu	bb6718b30e	[Hexagon] Removing old duplicate atomic load/store patterns. llvm-svn: 226309	2015-01-16 19:53:35 +00:00
Philip Reames	7de640a876	Remove gc.root's findCustomSafePoints mechanism Searching all of the existing gc.root implementations I'm aware of (all three of them), there was exactly one use of this mechanism, and that was to implement a performance improvement that should have been applied to the default lowering. Having this function is requiring a dependency on a CodeGen class (MachineFunction), in a class which is otherwise completely independent of CodeGen. I could solve this differently, but given that I see absolutely no value in preserving this mechanism, I going to just get rid of it. Note: Tis is the first time I'm intentionally breaking previously supported gc.root functionality. Given 3.6 has branched, I believe this is a good time to do this. Differential Revision: http://reviews.llvm.org/D7004 llvm-svn: 226305	2015-01-16 19:33:28 +00:00
Colin LeMahieu	7d1f632380	[Hexagon] Converting old patterns to new versions using classes. llvm-svn: 226304	2015-01-16 19:29:59 +00:00
Adam Nemet	3e8b22bc1b	[AVX512] Add intrinsics for masked aligned FP loads and stores Similar to the unaligned cases. Test was generated with update_llc_test_checks.py. Part of <rdar://problem/17688758> llvm-svn: 226296	2015-01-16 18:50:09 +00:00
Duncan P. N. Exon Smith	2f5bb31302	IR: Allow 16-bits for column info Raise the limit for column information from 8 bits to 16 bits. llvm-svn: 226291	2015-01-16 17:33:08 +00:00
Duncan P. N. Exon Smith	c9cddb0837	IR: Cleanup dead code, NFC Line/column fixups already exist in `MDLocation`. Delete the duplicated logic in `DebugLoc`. llvm-svn: 226290	2015-01-16 17:31:29 +00:00
Colin LeMahieu	2e3a26de0c	[Hexagon] Updating call/jump instruction patterns. llvm-svn: 226288	2015-01-16 17:05:27 +00:00
Andrea Di Biagio	ae47bc6ab9	[X86][DAG] Disable target specific combine on INSERTPS dag nodes at -O0. This patch disables target specific combine on X86ISD::INSERTPS dag nodes if optlevel is CodeGenOpt::None. The backend currently implements a target specific combine rule that converts a vector load used by an INSERTPS dag node into a scalar load plus a scalar_to_vector. This allows ISel to select a single INSERTPSrm instead of two instructions (i.e. a vector load plus INSERTPSrr). However, the existing target combine rule on INSERTPS nodes only works under the assumption that ISel will always be able to match an INSERTPSrm. This is not true in general at -O0, since the backend only allows folding a load into the memory operand of an instruction if the optimization level is not CodeGenOpt::None. In the example below: // __m128 test(__m128 a, __m128 b) { __m128 c = _mm_insert_ps(a, b, 1 << 6); return c; } // Before this patch, at -O0, the backend would have canonicalized the load to 'b' into a scalar load plus scalar_to_vector. Later on, ISel would have selected an INSERTPSrr leaving the insertps mask in an inconsistent state: movss 4(%rdi), %xmm1 insertps $64, %xmm1, %xmm0 # xmm0 = xmm1[1],xmm0[1,2,3]. With this patch, the backend avoids folding the vector load into the operand of the INSERTPS. The new codegen at -O0 is: movaps (%rdi), %xmm1 insertps $64, %xmm1, %xmm0 # %xmm1[1],xmm0[1,2,3]. llvm-svn: 226277	2015-01-16 14:55:26 +00:00
Toma Tabacu	f476200c63	[mips] Remove a redundant semicolon and add space before curly brackets. NFC. llvm-svn: 226269	2015-01-16 10:45:15 +00:00
Timur Iskhodzhanov	60b721363c	Revert r226242 - Revert Revert Don't create new comdats in CodeGen This breaks AddressSanitizer (ninja check-asan) on Windows llvm-svn: 226251	2015-01-16 08:38:45 +00:00
Hal Finkel	52f7c018d3	[PowerPC] Adjust PatchPoints for ppc64le Bill Schmidt pointed out that some adjustments would be needed to properly support powerpc64le (using the ELF V2 ABI). For one thing, R11 is not available as a scratch register, so we need to use R12. R12 is also available under ELF V1, so to maintain consistency, I flipped the order to make R12 the first scratch register in the array under both ABIs. llvm-svn: 226247	2015-01-16 04:40:58 +00:00
Mehdi Amini	590a2700fc	Fix Reassociate handling of constant in presence of undef float http://reviews.llvm.org/D6993 llvm-svn: 226245	2015-01-16 03:00:58 +00:00
Rafael Espindola	67a79e72f5	Revert "Revert Don't create new comdats in CodeGen" This reverts commit r226173, adding r226038 back. No change in this commit, but clang was changed to also produce trivial comdats for costructors, destructors and vtables when needed. Original message: Don't create new comdats in CodeGen. This patch stops the implicit creation of comdats during codegen. Clang now sets the comdat explicitly when it is required. With this patch clang and gcc now produce the same result in pr19848. llvm-svn: 226242	2015-01-16 02:22:55 +00:00
Sanjoy Das	a1837a342d	Add a new pass "inductive range check elimination" IRCE eliminates range checks of the form 0 <= A * I + B < Length by splitting a loop's iteration space into three segments in a way that the check is completely redundant in the middle segment. As an example, IRCE will convert len = < known positive > for (i = 0; i < n; i++) { if (0 <= i && i < len) { do_something(); } else { throw_out_of_bounds(); } } to len = < known positive > limit = smin(n, len) // no first segment for (i = 0; i < limit; i++) { if (0 <= i && i < len) { // this check is fully redundant do_something(); } else { throw_out_of_bounds(); } } for (i = limit; i < n; i++) { if (0 <= i && i < len) { do_something(); } else { throw_out_of_bounds(); } } IRCE can deal with multiple range checks in the same loop (it takes the intersection of the ranges that will make each of them redundant individually). Currently IRCE does not do any profitability analysis. That is a TODO. Please note that the status of this pass is experimental, and it is not part of any default pass pipeline. Having said that, I will love to get feedback and general input from people interested in trying this out. This pass was originally r226201. It was reverted because it used C++ features not supported by MSVC 2012. Differential Revision: http://reviews.llvm.org/D6693 llvm-svn: 226238	2015-01-16 01:03:22 +00:00
Kevin Enderby	a975d4df1d	This should fix the build bot clang-cmake-armv7-a15-full failing on the macho-archive-headers.test added with r226228. llvm-svn: 226232	2015-01-16 00:27:31 +00:00
Matt Arsenault	eeb2a7e688	R600/SI: Add patterns for v_cvt_{flr\|rpi}_i32_f32 llvm-svn: 226230	2015-01-15 23:58:35 +00:00
Filipe Cabecinhas	c552c9abce	Fix edge case when Start overflowed in 32 bit mode llvm-svn: 226229	2015-01-15 23:50:44 +00:00
Kevin Enderby	13023a1af6	Add the option, -archive-headers, used with -macho to print the Mach-O archive headers to llvm-objdump. llvm-svn: 226228	2015-01-15 23:19:11 +00:00
Matt Arsenault	268757ba60	R600/SI: Fix trailing comma with modifiers Instructions with 1 operand can still use source modifiers, so make sure we don't print an extra comma afterwards. llvm-svn: 226226	2015-01-15 23:17:03 +00:00
Colin LeMahieu	cd9c4e3e07	[Hexagon] Adding new-value store and bit reverse instructions. llvm-svn: 226224	2015-01-15 23:10:29 +00:00
Filipe Cabecinhas	4013950034	Report fatal errors instead of segfaulting/asserting on a few invalid accesses while reading MachO files. Summary: Shift an older “invalid file” test to get a consistent naming for these tests. Bugs found by afl-fuzz Reviewers: rafael Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6945 llvm-svn: 226219	2015-01-15 22:52:38 +00:00
Lang Hames	7e0692b614	[Object] Add SF_Exported flag. This flag will be set on all symbols that would be exported from a dylib if their containing object file were linked into one. No test case: No command line tools query this flag, and there are no Object unit tests. llvm-svn: 226217	2015-01-15 22:33:30 +00:00
Sanjoy Das	7f62ac8e4d	Revert r226201 (Add a new pass "inductive range check elimination") The change used C++11 features not supported by MSVC 2012. I will fix the change to use things supported MSVC 2012 and recommit shortly. llvm-svn: 226216	2015-01-15 22:18:10 +00:00
David Majnemer	f1f72c9e43	InductiveRangeCheckElimination: Remove extra ';' This silences a GCC warning. llvm-svn: 226215	2015-01-15 21:55:16 +00:00
Andrew Kaylor	204096b59e	Fixing pedantic build warnings. llvm-svn: 226214	2015-01-15 21:50:53 +00:00
Colin LeMahieu	c59328e627	[Hexagon] Fix 226206 by uncommenting required pattern and changing patterns for simple load-extends. llvm-svn: 226210	2015-01-15 21:35:49 +00:00
Hal Finkel	e2ab0f17cf	[PowerPC] Loosen ELFv1 PPC64 func descriptor loads for indirect calls Function pointers under PPC64 ELFv1 (which is used on PPC64/Linux on the POWER7, A2 and earlier cores) are really pointers to a function descriptor, a structure with three pointers: the actual pointer to the code to which to jump, the pointer to the TOC needed by the callee, and an environment pointer. We used to chain these loads, and make them opaque to the rest of the optimizer, so that they'd always occur directly before the call. This is not necessary, and in fact, highly suboptimal on embedded cores. Once the function pointer is known, the loads can be performed ahead of time; in fact, they can be hoisted out of loops. Now these function descriptors are almost always generated by the linker, and thus the contents of the descriptors are invariant. As a result, by default, we'll mark the associated loads as invariant (allowing them to be hoisted out of loops). I've added a target feature to turn this off, however, just in case someone needs that option (constructing an on-stack descriptor, casting it to a function pointer, and then calling it cannot be well-defined C/C++ code, but I can imagine some JIT-compilation system doing so). Consider this simple test: $ cat call.c typedef void (fp)(); void bar(fp x) { for (int i = 0; i < 1600000000; ++i) x(); } $ cat main.c typedef void (fp)(); void bar(fp x); void foo() {} int main() { bar(foo); } On the PPC A2 (the BG/Q supercomputer), marking the function-descriptor loads as invariant brings the execution time down to ~8 seconds from ~32 seconds with the loads in the loop. The difference on the POWER7 is smaller. Compiling with: gcc -std=c99 -O3 -mcpu=native call.c main.c : ~6 seconds [this is 4.8.2] clang -O3 -mcpu=native call.c main.c : ~5.3 seconds clang -O3 -mcpu=native call.c main.c -mno-invariant-function-descriptors : ~4 seconds (looks like we'd benefit from additional loop unrolling here, as a first guess, because this is faster with the extra loads) The -mno-invariant-function-descriptors will be added to Clang shortly. llvm-svn: 226207	2015-01-15 21:17:34 +00:00
Colin LeMahieu	f87697f05e	[Hexagon] Updating indexed load-extend patterns and changing test to new expected output. llvm-svn: 226206	2015-01-15 21:07:52 +00:00
Sanjoy Das	7059e2959d	Add a new pass "inductive range check elimination" IRCE eliminates range checks of the form 0 <= A * I + B < Length by splitting a loop's iteration space into three segments in a way that the check is completely redundant in the middle segment. As an example, IRCE will convert len = < known positive > for (i = 0; i < n; i++) { if (0 <= i && i < len) { do_something(); } else { throw_out_of_bounds(); } } to len = < known positive > limit = smin(n, len) // no first segment for (i = 0; i < limit; i++) { if (0 <= i && i < len) { // this check is fully redundant do_something(); } else { throw_out_of_bounds(); } } for (i = limit; i < n; i++) { if (0 <= i && i < len) { do_something(); } else { throw_out_of_bounds(); } } IRCE can deal with multiple range checks in the same loop (it takes the intersection of the ranges that will make each of them redundant individually). Currently IRCE does not do any profitability analysis. That is a TODO. Please note that the status of this pass is experimental, and it is not part of any default pass pipeline. Having said that, I will love to get feedback and general input from people interested in trying this out. Differential Revision: http://reviews.llvm.org/D6693 llvm-svn: 226201	2015-01-15 20:45:46 +00:00
Hal Finkel	5ef58eb86d	Revert "r226086 - Revert "r226071 - [RegisterCoalescer] Remove copies to reserved registers"" Reapply r226071 with fixes. Two fixes: 1. We need to manually remove the old and create the new 'deaf defs' associated with physical register definitions when we move the definition of the physical register from the copy point to the point of the original vreg def. This problem was picked up by the machinstr verifier, and could trigger a verification failure on test/CodeGen/X86/2009-02-12-DebugInfoVLA.ll, so I've turned on the verifier in the tests. 2. When moving the def point of the phys reg up, we need to make sure that it is neither defined nor read in between the two instructions. We don't, however, extend the live ranges of phys reg defs to cover uses, so just checking for live-range overlap between the pair interval and the phys reg aliases won't pick up reads. As a result, we manually iterate over the range and check for reads. A test soon to be committed to the PowerPC backend will test this change. Original commit message: [RegisterCoalescer] Remove copies to reserved registers This allows the RegisterCoalescer to join "non-flipped" range pairs with a physical destination register -- which allows the RegisterCoalescer to remove copies like this: <vreg> = something (maybe a load, for example) ... (things that don't use PHYSREG) PHYSREG = COPY <vreg> (with all of the restrictions normally applied by the RegisterCoalescer: having compatible register classes, etc. ) Previously, the RegisterCoalescer handled only the opposite case (copying from a physical register). I don't handle the problem fully here, but try to get the common case where there is only one use of <vreg> (the COPY). An upcoming commit to the PowerPC backend will make this pattern much more common on PPC64/ELF systems. llvm-svn: 226200	2015-01-15 20:32:09 +00:00
Philip Reames	66c9fb0d52	Style cleanup of old gc.root lowering code Use static functions for helpers rather than static member functions. a) this changes the linking (minor at best), and b) this makes it obvious no object state is involved. llvm-svn: 226198	2015-01-15 19:49:25 +00:00
Philip Reames	b87144160e	clang-format GCStrategy.cpp & GCRootLowering.cpp (NFC) llvm-svn: 226196	2015-01-15 19:39:17 +00:00
Philip Reames	f27f373895	Split GCStrategy.cpp into two files (NFC) This preparation for an update to http://reviews.llvm.org/D6811. GCStrategy.cpp will hopefully be moving into IR/, where as the lowering logic needs to stay in CodeGen/ llvm-svn: 226195	2015-01-15 19:29:42 +00:00
Colin LeMahieu	538b85810c	[Hexagon] Removing old versions of vsplice, valign, cl0, ct0 and updating references to new versions. llvm-svn: 226194	2015-01-15 19:28:32 +00:00
Marek Olsak	f0b130ace0	R600/SI: Unify VOP2 instructions which are VOP3-only on VI This removes some duplicated classes and definitions. These instructions are defined: _e32 // pseudo _e32_si _e64 // pseudo _e64_si _e64_vi llvm-svn: 226191	2015-01-15 18:43:06 +00:00
Marek Olsak	c536850526	R600/SI: Use 64-bit encoding by default for opcodes that are VOP3-only on VI llvm-svn: 226190	2015-01-15 18:43:01 +00:00
Marek Olsak	15e4a59899	R600/SI: Add V_READLANE_B32 and V_WRITELANE_B32 for VI These are VOP3-only on VI. The new multiclass doesn't define VOP3 versions of VOP2 instructions. llvm-svn: 226189	2015-01-15 18:42:55 +00:00

... 3 4 5 6 7 ...

76152 Commits