llvm-project

Commit Graph

Author	SHA1	Message	Date
Duncan P. N. Exon Smith	14cc94c1c6	Utils: Separate out mapDistinctNode(), NFC llvm-svn: 225902	2015-01-14 01:03:05 +00:00
Duncan P. N. Exon Smith	3956a85e6e	Utils: Use helper function directly, NFC llvm-svn: 225901	2015-01-14 01:02:17 +00:00
Adrian Prantl	7813d9c979	Debug Info: Implement DwarfCompileUnit::addComplexAddress() using DIEDwarfExpression (and get rid of a bunch of redundant code). NFC llvm-svn: 225900	2015-01-14 01:01:30 +00:00
Adrian Prantl	ad768c3719	Debug Info: Emitting a register in DwarfExpression may fail. Report the status in a bool and let the users deal with the error. NFC. llvm-svn: 225899	2015-01-14 01:01:28 +00:00
Adrian Prantl	658676c3ea	Debug Info: Move DIEDwarfExpression into DwarfExpression.h because it needs to be accessed from both DwarfCompileUnit.cpp and DwarfUnit.cpp. NFC. llvm-svn: 225898	2015-01-14 01:01:22 +00:00
Duncan P. N. Exon Smith	077affdbb9	Utils: Extract helper function, NFC llvm-svn: 225897	2015-01-14 01:01:19 +00:00
Duncan P. N. Exon Smith	34651ee2f6	Utils: Use MDTuple::get() directly, NFC Working towards supporting `MDLocation` in `MapMetadata()`. llvm-svn: 225896	2015-01-14 00:59:57 +00:00
Ahmed Bougacha	71d7b18e3d	[SimplifyLibCalls] Don't try to simplify indirect calls. It turns out, all callsites of the simplifier are guarded by a check for CallInst::getCalledFunction (i.e., to make sure the callee is direct). This check wasn't done when trying to further optimize a simplified fortified libcall, introduced by a refactoring in r225640. Fix that, add a testcase, and document the requirement. llvm-svn: 225895	2015-01-14 00:55:05 +00:00
Eric Christopher	16370678e3	Remove unused predicate. llvm-svn: 225893	2015-01-14 00:50:33 +00:00
Eric Christopher	6e30cd95cb	Migrate ABIName to MCTargetOptions so that it can be shared between the TargetMachine level and the MC level. llvm-svn: 225891	2015-01-14 00:50:31 +00:00
Chandler Carruth	11f5032368	Revert r225854: [PM] Move the LazyCallGraph printing functionality to a print method. This was formulated on a bad idea, but sadly I didn't uncover how bad this was until I got further down the path. I had hoped that we could provide a low boilerplate way of printing analyses, but it just doesn't seem like this really fits the needs of the analyses. Not all analyses really want to do printing, and those that do don't all use the same interface. Instead, with the new pass manager let's just take advantage of the fact that creating an explicit printer pass like the LCG has is pretty low boilerplate already and rely on that for testing. llvm-svn: 225861	2015-01-14 00:27:45 +00:00
Adrian Prantl	8efadbf868	Debug Info: Don't bother emitting DW_AT_frame_base if the function has no frame register. "Tested" via an assertion triggered by DwarfExpression. llvm-svn: 225858	2015-01-14 00:15:16 +00:00
Adrian Prantl	1411577ad9	Revert "Debug Info: Bail out of AddMachineRegPiece() if MachineReg is not a" This reverts commit r225852, it was a bad idea. MachineReg should always be a physical register. If it isn't this DebugLoc shouldn't have been created in the first place. llvm-svn: 225857	2015-01-14 00:15:12 +00:00
Chandler Carruth	76890d82c0	[PM] Move the LazyCallGraph printing functionality to a print method. I'm adding generic analysis printing utility pass support which will require such a method (or a specialization) so this will let the existing printing logic satisfy that. llvm-svn: 225854	2015-01-13 23:53:50 +00:00
Adrian Prantl	e8e0bac270	Debug Info: Bail out of AddMachineRegPiece() if MachineReg is not a physical register. The call to getMinimalPhysRegClass() later on asserts on this condition. llvm-svn: 225852	2015-01-13 23:39:15 +00:00
Adrian Prantl	092d9489ed	Debug Info: Move the complex expression handling (=the remainder) of emitDebugLocValue() into DwarfExpression. Ought to be NFC, but it actually uncovered a bug in the debug-loc-asan.ll testcase. The testcase checks that the address of variable "y" is stored at [RSP+16], which also lines up with the comment. It also check(ed) that the value of "y" is stored in RDI before that, but that is actually incorrect, since RDI is the very value that is stored in [RSP+16]. Here's the assembler output: movb 2147450880(%rcx), %r8b #DEBUG_VALUE: bar:y <- RDI cmpb $0, %r8b movq %rax, 32(%rsp) # 8-byte Spill movq %rsi, 24(%rsp) # 8-byte Spill movq %rdi, 16(%rsp) # 8-byte Spill .Ltmp3: #DEBUG_VALUE: bar:y <- [RSP+16] Fixed the comment to spell out the correct register and the check to expect an address rather than a value. Note that the range that is emitted for the RDI location was and is still wrong, it claims to begin at the function prologue, but really it should start where RDI is first assigned. llvm-svn: 225851	2015-01-13 23:39:11 +00:00
Adrian Prantl	0a3bfdbd37	cleanup. llvm-svn: 225848	2015-01-13 23:11:51 +00:00
Adrian Prantl	172ab66a11	Document, cleanup, and clang-format DwarfExpression.h llvm-svn: 225847	2015-01-13 23:11:07 +00:00
Adrian Prantl	8995f5c92f	Debug Info: Turn DIExpression::getFrameRegister() into an isFrameRegister() function. NFC. llvm-svn: 225846	2015-01-13 23:10:43 +00:00
Tom Stellard	fb77f00be8	R600/SI: Add pattern for bitcasting fp immediates to integers The backend now assumes that all immediates are integers. This allows us to simplify immediate handling code, becasue we no longer need to handle fp and integer immediates differently. llvm-svn: 225844	2015-01-13 22:59:41 +00:00
Chandler Carruth	703378f156	[PM] Remove the defunt CGSCC-specific debug flag. Even before I sunk the debug flag into the opt tool this had been made obsolete by factoring the pass and analysis managers into a single set of templates that all used the core flag. No functionality changed here. llvm-svn: 225842	2015-01-13 22:45:13 +00:00
Chandler Carruth	14a759e3d9	[PM] Push the debug option for the new pass manager into the opt tool and expose the necessary hooks in the API directly. This makes it much cleaner for example to log the usage of a pass manager from a library. It also makes it more obvious that this functionality isn't "optional" or "asserts-only" for the pass manager. llvm-svn: 225841	2015-01-13 22:42:38 +00:00
Adam Nemet	e5dbcb7fd0	[AVX512] Unpack support in new shuffle lowering This now handles both 32 and 64-bit element sizes. In this version, the test are in vector-shuffle-512-v8.ll, canonicalized by Chandler's update_llc_test_checks.py. Part of <rdar://problem/17688758> llvm-svn: 225838	2015-01-13 22:20:18 +00:00
Adam Nemet	67c8484794	[AVX512] Add pretty-printing of shuffle mask for unpacks llvm-svn: 225837	2015-01-13 22:20:14 +00:00
Matthias Braun	f50ab43214	DAGCombiner: simplify by using condition variables; NFC llvm-svn: 225836	2015-01-13 22:17:46 +00:00
Duncan P. N. Exon Smith	6a4848324b	AsmParser/Bitcode: Add support for MDLocation This adds assembly and bitcode support for `MDLocation`. The assembly side is rather big, since this is the first `MDNode` subclass (that isn't `MDTuple`). Part of PR21433. (If you're wondering where the mountains of testcase updates are, we don't need them until I update `DILocation` and `DebugLoc` to actually use this class.) llvm-svn: 225830	2015-01-13 21:10:44 +00:00
Matt Arsenault	bf0db918b2	R600: Implement getRecipEstimate This requires a new hook to prevent expanding sqrt in terms of rsqrt and reciprocal. v_rcp_f32, v_rsq_f32, and v_sqrt_f32 are all the same rate, so this expansion would just double the number of instructions and cycles. llvm-svn: 225828	2015-01-13 20:53:23 +00:00
Matt Arsenault	e93d06a579	R600: Implement getRsqrtEstimate Only do for f32 since I'm unclear on both what this is expecting for the refinement steps in terms of accuracy, and what f64 instruction actually provides. llvm-svn: 225827	2015-01-13 20:53:18 +00:00
Duncan P. N. Exon Smith	de03ff5721	IR: Add MDLocation class Add a new subclass of `UniquableMDNode`, `MDLocation`. This will be the IR version of `DebugLoc` and `DILocation`. The goal is to rename this to `DILocation` once the IR classes supersede the `DI`-prefixed wrappers. This isn't used anywhere yet. Part of PR21433. llvm-svn: 225824	2015-01-13 20:44:56 +00:00
Matt Arsenault	b56d843348	R600: Make cttz / ctlz cheap to speculate Speculating things is generally good. SI+ has instructions for these for 32-bit values. This is still probably better even with the expansion for 64-bit values, although it is odd that this callback doesn't have the size as a parameter. llvm-svn: 225822	2015-01-13 19:46:48 +00:00
Julien Lerouge	0473cb5ab7	Fix non-determinism issue in SLP The issue was introduced in r214638: + for (auto &BSIter : BlocksSchedules) { + scheduleBlock(BSIter.second.get()); + } Because BlocksSchedules is a DenseMap with BasicBlock* keys, blocks are scheduled in non-deterministic order, resulting in unpredictable IR. Patch by Daniel Reynaud! llvm-svn: 225821	2015-01-13 19:45:52 +00:00
Ulrich Weigand	bd039299c0	Use the integrated assembler as default on SystemZ This was already done in clang, this commit now uses the integrated assembler as default when using LLVM tools directly. A number of test cases deliberately using an invalid instruction in inline asm now have to use -no-integrated-as. llvm-svn: 225820	2015-01-13 19:45:16 +00:00
Ulrich Weigand	6b577e26f0	Use the integrated assembler as default on PowerPC This was already done in clang, this commit now uses the integrated assembler as default when using LLVM tools directly. A number of test cases using inline asm had to be adapted, either by updating the expected output, or by using -no-integrated-as (for such tests that deliberately use an invalid instruction in inline asm). llvm-svn: 225819	2015-01-13 19:43:45 +00:00
Chris Bieneman	5d23224f21	Running clang-format on CommandLine.h and CommandLine.cpp. No functional changes, I'm just going to be doing a lot of work in these files and it would be helpful if they had more current LLVM style. llvm-svn: 225817	2015-01-13 19:14:20 +00:00
Hal Finkel	63fb928109	Revert "r225808 - [PowerPC] Add StackMap/PatchPoint support" Reverting this while I investiage buildbot failures (segfaulting in GetCostForDef at ScheduleDAGRRList.cpp:314). llvm-svn: 225811	2015-01-13 18:25:05 +00:00
Hal Finkel	76a31f8c12	[PowerPC] Add missing override keyword llvm-svn: 225809	2015-01-13 18:02:22 +00:00
Hal Finkel	821befd52b	[PowerPC] Add StackMap/PatchPoint support This commit does two things: 1. Refactors PPCFastISel to use more of the common infrastructure for call lowering (this lets us take advantage of this common code for lowering some common intrinsics, stackmap/patchpoint among them). 2. Adds support for stackmap/patchpoint lowering. For the most part, this is very similar to the support in the AArch64 target, with the obvious differences (different registers, NOP instructions, etc.). The test cases are adapted from the AArch64 test cases. One difference of note is that the patchpoint call sequence takes 24 bytes, so you can't use less than that (on AArch64 you can go down to 16). Also, as noted in the docs, we take the patchpoint address to be the actual code address (assuming the call is local in the TOC-sharing sense), which should yield higher performance than generating the full cross-DSO indirect-call sequence and is likely just as useful for JITed code (if not, we'll change it). StackMaps and Patchpoints are still marked as experimental, and so this support is doubly experimental. So go ahead and experiment! llvm-svn: 225808	2015-01-13 17:48:12 +00:00
Hal Finkel	c4ee2c5188	[StackMaps] Use CurrentFnSymForSize When computing the call-site offset, use AP.CurrentFnSymForSize instead of AP.CurrentFnSym. There should be no change for other targets, but this is necessary for generating valid expressions for PPC64/ELF. llvm-svn: 225807	2015-01-13 17:48:07 +00:00
Hal Finkel	0ad96c818c	[StackMaps] Mark in CallLoweringInfo when lowering a patchpoint While, generally speaking, the process of lowering arguments for a patchpoint is the same as lowering a regular indirect call, on some targets it may not be exactly the same. Targets may not, for example, want to add additional register dependencies that apply only to making cross-DSO calls through linker stubs, may not want to load additional registers out of function descriptors, and may not want to add additional side-effect-causing instructions that cannot be removed later with the call itself being generated. The PowerPC target will use this in a future commit (for all of the reasons stated above). llvm-svn: 225806	2015-01-13 17:48:04 +00:00
Hal Finkel	df87f9383b	[StackMaps] Allow the target to pre-process the live-out mask Some targets, PowerPC for example, have pseudo-registers (such as that used to represent the rounding mode), that don't have DWARF register numbers or a register class. These are used only for internal dependency tracking, and should not appear in the recorded live-outs. This adds a callback allowing the target to pre-process the live-out mask in order to remove these kinds of registers so that the StackMaps code does not complain about them and/or attempt to include them in the output. This will be used by the PowerPC target in a future commit. llvm-svn: 225805	2015-01-13 17:47:59 +00:00
Hal Finkel	f4a22c0d48	[PowerPC] Split the blr definition into BLR and BLR8 We really need a separate 64-bit version of this instruction so that it can be marked as clobbering LR8 (instead of just LR). No change in functionality (although the verifier might be slightly happier), however, it is required for stackmap/patchpoint support. Thus, this will be covered by stackmap test cases once those are added. llvm-svn: 225804	2015-01-13 17:47:54 +00:00
Hal Finkel	7d3d50bcb2	[PowerPC] Add DWARF numbers for CA (XER), etc. For registers that have DWARF numbers (like CA, which is really part of XER), add them. Also, RM is not an SPR, and the declaration hack (where it is declared as an SPR with an arbitrary number) is not needed, so just declare it as a register. NFC; although CA's register number will be needed when stackmap/patchpoint support is added. llvm-svn: 225800	2015-01-13 17:45:11 +00:00
Jozef Kolek	e7cad7a1df	[mips][microMIPS] Fix issue with 16b instructions in jr instruction delay slot 16 bit instructions are not allowed in jr delay slot. Same stands for PseudoIndirectBranch and PseudoReturn. Differential Revision: http://reviews.llvm.org/D6815 llvm-svn: 225798	2015-01-13 15:59:17 +00:00
Olivier Sallenave	325096980b	Added TLI hook for isFPExtFree. Some of the FMA combine heuristics are now guarded with that hook. llvm-svn: 225795	2015-01-13 15:06:36 +00:00
Erik Eckstein	a168ef753f	Revert "SLPVectorizer: Cache results from memory alias checking." The alias cache has a problem of incorrect collisions in case a new instruction is allocated at the same address as a previously deleted instruction. llvm-svn: 225790	2015-01-13 14:36:46 +00:00
Erik Eckstein	4a445c047f	SLPVectorizer: Cache results from memory alias checking. This speeds up the dependency calculations for blocks with many load/store/call instructions. Beside the improved runtime, there is no functional change. llvm-svn: 225786	2015-01-13 11:37:51 +00:00
Chandler Carruth	816702ffe0	[PM] Refactor the new pass manager to use a single template to implement the generic functionality of the pass managers themselves. In the new infrastructure, the pass "manager" isn't actually interesting at all. It just pipelines a single chunk of IR through N passes. We don't need to know anything about the IR or the passes to do this really and we can replace the 3 implementations of the exact same functionality with a single generic PassManager template, complementing the single generic AnalysisManager template. I've left typedefs in place to give convenient names to the various obvious instantiations of the template. With this, I think I've nuked almost all of the redundant logic in the managers, and I think the overall design is actually simpler for having single templates that clearly indicate there is no special logic here. The logging is made somewhat more annoying by this change, but I don't think the difference is worth having heavy-weight traits to help log things. llvm-svn: 225783	2015-01-13 11:13:56 +00:00
Mehdi Amini	22e59748ef	Peephole opt needs optimizeSelect() to keep track of newly created MIs Peephole optimizer is scanning a basic block forward. At some point it needs to answer the question "given a pointer to an MI in the current BB, is it located before or after the current instruction". To perform this, it keeps a set of the MIs already seen during the scan, if a MI is not in the set, it is assumed to be after. It means that newly created MIs have to be inserted in the set as well. This commit passes the set as an argument to the target-dependent optimizeSelect() so that it can properly update the set with the (potentially) newly created MIs. llvm-svn: 225772	2015-01-13 07:07:13 +00:00
Ramkumar Ramachandra	181233b2b7	fix {typo, build failure} in r225760 llvm-svn: 225762	2015-01-13 04:17:47 +00:00
Ramkumar Ramachandra	40c3e03e27	Standardize {pred,succ,use,user}_empty() The functions {pred,succ,use,user}_{begin,end} exist, but many users have to check _begin() with _end() by hand to determine if the BasicBlock or User is empty. Fix this with a standard *_empty(), demonstrating a few usecases. llvm-svn: 225760	2015-01-13 03:46:47 +00:00
Saleem Abdulrasool	faa4f074eb	ARM: prepare prefix parsing for improved AAELF support AAELF specifies a number of ELF specific relocation types which have custom prefixes for the symbol reference. Switch the parser to be more table driven with an idea of file formats for which they apply. NFC. llvm-svn: 225758	2015-01-13 03:22:49 +00:00
Chandler Carruth	7ad6d620b7	[PM] Fold all three analysis managers into a single AnalysisManager template. This consolidates three copies of nearly the same core logic. It adds "complexity" to the ModuleAnalysisManager in that it makes it possible to share a ModuleAnalysisManager across multiple modules... But it does so by deleting all of the code, so I'm OK with that. This will naturally make fixing bugs in this code much simpler, etc. The only down side here is that we have to use 'typename' and 'this->' in various places, and the implementation is lifted into the header. I'll take that for the code size reduction. The convenient names are still typedef-ed and used throughout so that users can largely ignore this aspect of the implementation. The follow-up change to this will do the exact same refactoring for the PassManagers. =D It turns out that the interesting different code is almost entirely in the adaptors. At the end, that should be essentially all that is left. llvm-svn: 225757	2015-01-13 02:51:47 +00:00
Sanjay Patel	db8e6f472e	fix typo; NFC llvm-svn: 225753	2015-01-13 01:51:52 +00:00
Reid Kleckner	3542ace6ef	Rename llvm.recoverframeallocation to llvm.framerecover This name is less descriptive, but it sort of puts things in the 'llvm.frame...' namespace, relating it to frameallocate and frameaddress. It also avoids using "allocate" and "allocation" together. llvm-svn: 225752	2015-01-13 01:51:34 +00:00
Reid Kleckner	e9b8931873	Add the llvm.frameallocate and llvm.recoverframeallocation intrinsics These intrinsics allow multiple functions to share a single stack allocation from one function's call frame. The function with the allocation may only perform one allocation, and it must be in the entry block. Functions accessing the allocation call llvm.recoverframeallocation with the function whose frame they are accessing and a frame pointer from an active call frame of that function. These intrinsics are very difficult to inline correctly, so the intention is that they be introduced rarely, or at least very late during EH preparation. Reviewers: echristo, andrew.w.kaylor Differential Revision: http://reviews.llvm.org/D6493 llvm-svn: 225746	2015-01-13 00:48:10 +00:00
Duncan P. N. Exon Smith	845755c4bb	IR: Remove an invalid assertion when replacing resolved operands This adds back the testcase from r225738, and adds to it. Looks like we need both sides for now (the assertion was incorrect both ways, and although it seemed reasonable (when written correctly) it wasn't particularly important). llvm-svn: 225745	2015-01-13 00:46:34 +00:00
Matt Arsenault	a982e4f82b	Combine fcmp + select to fminnum / fmaxnum if no nans and legal Also require unsafe FP math for no since there isn't a way to test for signed zeros. llvm-svn: 225744	2015-01-13 00:43:00 +00:00
Chandler Carruth	2e7522e9ce	[PM] Re-clang-format much of this code as the code has changed some and so has clang-format. Notably, this fixes a bunch of formatting in the CGSCC pass manager side of things that has been improved in clang-format recently. llvm-svn: 225743	2015-01-13 00:36:47 +00:00
Duncan P. N. Exon Smith	2cc792b1d1	Revert "IR: Fix an inverted assertion when replacing resolved operands" This reverts commit r225738. Maybe the assertion is just plain wrong, but this version fails on WAY more bots. I'll make sure both ways work in a follow-up but I want to get bots green in the meantime. llvm-svn: 225742	2015-01-13 00:34:21 +00:00
Eric Christopher	acf25766ad	Grammar and spelling. llvm-svn: 225740	2015-01-13 00:21:14 +00:00
Duncan P. N. Exon Smith	e4c842f816	IR: Fix an inverted assertion when replacing resolved operands Add a unit test, since this bug was only exposed by clang tests. Thanks to Rafael for tracking this down! llvm-svn: 225738	2015-01-13 00:10:38 +00:00
Adrian Prantl	66f2595845	Debug Info: Move support for constants into DwarfExpression. Move the declaration of DebugLocDwarfExpression into DwarfExpression.h because it needs to be accessed from AsmPrinterDwarf.cpp and DwarfDebug.cpp NFC. llvm-svn: 225734	2015-01-13 00:04:06 +00:00
Duncan P. N. Exon Smith	a6de6a4013	IR: Split out writeMDTuple(), NFC Prepare for more subclasses of `UniquableMDNode` than `MDTuple`. llvm-svn: 225732	2015-01-12 23:45:31 +00:00
Adrian Prantl	a4c30d6509	Make DwarfExpression store the AsmPrinter instead of the TargetMachine. NFC. llvm-svn: 225731	2015-01-12 23:36:56 +00:00
Adrian Prantl	9cffbd8daa	remove extra semicolon llvm-svn: 225730	2015-01-12 23:36:50 +00:00
Reid Kleckner	bba20f06de	musttail: Only set the inreg flag for fastcall and vectorcall Otherwise we'll attempt to forward ECX, EDX, and EAX for cdecl and stdcall thunks, leaving us with no scratch registers for indirect call targets. Fixes PR22052. llvm-svn: 225729	2015-01-12 23:28:23 +00:00
Matt Arsenault	64dae8354b	R600/SI: Remove redundant setting expand on f64 vectors None of these are legal types already, so they default to Expand. llvm-svn: 225728	2015-01-12 23:13:00 +00:00
Adrian Prantl	337e360279	Run clang-format on the parts of AsmPrinterDwarf where it improves the readability. llvm-svn: 225726	2015-01-12 23:03:23 +00:00
Adrian Prantl	0fec811d7b	Debug Info: Add a virtual destructor to DwarfExpression. Thanks Chandler for noticing! llvm-svn: 225724	2015-01-12 22:59:28 +00:00
Chandler Carruth	2482fe0b52	[PM] Sink the reference vs. value decision for IR units out of the templated interface. So far, every single IR unit I can come up with has address-identity. That is, when two units of IR are both active in LLVM, their addresses will be distinct of the IR is distinct. This is clearly true for Modules, Functions, BasicBlocks, and Instructions. It turns out that the only practical way to make the CGSCC stuff work the way we want is to make it true for SCCs as well. I expect this pattern to continue. When first designing the pass manager code, I kept this dimension of freedom in the type parameters, essentially allowing for a wrapper-type whose address did not form identity. But that really no longer makes sense and is making the code more complex or subtle for no gain. If we ever have an actual use case for this, we can figure out what makes sense then and there. It will be better because then we will have the actual example in hand. While the simplifications afforded in this patch are fairly small (mostly sinking the '&' out of many type parameters onto a few interfaces), it would have become much more pronounced with subsequent changes. I have a sequence of changes that will completely remove the code duplication that currently exists between all of the pass managers and analysis managers. =] Should make things much cleaner and avoid bug fixing N times for the N pass managers. llvm-svn: 225723	2015-01-12 22:53:31 +00:00
Adrian Prantl	0d5df0ac1c	Untwine this expression. Thanks to David for noticing! llvm-svn: 225720	2015-01-12 22:39:14 +00:00
Simon Pilgrim	d88ab87064	[X86][SSE] Minor regression fix for r225551 r225551 vector byte shuffle optimization caused an assertion as fully zeroable vectors can be produced under certain circumstances. This fix drops the assert and returns a zero vector where the assert would have failed. llvm-svn: 225718	2015-01-12 22:38:08 +00:00
Adrian Prantl	0e6ffb9d0d	Debug Info: Implement DwarfUnit::addRegisterOpPiece() using DwarfExpression. NFC. llvm-svn: 225717	2015-01-12 22:37:16 +00:00
Duncan P. N. Exon Smith	49503f827d	Bitcode: Range-based for, NFC llvm-svn: 225716	2015-01-12 22:35:34 +00:00
Duncan P. N. Exon Smith	b1ad5d39a9	Bitcode: Add abbreviation for METADATA_NAME llvm-svn: 225715	2015-01-12 22:34:10 +00:00
Duncan P. N. Exon Smith	f8dd6ad6de	Bitcode: Range-based for, NFC llvm-svn: 225714	2015-01-12 22:33:00 +00:00
Duncan P. N. Exon Smith	73d5aae74c	Bitcode: Range-based for, NFC llvm-svn: 225713	2015-01-12 22:31:35 +00:00
Duncan P. N. Exon Smith	2fcf60e78e	Bitcode: Simplify emission of METADATA_BLOCK Refactor logic so that we know up-front whether to open a block and whether we need an MDString abbreviation. This is almost NFC, but will start emitting `MDString` abbreviations when the first record is not an `MDString`. llvm-svn: 225712	2015-01-12 22:30:34 +00:00
Duncan P. N. Exon Smith	0b31dd1d67	AsmParser: Use subclass API instead of MDNode wrappers, NFC Use subclass API instead of the wrappers in `MDNode` in the assembly parser. This will make the code easier to follow once we have multiple subclasses. llvm-svn: 225711	2015-01-12 22:27:39 +00:00
Duncan P. N. Exon Smith	f825dae836	AsmParser: Factor duplicated code into ParseMDNode(), NFC llvm-svn: 225710	2015-01-12 22:26:48 +00:00
Duncan P. N. Exon Smith	62a7919f6b	AsmParser: Reorder ParseMetadata() logic, NFC llvm-svn: 225709	2015-01-12 22:24:50 +00:00
Duncan P. N. Exon Smith	dbcff30bd1	AsmParser: Simplify ParseMDTuple(), NFC llvm-svn: 225708	2015-01-12 22:23:04 +00:00
Adrian Prantl	00dbc2a7d3	Debug Info: Implement DwarfUnit::addRegisterOffset using DwarfExpression. No functional change. llvm-svn: 225707	2015-01-12 22:19:26 +00:00
Adrian Prantl	b16d9ebb0c	Debug info: Factor out the creation of DWARF expressions from AsmPrinter into a new class DwarfExpression that can be shared between AsmPrinter and DwarfUnit. This is the first step towards unifying the two entirely redundant implementations of dwarf expression emission in DwarfUnit and AsmPrinter. Almost no functional change — Testcases were updated because asm comments that used to be on two lines now appear on the same line, which is actually preferable. llvm-svn: 225706	2015-01-12 22:19:22 +00:00
Duncan P. N. Exon Smith	58ef9d142a	AsmParser: ParseMDNode() => ParseMDTuple(), NFC This isn't parsing arbitrary subclasses of `MDNode`, just `MDTuple`. llvm-svn: 225702	2015-01-12 21:23:11 +00:00
Sanjay Patel	06d5589a84	80-cols; NFC llvm-svn: 225700	2015-01-12 21:21:28 +00:00
Duncan P. N. Exon Smith	a8d9a026d9	AsmParser: Remove unused version of ParseMDNodeID() Merge the two versions of `ParseMDNodeID()` now that no one needs special forward references. llvm-svn: 225699	2015-01-12 21:14:38 +00:00
Duncan P. N. Exon Smith	ab617d5977	AsmParser: Use normal references for metadata attachments, NFC Remove special parsing logic for metadata attachments. Now that `DebugLoc` is stored normally (since the metadata/value split), we don't need this special forward referencing logic. llvm-svn: 225698	2015-01-12 21:13:09 +00:00
Duncan P. N. Exon Smith	bf68e80d06	IR: Prepare for a new UniquableMDNode subclass, NFC Add generic dispatch for the parts of `UniquableMDNode` that cast to `MDTuple`. This makes adding other subclasses (like PR21433's `MDLocation`) easier. llvm-svn: 225697	2015-01-12 20:56:33 +00:00
Duncan P. N. Exon Smith	6b1f4659f9	IR: Stop erasing MDNodes from uniquing sets during teardown Stop erasing `MDNode`s from the uniquing sets in `LLVMContextImpl` during teardown (in particular, during `UniquableMDNode::~UniquableMDNode()`). Although it's currently feasible, there isn't any clear benefit and it may not be feasible for other subclasses (which don't explicitly store the lookup hash). llvm-svn: 225696	2015-01-12 20:50:25 +00:00
Ahmed Bougacha	291833b959	[X86] Also create+widen FMIN/FMAX nodes for v2f32. This happens in the HINT benchmark, where the SLP-vectorizer created v2f32 fcmp/select code. The "correct" solution would have been to teach the vectorizer cost model that v2f32 isn't legal (because really, it isn't), but if we can vectorize we might as well do so. We legalize these v2f32 FMIN/FMAX nodes by widening to v4f32 later on. v3f32 were already widened to v4f32 by the generic unroll-and-build-vector legalization. rdar://15763436 Differential Revision: http://reviews.llvm.org/D6557 llvm-svn: 225691	2015-01-12 20:31:30 +00:00
Duncan P. N. Exon Smith	942623540b	IR: Move creation logic to MDNodeFwdDecl, NFC Same as with `MDTuple`, factor out a `friend MDNode` by moving creation logic to the concrete subclass. llvm-svn: 225690	2015-01-12 20:21:37 +00:00
Duncan P. N. Exon Smith	ac3128d901	IR: Move creation logic down to MDTuple, NFC Move creation logic for `MDTuple`s down where it belongs. Once there are a few more subclasses, these functions really won't make much sense here (the `friend` relationship was already awkward). For now, leave the `MDNode` versions around, but have it forward down. llvm-svn: 225685	2015-01-12 20:13:56 +00:00
Duncan P. N. Exon Smith	3c94844a48	IR: Push storeDistinctInContext() down to UniquableMDNode, NFC llvm-svn: 225683	2015-01-12 20:11:32 +00:00
Duncan P. N. Exon Smith	118632dbf6	IR: Split GenericMDNode into MDTuple and UniquableMDNode Split `GenericMDNode` into two classes (with more descriptive names). - `UniquableMDNode` will be a common subclass for `MDNode`s that are sometimes uniqued like constants, and sometimes 'distinct'. This class gets the (short-lived) RAUW support and related API. - `MDTuple` is the basic tuple that has always been returned by `MDNode::get()`. This is as opposed to more specific nodes to be added soon, which have additional fields, custom assembly syntax, and extra semantics. This class gets the hash-related logic, since other sublcasses of `UniquableMDNode` may need to hash based on other fields. To keep this diff from getting too big, I've added casts to `MDTuple` that won't really scale as new subclasses of `UniquableMDNode` are added, but I'll clean those up incrementally. (No functionality change intended.) llvm-svn: 225682	2015-01-12 20:09:34 +00:00
Duncan P. N. Exon Smith	0c87d77175	IR: Invert logic to simplify control flow, NFC llvm-svn: 225670	2015-01-12 19:45:44 +00:00
Duncan P. N. Exon Smith	34c3d10363	IR: Separate out decrementUnresolvedOperandCount(), NFC llvm-svn: 225667	2015-01-12 19:43:15 +00:00
Duncan P. N. Exon Smith	d9e6eb7108	IR: Prevent handleChangedOperand() recursion Instead of returning early on `handleChangedOperand()` recursion (finally identified (and test added) in r225657), prevent it upfront by releasing operands before RAUW. Aside from massively different program flow, there should be no functionality change ;). llvm-svn: 225665	2015-01-12 19:36:35 +00:00
Tom Stellard	b6550529a6	R600/SI: Use RegisterOperands to specify which operands can accept immediates There are some operands which can take either immediates or registers and we were previously using different register class to distinguish between operands that could take immediates and those that could not. This patch switches to using RegisterOperands which should simplify the backend by reducing the number of register classes and also make it easier to implement the assembler. llvm-svn: 225662	2015-01-12 19:33:18 +00:00
Sanjay Patel	5f1d9eaad3	GVN: propagate equalities for floating point compares Allow optimizations based on FP comparison values in the same way as integers. This resolves PR17713: http://llvm.org/bugs/show_bug.cgi?id=17713 Differential Revision: http://reviews.llvm.org/D6911 llvm-svn: 225660	2015-01-12 19:29:48 +00:00
Duncan P. N. Exon Smith	5f4618923c	IR: Add test for handleChangedOperand() recursion Turns out this can happen. Remove the `FIXME` and add a testcase that crashes without the extra logic. llvm-svn: 225657	2015-01-12 19:22:04 +00:00
Duncan P. N. Exon Smith	967629e14a	IR: Separate out recalculateHash(), NFC llvm-svn: 225655	2015-01-12 19:16:34 +00:00
Duncan P. N. Exon Smith	3a16d80a44	IR: Separate out helper: resolveAfterOperandChange(), NFC llvm-svn: 225654	2015-01-12 19:14:15 +00:00
Duncan P. N. Exon Smith	5c5710b890	IR: Use SubclassData32 directly, NFC Simplify some logic by accessing `SubclassData32` directly instead of relying on API. llvm-svn: 225653	2015-01-12 19:12:37 +00:00
Matthias Braun	f5d931f716	RegisterCoalescer: Turn some impossible conditions into asserts This is a fixed version of reverted r225500. It fixes the too early if() continue; of the last patch and adds a comment to the unorthodox loop. llvm-svn: 225652	2015-01-12 19:10:17 +00:00
Duncan P. N. Exon Smith	6c0aee3248	IR: Don't allow operands to become unresolved Operands shouldn't change from being resolved to unresolved during graph construction. Simplify the logic based on that assumption. llvm-svn: 225649	2015-01-12 18:59:40 +00:00
Duncan P. N. Exon Smith	c0286874d7	IR: Remove redundant comment, NFC llvm-svn: 225648	2015-01-12 18:45:32 +00:00
Duncan P. N. Exon Smith	686162b1bc	IR: Simplify code, NFC llvm-svn: 225647	2015-01-12 18:45:01 +00:00
Rafael Espindola	d9c3e308f5	Add r224985 back with two fixes. One is that AArch64 has additional restrictions on when local relocations can be used. We have to take those into consideration when deciding to put a L symbol in the symbol table or not. The other is that ld64 requires the relocations to cstring to use linker visible symbols on AArch64. Thanks to Michael Zolotukhin for testing this! Remove doesSectionRequireSymbols. In an assembly expression like bar: .long L0 + 1 the intended semantics is that bar will contain a pointer one byte past L0. In sections that are merged by content (strings, 4 byte constants, etc), a single position in the section doesn't give the linker enough information. For example, it would not be able to tell a relocation must point to the end of a string, since that would look just like the start of the next. The solution used in ELF to use relocation with symbols if there is a non-zero addend. In MachO before this patch we would just keep all symbols in some sections. This would miss some cases (only cstrings on x86_64 were implemented) and was inefficient since most relocations have an addend of 0 and can be represented without the symbol. This patch implements the non-zero addend logic for MachO too. llvm-svn: 225644	2015-01-12 18:13:07 +00:00
Duncan P. N. Exon Smith	daa335a9c2	IR: Simplify replaceOperandWith(), NFC This will call `handleChangedOperand()` less frequently, but in that case (i.e., `isStoredDistinctInContext()`) it has identical logic to here. llvm-svn: 225643	2015-01-12 18:01:45 +00:00
Duncan P. N. Exon Smith	54df9896e3	IR: Remove redundant calls to MDNode::setHash(), NFC `storeDistinctInContext()` already calls `setHash(0)`. llvm-svn: 225642	2015-01-12 17:57:38 +00:00
Timur Iskhodzhanov	00ede84084	[ASan] Move the shadow on Windows 32-bit from 0x20000000 to 0x40000000 llvm-svn: 225641	2015-01-12 17:38:58 +00:00
Ahmed Bougacha	e03bef7543	[SimplifyLibCalls] Factor out fortified libcall handling. This lets us remove CGP duplicate. Differential Revision: http://reviews.llvm.org/D6541 llvm-svn: 225640	2015-01-12 17:22:43 +00:00
Ahmed Bougacha	6722f5e5b3	[SimplifyLibCalls] Factor out str/mem libcall optimizations. Put them in a separate function, so we can reuse them to further simplify fortified libcalls as well. Differential Revision: http://reviews.llvm.org/D6540 llvm-svn: 225639	2015-01-12 17:20:06 +00:00
Ahmed Bougacha	b7d8afb6c5	[SimplifyLibCalls] Factor out signature checks for fortifiable libcalls. The checks are the same for fortified counterparts to the libcalls, so we might as well do them in a single place. Differential Revision: http://reviews.llvm.org/D6539 llvm-svn: 225638	2015-01-12 17:18:19 +00:00
Jozef Kolek	9761e96b01	[mips][microMIPS] Implement BEQZ16 and BNEZ16 instructions Differential Revision: http://reviews.llvm.org/D5271 llvm-svn: 225627	2015-01-12 12:03:34 +00:00
Hal Finkel	87deb0b8e3	[PowerPC] Fix calls to non-function objects Looking at r225438 inspired me to see how the PowerPC backend handled the situation (calling a bitcasted TLS global), and it turns out we also produced an error (cannot select ...). What it means to "call" something that is not a function is implementation and platform specific, but in the name of doing something (besides crashing), this makes sure we do what GCC does (treat all such calls as calls through a function pointer -- meaning that the pointer is assumed, as is the convention on PPC, to point to a function descriptor structure holding the actual code address along with the function's TOC pointer and environment pointer). As GCC does, we now do the same for calling regular (non-TLS) non-function globals too. I'm not sure whether this is the most useful way to define the behavior, but at least we won't be alone. llvm-svn: 225617	2015-01-12 04:34:47 +00:00
Simon Pilgrim	b5869f6c7c	[X86][SSE] Minor fix to VPBLENDW AVX2 commutation. D6015 / rL221313 enabled commutation for SSE immediate blend instructions, but due to a typo the AVX2 VPBLENDW ymm instructions weren't flagged as commutative along with the others in the tables, but were still being commuted in code and tested for. llvm-svn: 225612	2015-01-11 22:08:01 +00:00
David Majnemer	14141f941a	Revert most of r225597 We can't rely on a DataLayout enlightened constant folder. llvm-svn: 225599	2015-01-11 07:29:51 +00:00
David Majnemer	292d0c796b	X86: Properly decode shuffle masks when the constant pool type is weird It's possible for the constant pool entry for the shuffle mask to come from a completely different operation. This occurs when Constants have the same bit pattern but have different types. Make DecodePSHUFBMask tolerant of types which, after a bitcast, are appropriately sized vector types. This fixes PR22188. llvm-svn: 225597	2015-01-11 05:08:57 +00:00
Saleem Abdulrasool	9cf2679d3b	X86: teach X86TargetLowering about L,M,O constraints Teach the ISelLowering for X86 about the L,M,O target specific constraints. Although, for the moment, clang performs constraint validation and prevents passing along inline asm which may have immediate constant constraints violated, the backend should be able to cope with the invalid inline asm a bit better. llvm-svn: 225596	2015-01-11 04:39:24 +00:00
Saleem Abdulrasool	fe781977b9	ARM: add support for segment base relocations (SBREL) This adds support for parsing and emitting the SBREL relocation variant for the ARM target. Handling this relocation variant is necessary for supporting the full ARM ELF specification. Addresses PR22128. llvm-svn: 225595	2015-01-11 04:39:18 +00:00
Sanjoy Das	81401d4b19	Fix PR22179. We were incorrectly inferring nsw for certain SCEVs. We can be more aggressive here (see Richard Smith's comment on http://llvm.org/bugs/show_bug.cgi?id=22179) but this change just focuses on correctness. Differential Revision: http://reviews.llvm.org/D6914 llvm-svn: 225591	2015-01-10 23:41:24 +00:00
Joerg Sonnenberger	8a36a8e5d4	Revert r225500, it leads to infinite loops. llvm-svn: 225590	2015-01-10 21:49:36 +00:00
Simon Pilgrim	94a4cc027a	[X86][SSE] Improved (v)insertps shuffle matching In the current code we only attempt to match against insertps if we have exactly one element from the second input vector, irrespective of how much of the shuffle result is zeroable. This patch checks to see if there is a single non-zeroable element from either input that requires insertion. It also supports matching of cases where only one of the inputs need to be referenced. We also split insertps shuffle matching off into a new lowerVectorShuffleAsInsertPS function. Differential Revision: http://reviews.llvm.org/D6879 llvm-svn: 225589	2015-01-10 19:45:33 +00:00
Hal Finkel	5d5d1539cc	[PowerPC] Mark zext of a small scalar load as free This initial implementation of PPCTargetLowering::isZExtFree marks as free zexts of small scalar loads (that are not sign-extending). This callback is used by SelectionDAGBuilder's RegsForValue::getCopyToRegs, and thus to determine whether a zext or an anyext is used to lower illegally-typed PHIs. Because later truncates of zero-extended values are nops, this allows for the elimination of later unnecessary truncations. Fixes the initial complaint associated with PR22120. llvm-svn: 225584	2015-01-10 08:21:59 +00:00
Justin Hibbits	17744c1e0d	Remove some whitespace. llvm-svn: 225583	2015-01-10 07:50:31 +00:00
Justin Hibbits	654346e6f9	Fully fix Bug #22115 . Summary: In the previous commit, the register was saved, but space was not allocated. This resulted in the parameter save area potentially clobbering r30, leading to nasty results. Test Plan: Tests updated Reviewers: hfinkel Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6906 llvm-svn: 225573	2015-01-10 01:57:21 +00:00
Alexey Samsonov	7c8a725116	Fix undefined behavior (shift of negative value) in RuntimeDyldMachOAArch64::encodeAddend. Test Plan: regression test suite with/without UBSan. Reviewers: lhames, ributzka Subscribers: aemerson, llvm-commits Differential Revision: http://reviews.llvm.org/D6908 llvm-svn: 225568	2015-01-10 00:46:38 +00:00
Hal Finkel	611b127ad8	[PowerPC] Readjust the loop unrolling threshold Now that the way that the partial unrolling threshold for small loops is used to compute the unrolling factor as been corrected, a slightly smaller threshold is preferable. This is expected; other targets may need to re-tune as well. llvm-svn: 225566	2015-01-10 00:31:10 +00:00
Hal Finkel	38dd590861	[LoopUnroll] Fix the partial unrolling threshold for small loop sizes When we compute the size of a loop, we include the branch on the backedge and the comparison feeding the conditional branch. Under normal circumstances, these don't get replicated with the rest of the loop body when we unroll. This led to the somewhat surprising behavior that really small loops would not get unrolled enough -- they could be unrolled more and the resulting loop would be below the threshold, because we were assuming they'd take (LoopSize * UnrollingFactor) instructions after unrolling, instead of (((LoopSize-2) * UnrollingFactor)+2) instructions. This fixes that computation. llvm-svn: 225565	2015-01-10 00:30:55 +00:00
Rafael Espindola	d0b23bef6f	Use the DiagnosticHandler to print diagnostics when reading bitcode. The bitcode reading interface used std::error_code to report an error to the callers and it is the callers job to print diagnostics. This is not ideal for error handling or diagnostic reporting: * For error handling, all that the callers care about is 3 possibilities: * It worked * The bitcode file is corrupted/invalid. * The file is not bitcode at all. * For diagnostic, it is user friendly to include far more information about the invalid case so the user can find out what is wrong with the bitcode file. This comes up, for example, when a developer introduces a bug while extending the format. The compromise we had was to have a lot of error codes. With this patch we use the DiagnosticHandler to communicate with the human and std::error_code to communicate with the caller. This allows us to have far fewer error codes and adds the infrastructure to print better diagnostics. This is so because the diagnostics are printed when he issue is found. The code that detected the problem in alive in the stack and can pass down as much context as needed. As an example the patch updates test/Bitcode/invalid.ll. Using a DiagnosticHandler also moves the fatal/non-fatal error decision to the caller. A simple one like llvm-dis can just use fatal errors. The gold plugin needs a bit more complex treatment because of being passed non-bitcode files. An hypothetical interactive tool would make all bitcode errors non-fatal. llvm-svn: 225562	2015-01-10 00:07:30 +00:00
Andrew Kaylor	a10379ad49	Fix the JIT event listeners and replace the associated tests. The changes to EventListenerCommon.h were contributed by Arch Robison. This fixes bug 22095. http://reviews.llvm.org/D6905 llvm-svn: 225554	2015-01-09 22:53:24 +00:00
Michael Zolotukhin	d9ade185b9	Update comment. llvm-svn: 225553	2015-01-09 22:15:06 +00:00
Hans Wennborg	dcc6e5bc03	SimplifyCFG: check uses of constant-foldable instrs in switch destinations (PR20210) The previous code assumed that such instructions could not have any uses outside CaseDest, with the motivation that the instruction could not dominate CommonDest because CommonDest has phi nodes in it. That simply isn't true; e.g., CommonDest could have an edge back to itself. llvm-svn: 225552	2015-01-09 22:13:31 +00:00
Simon Pilgrim	ec1f2c2cab	[X86][SSE] Avoid vector byte shuffles with zero by using pshufb to create zeros pshufb can shuffle in zero bytes as well as bytes from a source vector - we can use this to avoid having to shuffle 2 vectors and ORing the result when the used inputs from a vector are all zeroable. Differential Revision: http://reviews.llvm.org/D6878 llvm-svn: 225551	2015-01-09 22:03:19 +00:00
Michael Zolotukhin	1c38bc12de	Remove duplicating code. NFC. The removed condition is checked in the previous loop. llvm-svn: 225542	2015-01-09 20:36:19 +00:00
Tim Northover	eb16112e97	Re-reapply r221924: "[GVN] Perform Scalar PRE on gep indices that feed loads before doing Load PRE" It's not really expected to stick around, last time it provoked a weird LTO build failure that I can't reproduce now, and the bot logs are long gone. I'll re-revert it if the failures recur. Original description: Perform Scalar PRE on gep indices that feed loads before doing Load PRE. llvm-svn: 225536	2015-01-09 19:19:56 +00:00
Lang Hames	1e923ec122	Recommit r224935 with a fix for the ObjC++/AArch64 bug that that revision introduced. A test case for the bug was already committed in r225385. Patch by Rafael Espindola. llvm-svn: 225534	2015-01-09 18:55:42 +00:00
Duncan P. N. Exon Smith	9ed19665bb	Revert "Bitcode: Move the DEBUG_LOC record to DEBUG_LOC_OLD" This reverts commit r225498 (but leaves r225499, which was a worthy cleanup). My plan was to change `DEBUG_LOC` to store the `MDNode` directly rather than its operands (patch was to go out this morning), but on reflection it's not clear that it's strictly better. (I had missed that the current code is unlikely to emit the `MDNode` at all.) Conflicts: lib/Bitcode/Reader/BitcodeReader.cpp (due to r225499) llvm-svn: 225531	2015-01-09 17:53:27 +00:00
Daniel Sanders	1440bb2a26	[mips] Add support for accessing $gp as a named register. Summary: Mips Linux uses $gp to hold a pointer to thread info structure and accesses it with a named register. This makes this work for LLVM. The N32 ABI doesn't quite work yet since the frontend generates incorrect IR for this case. It neglects to truncate the 64-bit GPR to a 32-bit value before converting to a pointer. Given correct IR (as in the testcase in this patch), it works correctly. Reviewers: sstankovic, vmedic, atanasyan Reviewed By: atanasyan Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6893 llvm-svn: 225529	2015-01-09 17:21:30 +00:00
Sanjay Patel	2a385e2494	remove names from comments; NFC llvm-svn: 225526	2015-01-09 16:47:20 +00:00
Sanjay Patel	938e279082	fix typos; NFC llvm-svn: 225525	2015-01-09 16:35:37 +00:00
Sanjay Patel	e6e58c1a9e	fix typo; NFC llvm-svn: 225524	2015-01-09 16:29:50 +00:00
Sanjay Patel	d729115fa7	more efficient use of a dyn_cast; no functional change intended llvm-svn: 225523	2015-01-09 16:28:15 +00:00
Hal Finkel	b359b735d6	[PowerPC] Enable late partial unrolling on the POWER7 The P7 benefits from not have really-small loops so that we either have multiple dispatch groups in the loop and/or the ability to form more-full dispatch groups during scheduling. Setting the partial unrolling threshold to 44 seems good, empirically, for the P7. Compared to using no late partial unrolling, this yields the following test-suite speedups: SingleSource/Benchmarks/Adobe-C++/simple_types_constant_folding -66.3253% +/- 24.1975% SingleSource/Benchmarks/Misc-C++/oopack_v1p8 -44.0169% +/- 29.4881% SingleSource/Benchmarks/Misc/pi -27.8351% +/- 12.2712% SingleSource/Benchmarks/Stanford/Bubblesort -30.9898% +/- 22.4647% I've speculatively added a similar setting for the P8. Also, I've noticed that the unroller does not quite calculate the unrolling factor correctly for really tiny loops because it neglects to account for the fact that not every loop body replicant contains an ending branch and counter increment. I'll fix that later. llvm-svn: 225522	2015-01-09 15:51:16 +00:00
Toma Tabacu	68e8a9c0dd	[mips] Add comment which explains why we need to change the assembler options before and after inline asm blocks. NFC. llvm-svn: 225521	2015-01-09 15:00:30 +00:00
Suyog Sarda	85d0473650	Assumption that "VectorizedValue" will always be an Instruction is not correct. It can be a constant or a vector argument. ex : define i32 @hadd(<4 x i32> %a) #0 { entry: %vecext = extractelement <4 x i32> %a, i32 0 %vecext1 = extractelement <4 x i32> %a, i32 1 %add = add i32 %vecext, %vecext1 %vecext2 = extractelement <4 x i32> %a, i32 2 %add3 = add i32 %add, %vecext2 %vecext4 = extractelement <4 x i32> %a, i32 3 %add5 = add i32 %add3, %vecext4 ret i32 %add5 } llvm-svn: 225517	2015-01-09 10:23:48 +00:00
Saleem Abdulrasool	b68fa3b576	ARM: add support for R_ARM_ABS16 Add support for R_ARM_ABS16 relocation mapping. Addresses PR22156. llvm-svn: 225510	2015-01-09 06:57:24 +00:00
Saleem Abdulrasool	3c0f78a2fc	ARM: add support for R_ARM_ABS8 relocations Add support for R_ARM_ABS8 relocation. Addresses PR22126. llvm-svn: 225507	2015-01-09 05:59:12 +00:00
Matthias Braun	7e87384592	RegisterCoalescer: Fix removeCopyByCommutingDef with subreg liveness The code that eliminated additional coalescable copies in removeCopyByCommutingDef() used MergeValueNumberInto() which internally may merge A into B or B into A. In this case A and B had different Def points, so we have to reset ValNo.Def to the intended one after merging. llvm-svn: 225503	2015-01-09 03:01:31 +00:00
Matthias Braun	ea399e59cf	RegisterCoalescer: Some cleanup in removeCopyByCommutingDef(), NFC llvm-svn: 225502	2015-01-09 03:01:28 +00:00
Matthias Braun	55586a2f2d	RegisterCoalescer: No need to set kill flags, they are recompute later anyway llvm-svn: 225501	2015-01-09 03:01:26 +00:00
Matthias Braun	6588b145fc	RegisterCoalescer: Turn some impossible conditions into asserts llvm-svn: 225500	2015-01-09 03:01:23 +00:00
Duncan P. N. Exon Smith	52d0f16e1b	Bitcode: Share logic for last instruction, NFC Share logic for getting the last instruction emitted. llvm-svn: 225499	2015-01-09 02:51:45 +00:00
Duncan P. N. Exon Smith	11fae74ae5	Bitcode: Move the DEBUG_LOC record to DEBUG_LOC_OLD Prepare to simplify the `DebugLoc` record. llvm-svn: 225498	2015-01-09 02:48:48 +00:00
Hal Finkel	5ff00b4350	[PowerPC] Add a flag for experimenting with subreg liveness tracking This cannot yet be enabled by default, it causes ~50 miscompiles in the test suite. llvm-svn: 225497	2015-01-09 02:03:11 +00:00
Hal Finkel	6c39269a4c	[PowerPC] Fold [sz]ext with fp_to_int lowering where possible On modern cores with lfiw[az]x, we can fold a sign or zero extension from i32 to i64 into the load necessary for an i64 -> fp conversion. llvm-svn: 225493	2015-01-09 01:34:30 +00:00
Hal Finkel	0ce7f372e5	[DAGCombine] Remainder of fix to r225380 (More FMA folding opportunities) As pointed out by Aditya (and Owen), when we elide an FP extend to form an FMA, we need to extend the incoming operands so that the resulting node will really be legal. This is currently enabled only for PowerPC, and it happens to work there regardless, but this should fix the functionality for everyone else should anyone else wish to use it. llvm-svn: 225492	2015-01-09 01:29:29 +00:00
Chandler Carruth	685b1803ab	[x86] Add a flag to control the vector shuffle legality predicates that complements the new vector shuffle lowering code path. This flag, naturally, is off because we've not tested or evaluated the results of this at all. However, the flag will make it much easier to evaluate whether we can be this aggressive and whether there are missing vector shuffle lowering optimizations. llvm-svn: 225491	2015-01-09 01:24:36 +00:00
Chandler Carruth	f4ea3d3d9c	Cleaup ValueHandle to no longer keep a PointerIntPair for the Value*. This was used previously for metadata but is no longer needed there. Not doing this simplifies ValueHandle and will make it easier to fix things like AssertingVH's DenseMapInfo. llvm-svn: 225487	2015-01-09 00:48:47 +00:00
Hal Finkel	33ead6f901	Partial fix to r225380 (More FMA folding opportunities) As pointed out by Aditya (and Owen), there are two things wrong with this code. First, it adds patterns which elide FP extends when forming FMAs, and that might not be profitable on all targets (it belongs behind the pre-existing aggressive-FMA-formation flag). This is fixed by this change. Second, the resulting nodes might have operands of different types (the extensions need to be re-added). That will be fixed in the follow-up commit. llvm-svn: 225485	2015-01-09 00:45:54 +00:00
Philip Reames	33d7f9de33	[REFACTOR] Push logic from MemDepPrinter into getNonLocalPointerDependency Previously, MemDepPrinter handled volatile and unordered accesses without involving MemoryDependencyAnalysis. By making a slight tweak to the documented interface - which is respected by both callers - we can move this responsibility to MDA for the benefit of any future callers. This is basically just cleanup. In the future, we may decide to extend MDA's non local dependency analysis to return useful results for ordered or volatile loads. I believe (but have not really checked in detail) that local dependency analyis does get useful results for ordered, but not volatile, loads. llvm-svn: 225483	2015-01-09 00:26:45 +00:00
Philip Reames	567feb98f0	[Refactor] Have getNonLocalPointerDependency take the query instruction Previously, MemoryDependenceAnalysis::getNonLocalPointerDependency was taking a list of properties about the instruction being queried. Since I'm about to need one more property to be passed down through the infrastructure - I need to know a query instruction is non-volatile in an inner helper - fix the interface once and for all. I also added some assertions and behaviour clarifications around volatile and ordered field accesses. At the moment, this is mostly to document expected behaviour. The only non-standard instructions which can currently reach this are atomic, but unordered, loads and stores. Neither ordered or volatile accesses can reach here. The call in GVN is protected by an isSimple check when it first considers the load. The calls in MemDepPrinter are protected by isUnordered checks. Both utilities also check isVolatile for loads and stores. llvm-svn: 225481	2015-01-09 00:04:22 +00:00
Duncan P. N. Exon Smith	953e1a48f0	Utils: Keep distinct MDNodes distinct in MapMetadata() Create new copies of distinct `MDNode`s instead of following the uniquing `MDNode` logic. Just like self-references (or other cycles), `MapMetadata()` creates a new node. In practice most calls use `RF_NoModuleLevelChanges`, in which case nothing is duplicated anyway. Part of PR22111. llvm-svn: 225476	2015-01-08 22:42:30 +00:00
Duncan P. N. Exon Smith	090a19bd3c	IR: Add 'distinct' MDNodes to bitcode and assembly Propagate whether `MDNode`s are 'distinct' through the other types of IR (assembly and bitcode). This adds the `distinct` keyword to assembly. Currently, no one actually calls `MDNode::getDistinct()`, so these nodes only get created for: - self-references, which are never uniqued, and - nodes whose operands are replaced that hit a uniquing collision. The concept of distinct nodes is still not quite first-class, since distinct-ness doesn't yet survive across `MapMetadata()`. Part of PR22111. llvm-svn: 225474	2015-01-08 22:38:29 +00:00
Hal Finkel	3c0952b072	[PowerPC] Mark all instructions as non-cheap for MachineLICM MachineLICM uses a callback named hasLowDefLatency to determine if an instruction def operand has a 'low' latency. If all relevant operands have a 'low' latency, the instruction is considered too cheap to hoist out of loops even in low-register-pressure situations. On PowerPC cores, both the embedded cores and the others, there is no reason to believe that this is a good choice: all instructions have a cost inside a loop, and hoisting them when not limited by register pressure is a reasonable default. llvm-svn: 225471	2015-01-08 22:11:49 +00:00
Hal Finkel	0709f5160f	[MachineLICM] A command-line option to hoist even cheap instructions Add a command-line option to enable hoisting even cheap instructions (in low-register-pressure situations). This is turned off by default, but has proved useful for testing purposes. llvm-svn: 225470	2015-01-08 22:10:48 +00:00
Duncan P. N. Exon Smith	e90f1165d8	CodeGen: Use handy new-fangled post-increment, NFC Drive-by cleanup; I noticed this when reviewing the patch that became r225466. llvm-svn: 225468	2015-01-08 21:07:55 +00:00
Akira Hatanaka	442b40c2eb	[ARM] Fix a bug in constant island pass that was triggering an assertion. The assert was being triggered when the distance between a constant pool entry and its user exceeded the maximally allowed distance after thumb2 branch shortening. A padding was inserted after a thumb2 branch instruction was shrunk, which caused the user to be out of range. This is wrong as the padding should have been inserted by the layout algorithm so that the distance between two instructions doesn't grow later during thumb2 instruction optimization. This commit fixes the code in ARMConstantIslands::createNewWater to call computeBlockSize and set BasicBlock::Unalign when a branch instruction is inserted to create new water after a basic block. A non-zero Unalign causes the worst-case padding to be inserted when adjustBBOffsetsAfter is called to recompute the basic block offsets. rdar://problem/19130476 llvm-svn: 225467	2015-01-08 20:44:50 +00:00
Duncan P. N. Exon Smith	5914a97af8	CodeGen: Use range-based for loops, NFC Patch by Ramkumar Ramachandra! llvm-svn: 225466	2015-01-08 20:44:33 +00:00
Matt Arsenault	b935d9df4c	Fix fcmp + fabs instcombines when using the intrinsic This was only handling the libcall. This is another example of why only the intrinsic should ever be used when it exists. llvm-svn: 225465	2015-01-08 20:09:34 +00:00
Eric Christopher	90724285a2	Make the TargetMachine in MipsSubtarget a reference rather than a pointer to make unifying code a bit easier. llvm-svn: 225459	2015-01-08 18:18:57 +00:00
Eric Christopher	d8abc3a956	Update include - this class doesn't use the target machine, but only the subtarget. llvm-svn: 225458	2015-01-08 18:18:54 +00:00
Eric Christopher	1933f20aa4	Fix a couple of odd formatting issues. llvm-svn: 225457	2015-01-08 18:18:53 +00:00
Eric Christopher	09455d94bf	This routine is in InstrInfo, there's no need to access it again. llvm-svn: 225456	2015-01-08 18:18:50 +00:00
Ahmed Bougacha	d716121888	[X86] Reflow comment. NFC. llvm-svn: 225455	2015-01-08 17:49:48 +00:00
Rafael Espindola	261d25b940	clang-format. NFC. llvm-svn: 225454	2015-01-08 16:25:01 +00:00
Justin Hibbits	98a532dd8e	Add saving and restoring of r30 to the prologue and epilogue, respectively Summary: The PIC additions didn't update the prologue and epilogue code to save and restore r30 (PIC base register). This does that. Test Plan: Tests updated. Reviewers: hfinkel Reviewed By: hfinkel Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6876 llvm-svn: 225450	2015-01-08 15:47:19 +00:00
Rafael Espindola	bec6af62b8	Explicitly handle LinkOnceODRAutoHideLinkage. NFC. We already have a test. llvm-svn: 225449	2015-01-08 15:39:50 +00:00
Rafael Espindola	7b4b2dcd0a	Update naming style and clang-format. NFC. llvm-svn: 225448	2015-01-08 15:36:32 +00:00
Kristof Beyls	933de7aa06	Fix large stack alignment codegen for ARM and Thumb2 targets This partially fixes PR13007 (ARM CodeGen fails with large stack alignment): for ARM and Thumb2 targets, but not for Thumb1, as it seems stack alignment for Thumb1 targets hasn't been supported at all. Producing an aligned stack pointer is done by zero-ing out the lower bits of the stack pointer. The BIC instruction was used for this. However, the immediate field of the BIC instruction only allows to encode an immediate that can zero out up to a maximum of the 8 lower bits. When a larger alignment is requested, a BIC instruction cannot be used; llvm was silently producing incorrect code in this case. This commit fixes code generation for large stack aligments by using the BFC instruction instead, when the BFC instruction is available. When not, it uses 2 instructions: a right shift, followed by a left shift to zero out the lower bits. The lowering of ARM::Int_eh_sjlj_dispatchsetup still has code that unconditionally uses BIC to realign the stack pointer, so it very likely has the same problem. However, I wasn't able to produce a test case for that. This commit adds an assert so that the compiler will fail the assert instead of silently generating wrong code if this is ever reached. llvm-svn: 225446	2015-01-08 15:09:14 +00:00
Tom Stellard	654d669e56	R600/SI: Remove SIISelLowering::legalizeOperands() Its functionality has been replaced by calling SIInstrInfo::legalizeOperands() from SIISelLowering::AdjstInstrPostInstrSelection() and running the SIFoldOperands and SIShrinkInstructions passes. llvm-svn: 225445	2015-01-08 15:08:17 +00:00
Elena Demikhovsky	285fbd551a	Masked Load/Store - fixed a bug in type legalization. llvm-svn: 225441	2015-01-08 12:29:19 +00:00
Michael Kuperstein	698ea3b488	Fix include ordering, NFC. llvm-svn: 225439	2015-01-08 11:59:43 +00:00
Michael Kuperstein	46f7d525c3	[X86] Don't try to generate direct calls to TLS globals The call lowering assumes that if the callee is a global, we want to emit a direct call. This is correct for regular globals, but not for TLS ones. Differential Revision: http://reviews.llvm.org/D6862 llvm-svn: 225438	2015-01-08 11:50:58 +00:00
Michael Kuperstein	8c65e31a5a	Move SPAdj logic from PEI into the targets (NFC) PEI tries to keep track of how much starting or ending a call sequence adjusts the stack pointer by, so that it can resolve frame-index references. Currently, it takes a very simplistic view of how SP adjustments are done - both FrameStartOpcode and FrameDestroyOpcode adjust it exactly by the amount written in its first argument. This view is in fact incorrect for some targets (e.g. due to stack re-alignment, or because it may want to adjust the stack pointer in multiple steps). However, that doesn't cause breakage, because most targets (the only in-tree exception appears to be 32-bit ARM) rely on being able to simplify the call frame pseudo-instructions earlier, so this code is never hit. Moving the computation into TargetInstrInfo allows targets to override the way the adjustment is computed if they need to have a non-zero SPAdj. Differential Revision: http://reviews.llvm.org/D6863 llvm-svn: 225437	2015-01-08 11:04:38 +00:00
Craig Topper	7c10252943	[X86] Don't print 'dword ptr' or 'qword ptr' on the operand to some of the LEA variants in Intel syntax. The memory operand is inherently unsized. llvm-svn: 225432	2015-01-08 07:41:30 +00:00
Adrian Prantl	2561bb8831	Revert "Reapply: Teach SROA how to update debug info for fragmented variables." This reverts commit r225379 while investigating an assertion failure reported by Alexey. llvm-svn: 225424	2015-01-08 02:02:00 +00:00
Quentin Colombet	a799e2e014	[RegAllocGreedy] Introduce a late pass to repair broken hints. A broken hint is a copy where both ends are assigned different colors. When a variable gets evicted in the neighborhood of such copies, it is likely we can reconcile some of them. Context Copies are inserted during the register allocation via splitting. These split points are required to relax the constraints on the allocation problem. When such a point is inserted, both ends of the copy would not share the same color with respect to the current allocation problem. When variables get evicted, the allocation problem becomes different and some split point may not be required anymore. However, the related variables may already have been colored. This usually shows up in the assembly with pattern like this: def A ... save A to B def A use A restore A from B ... use B Whereas we could simply have done: def B ... def A use A ... use B Proposed Solution A variable having a broken hint is marked for late recoloring if and only if selecting a register for it evict another variable. Indeed, if no eviction happens this is pointless to look for recoloring opportunities as it means the situation was the same as the initial allocation problem where we had to break the hint. Finally, when everything has been allocated, we look for recoloring opportunities for all the identified candidates. The recoloring is performed very late to rely on accurate copy cost (all involved variables are allocated). The recoloring is simple unlike the last change recoloring. It propagates the color of the broken hint to all its copy-related variables. If the color is available for them, the recoloring uses it, otherwise it gives up on that hint even if a more complex coloring would have worked. The recoloring happens only if it is profitable. The profitability is evaluated using the expected frequency of the copies of the currently recolored variable with a) its current color and b) with the target color. If a) is greater or equal than b), then it is profitable and the recoloring happen. Example Consider the following example: BB1: a = b = BB2: ... = b = a Let us assume b gets split: BB1: a = b = BB2: c = b ... d = c = d = a Because of how the allocation work, b, c, and d may be assigned different colors. Now, if a gets evicted to make room for c, assuming b and d were assigned to something different than a. We end up with: BB1: a = st a, SpillSlot b = BB2: c = b ... d = c = d e = ld SpillSlot = e This is likely that we can assign the same register for b, c, and d, getting rid of 2 copies. Performances Both ARM64 and x86_64 show performance improvements of up to 3% for the llvm-testsuite + externals with Os and O3. There are a few regressions too that comes from the (in)accuracy of the block frequency estimate. <rdar://problem/18312047> llvm-svn: 225422	2015-01-08 01:16:39 +00:00
Ahmed Bougacha	2b6917b020	[SelectionDAG] Allow targets to specify legality of extloads' result type (in addition to the memory type). The LoadExt legalization handling used to only have one type, the memory type. This forced users to assume that as long as the extload for the memory type was declared legal, and the result type was legal, the whole extload was legal. However, this isn't always the case. For instance, on X86, with AVX, this is legal: v4i32 load, zext from v4i8 but this isn't: v4i64 load, zext from v4i8 Whereas v4i64 is (arguably) legal, even without AVX2. Note that the same thing was done a while ago for truncstores (r46140), but I assume no one needed it yet for extloads, so here we go. Calls to getLoadExtAction were changed to add the value type, found manually in the surrounding code. Calls to setLoadExtAction were mechanically changed, by wrapping the call in a loop, to match previous behavior. The loop iterates over the MVT subrange corresponding to the memory type (FP vectors, etc...). I also pulled neighboring setTruncStoreActions into some of the loops; those shouldn't make a difference, as the additional types are illegal. (e.g., i128->i1 truncstores on PPC.) No functional change intended. Differential Revision: http://reviews.llvm.org/D6532 llvm-svn: 225421	2015-01-08 00:51:32 +00:00
Nick Lewycky	c99cc19650	Remove empty statement. No functionality change. llvm-svn: 225420	2015-01-08 00:47:03 +00:00
Matthias Braun	ada0adf396	X86: VZeroUpperInserter: shortcut should not trigger if we have any function live-ins. llvm-svn: 225419	2015-01-08 00:33:48 +00:00
Matthias Braun	9d7bc0874c	RegisterCoalescer: Do not remove IMPLICIT_DEFS if they are required for subranges. The register coalescer used to remove implicit_defs when they are covered by the main range anyway. With subreg liveness tracking we can't do that anymore in places where the IMPLICIT_DEF is required as begin of a subregister liverange. llvm-svn: 225416	2015-01-08 00:21:23 +00:00
Matthias Braun	d55e6ddacf	RegisterCoalescer: Fix valuesIdentical() in some subrange merge cases. I got confused and assumed SrcIdx/DstIdx of the CoalescerPair is a subregister index in SrcReg/DstReg, but they are actually subregister indices of the coalesced register that get you back to SrcReg/DstReg when applied. Fixed the bug, improved comments and simplified code accordingly. Testcase by Tom Stellard! llvm-svn: 225415	2015-01-07 23:58:38 +00:00
Matthias Braun	4fe686af00	LiveInterval: Implement feedback by Quentin Colombet. llvm-svn: 225413	2015-01-07 23:35:11 +00:00
Philip Reames	76ebd15437	[GC] improve testing around gc.relocate and fix a test Patch by: Ramkumar Ramachandra <artagnon@gmail.com> "This patch started out as an exploration of gc.relocate, and an attempt to write a simple test in call-lowering. I then noticed that the arguments of gc.relocate were not checked fully, so I went in and fixed a few things. Finally, the most important outcome of this patch is that my new error handling code caught a bug in a callsite in stackmap-format." Differential Revision: http://reviews.llvm.org/D6824 llvm-svn: 225412	2015-01-07 22:48:01 +00:00
Tom Stellard	0599297cb4	R600/SI: Commute instructions to enable more folding opportunities llvm-svn: 225410	2015-01-07 22:44:19 +00:00
Duncan P. N. Exon Smith	5e5b85098d	IR: Add MDNode::getDistinct() Allow distinct `MDNode`s to be explicitly created. There's no way (yet) of representing their distinctness in assembly/bitcode, however, so this still isn't first-class. Part of PR22111. llvm-svn: 225406	2015-01-07 22:24:46 +00:00
Tom Stellard	26cc18df43	R600/SI: Only fold immediates that have one use Folding the same immediate into multiple instruction will increase program size, which can hurt performance. llvm-svn: 225405	2015-01-07 22:18:27 +00:00

... 2 3 4 5 6 ...

75765 Commits