llvm-project

Commit Graph

Author	SHA1	Message	Date
Reid Kleckner	a9f4cc9510	[codeview] Add new directives to record inlined call site line info Summary: Previously we were trying to represent this with the "contains" list of the .cv_inline_linetable directive, which was not enough information. Now we directly represent the chain of inlined call sites, so we know what location to emit when we encounter a .cv_loc directive of an inner inlined call site while emitting the line table of an outer function or inlined call site. Fixes PR29146. Also fixes PR29147, where we would crash when .cv_loc directives crossed sections. Now we write down the section of the first .cv_loc directive, and emit an error if any other .cv_loc directive for that function is in a different section. Also fixes issues with discontiguous inlined source locations, like in this example: volatile int unlikely_cond = 0; extern void __declspec(noreturn) abort(); __forceinline void f() { if (!unlikely_cond) abort(); } int main() { unlikely_cond = 0; f(); unlikely_cond = 0; } Previously our tables gave bad location information for the 'abort' call, and the debugger wouldn't snow the inlined stack frame for 'f'. It is important to emit good line tables for this code pattern, because it comes up whenever an asan bug occurs in an inlined function. The __asan_report* stubs are generally placed after the normal function epilogue, leading to discontiguous regions of inlined code. Reviewers: majnemer, amccarth Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D24014 llvm-svn: 280822	2016-09-07 16:15:31 +00:00
Chad Rosier	90bcb9176e	[LoopInterchange] Improve debug output. NFC. llvm-svn: 280820	2016-09-07 16:07:17 +00:00
Chad Rosier	f5814f56b8	[LoopInterchange] Improve debug output. NFC. llvm-svn: 280819	2016-09-07 15:56:59 +00:00
Justin Lebar	3a5f40c191	[LSV] Use the original loads' names for the extractelement instructions. Summary: LSV replaces multiple adjacent loads with one vectorized load and a bunch of extractelement instructions. This patch makes the extractelement instructions' names match those of the original loads, for (hopefully) improved readability. Reviewers: asbirlea, tstellarAMD Subscribers: arsenm, mzolotukhin Differential Revision: https://reviews.llvm.org/D23748 llvm-svn: 280818	2016-09-07 15:49:48 +00:00
Sanjay Patel	0bf9a99c7d	[x86] move combines of 'select of 2 constants' to its own function; NFC There are missing folds here and possibly folds that could be made generic. llvm-svn: 280817	2016-09-07 15:47:34 +00:00
Simon Pilgrim	d311311beb	Fix typo in test - it should be masking bits0-15 not bit16 llvm-svn: 280816	2016-09-07 15:19:07 +00:00
Andrea Di Biagio	bdd576dbb0	Regenerate vector bitcast folding tests using update_test_checks.py. Two tests have been merged together, regenerated and then moved to a more appropriate directory. No functional change. llvm-svn: 280814	2016-09-07 14:50:07 +00:00
Simon Pilgrim	32cfa5ba83	[X86][SSE] Added or combine tests for known bits of vectors Part of the yak shaving for D24253 llvm-svn: 280813	2016-09-07 14:49:50 +00:00
Simon Pilgrim	65cdc058b6	[X86][SSE] Added and+or+zext combine tests for known bits of vectors Part of the yak shaving for D24253 llvm-svn: 280810	2016-09-07 14:00:52 +00:00
Simon Pilgrim	2415144425	[X86][SSE] Added and+or combine tests currently failing with vectors (and (or x, C), D) -> D if (C & D) == D Part of the yak shaving for D24253 llvm-svn: 280809	2016-09-07 13:40:03 +00:00
Pablo Barrio	fc752bb70a	[ARM] Lower UDIV+UREM to UDIV+MLS (and the same for SREM) Summary: This saves a library call to __aeabi_uidivmod. However, the processor must feature hardware division in order to benefit from the transformation. Reviewers: scott-0, jmolloy, compnerd, rengolin Subscribers: t.p.northover, compnerd, aemerson, rengolin, samparker, llvm-commits Differential Revision: https://reviews.llvm.org/D24133 llvm-svn: 280808	2016-09-07 12:49:15 +00:00
Andrea Di Biagio	f3fd316223	[InstCombine][SSE4a] Fix assertion failure in the insertq/insertqi combining logic. This fixes a similar issue to the one already fixed by r280804 (revieved in D24256). Revision 280804 fixed the problem with unsafe dyn_casts in the extrq/extrqi combining logic. However, it turns out that even the insertq/insertqi logic was affected by the same problem. llvm-svn: 280807	2016-09-07 12:47:53 +00:00
Andrea Di Biagio	8df5b9cf48	[InstCombine][SSE4a] Fix assertion failure caused by unsafe dyn_casts on the operands of extrq/extrqi intrinsic calls. This patch fixes an assertion failure caused by unsafe dynamic casts on the constant operands of sse4a intrinsic calls to extrq/extrqi The combine logic that simplifies sse4a extrq/extrqi intrinsic calls currently checks if the input operands are constants. Internally, that logic relies on dyn_casts of values returned by calls to method Constant::getAggregateElement. However, method getAggregateElemet may return nullptr if the constant element cannot be retrieved. So, all the dyn_casts can potentially fail. This is what happens for example if a constexpr value is passed in input to an extrq/extrqi intrinsic call. This patch fixes the problem by using a dyn_cast_or_null (instead of a simple dyn_cast) on the result of each call to Constant::getAggregateElement. Added reproducible test cases to x86-sse4a.ll. Differential Revision: https://reviews.llvm.org/D24256 llvm-svn: 280804	2016-09-07 12:03:03 +00:00
Renato Golin	c69e0818e0	Revert "[EfficiencySanitizer] Adds shadow memory parameters for 40-bit virtual memory address." This reverts commit r280796, as it broke the AArch64 bots for no reason. The tests were passing and we should try to keep them passing, so a proper review should make that happen. llvm-svn: 280802	2016-09-07 10:54:42 +00:00
Vasileios Kalintiris	1ed49fd384	[mips] Disable the TImode shift libcalls for 32-bit targets. Summary: The o32 ABI doesn't not support the TImode helpers. For the time being, disable just the shift libcalls as they break recursive builds on MIPS. Reviewers: sdardis Subscribers: llvm-commits, sdardis Differential Revision: https://reviews.llvm.org/D24259 llvm-svn: 280798	2016-09-07 10:01:18 +00:00
Sagar Thakur	69c78d8db7	[EfficiencySanitizer] Adds shadow memory parameters for 40-bit virtual memory address. Adding 40-bit shadow memory parameters because MIPS64 uses 40-bit virtual memory addresses. Reviewed by bruening Differential: D23801 llvm-svn: 280796	2016-09-07 09:45:37 +00:00
James Molloy	6c009c1c85	[SimplifyCFG] Followup fix to r280790 In failure cases it's not guaranteed that the PHI we're inspecting is actually in the successor block! In this case we need to bail out early, and never query getIncomingValueForBlock() as that will cause an assert. llvm-svn: 280794	2016-09-07 09:01:22 +00:00
James Molloy	ec905a62ae	[SimplifyCFG] Update workaround for PR30188 to also include loads I should have realised this the first time around, but if we're avoiding sinking stores where the operands come from allocas so they don't create selects, we also have to do the same for loads because SROA will be just as defective looking at loads of selected addresses as stores. Fixes PR30188 (again). llvm-svn: 280792	2016-09-07 08:40:20 +00:00
Diana Picus	42431e7ce7	[CMake] Use CMake's default RPATH for the unit tests In the top-level CMakeLists.txt, we set CMAKE_BUILD_WITH_INSTALL_RPATH to ON, and then for the unit tests we set it to <test>/../../lib. This works for tests that live in unittest/<whatever>, but not for those that live in subdirectories e.g. unittest/Transforms/IPO or unittest/ExecutionEngine/Orc. When building with BUILD_SHARED_LIBRARIES, such tests don't manage to find their libraries. Since the tests are run from the build directory, it makes sense to set their RPATH for the build tree, rather than the install tree. This is the default in CMake since 2.6, so all we have to do is set CMAKE_BUILD_WITH_INSTALL_RPATH to OFF for the unit tests. llvm-svn: 280791	2016-09-07 08:37:15 +00:00
James Molloy	bf1837d9c9	[SimplifyCFG] Check PHI uses more accurately PR30292 showed a case where our PHI checking wasn't correct. We were checking that all values were used by the same PHI before deciding to sink, but we weren't checking that the incoming values for that PHI were what we expected. As a result, we had to bail out after block splitting which caused us to never reach a steady state in SimplifyCFG. Fixes PR30292. llvm-svn: 280790	2016-09-07 08:15:54 +00:00
Hal Finkel	42c83f131e	[PowerPC] Fix address-offset folding for plain addi When folding an addi into a memory access that can take an immediate offset, we were implicitly assuming that the existing offset was zero. This was incorrect. If we're dealing with an addi with a plain constant, we can add it to the existing offset (assuming that doesn't overflow the immediate, etc.), but if we have anything else (i.e. something that will become a relocation expression), we'll go back to requiring the existing immediate offset to be zero (because we don't know what the requirements on that relocation expression might be - e.g. maybe it is paired with some addis in some relevant way). On the other hand, when dealing with a plain addi with a regular constant immediate, the alignment restrictions (from the TOC base pointer, etc.) are irrelevant. I've added the test case from PR30280, which demonstrated the bug, but also demonstrates a missed optimization opportunity (i.e. we don't need the memory accesses at all). Fixes PR30280. llvm-svn: 280789	2016-09-07 07:36:11 +00:00
Elena Demikhovsky	f0ddd1b8b5	AVX512F: FMA intrinsic + FNEG - sequence optimization The previous commit (r280368 - https://reviews.llvm.org/D23313) does not cover AVX-512F, KNL set. FNEG(x) operation is lowered to (bitcast (vpxor (bitcast x), (bitcast constfp(0x80000000))). It happens because FP XOR is not supported for 512-bit data types on KNL and we use integer XOR instead. I added pattern match for integer XOR. Differential Revision: https://reviews.llvm.org/D24221 llvm-svn: 280785	2016-09-07 06:54:28 +00:00
Matt Arsenault	479ba3aac0	AMDGPU: Make some scalar instructions commutable llvm-svn: 280784	2016-09-07 06:25:55 +00:00
Matt Arsenault	6cda10c950	Remove unnecessary call to getAllocatableRegClass This reapplies r252565 and r252674, effectively reverting r252956. This allows VS_32/VS_64 to be unallocatable like they should be. llvm-svn: 280783	2016-09-07 06:16:45 +00:00
Craig Topper	0e473955a0	[X86] Add hasSideEffects=0 to some instructions. llvm-svn: 280782	2016-09-07 04:46:15 +00:00
Craig Topper	b880ad3a71	[AVX-512] Add support for commuting masked instructions in findCommutedOpIndices. The default implementation doesn't skip the mask input or the preserved input. llvm-svn: 280781	2016-09-07 04:46:11 +00:00
Saleem Abdulrasool	a7ade33d16	Revert "CodeGen: ensure that libcalls are always AAPCS CC" This reverts SVN r280683. Revert until I figure out why this is breaking lli tests. llvm-svn: 280778	2016-09-07 03:17:19 +00:00
Nick Lewycky	edd0a7023f	Fix typo in comment, NFC llvm-svn: 280774	2016-09-07 01:49:41 +00:00
Davide Italiano	24c29b1426	[LTO] Rename variables to be more explicative. Thanks to Mehdi for the suggestion! llvm-svn: 280772	2016-09-07 01:08:31 +00:00
Davide Italiano	c5d0a5cef1	[opt] Remove an unused argument to runPassPipeline(). I have plans to use this API also in libLTO (and maybe lld). llvm-svn: 280770	2016-09-07 00:48:47 +00:00
Zachary Turner	c998571493	Re-add "Make FieldList records print as a YAML sequence" This was originally submitted in r280549, and reverted in r280577 due to breaking one MSVC buildbot. The issue is that MSVC 2013 doesn't synthesize move constructors. So even though i was writing std::move(A) it was copying it, leading to a bogus ArrayRef. The solution here is to simply remove the std::vector<> from the type, since it is unused and unnecessary. This way the ArrayRef continues to point into the original memory backing the CVType. llvm-svn: 280769	2016-09-06 23:45:47 +00:00
Hal Finkel	8ca2ed22b2	[DAGCombine] More fixups to SETCC legality checking (visitANDLike/visitORLike) I might have called this "r246507, the sequel". It fixes the same issue, as the issue has cropped up in a few more places. The underlying problem is that isSetCCEquivalent can pick up select_cc nodes with a result type that is not legal for a setcc node to have, and if we use that type to create new setcc nodes, nothing fixes that (and so we've violated the contract that the infrastructure has with the backend regarding setcc node types). Fixes PR30276. For convenience, here's the commit message from r246507, which explains the problem is greater detail: [DAGCombine] Fixup SETCC legality checking SETCC is one of those special node types for which operation actions (legality, etc.) is keyed off of an operand type, not the node's value type. This makes sense because the value type of a legal SETCC node is determined by its operands' value type (via the TLI function getSetCCResultType). When the SDAGBuilder creates SETCC nodes, it either creates them with an MVT::i1 value type, or directly with the value type provided by TLI.getSetCCResultType. The first problem being fixed here is that DAGCombine had several places querying TLI.isOperationLegal on SETCC, but providing the return of getSetCCResultType, instead of the operand type directly. This does not mean what the author thought, and "luckily", most in-tree targets have SETCC with Custom lowering, instead of marking them Legal, so these checks return false anyway. The second problem being fixed here is that two of the DAGCombines could create SETCC nodes with arbitrary (integer) value types; specifically, those that would simplify: (setcc a, b, op1) and\|or (setcc a, b, op2) -> setcc a, b, op3 (which is possible for some combinations of (op1, op2)) If the operands of the and\|or node are actual setcc nodes, then this is not an issue (because the and\|or must share the same type), but, the relevant code in DAGCombiner::visitANDLike and DAGCombiner::visitORLike actually calls DAGCombiner::isSetCCEquivalent on each operand, and that function will recognise setcc-like select_cc nodes with other return types. And, thus, when creating new SETCC nodes, we need to be careful to respect the value-type constraint. This is even true before type legalization, because it is quite possible for the SELECT_CC node to have a legal type that does not happen to match the corresponding TLI.getSetCCResultType type. To be explicit, there is nothing that later fixes the value types of SETCC nodes (if the type is legal, but does not happen to match TLI.getSetCCResultType). Creating SETCCs with an MVT::i1 value type seems to work only because, either MVT::i1 is not legal, or it is what TLI.getSetCCResultType returns if it is legal. Fixing that is a larger change, however. For the time being, restrict the relevant transformations to produce only SETCC nodes with a value type matching TLI.getSetCCResultType (or MVT::i1 prior to type legalization). Fixes PR24636. llvm-svn: 280767	2016-09-06 23:02:23 +00:00
Vedant Kumar	5053b11edc	[llvm-cov] Use colors consistently in the summary Use the same color for counts and percentages. There doesn't seem to be a reason for them to be different, and the summary looks more consistent this way. llvm-svn: 280765	2016-09-06 22:46:00 +00:00
Vedant Kumar	702bb9d9b2	[llvm-cov] Clean up the summary class, delete dead code (NFC) llvm-svn: 280764	2016-09-06 22:45:57 +00:00
Dehao Chen	3857f8f0ac	Explicitly require DominatorTreeAnalysis pass for instsimplify pass. Summary: DominatorTreeAnalysis is always required by instsimplify. Reviewers: danielcdh, davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D24173 llvm-svn: 280760	2016-09-06 22:17:16 +00:00
Ying Yi	24e91bd05f	[llvm-cov] Add the project summary to the text coverage report for each source file. This patch is a spin-off from https://reviews.llvm.org/D23922. It extends the text view to preserve the same feature as the html view. Differential Revision: https://reviews.llvm.org/D24241 llvm-svn: 280756	2016-09-06 21:41:38 +00:00
Rafael Espindola	d5aa733769	Avoid using alignas and constexpr. This requires removing the custom allocator, since Demangle cannot depend on Support and so cannot use Compiler.h. llvm-svn: 280750	2016-09-06 20:36:24 +00:00
Konstantin Zhuravlyov	864718c666	[AMDGPU] Wave and register controls - Add missing test llvm-svn: 280749	2016-09-06 20:29:10 +00:00
Chris Bieneman	76864f9c77	[CMake] Cleanup LLVM_OPTIMIZED_TABLEGEN This cleanup removes the need for the native support library to have its own target. That target was only needed because makefile builds were tripping over each other if two tablegen targets were building at the same time. This causes problems because the parallel make invocations through CMake can't communicate with each other. This is fixed by invoking make directly instead of through CMake which is how we handle this in External Project invocations. The other part of the cleanup is to mark the custom commands as USES_TERMINAL. This is a bit of a hack, but we need to ensure that Ninja generators don't invoke multiple tablegen targets in the same build dir in parallel, because that too would be bad. Marking as USES_TERMINAL does have some downside for Ninja because it results in decreased parallelism, but correct builds are worth the minor loss and LLVM_OPTIMZIED_TABLEGEN is such a huge win, it is worth it. llvm-svn: 280748	2016-09-06 20:27:07 +00:00
Konstantin Zhuravlyov	1d65026ca6	[AMDGPU] Wave and register controls - Implemented amdgpu-flat-work-group-size attribute - Implemented amdgpu-num-active-waves-per-eu attribute - Implemented amdgpu-num-sgpr attribute - Implemented amdgpu-num-vgpr attribute - Dynamic LDS constraints are in a separate patch Patch by Tom Stellard and Konstantin Zhuravlyov Differential Revision: https://reviews.llvm.org/D21562 llvm-svn: 280747	2016-09-06 20:22:28 +00:00
Rafael Espindola	ec73f5dacf	Try to fix a circular dependency in the modules build. llvm-svn: 280746	2016-09-06 20:16:19 +00:00
Tom Stellard	2add8a1140	AMDGPU/SI: Teach SIInstrInfo::FoldImmediate() to fold immediates into copies Summary: I put this code here, because I want to re-use it in a few other places. This supersedes some of the immediate folding code we have in SIFoldOperands. I think the peephole optimizers is probably a better place for folding immediates into copies, since it does some register coalescing in the same time. This will also make it easier to transition SIFoldOperands into a smarter pass, where it looks at all uses of instruction at once to determine the optimal way to fold operands. Right now, the pass just considers one operand at a time. Reviewers: arsenm Subscribers: wdng, nhaehnle, arsenm, llvm-commits, kzhuravl Differential Revision: https://reviews.llvm.org/D23402 llvm-svn: 280744	2016-09-06 20:00:26 +00:00
Wei Ding	5e832e866e	AMDGPU : Add XNACK feature to GPUs that support it. Differential Revision: http://reviews.llvm.org/D24276 llvm-svn: 280742	2016-09-06 19:55:17 +00:00
Reid Kleckner	b2881f1fbe	Fix ItaniumDemangle.cpp build with MSVC 2013 llvm-svn: 280740	2016-09-06 19:39:56 +00:00
Ying Yi	d36b47c481	[llvm-cov] Add the "Go to first unexecuted line" feature. This patch provides easy navigation to find the zero count lines, especially useful when the source file is very large. Differential Revision: https://reviews.llvm.org/D23277 llvm-svn: 280739	2016-09-06 19:31:18 +00:00
Evandro Menezes	405c90e6cc	[AArch64] Adjust the scheduling model for Exynos M1. Further refine the model for branches. llvm-svn: 280736	2016-09-06 19:22:29 +00:00
Evandro Menezes	77e6b5d4e0	[AArch64] Adjust the scheduling model for Exynos M1. Further refine the model for stores. llvm-svn: 280735	2016-09-06 19:22:27 +00:00
Evandro Menezes	199cad4f17	[AArch64] Adjust the scheduling model for Exynos M1. Further refine the model for loads. llvm-svn: 280734	2016-09-06 19:22:19 +00:00
Rafael Espindola	b940b66c60	Add an c++ itanium demangler to llvm. This adds a copy of the demangler in libcxxabi. The code also has no dependencies on anything else in LLVM. To enforce that I added it as another library. That way a BUILD_SHARED_LIBS will fail if anyone adds an use of StringRef for example. The no llvm dependency combined with the fact that this has to build on linux, OS X and Windows required a few changes to the code. In particular: No constexpr. No alignas On OS X at least this library has only one global symbol: __ZN4llvm16itanium_demangleEPKcPcPmPi My current plan is: Commit something like this Change lld to use it Change lldb to use it as the fallback Add a few #ifdefs so that exactly the same file can be used in libcxxabi to export abi::__cxa_demangle. Once the fast demangler in lldb can handle any names this implementation can be replaced with it and we will have the one true demangler. llvm-svn: 280732	2016-09-06 19:16:48 +00:00
Sanjay Patel	4e463b4a2c	fix formatting; NFC llvm-svn: 280727	2016-09-06 18:16:31 +00:00
Davide Italiano	5715012b9e	[MCTargetDesc] Delete dead code. Found by GCC7 -Wunused-function. Also unbreak newer gcc build with -Werror. llvm-svn: 280726	2016-09-06 18:02:09 +00:00
Victor Leschuk	a2cd413129	Fix comment formatting for DebugInfoFlags.def llvm-svn: 280722	2016-09-06 17:22:48 +00:00
Justin Bogner	1c03915500	bugpoint: Return Errors instead of passing around strings This replaces the threading of `std::string &Error` through all of these APIs with checked Error returns instead. There are very few places here that actually emit any errors right now, but threading the APIs through will allow us to replace a bunch of exit(1)'s that are scattered through this code with proper error handling. This is more or less NFC, but does move around where a couple of error messages are printed out. llvm-svn: 280720	2016-09-06 17:18:22 +00:00
Krzysztof Parzyszek	7c9b012629	[RDF] Ignore undef use operands llvm-svn: 280717	2016-09-06 17:03:13 +00:00
Leny Kholodov	40c6235b79	Formatting with clang-format patch r280700 llvm-svn: 280716	2016-09-06 17:03:02 +00:00
Simon Pilgrim	1b4462b7c1	[SelectionDAG] Simplify extract_subvector( insert_subvector ( Vec, In, Idx ), Idx ) -> In If we are extracting a subvector that has just been inserted then we should just use the original inserted subvector. This has come up in certain several x86 shuffle lowering cases where we are crossing 128-bit lanes. Differential Revision: https://reviews.llvm.org/D24254 llvm-svn: 280715	2016-09-06 16:42:05 +00:00
Adam Nemet	c520822dbf	[JumpThreading] Only write back branch-weight MDs for blocks that originally had PGO info Currently the pass updates branch weights in the IR if the function has any PGO info (entry frequency is set). However we could still have regions of the CFG that does not have branch weights collected (e.g. a cold region). In this case we'd use static estimates. Since static estimates for branches are determined independently, they are inconsistent. Updating them can "randomly" inflate block frequencies. I've run into this in a completely cold loop of h264ref from SPEC. -Rpass-with-hotness showed the loop to be completely cold during inlining (before JT) but completely hot during vectorization (after JT). The new testcase demonstrate the problem. We check array elements against 1, 2 and 3 in a loop. The check against 3 is the loop-exiting check. The block names should be self-explanatory. In this example, jump threading incorrectly updates the weight of the loop-exiting branch to 0, drastically inflating the frequency of the loop (in the range of billions). There is no run-time profile info for edges inside the loop, so branch probabilities are estimated. These are the resulting branch and block frequencies for the loop body: check_1 (16) (8) / \| eq_1 \| (8) \ \| check_2 (16) (8) / \| eq_2 \| (8) \ \| check_3 (16) (1) / \| (loop exit) \| (15) \| (back edge) First we thread eq_1 -> check_2 to check_3. Frequencies are updated to remove the frequency of eq_1 from check_2 and then from the false edge leaving check_2. Changed frequencies are highlighted with * : check_1 (16) (8) / \| eq_1~ \| (8) / \| / check_2 (8) / (8) / \| \ eq_2 \| (0) \ \ \| ` --- check_3 (16) (1) / \| (loop exit) \| (15) \| (back edge) Next we thread eq_1 -> check_3 and eq_2 -> check_3 to check_1 as new back edges. Frequencies are updated to remove the frequency of eq_1 and eq_3 from check_3 and then the false edge leaving check_3 (changed frequencies are highlighted with ): check_1 (16) (8) / \| eq_1~ \| (8) / \| / check_2 (8) / (8) / \| /-- eq_2~ \| (0) (back edge) \| check_3 (0) (0) / \| (loop exit) \| (0*) \| (back edge) As a result, the loop exit edge ends up with 0 frequency which in turn makes the loop header to have maximum frequency. There are a few potential problems here: 1. The profile data seems odd. There is a single profile sample of the loop being entered. On the other hand, there are no weights inside the loop. 2. Based on static estimation we shouldn't set edges to "extreme" values, i.e. extremely likely or unlikely. 3. We shouldn't create profile metadata that is calculated from static estimation. I am not sure what policy is but it seems to make sense to treat profile metadata as something that is known to originate from profiling. Estimated probabilities should only be reflected in BPI/BFI. Any one of these would probably fix the immediate problem. I went for 3 because I think it's a good policy to have and added a FIXME about 2. Differential Revision: https://reviews.llvm.org/D24118 llvm-svn: 280713	2016-09-06 16:08:33 +00:00
Leny Kholodov	dabff7d8cd	Fix for Bindings/Go/go.test after patch r280700 llvm-svn: 280711	2016-09-06 15:03:54 +00:00
Chris Dewhurst	92cac9322d	[Sparc][Leon] Corrected supported atomics size for processors supporting Leon CASA instruction back to 32 bits. This was erroneously checked-in for 64 bits while trying to find if there was a way to get 64 bit atomicity in Leon processors. There is not and this change should not have been checked-in. There is no unit test for this as the existing unit tests test for behaviour to 32 bits, which was the original intention of the code. llvm-svn: 280710	2016-09-06 14:41:09 +00:00
Simon Dardis	b432a3ed7e	[mips] Tighten FastISel restrictions LLVM PR/29052 highlighted that FastISel for MIPS attempted to lower arguments assuming that it was using the paired 32bit registers to perform operations for f64. This mode of operation is not supported for MIPSR6. This patch resolves the reported issue by adding additional checks for unsupported floating point unit configuration. Thanks to mike.k for reporting this issue! Reviewers: seanbruno, vkalintiris Differential Review: https://reviews.llvm.org/D23795 llvm-svn: 280706	2016-09-06 12:36:24 +00:00
Krzysztof Parzyszek	020ec299bf	[PPC] Claim stack frame before storing into it, if no red zone is present Unlike PPC64, PPC32/SVRV4 does not have red zone. In the absence of it there is no guarantee that this part of the stack will not be modified by any interrupt. To avoid this, make sure to claim the stack frame first before storing into it. This fixes https://llvm.org/bugs/show_bug.cgi?id=26519. Differential Revision: https://reviews.llvm.org/D24093 llvm-svn: 280705	2016-09-06 12:30:00 +00:00
Leny Kholodov	5fcc4185f5	DebugInfo: use strongly typed enum for debug info flags Use ADT/BitmaskEnum for DINode::DIFlags for the following purposes: Get rid of unsigned int for flags to avoid problems on platforms with sizeof(int) < 4 Flags are now strongly typed Patch by: Victor Leschuk <vleschuk@gmail.com> Differential Revision: https://reviews.llvm.org/D23766 llvm-svn: 280700	2016-09-06 10:46:28 +00:00
Silviu Baranga	0b7c4af359	[RegisterScavenger] Remove aliasing registers of operands from the candidate set Summary: In addition to not including the register operand of the current instruction also don't include any aliasing registers. We can't consider these as candidates because using them will clobber the corresponding register operand of the current instruction. This change doesn't include a test case and it would probably be difficult to produce a stable one since the bug depends on the results of register allocation. Reviewers: MatzeB, qcolombet, hfinkel Subscribers: hfinkel, llvm-commits Differential Revision: https://reviews.llvm.org/D24130 llvm-svn: 280698	2016-09-06 10:10:21 +00:00
Craig Topper	4fa3b50fc3	[AVX-512] Fix masked VPERMI2PS isel when the index comes from a bitcast. We need to bitcast the index operand to a floating point type so that it matches the result type. If not then the passthru part of the DAG will be a bitcast from the index's original type to the destination type. This makes it very difficult to match. The other option would be to add 5 sets of patterns for every other possible type. llvm-svn: 280696	2016-09-06 06:56:59 +00:00
Craig Topper	cf9f1b8dfa	[AVX-512] Add a test case to show that we don't select masked vpermi2ps when the index operand comes from a bitcast. It doesn't work because we're looking for a bitcast from the v4i32 index operand to v4f32 for the passthru part of the DAG. But since the index is bitcasted from v2i64 and bitcasts fold, we actually have a bitcast from v2i64 to v4f32 in the passthru part of the DAG. Taken from optimized output from clang's test case. llvm-svn: 280695	2016-09-06 05:45:27 +00:00
Craig Topper	43fbd840dd	[X86] Remove unused encoding from IntrinsicType enum. llvm-svn: 280694	2016-09-06 05:45:24 +00:00
Craig Topper	a0055d315d	[X86] Fix indentation. NFC llvm-svn: 280693	2016-09-06 05:45:21 +00:00
Justin Bogner	24dac6afe2	Revert "bugpoint: Stop threading errors through APIs that never fail" This isn't the right thing to do - it turns out a number of the APIs that "never fail" just exit(1) if something bad happens. We can and should thread Error through this instead. That diff will make more sense with this reverted. Sorry for the noise. This reverts r280690 llvm-svn: 280691	2016-09-06 04:45:37 +00:00
Justin Bogner	46b1a9a70c	bugpoint: Stop threading errors through APIs that never fail This simplifies ListReducer and most of its subclasses by removing the std::string &Error that was threaded through all of them but almost never used. If we end up needing error handling in more places here we can reinstate it using llvm::Error instead of these unwieldy strings. The 2 cases (out of 12) that actually can hit the error cases are a little bit awkward now, but those will clean up as I refactor this API further. llvm-svn: 280690	2016-09-06 04:04:13 +00:00
Saleem Abdulrasool	bfa25bd1ac	ARM: workaround bundled operation predication This is a Windows ARM specific issue. If the code path in the if conversion ends up using a relocation which will form a IMAGE_REL_ARM_MOV32T, we end up with a bundle to ensure that the mov.w/mov.t pair is not split up. This is normally fine, however, if the branch is also predicated, then we end up trying to predicate the bundle. For now, report a bundle as being unpredicatable. Although this is false, this would trigger a failure case previously anyways, so this is no worse. That is, there should not be any code which would previously have been if converted and predicated which would not be now. Under certain circumstances, it may be possible to "predicate the bundle". This would require scanning all bundle instructions, and ensure that the bundle contains only predicatable instructions, and converting the bundle into an IT block sequence. If the bundle is larger than the maximal IT block length (4 instructions), it would require materializing multiple IT blocks from the single bundle. llvm-svn: 280689	2016-09-06 04:00:12 +00:00
Mehdi Amini	3821b53b84	Revert "DebugInfo: use strongly typed enum for debug info flags" This reverts commit r280686, bots are broken. llvm-svn: 280688	2016-09-06 03:26:37 +00:00
Mehdi Amini	767e1457d8	[LTO] Constify (NFC) llvm-svn: 280687	2016-09-06 03:23:45 +00:00
Mehdi Amini	356d6b636b	DebugInfo: use strongly typed enum for debug info flags Use ADT/BitmaskEnum for DINode::DIFlags for the following purposes: * Get rid of unsigned int for flags to avoid problems on platforms with sizeof(int) < 4 * Flags are now strongly typed Patch by: Victor Leschuk <vleschuk@gmail.com> Differential Revision: https://reviews.llvm.org/D23766 llvm-svn: 280686	2016-09-06 03:14:06 +00:00
Mehdi Amini	ac00212f16	Fix DensetSet::insert_as() for MSVC2015 (NFC) The latest MSVC update apparently resolve the call from the const ref variant to itself, leading to an infinite recursion. It is not clear to me why the r-value overload is not selected. `ValueT` is a pointer type, and the functional-style cast in the call `insert_as(ValueT(V), LookupKey);` should result in a r-value ref. A bug in MSVC? Differential Revision: https://reviews.llvm.org/D23956 llvm-svn: 280685	2016-09-06 03:03:15 +00:00
Craig Topper	62d0a5e7d3	[AVX-512] Fix v8i64 shift by immediate lowering on 32-bit targets. llvm-svn: 280684	2016-09-06 00:31:10 +00:00
Saleem Abdulrasool	a6519b1d54	CodeGen: ensure that libcalls are always AAPCS CC All of the builtins are designed to be invoked with ARM AAPCS CC even on ARM AAPCS VFP CC hosts. Tweak the default initialisation to ARM AAPCS CC rather than C CC for ARM/thumb targets. The changes to the tests are necessary to ensure that the calling convention for the lowered library calls are honoured. Furthermore, these adjustments cause certain branch invocations to change to branch-and-link since the returned value needs to be moved across registers (d0 -> r0, r1). llvm-svn: 280683	2016-09-06 00:28:43 +00:00
Craig Topper	dfc4fc9f02	[AVX-512] Teach fastisel load/store handling to use EVEX encoded instructions for 128/256-bit vectors and scalar single/double. Still need to fix the register classes to allow the extended range of registers. llvm-svn: 280682	2016-09-05 23:58:40 +00:00
Craig Topper	70e1348031	[X86] Update fast-isel store test to have more 256 and 512-bit test cases. Add command lines for AVX and AVX512 feature sets. llvm-svn: 280681	2016-09-05 23:58:37 +00:00
Craig Topper	f54ebca2dd	[X86] Update fast-isel vector load test to have more 256 and 512-bit test cases. Add a command line for SKX features too. llvm-svn: 280680	2016-09-05 23:58:32 +00:00
Sanjay Patel	e341c919b0	fix FileCheck variables for test added with r280677 The script (utils/update_test_checks.py) seems to have problems with variable names that start with the same string. llvm-svn: 280679	2016-09-05 23:49:32 +00:00
Gor Nishanov	ccabaca273	[Coroutines] Part12: Handle alloca address-taken Summary: Move early uses of spilled variables after CoroBegin. For example, if a parameter had address taken, we may end up with the code like: define @f(i32 %n) { %n.addr = alloca i32 store %n, %n.addr ... call @coro.begin This patch fixes the problem by moving uses of spilled variables after CoroBegin. Reviewers: majnemer Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D24234 llvm-svn: 280678	2016-09-05 23:45:45 +00:00
Sanjay Patel	eea2ef7862	[InstCombine] don't assert that division-by-constant has been folded (PR30281) This is effectively a revert of: https://reviews.llvm.org/rL280115 And this should fix https://llvm.org/bugs/show_bug.cgi?id=30281: llvm-svn: 280677	2016-09-05 23:38:22 +00:00
Sanjay Patel	46f9df5b71	[InstCombine] revert r280637 because it causes test failures on an ARM bot http://lab.llvm.org:8011/builders/clang-cmake-armv7-a15/builds/14952/steps/ninja%20check%201/logs/FAIL%3A%20LLVM%3A%3Aicmp.ll llvm-svn: 280676	2016-09-05 22:36:32 +00:00
Ahmed Bougacha	f245f56217	[lit] Downgrade error to warning on gtest crashes during discovery. Lots of unittests started failing under asan after r280455. It seems they've been failing for a long time, but lit silently ignored them. Downgrade the error so we can figure out what is going on. Filed http://llvm.org/PR30285. llvm-svn: 280674	2016-09-05 20:53:14 +00:00
Craig Topper	93f7b5699b	[AVX-512] Integrate mask register copying more completely into X86InstrInfo::copyPhysReg and simplify. No functional change intended. The code is now written in terms of source and dest classes with feature checks inside each type of copy instead of having separate functions for each feature set. llvm-svn: 280673	2016-09-05 20:34:50 +00:00
Simon Pilgrim	6dd4b3526a	[X86][SSE] Add test cases for PR29078 'Failure to recognise i64 sitofp/uitofp conversions that can be performed as i32' llvm-svn: 280671	2016-09-05 18:11:17 +00:00
Simon Pilgrim	efab350967	[X86][SSE] Add test cases for PR29079 'Failure to recognise uitofp conversions that can be performed as sitofp' llvm-svn: 280670	2016-09-05 18:04:38 +00:00
whitequark	9bd34f9751	CODE_OWNERS: bring my entry up to date llvm-svn: 280667	2016-09-05 17:42:46 +00:00
Simon Pilgrim	5f14ff75ec	[X86][SSE] Regenerate odd shuffle tests with common prefixes llvm-svn: 280661	2016-09-05 14:15:38 +00:00
Oliver Stannard	ef38d53a7e	[SimplifyCFG] Add test for sinking inline asm in if/else This test code previously caused a failure in the module verifier, because SimplifyCFG created this invalid instruction, which tries to take the address of inline asm: %.sink = select i1 %1, i64 ()* asm "mov $0, #1", "=r", i64 ()* asm %"mov $0, #2", "=r" This has been fixed recently, presumably by James Molloy's patches that re-wrote and changed parts of SimplifyCFG, so this patch just adds a regression test for it. Differential Revision: https://reviews.llvm.org/D24231 llvm-svn: 280660	2016-09-05 13:49:26 +00:00
Benjamin Kramer	ef0a45aaa5	[WebAssembly] Unbreak the build. Not sure why ADL isn't working here. llvm-svn: 280656	2016-09-05 12:06:47 +00:00
Valery Pykhtin	8bc659637c	[AMDGPU] Refactor FLAT TD instructions Differential revision: https://reviews.llvm.org/D24072 llvm-svn: 280655	2016-09-05 11:22:51 +00:00
James Molloy	728cf85950	[Thumb1] Add relocations for fixups fixup_arm_thumb_{br,bcc} These need to be mapped through to R_ARM_THM_JUMP{11,8} respectively. Fixes PR30279. llvm-svn: 280651	2016-09-05 08:29:15 +00:00
Igor Breger	a2f8ca9a34	[AVX512] Fix v8i1 /v16i1 zext + bitcast lowering pattern. Explicitly zero upper bits. Differential Revision: http://reviews.llvm.org/D23983 llvm-svn: 280650	2016-09-05 08:26:51 +00:00
Craig Topper	428169a5d6	[X86] Make some static arrays of opcodes const and shrink to uint16_t. NFC llvm-svn: 280649	2016-09-05 07:14:21 +00:00
Craig Topper	d9ca3d97ef	[AVX-512] Simplify X86InstrInfo::copyPhysReg for 128/256-bit vectors with AVX512, but not VLX. We should use the VEX opcodes and trust the register allocator to not use the extended XMM/YMM register space. Previously we were extending to copying the whole ZMM register. The register allocator shouldn't use XMM16-31 or YMM16-31 in this configuration as the instructions to spill them aren't available. llvm-svn: 280648	2016-09-05 06:43:06 +00:00
Craig Topper	87a675be50	[Target] Remove the AvailableRegClasses vector from TargetLoweringBase. It was a private member with no code reading from it. llvm-svn: 280647	2016-09-05 06:43:00 +00:00
Gor Nishanov	0e18f75a92	[Coroutines] Part11: Add final suspend handling. Summary: A frontend may designate a particular suspend to be final, by setting the second argument of the coro.suspend intrinsic to true. Such a suspend point has two properties: * it is possible to check whether a suspended coroutine is at the final suspend point via coro.done intrinsic; * a resumption of a coroutine stopped at the final suspend point leads to undefined behavior. The only possible action for a coroutine at a final suspend point is destroying it via coro.destroy intrinsic. This patch adds final suspend handling logic to CoroEarly and CoroSplit passes. Now, the final suspend point example from docs\Coroutines.rst compiles and produces expected result (see test/Transform/Coroutines/ex5.ll). Reviewers: majnemer Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D24068 llvm-svn: 280646	2016-09-05 04:44:30 +00:00
Craig Topper	42b69dd961	[X86] Add AVX and AVX512 command lines to the vec_ss_load_fold test. llvm-svn: 280645	2016-09-05 02:20:53 +00:00
Craig Topper	e3807febd8	[X86] Remove FsVMOVAPSrm/FsVMOVAPDrm/FsMOVAPSrm/FsMOVAPDrm. Due to their placement in the td file they had lower precedence than (V)MOVSS/SD and could almost never be selected. The only way to select them was in AVX512 mode because EVEX VMOVSS/SD was below them and the patterns weren't qualified properly for AVX only. So if you happened to have an aligned FR32/FR64 load in AVX512 you could get a VEX encoded VMOVAPS/VMOVAPD. I tried to search back through history and it seems like these instructions were probably unselectable for at least 5 years, at least to the time the VEX versions were added. But I can't prove they ever were. llvm-svn: 280644	2016-09-05 02:20:49 +00:00
Peter Zotov	e9dd46f4d0	[CMake] [OCaml] Allow building OCaml bindings out of tree. That is, add build system support for building the OCaml bindings against preinstalled LLVM libraries. This is important for package managers such as OPAM, because OCaml libraries need to be built against a specific OCaml compiler installation. llvm-svn: 280642	2016-09-05 01:42:22 +00:00
NAKAMURA Takumi	1700d021f3	lit/util.py: Another fix for py3. 'str' object has no attribute 'decode'. llvm-svn: 280641	2016-09-05 00:00:40 +00:00
Sanjay Patel	c641e9d6ff	[InstCombine] allow icmp (and X, C2), C1 folds for splat constant vectors The code to calculate 'UsesRemoved' could be simplified. As-is, that code is a victim of PR30273: https://llvm.org/bugs/show_bug.cgi?id=30273 llvm-svn: 280637	2016-09-04 20:58:27 +00:00
Craig Topper	040b10784e	[AVX-512] Add EVEX encoded scalar FMA intrinsic instructions to isNonFoldablePartialRegisterLoad. llvm-svn: 280636	2016-09-04 19:33:47 +00:00
Simon Pilgrim	cded03f163	[X86] Regenerate x64 mmx/f64 return value tests llvm-svn: 280634	2016-09-04 18:14:45 +00:00
Craig Topper	4177345d7f	[AVX-512] Remove 128-bit and 256-bit masked floating point add/sub/mul/div intrinsics and upgrade to native IR. llvm-svn: 280633	2016-09-04 18:13:33 +00:00
Lang Hames	38c7927b6f	[ORC] Clone module flags metadata into the globals module in the CompileOnDemandLayer. Also contains a tweak to the orc-lazy jit in LLI to enable the test case. llvm-svn: 280632	2016-09-04 17:53:30 +00:00
Simon Pilgrim	128047fde5	[X86] Regenerate trunc-store legalization test llvm-svn: 280631	2016-09-04 17:50:03 +00:00
Simon Pilgrim	c228f5c195	[X86][SSE] Regenerate fcmp/uitofp combine tests llvm-svn: 280629	2016-09-04 17:16:01 +00:00
Lang Hames	6c9bd78507	[ORC] Fix an unfinished comment. llvm-svn: 280628	2016-09-04 16:31:41 +00:00
Sanjay Patel	6b4909749b	[InstCombine] recode icmp fold in a vector-friendly way; NFC The transform in question: icmp (and (trunc W), C2), C1 -> icmp (and W, C2'), C1' ...is still not enabled for vectors, thus no functional change intended. It's not clear to me if this is a good transform for vectors or even scalars in general. Changing that behavior may be a follow-on patch. llvm-svn: 280627	2016-09-04 14:32:15 +00:00
Hal Finkel	f0bc9db96e	[PowerPC] During branch relaxation, recompute padding offsets before each iteration We used to compute the padding contributions to the block sizes during branch relaxation only at the start of the transformation. As we perform branch relaxation, we change the sizes of the blocks, and so the amount of inter-block padding might change. Accordingly, we need to recompute the (alignment-based) padding in between every iteration on our way toward the fixed point. Unfortunately, I don't have a test case (and none was provided in the bug report), and while this obviously seems needed, algorithmically, I don't have any way of generating a small and/or non-fragile regression test. llvm-svn: 280626	2016-09-04 14:18:29 +00:00
Igor Breger	7e2a0dfa0c	revert r279960. https://llvm.org/bugs/show_bug.cgi?id=30249 llvm-svn: 280625	2016-09-04 14:03:52 +00:00
Simon Pilgrim	9a36318c54	EOL fixes llvm-svn: 280624	2016-09-04 13:30:46 +00:00
Simon Pilgrim	122b0de1c1	Strip trailing whitespace llvm-svn: 280623	2016-09-04 13:28:46 +00:00
Chandler Carruth	ccd44939ef	[PM] Revert r280447: Add a unittest for invalidating module analyses with an SCC pass. This was mistakenly committed. The world isn't ready for this test, the test code has horrible debugging code in it that should never have landed in tree, it currently passes because of bugs elsewhere, and it needs to be rewritten to not be susceptible to passing for the wrong reasons. I'll re-land this in a better form when the prerequisite patches land. So sorry that I got this mixed into a series of commits that were ready to land. I shouldn't have. =[ What's worse is that it stuck around for so long and I discovered it while fixing the underlying bug that caused it to pass. llvm-svn: 280620	2016-09-04 08:42:31 +00:00
Chandler Carruth	11b3f60cd9	[LCG] Clean up and make NDEBUG verify calls more rigorous with make_scope_exit now that we have that utility. This makes the code much more clear and readable by isolating the check. It also makes it easy to go through and make sure all the interesting update routines have a start and end verify so we don't slowly let the graph drift into an invalid state. llvm-svn: 280619	2016-09-04 08:34:31 +00:00
Chandler Carruth	1f621f0a70	[LCG] A NFC refactoring to extract the logic for doing a postorder-sequence based update after edge insertion into a generic helper function. This separates the SCC-specific logic into two fairly simple lambdas and extracts the rest into a generic helper template function. I think this is a net win on its own merits because it disentangles different pieces of the algorithm. Now there is one place that does the two-step partition to identify a set of newly connected components and at the same time update the postorder sequence. However, I'm also hoping to re-use this an upcoming patch to update a cached post-order sequence of RefSCCs when doing the analogous update to the RefSCC graph, and I don't want to have two copies. The diff is quite messy but this really is just moving things around and making types generic rather than specific. llvm-svn: 280618	2016-09-04 08:34:24 +00:00
Dorit Nuzman	abd15f69b2	[InstCombine] Preserve llvm.mem.parallel_loop_access metadata when replacing memcpy with ld/st. When InstCombine replaces a memcpy with loads+stores it does not copy over the llvm.mem.parallel_loop_access from the memcpy instruction. This patch fixes that. Differential Revision: https://reviews.llvm.org/D23499 llvm-svn: 280617	2016-09-04 07:49:39 +00:00
Lang Hames	3301c7eebd	[ExecutionEngine] Move ObjectCache::anchor from MCJIT to ExecutionEngine. ObjectCache is an ExecutionEngine utility, so its anchor belongs there. The practical impact of this change is that ORC users no longer need to link MCJIT to use ObjectCaches. llvm-svn: 280616	2016-09-04 07:24:11 +00:00
Dorit Nuzman	7673ba7ac2	Test commit. llvm-svn: 280615	2016-09-04 07:06:00 +00:00
Hal Finkel	73390c7acd	[PowerPC] Zero-extend constants in FastISel As it turns out, whether we zero-extend or sign-extend i8/i16 constants, which are illegal types promoted to i32 on PowerPC, is a choice constrained by assumptions within the infrastructure. Specifically, the logic in FunctionLoweringInfo::ComputePHILiveOutRegInfo assumes that constant PHI operands will be zero extended, and so, at least when materializing constants that are PHI operands, we must do the same. The rest of our fast-isel implementation does not appear to depend on the fact that we were sign-extending i8/i16 constants, and all other targets also appear to zero-extend small-bitwidth constants in fast-isel; we'll now do the same (we had been doing this only for i1 constants, and sign-extending the others). Fixes PR27721. llvm-svn: 280614	2016-09-04 06:07:19 +00:00
Craig Topper	af0d63d2e7	[AVX-512] Remove masked integer add/sub/mull intrinsics and upgrade to native IR. llvm-svn: 280611	2016-09-04 02:09:53 +00:00
Joseph Tremoulet	e92e0a9042	Fix inliner funclet unwind memoization Summary: The inliner may need to determine where a given funclet unwinds to, and this determination may depend on other funclets throughout the funclet tree. The code that performs this walk in getUnwindDestToken memoizes results to avoid redundant computations. In the case that a funclet's unwind destination is derived from its ancestor, there's code to walk back down the tree from the ancestor updating the memo map of its descendants to record the unwind destination. This change fixes that code to account for the case that some descendant has a different unwind destination, which can happen if that unwind dest is a descendant of the EHPad being queried and thus didn't determine its unwind destination. Also update test inline-funclets.ll, which is supposed to cover such scenarios, to include a case that fails an assertion without this fix but passes with it. Fixes PR29151. Reviewers: majnemer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D24117 llvm-svn: 280610	2016-09-04 01:23:20 +00:00
Craig Topper	a57d2ca406	[X86] Combine some of the strings in autoupgrade code. llvm-svn: 280603	2016-09-03 23:55:13 +00:00
Xinliang David Li	7a28a7fbd8	Cleanup : Use metadata preserving API for branch creation Use the wrapper API in IRBuilder that does meta data copy to create new branch in LoopUnswitch. llvm-svn: 280602	2016-09-03 22:26:11 +00:00
Xinliang David Li	241e6c7086	[Profile] preserve branch metadata lowering select in CGP CGP currently drops select's MD_prof profile data when generating conditional branch which can lead to bad code layout. The patch fixes the issue. Differential Revision: http://reviews.llvm.org/D24169 llvm-svn: 280600	2016-09-03 21:26:36 +00:00
Mehdi Amini	ebb3434850	Fix ThinLTO crash with debug info Because the recent change about ODR type uniquing in the context, we can reach types defined in another module during IR linking. This triggered some assertions in case we IR link without starting from an empty module. To alleviate that, we can self-map metadata defined in the destination module so that they won't be visited. Differential Revision: https://reviews.llvm.org/D23841 llvm-svn: 280599	2016-09-03 21:12:33 +00:00
Simon Pilgrim	3606d2346c	Strip trailing whitespace llvm-svn: 280598	2016-09-03 20:36:05 +00:00
Matt Arsenault	ac42ba8633	AMDGPU: Set sizes of spill pseudos llvm-svn: 280595	2016-09-03 17:25:44 +00:00
Matt Arsenault	5ffe3e1d93	AMDGPU: Fix adding duplicate implicit exec uses I'm not sure if this should be considered a bug in copyImplicitOps or not, but implicit operands that are part of the static instruction definition should not be copied. llvm-svn: 280594	2016-09-03 17:25:39 +00:00
Craig Topper	907b580d72	[AVX-512] Add integer ADD/SUB instructions to load folding tables. Add an AVX512 stack folding test. llvm-svn: 280593	2016-09-03 17:20:07 +00:00
Craig Topper	392cd0300d	[AVX-512] Mark EVEX encoded vpcmpeq as commutable just like its AVX and SSE equivalent. llvm-svn: 280592	2016-09-03 16:28:03 +00:00
Nicolai Haehnle	3bba6a8438	AMDGPU: Reduce the duration of whole-quad-mode Summary: This contains two changes that reduce the time spent in WQM, with the intention of reducing bandwidth required by VMEM loads: 1. Sampling instructions by themselves don't need to run in WQM, only their coordinate inputs need it (unless of course there is a dependent sampling instruction). The initial scanInstructions step is modified accordingly. 2. When switching back from WQM to Exact, switch back as soon as possible. This affects the logic in processBlock. This should always be a win or at best neutral. There are also some cleanups (e.g. remove unused ExecExports) and some new debugging output. Reviewers: arsenm, tstellarAMD, mareko Subscribers: arsenm, llvm-commits, kzhuravl Differential Revision: http://reviews.llvm.org/D22092 llvm-svn: 280590	2016-09-03 12:26:38 +00:00
Nicolai Haehnle	a246dccc26	AMDGPU: Fix an interaction between WQM and polygon stippling Summary: This fixes a rare bug in polygon stippling with non-monolithic pixel shaders. The underlying problem is as follows: the prolog part contains the polygon stippling sequence, i.e. a kill. The main part then enables WQM based on the _reduced_ exec mask, effectively undoing most of the polygon stippling. Since we cannot know whether polygon stippling will be used, the main part of a non-monolithic shader must always return to exact mode to fix this problem. Reviewers: arsenm, tstellarAMD, mareko Subscribers: arsenm, llvm-commits, kzhuravl Differential Revision: https://reviews.llvm.org/D23131 llvm-svn: 280589	2016-09-03 12:26:32 +00:00
Matt Arsenault	46a0382ab2	AMDGPU: Do basic folding of class intrinsic This allows more of the OCML builtin library to be constant folded. llvm-svn: 280586	2016-09-03 07:06:58 +00:00
Matt Arsenault	2510a31677	AMDGPU: Fix spilling of m0 readlane/writelane do not support using m0 as the output/input. Constrain the register class of spill vregs to try to avoid this, but also handle spilling of the physreg when necessary by inserting an additional copy to a normal SGPR. llvm-svn: 280584	2016-09-03 06:57:55 +00:00
Matt Arsenault	f3d1a1a1b6	Improve debug error message with register name llvm-svn: 280583	2016-09-03 06:57:49 +00:00
Craig Topper	892ce56901	[AVX-512] Add EVEX encoded VPCMPEQ and VPCMPGT to the load folding tables. llvm-svn: 280581	2016-09-03 04:37:50 +00:00
NAKAMURA Takumi	76dbaebd02	Make lit/util.py py3-compatible. llvm-svn: 280579	2016-09-03 04:06:37 +00:00
Nico Weber	05e78450be	Revert r280549. The test it added doesn't pass: http://lab.llvm.org:8011/builders/clang-x64-ninja-win7/builds/15318/steps/ninja%20check%201/logs/FAIL%3A%20LLVM%3A%3Apdbdump-yaml-types.test Command Output (stdout): -- $ "D:/buildslave/clang-x64-ninja-win7/stage1/./bin\llvm-pdbdump.EXE" "pdb2yaml" "-tpi-stream" "D:\buildslave\clang-x64-ninja-win7\llvm\test\DebugInfo\PDB/Inputs/empty.pdb" $ "D:/buildslave/clang-x64-ninja-win7/stage1/./bin\FileCheck.EXE" "-check-prefix=YAML" "D:\buildslave\clang-x64-ninja-win7\llvm\test\DebugInfo\PDB\pdbdump-yaml-types.test" # command stderr: D:\buildslave\clang-x64-ninja-win7\llvm\test\DebugInfo\PDB\pdbdump-yaml-types.test:36:7: error: expected string not found in input YAML: Name: apartment ^ <stdin>:153:10: note: scanning from here Value: 161 ^ llvm-svn: 280577	2016-09-03 03:18:49 +00:00
Duncan P. N. Exon Smith	3c8472852e	ADT: Use std::list in SparseBitVector, NFC The only intrusive thing about SparseBitVector's usage of ilist<> was that new was usually called externally. There were no custom traits. It seems like the reason to switch to ilist in r41855 was to avoid pointer invalidation, but std::list<> has that feature too. Maybe std::list<>::emplace makes this a little more obvious than it was then. Switch over to std::list<> and simplify the code. llvm-svn: 280573	2016-09-03 02:43:42 +00:00
Hal Finkel	522e4d9d66	[PowerPC] Support asm parsing for bc[l][a][+-] mnemonics PowerPC assembly code in the wild, so it seems, has things like this: bc+ 12, 28, .L9 This is a bit odd because the '+' here becomes part of the BO field, and the BO field is otherwise the first operand. Nevertheless, the ISA specification does clearly say that the +- hint syntax applies to all conditional-branch mnemonics (that test either CTR or a condition register, although not the forms which check both), both basic and extended, so this is supposed to be valid. This introduces some asm-parser-only definitions which take only the upper three bits from the specified BO value, and the lower two bits are implied by the +- suffix (via some associated aliases). Fixes PR23646. llvm-svn: 280571	2016-09-03 02:31:44 +00:00
Duncan P. N. Exon Smith	4e229a7c0a	ADT: Do not inherit from std::iterator in ilist_iterator Inheriting from std::iterator uses more boiler-plate than manual typedefs. Avoid that in both ilist_iterator and MachineInstrBundleIterator. This has the side effect of removing ilist_iterator from certain ADL lookups in namespace std; calls to std::next need to be qualified by "std::" that didn't have to before. The one case of this in-tree was operating on a temporary, so I used the more compact operator++. llvm-svn: 280570	2016-09-03 02:27:35 +00:00
Duncan P. N. Exon Smith	3453d51c51	ADT: Split out iplist_impl from iplist, NFC Split out iplist_impl from iplist, and change SymbolTableList to inherit directly from iplist_impl. This makes it more straightforward to add new template paramaters to iplist []: - iplist_impl takes a "base" list that provides the intrusive functionality (usually simple_ilist<T>) and a traits class. - iplist no longer takes a "Traits" template parameter. It only takes the value_type, T, and instantiates iplist_impl with simple_ilist<T> and ilist_traits<T>. - SymbolTableList now inherits from iplist_impl, instead of iplist. Note for out-of-tree code: if you have an iplist whose second template parameter was not* the default (i.e., not ilist_traits<YourT>), you have three options: - Stop using a custom traits class, and instead specialize ilist_traits<YourT>. This is the usual thing to do. - Specialize iplist<YourT> to pass your custom traits class into iplist_impl. - Create your own trivial list type that passes your custom traits class into iplist_impl (see SymbolTableList<> for an example). [*]: The eventual goal is to start tracking a sentinel bit on the MachineInstr list even when LLVM_ENABLE_ABI_BREAKING_CHECKS is off, which will enable MachineBasicBlock::reverse_iterator to have normal list invalidation semantics that matching the new iplist<>::reverse_iterator from r280032. llvm-svn: 280569	2016-09-03 02:07:45 +00:00
Wei Mi	c37307a5f4	Fix buildbot error. Add -mtriple=x86_64-unknown-linux-gnu for the test and move it to CodeGen/X86. llvm-svn: 280568	2016-09-03 01:43:28 +00:00
Duncan P. N. Exon Smith	7b1fec0372	ADT: Rename NodeTy to T in iplist/ilist template parameters And use other typedefs so that the next rename has a smaller diff. llvm-svn: 280567	2016-09-03 01:42:40 +00:00
Duncan P. N. Exon Smith	8cc24eadd2	ADT: Remove external uses of ilist_iterator, NFC Delete the dead code for Write(ilist_iterator) in the IR Verifier, inline report(ilist_iterator) at its call sites in the MachineVerifier, and use simple_ilist<>::iterator in SymbolTableListTraits. The only remaining reference to ilist_iterator outside of the ilist implementation is from MachineInstrBundleIterator. I'll get rid of that in a follow-up. llvm-svn: 280565	2016-09-03 01:22:56 +00:00
Duncan P. N. Exon Smith	e974f57298	ADT: Fix up IListTest.privateNode and get it passing This test was using the wrong type, and so not actually testing much. ilist_iterator constructors weren't going through ilist_node_access, so they didn't actually work with private inheritance. llvm-svn: 280564	2016-09-03 01:06:08 +00:00
Hal Finkel	28842b96f3	[PowerPC] Add asm parser/disassembler support for hrfid,nap,slbmfev These few book-III instructions are used by the Linux kernel. Partially fixes PR24796. llvm-svn: 280560	2016-09-02 23:42:01 +00:00
Hal Finkel	277736eee6	[PowerPC] Add support for the extended dcbf form and mnemonics dcbf has an optional hint-like field, add support for the extended form and the associated mnemonics (dcbfl and dcbflp). Partially fixes PR24796. llvm-svn: 280559	2016-09-02 23:41:54 +00:00
Yunzhong Gao	27ea29b3b7	(LLVM part) Implement MASM-flavor intel syntax behavior for inline MS asm block: 1. 0xNN and NNh are accepted as valid hexadecimal numbers, but 0xNNh is not. 0xNN and NNh may come with optional U or L suffix. 2. NNb is accepted as a valid binary (base-2) number, but 0bNN is not. NNb may come with optional U or L suffix. Differential Revision: https://reviews.llvm.org/D22112 llvm-svn: 280555	2016-09-02 23:15:29 +00:00
Ron Lieberman	88159e5549	Make sure to maintain register liveness when generating predicated instructions. Author: Krzysztof Parzyszek <kparzysz@codeaurora.org> Differential Revision: https://reviews.llvm.org/D24209 llvm-svn: 280552	2016-09-02 22:56:24 +00:00
Gor Nishanov	534c7028ab	gitignore: ignore VS Code editor files Summary: VS code creates .vscode folder to keep its stuff that we really don't need in git. Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D24211 llvm-svn: 280551	2016-09-02 22:54:26 +00:00
Ivan Krasin	92bbf96c97	lit: print process output, if getting the list of google-tests failed. Summary: This is a follow up to r280455, where a check for the process exit code was introduced. Some ASAN bots throw this error now, but it's impossible to understand what's wrong with them, and the issue is not reproducible. Reviewers: vitalybuka Differential Revision: https://reviews.llvm.org/D24210 llvm-svn: 280550	2016-09-02 22:31:24 +00:00
Zachary Turner	83849415de	[codeview] Make FieldList records print as a yaml sequence. Before we were kind of imitating the behavior of a Yaml sequence by outputting each record one after the other. This makes it a little cumbersome when we want to go the other direction -- from Yaml to Pdb. So this treats FieldList records as no different than any other list of records, by printing them as a Yaml sequence with the exact same format. llvm-svn: 280549	2016-09-02 22:19:01 +00:00
Xinliang David Li	40afd5c9e4	[Profile] handle select instruction in 'expect' lowering Builtin expect lowering currently ignores select. This patch fixes the issue Differential Revision: http://reviews.llvm.org/D24166 llvm-svn: 280547	2016-09-02 22:03:40 +00:00
Hal Finkel	7b104d4721	[PowerPC] For larger offsets, when possible, fold offset into addis toc@ha When we have an offset into a global, etc. that is accessed relative to the TOC base pointer, and the offset is larger than the minimum alignment of the global itself and the TOC base pointer (which is 8-byte aligned), we can still fold the @toc@ha into the memory access, but we must update the addis instruction's symbol reference with the offset as the symbol addend. When there is only one use of the addi to be folded and only one use of the addis that would need its symbol's offset adjusted, then we can make the adjustment and fold the @toc@l into the memory access. llvm-svn: 280545	2016-09-02 21:37:07 +00:00
James Y Knight	6ef32bf2af	[Sparc] Mark i128 shift libcalls unavailable in 32-bit mode. Recently, llvm wants to emit calls to these functions, while it didn't seem to be an issue before. Not sure why. Nor do I know why only these three are important to disable, out of all of the i128 libcalls. Nevertheless, many other targets have this snippet of code, so, just copying it to sparc as well, to unbreak things. llvm-svn: 280537	2016-09-02 20:29:11 +00:00
Jan Vesely	ea45746d5a	AMDGPU/R600: EXTRACT_VECT_ELT should only bypass BUILD_VECTOR if the vectors have the same number of elements. Fixes R600 piglit regressions since r280298 Differential Revision: https://reviews.llvm.org/D24174 llvm-svn: 280535	2016-09-02 20:13:19 +00:00
Sjoerd Meijer	6c4140b6c0	Setting fp trapping mode and denormal type: this an improvement of r280246 and calculates compatibility of functions attributes in a better way. Differential Revision: https://reviews.llvm.org/D24070 llvm-svn: 280534	2016-09-02 19:51:34 +00:00
Krzysztof Parzyszek	3bf4aeccd6	Do not consider subreg defs as reads when computing subrange liveness Subregister definitions are considered uses for the purpose of tracking liveness of the whole register. At the same time, when calculating live interval subranges, subregister defs should not be treated as uses. Differential Revision: https://reviews.llvm.org/D24190 llvm-svn: 280532	2016-09-02 19:48:55 +00:00
Sanjay Patel	70277411d3	[InstCombine] auto-generate assertions for tighter checking llvm-svn: 280531	2016-09-02 19:38:37 +00:00
Chad Rosier	c50dfe38ac	[SLP] Don't pass a global CL option as an argument. NFC. Differential Revision: https://reviews.llvm.org/D24199 llvm-svn: 280527	2016-09-02 19:09:50 +00:00
Jan Vesely	00864886f4	AMDGPU/R600: Expand unaligned writes to local and global AS LOCAL and GLOBAL AS only PRIVATE needs special treatment Differential Revision: https://reviews.llvm.org/D23971 llvm-svn: 280526	2016-09-02 19:07:06 +00:00
Jan Vesely	cd6b12b12e	AMDGPU: Reorganize store tests Split by AS. Merge with some prviously failing tests. Differential Revision: https://reviews.llvm.org/D23969 llvm-svn: 280523	2016-09-02 18:52:28 +00:00
Reid Kleckner	fa28396f97	[codeview] Use the correct max CV record length of 0xFF00 Previously we were splitting our records at 0xFFFF bytes, which the Microsoft tools don't like. Should fix failure on the new Windows self-host buildbot. This length appears in microsoft-pdb/PDB/dbi/dbiimpl.h llvm-svn: 280522	2016-09-02 18:43:27 +00:00
Kyle Butt	e31cc84290	IfConversion: Add assertions that both sides of a diamond don't pred-clobber. One side of a diamond may end with a predicate clobbering instruction. That side of the diamond has to be if-converted second. Both sides can't clobber the predicate or the ifconversion is invalid. This is checked elsewhere, but add an assert as a safety check. NFC llvm-svn: 280518	2016-09-02 18:29:28 +00:00
Kyle Butt	8699921c4b	IfConversion: Fix bug introduced by rescanning diamonds. Passing the wrong values for predicate-clobbering. Simple to miss. Added an assert to make this easier to catch in the future. llvm-svn: 280517	2016-09-02 18:29:26 +00:00
Adam Nemet	26c43c879c	Fix up comment from r280442, noticed by Justin. llvm-svn: 280508	2016-09-02 17:20:32 +00:00
Wei Mi	c54d1298f5	Split the store of a wide value merged from an int-fp pair into multiple stores. For the store of a wide value merged from a pair of values, especially int-fp pair, sometimes it is more efficent to split it into separate narrow stores, which can remove the bitwise instructions or sink them to colder places. Now the feature is only enabled on x86 target, and only store of int-fp pair is splitted. It is possible that the application scope gets extended with perf evidence support in the future. Differential Revision: https://reviews.llvm.org/D22840 llvm-svn: 280505	2016-09-02 17:17:04 +00:00
Sanjay Patel	521f19f249	[InsttCombine] fold insertelement of constant into shuffle with constant operand (PR29126) The motivating case occurs with SSE/AVX scalar intrinsics, so this is a first step towards shrinking that to a single shufflevector. Note that the transform is intentionally limited to shuffles that are equivalent to vector selects to avoid creating arbitrary shuffle masks that may not lower well. This should solve PR29126: https://llvm.org/bugs/show_bug.cgi?id=29126 Differential Revision: https://reviews.llvm.org/D23886 llvm-svn: 280504	2016-09-02 17:05:43 +00:00
Davide Italiano	41bfd6bd6c	[lib/LTO] Simplify. No functional change intended. llvm-svn: 280503	2016-09-02 16:37:31 +00:00
Reid Kleckner	30e2067e3e	Quick fix to make LIT_PRESERVES_TMP work again llvm-svn: 280502	2016-09-02 16:33:15 +00:00
Reid Kleckner	8663b44318	[lit] Clean up temporary files created by tests Do this by creating a temp directory in the normal system temp directory, and cleaning it up on exit. It is still possible for this temp directory to leak if Python exits abnormally, but this is probably good enough for now. Fixes PR18335 llvm-svn: 280501	2016-09-02 16:29:24 +00:00
Derek Schuff	a66ae923e4	[WebAssembly] Update known test failures Fixed an issue with the experimental C headers llvm-svn: 280498	2016-09-02 16:26:24 +00:00
Matthew Simpson	b65c230eab	[LV] Ensure reverse interleaved group GEPs remain uniform For uniform instructions, we're only required to generate a scalar value for the first vector lane of each unroll iteration. Thus, if we have a reverse interleaved group, computing the member index off the scalar GEP corresponding to the last vector lane of its pointer operand technically makes the GEP non-uniform. We should compute the member index off the first scalar GEP instead. I've added the updated member index computation to the existing reverse interleaved group test. llvm-svn: 280497	2016-09-02 16:19:22 +00:00
Andrea Di Biagio	bff3fd6700	Simplify code a bit. No functional change intended. We don't need to call `GetCompareTy(LHS)' every single time true or false is returned from function SimplifyFCmpInst as suggested by Sanjay in review D24142. llvm-svn: 280491	2016-09-02 15:55:25 +00:00
Sanjay Patel	bd51c164d9	fix documentation comments; NFC llvm-svn: 280489	2016-09-02 15:43:25 +00:00
Andrea Di Biagio	805815f407	[instsimplify] Fix incorrect folding of an ordered fcmp with a vector of all NaN. This patch fixes a crash caused by an incorrect folding of an ordered comparison between a packed floating point vector and a splat vector of NaN. An ordered comparison between a vector and a constant vector of NaN, should always be folded into a constant vector where each element is i1 false. Since revision 266175, SimplifyFCmpInst folds the ordered fcmp into a scalar 'false'. Later on, this would cause an assertion failure, since the value type of the folded value doesn't match the expected value type of the uses of the original instruction: "Assertion failed: New->getType() == getType() && "replaceAllUses of value with new value of different type!". This patch fixes the issue and adds a test case to the already existing test InstSimplify/floating-point-compares.ll. Differential Revision: https://reviews.llvm.org/D24143 llvm-svn: 280488	2016-09-02 14:47:43 +00:00
Andrea Di Biagio	fd503e5af3	[DAGcombiner] Fix incorrect sinking of a truncate into the operand of a shift. This fixes a regression introduced by revision 268094. Revision 268094 added the following dag combine rule: // trunc (shl x, K) -> shl (trunc x), K => K < vt.size / 2 That rule converts a truncate of a shift-by-constant into a shift of a truncated value. We do this only if the shift count is less than half the size in bits of the truncated value (K < vt.size / 2). The problem is that the constraint on the shift count is incorrect, so the rule doesn't work well in some cases involving vector types. The combine rule should have been written instead like this: // trunc (shl x, K) -> shl (trunc x), K => K < vt.getScalarSizeInBits() Basically, if K is smaller than the "scalar size in bits" of the truncated value then we know that by "sinking" the truncate into the operand of the shift we would never accidentally make the shift undefined. This patch fixes the check on the shift count, and adds test cases to make sure that we don't regress the behavior. Differential Revision: https://reviews.llvm.org/D24154 llvm-svn: 280482	2016-09-02 11:29:09 +00:00
Andrey Bokhanko	7454145785	Fixed a typo (LLVM/Support/CFG.h -> LLVM/IR/CFG.h) llvm-svn: 280481	2016-09-02 11:13:35 +00:00
Chandler Carruth	0f0ef132af	[PM] Try to fix an MSVC2013 failure due to finding a template constructor when trying to do copy construction by adding an explicit move constructor. Will watch the bots to discover if this is sufficient. llvm-svn: 280479	2016-09-02 10:49:58 +00:00
Alexey Bataev	ed27d69037	[InstCombine] Add test for insertelementinsts with constants. Added a tests that shows that several insertelementinsts with constant indexes/data are not folded into a single shuffleinst. llvm-svn: 280474	2016-09-02 09:00:53 +00:00
George Rimar	d8dfeec019	[Support] - Fix possible crash in match() of llvm::Regex. Crash was possible if match() method was called on object that was moved or object created with empty constructor. Testcases updated. DIfferential revision: https://reviews.llvm.org/D24123 llvm-svn: 280473	2016-09-02 08:44:46 +00:00
George Rimar	d8a4ecac3b	[llvm-readobj] - Teach readobj to print DT_AUXILIARY dynamic tag in human readable form. Previously DT_AUXILIARY was unknown, patch fixes that. Differential revision: https://reviews.llvm.org/D24138 llvm-svn: 280471	2016-09-02 07:35:19 +00:00
James Molloy	f3cf2a494b	[SimplifyCFG] Add a workaround to fix PR30188 We're sinking stores, which is a good thing, but in the process creating selects for the store address operand, which SROA/Mem2Reg can't look through, which caused serious regressions. The real fix is in SROA, which I'll be looking into. llvm-svn: 280470	2016-09-02 07:29:00 +00:00
Craig Topper	ad79bf4712	[AVX-512] Move tests for masked floating point logical operations to avx512dqvl-intrinsics-upgrade.ll since they have now been autoupgraded. llvm-svn: 280467	2016-09-02 06:11:31 +00:00
Craig Topper	e75c49543c	[AVX-512] Remove floating point logical operation instrinsics and replace them with native IR. llvm-svn: 280466	2016-09-02 05:29:17 +00:00
Craig Topper	45d6503089	[AVX-512] Add more patterns for masked and broadcasted logical operations where the select or broadcast has a floating point type. These are needed in order to remove the masked floating point logical operation intrinsics and use native IR. llvm-svn: 280465	2016-09-02 05:29:13 +00:00
Craig Topper	00aecd97bf	[AVX-512] Add execution domain fixing for logical operations with broadcast loads. This builds on the handling of masked ops since we need to keep element size the same. llvm-svn: 280464	2016-09-02 05:29:09 +00:00
Craig Topper	f8ad647b93	[X86] Strengthen some SDNode type constraints. llvm-svn: 280463	2016-09-02 04:25:33 +00:00
Craig Topper	8b9e671e97	[AVX-512] Add NoVLX Predicates to some patterns so they don't rely on pattern ordering to be lower priority than their equivalent VLX pattern. llvm-svn: 280462	2016-09-02 04:25:30 +00:00
Lang Hames	c5d41d4ada	[Docs] Fix another typo in the Error/Expected docs. llvm-svn: 280461	2016-09-02 03:50:50 +00:00
Lang Hames	42f5dd8066	[Docs] Fix a couple of typos in the Error/Expected docs. llvm-svn: 280460	2016-09-02 03:46:08 +00:00
Lang Hames	a10e516596	[ORC] Fix some missing fields in OrcRemoteTargetClient's move constructor. llvm-svn: 280459	2016-09-02 03:45:44 +00:00
George Burgess IV	e93c477eda	Add missing &. NFC. llvm-svn: 280458	2016-09-02 03:38:43 +00:00
Hal Finkel	5ef4b03106	[PowerPC] hasAndNotCompare should return true As Sanjay suggested when he added the hook, PPC should return true from hasAndNotCompare. We have an efficient negated 'and' on PPC (which can feed a compare). Fixes PR27203. llvm-svn: 280457	2016-09-02 02:58:25 +00:00
Greg Parker	a9bac92890	[lit] Fail testing if a googletest executable crashes during test discovery googletest formatted tests are discovered by running the test executable. Previously testing would silently succeed if the test executable crashed during the discovery process. Now testing fails with "error: unable to discover google-tests ..." if the test executable exits with a non-zero status. llvm-svn: 280455	2016-09-02 02:44:07 +00:00
Hal Finkel	a39fd4bc53	[PowerPC] Add a pattern for a runtime bit check Following a suggestion by Sanjay, we should lower: %shl = shl i32 1, %y %and = and i32 %x, %shl %cmp = icmp eq i32 %and, %shl ret i1 %cmp into: subfic r4, r4, 32 rlwnm r3, r3, r4, 31, 31 Add this pattern and some associated patterns for the 64-bit case and the not-equal case. Fixes PR27356. llvm-svn: 280454	2016-09-02 02:34:44 +00:00
Dehao Chen	4b5e7f750c	revert r280429 and r280425: r280425 \| dehao \| 2016-09-01 16:15:50 -0700 (Thu, 01 Sep 2016) \| 9 lines Refactor LICM pass in preparation for LoopSink pass. Summary: LoopSink pass uses some common function in LICM. This patch refactor the LICM code to make it usable by LoopSink pass (https://reviews.llvm.org/D22778). r280429 \| dehao \| 2016-09-01 16:31:25 -0700 (Thu, 01 Sep 2016) \| 9 lines Refactor LICM to expose canSinkOrHoistInst to LoopSink pass. Summary: LoopSink pass shares the same canSinkOrHoistInst functionality with LICM pass. This patch exposes this function in preparation of https://reviews.llvm.org/D22778 llvm-svn: 280453	2016-09-02 01:59:27 +00:00
Dehao Chen	820372c0ed	revert r280432: r280432 \| dehao \| 2016-09-01 16:51:37 -0700 (Thu, 01 Sep 2016) \| 9 lines Explicitly require DominatorTreeAnalysis pass for instsimplify pass. Summary: DominatorTreeAnalysis is always required by instsimplify. llvm-svn: 280452	2016-09-02 01:47:13 +00:00
NAKAMURA Takumi	9aec5c3797	llvm/test/Transforms/GCOVProfiling/three-element-mdnode.ll: Use %/T instead of %T, not to emit backslashes. llvm-svn: 280451	2016-09-02 01:33:00 +00:00
Justin Bogner	8d0a08115a	bugpoint: clang-format all of bugpoint. NFC I'm going to clean up the APIs here a bit and touch many many lines anyway. llvm-svn: 280450	2016-09-02 01:21:37 +00:00
NAKAMURA Takumi	bc46927659	raw_pwrite_stream_test.cpp: _putenv_s() may be assumed as win32-generic. llvm-svn: 280449	2016-09-02 01:20:18 +00:00
Kyle Butt	93e94e8a12	IfConversion: Don't count branches in # of duplicates. If the entire blocks match, we would count the branch instructions toward the number of duplicated instructions. This doesn't match what we do elsewhere, and was causing a bug. llvm-svn: 280448	2016-09-02 01:20:06 +00:00
Chandler Carruth	c906ff63da	[PM] Add a unittest for invalidating module analyses with an SCC pass. This wasn't really well explicitly tested with a nice unittest before. It seems good to have reasonably broken out unittests for this kind of functionality as I'm workin go other invalidation features to make sure none of the existing ones regress. This still has too much duplicated code, I plan to factor that out in a subsequent commit to use common helpers for repeated parts of this. llvm-svn: 280447	2016-09-02 01:16:27 +00:00
Chandler Carruth	4f83742ab6	[PM] (NFC) Split the IR parsing into a fixture so that I can split out more testing into other test routines while using the same core module. llvm-svn: 280446	2016-09-02 01:14:05 +00:00
Reid Kleckner	1a4398a198	Fix a real temp file leak in FileOutputBuffer If we failed to commit the buffer but did not die to a signal, the temp file would remain on disk on Windows. Having an open file mapping and file handle prevents the file from being deleted. I am choosing not to add an assertion of success on the temp file removal, since virus scanners and other environmental things can often cause removal to fail in real world tools. Also fix more temp file leaks in unit tests. llvm-svn: 280445	2016-09-02 01:10:53 +00:00
Chandler Carruth	d4e80a9615	[PM] (NFC) Refactor the CGSCC pass manager tests to use lambda-based passes. This simplifies the test some and makes it more focused and clear what is being tested. It will also make it much easier to extend with further testing of different pass behaviors. I've also replaced a pointless module pass with running the requires pass directly as that is all that it was really doing. llvm-svn: 280444	2016-09-02 01:08:04 +00:00
Reid Kleckner	75e557f1fd	Try to fix some temp file leaks in SupportTests, PR18335 llvm-svn: 280443	2016-09-02 00:51:34 +00:00
Adam Nemet	1200a050ff	[CFGPrinter] Display branch weight on the edges Summary: This is pretty useful especially in connection with BFI's -view-block-freq-propagation-dags. It helped me to track down the bug that is being fixed in D24118. While -view-block-freq-propagation-dags displays the high-level information with static heuristics included (and block frequencies), the new thing only shows the raw weight as presented by PGO without any of the static estimates. This helps to distinguished what has been measured vs. estimated. For the sample loop in D24118, -view-block-freq-propagation-dags=integer looks like this: https://reviews.llvm.org/F2381352 While with -view-cfg-only you can see the underlying branch weights: https://reviews.llvm.org/F2392296 Reviewers: dexonsmith, bogner, davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D24144 llvm-svn: 280442	2016-09-02 00:28:26 +00:00
Hal Finkel	b54579fab6	[PowerPC] Don't apply the PPC64 address-formation peephole for offsets greater than 7 When applying our address-formation PPC64 peephole, we are reusing the @ha TOC addis value with the low parts associated with different offsets (i.e. different effective symbol addends). We were assuming this was okay so long as the offsets were less than the alignment of the global variable being accessed. This ignored the fact, however, that the TOC base pointer itself need only be 8-byte aligned. As a result, what we were doing is legal only for offsets less than 8 regardless of the alignment of the object being accessed. Fixes PR28727. llvm-svn: 280441	2016-09-02 00:28:20 +00:00
Hal Finkel	1e8218cc09	[PowerPC] Don't consider fusion in PPC64 address-formation peephole The logic in this function assumes that the P8 supports fusion of addis/addi, but it does not. As a result, there is no advantage to restricting our peephole application, merging addi instructions into dependent memory accesses, even when the addi has multiple users, regardless of whether or not we're optimizing for size. We might need something like this again for the P9; I suspect we'll revisit this code when we work on P9 tuning. llvm-svn: 280440	2016-09-02 00:27:50 +00:00
Dehao Chen	e573b77772	Explicitly require DominatorTreeAnalysis pass for instsimplify pass. Summary: DominatorTreeAnalysis is always required by instsimplify. Reviewers: davidxl, danielcdh Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D24173 llvm-svn: 280432	2016-09-01 23:51:37 +00:00
Aditya Kumar	356f79d535	[SelectionDAGBuilder] Add const to relevant places Reviewers: hans, evandro, sebpop Differential Revision: https://reviews.llvm.org/D24112 llvm-svn: 280430	2016-09-01 23:35:26 +00:00
Dehao Chen	e81d50b3b9	Refactor LICM to expose canSinkOrHoistInst to LoopSink pass. Summary: LoopSink pass shares the same canSinkOrHoistInst functionality with LICM pass. This patch exposes this function in preparation of https://reviews.llvm.org/D22778 Reviewers: chandlerc, davidxl, danielcdh Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D24171 llvm-svn: 280429	2016-09-01 23:31:25 +00:00
Dehao Chen	ddd0c125e3	Refactor replaceDominatedUsesWith to have a flag to control whether to replace uses in BB itself. Summary: This is in preparation for LoopSink pass which calls replaceDominatedUsesWith to update after sinking. Reviewers: chandlerc, davidxl, danielcdh Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D24170 llvm-svn: 280427	2016-09-01 23:26:48 +00:00
Dehao Chen	bc4e5bba0e	Refactor LICM pass in preparation for LoopSink pass. Summary: LoopSink pass uses some common function in LICM. This patch refactor the LICM code to make it usable by LoopSink pass (https://reviews.llvm.org/D22778). Reviewers: chandlerc, davidxl, danielcdh Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D24168 llvm-svn: 280425	2016-09-01 23:15:50 +00:00
Michael Kuperstein	7bc54cebea	[Legalizer] Don't throw away false low half when expanding GT/LT SETCC When expanding a SETCC for which the low half is known to evaluate to false, we can only throw it away for LT/GT comparisons, not LE/GE. This fixes PR29170. Differential Revision: https://reviews.llvm.org/D24151 llvm-svn: 280424	2016-09-01 23:02:32 +00:00
Chandler Carruth	67473526cb	Make the coding standards a bit more clear that we prefer the fancy new auto-brief format for doxygen comments. Most notable is switching to that in the example doxygen comment. I've also tweaked the wording but am happy to tweak it further if others have suggestions here. Mostly doing this to capture something I and others have been writing consistently and repeatedly in code reviews. llvm-svn: 280419	2016-09-01 22:18:25 +00:00
Michael Kuperstein	5f17d08f49	[SelectionDAG] Generate vector_shuffle nodes for undersized result vector sizes Prior to this, we could generate a vector_shuffle from an IR shuffle when the size of the result was exactly the sum of the sizes of the input vectors. If the output vector was narrower - e.g. a <12 x i8> being formed by a shuffle with two <8 x i8> inputs - we would lower the shuffle to a sequence of extracts and inserts. Instead, we can form a larger vector_shuffle, and then extract a subvector of the right size - e.g. shuffle the two <8 x i8> inputs into a <16 x i8> and then extract a <12 x i8>. This also includes a target-specific X86 combine that in the presence of AVX2 combines: (vector_shuffle <mask> (concat_vectors t1, undef) (concat_vectors t2, undef)) into: (vector_shuffle <mask> (concat_vectors t1, t2), undef) in cases where this allows us to form VPERMD/VPERMQ. (This is not a separate commit, as that pattern does not appear without the DAGBuilder change.) llvm-svn: 280418	2016-09-01 21:32:09 +00:00
Heejin Ahn	c0f18172f5	[WebAssembly] Add asm.js-style setjmp/longjmp handling for wasm (reland r280302) Summary: This patch adds asm.js-style setjmp/longjmp handling support for WebAssembly. It also uses JavaScript's try and catch mechanism. Reviewers: jpp, dschuff Subscribers: jfb, dschuff Differential Revision: https://reviews.llvm.org/D24121 llvm-svn: 280415	2016-09-01 21:05:15 +00:00
Justin Bogner	06d466a9bc	bugpoint: clang-format and modernize comments in ListReducer. NFC llvm-svn: 280414	2016-09-01 21:04:36 +00:00
Tim Northover	8d8812c5d7	GlobalISel: add a G_PHI instruction to give phis a type. They're another source of generic vregs, which are going to need a type on the definition when we remove the register width from MachineRegisterInfo. llvm-svn: 280412	2016-09-01 20:45:41 +00:00
Reid Kleckner	d1882f2188	Fix the ASan fuse-lld.cc test after LLD r280012 With that change, images built with 'lld-link /debug' always have a debug directory. If no PDB filename was passed on the command line, then the filename in the executable is empty. PDB information would never work anyway if the PDB file name is empty, so go ahead and try DWARF in that case. llvm-svn: 280410	2016-09-01 20:28:59 +00:00
Matthew Simpson	9e7851ed31	[LV] Use ScalarParts for ad-hoc pointer IV scalarization (NFCI) We can now maintain scalar values in VectorLoopValueMap. Thus, we no longer have to create temporary vectors with insertelement instructions when handling pointer induction variables. This case was mistakenly missed from r279649 when refactoring the other scalarization code. llvm-svn: 280405	2016-09-01 19:40:19 +00:00
Sanjay Patel	199a8e6a92	[InstCombine] add tests to show potential shuffle+insert folds llvm-svn: 280403	2016-09-01 19:14:19 +00:00
Andrey Turetskiy	cde38b6a99	[X86] Loosen memory folding requirements for cvtdq2pd and cvtps2pd instructions. According to spec cvtdq2pd and cvtps2pd instructions don't require memory operand to be aligned to 16 bytes. This patch removes this requirement from the memory folding table. Differential Revision: https://reviews.llvm.org/D23919 llvm-svn: 280402	2016-09-01 18:50:02 +00:00
Yaxun Liu	add05a8d95	AMDGPU: Add runtime metadata for pointee alignment of argument. Add runtime metdata for pointee alignment of pointer type kernel argument. The key is KeyArgPointeeAlign and the value is a 32 bit unsigned integer. Differential Revision: https://reviews.llvm.org/D24145 llvm-svn: 280399	2016-09-01 18:46:49 +00:00
Davide Italiano	22d36c167e	[lib/LTO] Simplify a bit. NFCI. llvm-svn: 280396	2016-09-01 18:34:47 +00:00
Chris Bieneman	342134ed68	[CMake] Connecting check-all and test-depends targets correctly My previous attempt at this connected the sub-project check targets to the test-depends target instead of to the check-all target. That resulted in the tests running multiple times on bots that built "test-depends" and "check-all" in separate build invocations. llvm-svn: 280392	2016-09-01 18:26:01 +00:00
Michael Kuperstein	b4743597bd	Rename some variables to have meaningful names. NFC. llvm-svn: 280391	2016-09-01 18:24:42 +00:00
Matthew Simpson	922af076c7	[LV] Move VectorParts allocation and mapping into PHI widening (NFC) This patch moves the allocation of VectorParts for PHI nodes into the actual PHI widening code. Previously, we allocated these VectorParts in vectorizeBlockInLoop, and passed them by reference to widenPHIInstruction. Upon returning, we would then map the VectorParts in VectorLoopValueMap. This behavior is problematic for the cases where we only want to generate a scalar version of a PHI node. For example, if in the future we only generate a scalar version of an induction variable, we would end up inserting an empty vector entry into the map once we return to vectorizeBlockInLoop. We now no longer need to pass VectorParts to the various PHI widening functions, and we can keep VectorParts allocation as close as possible to the point at which they are actually mapped in VectorLoopValueMap. llvm-svn: 280390	2016-09-01 18:14:27 +00:00
Zachary Turner	5c7c2307a8	[codeview] Properly propagate the TypeLeafKind through the pipeline. llvm-svn: 280388	2016-09-01 18:08:19 +00:00
Michael Kuperstein	65bc3c89ff	[DAGCombine] Don't fold a trunc if it feeds an anyext Legalization tends to create anyext(trunc) patterns. This should always be combined - into either a single trunc, a single ext, or nothing if the types match exactly. But if we happen to combine the trunc first, we may pull the trunc away from the anyext or make it implicit (e.g. the truncate(extract) -> extract(bitcast) fold). To prevent this, we can avoid doing the fold, similarly to how we already handle fpround(fpextend). Differential Revision: https://reviews.llvm.org/D23893 llvm-svn: 280386	2016-09-01 17:59:24 +00:00
Changpeng Fang	b28fe0307f	AMDGPU/SI: MIMG TD Refactoring. Summary: Created a new td file MIMGInstructions.td which contains all definitions of MIMG related instructions. Reviewed by: kzhuravl, vpykhtin Differential Revision: http://reviews.llvm.org/D24106 llvm-svn: 280385	2016-09-01 17:54:54 +00:00
Reid Kleckner	f309bfc5f9	[lit] Use multiprocessing by default on Windows Apparently nobody evaluated multiprocessing on Windows since Daniel enabled multiprocessing on Unix in r193279. It works so far as I can tell. Today this is worth about an 8x speedup (631.29s to 73.25s) on my 24 core Windows machine. Hopefully this will improve Windows buildbot cycle time, where currently it takes more time to run check-all than it does to self-host with assertions enabled: http://lab.llvm.org:8011/builders/clang-x86-windows-msvc2015/builds/20 build stage 2 ninja all ( 28 mins, 22 secs ) ninja check 2 stage 2 ( 37 mins, 38 secs ) llvm-svn: 280382	2016-09-01 17:19:44 +00:00
Chris Bieneman	be765196fa	[CMake] Revive LLVM__DIRS variables This is a partial revert of r280013. Brad King pointed out these variable names are matching CMake conventions, so we should preserve them. I've also added a direct mapping of the LLVM__DIR variables which we need to make projects support building in and out of tree. llvm-svn: 280380	2016-09-01 16:43:39 +00:00
Geoff Berry	fcb186ca9d	[EarlyCSE] Change C API pass interface for EarlyCSE w/ MemorySSA Previous change broke the C API for creating an EarlyCSE pass w/ MemorySSA by adding a bool parameter to control whether MemorySSA was used or not. This broke the OCaml bindings. Instead, change the old C API entry point back and add a new one to request an EarlyCSE pass with MemorySSA. llvm-svn: 280379	2016-09-01 15:07:46 +00:00
Simon Dardis	1fa1fb0f8d	[mips] Include missed file from previous commit llvm-svn: 280377	2016-09-01 15:03:13 +00:00
Simon Pilgrim	ce0e9f0b91	[X86][SSE] Dropped (V)CVTPD2PS intrinsic patterns now that its bound to X86vfpround It now uses X86vfpround patterns directly instead. Followup to D23797 llvm-svn: 280376	2016-09-01 14:59:20 +00:00
Simon Dardis	bd27154757	[mips] interAptiv based generic schedule model This scheduler describes a processor which covers all MIPS ISAs based around the interAptiv and P5600 timings. Reviewers: vkalintiris, dsanders Differential Revision: https://reviews.llvm.org/D23551 llvm-svn: 280374	2016-09-01 14:53:53 +00:00
Andrey Bokhanko	b7201cce3b	[CMake] Fix LLVM_ENABLE_EH and LLVM_ENABLE_RTTI on MSVC Patch by Johannes Sebastian Mueller-Roemer. Differential Revision: https://reviews.llvm.org/D23645 llvm-svn: 280371	2016-09-01 14:39:54 +00:00
Sanjay Patel	dd861964d1	[InstCombine] remove fold of an icmp pattern that should never happen While removing a scalar shackle from an icmp fold, I noticed that I couldn't find any tests to trigger this code path. The 'and' shrinking transform should be handled by InstCombiner::foldCastedBitwiseLogic() or eliminated with InstSimplify. The icmp narrowing is part of InstCombiner::foldICmpWithCastAndCast(). Differential Revision: https://reviews.llvm.org/D24031 llvm-svn: 280370	2016-09-01 14:20:43 +00:00
Krzysztof Parzyszek	07d9f53b51	[Hexagon] Deal with undefs when extending live intervals Reapply r280275, since MSVC accepts r280358. llvm-svn: 280369	2016-09-01 13:59:35 +00:00
Elena Demikhovsky	4d7738dfde	Optimized FMA intrinsic + FNEG , like -(ab+c) and FNEG + FMA, like ab-c or (-a)*b+c. The bug description is here : https://llvm.org/bugs/show_bug.cgi?id=28892 Differential revision: https://reviews.llvm.org/D23313 llvm-svn: 280368	2016-09-01 13:58:53 +00:00
James Molloy	88cad7e5cf	[SimplifyCFG] Handle tail-sinking of more than 2 incoming branches This was a real restriction in the original version of SinkIfThenCodeToEnd. Now it's been rewritten, the restriction can be lifted. As part of this, we handle a very common and useful case where one of the incoming branches is actually conditional. Consider: if (a) x(1); else if (b) x(2); This produces the following CFG: [if] / \ [x(1)] [if] \| \| \ \| \| \ \| [x(2)] \| \ \| / [ end ] [end] has two unconditional predecessor arcs and one conditional. The conditional refers to the implicit empty 'else' arc. This same pattern can also be caused by an empty default block in a switch. We can't sink the call to x() down to end because no call to x() happens on the third incoming arc (assume that x() has sideeffects for the sake of argument; if something is safe to speculate we could indeed sink nevertheless but this cannot happen in the general case and causes many extra selects). We are now able to detect this case and split off the unconditional arcs to a common successor: [if] / \ [x(1)] [if] \| \| \ \| \| \ \| [x(2)] \| \ / \| [sink.split] \| \ / [ end ] Now we can sink the call to x() into %sink.split. This can cause significant code simplification in many testcases. llvm-svn: 280364	2016-09-01 12:58:13 +00:00
Krzysztof Parzyszek	4f863d75f4	Add an optional parameter with a list of undefs to extendToIndices Reapply r280268, hopefully in a version that MSVC likes. llvm-svn: 280358	2016-09-01 12:10:36 +00:00
Honggyu Kim	9eb6a10251	[IR] Properly handle escape characters in Attribute::getAsString() If an attribute name has special characters such as '\01', it is not properly printed in LLVM assembly language format. Since the format expects the special characters are printed as it is, it has to contain escape characters to make it printable. Before: attributes #0 = { ... "counting-function"="^A__gnu_mcount_nc" ... After: attributes #0 = { ... "counting-function"="\01__gnu_mcount_nc" ... Reviewers: hfinkel, rengolin, rjmccall, compnerd Subscribers: nemanjai, mcrosier, hans, shenhan, majnemer, llvm-commits Differential Revision: https://reviews.llvm.org/D23792 llvm-svn: 280357	2016-09-01 11:44:06 +00:00
James Molloy	eec6df3193	[SimplifyCFG] Change the algorithm in SinkThenElseCodeToEnd r279460 rewrote this function to be able to handle more than two incoming edges and took pains to ensure this didn't regress anything. This time we change the logic for determining if an instruction should be sunk. Previously we used a single pass greedy algorithm - sink instructions until one requires more than one PHI node or we run out of instructions to sink. This had the problem that sinking instructions that had non-identical but trivially the same operands needed extra logic so we sunk them aggressively. For example: %a = load i32* %b %d = load i32* %b %c = gep i32* %a, i32 0 %e = gep i32* %d, i32 1 Sinking %c and %e would naively require two PHI merges as %a != %d. But the loads are obviously equivalent (and maybe can't be hoisted because there is no common predecessor). This is why we implemented the fairly complex function areValuesTriviallySame(), to look through trivial differences like this. However it's just not clever enough. Instead, throw areValuesTriviallySame away, use pointer equality to check equivalence of operands and switch to a two-stage algorithm. In the "scan" stage, we look at every sinkable instruction in isolation from end of block to front. If it's sinkable, we keep track of all operands that required PHI merging. In the "sink" stage, we iteratively sink the last non-terminator in the source blocks. But when calculating how many PHIs are actually required to be inserted (to work out if we should stop or not) we remove any values that have already been sunk from the set of PHI-merges required, which allows us to be more aggressive. This turns an algorithm with potentially recursive lookahead (looking through GEPs, casts, loads and any other instruction potentially not CSE'd) to two linear scans. llvm-svn: 280351	2016-09-01 10:44:35 +00:00
Hal Finkel	5081ac27c7	Add ISD::EH_DWARF_CFA, simplify @llvm.eh.dwarf.cfa on Mips, fix on PowerPC LLVM has an @llvm.eh.dwarf.cfa intrinsic, used to lower the GCC-compatible __builtin_dwarf_cfa() builtin. As pointed out in PR26761, this is currently broken on PowerPC (and likely on ARM as well). Currently, @llvm.eh.dwarf.cfa is lowered using: ADD(FRAMEADDR, FRAME_TO_ARGS_OFFSET) where FRAME_TO_ARGS_OFFSET defaults to the constant zero. On x86, FRAME_TO_ARGS_OFFSET is lowered to 2*SlotSize. This setup, however, does not work for PowerPC. Because of the way that the stack layout works, the canonical frame address is not exactly (FRAMEADDR + FRAME_TO_ARGS_OFFSET) on PowerPC (there is a lower save-area offset as well), so it is not just a matter of implementing FRAME_TO_ARGS_OFFSET for PowerPC (unless we redefine its semantics -- We can do that, since it is currently used only for @llvm.eh.dwarf.cfa lowering, but the better to directly lower the CFA construct itself (since it can be easily represented as a fixed-offset FrameIndex)). Mips currently does this, but by using a custom lowering for ADD that specifically recognizes the (FRAMEADDR, FRAME_TO_ARGS_OFFSET) pattern. This change introduces a ISD::EH_DWARF_CFA node, which by default expands using the existing logic, but can be directly lowered by the target. Mips is updated to use this method (which simplifies its implementation, and I suspect makes it more robust), and updates PowerPC to do the same. Fixes PR26761. Differential Revision: https://reviews.llvm.org/D24038 llvm-svn: 280350	2016-09-01 10:28:47 +00:00
Valery Pykhtin	1b13886b5f	[AMDGPU] Scalar Memory instructions TD refactoring Differential revision: https://reviews.llvm.org/D23996 llvm-svn: 280349	2016-09-01 09:56:47 +00:00
Hal Finkel	40d7f5c277	Add a counter-function insertion pass As discussed in https://reviews.llvm.org/D22666, our current mechanism to support -pg profiling, where we insert calls to mcount(), or some similar function, is fundamentally broken. We insert these calls in the frontend, which means they get duplicated when inlining, and so the accumulated execution counts for the inlined-into functions are wrong. Because we don't want the presence of these functions to affect optimizaton, they should be inserted in the backend. Here's a pass which would do just that. The knowledge of the name of the counting function lives in the frontend, so we're passing it here as a function attribute. Clang will be updated to use this mechanism. Differential Revision: https://reviews.llvm.org/D22825 llvm-svn: 280347	2016-09-01 09:42:39 +00:00
Chandler Carruth	1ab16f8c2d	[Support] Fix a warning introduced in r280339 due to the member initializers not being in the same order as the members. Specifically, 'preg' is the first member followed by 'error', so they will be initialized in that order and should be written in the member initializer list in that order. For the constructor in question, there is no change in behavior. llvm-svn: 280345	2016-09-01 09:31:02 +00:00
James Molloy	21744689b9	[SimplifyCFG] Fix nondeterministic iteration order We iterate over the result from SafeToMergeTerminators, so make it a SmallSetVector instead of a SmallPtrSet. Should fix stage3 convergence builds. llvm-svn: 280342	2016-09-01 09:01:34 +00:00
George Rimar	e34257f21c	Commit of forgotten header for r280339 "[LLVM/Support] - Create no-arguments constructor for llvm::Regex" llvm-svn: 280340	2016-09-01 08:02:20 +00:00
George Rimar	a9ff072fe8	[LLVM/Support] - Create no-arguments constructor for llvm::Regex This is useful when need to defer the construction, e.g. using Regex as a member of class. Differential revision: https://reviews.llvm.org/D24101 llvm-svn: 280339	2016-09-01 08:00:28 +00:00
James Molloy	e656642295	[SimplifyCFG] Improve FoldValueComparisonIntoPredecessors to handle more cases A very important case is not handled here: multiple arcs to a single block with a PHI. Consider: a: %1 = icmp %b, 1 br %1, label %c, label %e c: %2 = icmp %b, 2 br %2, label %d, label %e d: br %e e: phi [0, %a], [1, %c], [2, %d] FoldValueComparisonIntoPredecessors will refuse to fold this, as it doesn't know how to deal with two arcs to a common destination with different PHI values. The answer is obvious - just split all conflicting arcs. llvm-svn: 280338	2016-09-01 07:45:25 +00:00
Dean Michael Berris	6d6addbe15	[NFC] Remove unnecessary comment llvm-svn: 280336	2016-09-01 01:58:24 +00:00
Dean Michael Berris	e8ae5baaf7	[XRay] Detect and emit sleds for sibling/tail calls Summary: This change promotes the 'isTailCall(...)' member function to TargetInstrInfo as a query interface for determining on a per-target basis whether a given MachineInstr is a tail call instruction. We build upon this in the XRay instrumentation pass to emit special sleds for tail call optimisations, where we emit the correct kind of sled. The tail call sleds look like a mix between the function entry and function exit sleds. Form-wise, the sled comes before the "jmp" instruction that implements the tail call similar to how we do it for the function entry sled. Functionally, because we know this is a tail call, it behaves much like an exit sled -- i.e. at runtime we may use the exit trampolines instead of a different kind of trampoline. A follow-up change to recognise these sleds will be done in compiler-rt, so that we can start intercepting these initially as exits, but also have the option to have different log entries to more accurately reflect that this is actually a tail call. Reviewers: echristo, rSerge, majnemer Subscribers: mehdi_amini, dberris, llvm-commits Differential Revision: https://reviews.llvm.org/D23986 llvm-svn: 280334	2016-09-01 01:29:13 +00:00
Kostya Serebryany	e2d0f63654	[libFuzzer] add -minimize_crash flag (to minimize crashers). also add two tests that I failed to commit last time llvm-svn: 280332	2016-09-01 01:22:27 +00:00
Dean Michael Berris	40e6ba16a1	[XRay][NFC] Promote isTailCall() as virtual in TargetInstrInfo. This change is broken out from D23986, where XRay detects tail call exits. llvm-svn: 280331	2016-09-01 01:03:22 +00:00
Heejin Ahn	10a7086700	Revert "Add asm.js-style setjmp/longjmp handling for wasm" This reverts commit r280302, it broke the integration tests. llvm-svn: 280329	2016-09-01 00:44:37 +00:00
Justin Bogner	f1382fc71e	Support: Avoid errors with LLVM_FALLTHROUGH in clang 3.6 and below in C mode Older versions of clang defined __has_cpp_attribute in C mode, but would choke on scoped attributes, as per llvm.org/PR23435. Since we support building with clang all the way back to 3.1, we have to work around this issue. llvm-svn: 280326	2016-08-31 23:43:14 +00:00
Nick Lewycky	8dd4dad08b	Add cast to appease windows builder. Fixes build break introduced in r280306. llvm-svn: 280311	2016-08-31 23:24:43 +00:00
Zachary Turner	77807637ff	[codeview] Have visitTypeBegin return the record type. Previously we were assuming that any visitation of types would necessarily be against a type we had binary data for. Reasonable assumption when were just reading PDBs and dumping them, but once we start writing PDBs from Yaml this breaks down, because we have no binary data yet, only Yaml, and from that we need to read the record kind and perform the switch based on that. So this patch does that. Instead of having the visitor switch on the kind that is already in the CVType record, we change the visitTypeBegin() method to return the Kind, and switch on the returned value. This way, the default implementation can still return the value from the CVType, but the implementation which visits Yaml records and serializes binary PDB type records can use the field in the Yaml as the source of the switch. llvm-svn: 280307	2016-08-31 23:14:31 +00:00
Nick Lewycky	97e49ac59e	Add -fprofile-dir= to clang. -fprofile-dir=path allows the user to specify where .gcda files should be emitted when the program is run. In particular, this is the first flag that causes the .gcno and .o files to have different paths, LLVM is extended to support this. -fprofile-dir= does not change the file name in the .gcno (and thus where lcov looks for the source) but it does change the name in the .gcda (and thus where the runtime library writes the .gcda file). It's different from a GCOV_PREFIX because a user can observe that the GCOV_PREFIX_STRIP will strip paths off of -fprofile-dir= but not off of a supplied GCOV_PREFIX. To implement this we split -coverage-file into -coverage-data-file and -coverage-notes-file to specify the two different names. The !llvm.gcov metadata node grows from a 2-element form {string coverage-file, node dbg.cu} to 3-elements, {string coverage-notes-file, string coverage-data-file, node dbg.cu}. In the 3-element form, the file name is already "mangled" with .gcno/.gcda suffixes, while the 2-element form left that to the middle end pass. llvm-svn: 280306	2016-08-31 23:04:32 +00:00
Reid Kleckner	34151b3000	Fix the MSVC 2013 build by using Elf_Word instead of making a local typedef llvm-svn: 280304	2016-08-31 22:45:36 +00:00
NAKAMURA Takumi	96a6c49b6b	[CMake] Increase stack size to 16MiB for all mingw executables. llvm-svn: 280303	2016-08-31 22:43:23 +00:00
Heejin Ahn	23d57103a4	Add asm.js-style setjmp/longjmp handling for wasm Summary: This patch adds asm.js-style setjmp/longjmp handling support for WebAssembly. It also uses JavaScript's try and catch mechanism. Reviewers: jpp, dschuff Subscribers: jfb, dschuff Differential Revision: https://reviews.llvm.org/D23928 llvm-svn: 280302	2016-08-31 22:40:34 +00:00
Reid Kleckner	109448ee81	Revert "Add an optional parameter with a list of undefs to extendToIndices" This reverts commit r280268, it causes all MSVC 2013 to ICE. This appears to have been fixed in a later MSVC 2013 update, because I cannot reproduce it locally. That said, all upstream LLVM bots are broken right now, so I am reverting. Also reverts dependent change r280275, "[Hexagon] Deal with undefs when extending live intervals". llvm-svn: 280301	2016-08-31 22:36:02 +00:00
Sanjay Patel	0d70831d73	[InstCombine] allow icmp (shr exact X, C2), C fold for splat constant vectors The enhancement to foldICmpDivConstant ( http://llvm.org/viewvc/llvm-project?view=revision&revision=280299 ) allows us to remove the ConstantInt check; no other changes needed. llvm-svn: 280300	2016-08-31 22:18:43 +00:00
Sanjay Patel	541aef4661	[InstCombine] allow icmp (div X, Y), C folds for splat constant vectors Converting all of the overflow ops to APInt looked risky, so I've left that as a TODO. llvm-svn: 280299	2016-08-31 21:57:21 +00:00
Matt Arsenault	b50eb8dc2b	AMDGPU: Fix introducing stack access on unaligned v16i8 llvm-svn: 280298	2016-08-31 21:52:27 +00:00
Matt Arsenault	1d2151781b	AMDGPU: Use copy instead of mov during frame lowering This occurs before RA pseudos are expanded. It's less code to emit the copy. llvm-svn: 280297	2016-08-31 21:52:25 +00:00
Matt Arsenault	57bc4324f8	AMDGPU: Refactor frame lowering This will make future changes easier. llvm-svn: 280296	2016-08-31 21:52:21 +00:00
Zachary Turner	2f951ce9c9	[codeview] Add TypeVisitorCallbackPipeline. We were kind of hacking this together before by embedding the ability to forward requests into the TypeDeserializer. When we want to start adding more different kinds of visitor callback interfaces though, this doesn't scale well and is very inflexible. So introduce the notion of a pipeline, which itself implements the TypeVisitorCallbacks interface, but which contains an internal list of other callbacks to invoke in sequence. Also update the existing uses of CVTypeVisitor to use this new pipeline class for deserializing records before visiting them with another visitor. llvm-svn: 280293	2016-08-31 21:42:26 +00:00
Tim Northover	11a2354670	GlobalISel: use G_TYPE to annotate physregs with a type. More preparation for dropping source types from MachineInstrs: regsters coming out of already-selected code (i.e. non-generic instructions) don't have a type, but that information is needed so we must add it manually. This is done via a new G_TYPE instruction. llvm-svn: 280292	2016-08-31 21:24:02 +00:00
Derek Schuff	1b258d313c	[WebAssembly] Disable folding of GA+reg into load/store constant offsets Summary: If the register has a negative value then unsigned overflow will occur; this case is sometimes even created intentionally by LSR. For now disable GA+reg folding. Fixes PR29127 Differential Revision: https://reviews.llvm.org/D24053 llvm-svn: 280285	2016-08-31 20:27:20 +00:00
Sanjay Patel	85d79744df	[InstCombine] change insertRangeTest() to use APInt instead of Constant; NFCI This is prep work before changing the callers to also use APInt which will allow folds for splat vectors. Currently, the callers have ConstantInt guards in place, so no functional change intended with this commit. llvm-svn: 280282	2016-08-31 19:49:56 +00:00
Michael Zolotukhin	e0b2d97b52	[LoopInfo] Add verification by recomputation. Summary: Current implementation of LI verifier isn't ideal and fails to detect some cases when LI is incorrect. For instance, it checks that all recorded loops are in a correct form, but it has no way to check if there are no more other (unrecorded in LI) loops in the function. This patch adds a way to detect such bugs. Reviewers: chandlerc, sanjoy, hfinkel Subscribers: llvm-commits, silvas, mzolotukhin Differential Revision: https://reviews.llvm.org/D23437 llvm-svn: 280280	2016-08-31 19:26:19 +00:00
Geoff Berry	8d84605f25	[EarlyCSE] Optionally use MemorySSA. NFC. Summary: Use MemorySSA, if requested, to do less conservative memory dependency checking. This change doesn't enable the MemorySSA enhanced EarlyCSE in the default pipelines, so should be NFC. Reviewers: dberlin, sanjoy, reames, majnemer Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D19821 llvm-svn: 280279	2016-08-31 19:24:10 +00:00
Quentin Colombet	bd850f4185	Actually check for the diagnostic to be emitted! This makes the test case in r280273 actually useful! llvm-svn: 280276	2016-08-31 18:53:32 +00:00
Krzysztof Parzyszek	e21a0b3b9f	[Hexagon] Deal with undefs when extending live intervals llvm-svn: 280275	2016-08-31 18:52:09 +00:00
Tom Stellard	ba5730884b	AMDGPU/SI: Make sure llvm.amdgcn.implicitarg.ptr() is at least 4-byte aligned Summary: This fixes some OpenCV tests that were broken by libclc commit r276443. Reviewers: arsenm, jvesely Subscribers: arsenm, wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D24051 llvm-svn: 280274	2016-08-31 18:46:07 +00:00
Quentin Colombet	1c06a73a7c	[TargetPassConfig] Add a hook to tell whether GlobalISel should warm on fallback. Thanks to this patch, we know have a way to easly see if GlobalISel failed. llvm-svn: 280273	2016-08-31 18:43:04 +00:00
Quentin Colombet	612cd1fdd6	[ResetMachineFunction] Emit the diagnostic isel fallback when asked. This pass is now able to report when the function is being reset. llvm-svn: 280272	2016-08-31 18:43:01 +00:00
Quentin Colombet	0a243f47b0	[DiagnosticInfo] Add a diagnostic class for the fallback of ISel. This will be used to warm when we fallback in GlobalISel. llvm-svn: 280271	2016-08-31 18:42:55 +00:00
Chad Rosier	83a120337a	Fix indent. NFC. llvm-svn: 280270	2016-08-31 18:37:52 +00:00
Krzysztof Parzyszek	576225daf5	Add an optional parameter with a list of undefs to extendToIndices llvm-svn: 280268	2016-08-31 18:02:19 +00:00
Kevin Enderby	9d0c945ad6	Next set of additional error checks for invalid Mach-O files for bad load commands that use the Mach::linkedit_data_command type for the load commands that are currently used in the MachOObjectFile constructor. This contains the missing checks for LC_DATA_IN_CODE and LC_LINKER_OPTIMIZATION_HINT load commands and the fields for the Mach::linkedit_data_command type. Checking for other load commands that use this type will be added later. Also fixed a couple of places that was using sizeof(MachOObjectFile::LoadCommandInfo) that should have been using sizeof(MachO::load_command). llvm-svn: 280267	2016-08-31 17:57:46 +00:00
Geoff Berry	64f5ed172a	[EarlyCSE] Allow forwarding a non-invariant load into an invariant load. Reviewers: sanjoy Subscribers: mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D23935 llvm-svn: 280265	2016-08-31 17:45:31 +00:00
Chad Rosier	0de580aaab	[SLP] Update the debug based on Michael's suggestion. Passing the types/opcode check still doesn't guarantee we'll actually vectorize. Therefore, just make it clear we're attempting to vectorize. llvm-svn: 280263	2016-08-31 17:41:12 +00:00
Geoff Berry	4bda57622b	[LangRef] Clarify !invariant.load semantics. Based on discussion on llvm-dev. llvm-svn: 280262	2016-08-31 17:39:21 +00:00
Chad Rosier	54807a9b9d	[SLP] Sink debug after checking for matching types/opcode. Differential Revision: https://reviews.llvm.org/D24090 llvm-svn: 280260	2016-08-31 17:31:09 +00:00
Davide Italiano	1e9d3d3b40	[lib/LTO] Factor out logic for running passes. This is in preparation for adding an option to run a custom pipeline with the new PM. It's currently used in lld. llvm-svn: 280258	2016-08-31 17:02:44 +00:00
Tim Shen	48f814e8a3	s/static inline/static/ for headers I have changed in r279475. NFC. llvm-svn: 280257	2016-08-31 16:48:13 +00:00
Teresa Johnson	8068cc68f7	[LTO] Fix common test to reflect r279911 and move to X86 subdirectory Adjust the test to reflect the changes to common handling in r279911. This test wasn't running due to an incorrect REQUIRES and thus missed being modified for r279911 before. It was changed to XFAIL when the bad REQUIRES was discovered. Remove the XFAIL and move to a new X86 subdirectory that will properly disable on non-X86. llvm-svn: 280256	2016-08-31 16:15:39 +00:00
Reid Kleckner	9dac47319d	[codeview] Emit vtable shape information The shape of the vtable is passed down as the size of the __vtbl_ptr_type. This special pointer type appears both as the pointee type of the vptr type, and by itself in every dynamic class. For classes with multiple vtables, only the shape of the primary vftable is included, as the shape of all secondary vftables will be the same as in the base class. Fixes PR28150 llvm-svn: 280254	2016-08-31 15:59:30 +00:00

... 4 5 6 7 8 ...

137947 Commits