llvm-project

Commit Graph

Author	SHA1	Message	Date
Dan Gohman	a39ca60126	[WebAssembly] Add an assertion to catch unexpected MCFixupKindInfo flags. llvm-svn: 257657	2016-01-13 19:31:57 +00:00
Dan Gohman	938ff9f0aa	[WebAssembly] MCFixupKindInfo's TargetSize is in bits rather than bytes. llvm-svn: 257655	2016-01-13 19:29:37 +00:00
Hans Wennborg	81efb6b418	Fix struct/class mismatch for MachineSchedContext llvm-svn: 257648	2016-01-13 18:59:45 +00:00
Sanjay Patel	da08082a57	rangify; NFCI llvm-svn: 257646	2016-01-13 18:37:28 +00:00
Sanjay Patel	c5d29aa7c4	don't repeat names in comments ; NFC llvm-svn: 257643	2016-01-13 17:43:35 +00:00
Sanjay Patel	f23416852f	fix typo llvm-svn: 257626	2016-01-13 17:23:52 +00:00
Marek Olsak	46dadbfab2	AMDGPU/SI: Fix a GPU hang with POS_W_FLOAT enabled Reviewers: tstellarAMD, arsenm Subscribers: arsenm Differential Revision: http://reviews.llvm.org/D16037 llvm-svn: 257625	2016-01-13 17:23:20 +00:00
Marek Olsak	3c0ebc71f1	AMDGPU/SI: Remove ending s_endpgm from non-void functions Reviewers: tstellarAMD, arsenm Subscribers: arsenm Differential Revision: http://reviews.llvm.org/D16035 llvm-svn: 257623	2016-01-13 17:23:12 +00:00
Marek Olsak	8e9cc63bfb	AMDGPU/SI: Add s_waitcnt at the end of non-void functions Summary: v2: Make ReturnsVoid private, so that I can another 8 lines of code and look more productive. Reviewers: tstellarAMD, arsenm Subscribers: arsenm Differential Revision: http://reviews.llvm.org/D16034 llvm-svn: 257622	2016-01-13 17:23:09 +00:00
Marek Olsak	8a0f335ad6	AMDGPU/SI: Add support for non-void functions Summary: Return values can be stored in SGPRs (i32) and VGPRs (f32). This will be used by functions which expect some bytecode or other binary to be appended at the end. It allows defining in which registers the return values will be stored. v2: don't do this for compute shaders Reviewers: tstellarAMD, arsenm Subscribers: arsenm Differential Revision: http://reviews.llvm.org/D16033 llvm-svn: 257621	2016-01-13 17:23:04 +00:00
Derek Schuff	9c3bf3187a	[WebAssemly] Invalidate liveness in CFG stackifier WebAssemblyCFGStackify does not track liveness for EXPR_STACK, causing verifier failure if liveness has not already been invalidated. llvm-svn: 257620	2016-01-13 17:10:28 +00:00
Sanjay Patel	c775fa43d0	fix typo llvm-svn: 257613	2016-01-13 16:34:10 +00:00
Nicolai Haehnle	02c3291566	AMDGPU/SI: Add SI Machine Scheduler Summary: It is off by default, but can be used with --misched=si Patch by: Axel Davy Reviewers: arsenm, tstellarAMD, nhaehnle Subscribers: nhaehnle, solenskiner, arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D11885 llvm-svn: 257609	2016-01-13 16:10:10 +00:00
Michael Zuckerman	6b35f460ac	Fixing warning by adding the X86ISD::VROTRI case. Differential Revision: http://reviews.llvm.org/D16052 llvm-svn: 257607	2016-01-13 15:48:42 +00:00
Krzysztof Parzyszek	a3c5d44437	[Hexagon] Do not insert non-phis before phis in bit simplification llvm-svn: 257606	2016-01-13 15:48:18 +00:00
Michael Zuckerman	0e31b22487	[AVX512] Adding PMOVSXBD/W/Q , PMOVZSDQ and PMOVZSWD/Q Intrinsics . Differential Revision: http://reviews.llvm.org/D16111 llvm-svn: 257604	2016-01-13 14:59:19 +00:00
Michael Zuckerman	43cea85db9	[AVX512] Adding PMOVZXBD/W/Q , PMOVZXDQ and PMOVZXWD/Q Intrinsics Differential Revision:http://reviews.llvm.org/D16071 llvm-svn: 257601	2016-01-13 14:25:21 +00:00
Ulrich Weigand	46ff7ec317	[PowerPC] Fix large code model with the ELFv2 ABI The global entry point prologue currently assumes that the TOC associated with a function is less than 2GB away from the function entry point. This is always true when using the medium or small code model, but may not be the case when using the large code model. This patch adds a new variant of the ELFv2 global entry point prologue that lifts the 2GB restriction when building with -mcmodel=large. This works by emitting a quadword containing the distance from the function entry point to its associated TOC immediately before the entry point, and then using a prologue like: ld r2,-8(r12) add r2,r2,r12 Since creation of the entry point prologue is now split across two separate routines (PPCLinuxAsmPrinter::EmitFunctionEntryLabel emits the data word, PPCLinuxAsmPrinter::EmitFunctionBodyStart the prolog code), I've switched to using named labels instead of just temporaries to indicate the locations of the global and local entry points and the new TOC offset data word. These names are provided by new routines in PPCFunctionInfo modeled after the existing PPCFunctionInfo::getPICOffsetSymbol. Note that a corresponding change was committed to GCC here: https://gcc.gnu.org/ml/gcc-patches/2015-12/msg00355.html Reviewers: hfinkel Differential Revision: http://reviews.llvm.org/D15500 llvm-svn: 257597	2016-01-13 13:12:23 +00:00
Michael Zuckerman	298a680c80	[AVX512] adding PRORQ , PRORD , PRORLVQ and PRORLVD Intrinsics Differential Revision: http://reviews.llvm.org/D16052 llvm-svn: 257594	2016-01-13 12:39:33 +00:00
Marek Olsak	4e99b6ec01	AMDGPU/SI: Allow more shader inputs Reviewers: tstellarAMD, arsenm Subscribers: arsenm Differential Revision: http://reviews.llvm.org/D16032 llvm-svn: 257593	2016-01-13 11:46:48 +00:00
Marek Olsak	b6c8c3d165	AMDGPU/SI: Allow any number of PS inputs Summary: With the ability to concatenate shader binaries, the limit of 15 no longer applies. Reviewers: tstellarAMD, arsenm Subscribers: arsenm Differential Revision: http://reviews.llvm.org/D16031 llvm-svn: 257592	2016-01-13 11:46:10 +00:00
Marek Olsak	fccabaf57e	AMDGPU/SI: Add new target attribute InitialPSInputAddr Summary: This allows Mesa to pass initial SPI_PS_INPUT_ADDR to LLVM. The register assigns VGPR locations to PS inputs, while the ENA register determines whether or not they are loaded. Mesa needs to set some inputs as not-movable, so that a pixel shader prolog binary appended at the beginning can assume where some inputs are. v2: Make PSInputAddr private, because there is never enough silly getters and setters for people to read. Reviewers: tstellarAMD, arsenm Subscribers: arsenm Differential Revision: http://reviews.llvm.org/D16030 llvm-svn: 257591	2016-01-13 11:45:36 +00:00
Marek Olsak	926c56f50c	AMDGPU/SI: Fix a bug in SIFoldOperands Summary: ret.ll will contain a test for this Reviewers: tstellarAMD, arsenm Subscribers: arsenm Differential Revision: http://reviews.llvm.org/D16029 llvm-svn: 257590	2016-01-13 11:44:29 +00:00
Andrey Turetskiy	1ce2c9973f	LEA code size optimization pass (Part 2): Remove redundant LEA instructions. Make x86 OptimizeLEAs pass remove LEA instruction if there is another LEA (in the same basic block) which calculates address differing only be a displacement. Works only for -Oz. Differential Revision: http://reviews.llvm.org/D13295 llvm-svn: 257589	2016-01-13 11:30:44 +00:00
Junmo Park	b98cc2a617	Remove extra whitespace. NFC. llvm-svn: 257578	2016-01-13 07:03:42 +00:00
James Y Knight	7699494f08	[SPARC] Revamp AnalyzeBranch and add ReverseBranchCondition. AnalyzeBranch on X86 (and, previously, SPARC, which implementation was copied from X86) tries to modify the branches based on block layout (e.g. checking isLayoutSuccessor), when AllowModify is true. The rest of the architectures leave that up to the caller, which can call InsertBranch, RemoveBranch, and ReverseBranchCondition as appropriate. That appears to be the preferred way to do it nowadays. This commit makes SPARC like the rest: replaces AnalyzeBranch with an implementation cribbed from AArch64, and adds a ReverseBranchCondition implementation. Additionally, a test-case has been added (also cribbed from AArch64) demonstrating that redundant branch sequences no longer get emitted. E.g., it used to emit code like this: bne .LBB1_2 nop ba .LBB1_1 nop .LBB1_2: And now emits: cmp %i0, 42 be .LBB1_1 nop llvm-svn: 257572	2016-01-13 04:44:14 +00:00
Xinliang David Li	81f18a58f1	[Coverage] Refactor coverage mapping reader code (Resubmit after fixing a typo that breaks test on big endian machines) In this refactoring, member functions are introduced to access CovMap header/func record members and hide layout details. This will enable further code restructuring to support reading multiple versions of coverage mapping data with shared/templatized code. (When coveremap format version changes, backward compatibtility should be preserved). llvm-svn: 257571	2016-01-13 04:36:15 +00:00
Xinliang David Li	c8c39ea822	Rollback r257551 -- unexpected test failures TBI llvm-svn: 257564	2016-01-13 02:46:40 +00:00
Keno Fischer	78e5c9e6e2	Re-Revert r257105 (Verifier debug info changes) While I investigate some new buildbot failures. This was originally reapplied as r257550 and r257558. llvm-svn: 257563	2016-01-13 02:31:14 +00:00
Kostya Serebryany	72fdb32dac	[libFuzzer] make sure to update CurrentUnit when drilling llvm-svn: 257560	2016-01-13 01:58:27 +00:00
Keno Fischer	e7a4e5613e	Use utostr rather than std::to_string Looks like std::to_string is not available for Android. Hopefully this fixes the bot. llvm-svn: 257558	2016-01-13 01:26:57 +00:00
Matthias Braun	4cc3421a24	AsmPrinter: Fix wrong OS X versions being emitted for darwin triples The version numbers of the darwin kernel are different from the version numbers of OS X, so we need adjustments if we had "--darwin" triples. Use the existing utility functions in TargetTriple for this. Fixes rdar://22056966 Differential Revision: http://reviews.llvm.org/D14601 llvm-svn: 257555	2016-01-13 01:18:13 +00:00
David Majnemer	c3340db77d	[CodeView] Mark our lines as statements, not expressions The line tables for CodeView make a distinction between expressions and statements. As it turns out, MSVC always emits them as statements and we always emit them as expressions. Let's switch to statements to match the CodeView that they emit. llvm-svn: 257553	2016-01-13 01:05:23 +00:00
Xinliang David Li	b92bc6dff2	[Coverage] Refactor coverage mapping reader code /NFC (Resubmit after fixing build bot failures) In this refactoring, member functions are introduced to access CovMap header/func record members and hide layout details. This will enable further code restructuring to support reading multiple versions of coverage mapping data with shared/templatized code. (When coveremap format version changes, backward compatibtility should be preserved). llvm-svn: 257551	2016-01-13 00:53:46 +00:00
Keno Fischer	25916079ff	Reapply r257105 "[Verifier] Check that debug values have proper size" The follow extra changes were made to test cases: Manually making the variable be the actual type instead of a pointer to avoid pointer-size differences in generic code: LLVM :: DebugInfo/Generic/2010-03-24-MemberFn.ll LLVM :: DebugInfo/Generic/2010-04-06-NestedFnDbgInfo.ll LLVM :: DebugInfo/Generic/2010-05-03-DisableFramePtr.ll LLVM :: DebugInfo/Generic/varargs.ll Delete sizing information from debug info for the same reason (but the presence of the pointer was important to the test case): LLVM :: DebugInfo/Generic/restrict.ll LLVM :: DebugInfo/Generic/tu-composite.ll LLVM :: Linker/type-unique-type-array-a.ll LLVM :: Linker/type-unique-simple2.ll Fixing an incorrect DW_OP_deref LLVM :: DebugInfo/Generic/2010-05-03-OriginDIE.ll Fixing a missing DW_OP_deref LLVM :: DebugInfo/Generic/incorrect-variable-debugloc.ll Additionally, clang should no longer complain during bootstrap should no longer happen after r257534. The original commit message was: ``` Summary: Teach the Verifier to make sure that the storage size given to llvm.dbg.declare or the value size given to llvm.dbg.value agree with what is declared in DebugInfo. This is implicitly assumed in a number of passes (e.g. in SROA). Additionally this catches a number of common mistakes, such as passing a pointer when a value was intended or vice versa. One complication comes from stack coloring which modifies the original IR when it merges allocas in order to make sure that if AA falls back to the IR it gets the correct result. However, given this new invariant, indiscriminately replacing one alloca by a different (differently sized one) is no longer valid. Fix this by just undefing out any use of the alloca in a dbg.declare in this case. Additionally, I had to fix a number of test cases. Of particular note: - I regenerated dbg-changes-codegen-branch-folding.ll from the given source as it was affected by the bug fixed in r256077 - two-cus-from-same-file.ll was changed to avoid having a variable-typed debug variable as that would depend on the target, even though this test is supposed to be generic - I had to manually declared size/align for reference type. See also the discussion for D14275/r253186. - fpstack-debuginstr-kill.ll required changing `double` to `long double` - most others were just a question of adding OP_deref ``` llvm-svn: 257550	2016-01-13 00:31:44 +00:00
Xinliang David Li	8c65278179	Rollback r257547 -- buildbot failure TBI llvm-svn: 257549	2016-01-13 00:27:24 +00:00
Xinliang David Li	c3498b07db	[Coverage] Refactor coverage mapping reader code /NFC In this refactoring, member functions are introduced to access CovMap header/func record members and hide layout details. This will enable further code restructuring to support reading multiple versions of coverage mapping data with shared/templatized code. (When coveremap format version changes, backward compatibtility should be preserved). llvm-svn: 257547	2016-01-13 00:16:43 +00:00
Ana Pazos	359cab3bb3	Guard fabs to bfc convert with V6T2 flag Summary: BFC instructions are available in ARMv6T2 and above. Reviewers: t.p.northover Subscribers: aemerson Differential Revision: http://reviews.llvm.org/D16076 llvm-svn: 257546	2016-01-13 00:03:35 +00:00
Quentin Colombet	f8e3030794	[ARM] Mark VMOV with immediate: isAsCheapAsMove. VMOVs are not strictly speaking cheap, but they are as expensive as a vector copy (VORR), so we should prefer rematerialization over splitting when it applies. rdar://problem/23754176 llvm-svn: 257545	2016-01-13 00:02:40 +00:00
Fiona Glaser	db7824f0c1	CannotBeOrderedLessThanZero: add some missing cases llvm-svn: 257542	2016-01-12 23:37:30 +00:00
Rui Ueyama	6161b38dbc	COFF: Teach llvm-objdump how to dump DLL forwarder symbols. llvm-svn: 257539	2016-01-12 23:28:42 +00:00
Derek Schuff	4377e2d713	[WebAssembly] Fix disassembler shared-libs build llvm-svn: 257536	2016-01-12 23:03:40 +00:00
Matthias Braun	b505c76c9a	RegisterPressure: Expose RegisterOperands API Previously the RegisterOperands have only been used internally in RegisterPressure.cpp. However this datastructure can be useful for other tasks as well and allows refactoring of PDiff initialisation out of RPTracker::recede(). This patch: - Exposes RegisterOperands as public API - Splits RPTracker::recede() into a part that skips DebugValues and maintains the region borders, and the core that changes register pressure when given a set of RegisterOperands. - This allows to move the PDiff initialisation out recede() into a method of the PressureDiffs class. - The upcoming subregister scheduling code will also use RegisterOperands to avoid pushing more unrelated functionality into recede()/advance(). Differential Revision: http://reviews.llvm.org/D15473 llvm-svn: 257535	2016-01-12 22:57:35 +00:00
Keno Fischer	9aae445e09	[Utils] Insert DW_OP_bit_piece when only describing part of the variable Summary: The dbg.declare -> dbg.value conversion looks through any zext/sext to find a value to describe the variable (in the expectation that those zext/sext instruction will go away later). However, those values do not cover the entire variable and thus need a DW_OP_bit_piece. Reviewers: aprantl Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D16061 llvm-svn: 257534	2016-01-12 22:46:09 +00:00
Nathan Slingerland	7bee316890	[Support] Add saturating multiply-add support function Summary: Add SaturatingMultiplyAdd convenience function template since A + (X * Y) comes up frequently when doing weighted arithmetic. Reviewers: davidxl, silvas Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D15385 llvm-svn: 257532	2016-01-12 22:34:00 +00:00
David Majnemer	c81c8c66d5	[CodeView] Initialize column-end to zero CodeView, unlike DWARF, can associate code with a range of columns. However, LLVM can only represent a single column position internally. We used to claim that the end column and start column were the same which yielded less than satisfactory results: we would stop printing at the _beginning_ of the source expression! Instead, mark the column-end as 'zero' to indicate that we don't have one (as per the documentation for IDiaLineNumber::get_lineNumberEnd). llvm-svn: 257528	2016-01-12 21:58:20 +00:00
Dan Gohman	0656f5f845	[WebAsssembly] Register the MC register info. llvm-svn: 257525	2016-01-12 21:27:55 +00:00
Michael Zuckerman	2ddcbcf464	[AVX512] adding PROLQ and PROLD Intrinsics Differential Revision: http://reviews.llvm.org/D16048 llvm-svn: 257523	2016-01-12 21:19:17 +00:00
Kyle Butt	cec40806f1	Codegen: [PPC] Handle weighted comparisons when inserting selects. Only non-weighted predicates were handled in PPCInstrInfo::insertSelect. Handle the weighted predicates as well. This latent bug was triggered by r255398, because it added use of the branch-weighted predicates. While here, switch over an enum instead of an int to get the compiler to enforce totality in the future. llvm-svn: 257518	2016-01-12 21:00:43 +00:00
Dan Gohman	4635017176	[WebAssembly] Add a EM_WEBASSEMBLY value, and several bits of code that use it. A request has been made to the official registry, but an official value is not yet available. This patch uses a temporary value in order to support development. When an official value is recieved, the value of EM_WEBASSEMBLY will be updated. llvm-svn: 257517	2016-01-12 20:56:01 +00:00
Dan Gohman	3469ee120c	[WebAssembly] Introduce a WebAssemblyTargetStreamer class. Refactor .param, .result, .local, and .endfunc, as directives, using the proper MCTargetStreamer mechanism, rather than fake instructions. llvm-svn: 257511	2016-01-12 20:30:51 +00:00
Krzysztof Parzyszek	f62d44be28	Replace inherited constructor with an explicit one Some bots failed when the inherited constructor was used. llvm-svn: 257508	2016-01-12 19:27:59 +00:00
Dan Gohman	1d68e80f26	[WebAssembly] Make CFG stackification independent of basic-block labels. This patch changes the way labels are referenced. Instead of referencing the basic-block label name (eg. .LBB0_0), instructions now just have an immediate which indicates the depth in the control-flow stack to find a label to jump to. This makes them much closer to what we expect to have in the binary encoding, and avoids the problem of basic-block label names not being explicit in the binary encoding. Also, it terminates blocks and loops with end_block and end_loop instructions, rather than basic-block label names, for similar reasons. This will also fix problems where two constructs appear to have the same label, because we no longer explicitly use labels, so consumers that need labels will presumably create their own labels, and presumably they won't reuse labels when they do. This patch does make the code a little more awkward to read; as a partial mitigation, this patch also introduces comments showing where the labels are, and comments on each branch showing where it's branching to. llvm-svn: 257505	2016-01-12 19:14:46 +00:00
Krzysztof Parzyszek	1279881315	[Hexagon] Implement RDF-based post-RA optimizations - Handle simple cases of register copies (what current RDF CP allows). - Hexagon-specific dead code elimination: handles dead address updates in post-increment instructions. llvm-svn: 257504	2016-01-12 19:09:01 +00:00
Sanjay Patel	53ba88dbb0	[LibCallSimplifier] use instruction-level fast-math-flags to transform pow(x, 0.5) calls Also, propagate the FMF to the newly created sqrt() call. llvm-svn: 257503	2016-01-12 19:06:35 +00:00
Sanjay Patel	046c1d6355	rangify; NFCI llvm-svn: 257500	2016-01-12 18:47:59 +00:00
Reid Kleckner	304af56d51	Auto-link with ole32.dll to simplify building LLVM.dll Patch by Jakob Bornecrantz llvm-svn: 257499	2016-01-12 18:33:49 +00:00
Sanjay Patel	a252815bc1	function names start with a lower case letter ; NFC llvm-svn: 257496	2016-01-12 18:03:37 +00:00
Teresa Johnson	388497e8be	[ThinLTO] Handle an external call from an import to an alias in dest The findExternalCalls routine ignores calls to functions already defined in the dest module. This was not handling the case where the definition in the current module is actually an alias to a function call. llvm-svn: 257493	2016-01-12 17:48:44 +00:00
Sanjay Patel	6002e78a06	[LibCallSimplifier] use instruction-level fast-math-flags to transform pow(exp(x)) calls See also: http://reviews.llvm.org/rL255555 http://reviews.llvm.org/rL256871 http://reviews.llvm.org/rL256964 http://reviews.llvm.org/rL257400 http://reviews.llvm.org/rL257404 http://reviews.llvm.org/rL257414 llvm-svn: 257491	2016-01-12 17:30:37 +00:00
Krzysztof Parzyszek	c09d630e50	RDF: Copy propagation This is a very limited implementation of DFG-based copy propagation. It only handles actual COPY instructions (does not handle other equivalents such as add-immediate with a 0 operand). The major limitation is that it does not update the DFG: that will be the change required to make it more robust (hopefully coming up soon). llvm-svn: 257490	2016-01-12 17:23:48 +00:00
Tom Stellard	f421837250	AMDGPU: Emit note directive for HSA even if there are no functions Reviewers: arsenm, echristo Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D16010 llvm-svn: 257488	2016-01-12 17:18:17 +00:00
Krzysztof Parzyszek	6f4000e763	RDF: Dead code elimination Utility class to perform DFG-based dead code elimination. llvm-svn: 257485	2016-01-12 17:01:16 +00:00
Krzysztof Parzyszek	8dca45efa8	Fix compiler warnings from r257477 llvm-svn: 257483	2016-01-12 16:51:55 +00:00
Kostya Serebryany	4b83a4f6fe	[libFuzzer] add a macro LLVM_FUZZER_DEFINES_SANITIZER_WEAK_HOOOKS llvm-svn: 257482	2016-01-12 16:50:18 +00:00
Krzysztof Parzyszek	acdff46a9c	RDF: Implement register liveness analysis Compute block live-ins and operand kill flags from the DFG. llvm-svn: 257480	2016-01-12 15:56:33 +00:00
Daniel Sanders	5e1d5a789a	[mips] Correct operand order in DSP's mthi/mtlo Summary: The result register is the second operand as per the other mt* instructions. Reviewers: vkalintiris Subscribers: llvm-commits, dsanders Differential Revision: http://reviews.llvm.org/D15993 llvm-svn: 257478	2016-01-12 15:15:14 +00:00
Krzysztof Parzyszek	b5b5a1d7ad	Register Data Flow: data flow graph Target independent, SSA-based data flow framework for representing data flow between physical registers. This commit implements the creation of the actual data flow graph. llvm-svn: 257477	2016-01-12 15:09:49 +00:00
Benjamin Kramer	ab8cc02ba5	[Hexagon] Make helper function static. NFC. llvm-svn: 257476	2016-01-12 14:58:49 +00:00
Keno Fischer	00021429d4	[ARM] Fix several state persistence bugs Summary: This fixes three bugs, in all of which state is not or incorrecly reset between objects (i.e. when reusing the same pass manager to create multiple object files): 1) AttributeSection needs to be reset to nullptr, because otherwise the backend will try to emit into the old object file's attribute section causing a segmentation fault. 2) MappingSymbolCounter needs to be reset, otherwise the second object file will start where the first one left off. 3) The MCStreamer base class resets the Streamer's e_flags settings. Since EF_ARM_EABI_VER5 is set on streamer creation, we need to set it again after the MCStreamer was rest. Also rename Reset (uppser case) to EHReset to avoid confusion with reset (lower case). Reviewers: rengolin Differential Revision: http://reviews.llvm.org/D15950 llvm-svn: 257473	2016-01-12 13:38:15 +00:00
Andrey Turetskiy	fed110f646	Test commit access - tiny comment and code style fix. llvm-svn: 257472	2016-01-12 13:34:11 +00:00
Robert Lougher	6abd69a60b	The isel pattern that selects the memory-register form of VCVTPH2PS (64 to 128-bit) matches against the pattern fragment 'vzmovl_v2i64' (a zero-extended 64-bit load). However, a change in r248784 teaches the instruction combiner that only the lower 64 bits of the input to a 128-bit vcvtph2ps are used. This means the instruction combiner will ordinarily optimize away the upper 64-bit insertelement instruction in the zero-extension and so we no longer select the memory-register form. To fix this a new pattern has been added. Differential Revision: http://reviews.llvm.org/D16067 llvm-svn: 257470	2016-01-12 11:48:25 +00:00
Christof Douma	f617e678e9	The --debug-only option now takes a comma separated list of debug types. This means that the DEBUG_TYPE cannot take a comma anymore. All existing passes conform to this rule. Differential Revision: http://reviews.llvm.org/D15645 llvm-svn: 257466	2016-01-12 10:23:13 +00:00
Igor Breger	ea8e8e9f97	AVX512: VPMOVAPS/PD and VPMOVUPS/PD (load) intrinsic implementation. Differential Revision: http://reviews.llvm.org/D16042 llvm-svn: 257463	2016-01-12 10:02:32 +00:00
Justin Bogner	b8d82abb78	LoopUnroll: Move the actual unrolling logic to a standalone function. NFC This is pure code motion - break the actual work out of runOnLoop into a reusable standalone function. llvm-svn: 257445	2016-01-12 05:21:37 +00:00
Dan Gohman	1a42728719	[WebAssembly] Implement a prototype instruction encoder and disassembler. This is using an extremely simple temporary made-up binary format, not the official binary format (which isn't defined yet). llvm-svn: 257440	2016-01-12 03:32:29 +00:00
Dan Gohman	afd7e3ada8	[WebAssembly] Register the MC subtarget info. llvm-svn: 257439	2016-01-12 03:30:06 +00:00
Dan Gohman	a11fb2373c	[WebAssembly] Define OperandTypes for decoding immediate values. llvm-svn: 257438	2016-01-12 03:09:16 +00:00
Kostya Serebryany	4174005622	[libFuzzer] when a new unit is discovered using a dictionary, print all used dictionary entries llvm-svn: 257435	2016-01-12 02:36:59 +00:00
Kostya Serebryany	859e86d962	[libFuzzer] add various debug prints. Also don't mutate based on a cmp trace like (a eq a) or (a neq a) llvm-svn: 257434	2016-01-12 02:08:37 +00:00
Dan Gohman	85159ca224	[WebAssembly] Use TSFlags instead of keeping a list of special-case opcodes. llvm-svn: 257433	2016-01-12 01:45:12 +00:00
Manman Ren	ed967f3752	CXX_FAST_TLS calling convention: performance improvement for x86-64. This is the same change on x86-64 as r255821 on AArch64. rdar://9001553 llvm-svn: 257428	2016-01-12 01:08:46 +00:00
Justin Bogner	921b04e9a4	LoopUnroll: Make canUnrollCompletely static - it doesn't use any state. NFC llvm-svn: 257427	2016-01-12 01:06:32 +00:00
Justin Bogner	a1dd493159	LoopUnroll: Clean up the maze of initialization for unroll parameters. NFC The layering of where the various loop unroll parameters are initialized and overridden here was very confusing, making it pretty difficult to tell just how the various sources interacted. Instead, we put all of the initialization logic together in a single function so that it's obvious what overrides what. llvm-svn: 257426	2016-01-12 00:55:26 +00:00
Manman Ren	5e9e65e705	CXX_FAST_TLS calling convention: performance improvement for ARM. This is the same change on ARM as r255821 on AArch64. rdar://9001553 llvm-svn: 257424	2016-01-12 00:47:18 +00:00
Kostya Serebryany	e3580956ea	[libFuzzer] extend the weak memcmp/strcmp/strncmp interceptors to receive the result of the computations. With that, don't do any mutations if memcmp/etc returned 0 llvm-svn: 257423	2016-01-12 00:43:42 +00:00
Teresa Johnson	5fe40050bd	[IRMover] Don't copy personality, etc unless creating def Function::copyAttributesFrom will copy the personality function, prefix data and prolog data from the source function to the new function, and is invoked when the IRMover copies the function prototype. This puts a reference to a constant in the source module on a function in the dest module, which causes an error when deleting the source module after importing, since the personality function in the source module still has uses (this would presumably also be an issue for the prologue and prefix data). Remove the copies added to the dest copy when creating the new prototype, as they are mapped properly when/if we link the function body. llvm-svn: 257420	2016-01-12 00:24:24 +00:00
Manman Ren	1602605bf8	CXX_FAST_TLS calling convention: Add support for ARM on Darwin. rdar://9001553 llvm-svn: 257417	2016-01-11 23:50:43 +00:00
Dan Gohman	26c6765bd6	[WebAssembly] Define WebAssembly-specific relocation codes. Currently WebAssembly has two kinds of relocations; data addresses and function addresses. This adds ELF relocations for them, as well as an MC symbol kind to indicate which type of relocation is needed. llvm-svn: 257416	2016-01-11 23:38:05 +00:00
Reid Kleckner	5fb7a586e9	Avoid the deprecated GetVersionEx API Apparently the preferred version is the incredibly complicated VerifyVersionInfoW function. Rename the function to avoid potential future name clashes. llvm-svn: 257415	2016-01-11 23:33:03 +00:00
Sanjay Patel	e896ede7f1	[LibCallSimplifier] use instruction-level fast-math-flags to transform log calls Also, add tests to verify that we're checking 'fast' on both calls of each transform pair, tighten the CHECK lines, and give the tests more meaningful names. This is a continuation of: http://reviews.llvm.org/rL255555 http://reviews.llvm.org/rL256871 http://reviews.llvm.org/rL256964 http://reviews.llvm.org/rL257400 http://reviews.llvm.org/rL257404 llvm-svn: 257414	2016-01-11 23:31:48 +00:00
Rafael Espindola	36a425b618	Remove a bugs assert. There is no reason the value being printed has to be positive. Fixes pr25802. llvm-svn: 257412	2016-01-11 23:21:45 +00:00
Sanjay Patel	6c1ddbb7b6	[LibCallSimplifier] don't allow sqrt transform unless all ops are unsafe Fix the FIXME added with: http://reviews.llvm.org/rL257400 llvm-svn: 257404	2016-01-11 22:50:36 +00:00
Justin Bogner	0fb7ed5726	LoopUnroll: Use the optsize threshold for minsize as well Currently we're unrolling loops more in minsize than in optsize, which means -Oz will have a larger code size than -Os. That doesn't make any sense. This resolves the FIXME about this in LoopUnrollPass and extends the optsize test to make sure we use the smaller threshold for minsize as well. llvm-svn: 257402	2016-01-11 22:39:43 +00:00
Sanjay Patel	9f67dadea2	more space; NFC llvm-svn: 257401	2016-01-11 22:35:39 +00:00
Sanjay Patel	683f29735f	[LibCallSimplifier] use instruction-level fast-math-flags to transform sqrt calls This is a continuation of adding FMF to call instructions: http://reviews.llvm.org/rL255555 The intent of the patch is to preserve the current behavior of the transform except that we use the sqrt instruction's 'fast' attribute as a trigger rather than the function-level attribute. But this raises a bug noted by the new FIXME comment. In order to do this transform: sqrt((x * x) * y) ---> fabs(x) * sqrt(y) ...we need all of the sqrt, the first fmul, and the second fmul to be 'fast'. If any of those ops is strict, we should bail out. Differential Revision: http://reviews.llvm.org/D15937 llvm-svn: 257400	2016-01-11 22:34:19 +00:00
Sanjay Patel	34ea70a5c9	getParent()->getParent() == getFunction() and clang-format ; NFC llvm-svn: 257399	2016-01-11 22:24:35 +00:00
Sanjay Patel	472cc78ccb	don't repeat function names in comments; NFC llvm-svn: 257396	2016-01-11 22:14:42 +00:00
Dan Gohman	f225a63849	[WebAssembly] Reorganize address offset folding. Always expect tglobaladdr and texternalsym to be wrapped in WebAssemblywrapper nodes. Also, split out a regPlusGA from regPlusImm so that it can special-case global addresses, as they can be folded in more cases. Unfortunately this doesn't enable any new optimizations yet due to SelectionDAG limitations. I'll be submitting changes to the SelectionDAG infrastructure, along with tests, in a separate patch. llvm-svn: 257394	2016-01-11 22:05:44 +00:00
Matt Arsenault	5e0bdb8b95	AMDGPU: Implement {{s\|u}}int_to_fp i64 -> f32 The old lowering for uint_to_fp failed opencl conformance. It might be OK for fast math mode, but I'm not sure. llvm-svn: 257393	2016-01-11 22:01:48 +00:00
Teresa Johnson	b43257d594	Split resolveCycles(bool AllowTemps) into two interfaces and document Address review feedback from r255909. Move body of resolveCycles(bool AllowTemps) to resolveRecursivelyImpl(bool AllowTemps). Revert resolveCycles back to asserting on temps, and add new resolveNonTemporaries interface to invoke the new implementation with AllowTemps=true. Document the differences between these interfaces, specifically the effect on RAUW support and uniquing. Call appropriate interface from ValueMapper. llvm-svn: 257389	2016-01-11 21:37:41 +00:00
Matt Arsenault	800fecf9de	AMDGPU: Fix crash with dispatch.ptr intrinsic with non-HSA target It might be better to let this be a select failure instead. llvm-svn: 257386	2016-01-11 21:18:33 +00:00
Reid Kleckner	6cdf844d75	Revert "[Windows] Simplify assertion code. NFC." This reverts commit r254363. load64BitDebugHelp() has the side effect of loading dbghelp and setting globals. It should be called in no-asserts builds as well as debug builds. llvm_unreachable is also not appropriate here, since we actually want to return if dbghelp couldn't be loaded in a non-asserts build. llvm-svn: 257384	2016-01-11 21:07:48 +00:00
Reid Kleckner	ffbe12f4c5	Use ::GetVersionEx directly rather than the Win8.1 SDK helpers This removes ifdefs and fixes the build for users of the Win8.0 SDK, which I happen to be. Upgrading is not hard, but executing the same code everywhere seems better. llvm-svn: 257379	2016-01-11 20:35:45 +00:00
Adhemerval Zanella	e600c99a4e	[sanitizer] [msan] Fix origin store of array types This patch fixes the memory sanitizer origin store instrumentation for array types. This can be triggered by cases where frontend lowers function return to array type instead of aggregation. For instance, the C code: -- struct mypair { int64_t x; int y; }; mypair my_make_pair(int64_t x, int y) { mypair p; p.x = x; p.y = y; return p; } int foo (int p) { mypair z = my_make_pair(p, 0); return z.y + z.x; } -- It will be lowered with target set to aarch64-linux and -O0 to: -- [...] define i32 @_Z3fooi(i32 %p) #0 { [...] %call = call [2 x i64] @_Z12my_make_pairxi(i64 %conv, i32 0) %1 = bitcast %struct.mypair* %z to [2 x i64]* store [2 x i64] %call, [2 x i64]* %1, align 8 [...] -- The origin store will emit a 'icmp' to test each store value again the TLS origin array. However since 'icmp' does not support ArrayType the memory instrumentation phase will bail out with an error. This patch change it by using the same strategy used for struct type on array. It fixes the 'test/msan/insertvalue_origin.cc' for aarch64 (the -O0 case). llvm-svn: 257375	2016-01-11 19:55:27 +00:00
Chen Li	509ff21300	Code refactoring for commit r257278. llvm-svn: 257366	2016-01-11 19:20:53 +00:00
Chad Rosier	f35395eac1	[NFC] Fix whitespace. llvm-svn: 257365	2016-01-11 19:17:36 +00:00
Matt Arsenault	5319b0add5	AMDGPU: Fix ctlz combine for sub 32-bit types llvm-svn: 257353	2016-01-11 17:02:06 +00:00
Matt Arsenault	de5fbe9c60	AMDGPU: Pattern match ffbh pattern to instruction. The hardware instruction's output on 0 is -1 rather than 32. Eliminate a test and select to -1. This removes an extra instruction from the compatability function with HSAIL's firstbit instruction. llvm-svn: 257352	2016-01-11 17:02:00 +00:00
Matt Arsenault	f058d67643	AMDGPU: Custom lower i64 ctlz llvm-svn: 257348	2016-01-11 16:50:29 +00:00
Matt Arsenault	a0e5cd55ad	Mips: Remove lowerSELECT_CC This is the same as the default expansion. llvm-svn: 257346	2016-01-11 16:44:48 +00:00
Matt Arsenault	5ca3c72c5a	LegalizeDAG: Expand ctlz with ctlz_zero_undef if legal llvm-svn: 257345	2016-01-11 16:37:46 +00:00
Matt Arsenault	02d45dfeda	AMDGPU: Remove dead target dag combine llvm-svn: 257344	2016-01-11 16:37:40 +00:00
Lang Hames	9d7a269f47	[LLI] Replace the LLI remote-JIT support with the new ORC remote-JIT components. The new ORC remote-JITing support provides a superset of the old code's functionality, so we can replace the old stuff. As a bonus, a couple of previously XFAILed tests have started passing. llvm-svn: 257343	2016-01-11 16:35:55 +00:00
Silviu Baranga	603954ef0e	Revert r257164 - it has caused spec2k6 failures in LTO mode llvm-svn: 257340	2016-01-11 16:19:38 +00:00
Daniel Sanders	4d32300cfd	[mips] Never select JAL for calls to an absolute immediate address. Summary: It actually takes an offset into the current PC-region. This fixes the 'expr' command in lldb. Reviewers: vkalintiris, jaydeep, bhushan Subscribers: dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D16054 llvm-svn: 257339	2016-01-11 15:57:46 +00:00
Krzysztof Parzyszek	bc17b68a47	[Hexagon] Add check for nullptr in getFixupNoBits llvm-svn: 257338	2016-01-11 15:51:53 +00:00
Krzysztof Parzyszek	f49a8411f8	[Hexagon] Add implicit uses of GP to GP-relative loads and stores llvm-svn: 257337	2016-01-11 15:49:58 +00:00
Krzysztof Parzyszek	b024445444	[Hexagon] Mark D14 and GP as reserved registers llvm-svn: 257336	2016-01-11 15:47:41 +00:00
Alexey Bataev	28f0c5efec	[X86] Reduce complexity of the LEA optimization pass, by Andrey Turetsky. In the OptimizeLEA pass keep instructions' positions in the basic block saved and use them for calculation of the distance between two instructions instead of std::distance. This reduces complexity of the pass from O(n^3) to O(n^2) and thus the compile time. Differential Revision: http://reviews.llvm.org/D15692 llvm-svn: 257328	2016-01-11 11:52:29 +00:00
Junmo Park	7ceec0b82f	[BranchFolding] Set correct mem refs (2nd try) This is a recommit of r257253 which was reverted in r257270. Previous testcase can make failure on some targets due to using opt with O3 option. Original Summary: Merge MBBICommon and MBBI's MMOs. Differential Revision: http://reviews.llvm.org/D15990 llvm-svn: 257317	2016-01-11 07:15:38 +00:00
Lang Hames	4d0a5a9ec6	[Orc] Add support for remote JITing to the ORC API. This patch adds utilities to ORC for managing a remote JIT target. It consists of: 1. A very primitive RPC system for making calls over a byte-stream. See RPCChannel.h, RPCUtils.h. 2. An RPC API defined in the above system for managing memory, looking up symbols, creating stubs, etc. on a remote target. See OrcRemoteTargetRPCAPI.h. 3. An interface for creating high-level JIT components (memory managers, callback managers, stub managers, etc.) that operate over the RPC API. See OrcRemoteTargetClient.h. 4. A helper class for building servers that can handle the RPC calls. See OrcRemoteTargetServer.h. The system is designed to work neatly with the existing ORC components and functionality. In particular, the ORC callback API (and consequently the CompileOnDemandLayer) is supported, enabling lazy compilation of remote code. Assuming this doesn't trigger any builder failures, a follow-up patch will be committed which tests these utilities by using them to replace LLI's existing remote-JITing demo code. llvm-svn: 257305	2016-01-11 01:40:11 +00:00
Craig Topper	9d2cab7742	[AVX-512] Remove another extra space from the Intel syntax asm strings. llvm-svn: 257304	2016-01-11 01:03:40 +00:00
Lang Hames	70b2406f78	[Orc] Rename OrcTargetSupport to OrcArchitectureSupport to avoid confusion with the upcoming remote-target support classes. llvm-svn: 257302	2016-01-11 00:56:15 +00:00
Craig Topper	9feea57844	[AVX-512] Remove more superfluous spaces from asm strings. llvm-svn: 257301	2016-01-11 00:44:58 +00:00
Craig Topper	156622ad9d	[AVX-512] Remove unused Round and Itinerary from the maskable_cmp multiclasses. They weren't used and there were extra spaces in the asm string to prepare for the concatenations of the round string that wasn't ever used. llvm-svn: 257300	2016-01-11 00:44:56 +00:00
Craig Topper	bfe13ff6ca	[AVX-512] Make spacing between comma and {sae} operand consistent in asm strings. llvm-svn: 257299	2016-01-11 00:44:52 +00:00
Craig Topper	5be407ab27	[X86] Remove extra spaces from MPX instruction asm strings. llvm-svn: 257298	2016-01-11 00:44:46 +00:00
Lang Hames	4026f90e5d	[Orc] Add error codes and a new std::error_category for remote-jit errors. These will be used by an upcoming patch that adds remote-jit support utilities to ORC. llvm-svn: 257297	2016-01-11 00:34:13 +00:00
Lang Hames	b0934294b2	[RuntimeDyld] Add a notifyObjectLoaded method to RuntimeDyld::MemoryManager. This is a more generic version of the MCJITMemoryManager::notifyObjectLoaded method: It provides only a RuntimeDyld reference (rather than an ExecutionEngine), and so can be used with ORC JIT stacks. llvm-svn: 257296	2016-01-10 23:59:41 +00:00
Xinliang David Li	8a5bdb5dce	Move coveragemap_error enum into coverage namespace and InstrProf.h /NFC llvm-svn: 257295	2016-01-10 21:56:33 +00:00
Lang Hames	b2b7a3c179	[RuntimeDyld] Add alignment arguments to the reserveAllocationSpace method of RuntimeDyld::MemoryManager. The RuntimeDyld::MemoryManager::reserveAllocationSpace method is called when object files are loaded, and gives clients a chance to pre-allocate memory for all segments. Previously only the size of each segment (code, ro-data, rw-data) was supplied but not the alignment. This hasn't caused any problems so far, as most clients allocate via the MemoryBlock interface which returns page-aligned blocks. Adding alignment arguments enables finer grained allocation while still satisfying alignment restrictions. llvm-svn: 257294	2016-01-10 18:51:50 +00:00
Keno Fischer	875b122dfd	[SectionMemoryManager] Don't just drop the RO free list In r255760, I optimized the SectionMemoryManager to make better use of virtual memory on platforms where the allocation granularity was bigger than the protection granularity. As part of this, fixing up the free list became more complicated and was moved into `applyMemoryGroupPermissions`. Unfortunately, I forgot to actually remove the call that drops the free list for RO memory (I did remove the corresponding one for RX memory), defeating the whole optimization. llvm-svn: 257293	2016-01-10 18:17:12 +00:00
Daniel Berlin	7256059ef0	Speed up LiveDebugValues Summary: Use proper dataflow ordering to speed convergence. This will converge the testcase on bug 26055 in 2 iterations. (data structures speedups to come to make even that faster) Reviewers: kcc, samsonov, echristo, dblaikie, tvvikram Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D16039 llvm-svn: 257292	2016-01-10 18:08:32 +00:00
Elena Demikhovsky	542dfcf44c	Optimized instruction sequence for sitofp operation on X86-32 Optimized sitofp i64 %x to double. The current sequence movl %ecx, 8(%esp) movl %edx, 12(%esp) fildll 8(%esp) is replaced with: movd %ecx, %xmm0 movd %edx, %xmm1 punpckldq %xmm1, %xmm0 movq %xmm0, 8(%esp) Differential Revision: http://reviews.llvm.org/D15946 llvm-svn: 257285	2016-01-10 09:41:22 +00:00
Michael Zuckerman	885f61c534	[AVX512] add PRORVQ and PRORVD Intrinsic Differential Revision:http://reviews.llvm.org/D15955 llvm-svn: 257283	2016-01-10 09:16:41 +00:00
David Majnemer	d9833ea579	[JumpThreading] Don't forget to report that the IR changed JumpThreading's runOnFunction is supposed to return true if it made any changes. JumpThreading has a call to removeUnreachableBlocks which may result in changes to the IR but runOnFunction didn't appropriate account for this possibility, leading to badness. While we are here, make sure to call LazyValueInfo::eraseBlock in removeUnreachableBlocks; JumpThreading preserves LVI. This fixes PR26096. llvm-svn: 257279	2016-01-10 07:13:04 +00:00
Chen Li	c375450e3f	Fix a control flow problem in commit rL257277. llvm-svn: 257278	2016-01-10 06:13:32 +00:00
Chen Li	1689c2f54b	[SimplifyCFG] Extend SimplifyResume to handle phi of trivial landing pad. Summary: This is a fix of D13718. D13718 was committed but then reverted because of the following bug: https://llvm.org/bugs/show_bug.cgi?id=25299 This patch fixes the issue shown in the bug. Reviewers: majnemer, reames Subscribers: jevinskie, llvm-commits Differential Revision: http://reviews.llvm.org/D14308 llvm-svn: 257277	2016-01-10 05:48:01 +00:00
Joseph Tremoulet	a9a05cbcf9	[WinEH] Fix catchpad pred verification Summary: The code was simply ensuring that the catchpad's pred is its catchswitch, which was letting cases slip through where the flow edge was the unwind edge of the catchswitch rather than one of its catch clauses. Reviewers: andrew.w.kaylor, rnk, majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D16011 llvm-svn: 257275	2016-01-10 04:32:03 +00:00
Joseph Tremoulet	8ea8086322	[WinEH] Disallow cyclic unwinds Summary: Funclet-based EH personalities/tables likely can't handle these, and they can't be generated at source, so make them officially illegal in IR as well. Reviewers: andrew.w.kaylor, rnk, majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D15963 llvm-svn: 257274	2016-01-10 04:31:05 +00:00
Joseph Tremoulet	81e81960e3	[WinEH] Verify consistent funclet unwind exits Summary: A funclet EH pad may be exited by an unwind edge, which may be a cleanupret exiting its cleanuppad, an invoke exiting a funclet, or an unwind out of a nested funclet transitively exiting its parent. Funclet EH personalities require all such exceptional exits from a given funclet to have the same unwind destination, and EH preparation / state numbering / table generation implicitly depends on this. Formalize it as a rule of the IR in the LangRef and verifier. Reviewers: rnk, majnemer, andrew.w.kaylor Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D15962 llvm-svn: 257273	2016-01-10 04:30:02 +00:00
Joseph Tremoulet	e28885e693	[WinEH] Verify unwind edges against EH pad tree Summary: Funclet EH personalities require a tree-like nesting among funclets (enforced by the ParentPad linkage in the IR), and also require that unwind edges conform to certain rules with respect to the tree: - An unwind edge may exit 0 or more ancestor pads - An unwind edge must enter exactly one EH pad, which must be distinct from any exited pads - A cleanupret's edge must exit its cleanuppad Describe these rules in the LangRef, and enforce them in the verifier. Reviewers: rnk, majnemer, andrew.w.kaylor Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D15961 llvm-svn: 257272	2016-01-10 04:28:38 +00:00
Daniel Berlin	ca4d93a82f	Don't use random class variables across functions llvm-svn: 257271	2016-01-10 03:25:42 +00:00
Michael Zolotukhin	0fc89c67cc	Revert "[BranchFolding] Set correct mem refs" This reverts commit 1ff11017d2669b933b29fcbb6451cfcda34ad693. llvm-svn: 257270	2016-01-09 23:53:16 +00:00
Simon Pilgrim	c7bebcbfd8	[X86][AVX] Match broadcast loads through a bitcast AVX1 v8i32/v4i64 shuffles are bitcasted to v8f32/v4f64, this patch peeks through any bitcast to check for a load node to allow broadcasts to occur. This is a re-commit of r257055 after r257264 fixed 32-bit broadcast loads of i64 scalars. llvm-svn: 257266	2016-01-09 20:59:39 +00:00
Lang Hames	829826bf96	[Orc] Enable user-supplied memory managers in the CompileOnDemand layer. Previously the CompileOnDemand layer was hard-coded to use a new SectionMemoryManager for each function when it was called. llvm-svn: 257265	2016-01-09 20:55:18 +00:00
Simon Pilgrim	2e7a1849c9	[X86][AVX] Add support for i64 broadcast loads on 32-bit targets Added 32-bit AVX1/AVX2 broadcast tests. llvm-svn: 257264	2016-01-09 19:59:27 +00:00
Lang Hames	859d73ce95	[Orc][RuntimeDyld] Prevent duplicate calls to finalizeMemory on shared memory managers. Prior to this patch, recursive finalization (where finalization of one RuntimeDyld instance triggers finalization of another instance on which the first depends) could trigger memory access failures: When the inner (dependent) RuntimeDyld instance and its memory manager are finalized, memory allocated (but not yet relocated) by the outer instance is locked, and relocation in the outer instance fails with a memory access error. This patch adds a latch to the RuntimeDyld::MemoryManager base class that is checked by a new method: RuntimeDyld::finalizeWithMemoryManagerLocking, ensuring that shared memory managers are only finalized by the outermost RuntimeDyld instance. This allows ORC clients to supply the same memory manager to multiple calls to addModuleSet. In particular it enables the use of user-supplied memory managers with the CompileOnDemandLayer which must reuse the supplied memory manager for each function that is lazily compiled. llvm-svn: 257263	2016-01-09 19:50:40 +00:00
Benjamin Kramer	543762da3e	[JumpThreading] Use range-based for loops. No functionality change intended. llvm-svn: 257262	2016-01-09 18:43:01 +00:00
Benjamin Kramer	530e0db333	[TRE] Simplify code with range-based loops and std::find. No functional change intended. llvm-svn: 257261	2016-01-09 17:35:29 +00:00
Junmo Park	e1582cec34	[BranchFolding] Set correct mem refs Merge MBBICommon and MBBI's MMOs. Differential Revision: http://reviews.llvm.org/D15990 llvm-svn: 257253	2016-01-09 07:30:13 +00:00
Manuel Jacob	734e73342d	[RS4GC] Update and simplify handling of Constants in findBaseDefiningValueOfVector(). Summary: This is analogous to r256079, which removed an overly strong assertion, and r256812, which simplified the code by replacing three conditionals by one. Reviewers: reames Subscribers: sanjoy, llvm-commits Differential Revision: http://reviews.llvm.org/D16019 llvm-svn: 257250	2016-01-09 04:02:16 +00:00
Kostya Serebryany	1f9c40db1d	[libFuzzer] debug prints in tracing llvm-svn: 257249	2016-01-09 03:46:08 +00:00
Kostya Serebryany	b65805a939	[libFuzzer] change the way trace-based mutations are applied. Instead of a custom code just rely on the automatically created dictionary llvm-svn: 257248	2016-01-09 03:08:58 +00:00
Manuel Jacob	0593cfd336	[RS4GC] Unify two asserts. NFC. llvm-svn: 257247	2016-01-09 03:08:49 +00:00
Kostya Serebryany	c573316eee	[libFuzzer] don't limit memcmp tracing with 8 bytes llvm-svn: 257245	2016-01-09 01:39:55 +00:00
Philip Reames	5715f576ea	[rs4gc] Optionally directly relocated vector of pointers This patch teaches rewrite-statepoints-for-gc to relocate vector-of-pointers directly rather than trying to split them. This builds on the recent lowering/IR changes to allow vector typed gc.relocates. The motivation for this is that we recently found a bug in the vector splitting code where depending on visit order, a vector might not be relocated at some safepoint. Specifically, the bug is that the splitting code wasn't updating the side tables (live vector) of other safepoints. As a result, a vector which was live at two safepoints might not be updated at one of them. However, if you happened to visit safepoints in post order over the dominator tree, everything worked correctly. Weirdly, it turns out that post order is actually an incredibly common order to visit instructions in in practice. Frustratingly, I have not managed to write a test case which actually hits this. I can only reproduce it in large IR files produced by actual applications. Rather than continue to make this code more complicated, we can remove all of the complexity by just representing the relocation of the entire vector natively in the IR. At the moment, the new functionality is hidden behind a flag. To use this code, you need to pass "-rs4gc-split-vector-values=0". Once I have a chance to stress test with this option and get feedback from other users, my plan is to flip the default and remove the original splitting code. I would just remove it now, but given the rareness of the bug, I figured it was better to leave it in place until the new approach has been stress tested. Differential Revision: http://reviews.llvm.org/D15982 llvm-svn: 257244	2016-01-09 01:31:13 +00:00
Kostya Serebryany	e7583d21e3	[libFuzzer] refactor the way we collect cmp traces (don't use std::vector, don't limit with 8 bytes) llvm-svn: 257239	2016-01-09 00:38:40 +00:00
Mike Aizatsky	6129dacdbb	fixing type. llvm-svn: 257238	2016-01-09 00:31:56 +00:00
NAKAMURA Takumi	0929fcfbcf	llvm/lib/DebugInfo/Symbolize/DIPrinter.cpp: Fix build in -m32. 1L is incompatible to int64_t. llvm-svn: 257237	2016-01-09 00:28:50 +00:00
Mike Aizatsky	17dbc2831e	[llvm-symbolizer] -print-source-context-lines option to print source code around the line. Differential Revision: http://reviews.llvm.org/D15909 llvm-svn: 257236	2016-01-09 00:14:35 +00:00
Sanjay Patel	9f088ab5e2	rangify; NFCI llvm-svn: 257226	2016-01-08 22:59:42 +00:00
Tobias Edler von Koch	ccd3bfc3c8	[Hexagon] Replace a static member variable in HexagonCVIResource (NFC) This creates one instance of TUL per HexagonShuffler, which avoids thread-safety issues with future changes. llvm-svn: 257215	2016-01-08 22:07:25 +00:00
Sanjay Patel	9f49b683e0	variable names start with an upper case letter; NFC llvm-svn: 257213	2016-01-08 22:05:03 +00:00
Sanjay Patel	1dc7dfb9d9	[DAGCombiner] don't dereference an operand that doesn't exist (PR26070) The bug was introduced with changes for x86-64 fp128: http://reviews.llvm.org/rL254653 I don't know why an x86 change is here, so I'll follow up in: http://reviews.llvm.org/D15134 Should fix: https://llvm.org/bugs/show_bug.cgi?id=26070 llvm-svn: 257200	2016-01-08 19:53:24 +00:00
Haicheng Wu	a6a3279bd3	[JumpThreading] Split select that has constant conditions coming from the PHI node Look for PHI/Select in the same BB of the form bb: %p = phi [false, %bb1], [true, %bb2], [false, %bb3], [true, %bb4], ... %s = select p, trueval, falseval And expand the select into a branch structure. This later enables jump-threading over bb in this pass. Using the similar approach of SimplifyCFG::FoldCondBranchOnPHI(), unfold select if the associated PHI has at least one constant. If the unfolded select is not jump-threaded, it will be folded again in the later optimizations. llvm-svn: 257198	2016-01-08 19:39:39 +00:00
Tim Shen	9b68bd48ca	Test commit access - add a blank line in comment. llvm-svn: 257192	2016-01-08 19:20:23 +00:00
Justin Bogner	e9fb228d59	LoopInfo: Simplify ownership of Loop objects It's strange that LoopInfo mostly owns the Loop objects, but that it defers deleting them to the loop pass manager. Instead, change the oddly named "updateUnloop" to "markAsRemoved" and have it queue the Loop object for deletion. We can't delete the Loop immediately when we remove it, since we need its pointer identity still, so we'll mark the object as "invalid" so that clients can see what's going on. llvm-svn: 257191	2016-01-08 19:08:53 +00:00
Weiming Zhao	4b3b13d3bc	RBIT Instruction only available for ARMv6t2 and above. Summary: r255334 matches bit-reverse pattern in InstCombine and generates calls to Instrinsic::bitreverse. RBIT instruction is only available for ARMv6t2 and above. This patch has the intrinsic expanded during legalization for ARMv4 and ARMv5. Patch by Z. Zheng <zhaoshiz@codeaurora.org> Reviewers: apazos, jmolloy, weimingz Subscribers: aemerson, rengolin, llvm-commits Differential Revision: http://reviews.llvm.org/D15932 llvm-svn: 257188	2016-01-08 18:43:41 +00:00
Weiming Zhao	48c033e021	Disable shrink-wrap for Thumb1 Summary: In ARMConstantIslandPass, which runs after Shrink Wrap pass, long jumps will be fixed up as BL (tBfar) which depends on spilling LR in epilogue. However, shrink-wrap may remove the LR, which causes issues when the function returns. Reviewers: qcolombet, rengolin Subscribers: aemerson, rengolin Differential Revision: http://reviews.llvm.org/D15984 llvm-svn: 257187	2016-01-08 18:37:43 +00:00
Easwaran Raman	7f18729039	Remove CloningDirector and associated code With the removal of the old landing pad code in r249918, CloningDirector is not used anywhere else. NFCI. llvm-svn: 257185	2016-01-08 18:23:17 +00:00
Pirama Arumuga Nainar	bf5ccdccb2	Do not ASSERTZEXT for i16 result of bitcast from f16 operand Summary: During legalization if i16, do not ASSERTZEXT the result of FP_TO_FP16. Directly return an FP_TO_FP16 node with return type as the promote-to-type of i16. This patch also removes extraneous length check. This legalization should be valid even if integer and float types are of different lengths. This patch breaks a hard-float test for fp16 args. The test is changed to allow a vmov to zero-out the top bits, and also ensure that the return value is in an FP register. Reviewers: ab, jmolloy Subscribers: srhines, llvm-commits Differential Revision: http://reviews.llvm.org/D15438 llvm-svn: 257184	2016-01-08 17:46:05 +00:00
David Majnemer	2a6368f609	[WinEH] CatchHandler which don't have catch objects in StackColoring StackColoring rewrites the frame indicies of operations involving allocas if it can find that the life time of two objects do not overlap. MSVC EH needs to be kept aware of this if happens in the event that a catch object has moved around. However, we represent the non-existance of a catch object with a sentinel frame index (INT_MAX). This sentinel also happens to be the EmptyKey of the SlotRemap DenseMap. Testing for whether or not we need to translate the frame index fails in this case because we call the count method on the DenseMap with the EmptyKey, leading to assertions. Instead, check if it is our sentinel value before trying to look into the DenseMap. This fixes PR26073. llvm-svn: 257182	2016-01-08 17:24:47 +00:00
Teresa Johnson	1b00f2d99a	[ThinLTO] Use new in-place symbol changes for exporting module Due to the new in-place ThinLTO symbol handling support added in r257174, we now invoke renameModuleForThinLTO on the current module from within the FunctionImport pass. Additionally, renameModuleForThinLTO no longer needs to return the Module as it is performing the renaming in place on the one provided. This commit will be immediately preceeded by a companion clang patch to remove its invocation of renameModuleForThinLTO. llvm-svn: 257181	2016-01-08 17:06:29 +00:00
Teresa Johnson	4504c1bc80	[ThinLTO] Enable in-place symbol changes for exporting module Summary: Move ThinLTO global value processing functions out of ModuleLinker and into a new ThinLTOGlobalProcessor class, which performs any necessary linkage and naming changes on the given module in place. As a result, renameModuleForThinLTO no longer needs to create a new Module when performing any necessary local to global promotion on a module that we are possibly exporting from during a ThinLTO backend compilation. During function importing the ThinLTO processing is still invoked from the ModuleLinker (via the new class), as it needs to perform renaming and linkage changes on the source module, e.g. in order to get the correct renaming during local to global promotion. Reviewers: joker.eph Subscribers: davidxl, llvm-commits, joker.eph Differential Revision: http://reviews.llvm.org/D15696 llvm-svn: 257174	2016-01-08 15:00:00 +00:00
Tom Stellard	4c4c72db48	AMDGPU/SI: Emit global variable sizes when targeting HSA Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15952 llvm-svn: 257173	2016-01-08 14:50:28 +00:00
Tom Stellard	ad8f5e8111	AMDGPU: Emit functions sizes Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15951 llvm-svn: 257172	2016-01-08 14:50:23 +00:00
Teresa Johnson	a1080ee6f0	[ThinLTO] Delay metadata materializtion in function importer The function importer was still materializing metadata when modules were loaded for function importing. We only want to materialize it when we are going to invoke the metadata linking postpass. Materializing it before function importing is not only unnecessary, but also causes metadata referenced by imported functions to be mapped in early, and then not connected to the rest of the module level metadata when it is ultimately linked in. Augmented the test case to specifically check for the metadata being properly connected, which it wasn't before this fix. llvm-svn: 257171	2016-01-08 14:17:41 +00:00
Nemanja Ivanovic	2314e83227	Prevent renaming of CR fields in AADB when a CR restore is present This patch corresponds to review: http://reviews.llvm.org/D15930 Moves to and from CR fields depend on shifts/masks that depend on the target/source CR field. Thus, post-ra anti-dep breaking must not later change that CR register assignment. llvm-svn: 257168	2016-01-08 13:09:54 +00:00
NAKAMURA Takumi	134d31e328	InstCombineCompares.cpp: Fix a warning. [-Wbraced-scalar-init] llvm-svn: 257167	2016-01-08 12:50:03 +00:00
Silviu Baranga	9e007efad2	Re-commit r257064, this time with a fixed assert In setInsertionPoint if the value is not a PHI, Instruction or Argument it should be a Constant, not a ConstantExpr. Original commit message: [InstCombine] Look through PHIs, GEPs, IntToPtrs and PtrToInts to expose more constants when comparing GEPs Summary: When comparing two GEP instructions which have the same base pointer and one of them has a constant index, it is possible to only compare indices, transforming it to a compare with a constant. This removes one use for the GEP instruction with the constant index, can reduce register pressure and can sometimes lead to removing the comparisson entirely. InstCombine was already doing this when comparing two GEPs if the base pointers were the same. However, in the case where we have complex pointer arithmetic (GEPs applied to GEPs, PHIs of GEPs, conversions to or from integers, etc) the value of the original base pointer will be hidden to the optimizer and this transformation will be disabled. This change detects when the two sides of the comparison can be expressed as GEPs with the same base pointer, even if they don't appear as such in the IR. The transformation will convert all the pointer arithmetic to arithmetic done on indices and all the relevant uses of GEPs to GEPs with a common base pointer. The GEP comparison will be converted to a comparison done on indices. Reviewers: majnemer, jmolloy Subscribers: hfinkel, jevinskie, jmolloy, aadg, llvm-commits Differential Revision: http://reviews.llvm.org/D15146 llvm-svn: 257164	2016-01-08 11:11:04 +00:00
Chandler Carruth	1926b70e37	[attrs] Split the late-revisit pattern for deducing norecurse in a top-down manner into a true top-down or RPO pass over the call graph. There are specific patterns of function attributes, notably the norecurse attribute, which are most effectively propagated top-down because all they us caller information. Walk in RPO over the call graph SCCs takes the form of a module pass run immediately after the CGSCC pass managers postorder walk of the SCCs, trying again to deduce norerucrse for each singular SCC in the call graph. This removes a very legacy pass manager specific trick of using a lazy revisit list traversed during finalization of the CGSCC pass. There is no analogous finalization step in the new pass manager, and a lazy revisit list is just trying to produce an RPO iteration of the call graph. We can do that more directly if more expensively. It seems unlikely that this will be the expensive part of any compilation though as we never examine the function bodies here. Even in an LTO run over a very large module, this should be a reasonable fast set of operations over a reasonably small working set -- the function call graph itself. In the future, if this really is a compile time performance issue, we can look at building support for both post order and RPO traversals directly into a pass manager that builds and maintains the PO list of SCCs. Differential Revision: http://reviews.llvm.org/D15785 llvm-svn: 257163	2016-01-08 10:55:52 +00:00
David Majnemer	086fec23ec	[WinEH] Update WinEHFuncInfo if StackColoring merges allocas Windows EH keeping track of which frame index corresponds to a catchpad in order to inform the runtime where the catch parameter should be initialized. LLVM's optimizations are able to prove that the memory used by the catch parameter can be reused with another memory optimization, changing it's frame index. We need to keep WinEHFuncInfo up to date with respect to this or we will miscompile/assert. This fixes PR26069. llvm-svn: 257158	2016-01-08 08:03:55 +00:00
Dylan McKay	cc6733aa83	[AVR] Added AVRSelectionDAGInfo header file llvm-svn: 257152	2016-01-08 06:32:27 +00:00
Craig Topper	048e700828	[AVX-512] Remove superfluous spaces from some asm strings. llvm-svn: 257150	2016-01-08 06:09:20 +00:00
Craig Topper	04493fda81	[X86] Don't print the aliased version of CVTSD2SI64rm. This appears to be a mistake I made years ago. llvm-svn: 257149	2016-01-08 06:09:18 +00:00
Craig Topper	29510c0430	[X86] Use \t instead of space after mnemonics in a bunch InstAliases for consistency. llvm-svn: 257148	2016-01-08 06:09:13 +00:00
Xinliang David Li	062cde9cc3	[PGO] Ensure vp data in indexed profile always sorted Done in InstrProfWriter to eliminate the need for client code to do the sorting. The operation is done once and reused many times so it is more efficient. Update unit test to remove sorting. Also update expected output of affected tests. llvm-svn: 257145	2016-01-08 05:45:21 +00:00
Junmo Park	aa9243a25d	Remove extra whitespace. NFC. llvm-svn: 257144	2016-01-08 04:20:32 +00:00
Xinliang David Li	51dc04cff2	[PGO] Fix a bug in InstProfWriter addRecord For a new record with weight != 1, only edge profiling counters are scaled, VP data is not properly scaled. This patch refactors the code and fixes the problem. Also added sort by count interface (for follow up patch). llvm-svn: 257143	2016-01-08 03:49:59 +00:00
Mehdi Amini	599ebf2767	Remove static global GCNames from Function.cpp and move it to the Context This remove the need for locking when deleting a function. Differential Revision: http://reviews.llvm.org/D15988 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 257139	2016-01-08 02:28:20 +00:00
Kyle Butt	bfcff3856a	Add call sequence start and end for __tls_get_addr This is a fix for bug http://llvm.org/bugs/show_bug.cgi?id=25839. For a PIC TLS variable access in a function, prologue (mflr followed by std and stdu) gets scheduled after a tls_get_addr call. tls_get_addr messed up LR but no one saves/restores it. Also added a test for save/restore clobbered registers during calling __tls_get_addr. Patch by Tim Shen llvm-svn: 257137	2016-01-08 02:06:19 +00:00
Kyle Butt	a02ce98bd4	[Vectorization] Actually return from error case in isStridedPtr The early return seems to be missed. This causes a radical and wrong loop optimization on powerpc. It isn't reproducible on x86_64, because "UseInterleaved" is false. Patch by Tim Shen. llvm-svn: 257134	2016-01-08 01:55:13 +00:00
Sanjay Patel	d72a458d28	[InstCombine] insert a new shuffle in a safe place (PR25999) Limit this transform to a basic block and guard against PHIs. Hopefully, this fixes the remaining failures in PR25999: https://llvm.org/bugs/show_bug.cgi?id=25999 llvm-svn: 257133	2016-01-08 01:39:16 +00:00
Dan Gohman	4ef99433aa	[WebAssembly] Minor code cleanups. NFC. llvm-svn: 257131	2016-01-08 01:18:00 +00:00
Matthias Braun	7c66afb887	IntEqClasses: Let join() return the new leader The new leader is known anyway so we can return it for some micro optimization in code where it is easy to pass along the result to the next join(). llvm-svn: 257130	2016-01-08 01:16:39 +00:00
Matthias Braun	bf47f63b74	LiveInterval: A LiveRange is enough for ConnectedVNInfoEqClasses::Classify() llvm-svn: 257129	2016-01-08 01:16:35 +00:00
Dan Gohman	35e4a28947	[WebAssembly] Minor code cleanups. NFC. llvm-svn: 257128	2016-01-08 01:06:00 +00:00
Dan Gohman	8633eedb30	[WebAssembly] Remove an unused def : Pat. WebAssemblyISelLowering.cpp does not wrap jump table nodes inside of WebAssemblywrapper nodes, so this pattern is not currently used. llvm-svn: 257127	2016-01-08 00:50:33 +00:00

... 2 3 4 5 6 ...

86304 Commits